Comparing eddfef0339..4cd94a0608 - mesa

fran/mesa

Author	SHA1	Message	Date
Emil Velikov	4cd5e5b48e	nouveau: update the Makefile.sources list Reflect the nv50->g80 change and the new gm107_texture header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-22 11:40:29 +00:00
Marek Olšák	ff360a52e6	radeonsi: implement binary shaders & shader cache in memory (v2) v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	1132910e50	gallium/radeon: remove unused radeon_shader_binary_free_* functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	50ac2612d0	radeonsi: make radeon_shader_reloc name string fixed-sized This will simplify implementations of binary shaders. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	1fe73d55e3	radeonsi: move some struct si_shader members to new struct si_shader_info This will be part of shader binaries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	10fa269f4f	radeonsi: use smaller types for some si_shader members in order to decrease the shader size for a shader cache. v2: add & use SI_MAX_VS_OUTPUTS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	9aaf28da62	radeonsi: enable compiling one variant per shader Shader stats from VERDE: Default scheduler: Totals: SGPRS: 491272 -> 488672 (-0.53 %) VGPRS: 289980 -> 311093 (7.28 %) Code Size: 11091656 -> 11219948 (1.16 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave Max Waves: 78063 -> 77352 (-0.91 %) Wait states: 0 -> 0 (0.00 %) Looking at some of the worst regressions, I get: - The VGPR increase seems to be caused by the fact that if PS has used less than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20. However, the wave count remains at 10 if VGPRs <= 24, so no harm there. - The scratch increase seems to be caused by SGPR spilling. The unnecessary SGPR spilling has been an ongoing issue with the compiler and it's completely fixable by rematerializing s_loads or reordering instructions. SI scheduler: Totals: SGPRS: 374848 -> 374576 (-0.07 %) VGPRS: 284456 -> 307515 (8.11 %) Code Size: 11433068 -> 11535452 (0.90 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 509952 -> 522240 (2.41 %) bytes per wave Max Waves: 79456 -> 78217 (-1.56 %) Wait states: 0 -> 0 (0.00 %) VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much and generally spills way less than the default scheduler. (522240 spills vs 2246656 spills) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	754cf171e9	radeonsi: print full shader name before disassembly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	3c98e0b369	radeonsi: compile non-GS middle parts of shaders immediately if enabled Still disabled. Only prologs & epilogs are compiled in draw calls, but each variant of those is compiled only once per process. VS is always compiled as hw VS. TES is always compiled as hw VS. LS and ES stages are always compiled on demand. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	e038f8fd49	radeonsi: rework polygon stippling for PS prolog Don't use the pstipple module. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	4636d9be4a	radeonsi: add PS prolog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	e79bb746ab	radeonsi: add PS epilog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	eb10919b83	radeonsi: add TCS epilog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	e1b21696a3	radeonsi: add VS epilog It only exports the primitive ID. Also used by TES when it's compiled as VS. The VS input location of the primitive ID input is v2. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	70de433dea	radeonsi: add VS prolog This is disabled with use_monolithic_shaders = true. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	19a92886a8	radeonsi: first bits for non-monolithic shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	0303886b10	radeonsi: add code for dumping all shader parts together (v2) v2: unify some code into si_get_shader_binary_size Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	17eb99d8b9	radeonsi: add code for combining and uploading shaders from 3 shader parts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	9d5bf1a3ef	radeonsi: fail compilation if non-GS non-CS shaders have rodata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	09408764c1	radeonsi: separate 2 pieces of code from create_function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	292759220c	radeonsi: add samplemask parameter to si_export_mrt_color Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	e6aea08b86	radeonsi: add start_instance parameter to get_instance_index_for_fetch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	dc27456194	radeonsi: separate out shader key bits for prologs & epilogs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	d995d4830e	radeonsi: compute how many input VGPRs fragment shaders have Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	fe1b6ede01	radeonsi: compute how many input SGPRs and VGPRs shaders have Prologs (shader binaries inserted before the API shader binary) need to know this, so that they won't change the input registers unintentionally. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	36202182ac	gallium/radeon: add basic code for setting shader return values LLVMBuildInsertValue will be used on return_value. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Samuel Pitoiset	3c9ed2015c	nvc0: enable compute shaders on Fermi Kepler compute support is really different than Fermi and it's not ready yet. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:32 +01:00
Samuel Pitoiset	14a810e9d0	nv50/ir: add atomics support on shared memory for Fermi Changes from v3: - move the previous OP_SELP change to the previous commit Changes from v2: - make sure the op is OP_SELP when emitting the predicate and add one assert - use bld.getSSA() for mkOp2() - add cross edge between tryLockAndSetBB and joinBB Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:32 +01:00
Samuel Pitoiset	e0371e63df	nv50/ir: make OP_SELP a compare instruction This OP_SELP insn will be used to handle compare and swap subops. Changes from v2: - fix logic for GK110+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:29 +01:00
Samuel Pitoiset	0c930557bf	nv50/ir: add lock/unlock subops for load/store Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:42:02 +01:00
Samuel Pitoiset	45e85e16f5	nv50/ir: use s[] addr space for shared buffers Shared memory address space (FILE_MEMORY_SHARED) must be used instead of global memory when a shared memory area is declared. Changes from v2: - oops, do not remove TGSI_FILE_BUFFER in a switch in nv50_ir_from_tgsi.cpp Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:58 +01:00
Samuel Pitoiset	80fc67fba5	nvc0: reduce likelihood of collision for real buffers on Fermi Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:53 +01:00
Samuel Pitoiset	807901b639	nvc0: invalidate compute state when switching pipe contexts Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:48 +01:00
Samuel Pitoiset	c6293877f0	nvc0: add support for indirect compute on Fermi When indirect compute is used, the size of the grid (in blocks) is stored as three integers inside a buffer. This requires a macro to set up GRIDDIM_YX and GRIDDIM_Z. Changes from v2: - do not launch the grid if the number of groups for a dimension is 0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:45 +01:00
Samuel Pitoiset	fa7333a742	nvc0: bind textures/samplers for compute on Fermi Textures and samplers don't seem to be aliased between COMPUTE and 3D. Changes from v2: - refactor the code to share (almost) the same logic between 3d and compute Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:40 +01:00
Samuel Pitoiset	917a5ff6ea	nvc0: bind shader buffers for compute on Fermi This is loosely based on 3D. Shader buffers are bound on c15 (the driver constbuf) at offset 0x200. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:37 +01:00
Samuel Pitoiset	a9b70a86db	nvc0: bind driver constbuf for compute on Fermi Changes from v3: - add new validation state for COMPUTE driver constbuf Changes from v2: - always bind the driver consts even if user params come in via clover Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:32 +01:00
Samuel Pitoiset	527652629d	nvc0: add a new validation state for 3D driver constbuf This will be used to invalidate 3D driver constbuf when using COMPUTE and vice-versa. This is needed because this CB contains a bunch of useful information like the addrs of shader buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:29 +01:00
Samuel Pitoiset	57d4251003	nvc0: bind constant buffers for compute on Fermi Loosely based on 3D. Changs from v3: - invalidate COMPUTE CBs after validating 3D CBs because they are aliased Changes from v2: - get rid of the 's' param to nvc0_cb_bo_push() because it doesn't matter to upload constbufs for compute using the 3d chan Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:25 +01:00
Samuel Pitoiset	53f92bb7f9	nvc0: allocate an area for compute user constbufs For compute shaders, we might need to upload uniforms. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-21 10:41:21 +01:00
Samuel Pitoiset	89d25a82e8	nv50: do not advertise about compute shaders Compute shaders are totally unsupported. This avoids Clover to report that OpenCL is supported on Tesla because it's a lie. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-20 19:25:12 +01:00
Rhys Kidd	a0f55e91cc	docs: Correct typo in LLVMpipe envvar description Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-20 16:15:35 +01:00
Ilia Mirkin	0b10ec1086	st/mesa: force depth mode to GL_RED for sized depth/stencil formats See commit `9db2098d` for the i965 version of this. This fixes depth in a bunch of dEQP EXT_texture_border_clamp tests. And probably other ones as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-19 17:37:39 -05:00
Daniel Czarnowski	e6f1a44d14	egl_dri2: set correct error code if swapbuffers fails A return value of '-1' means that there was error during swap with a window drawable, in this case we set error as EGL_BAD_NATIVE_WINDOW. v2: coding style cleanup, better commit message Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-19 18:23:19 +00:00
Dongwon Kim	d1e1563bb6	egl: move Null check to eglGetSyncAttribKHR to prevent Segfault Null-check on "*value" is currently done in _eglGetSyncAttrib, which is after eglGetSyncAttribKHR dereferences it. Move the check a layer up (in the beginning of eglGetSyncAttribKHR) to avoid segfaults. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Emil Velikov: tweak commit message, add stable tag] Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-19 18:23:19 +00:00
Ilia Mirkin	b697400a97	meta/copy_image: use precomputed dst_internal_format to avoid segfault If the destination is a renderbuffer, dst_tex_image will be NULL. This fixes the *to_renderbuffer dEQP copy image tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: mesa-stable@lists.freedesktop.org	2016-02-19 13:10:28 -05:00
Ilia Mirkin	a03d6f2aa3	mesa: add GL_OES_texture_stencil8 support It's basically the same thing as GL_ARB_texture_stencil8 except that glCopyTexImage isn't supported, so add STENCIL_INDEX to the list of invalid GLES formats for glCopyTexImage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-02-19 12:37:22 -05:00
Ilia Mirkin	2b938a390c	st/mesa: fix pbo uploads - LOD must be provided in .w for TXF (even for buffer textures) - User buffer must be valid at draw time - Must have a sampler associated with the sampler view This makes PBO uploads work again on nouveau. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-19 11:30:33 -05:00
Ilia Mirkin	68c4af1c19	mesa: check fbo completeness based on internal format, not driver format The base format is a function of the user-requested format, while the driver format is not. So we should use the base format instead. The driver format can be anything. Specifically in the stencil-only case, it might be a depth/stencil format. However we still want to refuse such an attachment when bound to GL_DEPTH. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-19 11:30:33 -05:00
Brian Paul	0eb7b5c2a3	mesa: small optimization of _mesa_expand_bitmap() Avoid a per-pixel multiply. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 08:51:51 -07:00
Brian Paul	8a2a1a6bd6	mesa: add special case ubyte[4] / BGRA conversion function This reduces a glTexImage(GL_RGBA, GL_UNSIGNED_BYTE) hot spot in when storing the texture as BGRA. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 08:51:51 -07:00
Brian Paul	44f48fead5	st/mesa: implement a simple cache for glDrawPixels Instead of discarding the texture we created, keep it around in case the next glDrawPixels draws the same image again. This is intended to help application which draw the same image several times in a row, either within a frame or subsequent frames. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-19 08:51:51 -07:00
Brian Paul	71dcc067a5	llvmpipe: add a few const qualifiers Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 08:51:51 -07:00
Brian Paul	6d551f9ea3	trace: assorted whitespace and formatting fixes Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-02-19 08:49:51 -07:00
Brian Paul	e8689d9df3	trace: remove unneeded inline qualifiers Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-02-19 08:49:41 -07:00
Iago Toral Quiroga	72794b0bd9	glsl: fix emit_inline_matrix_constructor for doubles Specifically, for the case where we initialize a dmat with a source matrix that has fewer columns/rows. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 14:16:05 +01:00
Iago Toral Quiroga	d1617b4088	glsl: Mark float constants as such So we don't generate double to float conversion code Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 14:16:05 +01:00
Iago Toral Quiroga	ad22886ef1	glsl: fix indentation in emit_inline_matrix_constructor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-19 14:16:05 +01:00
Rob Clark	04ad05c987	glsl: fix standalone compiler Need to set some non-zero limits for MaxCombinedUniformComponents, otherwise we hit an "Too many <type> shader uniform components" error in the linker. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-19 08:02:02 -05:00
Nicolai Hähnle	d7c4ffd1ee	st/mesa: disable depth/stencil/alpha tests in PBO upload Noticed by Brian Paul. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-18 20:49:12 -05:00
Brian Paul	2f3d06d9f9	svga: allow non-contiguous VS input declarations This fixes a glDrawPixels regression since `b63fe0552b`. The new quad-drawing utility code uses 3 vertex attributes (xyz, rgba, st). For glDrawPixels path we don't use the rgba attribute so there's a gap in the TGSI VS input declarations (INPUT[0] = pos, INPUT[2] = texcoord). The TGSI->VGPU10 translations code did not handle this correctly. I missed this because my VM was configured for HWv11 while testing. Another way to fix this would be to change the tgsi_scan.c code so that the tgsi_shader_info::num_inputs (and num_outputs) included the unused inputs/outputs. These counts would then actually be "max input register index + 1" rather than "number of used inputs". But that change could impact all drivers so put it off for now. No regressions found with piglit or typical GL apps. v2: also update alloc_system_value_index() to use info.file_max[] Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-18 15:46:17 -07:00
Oded Gabbay	a3e3c3e621	gallivm: Check whether to stop disassemble only for x86 Because the if statement that checks whether we have a return statement is valid only on x86, surround it with X86 or X86-64 arch defines Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 00:18:11 +02:00
Oded Gabbay	b3d42934a1	gallivm: use sstream for dissasembling Currently, disassemble() directly prints to stdout. This has broke the profiling support for llvmpipe JIT code. This patch redirects the output to an sstream object, which is then either gets printed to stdout (for assembly debugging) or gets written to a file in /tmp/ (for profiling support). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-19 00:18:11 +02:00
Rob Clark	93c62fdee9	trace: fix new gcc6 warnings src/gallium/drivers/trace/tr_context.c:1713:39: warning: ‘rbug_blocker_flags’ defined but not used [-Wunused-const-variable] static const struct debug_named_value rbug_blocker_flags[] = { ^~~~~~~~~~~~~~~~~~ Note that use of rbug_blocker_flags was removed in: commit `5494332128` Author: Jakob Bornecrantz <jakob@vmware.com> Date: Wed May 12 19:26:19 2010 +0100 trace: Remove rbug from trace Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-18 17:10:55 -05:00
Rob Clark	5051d85b03	gallium/auxiliary: fix new gcc6 warnings src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c: In function ‘mm_bufmgr_create_from_buffer’: src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:288:4: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] if(mm->map) ^~ src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c:286:1: note: ...this ‘if’ clause, but it is not if(mm->heap) ^~ Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-18 17:10:55 -05:00
Rob Clark	bba836ea6a	gallium/hud: fix new gcc6 warnings src/gallium/auxiliary/hud/font.c:234:22: warning: ‘Fixed8x13_Character_159’ defined but not used [-Wunused-const-variable] static const GLubyte Fixed8x13_Character_159[] = { 9, 0, 0, 0, 0, 0, 0,170, 0, 0, 0,130, 0, 0, 0,130, 0, 0, 0,130, 0, 0, 0,170, 0, 0, 0, 0, 0}; ^~~~~~~~~~~~~~~~~~~~~~~ .... many more.. These are simply unused, just #if 0 them out for now, in case someone wants to use them in the future. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-18 17:10:55 -05:00
Rob Clark	7d5372bfe8	mesa: fix new gcc6 warnings src/mesa/main/texstore.c:92:22: warning: ‘map_1032’ defined but not used [-Wunused-const-variable] static const GLubyte map_1032[6] = { 1, 0, 3, 2, ZERO, ONE }; ^~~~~~~~ src/mesa/main/texstore.c:91:22: warning: ‘map_3210’ defined but not used [-Wunused-const-variable] static const GLubyte map_3210[6] = { 3, 2, 1, 0, ZERO, ONE }; ^~~~~~~~ src/mesa/main/texstore.c:90:22: warning: ‘map_identity’ defined but not used [-Wunused-const-variable] static const GLubyte map_identity[6] = { 0, 1, 2, 3, ZERO, ONE }; ^~~~~~~~~~~~ These appear to be unused since: commit `8ec6534b26` Author: Iago Toral Quiroga <itoral@igalia.com> AuthorDate: Wed Oct 15 13:42:11 2014 +0200 mesa: Use _mesa_format_convert to implement texstore_rgba. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	b01575ec99	glsl: fix new gcc6 warnings src/compiler/glsl/lower_discard_flow.cpp:79:1: warning: ‘ir_visitor_status {anonymous}::lower_discard_flow_visitor::visit_enter(ir_loop_jump)’ defined but not used [-Wunused-function] lower_discard_flow_visitor::visit_enter(ir_loop_jump ir) ^~~~~~~~~~~~~~~~~~~~~~~~~~ The base class method that was intended to be overridden was 'visit(ir_loop_jump *ir)', not visit_enter(). Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	e93caca071	glsl: fix new gcc6 warnings src/compiler/glsl/ast_to_hir.cpp: In function ‘unsigned int ast_process_struct_or_iface_block_members(exec_list, _mesa_glsl_parse_state, exec_list, glsl_struct_field, bool, glsl_matrix_layout, bool, ir_variable_mode, ast_type_qualifier, unsigned int, unsigned int)’: src/compiler/glsl/ast_to_hir.cpp:6339:52: warning: ‘first_member_has_explicit_location’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (!layout->flags.q.explicit_location && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ ((first_member_has_explicit_location && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ !qual->flags.q.explicit_location) \|\| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (!first_member_has_explicit_location && ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ qual->flags.q.explicit_location))) { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	e2060aaf57	i965: fix new gcc6 warnings src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp:244:1: warning: ‘void {anonymous}::fs_copy_prop_dataflow::dump_block_data() const’ defined but not used [-Wunused-function] fs_copy_prop_dataflow::dump_block_data() const ^~~~~~~~~~~~~~~~~~~~~ From looking at git history, it looks like this is intended to be unused (ie. just for adding on-demand debug prints) Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Rob Clark	a13442ac67	util: fix new gcc6 warnings src/util/hash_table.h:111:23: warning: ‘_mesa_fnv32_1a_offset_bias’ defined but not used [-Wunused-const-variable] static const uint32_t _mesa_fnv32_1a_offset_bias = 2166136261u; ^~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 17:10:55 -05:00
Kenneth Graunke	1c694a6c20	glcpp: Disallow "defined" as a macro name. Both GCC and Clang disallow this, and glslang has recently started disallowing it as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94188 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-18 13:38:50 -08:00
Samuel Pitoiset	dfc95ad6d1	gallium/cso: only enable compute shaders when TGSI is supported Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94186 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-18 20:41:25 +01:00
Rob Herring	5c7f97426d	Android: disable unused-parameter warning Android builds with -Wunused-parameter enabled which results in spewing lots of warnings. Disable it so more meaningful warnings are more visible. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	7efc273df1	Android: enable building on arm64 Use the LOCAL_CFLAGS_{32/64} instead of arch specific variants to define the DEFAULT_DRIVER_DIR. This enables building for arm64. Cc: Chih-Wei Huang <cwhuang@android-x86.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	1f53a57b2f	Android: Fix building secondary arch in mixed 32/64-bit builds TARGET_CC is not defined for the secondary arch on combined 32/64-bit builds. The build system uses 2ND_TARGET_CC instead and it is not meant to be used in module makefiles. LOCAL_CC was used to provide C only flags as -std=c99 is not valid for C++ files. Since Android 4.4, LOCAL_CONLYFLAGS was added to set compiler flags on C files only, so it can be used now instead of LOCAL_CC. This will break on pre-4.4 versions of Android, but it unlikely anyone is using current Mesa with such an old version of Android. Cc: Chih-Wei Huang <cwhuang@android-x86.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	ba06ea1a37	egl: android: clean-up config attribute setting Pass the additional config attributes to dri2_add_config to set them instead of open coding them. This is in preparation to add more attributes. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Varad Gautam	e35c5af337	egl: android: fix visuals declaration Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Rob Herring	64d2f398f6	Android: fix build break in libmesa_program Commit `5fd848f6c9` ("program: Use _mesa_geometric_samples to calculate gl_NumSamples") broken Android builds. Add the missing include path "main" to framebuffer.h like other includes in prog_statevars.c. Cc: Neil Roberts <neil@linux.intel.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-18 17:47:33 +00:00
Ilia Mirkin	12e3ad2ae9	mesa: gl_NumSamples should always be at least one From ARB_sample_shading: "gl_NumSamples is the total number of samples in the framebuffer, or one if rendering to a non-multisample framebuffer" So make sure to always pass in at least 1. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O`Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2016-02-18 12:35:28 -05:00
Plamena Manolova	65dfb3048e	compiler/glsl: Fix uniform location counting. This patch moves the calculation of current uniforms to link_uniforms, which makes use of UniformRemapTable which stores all the reserved uniform locations. Location assignment for implicit uniforms now tries to use any gaps left in the table after the location assignment for explicit uniforms. This gives us more space to store more uniforms. Patch is based on earlier patch with following changes/additions: 1: Move the counting of explicit locations to check_explicit_uniform_locations and then pass the number to link_assign_uniform_locations. 2: Count the number of empty slots in UniformRemapTable and store them in a list_head. 3: Try to find an empty slot for implicit locations from the list, if that fails resize UniformRemapTable. Fixes following CTS tests: ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93696	2016-02-18 11:53:35 +02:00
Roland Scheidegger	d335b6abc0	gallivm, tgsi: provide fake sample_i_ms implementations Just like the rest of the msaa "implementation" it's just fake for now... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-18 05:00:03 +01:00
Brian Paul	06d3b0a006	st/mesa: new st_DrawAtlasBitmaps() function for drawing bitmap text This basically saves the current pipeline state, sets up state for rendering, constructs a set of textured quads, renders, then restores the previous pipeline state. It shouldn't be hard to implement a similar function for non-gallium drives. With some code refactoring, the vertex definition code could probably be shared. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-17 19:57:48 -07:00
Brian Paul	b26ddda12f	mesa: implement a display list / glBitmap texture atlas This improves the performance of applications which use glXUseXFont() or wglUseFontBitmaps() and glCallLists() to draw bitmap text. Basically, we collect all the glBitmap images from the display lists and put them into a texture atlas. To render the bitmaps for a glCallLists() command, we render a set of textured quads where each quad is textured with one bitmap image. Actually, the rendering part has to be done by the Mesa driver or Mesa/gallium state tracker. Note that GLUT demos that use glutBitmapCharacter() don't benefit from this. v2, per Nicolai Hähnle: - check the max tex rect size is at least 1024. - add comment in dd.h that texture_rectangle is required. - in _mesa_DeleteLists(), try to delete the atlas before the list(s) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-17 19:57:48 -07:00
Ilia Mirkin	6f4a725073	st/mesa: apply DepthMode swizzle to stencil texturing as well Gallium doesn't present these as GL_RED-style. A swizzle is necessary to present the proper data in the unused components. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-17 21:20:24 -05:00
Ben Widawsky	20e8ee3662	i965/skl: Update Skylake renderer strings Also adds some of the Iris/Pro parts which we previously didn't have named. v2: 0x192d is gt3, not gt4 Adding some 'e' tags for eDRAM parts Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com>	2016-02-17 16:50:59 -08:00
Ben Widawsky	644c8a5151	i965/skl: Add two missing device IDs The Iris part is left unbranded because we did not have these with original SKL. v2: 0x192d is gt3, not gt4 v3: Forgot to update the temporary brand string when I did v2. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com>	2016-02-17 16:50:59 -08:00
Ilia Mirkin	f3cd62a765	mesa: allow multisampled format info to be returned on GLES 3.1 The restriction on multisampled integer texture formats only applies to GLES 3.0, so don't apply it to GLES 3.1 contexts. This fixes a slew of dEQP-GLES31.functional.state_query.internal_format.* tests, which now all pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-17 19:30:40 -05:00
Ben Widawsky	2bf041d94f	i965: Extract push constant state to a new file Every stage has a corresponding 3DSTATE_CONSTANT_XS packet, so having the code to create and emit push constant buffers in genX_vs_state.c is a little strange. Moving it to a separate file seems more logical. v2 [Ken]: Rebase on master, explain motivation in the commit message. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-17 12:34:23 -08:00
Matt Turner	0e9dc59a58	i965: Make emit_minmax return an instruction*. And use it in brw_fs_nir.cpp.	2016-02-17 12:35:27 -08:00
Matt Turner	2f2c00c727	i965: Lower min/max after optimization on Gen4/5. Gen4/5's SEL instruction cannot use conditional modifiers, so min/max are implemented as CMP + SEL. Handling that after optimization lets us CSE more. On Ironlake: total instructions in shared programs: 6426035 -> 6422753 (-0.05%) instructions in affected programs: 326604 -> 323322 (-1.00%) helped: 1411 total cycles in shared programs: 129184700 -> 129101586 (-0.06%) cycles in affected programs: 18950290 -> 18867176 (-0.44%) helped: 2419 HURT: 328 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-17 12:35:27 -08:00
Matt Turner	378d98f87e	i965/vec4: Initialize force_writemask_all in vec4_builder(). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-17 12:35:27 -08:00
Tom Stellard	dc7cf07af3	radeon/llvm: Add TargetLibraryInfo to the pass manager This will prevent optimization passes from introducing unsupported library calls. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-17 19:06:41 +00:00
Tom Stellard	4f351a6cb1	radeon/llvm: Set the target triple on the module Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-17 19:06:41 +00:00
Tom Stellard	77f4e1c7ff	gallivm: Add helpers for creating and destroying TargetLibraryInfo This functionality is not exposed via the LLVM C API. Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-17 19:06:41 +00:00
Samuel Pitoiset	cfd1dd0500	nvc0: invalidate all buffers when switching pipe contexts Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-17 21:14:24 +01:00
Ilia Mirkin	49c67926c7	st/mesa: fix up result_src.type when doing i2u/u2i conversions Even though it's a no-op, it's important to keep track of the type so that we can pick the properly-signed op later on. This fixes dEQP-GLES3.functional.shaders.precision.uint.highp_div_fragment, which ended up using IDIV instead of UDIV. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-17 13:30:33 -05:00
Brian Paul	5e52df2198	st/mesa: use cso_set_viewport_dims() in try_pbo_upload_common() Note that this results in a different transformation for the viewport's Z axis (depth range), but that doesn't matter for this case. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-17 11:25:02 -07:00
Jordan Justen	9a939ebb47	i965/gen7: Use predicated rendering for indirect compute On gen7 (Ivy Bridge, Haswell), we will get a GPU hang if an indirect dispatch is used, but one of the dimensions is 0. Therefore we use predicated rendering on the GPGPU_WALKER command to handle this case. Fixes piglit test: spec/arb_compute_shader/zero-dispatch-size From the ARB_compute_shader spec, under DispatchCompute: "If the work group count in any dimension is zero, no work groups are dispatched." And then for DispatchComputeIndirect: ... "is equivalent (assuming no errors are generated) to calling DispatchCompute with <num_groups_x>, <num_groups_y> and <num_groups_z>" ... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94100 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-17 09:25:47 -08:00
Rob Clark	37d540ba70	freedreno: expose time-elapsed query Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	ba194630cc	freedreno/a4xx: implement time-elapsed query Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	62fa868728	freedreno/a4xx: better occlusion/sample counting This seems to give more reliable results. More similar to what we do on a3xx, although I think it breaks the a3xx theory that the four sets of results map to each MRT (since we appear to still only have four sets on a4xx). The divide-by-two is a bit odd, but seems to be needed for some reason. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	87eb406791	freedreno/query: fix refcnt'ing issue Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	0e91dccf9c	freedreno/query: some queries don't have ->begin_query() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	9d23d7b7cb	freedreno/query: align counter snapshot locations Some hw queries need their sample memory locations to have certain alignment. At the moment that isn't an issue, since the only hw query is occlusion, so all samples have the same size. But when others are added with different sample sizes, this starts to be a problem. All current and immediately upcoming hw queries simply need their sample address aligned to their size, so let's use that for now. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	8529e210ec	freedreno/query: add optional enable hook Add enable hook for hw query providers. Some will need to configure perfctr selector registers, which we want to do at the start of the submit. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	45ab5b1c34	freedreno: query max gpu freq This will be needed to support converting from cycle counts to time for performance related queries (initially time-elapsed, but there are some additional performance counters that could be wired up). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	dcb69185a0	freedreno: update generated headers Mostly to pull in perf ctrs. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-17 10:41:55 -05:00
Rob Clark	2a7ceb5957	freedreno/ir3: fix new gcc6 errors src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c: In function ‘emit_tex’: src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c:1368:26: warning: unused variable ‘const_off’ [-Wunused-variable] struct ir3_instruction *const_off[4]; ^~~~~~~~~ unused since: commit `8750299a42` Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Tue Feb 9 14:51:28 2016 -0800 nir: Remove the const_offset from nir_tex_instr Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-02-17 10:41:55 -05:00
Karol Herbst	edf774bb7e	nv50/ir: we can't do the add to mad conversion when the mul saturates Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 18:20:10 -05:00
Karol Herbst	068e9848ba	nv50/ir: optimize neg(and(set, 1)) to set helps shaders in saints row IV, bioshock infinite and shadow warrior total instructions in shared programs : 1914931 -> 1903900 (-0.58%) total gprs used in shared programs : 247920 -> 247785 (-0.05%) total local used in shared programs : 5673 -> 5673 (0.00%) total bytes used in shared programs : 17558272 -> 17457320 (-0.57%) local gpr inst bytes helped 0 137 719 719 hurt 0 12 0 0 v2: remove this opt for OP_SLCT and check against float for OP_SET v3: simplified the code Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 18:20:10 -05:00
Ilia Mirkin	ca23c8081f	nv50/ir: fix quadop emission in the presence of predication When there's a predicate, it just goes onto the sources list. If the quadop only has a single regular source, we will end up thinking that the predicate is the second source. Check explicitly for the predSrc so that we don't accidentally emit the wrong thing. This fixes a bunch of dEQP-GLES3.functional.shaders.derivate.* tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-16 18:20:10 -05:00
Ilia Mirkin	1d1ddfe5f8	nv50,nvc0: enable/disable seamless cubemap texturing as requested In a situation where the seamless setting isn't available on a per-texture basis (G200+ Teslas, and all Fermis), assume that all samplers will have it identically set, and enable accordingly. This fixes arb_seamless_cubemap piglit test on Fermi and Tesla. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 18:20:10 -05:00
Rob Clark	d49307435a	st/mesa: add missing ETC2 entries to format_map Noticed by Ilia when I was trying to figure out why some app was failing to use ETC2. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:53:43 -05:00
Samuel Pitoiset	3d5f61a262	nvc0: enable compute support on GK110:GM200 with an envvar Without this NVF0_COMPUTE environment variable, compute support is initialized by default and this is not what we want for now because it might break 3D. It will be enabled by default once we are sure it won't break anything. Please note that compute support on GM200+ is not enabled yet because it needs to be double-checked. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 21:39:00 +01:00
Samuel Pitoiset	6d74fa5756	nvc0: add compute support for GM107 Fortunately, compute support on GM107 is very close to GK110, except the GK110_COMPUTE.UNK02C4 which is invalid and should not be used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 21:39:00 +01:00
Samuel Pitoiset	bc331dd838	nvc0: fix compute state initialization on GK110+ Because our firmware doesn't support the GK110_COMPUTE.FIRMWARE[0x6] method the GPU hangs when it is used. Removing it fix the issue and allow to launch compute shaders on GK110+. Tested on GK208 and GM107. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 21:39:00 +01:00
Timothy Arceri	a61823b584	glsl: remove duplicate interpolation_string() function We already have one in the IR code that can be used everywhere its needed in the AST code so remove the one from the AST. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-02-17 07:26:38 +11:00
Timothy Arceri	e70ece4eea	glsl: remove unused helper Seems to have become unused when i965 moved to NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-17 07:25:10 +11:00
Timothy Arceri	07e6a37332	glsl: set user defined varyings to smooth by default in ES This is usually handled by the backends in order to handle the various interactions with the gl_*Color built-ins. The problem is this means linking will fail if one side on the interface adds the smooth qualifier to the varying and the other side just uses the default even though they match. This fixes various deqp tests. The spec is not clear what to for desktop GL so leave it as is for now. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92743	2016-02-17 07:23:49 +11:00
Samuel Pitoiset	f638512890	gm107/ir: add ATOM CAS emission This fixes the following dEQP test and the other compswap variants. dEQP-GLES31.functional.ssbo.atomic.compswap.highp_int Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 20:53:39 +01:00
Samuel Pitoiset	09446cf5f6	st/mesa: do not init limits when compute shaders are not supported When the number of uniform blocks is less than 12, ARB_uniform_buffer_object can't be enabled and the maximum GL version is not even 3.1... This fixes a regression introduced in `7c79c1e` (st/mesa: add compute shader state) if the maximum number of uniform blocks allowed for compute shaders is less than 12. This happens on Kepler but this might also affect other Gallium drivers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2016-02-16 20:53:35 +01:00
Jordan Justen	f28d80fabf	mesa: Don't call driver when there is no compute work The ARB_compute_shader spec says: "If the work group count in any dimension is zero, no work groups are dispatched." Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 09:25:20 -08:00
Jordan Justen	8514c75a26	i965: Set compute shader shared memory max to 64k See Ivy Bridge PRM, Volume 2, Part 2, 1.8.4 INTERFACE_DESCRIPTOR_DATA: DWORD 5, bits 20:16: "This field indicates how much shared local memory the thread group requires. The amount is specified in 4k blocks, but only powers of 2 are allowed: 0, 4k, 8k, 16k, 32k and 64k per half-slice." For Haswell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA: DWORD 5, bits 20:16: With text identical to the Ivy Bridge PRM. For Broadwell, see Volume 2d, INTERFACE_DESCRIPTOR_DATA: DWORD 6, bits 20:16: With text identical to the Ivy Bridge PRM. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 09:25:20 -08:00
Brian Paul	f90801cd40	st/mesa: use new CSO_BITS_ALL_SHADERS Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-16 10:22:32 -07:00
Brian Paul	1bf8fa8277	cso: add CSO_BITS_ALL_SHADERS For saving/restoring all shader stages. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-16 10:22:32 -07:00
Brian Paul	a0636157c4	st/mesa: simplify st->ctx, ctx->st usage in a various places	2016-02-16 10:22:32 -07:00
Brian Paul	5239832cf1	st/mesa: use _mesa_geometric_width/height() in glDrawPixels code Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 10:22:32 -07:00
Brian Paul	b92d48fb6b	st/mesa: rename attr variable in st_DrawTex() Rename to 'tex_attr' to be a bit more clear. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	5ce1f1245d	st/mesa: use 'cso' instead of 'st->cso_context' in st_DrawTex() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	79ffe94c8b	st/mesa: fix whitespace and add comment in st_DrawTex() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	4277618235	st/mesa: used _mesa_num_tex_faces() in st_finalize_texture() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	ffa1a1dd21	cso: make most of the cso_save/restore_x() functions static Users of the CSO save/restore facility all use the new cso_save/restore_state() functions instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	223ffd8a08	postprocess: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	70e8a4f734	gallium/hud: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	66889d8f84	gallium/util: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	38db9a4e26	st/mesa: use cso_save/restore_state() in st_cb_texture.c This simplifies the error handling code too. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	33fc248606	st/mesa: use new cso_save/restore_state() functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	9403571755	cso: add new cso_save/restore_state() functions cso_save_state() takes a bitmask of state items to save. Calling cso_restore_state() restores those states. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	017a003f1c	cso: remove comment There's a similar comment just a few lines before.	2016-02-16 10:22:32 -07:00
Brian Paul	347b9418ac	st/mesa: use new cso_set_viewport_dims() helper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	f7af12ae85	cso: add new cso_set_viewport_dims() helper To simplify some viewport setting code in the state tracker. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	f88c859cd3	st/mesa: use 'cso' local var instead of st->cso_context Just a little cleaner. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	d7d4fe90c4	st/mesa: consolidate quad drawing code The glClear, glBitmap and glDrawPixels code now use a new st_draw_quad() helper function. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:32 -07:00
Brian Paul	b63fe0552b	st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap Define a new st_util_vertex structure which is a bit smaller (9 floats versus the previous 12 floats per vertex). Clean up the glClear, glDrawPixels and glBitmap code that sets up the vertex data and does the drawing so it's all very similar. This can lead to more consolidation. v2: add assertion that vertex buffer slot == 0 to catch possible future change in cso_get_aux_vertex_buffer_slot() behavior. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:31 -07:00
Brian Paul	2b1535f82f	st/mesa: include u_draw.h, not u_draw_quad.h in st_draw.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-16 10:22:31 -07:00
Jan Vesely	04085afcbf	configure: Bail out on llvm-config component error Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-16 10:09:33 -05:00
Matthew Dawson	0bba5ca468	Handle removal of LLVMAddTargetData in SVN revision 260919 LLVM removed LLVMAddTargetData for the 3.9 release in r260919. For the two places in mesa where this is called, only enable the lines when compiling for less then 3.9. For the radeon driver, I'm not sure how to check if any other LLVM calls need to be adjusted. I think since the target data used is extracted from the LLVMModule, it isn't necessary to pass it back to LLVM again. The code does compile, and at least for radeonsi does run OpenGL games. [ Michel Dänzer: Move #if closer to LLVMAddTargetData in lp_bld_init.c, and add HAVE_LLVM < 0x0309 guards around now unused occurrences of TD and data_layout ] Signed-off-by: Matthew Dawson <matthew@mjdsystems.ca> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-16 16:18:35 +09:00
Topi Pohjolainen	7287cc8440	i965: Expose logic telling if non-msrt mcs is supported Alos use the opportunity to mark inputs constant. (Context has to be given as read-write to intel_miptree_supports_non_msrt_fast_clear() to support debug output). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	dd37b6aaa9	i965/gen9: Refactor msrt mcs initialization This will be re-used to initialize auxiliary buffers in lossless compression case. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	2bd58790e2	i965: Add a few assertions on lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	56f29911ec	i965: Add a flag telling color resolve pass to ignore CCS_E v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	97f4ca90b8	i965: Add resolve option for lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression (intel_miptree_is_lossless_compressed()). Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	0e79bff957	i965: Allow fast clear to be used with lossless compression v2 (Ben): Use combination of msaa_layout and number of samples instead of introducing explicit type for lossless compression. v3 (Ben): Squash with "i965: Resolve color buffer also in lossless compression case" and clarify simple non-compressed fast clear case. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:24 +02:00
Topi Pohjolainen	4b801116d3	i965: Add helper for detecting lossless compression Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-16 08:52:23 +02:00
Topi Pohjolainen	36b7c0dad9	Revert "i965: Restore vbo after color resolve during brw_try_draw_prims()" This got pushed accidentally in the first place but wasn't reverted as it didn't regress piglit but instead fixed one newly introduced test exercising a corner in case in i965 driver. However, saving and restoring vertex buffer context is complicated and requires more thought. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94150 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Tapani Palli <tapani.palli@intel.com>	2016-02-16 08:52:14 +02:00
Ben Skeggs	33ace5544e	nvc0: initial support for GM20x GPUs Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:16 +10:00
Ben Skeggs	97fc3fd559	nvc0: implement support for maxwell texture headers Adds support for the new TIC layout that's present on Maxwell GPUs, heavily based on the code for the existing layout. This code is required for GM20x support. While GM10x supports the older layout still, this commit switches it to use the updated version instead. Piglit testing shows zero regressions on GM107. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:13 +10:00
Ben Skeggs	7333b0c20c	nvc0: import maxwell texture header definitions from rnndb Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:10 +10:00
Ben Skeggs	733c8f8c73	nv50-: split tic format specification We previously stored texture format information as it would appear in the TIC. We're about to support the new TIC layout that appeared with Maxwell, so it makes more sense to store the data in a split-out format. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:07 +10:00
Ben Skeggs	a928cbc205	nv50-: remove nv50_texture.xml.h Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:05 +10:00
Ben Skeggs	ff1af29dd9	nvc0: switch nvc0_tex.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:03 +10:00
Ben Skeggs	c999736c18	nvc0: switch nvc0_surface.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:02 +10:00
Ben Skeggs	63880dca12	nv50: switch nv50_tex.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:57:00 +10:00
Ben Skeggs	a15c08c95c	nv50: switch nv50_surface.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:58 +10:00
Ben Skeggs	59d93ad1be	nv50: switch nv50_state.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:56 +10:00
Ben Skeggs	1a45b7afb6	nv50-: switch nv50_formats.c to updated g80_texture.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:54 +10:00
Ben Skeggs	d5ac81295d	nv50: import updated g80_texture.xml.h from rnndb Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:52 +10:00
Ben Skeggs	7235b6250d	nv50-: remove nv50_defs.xml.h Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:50 +10:00
Ben Skeggs	b04b16754c	nv50-: switch nv50_formats.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:48 +10:00
Ben Skeggs	3444f83077	nv50-: improved macros to handle format specification Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:45 +10:00
Ben Skeggs	346d7a24ea	nv50-: separate vertex formats from surface format descriptions We've previously had identical naming between vertex and texture formats, so it mostly made sense to define these together. However, upcoming patches are going to transition the driver over to using updated texture header definitions using NVIDIA's naming, and this will no longer be the case. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:42 +10:00
Ben Skeggs	3e2dd50d81	nvc0: remove unnecessary includes Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:40 +10:00
Ben Skeggs	e8eda47898	nvc0: switch nvc0_tex.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:38 +10:00
Ben Skeggs	546ccf3f82	nvc0: switch nvc0_surface.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:36 +10:00
Ben Skeggs	0a0d8e4497	nv50: remove unnecessary include Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:33 +10:00
Ben Skeggs	9c4b7748db	nv50: switch nv50_transfer.c to g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:31 +10:00
Ben Skeggs	577eeb7984	nv50: switch nv50_tex.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:29 +10:00
Ben Skeggs	114d41feb2	nv50: switch nv50_surface.c to updated g80_defs.xml.h Verified (binary diff) to produce identical code. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:27 +10:00
Ben Skeggs	413cc25753	nv50: import updated g80_defs.xml.h from rnndb Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-16 15:56:12 +10:00
Nicolai Hähnle	2de9317d5f	st/mesa: count shader images in MaxCombinedShaderOutputResources Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 22:22:34 -05:00
Ilia Mirkin	1edbe0157d	st/mesa: enable GL image extensions when backend supports them This enables ARB_shader_image_load_store and ARB_shader_image_size when the backend claims support for these. It will also implicitly enable the image component of ARB_shader_texture_image_samples. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	2e0a84208b	st/mesa: convert GLSL image intrinsics into TGSI Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	672257dc69	st/mesa: allow st_format.h to be included from C++ files Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Nicolai Hähnle	ef27190a34	st/mesa: set pipe_image_view layers correctly for 3D textures Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 22:22:33 -05:00
Nicolai Hähnle	f1b0bda6bc	st/mesa: call st_finalize_texture from image atoms Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	78093167b1	st/mesa: add an image atom for shader images Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	e2a1ec5f0f	tgsi: show textual format representation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	9fbfa1abb2	gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGES Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	bceff68114	gallium: make image views non-persistent objects Make them akin to shader buffers, with no refcounting/etc. Just used to pass data about the bound image in ->set_shader_images. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-15 22:22:33 -05:00
Ilia Mirkin	cfbf25ac8f	st/mesa: empty buffer binding if the buffer's not really there This can happen with 0-sized buffers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-15 22:22:33 -05:00
Rhys Kidd	76e2af3dd4	docs: Document VC4_DEBUG envvar Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Rhys Kidd	aa82cc4b22	vc4: Add missing braces in initializer Silences the following GCC warning: mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c: In function 'qir_schedule_instructions': mesa/src/gallium/drivers/vc4/vc4_qir_schedule.c:578:16: warning: missing braces around initializer [-Wmissing-braces] struct schedule_state state = { 0 }; ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Rhys Kidd	c75ced3623	vc4: Correct typo setting 'handled_qinst_cond' Variable was previously always set to true. Accordingly, the later assert() served no active purpose. Found with GCC warning and code inspection: mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c: In function'vc4_generate_code': mesa/src/gallium/drivers/vc4/vc4_qpu_emit.c:315:22: warning: variable 'handled_qinst_cond' set but not used [-Wunused-but-set-variable] bool handled_qinst_cond = true; ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Eric Anholt	655fa0f465	vc4: Don't treat conditional MOVs as raw MOV. The two consumers want to know that the destination will be exactly the source, which is not true if we might not set the destination. Signed-off-by: Eric Anholt <eric@anholt.net>	2016-02-15 17:13:52 -08:00
Timothy Arceri	00a1bd13b5	glsl: warn in GL as well as ES when varying not written Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93339	2016-02-16 11:15:43 +11:00
Ilia Mirkin	6d39075c06	docs: update GLES 3.1 section for recent nvc0 additions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-15 17:43:37 -05:00
Ilia Mirkin	4360ba0caf	mesa: need to check resource and set length even if bufSize is 0 This fixes a number of dEQP tests, such as: dEQP-GLES31.functional.program_interface_query.buffer_limited_query.resource_query It was expecting the length to be set even in the bufSize == 0 case. Also _mesa_get_program_resourceiv does some error checking on the resource which should probably happen even in the bufSize == 0 case as well although there's no dEQP test for that. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-02-15 12:20:25 -05:00
Ben Widawsky	66c790720b	i965/bxt: Production thread counts v2: Forgot to squash in the comment removal Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-15 07:48:09 -08:00
Daniel Czarnowski	5d87a7c894	egl_dri2: NULL check for xcb_dri2_get_buffers_reply() Without the check, unsuccessful xcb_dri2_get_buffers_reply(...) causes segmentation fault in dri2_get_buffers. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org	2016-02-15 07:43:27 +02:00
Edward O'Callaghan	331f963b7e	nv50,nvc0: Remove duplicate logic from nvc0_set_framebuffer_state() We already have this logic in the gallium/util functions so lets reduce some entropy while here. V.2: Apply change to nv50 also as suggested by Samuel Pitoiset. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-14 23:56:54 +01:00
Samuel Pitoiset	cbf24a01dd	nv50: add missing PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-14 22:56:02 +01:00
Kenneth Graunke	8122d21d15	i965: Fix gl_DrawID in the vec4 backend. brw_draw_upload.c uploads VertexID/InstanceID first, then DrawID. So we need to assign the attribute mapping in that order as well. Fixes the following Pigit tests with the vec4 backend: - arb_shader_draw_parameters-drawid vertexid - arb_shader_draw_parameters-drawid-indirect basevertex Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-02-14 13:24:07 -08:00
Brian Paul	816c987b67	mesa: move assertion in _mesa_cube_face_target() Fixes piglit arb_texture_view-sampling-2d-array-as-2d-layer regression. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94134 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-14 09:16:22 -07:00
Serge Martin	a4cff1859e	clover: fix build failure since `bfd695e` Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-14 11:00:29 +01:00
Kenneth Graunke	565aa69970	glsl: Fix overflow of ImageAccess[] array. The ImageAccess array is statically sized to MAX_IMAGE_UNIFORMS: GLenum ImageAccess[MAX_IMAGE_UNIFORMS]; There was no bounds checking ensuring we don't overflow. Passing in a shader with too many uniforms would cause writes to extend into other fields, such as sh->NumImages. Later linker checks already handle reporting an error when there are too many images, so just avoid corrupting structures here. This rearranges the logic a bit to look more like the sampler case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 21:12:18 -08:00
Ilia Mirkin	6411444c36	mesa: default FixedSampleLocations to true when using a dummy image GL_ARB_texture_multisample and GLES 3.1 expect the initial value to be GL_TRUE. This fixes dEQP-GLES31.functional.state_query.texture_level.texture_2d_multisample_array.fixed_sample_locations_integer and a few related tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-13 23:41:28 -05:00
Jason Ekstrand	7410c60988	nir/types: Add more type constructor functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	f05f576803	nir/types: Add a few more glsl_type_is_ functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	914829f766	nir/types: Add helpers for working with sampler and image types Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	d140b13fd5	nir/types: Add helpers for function types Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	b9e94ad806	glsl/types: Expose glsl_struct_field and glsl_function_param to C Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	954d46184f	glsl/types: Add a helper for getting image types Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	95ea9f7708	glsl/types: Add support for function types SPIR-V has a concept of a function type that's used fairly heavily. We could special-case function types in SPIR-V -> NIR but it's easier if we just add support to glsl_types. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	5ec6a65388	glsl/types: Add a bare "sampler" type This is to be used by SPIR-V for representing a sampler that isn't attached to any particular image. In SPIR-V, all of the interesting bits such as dimensionality, sampled type, etc. come from the image, the bare "sampler" type simply uses a sampled type of VOID and 0 values for the rest. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Jason Ekstrand	ac089126b9	glsl/types: Rename sampler_type to sampled_type It's a bit more descriptive since it is the base type that you get when you sample from it. Also, the next commit adds a bare "sampler" type and we need glsl_type::sampler_type available for a public static member. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-13 17:22:36 -08:00
Vinson Lee	4ed4c1d921	llvmpipe: Do not use barriers if not using threads. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94088 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-13 14:42:05 -08:00
Francisco Jerez	9e30d66b7c	i965: Reupload push and pull constants when we get new shader image unit state. Fixes several of the "dEQP-GLES31.functional.image_load_storeload_storesingle_layer" dEQP tests that use image formats we implement using untyped surface messages. Cc: mesa-stable@lists.freedesktop.org Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-13 14:33:32 -08:00
Samuel Pitoiset	40fcb6b9f9	i965: fix MAX_COMPUTE_SHARED_SIZE constant value MAX_COMPUTE_SHARED_SIZE should be set to 32768. This fixes a regression introduced in `be27f77` (mesa: do not use a constant for MAX_COMPUTE_SHARED_SIZE). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94139 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-13 23:13:31 +01:00
Samuel Pitoiset	7f0a19400e	nv50/ir: add missing SV_TID and SV_CTAID sysvals on GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 22:26:38 +01:00
Samuel Pitoiset	d11266aa06	nv50/ir: add MEMBAR emission for GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 22:06:15 +01:00
Alejandro Piñeiro	a150101125	docs: document MESA_GLES_VERSION_OVERRIDE envvar v2: Removed reference to FC not being an allowed suffix (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-13 20:21:06 +01:00
Samuel Pitoiset	b410ed9215	st/mesa: fix pipe_grid_info initializer Fixes MSVC build error which doesn't allow empty initializers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-13 17:08:24 +01:00
Samuel Pitoiset	628b0e8571	trace: add all compute related functions Changes from v3: - dump the TGSI compute program Changes from v2: - remove use of MALLOC() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:02 +01:00
Samuel Pitoiset	fe0b55f39e	st/mesa: implement limits for ARB_compute_shader According to the spec, this also increases the following minimum values: - MAX_COMBINED_TEXTURE_IMAGE_UNITS 96 (616), was 80 - MAX_UNIFORM_BUFFER_BINDINGS 72 (612), was 60 ARB_compute_shader is not enabled by default because images support is still not implemented yet. If you want to use it you need to set MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader. Changes from v2: - make use of the new PIPE_CAP_SHADER_SUPPORTED_IRS cap instead of enabling the extension when PIPE_CAP_COMPUTE is enabled. - query for PIPE_CAP_COMPUTE first - s/shader_supported_irs/compute_supported_irs/ - disable ARB_compute_shader and add a comment which explains why Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:02 +01:00
Samuel Pitoiset	8aa666981b	st/mesa: add compute program dispatch callbacks This state tracker implements DispatchCompute() and DispatchComputeIndirect(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:01 +01:00
Samuel Pitoiset	805d92e540	st/mesa: add state validation for compute shaders This binds atomics, constants, samplers, ssbos, textures and ubos. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:01 +01:00
Samuel Pitoiset	61c87cd2c0	st/mesa: add mappings for compute shader sysvals LOCAL_INVOCATION_ID, WORK_GROUP_ID and NUM_WORK_GROUPS are respectively mapped to THREAD_ID, BLOCK_ID and GRID_SIZE. Changes from v2: - add assertions in st_translate_program() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	e8db4e4e0a	st/mesa: keep track of shared memory declarations Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	dfa58f0ff0	st/mesa: add intrinsics for shared variables This adds GLSL intrinsics for load/store and atomic operations. Changes from v2: - use PROGRAM_MEMORY instead of PROGRAM_BUFFER Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	44e04dc809	st/mesa: add conversion for compute shaders According to the spec, there are no predefined inputs nor any fixed-function outputs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 16:01:00 +01:00
Samuel Pitoiset	7c79c1e3e2	st/mesa: add compute shader states Changes from v2: - use as much common code as possible (eg. st_basic_variant) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 16:00:54 +01:00
Samuel Pitoiset	08c46025c8	st/mesa: add a second pipeline for compute Compute needs a new and different validation path. Changes from v2: - make use of unreachable() instead of assert() when the pipeline is invalid - move the st_pipeline enumeration to st_context.h instead of st_api.h Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	a8328e3a50	tgsi/ureg: add shared variables support for compute shaders This introduces TGSI_FILE_MEMORY for shared, global and local memory. Only shared memory is currently supported. Changes from v2: - introduce TGSI_FILE_MEMORY Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	5e09ac78e5	gallium: add PIPE_SHADER_CAP_SUPPORTED_IRS This cap indicates the supported representations of programs. It should be a mask of pipe_shader_ir bits. It will allow to enable ARB_compute_shader if the underlying driver supports TGSI. Changes from v2: - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	43f4420fba	gallium: add indirect compute parameters to pipe_grid_info Like indirect draw, we need to store a resource and an offset that needs to be 4 byte aligned. When indirect is used, the size of the grid (in blocks) is stored with three 32-bit integers. Changes from v2: - s/most values/block sizes/ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	bfd695e1d2	gallium: add a new interface for pipe_context::launch_grid() This introduces pipe_grid_info which contains all information to describe a launch_grid call. This will be used to implement indirect compute in the same fashion as indirect draw. Changes from v2: - correctly initialize pipe_grid_info for nv50/nvc0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	61ed09c7ea	gallium/cso: add support for compute shaders Changes from v2: - removed cso_{save,restore}_compute_shader() functions and the compute_shader_saved variable because disabling compute shaders for meta ops is not currently needed Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	ffd9c7fd74	mesa: add PROGRAM_MEMORY This will be used for shared, global and local memory areas. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	a9eb1327be	mesa: store shared size in gl_compute_program The size of shared variables needs to be stored in gl_compute_program in order to set up pipe_compute_state::req_local_mem. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Samuel Pitoiset	be27f772e8	mesa: do not use a constant for MAX_COMPUTE_SHARED_SIZE This will allow to query the underlying drivers for the maximum total storage size of all variables declared as <shared> with PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-13 15:51:17 +01:00
Ilia Mirkin	f2547883cf	mesa: make compute maximums reflect driver-provided values Looks like the various max's were never plumbed through. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-13 15:51:17 +01:00
Topi Pohjolainen	f709a08457	i965: Add means for limiting color resolves Until now there has been only one type of color buffer that needs to resolved - namely single sampled fast clear. As even the sampler engine in GPU doesn't understand the associated meta data, the color values need to be always resolved prior to reading them. From SKL onwards there is new scheme supported called the lossless compression of single sampled color buffers. This is something that is understood by the sampling engine and therefore resolving of these types of buffers is not necessary before sampling. This patch adds means to make the distinction when considering if resolve is needed. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-13 09:50:24 +02:00
Topi Pohjolainen	7513c5c782	i965: Refactor resolving of auxiliary mode Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-13 09:30:36 +02:00
Topi Pohjolainen	9002bcdb35	i965: Don't try to create aux buffer for non-msrt aux-buffer In addition to simply calling miptree_create() the higher level call intel_miptree_create() also considers if the buffer should be associated with an auxiliary buffer based on the given format. Here we are allocating an auxiliary buffer which in turn has such format that would mislead intel_miptree_create_layout() later on to try to associate the auxiliary buffer with an auxiliary buffer. To prevent this the actual buffer creation logic was split out into its own function. Lets invoke that instead. v2 (Ben): Do not signal msaa layout with explicit argument but using layout_flags instead. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-13 09:28:41 +02:00
Ben Widawsky	5743fd9571	i965: Rename optimizer debug 00 filename This allows ls, and scripts to get the file names in the correct order of optimization. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-12 20:52:28 -08:00
Kenneth Graunke	c8b0020f2f	i965: Make brw_clear_cache NULL out stale program pointers. The L3 partitioning code tries to look at all programs - both render programs (VS/TCS/TES/GS/FS) and compute (CS). After calling brw_clear_cache, all prog_data pointers are invalid and point to freed data. The intention was that flagging the dirty bits for all programs would cause the next draw call to re-run the atoms for each program stage, uploading new programs and installing new, valid pointers. However, this doesn't quite work in our new multi-pipeline world. When drawing or dispatching a compute workload, we only consider the programs for the appropriate pipeline: drawing sets up VS/TCS/TES/GS/FS, but not CS, and vice versa. This leaves pointers dangling a bit longer than intended. The L3 configuration code tries to inspect the prog_data for all shader stages, so that we avoid having to reconfigure it when swapping back and forth between render and compute workloads. So we can't have dangling pointers. The fix is simple: have brw_clear_cache NULL out stale prog_data pointers, making it safe to inspect. The next L3 configuration pass will see either the render shaders or compute shader as missing for one go around, but will pick them up when both pipelines have run. In other words, we'll simply reconfigure L3 twice, which is safe, if a tiny bit wasteful - but then again, we just threw every compiled shader we had on the floor and started recompiling the from scratch, which is massively more wasteful, so it's not much of a concern. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jljusten@gmail.com>	2016-02-12 20:35:34 -08:00
Ilia Mirkin	f56b5de877	mesa: avoid segfault in GetProgramPipelineInfoLog when no length If there is no pipe info log, we would unconditionally deref length, which was only optionally there. _mesa_copy_string handles the source being null, as well as the length, so may as well just always call it. Fixes a segfault in dEQP-GLES31.functional.state_query.program_pipeline.info_log Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:50 -05:00
Ilia Mirkin	f82ff6207c	mesa: reset offset/size to 0 when removing atomic binding Similar to commit `dd9d2963d6` (mesa: AtomicBufferBindings should be initialized to zero.), we should reset these to zero when unbinding. This fixes a number of dEQP failures due to cross-test pollution. The tests properly unbound everything, but when querying the values again, the expectation was that they would be 0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	b7e246d89a	mesa: recognize enums GL_COLOR_ATTACHMENT8-31 as valid Similar as for AUX1-3, these enums aren't invalid (i.e. -1) but also not supported by mesa. Returning BUFFER_COUNT causes the proper error to be returned by ReadBuffer and other functions. This resolves some failures in dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.read_buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	a663aa2a37	mesa/clear: update ClearBufferfv error handling for GL 4.5 spec This fixes dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.clear_bufferfv and brings the logic up to spec with GL 4.5 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	3a0051bea9	mesa/clear: update ClearBufferuiv error handling for GL 4.5 spec This fixes dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.clear_bufferuiv and brings the logic up to spec with GL 4.5 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	758162923b	mesa/clear: simplify ClearBufferiv error handling Might as well handle everything in the same error call. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:49 -05:00
Ilia Mirkin	86fd9d6b8e	mesa/clear: remove dead code handling ClearBufferiv(GL_DEPTH) There's a hunk above which sets INVALID_ENUM for GL_DEPTH unconditionally. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 18:22:48 -05:00
Ilia Mirkin	d33ef19479	mesa: allow DEPTH_STENCIL_TEXTURE_MODE queries in GLES 3.1 contexts This fixes dEQP-GLES31.functional.state_query.texture.texture_2d_multisample.depth_stencil_mode_integer and a few related tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-02-12 18:22:48 -05:00
Kenneth Graunke	2a0fc82864	i915: include teximage.h To get _mesa_num_tex_faces() prototype.	2016-02-12 15:20:29 -08:00
Brian Paul	320ccf710e	i965: include teximage.h To get _mesa_num_tex_faces() prototype.	2016-02-12 15:42:54 -07:00
Axel Davy	cc0114f30b	st/nine: Implement Managed vertex/index buffers We were implementing those the same way than the default pool, which is sub-optimal. The buffer is supposed to return pointer to a ram copy when user locks, and automatically update the vram copy when needed. v2: Rename NineBuffer9_Validate to NineBuffer9_Upload Rename validate_buffers to update_managed_buffers Initialize NineBuffer9 managed fields after the resource is allocated. In case of allocation failure, when the dtor is executed, This->base.pool is then rightfully set. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-12 23:26:36 +01:00
Axel Davy	77d6c11f8f	st/nine: Align stack for entry points For 32 bits, incoming stack is 4-byte aligned. We need to realign the stack to 16-byte at some point, or there are issues later (crash with SSE, llvm, etc). This patch chooses to align the stack at API entry points. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-12 23:26:36 +01:00
Axel Davy	d7a5468da9	st/nine: Drop path for ureg_NRM and ureg_CLAMP using MIN/MAX is fine instead of CLAMP. NRM doesn't exist anymore. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	6b43f5b1d4	st/nine: Remove usage of SQRT in ff code SQRT is not supported everywhere, so replace it by RSQ + MUL and handle case <= 0. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-12 23:26:36 +01:00
Axel Davy	17078d92ea	st/nine: Fix stateblocks crashes with lights We had several issues of crashes with it. This should fix it. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	6cba347530	st/nine: SCRATCH does support all formats Add new argument to d3d9_to_pipe_format_checked to be able to bypass format support checks. This argument is set to TRUE when the requested Pool is SCRATCH. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	dbcb4f46ad	st/nine: Add format checks to create_zs_or_rt_surface Returns INVALIDCALL when trying to create a surface of unsupported format. In practice, apps are supposed to check for format support before trying to create a render target of that format. However some bad behaving apps could just try to create the surface and deduce if it failed that it wasn't supported. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	3a2e0c7784	st/nine: Support ATI1/ATI2 for CubeTexture Texture and CubeTexture use common code, and thus ATI1/ATI2 is already implemented for CubeTexture. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	6c4774bbe4	st/nine: Clean pSharedHandle Texture ctors checks Clarify the behaviour and clean the checks Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	bb65b189f3	st/nine: Move texture creation checks We were having checks at both CreateTexture functions and in ctors. Move all CreateTexture checks to ctors. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	d973a525d3	st/nine: Clean useless code in texture9.c This->base.base.resource is worth NULL for SYSTEMMEM textures. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	36b4bb303c	st/nine: Do not set SHARED flag for shared textures. We do not support shared textures, thus no need to set the shared flag. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Axel Davy	77a5871c1d	st/nine: Do not set resource usage for SYSTEMMEM We do not create a resource for SYSTEMMEM textures, thus we do not need to set resource usage. The only exception is vertexbuffer SYSTEMMEM, since we do use a pipe resource for them. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-12 23:26:36 +01:00
Brian Paul	9675fb6c68	mesa: move _mesa_num_tex_faces() to teximage.h So it's near the other cube map helper functions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:11:38 -07:00
Brian Paul	6e09df24b5	mesa: simplify some code with new _mesa_cube_face_target() function Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:11:38 -07:00
Brian Paul	82db969ac0	mesa: add _mesa_cube_face_target() helper Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:11:24 -07:00
Brian Paul	d73f5a3133	mesa: make _mesa_tex_target_to_face() an inline function Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:10:37 -07:00
Brian Paul	6a08673c5e	mesa: remove _ARB suffix from cube map enums Just minor clean-up so we're consistent everywhere. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 15:10:15 -07:00
Brian Paul	ae70d0d68c	docs: Visual Studio 2013 or later is now required	2016-02-12 15:08:35 -07:00
Timothy Arceri	4e59362d1b	glsl: replace _strtoui64() with strtoull() for MSVC Now that MSVC 2013 is required we can remove this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-13 08:57:01 +11:00
Jose Fonseca	950da38164	mesa: Use _aligned_malloc/free for MinGW too. We already use these for gallium in src/gallium/auxiliary/os/os_memory_stdc.h and it's always better to minimize divergences between MinGW and MSVC. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-12 14:51:28 +00:00
Jose Fonseca	c69ef377c8	mesa: Remove support for MSVC2008. Spotted by Emil Velikov. Trivial.	2016-02-12 10:31:15 +00:00
Jose Fonseca	5bc8d34526	util/u_atomic: Remove MSVC 2008 support. Spotted by Emil Velikov. Trivial.	2016-02-12 10:31:15 +00:00
Topi Pohjolainen	30711d984f	i965: Stop considering if msrt aux buffers need aux buffer Auxiliary buffers are always created with sample number of zero which effectively prevents intel_miptree_create_layout() from trying to associate auxiliary buffers with auxiliary buffers. Now that there is more direct path available lets start using it instead and stop even checking for such (im)possibility. v2 (Ben): Do not signal msaa layout with explicit argument but using layout_flags instead. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-12 09:17:29 +02:00
Topi Pohjolainen	422b1386d7	i965: Separate miptree creation from auxiliary buffer setup Currently the logic allocating and setting up miptrees is closely combined with decision making when to re-allocate buffers in X-tiled layout and when to associate colors with auxiliary buffers. These auxiliary buffers are in turn also represented as miptrees and are created by the same miptree creation logic calling itself recursively. This means considering in vain if the auxiliary buffers should be represented in X-tiled layout or if they should be associated with auxiliary buffers again. While this is somewhat unnecessary, this doesn't impose any problems currently. Miptrees for auxiliary buffers are created as simgle-sampled fusing the consideration for multi-sampled compression auxiliary buffers. The format in turn is such that is not applicable for single-sampled fast clears (that would require accompaning auxiliary buffer). But once the driver starts to support lossless compression of color buffers the auxiliary buffer will have a format that would itself be applicable for lossless compression. This would be rather difficult and ugly to detect in the current miptree creation logic, and therefore this patch seeks to separate the association logic from the general allocation and setup steps. v2 (Ben): - Do not reconsider for X-tiling in intel_miptree_create() as it was just forced to Y-tiling in miptree_create(). - Do not drop checks for allocation failures. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Topi Pohjolainen	d089f2d932	i965: Isolate aligned dimensions for stencil only This makes the logic a little more explicit and helps to keep subsequent patches easier to read. Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Topi Pohjolainen	0dcd9a09d1	i965: Restore vbo after color resolve during brw_try_draw_prims() Part of brw_try_draw_prims() is a check to validate textures (brw_validate_textures()). In case of textures that currently have only level zero but are marked for mipmap generation, i965 driver will decide to replace the underlying buffer with a larger one capable of holding also the additional levels. This results into blit from the original buffer to the newly allocated (see intel_miptree_copy_teximage()). This blit is currently handled with blitter engine and hence it won't effect the ongoing draw operation. However, this blit in turn may trigger color resolve on the source buffer. In principle, this should be possible with fast cleared buffers but I only started hitting it when I enabled lossless compression (that reguires similar resolve to fast cleared buffers). Now, the color resolve is a meta operation and uses the same drawing path we are already in middle of. After quite a bit of debugging I realized that the resolve will modify the current vbo setup but it won't restore it afterwards resulting in the original draw call using wrong vertex data. When brw_try_draw_prims() gets called, the vbo logic in the Mesa core (see vbo_draw_arrays()) has just bound the vbo (see vbo_bind_arrays() and recalculate_input_bindings()). Color resolve operation will overwrite the vbo setup by calling vbo_bind_arrays() against the resolve rectangle (see brw_draw_rectlist()). Once the color resolve is done the vbo setup is left to the resolve rectangle state and the original drawing call yields bogus results. This patch aims to restore the original state after the color resolve by calling vbo_bind_arrays() yet again after the vertex array state in the core context have been restored. Now having said all this, I'd also like to state that I'm quite uncomfortable with the nested meta operations. Ths original draw call in this case is in fact a meta operation itself. It is a blit from level zero to level one when generating the additional mipmap levels (see _mesa_meta_GenerateMipmap()). Imagine the complexity if the blit in the middle from buffer to another would go to meta path also instead of blitter. I would very tempted to try to move all the resolves to happen before a meta operation is started. Additionally I still feel that work I did earlier in the spring/ summer time moving meta operations to use direct state upload bypassing the core context would make sense. v2: Force input recalculation by setting the flag explicitly v3: Do not attempt to restore vbo for opengles1 which doesn't support vertex buffer objects. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Topi Pohjolainen	779429d063	i965: Validate textures before altering driver state Validation may kick off copies and subsequently color resolves. Color resolves (and the copies themselves if ending up in meta path) will overwrite the internal driver state but are not prepared to restore it. Instead of adding that capability the validation can be simply performed before the state is updated. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-02-12 09:13:07 +02:00
Kenneth Graunke	76f6f59c6e	i965: Make brw_clear_cache flag all the bits on both pipelines. Setting brw->ctx.NewDriverState and brw->ctx.NewGLState affects the dirty bits for the current pipeline. But, we need to flag everything dirty on both pipelines, so that when we switch back, we'll realize our programs are stale and re-upload them. To accomplish this, flag the saved state for both pipelines. Only one of them should matter, but this way we don't have to check which we need to set. It's harmless to set the other. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-11 22:53:19 -08:00
Samuel Iglesias Gonsálvez	61ceb36ead	glsl: Allow invariant qualifer in block members in desktop OpenGL. Feedback from Khronos is that 'invariant' should be allowed on block members for desktop OpenGL. Fix piglit regression added by `fe1e89a0`: invariant-qualifier-in-out-block-01.vert v2: - Allow it for in/out blocks in OpenGL ES too, so when OES_shader_io_blocks is supported we don't need to do any change (Timothy) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89330 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-12 07:20:47 +01:00
Kenneth Graunke	e9644cb1f9	i965: Consider tessellation in get_pipeline_state_l3_weights. I think this was just missed; Curro and I were probably writing code simultaneously and forgot to combine them at the end. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-11 19:15:17 -08:00
Kenneth Graunke	f275c61c30	i965: Split brw_upload_texture_surfaces into compute/render atoms. When uploading state for the compute pipeline, we don't want to look at VS/TCS/TES/GS/FS programs, as they might be stale, and aren't relevant anyway. Likewise, the render pipeline shouldn't look at CS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93790 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-11 19:15:08 -08:00
Marek Olšák	f3943614ff	radeonsi: fix build with LLVM 3.6 Broken by this cleanup: `3dc1cb0cc7` Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-12 00:41:36 +01:00
Jason Ekstrand	9f8c01b03c	i965/gs: Pass VerticesIn though prog_data Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 15:07:20 -08:00
Jason Ekstrand	56eb9c44ad	i965/fs: Pass usage of depth, W, and sample mask through prog_data We really need to stop pulling information directly out of shaders for state setup. For one thing, if we want any sort of an on-disk shader cache, having all of this metadata in one place is going to be crucial. Also, passing it all through prog_data cleans up the compiler <-> state setup API substantially. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 15:07:20 -08:00
Jason Ekstrand	ae3543950c	i965/fs: Refactor setup_payload_gen6 to assume FS It's extremely FS specific so the fact that we have a stage check in the middle of it is rather bogus. While were here, we rename setup_payload_gen4 and setup_payload_gen6 to make it obvious that they are both FS specific. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 15:07:20 -08:00
Samuel Pitoiset	d759f0ddf1	nv50,nvc0: remove unused parameter in nvXX_state_validate() This 'words' parameter is there since 2011 but it has never been used. While we are at it, get rid of the extern declaration. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-11 23:14:16 +01:00
Timothy Arceri	b600247035	glsl: don't validate interface blocks twice We already check for opaque types so don't recheck for atomics and images. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-12 09:12:23 +11:00
Timothy Arceri	98d3cc9fbc	glsl: remove duplicate embedded struct validation Commit `c98deb18d5` in 2010 disallowed embedded struct definitions in ES. Then in 2013 `d9bb8b7b56` disallowed it for everything but GLSL 1.10. Commit `c98deb18d5` seemed the cleanest way to do the check so its been extended to cover GL and the other version has been removed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-12 09:06:49 +11:00
Jose Fonseca	0d4898ae80	include,gallium: Remove pre-MSVC 2013 compatibility. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-11 21:36:00 +00:00
Jose Fonseca	a97a955b92	scons: Eliminate MSVC2008 compatibility. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-11 21:36:00 +00:00
Jose Fonseca	1cadfe08c4	configure: Eliminate MSVC2008 compatibility. We no longer need to build any part of Mesa with Windows SDK 7.0.7600 or MSVC 2008. MSVC 2013 will be the oldest we support. In practice this means people are now free to declare variables in the middle of blocks, on the whole Mesa tree. Care should still be taken with variable length arrays and void pointer arithmetic. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Hella-acked-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-11 21:36:00 +00:00
Chris Forbes	a2c8b5ece5	i965: ir: dump floats as %-g rather than %f, so we can see denormals Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-11 12:10:29 -08:00
Jordan Justen	9f36070c2f	i965/gen7: Require kernel cmd_parser 5 for ARB_compute_shader The indirect dispatch registers were whitelisted in command parser version 5. (Version 5 is available as of Linux 4.4) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-11 10:49:13 -08:00
Marek Olšák	a8aa73f768	st/mesa: release GLSL IR in LinkShader after it's not needed Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-11 17:31:40 +01:00
Marek Olšák	906ecab450	mesa: call build_program_resource_list inside Driver.LinkShader to allow LinkShader to free the GLSL IR. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-11 16:56:28 +01:00
Marek Olšák	0f235c960c	st/mesa: use correct pipe functions to create tess shaders Broken by one of my cleanups. Spotted by luck. Radeonsi doesn't care, because all shader create callbacks go to the same function. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-11 16:56:28 +01:00
Marek Olšák	100796c15c	gallium/radeon: drop support for LLVM 3.5 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> v2: adjust the comment in the amdgpu winsys	2016-02-11 16:48:30 +01:00
Marek Olšák	3dc1cb0cc7	radeonsi: obtain commonly used LLVM types only once Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-11 16:48:30 +01:00
Marek Olšák	1643dca513	radeonsi: cleanup shader codegen si_shader_ctx -> ctx type * ptr -> type ptr si_shader_context shader -> si_shader_context *ctx Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-11 16:48:30 +01:00
Marek Olšák	1c8a1a8fed	radeonsi: fix a crash when binding a sampler buffer Buffers don't contain r600_texture. Broken by `7aedbbacae`: "radeonsi: put image, fmask, and sampler descriptors into one array" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94091	2016-02-11 16:48:30 +01:00
Emil Velikov	0f3cea95ab	docs: add news item and link release notes for 11.1.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-11 01:47:16 +00:00
Emil Velikov	0802afd92d	docs: add sha256 checksums for 11.1.2 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `e49dd21bcb`)	2016-02-11 01:45:27 +00:00
Emil Velikov	323782aa57	docs: add release notes for 11.1.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `7bcd827806`)	2016-02-11 01:45:25 +00:00
Jason Ekstrand	8750299a42	nir: Remove the const_offset from nir_tex_instr When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Rob Clark <robclark@gmail.com>	2016-02-10 16:33:50 -08:00
Jason Ekstrand	70dff4a55e	nir/lower_vec_to_movs: Better report channels handled by insert_mov This fixes two issues. First, we had a use-after-free in the case where the instruction got deleted and we tried to return mov->dest.write_mask. Second, in the case where we are doing a self-mov of a register, we delete those channels that are moved to themselves from the write-mask. This means that those channels aren't reported as being handled even though they are. We now stash off the write-mask before remove unneeded channels so that they still get reported as handled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-02-10 16:33:14 -08:00
Marek Olšák	6ee1c386fe	radeonsi: don't emit unnecessary NULL exports for unbound targets (v3) v2: remove semantic index == 0 checks add the else statement to remove shadowing of args v3: fix fbo-alphatest-nocolor regression Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)	2016-02-10 23:53:17 +01:00
Ben Widawsky	088280e022	i965: Make sure we blit a full compressed block This fixes an assertion failure in [at least] one of the Unreal Engine Linux demo/games that uses DXT1 compression. Specifically, the "Vehicle Game". At some point, the game ends up trying to blit mip level whose size is 2x2, which is smaller than a DXT1 block. As a result, the assertion in the blit path is triggered. It should be safe to simply make sure we align the width and height, which is sadly an example of compression being less efficient. NOTE: The demo seems to work fine without the assert, and therefore release builds of mesa wouldn't stumble over this. Perhaps there is some unnoticeable corruption, but I had trouble spotting it. Thanks to Jason for looking at my backtrace and figuring out what was going on. v2: Use NPOT alignment to make sure ASTC is handled properly (Ilia) Remove comment about how this doesn't fix other bugs, because it does. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93358 Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-10 14:08:46 -08:00
Marek Olšák	79d0082c64	radeon/uvd: silence a warning	2016-02-10 20:16:17 +01:00
Marek Olšák	d9c8a8fe61	r300g: silence warnings	2016-02-10 20:16:17 +01:00
Ian Romanick	0ecc9d907e	meta/decompress: Don't pollute the renderbuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Fixes piglit 'object-namespace-pollution glGetTexImage-compressed renderbuffer' test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:55 -08:00
Ian Romanick	3aeff21fbf	meta: Use internal functions for renderbuffer access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:53 -08:00
Ian Romanick	4087c17832	meta/decompress: Track renderbuffer using gl_renderbuffer instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:50 -08:00
Ian Romanick	47a5aa4bfa	i965/meta: Don't pollute the renderbuffer namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:47 -08:00
Ian Romanick	03506c9ef1	i965/meta: Use internal functions for renderbuffer access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:44 -08:00
Ian Romanick	4c6b0e017c	i965/meta: Return struct gl_renderbuffer* from brw_get_rb_for_slice instead of GL API handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:42 -08:00
Ian Romanick	ab2b631703	meta: Don't save or restore the renderbuffer binding Nothing left in meta does anything with the RBO binding, so we don't need to save or restore it. The FBO binding is still modified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:40 -08:00
Ian Romanick	e273bbd60b	meta: Use _mesa_CreateRenderbuffers instead of _mesa_GenRenderbuffers and _mesa_BindRenderbuffer This has the advantage that it does not pollute the global binding state. It also enables later patches that will stop calling _mesa_GenRenderbuffers / _mesa_CreateRenderbuffers which pollute the renderbuffer namespace. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:36 -08:00
Ian Romanick	1e055e9211	i965/meta: Use _mesa_CreateRenderbuffers instead of _mesa_GenRenderbuffers and _mesa_BindRenderbuffer This has the advantage that it does not pollute the global binding state. It also enables later patches that will stop calling _mesa_GenRenderbuffers / _mesa_CreateRenderbuffers which pollute the renderbuffer namespace. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:33 -08:00
Ian Romanick	eb5bc62e97	mesa: Refactor renderbuffer_storage to make _mesa_renderbuffer_storage Pulls the parts of renderbuffer_storage that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:31 -08:00
Ian Romanick	9ae42ab1ec	mesa: Refactor _mesa_framebuffer_renderbuffer This function previously was only used in fbobject.c and contained a bunch of API validation. Split the function into framebuffer_renderbuffer that is static and contains the validation, and _mesa_framebuffer_renderbuffer that is suitable for calling from elsewhere in Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-02-10 10:59:28 -08:00
Marek Olšák	7aedbbacae	radeonsi: put image, fmask, and sampler descriptors into one array The texture slot is expanded to 16 dwords containing 2 descriptors. Those can be: - Image and fmask, or - Image and sampler state By carefully choosing the locations, we can put all three into one slot, with the fmask and sampler state being mutually exclusive. This improves shaders in 2 ways: - 2 user SGPRs are unused, shaders can use them as temporary registers now - each pair of descriptors is always on the same cache line v2: cosmetic changes: add back v8i32, don't load a sampler state & fmask at the same time Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-10 19:41:49 +01:00
Marek Olšák	796ee76e2e	winsys/radeon: fix the num_tile_pipes comment to silence warnings	2016-02-10 19:41:49 +01:00
Alexandre Demers	111602e159	winsys/radeon: better explain the num_tile_pipes fixup for TAHITI (v2) v2: Clarify the relation between num_tiles_pipes and GB_TILE_MODE and the fix needed for Tahiti as suggested by Marek. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-10 19:29:41 +01:00
Samuel Pitoiset	5e8db898fd	st/mesa: check ureg_create() retval in create_pbo_upload_vs() This avoids a possible NULL dereference because ureg_create() might return a NULL pointer. Spotted by coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-10 18:26:20 +01:00
Bernhard Rosenkränzer	e86ba7844f	freedreno/ir3: Get rid of nested functions This allows building Freedreno with clang Signed-off-by: Bernhard Rosenkränzer <bero@linaro.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-10 11:26:48 -05:00
Chris Forbes	43d23e879c	i965/blorp: Fix hiz ops on MSAA surfaces Two things were broken here: - The depth/stencil surface dimensions were broken for MSAA. - Sample count was programmed incorrectly. Result was the depth resolve didn't work correctly on MSAA surfaces, and so sampling the surface later produced garbage. Fixes the new piglit test arb_texture_multisample-sample-depth, and various artifacts in 'tesseract' with msaa=4 glineardepth=0. Fixes freedesktop bug #76396. Not observed any piglit regressions on Haswell. v2: Just set brw_hiz_op_params::dst.num_samples rather than adding a helper function (Ken). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> v3: moved the alignment needed for hiz+msaa to brw_blorp.cpp, as suggested by Chad Versace (Alejandro Piñeiro on behalf of Chris Forbes) Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-02-10 09:00:05 +01:00
Topi Pohjolainen	878b2b8964	i965/gen8: Remove dead assertion The assertion is inside a condition mandating num_samples > 1 and therefore the first half of the constraint is always met. The second half in turn would only be applicable for single sampled case and moreover it is trying to falsely check against surface type instead of format. Subsequent patches will introduce proper support for the lossless compression and dropping this here makes the patches a little simpler. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-10 09:11:34 +02:00
Topi Pohjolainen	3c432d48bf	i965: Use constant pointer when checking for compression Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-10 09:10:45 +02:00
Brian Paul	85fab1f09a	mesa: fix trivial comment typo in dlist.c	2016-02-09 20:09:30 -07:00
Kenneth Graunke	85f5c18fef	i965/vec4: Drop support for ATTR as an instruction destination. This is no longer necessary...and it doesn't make much sense to have inputs as destinations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-09 17:01:45 -08:00
Kenneth Graunke	67c5d00273	i965/vec4/gs: Stop munging the ATTR containing gl_PointSize. gl_PointSize is delivered in the .w component of the VUE header, while the language expects it to be a float (and thus in the .x component). Previously, we emitted MOVs to copy it over to the .x component. But this is silly - we can just use a .wwww swizzle and access it without copying anything or clobbering the value stored at .x (which admittedly is useless). Removes the last use of ATTR destinations. v2: Use BRW_SWIZZLE_WWWW, not SWIZZLE_WWWW (caught by GCC). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-09 17:01:45 -08:00
Kenneth Graunke	d56ae2d160	i965: Apply VS attribute workarounds in NIR. This patch re-implements the pre-Haswell VS attribute workarounds. Instead of emitting shader code in the vec4 backend, we now simply call a NIR pass to emit the necessary code. This simplifies the vec4 backend. Beyond deleting code, it removes the primary use of ATTR as a destination. It also eliminates the requirement that the vec4 VS backend express the ATTR file in terms of VERT_ATTRIB_* locations, giving us a bit more flexibility. This approach is a little different: rather than munging the attributes at the top, we emit code to fix them up when they're accessed. However, we run the optimizer afterwards, so CSE should eliminate the redundant math. It may even be able to fuse it with other calculations based on the input value. shader-db does not handle non-default NOS settings, so I have no statistics about this patch. Note that the scalar backend does not implement VS attribute workarounds, as they are unnecessary on hardware which allows SIMD8 VS. v2: Do one multiply for FIXED rescaling and select components from either the original or scaled copy, rather than multiplying each component separately (suggested by Matt Turner). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-02-09 17:01:45 -08:00
Brian Paul	cac54d7987	st/mesa: clarify some texture target code in st_cb_drawpix.c Use st->internal_target instead of PIPE_TEXTURE_2D when choosing the texture format. Probably no real difference, but let's be consistent. Simplify a test when determining whether we need normalized texcoords. Add a new assertion. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 17:48:26 -07:00
Brian Paul	5e4de781fa	st/mesa: fix bitmap texture target code and simplify tex sampler state Bitmaps may be drawn with a PIPE_TEXTURE_2D or PIPE_TEXTURE_RECT resource as determined at context creation by checking if PIPE_CAP_NPOT_TEXTURES is supported. But many places in the bitmap code were hard-coded to use PIPE_TEXTURE_2D. Use st->internal_target instead. I think an older NV chip is the only case where a gallium driver does not support NPOT textures. Bitmap drawing was probably broken for that GPU. Also, we only need one sampler state with texcoord normalization set up according to st->internal_target. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 17:48:25 -07:00
Brian Paul	9e2a9d5743	st/mesa: use MAX3() macro, as we do for sampler view code below Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 17:48:25 -07:00
Brian Paul	a5b8ede253	st/mesa: move some st_cb_drawpixels.c code, add comments	2016-02-09 17:47:42 -07:00
Nanley Chery	c624241ef4	mesa/readpix: Dedent former _mesa_readpixels() if block Formatting patch split out for easy reviewing. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Nanley Chery	b89a8a15c2	mesa/readpix: Don't clip in _mesa_readpixels() The clipping is performed higher up in the call-chain. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Nanley Chery	605832736a	mesa/readpix: Clip ReadPixels() area to the ReadBuffer's The fast path for Intel's ReadPixels() unintentionally omits clipping the specified area to a valid one. Rather than clip in various corner-cases, perform this operation in the API validation stage. The bug in intel_readpixels_tiled_memcpy() showed itself when the winsys ReadBuffer's height was smaller than the one specified by ReadPixels(). yoffset became negative, which was an invalid input for tiled_to_linear(). v2: Move clipping to validation stage (Jason) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92193 Reported-by: Marta Löfstedt <marta.lofstedt@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Nanley Chery	55d56d34e0	mesa/image: Make _mesa_clip_readpixels() work with renderbuffers v2: Use gl_renderbuffer::{Width,Height} (Jason) Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-09 15:13:07 -08:00
Jason Ekstrand	d03e5d5255	i965/vec4: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	f88027f7bd	i965/vec4: Separate the sampler from the surface in generate_tex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	b8ab9c8c86	i965/fs: Plumb separate surfaces and samplers through from NIR Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	c0c14de130	i965/fs: Separate the sampler from the surface in generate_tex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	a37b8110c1	i965/fs: Add an enum for keeping track of texture instruciton sources These logical texture instructions can have a lot of sources. It's much safer if we have symbolic names for them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	5ec456375e	nir: Separate texture from sampler in nir_tex_instr This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	ee85014b90	nir/tex_instr: Rename sampler to texture We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 15:00:17 -08:00
Jason Ekstrand	3f42184994	nir: Add some braces around loops and ifs	2016-02-09 15:00:17 -08:00
Kenneth Graunke	830b075e86	i965: Explicitly write the "TR DS Cache Disable" bit at TCS EOT. Bit 0 of the Patch Header is "TR DS Cache Disable". Setting that bit disables the DS Cache for tessellator-output topologies resulting in stitch-transition regions (but leaves it enabled for other cases). We probably shouldn't leave this to chance - the URB could contain garbage - which could result in the cache randomly being turned on or off. This patch makes the final EOT write 0 to the first DWord (which only contains this one bit). This ensures the cache is always on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-09 14:54:26 -08:00
Rob Clark	8b0fb1c152	freedreno/ir3: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-09 17:30:33 -05:00
Rob Clark	ced8d3e773	nir: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	6921762de6	ptn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	ead05e8670	ttn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	b1770235ed	ttn: small logic cleanup The only case where dim!=NULL is where op==load_ubo. But using op==load_ubo is less confusing. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-09 17:30:33 -05:00
Rob Clark	b6cf98bc82	gtn: use const_index helpers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Rob Clark	1df3ecc1b8	nir: const_index helpers Direct access to intr->const_index[n], where different slots have different meanings, is somewhat confusing. Instead, let's put some extra info in nir_intrinsic_infos[] about which slots map to what, and add some get/set helpers. The helpers validate that the field being accessed (base/writemask/etc) is applicable for the intrinsic opc, for some extra safety. And nir_print can use this to dump out decoded const_index fields. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-02-09 17:30:33 -05:00
Kenneth Graunke	8b0f6de73d	glsl: Disallow transform feedback varyings with compute shaders. If the only stage is MESA_SHADER_COMPUTE, we should complain that there's nothing coming out of the geometry shader stage just as we would if the first stage were MESA_SHADER_FRAGMENT. Also, it's valid for tessellation shaders to be the stage producing transform feedback varyings, so mention those in the compiler error. Found by inspection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-09 12:34:11 -08:00
Marek Olšák	329181ae33	radeonsi: enable denorms for 64-bit and 16-bit floats This fixes FP16 conversion instructions for VI, which has 16-bit floats, but not SI & CI, which can't disable denorms for those instructions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	17fe3fa312	gallium: pass the robust buffer access context flag to drivers radeonsi will not do bounds checking for loads if this is not set. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	d611fce23d	gallium/radeon: add a function for adding llvm function attributes This will be used for setting the new InitialPSInputAddr attribute. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	de2e28366a	radeonsi: compile geometry shaders immediately they have only 1 variant Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	f7a8b6fff5	radeonsi: split out code for deleting si_shader Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	e21142087c	radeonsi: move code writing tess factors into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	dc5fc3c2f6	radeonsi: make LLVM IR dumping less messy Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	c1041366db	radeonsi: move a few r600_can_dump_shader calls to where they're needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	b6d5666fbf	radeonsi: remove useless code that handles dx10_clamp_mode "enable-no-nans-fp-math" is a wrong string and there was a disagreement about fixing it. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	57271d5364	radeonsi: dump SPI_PS_INPUT values along with shader stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	5a53628f45	radeonsi: read SPI_PS_INPUT_ADDR from LLVM if it returns it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	9483fcc7f2	radeonsi: don't force gl_SampleMaskIn to 1 for smoothing Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	c379c2540b	radeonsi: split PS input interpolation code into its own function This will be used by the fragment shader prolog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	b9126dcda8	radeonsi: implement forcing per-sample_interpolation using the shader key only It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	4596f3c1b8	radeonsi: remove si_shader::ps_input_interpolate tgsi_shader_info has this too. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	6dda2455c8	radeonsi: move BCOLOR PS input locations after all other inputs BCOLOR inputs were immediately after COLOR inputs. Thus, all following inputs were offset by 1 if color_two_side was enabled, and not offset if it was not enabled, which is a variation that's problematic if we want to have 1 variant per shader and the variant doesn't care about color_two_side (that should be handled by other bytecode attached at the beginning). Instead, move BCOLOR inputs after all other inputs, so BCOLOR0 is at location "num_inputs" if it's present. BCOLOR1 is next. This also allows removing si_shader::nparam and si_shader::ps_input_param_offset, which are useless now. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	606e4185f3	radeonsi: move SPI_PS_INPUT_CNTL value computation to a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	90cbbe1c12	radeonsi: generate a color_two_side variant only if the shader reads colors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	4bbbaaf191	radeonsi: move si_shader_context initialization into a separate function This will be re-used later. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	a3e9a5f9f8	st/mesa: remove st_is_program_native The default scenario sets GL_TRUE too. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:51 +01:00
Marek Olšák	7046c588eb	st/mesa: unify destroy_program_variants cases for TCS, TES, GS Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Marek Olšák	75be3ee9f9	st/mesa: unify get_variant functions for TCS, TES, GS Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Marek Olšák	b8d31fdedf	st/mesa: unify variants and delete functions for TCS, TES, GS no difference between those Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-09 21:19:50 +01:00
Brian Paul	fe14110f35	mesa: fix incorrect viewport position when GL_CLIP_ORIGIN = GL_LOWER_LEFT Ilia Mirkin found/fixed the mistake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93813 Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-09 11:27:48 -07:00
Brian Paul	0193e20df5	mesa: rewrite save_CallLists() code When glCallLists() is compiled into a display list, preserve the call as a single glCallLists rather than 'n' glCallList calls. This will matter for an upcoming display list optimization project. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-09 11:27:48 -07:00
Brian Paul	711d5347cf	mesa: add missing error check in _mesa_CallLists() Generate GL_INVALID_VALUE if n < 0. Return early if n==0 or lists==NULL. v2: fix formatting, also check for lists==NULL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-02-09 11:27:48 -07:00
Brian Paul	b1ddc03633	mesa: whitespace clean-ups in dlist.h And remove 'extern' qualifiers.	2016-02-09 11:27:48 -07:00
Brian Paul	7d18faf8e7	st/mesa: don't allocate bitmap drawing state until needed Most apps don't use glBitmap so don't allocate the bitmap cache or gallium state objects/shaders/etc until the first call to st_Bitmap(). v2: simplify a conditional, per Gustaw Smolarczyk. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:48 -07:00
Brian Paul	a5799de3dc	st/mesa: move the setup_bitmap_vertex_data() code into draw_bitmap_quad() Now all the code to setup the vertex data and draw it is in one place. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:48 -07:00
Brian Paul	130d34ce65	st/mesa: refactor some bitmap drawing code Move setup/restoration of rendering state into helper functions. This makes the draw_bitmap_quad() function much more concise. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:27:47 -07:00
Ilia Mirkin	922be4eab9	mesa: remove hack to fix up GL_ANY_SAMPLES_PASSED results Both st/mesa and i965 should return a true/false result now, and the only other driver implementing queries (radeon) doesn't support ARB_occlusion_query2 which added that pname. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	7aca4bb9b1	st/mesa: make use of the occlusion predicate query Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	50235ab3ab	nv50: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	0cb1dda36e	nv30: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-09 11:59:35 -05:00
Ilia Mirkin	0d04ec2fd2	ilo: add PIPE_QUERY_OCCLUSION_PREDICATE support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2016-02-09 11:59:27 -05:00
Nicolai Hähnle	c260175677	draw: use util_pstipple_* function for stipple pattern textures and samplers This reduces code duplication. Suggested-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-09 10:01:57 -05:00
Nicolai Hähnle	452e51bf1e	draw: use util_pstipple_create_fragment_shader This reduces code duplication. It also adds support for drivers where the fragment position is a system value. Suggested-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-02-09 10:01:32 -05:00
Marek Olšák	83b4d701c0	winsys/radeon: fix a wrong NUM_TILE_PIPES value from the kernel Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94019 Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-09 15:26:40 +01:00
Timothy Arceri	1aae5e8ced	nir: remove unused nir_variable fields These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:49:06 +11:00
Timothy Arceri	6235b69134	glsl: remove unrequired forward declaration This was added in `2548092ad8` although I don't see why as it was already in the linker.h header. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:48:55 +11:00
Timothy Arceri	9dd6a4ea79	glsl: clean up and fix bug in varying linking rules The existing code was very hard to follow and has been the source of at least 3 bugs in the past year. The existing code also has a bug for SSO where if we have a multi-stage SSO for example a tes -> gs program, if we try to use transform feedback with gs the existing code would look for the transform feedback varyings in the tes stage and fail as it can't find them. V2: Add more code comments, always try to remove unused inputs to the first stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:22 +11:00
Timothy Arceri	fd0b89ad8d	glsl: simplify ES Vertex/Fragment shader requirements We really just needed to skip the existing ES < 3.1 check if we have a compute shader, all other scenarios are already covered. * No shaders is a link error. * Geom or Tess without Vertex is a link error which means we always require a Vertex shader and hence a Fragment shader. * Finally a Compute shader linked with any other stage is a link error. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:15 +11:00
Timothy Arceri	55fa3c44bc	glsl: simplify required stages for linking rules Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:11 +11:00
Timothy Arceri	20823992b4	glsl: small tidy up now that link_shaders() exits early with 0 shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:07 +11:00
Timothy Arceri	76cfb47207	glsl: don't attempt to link empty program Previously an empty program would go through the entire link_shaders() function and we would have to be careful not to cause a segfault. In core profile also now set link_status to false by generating an error, it was previously set to true. From Section 7.3 (PROGRAM OBJECTS) of the OpenGL 4.5 spec: "Linking can fail for a variety of reasons as specified in the OpenGL Shading Language Specification, as well as any of the following reasons: - No shader objects are attached to program." V2: Only generate an error in core profile and add spec quote (Ian) V3: generate error in ES too, remove previous check which was only applying the rule to GL 4.5/ES 3.1 and above. My understand is that this spec change is clarifying previously undefined behaviour and therefore should be applied retrospectively. The ES CTS tests for this are in ES 2 I suspect it was passing because it would have generated an error for not having both a vertex and fragment shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-09 22:44:02 +11:00
Matt Turner	371c4b3c48	nir: Recognize open-coded bitfield_reverse. Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a lot of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 21:20:58 -08:00
Matt Turner	2d0d9755da	nir: Handle large unsigned values in opt_algebraic. The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>	2016-02-08 20:38:17 -08:00
Matt Turner	7be8d07732	nir: Do opt_algebraic in reverse order. Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	a8f0960816	nir: Recognize product of open-coded pow()s. Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Matt Turner	9f02e3ab03	nir: Add opt_algebraic rules for xor with zero. instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-02-08 20:38:17 -08:00
Timothy Arceri	3fd4280759	glsl: validate arrays of arrays on empty type delclarations Fixes: dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_fragment dEQP-GLES31.functional.shaders.arrays_of_arrays.invalid.empty_declaration_without_var_name_vertex Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 13:52:52 +11:00
Kenneth Graunke	74f956c416	i965: Use nir_lower_load_const_to_scalar(). I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-08 18:10:34 -08:00
Timothy Arceri	184afd8fd9	mesa: remove now unused sampler index handing code Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 12:03:02 +11:00
Timothy Arceri	edc108765e	mesa: compute sampler index in ir_to_mesa rather than using UniformHash The aim of this is to work towards removing UniformHash from the program struct so that we don't need to hold onto it in memory and pass it around outside the linker. Reviewed-by: Dave Airlie <airlied@redhat.com>	2016-02-09 12:02:58 +11:00
Kenneth Graunke	d0e1d6b7e2	i965: Don't add barrier deps for FB write messages. There are never render target reads, so there are no scheduling hazards. Giving the extra flexibility to the scheduler makes it possible to do FB writes as soon as their sources are available, reducing register pressure. It also makes it possible to do the payload setup for more than one FB write message at a time, which could better hide latency. shader-db results on Skylake: total instructions in shared programs: 9110254 -> 9110211 (-0.00%) instructions in affected programs: 2898 -> 2855 (-1.48%) helped: 3 HURT: 0 LOST: 0 GAINED: 1 A reduction in instruction counts is surprising, but legitimate: the three shaders helped were spilling, and reducing register pressure allowed us to issue fewer spills/fills. total cycles in shared programs: 69035108 -> 68928820 (-0.15%) cycles in affected programs: 4412402 -> 4306114 (-2.41%) helped: 4457 HURT: 213 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2016-02-08 16:59:35 -08:00
Dave Airlie	6502b3f60e	st/mesa: enable AoA for gallium drivers reporting GLSL 1.30 Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:09 +10:00
Dave Airlie	b74e8c89a6	st/mesa: add atomic AoA support reuse the sampler deref handling code to do the same thing for atomics. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:09 +10:00
Dave Airlie	90bbe3d781	mesa: drop unused nonconst sampler functions. Since we fixed the glsl->tgsi conversion we no longer need this function. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:08 +10:00
Dave Airlie	bb8bbe34e3	st/mesa: handle indirect samplers in arrays/structs properly (v4.1) The state tracker never handled this properly, and it finally annoyed me for the second time so I decided to fix it properly. This is inspired by the NIR sampler lowering code and I only realised NIR seems to do its deref ordering different to GLSL at the last minute, once I got that things got much easier. it fixes a bunch of tests in tests/spec/arb_gpu_shader5/execution/sampler_array_indexing/ v2: fix AoA tests when forced on. I was right I didn't need all that code, fixing the AoA code meant cleaning up a chunk of code I didn't like in the array handling. v3: start generalising the code a bit more for atomics. v3.1: use UniformRemapTable v4: handle uniforms differently using the param_index, and go back to UniformStorage fix issues identified by Timothy with deref handling. v4.1: squash const fix and move handling 1D const out of recursive function. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:08 +10:00
Dave Airlie	52801766a0	glsl/ir: add param index to variable. We have a requirement to store the index into the mesa parameterlist for uniforms. Up until now we've overwritten var->data.location with this info. However this then stops us accessing UniformStorage, which is needed to do proper dereferencing. Add a new variable to ir_variable to store this value in, and change the two uses to use it correctly. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-09 10:52:08 +10:00
Francisco Jerez	53739fddc6	i965: Rename define for the PIPE_CONTROL DC flush bit. Its previous name was somewhat misleading, this really behaves like a RW cache flush rather than an invalidation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-08 15:48:00 -08:00
Francisco Jerez	10d84ba9f0	i965: Invalidate state cache before L3 partitioning set-up. The state cache is also L3-backed so it seems sensible to make sure it's clean as we do for other RO caches before repartitioning the L3. This wasn't part of my original L3 partitioning code because I was able to reproduce hangs on Gen7 hardware when the state cache invalidation happened asynchronously with previous 3D rendering, which should no longer be possible after the previous change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-08 15:47:21 -08:00
Francisco Jerez	0aa4f99f56	i965: Fix cache pollution race during L3 partitioning set-up. We need to split the stalling flush from the RO cache invalidation into a different PIPE_CONTROL command to make sure that the top of the pipe invalidation happens after any previous rendering is complete. Otherwise it's possible for previous rendering to pollute the L3 cache in the short window of time between RO invalidation and the completion of the stalling flush. Fixes rendering artifacts on Unigine Heaven, Metro Last Light Redux and Metro 2033 Redux. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93540 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93599 Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Tested-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-08 15:45:44 -08:00
Francisco Jerez	1817e3c07a	i965/fs: Don't emit unnecessary SEL instruction from emit_image_atomic(). The SEL instruction with predication mode NONE emitted when the atomic operation doesn't need to be predicated is a no-op and might rely on undocumented hardware behaviour. Noticed by chance while looking at the assembly output. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-08 15:43:05 -08:00
Matt Turner	c300559fbf	i965/vec4: Update vec4 unit tests for commit `01dacc83ff`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94050	2016-02-08 15:32:12 -08:00
Brian Paul	01dacc83ff	dri/common: include debug_output.h to silence warning	2016-02-08 10:52:02 -07:00
Brian Paul	59251610ed	tgsi: minor whitespace fixes in tgsi_scan.c Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	42246ab1f5	tgsi: s/true/TRUE/ in tgsi_scan.c Just to be consistent. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	da6e879a6c	tgsi: use switches instead of big if/else ifs Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	37eb3f0400	tgsi: break gigantic tgsi_scan_shader() function into pieces New functions for examining instructions, declarations, etc. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-08 09:29:38 -07:00
Brian Paul	3c3ef69696	st/mesa: minor formatting fixes in st_cb_bitmap.c	2016-02-08 09:29:38 -07:00
Brian Paul	5fdbfb8d6f	mesa: move GL_ARB_debug_output code into new debug_output.c file The errors.c file had grown quite large so split off this extension code into its own file. This involved making a handful of functions non-static. Acked-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-02-08 09:29:38 -07:00
Brian Paul	6691ba1fe8	gallium/util: whitespace, formatting fixes in u_debug_stack.c	2016-02-08 09:29:38 -07:00
Brian Paul	5d2539cb49	gallium/util: whitespace, formatting fixes in u_staging.[ch] files Still some nonsensical comments.	2016-02-08 09:29:38 -07:00
Brian Paul	c84a8911fc	gallium/util: switch over to new u_debug_image.[ch] code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-08 09:29:38 -07:00
Brian Paul	3917c8f3f9	gallium/util: put image dumping functions into separate file To try to reduce the clutter in u_debug.[ch] Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-08 09:29:38 -07:00
Brian Paul	6c7d4a7173	gallium/util: whitespace, formatting fixes in u_debug.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-08 09:29:38 -07:00
Samuel Pitoiset	efe5829578	trace: add missing pipe_context::clear_texture() This fixes a crash with bin/arb_clear_texture-base-formats and probably some other tests which use clear_texture(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-08 00:06:32 +01:00
Samuel Pitoiset	1dacbb7b46	trace: remove useless MALLOC() in trace_context_draw_vbo() There is no need to allocate memory when unwrapping the indirect buf. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-08 00:06:22 +01:00
Vinson Lee	ccaf734275	mesa/extensions: Fix NVX_gpu_memory_info lexicographical order. Fixes MesaExtensionsTest.AlphabeticallySorted. Fixes: `1d79b99580` ("mesa: implement GL_NVX_gpu_memory_info (v2)") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94016 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-07 14:42:00 -08:00
Ilia Mirkin	88519c6087	glsl: return cloned signature, not the builtin one The builtin data can get released with a glReleaseShaderCompiler call. We're careful everywhere to clone everything that comes out of builtins except here, where we accidentally return the signature belonging to the builtin version, rather than the locally-cloned one. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Rob Herring <robh@kernel.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-07 17:23:58 -05:00
Ilia Mirkin	ac57577e29	glsl: make sure builtins are initialized before getting the shader The builtin function shader is part of the builtin state, released when glReleaseShaderCompiler is called. We must ensure that the builtins have been (re)initialized before attempting to link with the builtin shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Tested-by: Rob Herring <robh@kernel.org> Cc: mesa-stable@lists.freedesktop.org	2016-02-07 17:23:57 -05:00
Samuel Pitoiset	04c2ca5038	tgsi: use TGSI_WRITEMASK_XYZW instead of hardcoding the mask Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Serge Martin <edb+mesa@sigluy.net>	2016-02-06 20:24:41 +01:00
Timothy Arceri	ea7f64f74d	glsl: don't generate transform feedback candidate when not required If we are not even looking for one don't bother generating a candidate list. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-06 14:34:43 +11:00
Timothy Arceri	c1bbaff1e8	glsl: replace unreachable code with an assert() All interface blocks will have been lowered by this point so just use an assert. Returning false would have caused all sorts of problems if they were not lowered yet and there is an assert to catch this later anyway. We also update the tests to reflect this change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-06 14:34:35 +11:00
Jan Vesely	e377037bef	r600, compute: Do not overwrite pipe_resource.screen found by inspection. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-05 21:17:15 -05:00
Jan Vesely	5b51b2e000	r600g: Ignore format for PIPE_BUFFER targets Fixes compute since `7dd31b81fe` gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 20:23:56 +01:00
Marek Olšák	d8e4908b63	mesa/get: fix a breakage after rebase trivial.	2016-02-05 19:39:13 +01:00
Matt Turner	9f2e22bf34	i965/vec4: don't copy ATTR into 3src instructions with complex swizzles The vec4 backend, at the end, does this: if (inst->is_3src()) { for (int i = 0; i < 3; i++) { if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0) assert(brw_is_single_value_swizzle(inst->src[i].swizzle)); So make sure that we use the same conditions when trying to copy-propagate. UNIFORMs will be converted to vstride 0 in convert_to_hw_regs, but so will ATTRs when interleaved (as will happen in a GS with multiple attributes). Since the vstride is not set at copy-prop time, infer it by inspecting dispatch_mode and reject ATTRs if they have non-scalar swizzles and are interleaved. Fixes assertion errors in dolphin-generated geometry shaders (or misrendering on opt builds) on Sandybridge or on IVB/HSW with INTEL_DEBUG=nodualobj. Co-authored-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93418 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-02-05 09:33:19 -08:00
Marek Olšák	1106e79ed9	docs/relnotes: document memory info extensions	2016-02-05 17:47:59 +01:00
Marek Olšák	635555af6a	gallium/radeon: implement query_memory_info (v2) v2: don't use DIV_ROUND_UP (no so useful) also return eviction stats Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:31:58 +01:00
Marek Olšák	5f51a24a77	st/mesa: implement and enable memory info extensions (v2) v2: assert and return if query_memory_info is not set rebase Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:31:53 +01:00
Marek Olšák	837f74aa51	mesa: implement GL_ATI_meminfo (v2) v2: rebase Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:31:20 +01:00
Marek Olšák	1d79b99580	mesa: implement GL_NVX_gpu_memory_info (v2) v2: implement eviction queries properly add gl_memory_info structure Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:30:07 +01:00
Marek Olšák	d2e4c9e737	gallium: add interface for querying memory usage and sizes (v2) If you're worried about the duplication of some CAPs, we can remove them later. v2: add fields for memory eviction stats Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-02-05 17:29:38 +01:00
Marek Olšák	c577f2843a	gallium/radeon: remove radeon_info::r600_tiling_config Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:29:19 +01:00
Marek Olšák	4f96846d9d	gallium/radeon: get pipe_interleave_bytes AKA group_bytes from the winsys Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:59 +01:00
Marek Olšák	276621da45	gallium/radeon: set num_banks in the winsys amdgpu doesn't have to set this, because radeonsi gets it from tile mode arrays by default. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:40 +01:00
Marek Olšák	294ec530c9	gallium/radeon: just get num_tile_pipes from the winsys Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:24 +01:00
Marek Olšák	0f3556d308	winsys/amdgpu: add an assertion to cik_get_num_tile_pipes (v2) v2: print an error to stderr Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:18 +01:00
Marek Olšák	a2291f7b57	winsys/amdgpu: remove an r600-only setting Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:12 +01:00
Marek Olšák	1e864d7379	gallium/radeon: rename & reorder members of radeon_info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-05 17:28:00 +01:00
Steinar H. Gunderson	feb53912f8	mesa: Fix locking of GLsync objects. GLsync objects had a race condition when used from multiple threads (which is the main point of the extension, really); it could be validated as a sync object at the beginning of the function, and then deleted by another thread before use, causing crashes. Fix this by changing all casts from GLsync to struct gl_sync_object to a new function _mesa_get_and_ref_sync() that validates and increases the refcount. In a similar vein, validation itself uses _mesa_set_search(), which requires synchronization -- it was called without a mutex held, causing spurious error returns and other issues. Since _mesa_get_and_ref_sync() now takes the shared context mutex, this problem is also resolved. Fixes bug #92757, found while developing Nageru, my live video mixer (due for release at FOSDEM 2016). v2: Marek: silence warnings, fix declaration after code Signed-off-by: Steinar H. Gunderson <sesse@google.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 17:18:17 +01:00
Nicolai Hähnle	156e81f305	radeonsi: add placeholder MC and SRBM performance counter groups Yet another change motivated by AMD GPUPerfStudio compatibility. These groups are not directly accessible from userspace, and AMD GPUPerfStudio does not actually query them - it just requires them to be there. Hence, adding a placeholder for now. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:25:33 -05:00
Nicolai Hähnle	988f4b31f3	radeonsi: re-order the SQ_xx performance counter blocks This is yet another change motivated by appeasing AMD GPUPerfStudio's hardcoding of performance counter group numbers. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:25:30 -05:00
Nicolai Hähnle	75affd73b0	radeonsi: re-order the perfcounter hardware blocks As documented in the comment, AMD GPUPerfStudio unfortunately hardcodes the order of performance counter groups. Let's do the pragmatic thing and present the same order as Catalyst/Crimson. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:25:27 -05:00
Nicolai Hähnle	b0e32548c8	gallium/radeon: add GPIN driver query group This group was used by older versions of AMD GPUPerfStudio (via AMD_performance_monitor) to identify the GPU family, and GPUPerfStudio still complains when it isn't available. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:24:59 -05:00
Nicolai Hähnle	4b672b8310	radeonsi: Allow dumping LLVM IR before optimization passes Set R600_DEBUG=preoptir to dump the LLVM IR before optimization passes, to allow diagnosing problems caused by optimization passes. Note that in order to compile the resulting IR with llc, you will first have to run at least the mem2reg pass, e.g. opt -mem2reg -S < shader.ll \| llc -march=amdgcn -mcpu=bonaire Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> (original patch) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (w/ debug flag) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:22:04 -05:00
Nicolai Hähnle	5aafc169ca	gallium/radeon: emit LLVM `ret void` before radeon_llvm_finalize_module This allows dumping a consumable LLVM module before the initial optimization passes are run. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:54 -05:00
Nicolai Hähnle	7e9670c8bc	st/mesa: bail out of try_pbo_upload_common when constant upload fails Also fixes a resource leak when an upload_mgr is used for constants. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:51 -05:00
Nicolai Hähnle	a01e44adcc	st/mesa: bail out of try_pbo_upload_common when vertex upload fails At the same time, fix a memory leak noticed by Ilia Mirkin. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:48 -05:00
Nicolai Hähnle	b27c79bd81	st/mesa: reduce the scope of sampler_view in try_pbo_upload_common We can get rid of our reference immediately, since the driver will hold onto it for us. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:44 -05:00
Nicolai Hähnle	13e21e3ec5	st/mesa: do uploads earlier in try_pbo_upload_common While rather unlikely, uploads _can_ fail. Doing them earlier means we'll have to restore less state when they do fail, and it's slightly easier to check the restore code. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-05 09:21:27 -05:00
Neil Roberts	eb9cf3cfc9	main: Use a derived value for the default sample count Previously the framebuffer default sample count was taken directly from the value given by the application. On the i965 driver on HSW if the value wasn't one that is supported by the hardware it would hit an assert when it tried to program the state for it. This patch fixes it by adding a derived sample count to the state for the default framebuffer. The driver can then quantize this to one of the valid values in its UpdateState handler when the _NEW_BUFFERS state changes. _mesa_geometric_samples is changed to use the new derived value. Fixes the piglit test arb_framebuffer_no_attachments-query Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93957 Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:05:10 +00:00
Neil Roberts	5fd848f6c9	program: Use _mesa_geometric_samples to calculate gl_NumSamples Otherwise it won't take into account the default samples for framebuffers with no attachments. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:05:06 +00:00
Neil Roberts	4995d9c9a0	main: Use _mesa_geometric_samples to calculate GL_SAMPLE_BUFFERS Otherwise it won't take into account the default samples for framebuffers with no attachments. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:05:01 +00:00
Neil Roberts	d8d4661ddb	main: Use _mesa_geometric_samples to calculate the value of GL_SAMPLES Otherwise it won't take into account the default samples for framebuffers with no attachments. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-05 11:04:44 +00:00
Ilia Mirkin	2065e380b2	nvc0: avoid negatives in PUSH_SPACE argument Fixup to commit `03b3eb90d` - the number of buffers could be larger than the number of elements, in which case we'd pass a negative argument to PUSH_SPACE, which would be bad. While we're at it, merge it with the other PUSH_SPACE at the top of the function. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-05 00:49:51 -05:00
Ilia Mirkin	03b3eb90d7	nvc0: add some missing PUSH_SPACE's nvc0_vbo has explicit push space checking enabled, so we must run PUSH_SPACE by hand. A few spots missed that. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-05 00:41:43 -05:00
Ilia Mirkin	1a0fde1f52	nvc0/ir: fix converting between predicate and gpr The spill logic will insert convert ops when moving between files. It seems like the emission logic wasn't quite ready for these converts. Tested on fermi, and visually looked at nvdisasm output for maxwell. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-05 00:41:33 -05:00
Ilia Mirkin	2fed18b8a5	nvc0: add support for ARB_query_buffer_object Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	9cd5bb9f9f	st/mesa: add query buffer support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	f9e6f46335	gallium: add PIPE_CAP_QUERY_BUFFER_OBJECT Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	40d7f02c67	gallium: add a way to store query result into buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	386a9ec77b	mesa: add core implementation of ARB_query_buffer_object Forwards query result writes to drivers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Ilia Mirkin	7c3f4b2fd8	mesa: add driver interface for writing query results to buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	3efcd4df01	mesa: Handle QUERY_BUFFER_BINDING in GetIntegerv Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: move to GL/GL_CORE section] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	2d0ec0c272	mesa: Add QueryBuffer to context Add QueryBuffer and initialise it to NullBufferObj on start Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: also release QueryBuffer on free] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	c5bab061da	mesa: Add ARB_query_buffer_object extension flag Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: add string to extensions.c] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Rafal Mielniczuk	4913d381a0	glapi: Add xml infrastructure for ARB_query_buffer_object Signed-off-by: Rafal Mielniczuk <rafal.mielniczuk2@gmail.com> [imirkin: move definition to gl_API.xml as it is very short] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-04 21:21:30 -05:00
Timothy Arceri	23e24e27ac	glsl: simplify setting of image access qualifiers Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-05 10:05:40 +11:00
Timothy Arceri	815929bd15	mesa: remove dead program parameter functions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-05 09:11:00 +11:00
Axel Davy	94d91c6707	st/nine: Use align_free when needed Use align_free to free memory allocated with align_malloc. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	6b12fe77ea	st/nine: Disallow non-argb8888 cursors Only argb8888 cursors are allowed. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	24ddadbba9	st/nine: Enforce centroid for color input when multisampling is on The color inputs must automatically use centroid whether multisampling is used or not. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	d5389bb92d	st/nine: Fix centroid flag sem.reg.mod & NINED3DSPDM_CENTROID is worth 4 when centroid is requested, whereas TGSI_INTERPOLATE_LOC_CENTROID is worth 1. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	ee31f0fed4	st/nine: Use fast clears more often for MRTs This enables to use fast clears in the following case: pixel shader renders to 1 RT 4 RT bound clear new pixel shader bound that renders to 4 RTs Previously the fast clear path wouldn't be hit, because when trying the fast clear path, the framebuffer state would be configured for 1 RT, instead of 4. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	e85ef7d8e5	st/nine: Use linear filtering for shadow mapping Some docs say linear filtering is always used when app does shadow mapping. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	0b35da59de	st/nine: Respect block alignment on surface lock Respect block alignment for ATI1/ATI2 format when trying to lock a surface using LockRect(). Fixes failing WINE tests device.c test_surface_blocks() tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	56b4222b29	st/nine: Add Render state validation layer Testing Win behaviour seems to show wrong states are accepted, but then depending on the states some specific 'good' behaviours happen. This adds some validation to catch invalid states and have these 'good' behaviours when it happens. Also reorders SetRenderState to match the expected optimisation: (Value == previous Value) => return immediately, which affects D3D9 hacks too. Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	7132617436	DRI_CONFIG: Add option to override vendor id Add config option override_vendorid to report a fake card in d3dadapter9 drm. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	1a893ac886	st/nine: Implement NineDevice9_GetAvailableTextureMem Implement a device private memory counter similar to Win 7. Only textures and surfaces increment vidmem and may return ERR_OUTOFVIDEOMEMORY. Vertexbuffers and indexbuffers creation always succeedes, even when out of video memory. Fixes "Vampire: The Masquerade - Bloodlines" allocating resources until crash. Fixes "Age of Conan" allocating resources until crash. Fixes failing WINE test device.c test_vidmem_accounting(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	a961ec335d	st/nine: Handle Window Occlusion Apps can know if the window is occluded by checking for specific error messages. The behaviour is different for Device9 and Device9Ex. This allow games to release the mouse and stop rendering until the focus is restored. In case of multiple swapchain we do care only of the device one. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	e59908e57f	st/nine: Store minor version num To keep compatible with older ID3DPresent interfaces (used to talk with Wine), store the minor version num accessible to all statetracker functions (in the NineDevice9 structure). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	0ac01a9fd7	st/nine: Call flush_resource before flush flush_resource needs to be called before flush (for fast clear resolve, etc). Removes useless computation of resource (it is already set correctly). Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	f481b9b952	st/nine: Fix remaining swapchain tests Return D3DERR_INVALIDCALL instead of E_POINTER. On error set ppBackBuffer to NULL. Multiple swapchains can only be created in windowed mode as windowed swapchain. Set backbuffer to NULL in NineDevice9_GetBackBuffer, but not in NineSwapChain9_GetBackBuffer. This fixes all WINE's device.c test_swapchain() tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	cbbd3c65cc	st/nine: Fix crash NineDevice9_CreateAdditionalSwapChain When no window is specified, we should revert to the focus window. This deserves more tests however (what if the device swapchain is already using the focus window ?) Fixes crash for FFXIV Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	996f76bd8a	st/nine: Fix possible crash on error In case swapchain creation fails This->swapchains[i] might be NULL and causes a crash. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	40a0b97ebd	st/nine: Test more presentation params Return errors in case of invalid presentation parameters. Fixes failing WINE tests device.c test_swapchain_parameters(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	827fee059e	st/nine: Fix resource9 private data Store a copy of GUID in the header that is under our control and use it as key for the hashtable instead of using the application provided pointer. The application might change the memory after leaving the function. Fixes a crash for issue https://github.com/iXit/Mesa-3D/issues/130 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	5c79bd666b	st/nine: Print GUID instead of pointer To ease debugging print the GUID instead of the pointer to it. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	2a4d1509c8	st/nine: Fix use of uninitialized memory The values of box.z and box.depth weren't set and lead to a crash. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	924038c08f	st/nine: Fix clear for multisample mismatch depth-stencil Tests show in case of multisample mismatch between the depth-stencil buffer and the render target, then it is not cleared. Fixes failing WINE test visual.c test_multisample_mismatch(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	7f58ba45a8	st/nine: Fix Volumetexture9_LockBox Check for valid locked box dimensions. Fixes failing wine tests device.c test_lockbox_invalid. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	35047681ff	st/nine: Fix ATI2 pitch for non-square Fixes crash for non-square textures. We were using the height instead of the width for some calculations. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	eeeab8d6b4	st/nine: Support D3DFMT_R8G8B8 Add support for D3DFMT_R8G8B8. It allows format conversion for surfaces of pool scratch. Usually gallium formats equivalents for d3d9 formats have their names reversed. The gallium format PIPE_FORMAT_R8G8B8_UNORM is the right equivalent here, and its name is likely wrong (reversed). Fixes a crash in TmNationsForever. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	a3e7525ada	st/nine: Use cso for viewport Use CSO to catch redundant viewport changes. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	495727af6b	st/nine: Fix shade mode flat Shade mode flat is only working if pixelshaders have interpolate set to TGSI_INTERPOLATE_COLOR on color inputs. Fixes failing WINE tests visual.c test_shademode(). Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	fa887ba65b	st/nine: Clear rendertarget on creation Clear every rendertarget on creation. Fixes https://github.com/iXit/Mesa-3D/issues/139 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	b142f61621	st/nine: Allow ColorFill on D3DFMT_NULL surfaces Report success instead of failing as there's no resource for those surfaces. Fixes a crash in Crysis: Warhead. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	04e22a04a6	st/nine: Introduce STREAMFREQ state Previous vertex elements code update was protected by 'if ((group & (NINE_STATE_VDECL \| NINE_STATE_VS)) \|\| state->changed.stream_freq & ~1)' itself protected by 'if (group & (NINE_STATE_COMMON \| NINE_STATE_VS))' If no state is changed except the stream frequency, no update would happen. This patch solves the problem by adding a new NINE_STATE_STREAMFREQ state. Another way would be to add state->changed.stream_freq & ~1 check to the main test. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	15ce2778fb	st/nine: Catch redundant SetStreamSourceFreq calls Some apps do redundant SetStreamSourceFreq calls. Catch them to improve performance. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	ea3f504f7c	st/nine: Squash indexbuffer9 and vertexbuffer9 The indexbuffer9 codebase was lagging behind the one of vertexbuffer9. Add buffer9 as common code base for indexbuffer9 and vertexbuffer9. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	b6bb8d561a	st/nine: Unset vtxbuf on reset We forgot to reset vtxbuf. This fixes some crashes. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	b63c144d1e	st/nine: Use pipe_resource_reference for vtxbuf This seems cleaner to actually reference the resources for vtxbuf, rather than relying on the fact the bound d3d streams do. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	b5876e4762	st/nine: Use ff vertex shader when position_t is used When an application sets a vertex shader, we are supposed to use it, and when no vertex shader are set, we are supposed to revert to fixed function vertex shader. It seems there is an exception: when the vertex declaration has a position_t index, we should revert to fixed function vertex shader. Up to know we were checking if device->state.vs is set to know whether to use programmable shader or not. With this commit we determine whether we use programmable shader or not when vertex shader/declaration are set, but stateblocks do complicate things a bit. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	531acbc56b	st/nine: Don't increment refcount on VertexDeclaration creation failure NineUnknown_ctor increments the refcount even in case of an error. Restructure the code to prevent refcount increments. Fixes a couple of wine tests. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	b39fd5b1da	st/nine: Change StretchRect check order Textures in SYSTEMMEM don't have resources attached. Instead of returning an error for them, StretchRect was crashing. This changes the check order to fix that case. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	a82e67812a	st/nine: Initialize lights in stateblocks This fixes a crash. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9c1d93f8e7	st/nine: Fix fixed-function blendweights The last weighted element is one minus the sum of all previous weights. Fixes WINE test visual.c test_vertex_blending. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	cc830dc214	st/nine: Always normalize hitDir Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	ed7e1046b6	st/nine: Replace r[0] with tmp Replace r[0] with tmp to ease code reading. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9856203f5a	st/nine: Fix ff calculation of midVec In case of non local viewer the value has to be subtracted. Fixes failing WINE tests in test_specular_lighting() (visual.c) Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	921f3eac58	st/nine: Implement D3DRS_SPECULARENABLE Implement fixed function D3DRS_SPECULARENABLE. Fixes failing WINE tests in test_specular_lighting() (visual.c) Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9c26fa1b13	st/nine: Fix D3DRS_LOCALVIEWER being ignored Set key->localviewer to D3DRS_LOCALVIEWER. Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Axel Davy	aa4454ae85	st/nine: Fix rounding issue with vs1.1 a0 reg vs1.1 rounds a0 to lowest integer, while other versions do round to closest. To use the same path as the other versions (with ARR), we were substracting 0.5 for vs1.1 to get round to lowest. This gives wrong result if a0 is set to 0: round(0 - 0.5) = -1 Instead just use ARL for vs1.1 Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Axel Davy	dbb03f6b5b	st/nine: Fix D3DPMISCCAPS_FOGANDSPECULARALPHA support The documentation of the flag doesn't make sense. To sum up the doc, if not set, specular alpha contains fog, and if set specular alpha contains 0 (except for ff). However in practice when the flag is there, apps do use specular alpha as if it could be used normally, which makes much more sense than the doc. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Patrick Rudolph <siro@das-labor.org>	2016-02-04 22:12:17 +01:00
Patrick Rudolph	9298a0b81b	st/nine: Fix AlphaCmpCaps AlphaCmpCaps should advertise D3DPCMPCAPS_NEVER as well. Fixes https://github.com/iXit/Mesa-3D/issues/142 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2016-02-04 22:12:17 +01:00
Marek Olšák	bff640b3e0	radeonsi: implement PK2H and UP2H opcodes Based on a gallivm patch by Ilia Mirkin. +8 piglit regressions due to precision issues (I blame the tests) The benefit is that we'll get v_cvt_f32_f16 and v_cvt_f16_f32 instead of emulation with integer instructions. They are GLSL 4.00 intrinsics. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-04 19:52:28 +01:00
Matt Turner	973ba3f4d4	glsl: Ensure glsl/ exists before making the lexer/parser. Reported-by: Jan Ziak <0xe2.0x9a.0x9b@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93989	2016-02-04 09:31:17 -08:00
Matt Turner	8c7a42b3e8	i965/fs: Allocate single register at a time for constants. No instruction counts changed, but: total cycles in shared programs: 64834502 -> 64781530 (-0.08%) cycles in affected programs: 16331544 -> 16278572 (-0.32%) helped: 4757 HURT: 4288 GAINED: 66 LOST: 20 I remember trying this when I first wrote the pass, but it wasn't helpful at the time. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-02-04 09:30:58 -08:00
Marek Olšák	8ec24678ac	radeonsi: fix Hyper-Z on Stoney Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-04 16:47:41 +01:00
Patrick Baggett	9c78cfd547	mesa: Use SSE prefetch instructions rather than 3DNow instructions 64-bit Pentium 4 CPUs don't have the 3DNow prefetch instructions which results in an Illegal instruction crash. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Tested-by: Timothy Arceri <t_arceri@yahoo.com.au> https://bugs.freedesktop.org/show_bug.cgi?id=27512	2016-02-04 22:02:31 +11:00
Ilia Mirkin	edd494ddf0	nv50/ir: make sure to fetch all sources before creating instruction We must fetch all sources into the instruction stream before generating the instruction that uses them. Otherwise we'll define values after using them, which won't work so well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-03 18:40:38 -05:00
Ilia Mirkin	a9d5c64c34	nv50: avoid freeing the symbols if they're about to be stored Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-03 18:40:26 -05:00
Ilia Mirkin	9284fd9c0d	st/mesa: fix potential null deref if no shader is passed in Spotted by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-02-03 18:40:13 -05:00
Ilia Mirkin	5ac7f0433b	glx: update to updated version of EXT_create_context_es2_profile The EXT spec has been updated to: - logically combine the es2_profile and es_profile exts - allow any legal version to be requested dEQP tests request a specific ES version when using GLX, so this allows dEQP upstream to run against GLX with the appropriate X server patch (which had similar disabling logic). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Adam Jackson <ajax@redhat.com> (v3) v1 -> v2: - distinguish between DRI_API_GLES{,2,3} - add GLX_EXT_create_context_es_profile client-side support v2 -> v3: - fix error in computing mask	2016-02-03 15:44:51 -05:00
Ilia Mirkin	ad0e48e518	dir-locals.el: set case-label offset to 0 While this is the default, private .emacs files might have it set to something else. No harm in forcing it to 0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2016-02-03 15:44:51 -05:00
Jose Fonseca	1c0f95f602	appveyor: Bump shallow clone depth. To prevent build failures when a large patch series is committed, like happened in https://ci.appveyor.com/project/jrfonseca-fdo/mesa/build/322 due to 10 commits between `dac2964f3e` and `6f428328d3` where submitted before the build slave started the git clone. 100 commits should be bigger than any patch series seen in practice, and it takes practically the same time to download as 5 commits. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-03 19:37:19 +00:00
Rob Clark	029c89a0cc	Revert "compiler: removed unused Makefile.sources" Whoops, didn't mean to push this one. This reverts commit `78f4c555b9`.	2016-02-03 14:35:10 -05:00
Rob Clark	1be9184ff3	compiler: fix .gitignore for glsl_compiler Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-03 13:32:46 -05:00
Rob Clark	78f4c555b9	compiler: removed unused Makefile.sources We seem to end up w/ duplication between compiler/Makefile.sources and compiler/glsl/Makefile.sources. The latter appears unused. Delete it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-02-03 13:19:45 -05:00
Nicolai Hähnle	43a401a792	gallium: fix the documentation of PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE This parameter is equivalent to the corresponding OpenGL implementation limit which is in texels, not bytes. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:12:37 +01:00
Nicolai Hähnle	7dd31b81fe	gallium/radeon: support PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This is already used internally in si_resource_copy_region for compressed textures, so the only real change here is the adjusted surface size computation. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:37 +01:00
Nicolai Hähnle	4b02f16537	st/mesa: implement PBO upload for glCompressedTex(Sub)Image v2: - use st->pbo_upload.enabled flag Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:37 +01:00
Nicolai Hähnle	f38bb36f57	st/mesa: redirect CompressedTexSubImage to our own implementation This is where PBO upload will go. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:36 +01:00
Nicolai Hähnle	16c2ea1fcc	st/mesa: inline the implementation of _mesa_store_compressed_teximage We will write our own version of texsubimage for PBO uploads, and we will want to call that here as well. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:36 +01:00
Nicolai Hähnle	c99f2fe70e	st/mesa: implement PBO upload for multiple layers Use instancing to generate two triangles for each destination layer and use a geometry shader to route the layer index. v2: - directly write layer in VS if supported by the driver (Marek Olšák) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:36 +01:00
Fredrik Höglund	757071ca7c	st/mesa: Accelerate PBO uploads Create a PIPE_BUFFER sampler view on the pixel-unpack buffer, and draw the image on the texture with a fragment shader that maps fragment coordinates to buffer coordinates. Modifications by Nicolai Hähnle: - various cleanups and fixes (e.g. error handling, corner cases) - split try_pbo_upload into two functions, which will allow code to be shared with compressed texture uploads - modify the source format selection to only test for support against the PIPE_BUFFER target v2: - update handling of TGSI_SEMANTIC_POSITION for recent changes in master - MaxTextureBufferSize is number of texels, not bytes (Ilia Mirkin) - only enable when integers are supported (Marek Olšák) - try harder to hit the TextureBufferOffsetAlignment - remove unnecessary MOV from the fragment shader Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:35 +01:00
Nicolai Hähnle	4a448a63ad	st/mesa: use the correct address generation functions in st_TexSubImage blit We need to tell the address generation functions about the dimensionality of the texture to correctly implement the part of Section 3.8.1 (Texture Image Specification) of the OpenGL 2.1 specification which says: "For the purposes of decoding the texture image, TexImage2D is equivalent to calling TexImage3D with corresponding arguments and depth of 1, except that ... * UNPACK SKIP IMAGES is ignored." Fixes a low impact bug that was found by chance while browsing the spec and extending piglit tests. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:35 +01:00
Nicolai Hähnle	6af6d7b08a	gallium: Add PIPE_CAP_SURFACE_REINTERPRET_BLOCKS This cap indicates whether pipe->create_surface can reinterpret a texture as a surface with a format of different block width/height (but equal block size). v2: fix whitespace Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:34 +01:00
Nicolai Hähnle	3abb548ef6	gallium: Add PIPE_CAP_BUFFER_SAMPLER_VIEW_RGBA_ONLY This cap indicates that the driver only supports R, RG, RGB and RGBA formats for PIPE_BUFFER sampler views. v2: move into "unsupported features" section for nouveau (Ilia Mirkin) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-03 14:10:34 +01:00
Nicolai Hähnle	bc8a6842a9	mesa: add MESA_NO_MINMAX_CACHE environment variable When set to a truish value, this globally disables the minmax cache for all buffer objects. No #ifdef DEBUG guards because this option can be interesting for benchmarking. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:04:11 +01:00
Nicolai Hähnle	761c7d59c4	vbo: disable the minmax cache when the hit rate is low When applications stream their index buffers, the caches for those BOs become useless and add overhead, so we want to disable them. The tricky part is coming up with the right heuristic for when to disable them. The first question is which hit rate to aim for. Since I'm not aware of any interesting borderline applications that do something like "draw two or three times for each upload", I just kept it simple. The second question is how soon we should give up on the caching. Applications might have a warm-up phase where they fill a buffer gradually but then keep reusing it. For this reason, I count the number of indices that hit and miss (instead of the number of calls that hit or miss), since comparing that to the size of the buffer makes sense. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:04:06 +01:00
Nicolai Hähnle	115c643b16	mesa: add USAGE_DISABLE_MINMAX_CACHE flag to buffer UsageHistory Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:59 +01:00
Nicolai Hähnle	6b057f8ecc	vbo: cache/memoize the result of vbo_get_minmax_indices (v3) Some games developers are unaware that an index buffer in a VBO still needs to be read by the CPU if some varying data comes from a user pointer (unless glDrawRangeElements and friends are used). This is particularly bad when they tell us that the index buffer should live in VRAM. This cache helps, e.g. lifting This War Of Mine (a particularly bad offender) from under 10fps to slightly over 20fps on a Carrizo. Note that there is nothing prohibiting a user from rendering from multiple threads simultaneously with the same index buffer, hence the locking. (The internal buffer map taken for the buffer still leads to a race, but at least the locks are a move in the right direction.) v2: disable the cache on USAGE_TEXTURE_BUFFER as well (Chris Forbes) v3: - use bool instead of GLboolean for MinMaxCacheDirty (Ian Romanick) - replace the sticky USAGE_PERSISTENT_WRITE_MAP bit by a direct AccessFlags check Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:49 +01:00
Nicolai Hähnle	1a570d96a6	vbo: move vbo_get_minmax_indices into its own source file We will add more code for caching/memoization. Moving the existing code into its own file helps keep things modular. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:48 +01:00
Nicolai Hähnle	46b7a526f5	mesa/main: bail earlier for size == 0 in _mesa_clear_buffer_sub_data Note that the conversion of the clear data (when data != NULL) can fail due to an out of memory condition, but it does not check any error conditions mandated by the spec. Therefore, it is safe to skip when size == 0. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:46 +01:00
Nicolai Hähnle	fd7229b437	mesa/main: add USAGE_PIXEL_PACK_BUFFER flag to buffer UsageHistory We will want to disable minmax index caching for buffers that are used in this way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:45 +01:00
Nicolai Hähnle	54c4a9803b	mesa/main: add USAGE_TRANSFORM_FEEDBACK_BUFFER flag to buffer UsageHistory We will want to disable minmax index caching for buffers that are used in this way. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:41 +01:00
Nicolai Hähnle	55fb921d69	util/hash_table: add _mesa_hash_table_num_entries Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-03 14:03:35 +01:00
Nicolai Hähnle	8b11d8cfbf	util/hash_table: add _mesa_hash_table_clear (v4) v4: coding style change (Matt Turner) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v3)	2016-02-03 14:03:25 +01:00
Leo Liu	6ad2e55a14	st/omx/dec/h264: fix corruption when scaling matrix present flag set The scaling list should be filled out with zig zag scan v2: integrate zig zag scan for list 4x4 to vl(Christian) v3: move list determination out from the loop(Ilia) Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-02-02 20:29:47 -05:00
Leo Liu	4f598f2173	vl: add zig zag scan for list 4x4 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-02-02 20:29:43 -05:00
Roland Scheidegger	848a023c05	llvmpipe: use scissor_planes_needed helper function So it doesn't get out of sync in multiple places.	2016-02-03 01:25:45 +01:00
Jordan Justen	141ef75569	i965/gen8: Initialize aux_mode to GEN8_SURFACE_AUX_MODE_NONE GEN8_SURFACE_AUX_MODE_NONE is 0, so this is a no-op. Yet, this also makes it clear that we can compare aux_mode to the other GEN8_SURFACE_AUX_MODE_ values. We will want to compare to GEN8_SURFACE_AUX_MODE_HIZ. v2: Some very minor cherry-pick conflicts due to moving it around in the series. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>	2016-02-02 15:44:18 -08:00
Ilia Mirkin	18f688d62a	mesa: use default geometry's samples when there are no attachments Whether multisampling is turned on depends, in part, on whether attachments are themselves multisample surfaces. However when there are no attachments, we should rely on the default geometry for this. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-02 17:08:46 -05:00
Ilia Mirkin	095da3b550	mesa: invalidate framebuffer when changing parameters This fixes dEQP-GLES31.functional.fbo.completeness.no_attachments When the width or height are 0, the framebuffer is incomplete. We may also not have been passing the new state down to the driver when the widths/heights/etc changed. Make sure to dirty the state so that the framebuffer state is revalidated at draw time. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 17:08:46 -05:00
Ilia Mirkin	beac7b1b8b	mesa: use geometric helper for computing min samples In case we have a draw buffer without attachments, we should be looking at the default number of samples. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-02-02 17:08:46 -05:00
Ilia Mirkin	2d4976fa19	mesa: the _mesa_geometric_* functions require full types from mtypes.h Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 17:08:46 -05:00
Niels Ole Salscheider	fb44cfadce	winsys/radeon: Do not deinit the pb cache if it was not initialized This fixes a crash in pb_cache_release_all_buffers. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-02-02 21:11:15 +01:00
Marek Olšák	84a6d2d7d6	tgsi/scan: add tgsi_shader_info::reads_samplemask	2016-02-02 21:04:52 +01:00
Marek Olšák	0d68b91220	radeonsi: rework RB+ for Stoney This fixes it. States which also need to be taken into account: - SPI color formats - each down-conversion format supports only a limited set of SPI formats - whether MSAA resolving and logic op are enabled These need special handling: - blending - disabled channels Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-02 21:03:19 +01:00
Marek Olšák	066d76c2f4	radeonsi: rename cb_target_mask state to cb_render_state and rename a variable in the function. SX_PS_DOWNCONVERT will be emitted here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-02 21:03:19 +01:00
Marek Olšák	5f0f9a5619	radeonsi: treat intensity render targets exactly like red The motivation is to simplify the Stoney RB+ code. Intensity is already treated as red except here. No piglit regressions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-02 21:03:18 +01:00
Marek Olšák	f96f94966d	tgsi: set correct src type for UP2H Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-02 21:02:26 +01:00
Connor Abbott	19db71807f	util/hash_table: don't compare deleted entries The equivalent of the last patch for the hash table. I'm not aware of any issues this fixes. v2: - use entry_is_deleted (Timothy) Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2016-02-02 14:42:40 -05:00
Connor Abbott	8fc2f652a2	util/set: don't compare against deleted entries When we delete entries in the hash set, we mark them "deleted" by setting their key to the deleted_key, which points to a dummy deleted_key_value. When searching for an entry, we normally skip over those, but set_add() had some code for searching for duplicate entries which forgot to skip over deleted entries. This led to a segfault inside the NIR vectorization pass, since its key comparison function interpreted the memory where deleted_key_value resides as a pointer and tried to dereference it. v2: - add better commit message (Timothy) - use entry_is_deleted (Timothy) Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2016-02-02 14:42:32 -05:00
Jordan Justen	bd97b62525	glsl: Disable tree grafting optimization for shared variables Fixes: * dEQP-GLES31.functional.compute.basic.shared_atomic_op_multiple_groups * dEQP-GLES31.functional.compute.basic.shared_atomic_op_multiple_invocation * dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_group * dEQP-GLES31.functional.compute.basic.shared_atomic_op_single_invocation From https://android.googlesource.com/platform/external/deqp Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-02 10:50:40 -08:00
Jordan Justen	afef1422cb	glsl: Enable debug prints for do_common_optimization Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-02-02 10:50:40 -08:00
Roland Scheidegger	5e090079e1	Revert "i965: Provide sse2 version for rgba8 <-> bgra8 swizzle" This reverts commit `ab30426e33`. Apparently the memory isn't quite as aligned when this gets called as it should be, causing crashes. (Albeit this looks independent from this code, should crash just as well if ssse3 is enabled when compiling without this patch.) https://bugs.freedesktop.org/show_bug.cgi?id=93962	2016-02-02 15:45:59 +01:00
Dave Airlie	e7a27f70b9	virgl: mark function as static This is fallout from the previous changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93961 Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 17:55:40 +10:00
Roland Scheidegger	7221b8aec6	gallivm: add PK2H/UP2H support Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due to those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with). Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:20 +01:00
Roland Scheidegger	5171ec9ca9	gallivm: add PK2H/UP2H support Add support for these opcodes, the conversion functions were already there albeit need some new packing stuff. Just like the tgsi version, piglit won't like it for all the same reasons, so it's disabled (UP2H passes piglit arb_shader_language_packing tests, albeit since PK2H won't due those rounding differences I don't know if that one works or not as the piglit test is rather difficult to deal with).	2016-02-02 05:58:19 +01:00
Roland Scheidegger	dc16086e3b	tgsi: add PK2H/UP2H support The util functions handle the half-float conversion. Note that piglit won't like it much due to: a) The util functions use magic float mul conversion but when run inside softpipe/llvmpipe, denorms are flushed to zero, therefore when the conversion is from/to f16 denorm the result will be zero. This is a bug which should be fixed in these functions (should not rely on denorms being available), but will happen elsewhere just the same (e.g. conversion to f16 render targets). b) The util functions use trunc round mode rather than round-to-nearest. This is NOT a bug (as it is a d3d10 requirement). This will result of rounding not representable finite values to MAX_F16 rather than INFINITY. My belief is the piglit tests are wrong here but it's difficult to tell (generally glsl rounding mode is undefined, however I'm not sure if rounding mode might need to be consistent for different operations). Nevertheless, for gl it would be better to use round-to-nearest, but using different rounding for GL and d3d10 is an unsolved problem (as it affects things like conversion to f16 render targets, clear colors, this shader opcode). Hence for now don't enable the cap bit (so the code is unused). (Code is from imirkin, comment from sroland) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmvware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	99bd96abbb	llvmpipe: drop scissor planes early if the tri is fully inside them If the tri is fully inside a scissor edge (or rather, we just use the bounding box of the tri for the comparison), then we can drop these additional scissor "planes" early. We do not even need to allocate space for them in the tri. The math actually appears to be slightly iffy due to bounding boxes being rounded, but it doesn't matter in the end. Those scissor rects are costly - the 4 planes from the scissor are already more expensive to calculate than the 3 planes from the tri itself, and it also prevents us from using the specialized raster code for small tris. This helps openarena performance by about 8% or so. Of course, it helps there that while openarena often enables scissoring (and even moves the scissor rect around) I have not seen a single tri actually hit the scissor rect, ever. v2: drop individual scissor edges, and do it earlier, not even allocating space for them. v3: help the compiler a bit with simpler code, suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	9d2a34e105	llvmpipe: minor cleanup of sse2 for calc_fixed_position Just slightly simpler assembly. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	8aa168eb8f	llvmpipe: use vector loads for (optimized) tri raster funcs When we switched to 64bit rasterization, we could no longer use straight aligned loads for loading the plane data. However, what the code actually does for loading 3 planes, is 12 scalar loads + 9 unpacks, and then there's another 8 unpacks for the transpose we need (!). It would be possible to do the (scalar) loads of course already transposed (at least saving the additional unpacks), however instead just use (un)aligned vector loads, and recalculate the eo values, which is much less instructions (note in case of the triangle_32_3_4 case, the eo values are not even used, making the scalar loads + unpacks for them all the more pointless). This drops execution time of the triangle_32_3_4 function considerably, albeit it doesn't really make a measurable difference (for small tris we're essentially limited by vertex throughput in any case), for triangle_32_3_16 it's essentially noise (the loop is more costly than the initial code there). (I'm thinking about just ditching storing the eo values in the plane data, so could switch back to using aligned planes, however right now they are still used in the other raster functions dealing with planes with scalar code. Also not touching the ppc code, might not be that bad there in any case.) Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	ab30426e33	i965: Provide sse2 version for rgba8 <-> bgra8 swizzle The existing code used ssse3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas sse2 is always present at least with 64bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-02 05:58:19 +01:00
Roland Scheidegger	116e4dc995	mesa: fix typo in python scripts Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-02-02 05:58:19 +01:00
Rob Herring	f0f4259324	virgl: also build vtest for Android Enabling swrast on Android causes a link error because vtest is missing. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 09:58:51 +10:00
Rob Herring	2d3301e4d5	virgl: fix reference counting of prime handles The virgl reference counting of buffers is broken for prime fd buffers. Each prime fd passed into virgl_drm_winsys_resource_create_handle creates a new resource. The solution requires creating a separate hash table to track flink names separately from prime handles. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 09:58:29 +10:00
Rob Herring	f87330dbce	virgl: reuse screen when fd is already open It is necessary to share the screen between mesa and gralloc to properly ref count resources. This implements a hash lookup on the file description to re-use an already created screen. This is a similar implementation as freedreno and radeon. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-02 09:58:29 +10:00
Mauro Rossi	6711592c2f	nouveau/video: wrap assertion within #ifndef NDEBUG The change is necessary to avoid the following building error in android: external/mesa/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c: In function 'nouveau_vp3_bsp_next': external/mesa/src/gallium/drivers/nouveau/nouveau_vp3_video_bsp.c:269:14: error: 'bsp_bo' undeclared (first use in this function) assert(bsp_bo->size >= str_bsp->w0[0] + num_bytes[i]); ^ This matches the declaration of the variables in question. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-01 17:45:19 -05:00
Ilia Mirkin	047b917718	st/mesa: treat a write as a read for range purposes We use this logic to detect live ranges and then do plain renaming across the whole codebase. As such, to prevent WaW hazards, we have to treat a write as if it were also a read. For example, the following sequence was observed before this patch: 13: UIF TEMP[6].xxxx :0 14: ADD TEMP[6].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[7].x, TEMP[3].xxxx 16: MUL TEMP[3].x, TEMP[6].xxxx, TEMP[7].xxxx 17: ADD TEMP[6].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[7].x, TEMP[3].xxxx 19: MUL TEMP[4].x, TEMP[6].xxxx, TEMP[7].xxxx While after this patch it becomes: 13: UIF TEMP[7].xxxx :0 14: ADD TEMP[7].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[8].x, TEMP[3].xxxx 16: MUL TEMP[4].x, TEMP[7].xxxx, TEMP[8].xxxx 17: ADD TEMP[7].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[8].x, TEMP[3].xxxx 19: MUL TEMP[5].x, TEMP[7].xxxx, TEMP[8].xxxx Most importantly note that in the first example, the second RCP is done on the result of the MUL while in the second, the second RCP should have the same value as the first. Looking at the GLSL source, it is apparent that both of the RCP's should have had the same source. Looking at what's going on, the GLSL looks something like float tmin_8; float tmin_10; tmin_10 = tmin_8; ... lots of code ... tmin_8 = tmpvar_17; ... more code that never looks at tmin_8 ... And so we end up with a last_read somewhere at the beginning, and a first_write somewhere at the bottom. For some reason DCE doesn't remove it, but even if that were fixed, DCE doesn't handle 100% of cases, esp including loops. With the last_read somewhere high up, we overwrite the previously correct (and large) last_read with a low one, and then proceed to decide to merge all kinds of junk onto this temp. Even if that weren't the case, and there were just some writes after the last read, then we might still overwrite a merged value with one of those. As a result, we should treat a write as a last_read for the purpose of determining the live range. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org	2016-02-01 17:40:18 -05:00
Matt Turner	75c9def8ee	i965/gen7+: Use NIR for lowering of pack/unpack opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	f4952421cd	i965/vec4: Implement nir_op_pack_uvec2_to_uint. And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	955d052058	nir: Add lowering support for unpacking opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9b8786eba9	nir: Add lowering support for packing opcodes. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	1dc312e295	i965/fs: Implement support for extract_word. The vec4 backend will lower it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	68f8c5730b	nir: Add opcodes to extract bytes or words. The uint versions zero extend while the int versions sign extend. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	8709dc0713	glsl: Remove 2x16 half-precision pack/unpack opcodes. i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	1a53a4fc7a	i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 scalarizing. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	9ce901058f	nir: Add lowering of nir_op_unpack_half_2x16. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	e4278a847e	i965: Make separate nir_options for scalar/vector stages. We'll want to have different lowering options set for scalar/vector stages. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	252d497d4c	i965: Move brw_compiler_create() to new brw_compiler.c. A future patch will want to use designated initalizers, which aren't available in C++, but this is C. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Matt Turner	140a886c41	nir: Make argument order of unop_convert match binop_convert. Strangely the return and parameter types were reversed. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-02-01 10:43:57 -08:00
Marta Lofstedt	77a60ab5dc	mesa: enable enums for OES_geometry_shader Enable GL_OES_geometry_shader enums for OpenGL ES 3.1. V4: EXTRA tokens updated according to comments from Ilia Mirkin. V5: Account for check_extra does not evaluate "or" lazy. Fix issues with EXTRA_EXT_FB_NO_ATTACH_CS. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-01 09:30:50 +01:00
François Tigeot	a48afb92ff	gallium: Add DragonFly support Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-31 11:56:09 +00:00
Ilia Mirkin	7f19e29305	nv50/ir: get rid of memory stores with nop values This happens especially with exports and varying packing, where the last bits aren't always filled in. We end up trying to do quad-wide stores, which ends up being a lot of register moves that carefully preserve the nop value. Instead don't do the stores. total instructions in shared programs : 6131375 -> 6125267 (-0.10%) total gprs used in shared programs : 910139 -> 895501 (-1.61%) total local used in shared programs : 15328 -> 15328 (0.00%) local gpr inst helped 0 7442 4693 hurt 0 90 2687 Most of the helped/hurt instruction changes are by one or two ops because can no longer do quad-wide stores in all cases. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-30 17:18:41 -05:00
Ilia Mirkin	3ca941d60e	nv50/ir: fix false global CSE on instructions with multiple defs If an instruction has multiple defs, we have to do a lot more checks to make sure that we can move it forward. Among other things, various code likes to do a, b = tex() if () c = a else c = b which means that a single phi node will have results pointing at the same instruction. We obviously can't propagate the tex in this case, but properly accounting for this situation is tricky. Just don't try for instructions with multiple defs. This fixes about 20 shaders in shader-db, including the dolphin efb2ram shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-01-30 17:18:41 -05:00
Ilia Mirkin	3ca2001b53	nv50,nvc0: fix buffer clearing to respect engine alignment requirements It appears that the nvidia render engine is quite picky when it comes to linear surfaces. It doesn't like non-256-byte aligned offsets, and apparently doesn't even do non-256-byte strides. This makes arb_clear_buffer_object-unaligned pass on both nv50 and nvc0. As a side-effect this also allows RGB32 clears to work via GPU data upload instead of synchronizing the buffer to the CPU (nvc0 only). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> # tested on GF108, GT215 Tested-by: Nick Sarnie <commendsarnex@gmail.com> # GK208 Cc: mesa-stable@lists.freedesktop.org	2016-01-30 16:01:41 -05:00
Rob Clark	f15447e7c9	freedreno/ir3: ignore clip-vertex varying Since we emulate clip-planes, the clip-vertex is used within the VS itself (thanks to nir_lower_clip). So just ignore it as a VS output. Fixes a boatload of piglit tests that were asserting on unknown varying slot. (Also unrelated spelling/typo fix.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:29:21 -05:00
Rob Clark	f20cf22b54	freedreno/ir3: don't ignore local vars With glsl_to_nir we end up with local variables, instead of global, for arrays. Note that we'll eventually have to do something more clever, I think, when we support multiple functions, but that will probably take some work in a few places. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:27:57 -05:00
Rob Clark	8039a2a6b3	freedreno/ir3: handle tex instrs w/ const offset Something we start to see with glsl_to_nir. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:27:27 -05:00
Rob Clark	f212d7dc50	freedreno/ir3: support load_front_face intrinsic With tgsi_to_nir we get this as a normal input with VARYING_SLOT_FACE. But glsl_to_nir plus nir_lower_system_values this becomes an intrinsic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:11:54 -05:00
Rob Clark	9e05e8cb75	freedreno: limit string marker to max packet size Experimentally derived max size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-30 12:10:13 -05:00
Ilia Mirkin	438d421f8b	nvc0: avoid crashing when there are holes in vertex array bindings When using the "shared" vertex array configuration strategy, we bind each of the buffers as a separate array. However there can be holes in such vertex buffer lists, so just emit a disable for those. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-01-29 22:10:42 -05:00
Ilia Mirkin	899b1b98a4	nvc0: enable atomic counters and ssbo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 22:10:42 -05:00
Ilia Mirkin	48cf392c0e	nv50/ir: handle new TGSI MEMBAR opcode Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:48 -05:00
Ilia Mirkin	df043f0764	nvc0/ir: fix atomic compare-and-swap arguments Teach the emitter that the two registers are sequential, and drop the second arg entirely, in favor of a double-wide first argument. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:48 -05:00
Ilia Mirkin	7b9a77b905	nv50/ir: add support for indirect buffer loading Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:48 -05:00
Ilia Mirkin	2c4eeb0b5c	nv50/ir: add SUQ op by reading the info from driver constbuf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:47 -05:00
Ilia Mirkin	c3083c7082	nv50/ir: add support for BUFFER accesses This largely leaves the existing image logic alone. When image support is added this will have to be harmonized somehow. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:47 -05:00
Ilia Mirkin	abe427ebd2	nvc0: handle shader buffer memory barrier Issue a MEM_BARRIER. No idea if this is sufficient. As there are no tests for this, it'll have to do for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:22:38 -05:00
Ilia Mirkin	fe01be4ad5	nvc0: add state management for shader buffers (address, length) pairs are uploaded to the driver constbuf as well to make these values available to the shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:06:07 -05:00
Ilia Mirkin	b4688c4615	nvc0: double per-shader stage driver constants area We need to store a lot more info now with per-buffer address/size. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-29 21:06:06 -05:00
Ilia Mirkin	ae725d5746	trace: add support for set_shader_buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) v1 -> v2: add arg_begin/arg_end around buffer array Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	fea25db925	st/mesa: enable ARB_shader_storage_buffer_object when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	6fb8fac853	st/mesa: add shader buffer barrier bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	792bab24ac	st/mesa: add support for memory barrier intrinsics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) v1 -> v2: use TGSI_MEMBAR defines	2016-01-29 21:05:47 -05:00
Ilia Mirkin	c0e1c54a4f	st/mesa: use RESQ to find buffer size Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-29 21:05:47 -05:00
Ilia Mirkin	6880036694	st/mesa: add support for SSBO binding and GLSL intrinsics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> v1 -> v2: some 80 char reformatting	2016-01-29 21:05:46 -05:00
Ilia Mirkin	9d6f9ccf6b	st/mesa: add atomic counter support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:46 -05:00
Ilia Mirkin	0fddb677e6	mesa: add PROGRAM_IMMEDIATE, PROGRAM_BUFFER This makes PROGRAM_IMMEDIATE a first-class gl_register_file type, and adds PROGRAM_BUFFER to the list. These are used purely inside glsl_to_tgsi conversion. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:05:35 -05:00
Ilia Mirkin	35f8488668	glsl: keep track of ssbo variable being accessed, add access params Currently any access params (coherent/volatile/restrict) are being lost when lowering to the ssbo load/store intrinsics. Keep track of the variable being used, and bake its access params in as the last arg of the load/store intrinsics. If the variable is accessed via an instance block, then 'variable' points to the instance block variable and not the field inside the instance block that we are accessing. In order to check access parameters for the field itself we need to detect this case and keep track of the corresponding field struct so we can extract the specific field access information from there instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) v1 -> v2: add tracking of struct field v2 -> v3: minor adjustments based on Iago's feedback Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-29 21:05:08 -05:00
Ilia Mirkin	2b089c7ffe	glsl: always initialize image_* fields, copy them on interface init Interfaces can have image properties set in case they are buffer interfaces. Make sure not to lose this information. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-29 21:04:56 -05:00
Ilia Mirkin	2ccc42fd2c	tgsi: add MEMBAR opcode to handle memoryBarrier* GLSL intrinsics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) v1 -> v2: add defines for the various bits Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-29 21:04:36 -05:00
Michel Dänzer	30fcf241e1	winsys/amdgpu: Process RADEON_FLAG_* independently from RADEON_DOMAIN_* In particular, AMDGPU_GEM_CREATE_CPU_GTT_USWC can affect even BOs created in VRAM if they get evicted to GTT. In general there's no need to restrict any of the flags to any particular domains. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-29 16:06:06 +09:00
Michel Dänzer	62f837e2ea	winsys/amdgpu: Handle RADEON_FLAG_NO_CPU_ACCESS Failing to do this was resulting in the kernel driver unnecessarily leaving open the possibility of CPU access to tiled BOs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93862 (This change shouldn't be backported to stable branches, because released versions of xf86-video-amdgpu unnecessarily try to map the front buffer) Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-29 16:06:06 +09:00
Karol Herbst	29d09f8747	nv50/ir: optimize mad/fma with third argument 0 to mul Very modest effect, but it's clearly the right thing to do. total instructions in shared programs : 6131491 -> 6131398 (-0.00%) total gprs used in shared programs : 910157 -> 910131 (-0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) local gpr inst bytes helped 0 55 85 85 hurt 0 26 20 20 Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-28 15:59:41 -05:00
Karol Herbst	3aa681449e	nv50/ir: run DCE backwards Reduces calls up to 50% Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-28 15:34:29 -05:00
Karol Herbst	978ae28ca2	nv50/ir: optimize shl(shr(a, c), c) to and(a, ~((1 << c) - 1)) Following shader-db results on GK110: total instructions in shared programs : 6141510 -> 6131491 (-0.16%) total gprs used in shared programs : 910187 -> 910157 (-0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) local gpr inst bytes helped 0 18 821 821 hurt 0 0 0 0 Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-28 15:34:22 -05:00
Ilia Mirkin	089f605439	glsl: disallow implicit conversions in ESSL shaders Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-28 11:31:19 -05:00
Axel Davy	dda7a84986	radeonsi: Add option for SI scheduler Add a debug option to select the LLVM SI Machine Scheduler. R600_DEBUG=sisched Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-28 17:22:44 +01:00
Samuel Iglesias Gonsálvez	f9c43dd22f	glsl: double-precision values don't support interpolation ARB_gpu_shader_fp64 spec says: "This extension does not support interpolation of double-precision values; doubles used as fragment shader inputs must be qualified as "flat"." Fixes the regressions added by commit `781d278`: arb_gpu_shader_fp64-double-gettransformfeedbackvarying arb_gpu_shader_fp64-tf-interleaved arb_gpu_shader_fp64-tf-interleaved-aligned arb_gpu_shader_fp64-tf-separate Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93878 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-28 11:35:03 +01:00
Eric Anholt	3fba517bdd	vc4: Throttle outstanding rendering after submission. Just make sure that after we've submitted, we get to at least 5 (global) submits ago before we go on to do more. Prevents up to seconds of lag with window movement in X with xcompmgr -c. There may be useful tuning to do in the future, but for now this gets us usability. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-01-27 20:05:37 -08:00
Eric Anholt	2a449ce7c9	vc4: Don't record the seqno of a failed job submit. On an error return, the returned seqno will probably be unset, so we'd lose track of what we've submitted so far for waiting on in the future. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2016-01-27 20:05:37 -08:00
Ben Widawsky	0e06f76a84	i965/skl: Utilize new 5th bit for gateway messages Modify comment as spotted by Matt, and Chris Forbes Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-27 17:12:56 -08:00
Ilia Mirkin	34c2c7c61e	glsl: only expose double mod when doubles are available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-27 15:15:10 -05:00
Karol Herbst	19ae5de981	nv50/ir: fix memory corruption when spilling and redoing RA When RA fails, and we spill, we have to clean everything up before doing RA again. We were forgetting to reset the hi/lo linked lists - at least the hi list is guaranteed to still have pointers to now-deleted RIG nodes. Signed-off-by: Karol Herbst <nouveau@karolherbst.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-01-26 17:55:06 -05:00
Timothy Arceri	d580a979a4	glsl: remove old FINISHME This should have been removed long ago. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-27 09:15:21 +11:00
Marek Olšák	98cebc913c	configure.ac: don't require EGL/DRM and GBM if OpenGL is disabled This allows building VDPAU/OMX/VA drivers without OpenGL and its dependencies. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-01-26 19:07:03 +01:00
Jan Vesely	efc4142acd	r600,compute: Plug few memory leaks v2: drop inline keyword drop radeon_llvm_dispose_kernel_module wrapper v3: move definitions to .c file use in radeonsi Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 19:04:38 +01:00
Jan Vesely	e1dcd333e4	r600: Typos and whitespace fixes Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-26 19:01:22 +01:00
Marek Olšák	2924ca131f	radeonsi: fix clover crash caused by `ce1e7784d0` Trivial.	2016-01-26 18:53:41 +01:00
Marek Olšák	af57507e4f	radeonsi: fix shader precompilation for shader-db The addition of spi_shader_col_format killed all color outputs in precompiled shaders. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) v2: also set the alpha func (trivial)	2016-01-26 18:49:50 +01:00
Ilia Mirkin	38c63abf09	glsl: add GL_OES_geometry_point_size and conditionalize gl_PointSize For now this will be enabled in tandem with GL_OES_geometry_shader. Should a driver come along that wants to separate them out, another enable can be added. Also adds the missed GL_OES_geometry_shader define in glcpp. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-26 12:36:15 -05:00
Emil Velikov	eb63640c1d	glsl: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:33 +00:00
Emil Velikov	a39a8fbbaa	nir: move to compiler/ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:30 +00:00
Emil Velikov	f694da80c7	compiler: move the glsl_types C wrapper alongside their C++ brethren At a later stage we might want to split out the NIR specific [XXX: which one was it], as to make things move obvious and rename the files appropriately. This patch aims to split it out of nir. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:27 +00:00
Emil Velikov	24f984f64a	nir: move glsl_types.{cpp,h} to compiler Allows us to remove the SCons workaround :-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:24 +00:00
Emil Velikov	1a882fd2ee	nir: move shader_enums.[ch] to compiler This way one can reuse it in glsl, nir or other infrastructure without pulling nir as dependency. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:08:20 +00:00
Emil Velikov	2f86383091	compiler: introduce a libcompiler static library Currently it's an empty library, although it'll be used to store common code between GLSL and NIR that is compiler specific (rather than generic as the one in src/util). XXX: strictly speaking we could add a python/mako parser to generate the relevant files instead including builtin_type_macros.h in such a manner. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-26 16:07:27 +00:00
Nicolai Hähnle	41875ac4ed	gallium/ddebug: add 'verbose' option This currently just writes out the name of dump files, which can be useful to easily correlate those files with other log outputs (driver debug output, apitrace calls, etc.) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 09:58:55 -05:00
Nicolai Hähnle	f4c8fa4e49	gallium/ddebug: make 'noflush' also affect 'always' mode This changes the default behavior of 'always' mode to be consistent with hang detection mode. I have used this to more easily compare dumped command streams using diff. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 09:58:49 -05:00
Nicolai Hähnle	8894b5f008	radeonsi: use llvm.amdgcn.s.barrier instead of llvm.AMDGPU.barrier.local The new name for the intrinsic was introduced in LLVM r258558. v2: use ternary operator instead of preprocessor Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-26 09:57:06 -05:00
Ben Widawsky	a443b5b732	i965/bxt: Fix conservative wm thread counts. When setting the conservative thread counts, I halved everything. That isn't correct for the wm, which has nothing to do with actual thread counts. I suck. BXT only has 1 slice, and there is some ambiguity about subslices, so just reserve the max possible for now. It looks like this might fix: piglit.spec.glsl-1_50.execution.variable-indexing.gs-output-array-vec4-index-wr.bxtm64. I kind of question why that is, but it is what Jenkins says. Mark is current running some of the other blacklisted tests on this patch. (it effects anything requiring scratch space). Cc: mesa-stable <mesa-stable@lists.freedesktop.org> Cc: Neil Roberts <neil@linux.intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-01-25 15:51:17 -08:00
Ian Romanick	2542871387	meta: Use internal functions to set texture parameters _mesa_texture_parameteriv is used because (the more obvious) _mesa_texture_parameteri just stuffs the parameter in an array and calls _mesa_texture_parameteriv. This just cuts out the middleman. As a side bonus we no longer need check that ARB_stencil_texturing is supported. The test doesn't allow non-supporting implementations to avoid any work, and it's redundant with the value-changed test. Fix bug #93717 because the state restore commands at the bottom of _mesa_meta_GenerateMipmap no longer depend on the bound state. Fixes piglit arb_direct_state_access-generatetexturemipmap with the changes recently sent to the piglit mailing list. See the bugzilla entry for more info. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93717 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Ian Romanick	18b0ba340b	meta/blit: Restore GL_DEPTH_STENCIL_TEXTURE_MODE state for GL_TEXTURE_RECTANGLE Commit `c246828c` added the code to save and restore the stencil texturing mode. The restore, however, was erroneously inside the 'target != GL_TEXTURE_RECTANGLE' block. Fixes piglit test 'arb_stencil_texturing-blit_corrupts_state GL_TEXTURE_RECTANGLE'. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Ian Romanick	f7800fadff	meta/copy_image: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Ian Romanick	bae8a4f05b	mesa: Don't include meta.h Commit `055093e` removed the call to _mesa_meta_in_progress, and meta.h has not been necessary in src/mesa/main/enable.c since. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-25 10:43:47 -08:00
Nicolai Hähnle	1067e6eb55	radeonsi: add DCC buffer for sampler views on new CS This fixes a VM fault and possible lockup in high memory pressure situations. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-25 10:16:12 -05:00
Nicolai Hähnle	0bacbf5b7e	radeonsi: emit rw_buffers for tes_shader only if tes_shader present Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:08 -05:00
Nicolai Hähnle	2385b253c6	radeonsi: do not set the shader->key for gs copy shaders The key for a geometry shader would be interpreted as the key for a vertex shader further down the line, which really doesn't make sense. This does not affect the contents of shader->key because geometry shaders don't have any key entries anyway. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:05 -05:00
Nicolai Hähnle	46c0ba60c6	radeonsi: si_llvm_emit_vs_epilogue is never used with gs copy shaders Hence remove the misleading branch on is_gs_copy_shader. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:02 -05:00
Nicolai Hähnle	c55b9499d5	radeonsi: move is_gs_copy_shader to si_shader_context It is only used during shader creation now, so no need to keep it around afterwards. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:16:00 -05:00
Nicolai Hähnle	a7754ffd31	radeonsi: replace use of is_gs_copy_shader in si_shader_vs We now have an explicit parameter that contains the same information, and this will allow us to get rid of is_gs_copy_shader in the si_shader struct. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:15:55 -05:00
Nicolai Hähnle	004fcd4230	radeonsi: ensure that VGT_GS_MODE is sent when necessary Specifically, when the API switches from using a GS to not using a GS and then back to using the same GS again, we do not have to re-send all the GS state, but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part of the VS state. This fixes a rendering bug in Dolphin, but surely other applications are affected as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:15:31 -05:00
Nicolai Hähnle	9f89bd69df	radeonsi: extract the VGT_GS_MODE calculation into its own function Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-25 10:15:08 -05:00
Samuel Pitoiset	429371f22a	trace: fix a segfault when tracing indirect draw calls Like other resources, the indirect draw buffer must be unwrapped. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-24 19:53:53 +01:00
Marek Olšák	24ea81a491	Revert "mesa: enable enums for OES_geometry_shader" This reverts commit `67e3098703`. It breaks a bunch of geometry shader tests, such as "spec@!opengl 3.2@minmax" and others depending on the glGet queries.	2016-01-24 15:47:39 +01:00
Marek Olšák	e707b9d8ba	winsys/amdgpu: optionally use buffer lists with all allocated buffers Set RADEON_ALL_BOS=1 to use it. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-23 17:01:54 +01:00
Kenneth Graunke	ae9f73ea40	glsl: Conditionalize atan2 math. In the old hand-writen implementation of atan2, the calculation of atan(y/x) was performed conditionally in the "then" block of the outermost if statement. I believe I accidentally lifted this out into unconditional code when converting to IR builder. For reference, the original hand-written IR is visible in commit `722eff674b`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Erik Faye-Lund <kusmabite@gmail.com>	2016-01-22 21:03:00 -08:00
Rob Herring	7ee8954753	virgl: enable building on Android This is just a copy-n-paste and rename of vc4 Android makefiles. Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-23 12:35:29 +10:00
Rob Herring	657dc4f533	virtio_gpu: Add PCI ID to driver map Add the virtio-gpu PCI ID so the driver probing works. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-23 12:35:24 +10:00
Kenneth Graunke	b3340cd32a	i965: Implement a drirc workaround for broken dual color blending. OpenGL's dual color blending feature was specified so that an implementation could support both multiple render targets (MRT) and dual source blending. Fragment shader outputs specify both "location" (the render target number) and "index" (either color 0 or 1). I believe DirectX only has the notion of "location" - if using dual color blending, location 0 or 1 will specify the operands. If not, then location means the render target index. The two features can't be used together. As such, some applications mistakenly try to use <loc = 0, index = 0> and <loc = 1, index = 0> in a shader used for dual color blending with a single render target, rather than the correct <loc = 0, index = 0> and <loc = 0, index = 1>. In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug. Unigine is aware of the problem, and quickly developed a fix, but has not bothered to change the download link on their website to a working copy in over a year. People were still using the broken version and complaining. We tried working around this by disabling dual color blending, but that apparently hurts performance, and people were once again unhappy. On i965, dual source blending is achieved by using different framebuffer write messages than normal rendering. So, we have to compile different code for the two cases. We're not being pedantic: we actually have to know in order to function. Normally, dual source blending is detectable in the shader: if a shader has an output with index = 1, then it's meant for blending, not MRT. With the broken inputs, they're indistinguishable, so we can only tell by looking at the current GL state. This patch implements a new drirc workaround: export dual_color_blend_by_location=true which makes the i965 driver detect when OpenGL state is configured for dual source blending, and recompile the fragment shader to use the right messages. In that case, we allow either location = 1 or index = 1 to specify the second source for the blending equations. It also re-enables GL_ARB_blend_func_extended for Unigine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 14:14:26 -08:00
Marek Olšák	cd9c07e7cd	radeonsi: add ETC1 support for Stoney It's a subset of ETC2. Tested. For more information, see page 42 and onward: http://www.graphicshardware.org/previous/www_2007/presentations/strom-etc2-gh07.pdf Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	b3bac55621	radeonsi: change LLVM intrinsics for BREV, CLAMP, EX2 Requested by Matt Arsenault. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	ce1e7784d0	radeonsi: add max waves / SIMD to shader stats (v2) v2: account for LDS usage in PS the limit is per SIMD, not per CU Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	5944f3d2fc	radeonsi: enable late VS allocation (v3) v2: take the number of CUs into account v3: change in LS allocation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Marek Olšák	97648229e4	radeonsi: allow using all CUs for tessellation and on-chip GS (v2) v2: After more discussion with hw teams, the kernel already contains the optimal settings allowing us to use all CUs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 22:05:42 +01:00
Jeremy Huddleston Sequoia	7c99557f53	Revert "mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB" This reverts commit `739ac3d39d`. This will be done a differnet way. See http://lists.freedesktop.org/archives/mesa-dev/2016-January/105642.html	2016-01-22 13:02:01 -08:00
Ben Widawsky	315cda6715	i965/fs: Remove unused count from vs urb setup This was originally removed here: commit `031d350132` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Tue Aug 25 16:59:12 2015 -0700 i965/vs: Unify URB entry size/read length calculations between backends. Then added back: commit `bd198b9f0a` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Fri Aug 14 16:01:33 2015 -0700 i965/vs: Simplify fs_visitor's ATTR file. Note that the authorship dates are out of order, but the above reflects the order of the commit dates. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-22 10:38:41 -08:00
Nicolai Hähnle	d76bd85c35	Revert "radeonsi: fix discard-only fragment shaders (v2)" This reverts commit `843855bbf0`. It became redundant due to Marek's earlier pushed `8667a1ae` which achieves the same thing.	2016-01-22 12:40:26 -05:00
Nicolai Hähnle	843855bbf0	radeonsi: fix discard-only fragment shaders (v2) When a fragment shader is used that has no outputs but does conditional discard (KILL_IF), all fragments are killed without this patch. By comparing various register settings, my conclusion is that the exec mask is either not properly forwarded to the DB by NULL exports or ends up being unused, at least when there is _only_ a NULL export (the ISA documentation claims that NULL exports can be used to override a previously exported exec mask). Of the various approaches I have tried to work around the problem, this one seems to be the least invasive one. v2: take discard by alpha test into account as well Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-22 11:59:50 -05:00
Marta Lofstedt	3e640c256a	mesa: Update _mesa_has_geometry_shaders Updates the _mesa_has_geometry_shaders function to also look for OpenGL ES 3.1 contexts that has OES_geometry_shader enabled. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-22 17:13:55 +01:00
Marta Lofstedt	ae4e4ba06d	glsl: add support for GL_OES_geometry_shader This adds glsl support of GL_OES_geometry_shader for OpenGL ES 3.1. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 17:13:55 +01:00
Marta Lofstedt	67e3098703	mesa: enable enums for OES_geometry_shader Enable GL_OES_geometry_shader enums for OpenGL ES 3.1. V4: EXTRA tokens updated according to comments from Ilia Mirkin. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 17:13:55 +01:00
Marta Lofstedt	af5a14d1e0	glapi: add GL_OES_geometry_shader extension Add xml definitions for the GL_OES_geometry_shader extension and expose the extension for OpenGL ES 3.1. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-22 17:13:55 +01:00
Emil Velikov	bb58b59998	docs: correct 11.1.1 release year Seems like I wasn't ready to let 2015 go :-) Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:50:48 +00:00
Emil Velikov	45c5000ffc	docs: add news item and link release notes for 11.0.9 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:49:47 +00:00
Emil Velikov	87b0a52de8	docs: add sha256 checksums for 11.0.9 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:47:12 +00:00
Emil Velikov	51e8152186	docs: add release notes for 11.0.9 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-22 15:47:11 +00:00
Marek Olšák	a9d5842ec0	radeonsi: add ETC2 support for Stoney Tested and working.	2016-01-22 15:36:14 +01:00
Marek Olšák	6f428328d3	radeonsi: implement SAMPLEPOS system value without a constant buffer load We always get per-sample input position. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	2b66bc87d4	winsys/amdgpu: compute num_good_compute_units correctly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	0d8e4f958f	gallium/radeon: rename max_compute_units -> num_good_compute_units radeon sets this correctly, but not amdgpu Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	99dfeb01bd	radeonsi: disable SPI color outputs the shader doesn't write Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	f6360de8c0	radeonsi: use all SPI color formats because not using SPI_SHADER_32_ABGR doubles fill rate. We should also get optimal performance if alpha isn't needed or blending isn't enabled. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	933e3c4145	radeonsi: use 32_AR for alpha-to-coverage without a color buffer This avoids the fp16 packing instructions. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	f1f0158837	radeonsi: add shader conversion code for all SPI color formats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	e28b8530b9	radeonsi: set CB_SHADER_MASK according to SPI color formats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	8667a1aea2	radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpc This does change the behavior slightly: If a shader writes COLOR[i] and that color buffer isn't bound, the shader will export MRT_NULL instead and discard the IR tree that calculates the output. The only exception is alpha-to-coverage, which requires an alpha export. v2: - update a comment about 16BPC - account for MRTZ when when fixing alpha-test/kill Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Marek Olšák	0446ea9d08	radeonsi: don't enable blending if colormask == 0 most likely useless, but doesn't hurt Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-22 15:02:40 +01:00
Ilia Mirkin	dac2964f3e	glsl: always compute proper varying type, irrespective of varying packing Normally there's a producer and consumer, and the producer var gets picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed version. In the SSO case, however, there is no producer. So we picked the arrayed GS variable, and as a result, used more slots than we should. More critically, these slots would also no longer line up with the producer's calculation. To fix this, we need to fix up the type of the variable based on stage no matter what. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-22 08:48:27 -05:00
Emil Velikov	54702c2fa1	egl/dri2: expose srgb configs when KHR_gl_colorspace is available Otherwise the user has no way of using it, and we'll try to access the linear one. v2: - Bail out when KHR_gl_colorspace is missing and srgb is set (Marek) Cc: Chih-Wei Huang <cwhuang@android-x86.org> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com>	2016-01-22 11:55:54 +00:00
Emil Velikov	f29a772a7e	targets/dri: android: use WHOLE static libraries By using whole static libraries the android buildsystem provides whole-archive (alike) solution. This means that we don't need to worry about the order of the static libraries and any reverse, recursive or circular dependencies that they have between one another. Without this the linker will discard any unused hunks of one library and we'll end up with unresolved symbols as those are required by another static library. This issue has become more prominent with the introduction of pipe-loader. Whole static libraries has been used in i915/i965 for a very long time, so we might do the same. v2: - Better commit message (Ilia) - Keep external dependencies as [normal] static libs (Mauro) Cc: mesa-stable@lists.freedesktop.org Cc: Mauro Rossi <issor.oruam@gmail.com> Reported-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-22 11:55:34 +00:00
Emil Velikov	72fda2b710	i915: correctly parse/set the context flags With an earlier commit we've spit the flags parsing to a separate function, but forgot to update all the dri modules to use it. Noticed when we've enabled KHR_debug for every dri module - fdo#93048 Fixes: `38366c0c6e` "dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context" Cc: Mark Janes <mark.a.janes@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2016-01-22 11:54:01 +00:00
Iago Toral Quiroga	ab0c7c0829	glsl/lower_instructions: fix regression in dldexp_to_arith The commit `b4e198f47f` changed the offset and bits parameters of the bitfield insert operation from scalars to vectors. However, the lowering of ldexp on doubles operates on each vector component and emits scalar code (since it has to deal with the lower and upper 32-bit chunks of each double component), so it needs its bits and offset parameters to be scalars. Fixes fp64 regression (crash) in: spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-22 08:14:11 +01:00
Eduardo Lima Mitev	263f829d2e	i965/vec4/tcs: Return NULL instead of false in brw_compile_tcs() brw_compile_tcs() is expected to return 'const unsigned *', so the compiler complains. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-21 16:16:26 -08:00
cstout	13b87e02b9	freedreno/a4xx: Add support for adreno 430 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-21 17:20:11 -05:00
Christian Gmeiner	66672e791c	freedreno: make opc array static const Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-21 17:20:11 -05:00
Rob Clark	bc1a37378c	freedreno: implement emit_string_marker Writes string to cmdstream in payload of a no-op packet. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-21 17:20:11 -05:00
Rob Clark	d6408372eb	gallium: add GREMEDY_string_marker Since the GREMEDY extensions are normally only exposed by the gremedy debugger (and could possibly trigger debug paths in the app), we don't expose the extension by default, but instead only with ST_DEBUG=gremedy. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-21 17:19:56 -05:00
Rob Clark	a6a99fbf05	mesa: wire up EmitStringMarker for KHR_debug The extension spec[1] describes DEBUG_TYPE_MARKER as "Annotation of the command stream". So for DEBUG_TYPE_MARKER, also pass the buf to the driver's EmitStringMarker() to be inserted in the command stream. [1] https://www.opengl.org/registry/specs/KHR/debug.txt Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 17:19:05 -05:00
Rob Clark	1f7a96e005	mesa: add GREMEDY_string_marker Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 17:19:05 -05:00
Neil Roberts	cbf0e64ee1	texobj: Remove redundant checks that the texture cube faces match size The texture mipmap completeness checking code was checking whether all of the faces have the same size. However this is pointless because the code just above it checks whether the face has the expected size calculated for the mipmap level anyway so the error condition could never be reached. This patch just removes it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 21:45:53 +00:00
Neil Roberts	666d96d169	texobj: Fix the completeness checks for cube textures According to the GL 1.4 spec section 3.8.10, a cubemap texture is only complete if: • The level base arrays of each of the six texture images making up the cube map have identical, positive, and square dimensions. • The level base arrays were each specified with the same internal format. • The level base arrays each have the same border width. Previously the texture completeness code was only checking the first point. This patch makes it additionally check the other two. This fixes the following two dEQP tests: deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgba_rgb_level_0_neg_z deqp-gles2.functional.texture.completeness.cube.format_mismatch_rgb_rgba_level_0_pos_z And also this Piglit test: spec/!opengl 2.0/incomplete-cubemap-format Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93792 Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-21 21:45:18 +00:00
Grazvydas Ignotas	0153ff8379	r600g: don't leak driver const buffers The buffers are referenced from r600_update_driver_const_buffers() -> r600_set_constant_buffer() -> u_upload_data(), but nothing ever releases the reference. Similar case with driver_consts. Found using valgrind. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-21 15:36:24 -05:00
Jeremy Huddleston Sequoia	739ac3d39d	mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com>	2016-01-21 09:18:06 -08:00
Jeremy Huddleston Sequoia	b20d6bf96d	mesa: Fix format warnings main/shaderapi.c:1318:51: warning: format specifies type 'unsigned int' but the argument has type 'GLhandleARB' (aka 'unsigned long') [-Wformat] _mesa_debug(ctx, "glDeleteObjectARB(%u)\n", obj); ~~ ^~~ %lu Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-21 09:18:06 -08:00
Jeremy Huddleston Sequoia	a087a09fa8	mesa: Fix some function prototype mismatching main/api_exec.c:543:36: warning: incompatible pointer types passing 'void (GLhandleARB, GLuint, const GLcharARB )' (aka 'void (unsigned long, unsigned int, const char )') to parameter of type 'void ()(GLuint, GLuint, const GLchar )' (aka 'void ()(unsigned int, unsigned int, const char )') [-Wincompatible-pointer-types] SET_BindAttribLocation(exec, _mesa_BindAttribLocation); ^~~~~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7590:88: note: passing argument to parameter 'fn' here static inline void SET_BindAttribLocation(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLuint, const GLchar )) { ^ main/api_exec.c:547:31: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_CompileShader(exec, _mesa_CompileShader); ^~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7612:83: note: passing argument to parameter 'fn' here static inline void SET_CompileShader(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { ^ main/api_exec.c:568:33: warning: incompatible pointer types passing 'void (GLhandleARB, GLuint, GLsizei, GLsizei , GLint , GLenum , GLcharARB )' (aka 'void (unsigned long, unsigned int, int, int , int , unsigned int , char )') to parameter of type 'void ()(GLuint, GLuint, GLsizei, GLsizei , GLint , GLenum , GLchar )' (aka 'void ()(unsigned int, unsigned int, int, int , int , unsigned int , char )') [-Wincompatible-pointer-types] SET_GetActiveAttrib(exec, _mesa_GetActiveAttrib); ^~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7711:85: note: passing argument to parameter 'fn' here static inline void SET_GetActiveAttrib(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLuint, GLsizei , GLsizei , GLint , GLenum , GLchar )) { ^ main/api_exec.c:571:35: warning: incompatible pointer types passing 'GLint (GLhandleARB, const GLcharARB )' (aka 'int (unsigned long, const char )') to parameter of type 'GLint ()(GLuint, const GLchar )' (aka 'int ()(unsigned int, const char )') [-Wincompatible-pointer-types] SET_GetAttribLocation(exec, _mesa_GetAttribLocation); ^~~~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7744:88: note: passing argument to parameter 'fn' here static inline void SET_GetAttribLocation(struct _glapi_table disp, GLint (GLAPIENTRYP fn)(GLuint, const GLchar )) { ^ main/api_exec.c:585:33: warning: incompatible pointer types passing 'void (GLhandleARB, GLsizei, GLsizei , GLcharARB )' (aka 'void (unsigned long, int, int , char )') to parameter of type 'void ()(GLuint, GLsizei, GLsizei , GLchar )' (aka 'void ()(unsigned int, int, int , char )') [-Wincompatible-pointer-types] SET_GetShaderSource(exec, _mesa_GetShaderSource); ^~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7788:85: note: passing argument to parameter 'fn' here static inline void SET_GetShaderSource(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLsizei, GLsizei , GLchar )) { ^ main/api_exec.c:597:29: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_LinkProgram(exec, _mesa_LinkProgram); ^~~~~~~~~~~~~~~~~ ./main/dispatch.h:7909:81: note: passing argument to parameter 'fn' here static inline void SET_LinkProgram(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { ^ main/api_exec.c:628:30: warning: incompatible pointer types passing 'void (GLhandleARB, GLsizei, const GLcharARB const , const GLint )' (aka 'void (unsigned long, int, const char const , const int )') to parameter of type 'void ()(GLuint, GLsizei, const GLchar const , const GLint )' (aka 'void ()(unsigned int, int, const char const , const int )') [-Wincompatible-pointer-types] SET_ShaderSource(exec, _mesa_ShaderSource); ^~~~~~~~~~~~~~~~~~ ./main/dispatch.h:7920:82: note: passing argument to parameter 'fn' here static inline void SET_ShaderSource(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint, GLsizei, const GLchar const , const GLint )) { ^ main/api_exec.c:653:28: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_UseProgram(exec, _mesa_UseProgram); ^~~~~~~~~~~~~~~~ ./main/dispatch.h:8173:80: note: passing argument to parameter 'fn' here static inline void SET_UseProgram(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { ^ main/api_exec.c:655:33: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_ValidateProgram(exec, _mesa_ValidateProgram); ^~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:8184:85: note: passing argument to parameter 'fn' here static inline void SET_ValidateProgram(struct _glapi_table disp, void (GLAPIENTRYP fn)(GLuint)) { main/dlist.c:9457:26: warning: incompatible pointer types passing 'void (GLhandleARB)' (aka 'void (unsigned long)') to parameter of type 'void ()(GLuint)' (aka 'void ()(unsigned int)') [-Wincompatible-pointer-types] SET_UseProgram(table, save_UseProgramObjectARB); ^~~~~~~~~~~~~~~~~~~~~~~~ ./main/dispatch.h:8173:80: note: passing argument to parameter 'fn' here static inline void SET_UseProgram(struct _glapi_table *disp, void (GLAPIENTRYP fn)(GLuint)) { ^ 1 warning generated. Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-21 09:18:06 -08:00
Andreas Boll	5d4b20267d	glapi: Build glapi_gentable.c only on Darwin Removes the public symbol _glapi_create_table_from_handle from libGL.so.1.2.0 on all platforms except Darwin. Since the symbol is not used on other platforms it makes sense to build glapi_gentable.c only on Darwin. As a side effect it accelerates the build a bit and reduces the size of libGL.so.1.2.0 as follows: size lib/libGL.so.1.2.0 on my system shows text data bss dec hex filename 469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 before 420988 11240 2720 434948 6a304 lib/libGL.so.1.2.0 after A little bit of history: _glapi_create_table_from_handle was introduced in commit `85937f4c0d` Author: Jeremy Huddleston <jeremyhu@apple.com> Date: Thu Jun 9 16:59:49 2011 -0700 glapi: Add API that can create a _glapi_table from a dlfcn handle Example usage: void handle = dlopen(opengl_library_path, RTLD_LOCAL); struct _glapi_table disp = _glapi_create_table_from_handle(handle, "gl"); Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com> and the only user in mesa was added in commit `f35913b96e` Author: Jeremy Huddleston <jeremyhu@apple.com> Date: Thu Jun 9 17:29:51 2011 -0700 apple: Use _glapi_create_table_from_handle to initialize our dispatch table Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com> gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14. v2: Fix typos in commit message Add missing XORG_GLAPI_OUTPUTS += \ into src/mapi/glapi/gen/Makefile.am Add glapi_gentable.c to EXTRA_DIST for inclusion in the release tarball v3: Fix commit message: s/gl_gentable.c/glapi_gentable.c/ Reported-by: Arlie Davis <arlied@google.com> Cc: Jeremy Huddleston <jeremyhu@apple.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-21 15:04:02 +01:00
Arlie Davis	daa775b58e	mesa: Reduce libGL.so binary size by about 15% This patch significantly reduces the size of the libGL.so binary. It does not change the (externally visible) behavior of libGL.so at all. gl_gentable.py generates a function, _glapi_create_table_from_handle. This function allocates a large dispatch table, consisting of 1300 or so function pointers, and fills this dispatch table by doing symbol lookups on a given shared library. Previously, gl_gentable.py would generate a single, very large _glapi_create_table_from_handle function, with a short cluster of lines for each entry point (function). The idiom it generates was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress, and then a store into the dispatch table. Since this function processes a large number of entry points, this code is duplicated many times over. We can encode the same information much more compactly, by using a lookup table. The previous total size of _glapi_create_table_from_handle on x64 was 125848 bytes. By using a lookup table, the size of _glapi_create_table_from_handle (and the related lookup tables) is reduced to 10840 bytes. In other words, this enormous function is reduced by 91%. The size of the entire libGL.so binary (measured when stripped) itself drops by 15%. So the purpose of this change is to reduce the binary size, which frees up disk space, memory, etc. size lib/libGL.so.1.2.0 on my system shows (Andreas) text data bss dec hex filename 565947 11256 2720 579923 8d953 lib/libGL.so.1.2.0 before 469211 21848 2720 493779 788d3 lib/libGL.so.1.2.0 after v2: Incorporate Matt's feedback. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Tested-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2016-01-21 15:03:53 +01:00
Ilia Mirkin	daa0fd7843	nv50/ir: 64-bit splitting fixes Take reading shader outputs into account, and use setFlagsDef for the carry since we rely on having i->flagsDef being set. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	c0b66d96d7	gk110/ir: allow carry to be set/read by imad Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	73c9ca7544	gm107/ir: add carry emission to LOP and IADD Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	71a489633b	gm107/ir: add ATOM and CCTL support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	57b0025814	gm107/ir: set LD/ST address width bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:34 -05:00
Ilia Mirkin	2e533ab74b	gk110/ir: fix double-wide vm address	2016-01-20 19:37:34 -05:00
Ilia Mirkin	8c2dfe05c5	gk110/ir: add OP_CCTL handling	2016-01-20 19:37:33 -05:00
Ilia Mirkin	7d9a97d6be	gk110/ir: add atomic op emission, fix gmem loads Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 19:37:33 -05:00
Roland Scheidegger	dc8b9bd0aa	llvmpipe: warn about illegal use of objects in different contexts Doing that is clearly a bug. We can't quite assert as st/mesa may hit this, but increase at least visibility of it a bit. (For the non-refcounted objects it would be illegal too, but we can't detect that unless we'd store the context ourselves. Plus, those don't tend to cause random crashes at context or object destruction time... So just sampler views, surfaces and so targets for now.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-21 00:09:55 +01:00
Roland Scheidegger	e925ec8811	llvmpipe,i915: add back NEW_RASTERIZER dependency when computing vertex info I removed this mistakenly in `2dbc20e456`. I actually thought it should not be necessary and a piglit run didn't show any differences, but this shouldn't have been in there. draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER. The new polygon-mode-facing test indeed shows why this is necessary, there's lots of invalid reads and writes with valgrind (also crashes without valgrind), because the pre-pipeline vertex size doesn't match the post-pipeline vertex size (note this won't help much with stages which don't have the prepare hook which can grow the vertex size, in particular the wide point stage, but this isn't used by llvmpipe). The test still won't pass, of course, but it is only usage of uninitialized values now, which is much less dangerous... (Albeit I'm pretty sure for i915 it really is not needed anymore as it doesn't care about the extra outputs and doesn't call draw_prepare_shader_outputs().) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-21 00:09:55 +01:00
Ilia Mirkin	dc3ac418bf	nv50/ir: don't flip SHL(ADD) into ADD(SHL) if ADD sources have modifiers Fixes: `31fde8fa` (nv50/ir: flip shl(add, imm) into add(shl, imm)) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 18:03:36 -05:00
Ilia Mirkin	3a63576168	gk110/ir: fix load from shared memory It was accidentally using the store opcode. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 17:16:09 -05:00
Ilia Mirkin	9f23007a7a	gk110/ir: add partial BAR support This is enough for the plain TGSI BARRIER implementation. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-20 17:16:09 -05:00
Tapani Pälli	f1152c3455	Revert "glsl: move uniform calculation to link_uniforms" This reverts commit `4475d8f916`.	2016-01-20 22:04:46 +02:00
Tapani Pälli	4475d8f916	glsl: move uniform calculation to link_uniforms Patch moves uniform calculation to happen during link_uniforms, this is possible with help of UniformRemapTable that has all the reserved locations. Location assignment for implicit locations is changed so that we utilize also the 'holes' that explicit uniform location assignment might have left in UniformRemapTable, this makes it possible to fit more uniforms as previously we were lazy here and wasting space. Fixes following CTS tests: ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array v2: code cleanups, increment NumUniformRemapTable correctly, fix find_empty_block to work properly and add some more comments. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-20 07:24:39 +02:00
Timothy Arceri	0a6a05c8ea	glsl: add missing explicit_image_format flag to has_layout() Fixes piglit regression after fixes to duplicate layout rules. Previously catching multiple layouts was relying on the code meant to catch duplicates within a single layout(...), this change triggers the rules for multiple layouts. Cc: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-01-20 15:45:56 +11:00
Roland Scheidegger	b21973acaa	llvmpipe: turn depth clears into full depth/stencil clears for d24x8 formats If we have a d24x8 format, there is no stencil. Therefore, we can always clear these bits too, which means this will be some kind of memset rather than read-modify-write. This is good for some 7% increase or so in gears with huge window size - seems to have a bigger effect if things aren't in caches. Of course, any real app won't spend nearly as much time comparatively in clearing depth buffer in the first place, so the speedup will be much lower. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-20 01:45:56 +01:00
Francisco Jerez	f8ac314cc2	i965: Implement compute sampler state atom. Fixes a number of GLES31 CTS failures and hangs on various hardware: ES31-CTS.texture_gather.plain-gather-depth-2d ES31-CTS.texture_gather.plain-gather-depth-2darray ES31-CTS.texture_gather.plain-gather-depth-cube ES31-CTS.texture_gather.offset-gather-depth-2d ES31-CTS.texture_gather.offset-gather-depth-2darray ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers ES31-CTS.compute_shader.resources-texture Some of them were actually passing by luck on some generations even though we weren't uploading sampler state tables explicitly for the compute stage, most likely because they relied on the cached sampler state left from previous rendering to be close enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725 Reported-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-19 16:11:04 -08:00
Francisco Jerez	9e4c8acd78	i965: Trigger CS state reemission when new sampler state is uploaded. This reuses the NEW_SAMPLER_STATE_TABLE state bit (currently only used on pre-Gen7 hardware) to signal that the sampler state tables have changed in order to make sure that the GPGPU interface descriptor is updated. Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-19 16:11:04 -08:00
Kenneth Graunke	4fc018576b	glsl: Don't abbreviate tessellation shader stage names. I have a patch that writes shaders as .shader_test files, and it uses this function to create the headers (i.e. [vertex shader]). [tess ctrl shader] isn't a valid shader_runner header - it's spelled out as [tessellation control shader]. There's no real reason to abbreviate it, so spell it out. v2: Rebase on Rob's patches to move the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-19 14:57:42 -08:00
Timothy Arceri	11fc7ad62e	mesa: remove link validation that should be done elsewhere Even if re-linking fails rendering shouldn't fail as the previous succesfully linked program will still be available. It also shouldn't be possible to have an unlinked program as part of the current rendering state. This fixes a subtest in: ES31-CTS.sepshaderobjs.StateInteraction This change should improve performance on CPU limited benchmarks as noted in commit `d6c6b186cf`. >From Section 7.3 (Program Objects) of the OpenGL 4.5 spec: "If a program object that is active for any shader stage is re-linked unsuccessfully, the link status will be set to FALSE, but any existing executables and associated state will remain part of the current rendering state until a subsequent call to UseProgram, UseProgramStages, or BindProgramPipeline removes them from use. If such a program is attached to any program pipeline object, the existing executables and associated state will remain part of the program pipeline object until a subsequent call to UseProgramStages removes them from use. An unsuccessfully linked program may not be made part of the current rendering state by UseProgram or added to program pipeline objects by UseProgramStages until it is successfully re-linked." "void UseProgram(uint program); ... An INVALID_OPERATION error is generated if program has not been linked, or was last linked unsuccessfully. The current rendering state is not modified." V2: apply the rule to both core and compat. Cc: Tapani Pälli <tapani.palli@intel.com> Cc: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-20 09:35:04 +11:00
Timothy Arceri	6a660a5f5d	glsl: allow multiple layout qualifiers for a single declaration From the ARB_shading_language_420pack spec: "More than one layout qualifier may appear in a single declaration. If the same layout-qualifier-name occurs in multiple layout qualifiers for the same declaration, the last one overrides the former ones." The parser was already failing correctly when the extension is not available but testing for duplicates within a single layout qualifier was still causing this to fail when available as both cases share the same function for merging. Here we add a parameter to differentiate between the two uses and apply it to the duplicate test. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:50 +11:00
Timothy Arceri	564009986f	glsl: update parser to allow duplicate default layout qualifiers In order to only create a single node for each default declaration we add a new boolean parameter to the in/out merge function to only create one once we reach the rightmost layout qualifier. From the ARB_shading_language_420pack spec: "More than one layout qualifier may appear in a single declaration. If the same layout-qualifier-name occurs in multiple layout qualifiers for the same declaration, the last one overrides the former ones." Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:45 +11:00
Timothy Arceri	a0a93470e3	glsl: move default layout qualifier rules out of the parser Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:40 +11:00
Timothy Arceri	fd612e4547	glsl: split layout_defaults into specific types This will allow merging of duplicate layout qualifiers as allowed by ARB_shading_language_420pack Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:35 +11:00
Timothy Arceri	c8b8c578d1	glsl: allow duplicate layout-qualifier-names This is added by ARB_enhanced_layouts although it doesn't fit into any of the six main changes so we enable this independently. From the ARB_enhanced_layouts spec: "More than one layout qualifier may appear in a single declaration. Additionally, the same layout-qualifier-name can occur multiple times within a layout qualifier or across multiple layout qualifiers in the same declaration" Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-20 08:06:29 +11:00
Matt Turner	866a6bf9f7	i965/vec4: Spaces around operators.	2016-01-19 12:12:38 -08:00
Matt Turner	e734fb0326	i965: Inform compiler of variable range to silence warning. Extends commit `6531ccb70` to silence the warning in release builds as well. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-19 12:08:59 -08:00
Matt Turner	a439788c59	glsl: Restore Mesa-style to shader_enums.c/h.	2016-01-19 12:08:59 -08:00
Christian König	f3b067af86	st/va: fix motion adaptive deinterlacing Signed-off-by: Christian König <christian.koenig@amd.com>	2016-01-19 17:28:38 +01:00
Nicolai Hähnle	e6281a2850	util/u_pstipple.c: copy immediates during transformation Apparently, nobody has combined stippling with a fragment shader containing immediates in almost five years... Fixes a bug in Kodi with radeonsi reported by Christian König. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-19 10:52:35 -05:00
Marta Lofstedt	2bcacc69b9	mesa: Move sanity check of BindVertexBuffer for OpenGL ES 3.1 Sanity check of BindVertexBuffer for OpenGL ES in _mesa_handle_bind_buffer_gen breaks OpenGL ES 2 conformance. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93426 Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-19 13:08:42 +01:00
Timothy Arceri	d018619d7f	glsl: fix interface block error message Print the stream value not the pointer to the expression, also use the unsigned format specifier. Cc: 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-19 14:51:31 +11:00
Ilia Mirkin	a31819cff8	nv50/ir: swap the least-ref'd source into src1 when both const/imm The whole point of inlining sources is to reduce loads. We can end up in a situation where one value is used a lot of times, and one value is used only once per instruction. The once-per-instruction one is the one that should get inlined, but with the previous algorithm, it was given no preference. This flips things around to preferring putting less-referenced values into src1 which increases the likelihood of them being inlined. While we're at it, adjust the heuristic to not treat 0 as an immediate, as well as (effectively) check for situations where LIMMs can't be loaded. All this yields improvements on nvc0: total instructions in shared programs : 6261157 -> 6255985 (-0.08%) total gprs used in shared programs : 945082 -> 943417 (-0.18%) total local used in shared programs : 30372 -> 30288 (-0.28%) total bytes used in shared programs : 50089256 -> 50047880 (-0.08%) local gpr inst bytes helped 21 822 3332 3332 hurt 0 278 565 565 And more importantly avoids generating really bad code with SSBOs, where we end up checking a lot of different values (usually immediates) against the length. On nv50 we get comparable results, and even improve packing (bytes went down more than instructions): total instructions in shared programs : 6346564 -> 6341277 (-0.08%) total gprs used in shared programs : 728719 -> 725131 (-0.49%) total local used in shared programs : 3552 -> 3552 (0.00%) total bytes used in shared programs : 43995688 -> 43932928 (-0.14%) local gpr inst bytes helped 0 1380 3252 3774 hurt 0 287 1710 1365 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-18 17:52:07 -05:00
Ilia Mirkin	af686e7de3	st/mesa: restore the stObj's size if it was cleared out An issue could still occur if the base level is set, but fixing that would require a lot more logic. This fixes the recently-failing texelFetch 3D tests because the mipmaps were no longer being generated, which in turn caused the copying logic to be hit, which in turn didn't work because of the broken width/height/depth. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-18 17:52:07 -05:00
Rob Clark	805e080ba0	freedreno/a4xx: use smaller threadsize for more registers Once we go past half of the "GPR" register file, it seems like we need to run frag shader with smaller threadsize. (The vertex shader already runs at TWO_QUADS, which is the minimum.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-18 16:58:25 -05:00
Rob Clark	6062941e4d	freedreno: per-generation OUT_IB packet Some a4xx firmware doesn't implement the "PFD" (prefetch-disabled) version of the CP_INDIRECT_BUFFER packet. So allow for PFD vs PFE per generation. Switch a3xx and a4xx over to using prefetch-enabled version (which is also what blob does.. it seems only on a2xx we cannot use PFE). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-18 16:58:25 -05:00
Emil Velikov	c03f3dd0a5	gallium: bundle the compat header u_pwr8.h in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-18 13:37:58 +02:00
Emil Velikov	7bc714509b	mapi: include gl.xml in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-18 13:37:58 +02:00
Emil Velikov	a78e08e88f	i965: adding missing headers to the dist tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-18 13:37:58 +02:00
Christian König	eaf7ec9cfc	st/va: add motion adaptive deinterlacing v2 v2: minor cleanup Signed-off-by: Christian König <christian.koenig@amd.com>	2016-01-18 10:59:32 +01:00
Michel Dänzer	ad20be1f30	gallium/radeon: Rename do_invalidate_resource to invalidate_buffer And only call it from r600_invalidate_resource for buffer resources. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-18 17:39:37 +09:00
Michel Dänzer	0491dd1deb	st/dri: Don't call invalidate_resource for NULL depth/stencil buffers Fixes crash in 4 EGL piglit tests with radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-18 17:39:37 +09:00
Michel Dänzer	a9ab7172a6	radeonsi: Avoid warning about LLVM generating R_0286D0_SPI_PS_INPUT_ADDR Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2016-01-18 17:39:37 +09:00
Michel Dänzer	4297259fc8	radeonsi: Print "LLVM emitted unknown config register" warning only once Say "LLVM" instead of "Compiler" for clarity. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-18 17:39:37 +09:00
Oded Gabbay	679a654a77	llvmpipe: use vpkswss when dst is signed This patch fixes a bug when building a pack instruction. For POWER (altivec), in case the destination is signed and the src width is 32, we need to use vpkswss. The original code used vpkuwus, which emits an unsigned result. This fixes the following piglit tests on ppc64le: - spec@arb_color_buffer_float@gl_rgba8-drawpixels - shaders@glsl-fs-fogscale I've also corrected some coding style issues in the function. v2: Returned else statements to vmware style Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-18 09:45:25 +02:00
Dave Airlie	119bef9543	glsl: fix subroutine lowering reusing actual parmaters One of the oglconform tests was crashing here, and it was due to not cloning the actual parameters before creating the new call. This makes a call clone function that does the right things to make sure we clone all the needed info, and points the callee at it. (It differs from ->clone due to this). this may fix https://bugs.freedesktop.org/show_bug.cgi?id=93722, I had this patch in my cts fixes tree, but hadn't had time to make sure I liked it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-18 15:02:34 +10:00
Timothy Arceri	9258d9f23d	glsl: remove special case for detecting stream duplicates Any duplicates in a single declaration will already fail the generic duplicates test due to the explicit_stream flag being set. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-18 13:09:28 +11:00
Timothy Arceri	eac2cece31	glsl: add missing explicit_stream flag to has_layout() This will allow the ARB_shading_language_420pack rules in glsl_parser.yy for catching duplicate layout qualifiers to be triggered for the stream identifier rather than relying on the code meant to catch duplicates within a single layout(...) Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-18 13:09:16 +11:00
Timothy Arceri	86677f1016	mesa: fix segfault in glUniformSubroutinesuiv() From Section 7.9 (SUBROUTINE UNIFORM VARIABLES) of the OpenGL 4.5 Core spec: "The command void UniformSubroutinesuiv(enum shadertype, sizei count, const uint *indices); will load all active subroutine uniforms for shader stage shadertype with subroutine indices from indices, storing indices[i] into the uniform at location i. The indices for any locations between zero and the value of ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS minus one which are not used will be ignored." V2: simplify NULL check suggested by Jason. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org https://bugs.freedesktop.org/show_bug.cgi?id=93731	2016-01-18 11:53:24 +11:00
Timothy Arceri	50376e0c0e	glsl: fix segfault linking subroutine uniform with explicit location Reviewed-by: Dave Airlie <airlied@redhat.com> Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org	2016-01-18 11:30:45 +11:00
Ilia Mirkin	4ac1274caa	gm107/ir: don't do indirect frag shader inputs on GM107 Apparently the IPA op decided to stop working with offsets. Need to figure out if we need to do an AL2P situation or something similar. For now just turn it back off. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-17 16:37:04 -05:00
Ilia Mirkin	3281ae96c8	tgsi: initialize Atomic field in tgsi_default_declaration Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-17 16:37:04 -05:00
Ilia Mirkin	5a81b48ad0	nvc0: bsp_bo can't be null We already deref it earlier. And these are all allocated on load. Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-17 16:37:04 -05:00
Oded Gabbay	529aa8249a	llvmpipe: fix arguments order given to vec_andc This patch fixes a classic "confuse the enemy" bug. _mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the arguments are opposite. _mm_andnot_si128 performs "r = (~a) & b" while vec_andc performs "r = a & (~b)" To make sure this error won't return in another place, I added a wrapper function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-17 21:07:27 +02:00
Rob Clark	02ac91d717	freedreno/ir3: fix mad 3rd src delay calc In `fad158a0` ("freedreno/ir3: array rework") the src # (n) shifted by one, but missed updating delay-slot calc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-17 12:21:45 -05:00
Rob Clark	2a6ec1e061	freedreno/ir3: better array register allocation Detect arrays which don't conflict with each other and allow overlapping register allocation. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:23:52 -05:00
Rob Clark	6a33c5c0df	freedreno/ir3: array offset can be negative It at least happens with some piglit tests, like $piglit/bin/vp-address-01 VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..7] DCL ADDR[0] 0: ARL ADDR[0].x, IN[1].xxxx 1: MOV_SAT OUT[1], CONST[ADDR[0].x-1] 2: DP4 OUT[0].x, CONST[4], IN[0] 3: DP4 OUT[0].y, CONST[5], IN[0] 4: DP4 OUT[0].z, CONST[6], IN[0] 5: DP4 OUT[0].w, CONST[7], IN[0] 6: END Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:23:20 -05:00
Rob Clark	ddede497b8	freedreno/ir3: workaround bug/feature Seems like in certain cases, we cannot use c<a0.x+0> as the third src to cat3 instructions. This may be slightly conservative, we may only have this restriction when the first src is also const. This fixes, for example, +24/-0 of the variable-indexing piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:22:43 -05:00
Rob Clark	ebd3a1fc17	ttn: use writemask for store_var Only user is freedreno, and after array-rework it can cope. Avoids generating loads for a store. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:21:52 -05:00
Rob Clark	fad158a0e0	freedreno/ir3: array rework Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:21:08 -05:00
Rob Clark	cc7ed34df9	freedreno/ir3: refactor/simplify cp If we handle separately the special case of eliminating output mov (which includes keeps and various other cases where we don't have a consuming instruction's src register to collapse things into), we can simplify the logic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:20:46 -05:00
Rob Clark	680664dff9	freedreno/ir3: fix incorrect decoding of mov instructions Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:20:37 -05:00
Rob Clark	2809c87f90	freedreno/ir3: remove unused tgsi tokens ptr Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:59 -05:00
Rob Clark	fc0d2f7e02	freedreno/ir3: bit of ra refactor Shuffle things slightly, passing instr-data to ra_name() to reduce the number of places where we need to add support for array names. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:47 -05:00
Rob Clark	d430f443de	freedreno/ir3: cosmetic de-indent Collapse two nested if's into one to reduce indent level. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-16 14:18:33 -05:00
Rob Clark	6f0377d651	ttn: add missing writemask on store_output Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-16 13:35:44 -05:00
Rob Clark	683794fd60	nir/print: const_index is signed Noticed this with $piglit/bin/vp-address-01 Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-16 13:35:44 -05:00
Rob Clark	211b0644e6	nir: few missing struct names nir.h is a bit inconsistent about 'typedef struct {} nir_foo' vs 'typedef struct nir_foo {} nir_foo'. But missing struct name tags is inconvenient when you need a fwd declaration without pulling in all of nir. So add missing struct name tag for nir_variable, and a couple other spots where it would likely be useful. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-16 13:35:43 -05:00
Ilia Mirkin	32a9fe013b	nv50/ir: add saturate support on ex2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-16 00:10:56 -05:00
Jeff Muizelaar	e5fefe49f2	gallivm: avoid crashing in mod by 0 with llvmpipe This adds code that is basically the same as the code in umod, udiv and idiv. However, unlike idiv we return -1. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-16 03:36:29 +01:00
Kenneth Graunke	d54a70aa18	glsl: Allow implicit int -> uint conversions for bitwise operators (&, ^, \|). The ARB has decided that implicit conversions should be performed for bitwise operators in future language revisions. Implementations of current language revisions may or may not perform them. This patch makes Mesa apply implicti conversions even on current language versions. Applications appear to expect this behavior, and there's really no downside to doing so. Fixes shader compilation in Shadow of Mordor. Bugzilla: https://www.khronos.org/bugzilla/show_bug.cgi?id=1405 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-15 17:53:44 -08:00
Jason Ekstrand	61b0cfd84e	i965/fs: Always set channel 2 of texture headers in some stages In the vertex and fragment stages, the hardware is nice to us and leaves g0.2 zerod out for us so we can use it for headers. However, in compute, geometry, and tessellation stages, the hardware is not so nice. In particular, for compute shaders on BDW, the hardware places some debug bits in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha channel mask. This means that if you use a texturing instruction with a header in a compute shader, you may randomly get the alpha channel disabled. Since channel masks affect the return length of the sampler message, this can lead the GPU to expect a different mlen to the one you specified in the shader and this, in turn, hangs your GPU. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	9870f798be	i965/fs/generator: Take an actual shader stage rather than a string Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-15 16:44:02 -08:00
Jason Ekstrand	0a6811207f	i965/vec4: Use UW type for multiply into accumulator on GEN8+ BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-15 16:44:02 -08:00
Roland Scheidegger	03f66dfb4b	llvmpipe: ditch additional ref counting for vertex/geometry sampler views The cleaning up was quite a performance hog (making pipe_resource_reference the number two in profilers on the vertex path, and 3rd overall, with its cousin pipe_reference_described not far behind) if there were lots of tiny draw calls (ipers). Now the reason was really that it was blindly calling this for all potential shader views (so 32 each for vs and gs) even though the app never touched a single one which could have been fixed, however I can't come up with a good reason why we refcount these. We've got references, of course, in the sampler views, which should be quite sufficient as we do all vertex and geometry shader execution fully synchronous. (Calling prepare_shader_sampling for all draw calls even if there were no changes looks quite suboptimal too, but generally we don't really expect vs/gs shader sampling to be used much with llvmpipe, and there's even an early exit if there aren't any views to avoid the "null loop" albeit it's now no longer always trying to loop through all 32 slots. Maybe improve another time...). Of course, if we manage to make vertex loads run asynchronously some day, we need references again, but adding that back would be the least of the problems... Also only set LP_NEW_SAMPLER_VIEW for fragment sampler views. Nothing on the vertex side depends on it (I suppose we'd really wanted a separate flag in any case). (Good for a 3% improvement or so in ipers under the right conditions.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-15 20:13:45 +01:00
Roland Scheidegger	2f9a325b6a	llvmpipe: fix "leaking" textures This was not really a leak per se, but we were referencing the textures for longer than intended. If textures were set via llvmpipe_set_sampler_views() (for fs) and then picked up by lp_setup_set_fragment_sampler_views(), they were referenced in the setup state. However, the only way to unreference them was by replacing them with another texture, and not when the texture slot was replaced with a NULL sampler view. (They were then further also referenced by the scene too which might have additional minor side effects as we limit the memory size which is allowed to be referenced by a scene in a rather crude way.) Only setup destruction (at context destruction time) then finally would get rid of the references. Fix this by noting the number of textures the last time, and unreference things if the new view is NULL (avoiding having to unreference things always up to PIPE_MAX_SHADER_SAMPLER_VIEWS which would also have worked). Found by code inspection, no test... v2: rename var Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-15 20:13:45 +01:00
Samuel Iglesias Gonsálvez	781d2787bc	glsl: restrict consumer stage condition to modify interpolation type Only modify interpolation type for integer-based varyings or when the consumer is known and different than fragment shader. If we are linking separate shader programs and the consumer is unknown, the consumer could be added later and be a fragment shader. If we modify the interpolation type in this case, we could read wrong values in the fragment shader inputs, as shown in bug 93320. Fixes the following CTS test: ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate Fixes the following dEQP tests: dEQP-GLES31.functional.separate_shader.random.102 dEQP-GLES31.functional.separate_shader.random.111 dEQP-GLES31.functional.separate_shader.random.115 dEQP-GLES31.functional.separate_shader.random.17 dEQP-GLES31.functional.separate_shader.random.22 dEQP-GLES31.functional.separate_shader.random.23 dEQP-GLES31.functional.separate_shader.random.3 dEQP-GLES31.functional.separate_shader.random.32 dEQP-GLES31.functional.separate_shader.random.39 dEQP-GLES31.functional.separate_shader.random.64 dEQP-GLES31.functional.separate_shader.random.73 dEQP-GLES31.functional.separate_shader.random.91 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93320 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-15 07:06:41 +01:00
Kenneth Graunke	3657cbf24f	i965: Apply add_const_offset_to_base for vec4 VS inputs too. This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	a3500f943e	i965: Make add_const_offset_to_base() work at the shader level. This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	824d82025d	i965: Make an is_scalar boolean in brw_compile_vs(). Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Kenneth Graunke	bb6612f06b	nir/builder: Add a nir_build_ivec4() convenience helper. nir_build_ivec4 is more readable and succinct than using nir_build_imm directly, even if you have C99. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-14 21:32:59 -08:00
Tapani Pälli	cf96bce0ca	glsl: mark explicit uniforms as explicit in other stages too If shader declares uniform explicit location in one stage but implicit in another, explicit location should be used. Patch marks implicit uniforms as explicit if they were explicit in previous stage. This makes sure that we don't treat them implicit later when assigning locations. Fixes following CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-implicit-in-some-stages3 v2: move check to cross_validate_globals (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-15 07:12:42 +02:00
Francisco Jerez	0556b87de4	i965/gen7.5+: Disable resource streamer during GPGPU workloads. The RS and hardware binding tables are only supported on the 3D pipeline and can lead to corruption if left enabled during a GPGPU workload. Disable it when switching to the GPGPU (or media) pipeline and re-enable it when switching back to the 3D pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2016-01-14 19:26:24 -08:00
Francisco Jerez	c8df0e7bf3	i965/gen7: Emit stall and dummy primitive draw after switching to the 3D pipeline. This hardware bug can supposedly lead to a hang on IVB and VLV. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	635be1402c	i965/gen4-5: Emit MI_FLUSH as required prior to switching pipelines. AFAIK brw_emit_select_pipeline() is only called once during context init on Gen4-5, at which point the pipeline is likely to be already idle so it may just happen to work by luck regardless of the MI_FLUSH. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	18c76551ee	i965/gen6-7: Implement stall and flushes required prior to switching pipelines. Switching the current pipeline while it's not completely idle or the read and write caches aren't flushed can lead to corruption. Fixes misrendering of at least the following Khronos CTS test: ES31-CTS.shader_image_load_store.basic-allTargets-store-fs The stall and flushes are no longer required on Gen8+. v2: Emit PIPE_CONTROL with non-zero post-sync op before the write cache flush on SNB due to hardware bug. (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	044acb9256	i965/gen8+: Invalidate color calc state when switching to the GPGPU pipeline. This hardware bug can cause a hang on context restore while the current pipeline is set to GPGPU (BDWGFX HSD 1909593). In addition to clearing the valid bit, mark the CC state as dirty to make sure that the CC indirect state pointer is re-emitted when we switch back to the 3D pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Francisco Jerez	22ac1f6922	i965: Add state bit to trigger re-emission of color calculator state. This will be used on Gen8+ to make sure that the color calculator state pointers are re-emitted when switching back to the 3D pipeline after some GPGPU workload due to a hardware workaround. There are other state bits already defined that could be used to achieve the same effect but they all cause a ton of unrelated state to be re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new one, state bits are cheap. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-14 19:26:23 -08:00
Ilia Mirkin	fffb559129	nv50/ir: rebase indirect temp arrays to 0, so that we use less lmem space Reduces local memory usage in a lot of Metro 2033 Redux and a few KSP shaders: total local used in shared programs : 54116 -> 30372 (-43.88%) Probably modest advantage to execution, but it's an imporant prerequisite to dropping some of the TGSI optimizations done by the state tracker. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 20:14:01 -05:00
Ilia Mirkin	e231f59b6d	nv50/ir: only use FILE_LOCAL_MEMORY for temp arrays that use indirection Previously we were treating any indirect temp array usage to mean that everything should end up in lmem. The MemoryOpt pass would clean a lot of that up later, but in the meanwhile we would lose a lot of opportunity for optimization. This helps a lot of Metro 2033 Redux and a handful of KSP shaders: total instructions in shared programs : 6288373 -> 6261517 (-0.43%) total gprs used in shared programs : 944051 -> 945131 (0.11%) total local used in shared programs : 54116 -> 54116 (0.00%) A typical case is for register usage to double and for instructions to halve. A future commit can also optimize local memory usage size to be reduced with better packing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 20:13:59 -05:00
Ilia Mirkin	37b67db6ae	nvc0/ir: be careful about propagating very large offsets into const load Indirect constbuf indexing works by using very large offsets. However if an indirect constbuf index load is const-propagated, it becomes a very large const offset. Take that into account when legalizing the SSA by moving the high parts of that offset into the file index. Also disallow very large (or small) indices on most other instructions. This fixes regressions in ubo_array_indexing/*-two-arrays piglit tests. Fixes: `abd326e81b` (nv50/ir: propagate indirect loads into instructions) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 18:20:27 -05:00
Ilia Mirkin	7a521ddf36	nvc0: allow fragment shader inputs to use indirect indexing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-14 14:28:04 -05:00
Ilia Mirkin	e94ef885bb	st/mesa: use surface format to generate mipmaps when available This fixes the recently posted mipmap + texture views piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-14 14:28:04 -05:00
Marek Olšák	dc96a18d24	radeonsi: don't miss changes to SPI_TMPRING_SIZE I'm not sure about the consequences of this bug, but it's definitely dangerous. This applies to SI, CIK, VI. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-14 19:55:41 +01:00
Charmaine Lee	6303231a1d	svga: add DXGenMips command support For those formats that support hw mipmap generation, use the DXGenMips command. Otherwise fallback to the mipmap generation utility. Tested with piglit, OpenGL apps (Heaven, Turbine, Cinebench) v2: make sure the texture surface was created with the render target bind flag set relocation flag to SVGA_RELOC_WRITE for the texture surface Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-14 10:44:25 -07:00
Charmaine Lee	78e628ae43	svga: add num-generate-mipmap HUD query The actual increment of the num-generate-mipmap counter will be done in a subsequent patch when hw generate mipmap is supported. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-14 10:39:53 -07:00
Charmaine Lee	3038e8984d	gallium/st: add pipe_context::generate_mipmap() This patch adds a new interface to support hardware mipmap generation. PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify if this new interface is supported; if not supported, the state tracker will fallback to mipmap generation by rendering/texturing. v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers v3: add format to the generate_mipmap interface to allow mipmap generation using a format other than the resource format v4: fix return type of trace_context_generate_mipmap() Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-14 10:39:53 -07:00
Brian Paul	b1e11f4d71	st/mesa: declare struct pipe_screen in st_cb_bufferobjects.h To silence a compiler warning. Trivial.	2016-01-14 10:38:18 -07:00
Matt Turner	b82e26a6a4	nir: Lower bitfield_extract. The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-14 09:28:01 -08:00
Matt Turner	15640ee77a	nir: Handle <bits>=32 case in bitfield_insert lowering. The OpenGL specifications for bitfieldInsert() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 bfi opcode is specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit fixes the lowering of bitfield_insert to handle the trivial case of <bits> = 32 as bitfieldInsert: bits > 31 ? insert : bfi(bfm(bits, offset), insert, base) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2 ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-14 09:27:52 -08:00
Brian Paul	6470435190	st/mesa: add check for color logicop in blit_copy_pixels() We check that a bunch of raster operations are disabled in blit_copy_pixels(). We also need to check that color logicop is disabled. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:08:21 -07:00
Nicolai Hähnle	e976860638	gallium/radeon: do not reallocate user memory buffers The whole point of AMD_pinned_memory is that applications don't have to map buffers via OpenGL - but they're still allowed to, so make sure we don't break the link between buffer object and user memory unless explicitly instructed to. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:41:24 -05:00
Nicolai Hähnle	321140d563	gallium/radeon: implement PIPE_CAP_INVALIDATE_BUFFER Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:41:04 -05:00
Nicolai Hähnle	08c71740ad	gallium/radeon: reset valid_buffer_range on PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE This accomodates a streaming pattern where the discard flag is set when the application wraps back to the beginning of the buffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:40:00 -05:00
Nicolai Hähnle	70e66c57bb	st/mesa: implement Driver.InvalidateBufferSubData Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:39:57 -05:00
Nicolai Hähnle	9e2240e892	st/mesa: use pipe->invalidate_resource instead of buffer re-allocation Drivers are expected to avoid unnecessary work when possible in this code path. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:39:53 -05:00
Nicolai Hähnle	654670b404	gallium: add PIPE_CAP_INVALIDATE_BUFFER It makes sense to re-use pipe->invalidate_resource for the purpose of glInvalidateBufferData, but this function is already implemented in vc4 where it doesn't have the expected behavior. So add a capability flag to indicate that the driver supports the expected behavior. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:39:38 -05:00
Nicolai Hähnle	6f4ae81005	mesa: add Driver.InvalidateBufferSubData Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-14 09:39:30 -05:00
Nicolai Hähnle	53c77494aa	mesa: fix the checks in _mesa_InvalidateBuffer(Sub)Data Change the check to be in line with what the quoted spec fragment says. I have sent out a piglit test for this as well. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-14 09:39:22 -05:00
Nicolai Hähnle	cbcdef7b40	winsys/radeon: fix warnings about incompatible pointer types Some confusion between pb_buffer and radeon_bo as well as between radeon_drm_winsys and radeon_winsys. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-14 09:33:58 -05:00
Neil Roberts	06b526de05	texobj: Check completeness with InternalFormat rather than Mesa format The internal Mesa format used for a texture might not match the one requested in the internalFormat when the texture was created, for example if the driver is internally remapping RGB textures to RGBA. Otherwise it can cause false positives for completeness if one mipmap image is created as RGBA and the other as RGB because they would both have an RGBA Mesa format. If we check the InternalFormat instead then we are directly checking the API usage which I think better matches the intention of the check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93700 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-01-14 12:18:24 +00:00
Ben Widawsky	f4ab7340ca	i965: Remove unused hw_must_use_separate_stencil I spotted this while looking for what needs updating in future platforms. I'm too lazy to go through the git logs, but it was probably missed by Jason when all the brw refactoring happened. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-13 16:41:04 -08:00
Matt Turner	138a7dc826	i965: Drop extra newline from shader compile messages. Ilia changed shader-db's run.c to not expect messages to contain a newline in shader-db commit 51bbc8035.	2016-01-13 16:19:18 -08:00
Matt Turner	74cff779eb	nir: Change bfm's semantics to match Intel/AMD/SM5. Intel/AMD's hardware instructions do not handle arguments of 32. Constant evaluation should not produce a result different from the hardware instruction. The s/1ull/1u/ change is intentional: previously we wanted defined behavior for the "1 << 32" case, but we're making this case undefined so we can make it 1u and save ourselves a 64-bit operation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 11:22:40 -08:00
Matt Turner	a5fcff6628	glsl: Fix undefined shifts. Shifting into the sign bit is undefined, as is shifting by 32. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 11:22:11 -08:00
Matt Turner	966a0dd720	glsl: Handle failure of Python codegen scripts. If a Python codegen script failed, it would write a zero-byte file, which on subsequent invocations of make would trick it into thinking the file was appropriately generated. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 10:35:12 -08:00
Kenneth Graunke	84d6130c21	glsl, nir: Make ir_triop_bitfield_extract a vectorized operation. We would like to be able to combine result.x = bitfieldExtract(src0.x, src1.x, src2.x); result.y = bitfieldExtract(src0.y, src1.y, src2.y); result.z = bitfieldExtract(src0.z, src1.z, src2.z); result.w = bitfieldExtract(src0.w, src1.w, src2.w); into a single ivec4 bitfieldInsert operation. This should be possible with most drivers. This patch changes the offset and bits parameters from scalar ints to ivecN or uvecN. The type of all three operands will be the same, for simplicity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-13 10:35:12 -08:00
Kenneth Graunke	b4e198f47f	glsl, nir: Make ir_quadop_bitfield_insert a vectorized operation. We would like to be able to combine result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x); result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y); result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z); result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w); into a single ivec4 bitfieldInsert operation. This should be possible with most drivers. This patch changes the offset and bits parameters from scalar ints to ivecN or uvecN. The type of all four operands will be the same, for simplicity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-13 10:35:12 -08:00
Kenneth Graunke	b85a229e1f	glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes. TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:35:12 -08:00
Matt Turner	92f1773869	nir: Fix constant evaluation of bfm. NIR's bfm, like Intel/AMD's hardware instructions and GLSL IR's ir_binop_bfm takes <bits> as src0 and <offset> as src1. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:35:12 -08:00
Matt Turner	7dc2e5f940	i965/fs: Skip assertion on NaN. A shader in Unreal4 uses the result of divide by zero in its color output, producing NaN and triggering this assertion since NaN is not equal to itself. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93560 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:32:53 -08:00
Matt Turner	64800933b8	i965/fs: Add debugging to constant combining pass. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-13 10:32:53 -08:00
Brian Paul	9638c03a4e	meta: remove const qualifier on _mesa_meta_fb_tex_blit_begin() To silence a compiler warning about a const/non-const mismatch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-13 08:02:25 -07:00
Brian Paul	235a299534	st/mesa: fix incorrect buffer token passed to _mesa_BindFramebuffer() I added this code right at the end, and got it wrong. Only used by the WGL_ARB_render_texture code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-13 08:01:56 -07:00
Emil Velikov	2065ffb4cf	docs: add news item and link release notes for 11.1.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-01-13 15:27:50 +02:00
Emil Velikov	183b5ff109	docs: add sha256 checksums for 11.1.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `4b2d9f29e9`)	2016-01-13 15:25:32 +02:00
Emil Velikov	8f16739528	docs: add release notes for 11.1.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `330aa44a0d`)	2016-01-13 15:25:31 +02:00
Neil Roberts	cda886a485	i965/gen9: Don't allow the RGBX formats for texturing/rendering The RGBX surface formats aren't renderable so we internally remap them to RGBA when rendering. They are retained as RGBX when used as textures. However since the previous patch fast clears are disabled for surfaces that use a different format for rendering than for texturing. To avoid this situation we can just pretend not to support RGBX formats at all. This will cause the upper layers of mesa to pick an RGBA format internally instead. This should be safe because we always override the alpha component to 1.0 for RGBX in the texture swizzle anyway. We could also do this for all gens except that it's a bit more difficult when the hardware doesn't support texture swizzling. Gens using the blorp have further problems because that doesn't implement this swizzle override. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-01-13 12:16:31 +00:00
Marek Olšák	4ea0febcb0	radeonsi: move POSITION and FACE fragment shader inputs to system values And FACE becomes integer instead of float. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-13 12:27:28 +01:00
Marek Olšák	caf3c2abea	radeonsi: simplify gl_FragCoord behavior It will become a system value, not an input. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-13 12:27:28 +01:00
Samuel Iglesias Gonsálvez	69c4c75264	glsl: add image_format check in cross_validate_globals() Fixes CTS test: ES31-CTS.shader_image_load_store.negative-linkErrors Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93410 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-13 07:01:55 +01:00
Tapani Pälli	e937fd779f	mesa: do not validate io of non-compute and compute stage Fixes regression on SSO tests that have both non-compute and compute programs in a program pipeline. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93532 Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-13 07:31:57 +02:00
Tapani Pälli	6b0706b2aa	glsl: add packed varyings for outputs with single stage program Commit `8926dc8` added a check where we add packed varyings of output stage only when we have multiple stages, however duplicates are already handled by changes in commit `0508d950` and we want to add outputs also in case where we have only one stage. Fixes regression caused by `8926dc8` for following test: ES31-CTS.program_interface_query.separate-programs-vertex Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-13 07:30:46 +02:00
Roland Scheidegger	38cdcb000d	llvmpipe: (trivial) use cast wrapper for __m128d to __m128 casts some compiler was unhappy.	2016-01-13 04:48:41 +01:00
Roland Scheidegger	49ec647c3b	llvmpipe: avoid most 64 bit math in rasterization The trick here is to recognize that in the c + n * dcdx calculations, not only can the lower FIXED_ORDER bits not change (as the dcdx values have those all zero) but that this means the sign bit of the calculations cannot be different as well, that is sign(c + ndcdx) == sign((c >> FIXED_ORDER) + n(dcdx >> FIXED_ORDER)). That shaves off more than enough bits to never require 64bit masks. A shifted plane c value could still easily exceed 32 bits, however since we throw out planes which are trivial accept even before binning (and similarly don't even get to see tris for which there was a trivial reject plane)) this is never a problem. The idea isnt't all that revolutionary, in fact something similar was tried ages ago (`9773722c2b`) back when the values were only 32 bit anyway. I believe now it didn't quite work then because the adjustment needed for testing trivial reject / partial masks wasn't handled correctly. This still keeps the separate 32/64 bit paths for now, as the 32 bit one still looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled from setup which would be a good reason to ditch the 32 bit path, we'd need to change the special-purpose rasterization functions for small tris). This passes piglit triangle-rasterization (-fbo -auto -max_size -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks to make it work correctly with large sizes) easily (full piglit as well of course, but most tests wouldn't use triangles large enough to be affected, that is tris with a bounding box over 128x128). The profiler says indeed time spent in rast_tri functions is reduced substantially, BUT of course only if the tris are large. I measured a 3% improvement in mesa gloss demo when supersized to twice the screen size... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:50:57 +01:00
Roland Scheidegger	16530fdc82	llvmpipe: scale up bounding box planes to subpixel precision Otherwise some planes we get in rasterization have subpixel precision, others not. Doesn't matter so far, but will soon. (OpenGL actually supports viewports with subpixel accuracy, so could even do bounding box calcs with that). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:34:59 +01:00
Roland Scheidegger	0298f5aca7	llvmpipe: add sse code for fixed position calculation This is quite a few less instructions, albeit still do the 2 64bit muls with scalar c code (they'd need way more shuffles, plus fixup for the signed mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed scalar muls natively just fine after all (even on 32bit). (This still doesn't have a very measurable performance impact in reality, although profiler seems to say time spent in setup indeed has gone down by 10% or so overall. Maybe good for a 3% or so improvement in openarena.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-13 03:34:09 +01:00
Roland Scheidegger	9422999e40	draw: fix key comparison with uninitialized value Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.	2016-01-13 02:43:04 +01:00
Timothy Arceri	6143e2d651	mesa: print the invalid enum when CreateShader fails Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2016-01-13 09:46:56 +11:00
Kenneth Graunke	c034dbeda8	glsl: Make read_from_write_only_variable_visitor ignore .length(). .length() on an unsized SSBO variable doesn't actually read any data from the SSBO, and is allowed on variables marked 'writeonly'. Fixes compute shader compilation in Shadow of Mordor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-12 12:20:02 -08:00
Kenneth Graunke	9095847c25	i965: Mark TCS URB writes as having side effects. This adds barrier dependencies around TCS_OPCODE_URB_WRITE, preventing reads and writes from being incorrectly scheduled. Fixes rendering in GFXBench 4.0's tessellation demo. For some reason, we haven't ever listed URB writes as having side-effects. This hasn't been a problem because in most stages, we never read from the URB, and only write to each location once. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93526 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2016-01-12 12:19:47 -08:00
Tom St Denis	56fc2986d5	st/omx: Avoid segfault in deconstructor if constructor fails If the constructor fails before the LIST_INIT calls the pointers will be null and the deconstructor will segfault. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2016-01-12 19:13:19 +01:00
Christian König	6f898f740c	vl: use preferred format for deinterlacing Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:42 +01:00
Christian König	5fdd4a5aef	vl: improve motion adaptive deinterlacer Handle other formats than YV12 as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:39 +01:00
Christian König	e945235aed	st/va: add BOB deinterlacing v2 Tested with MPV. v2: correctly handle compositor deinterlacing as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:35 +01:00
Christian König	3949cf0e02	st/va: add NV12 -> NV12 post processing v2 Usefull for mpv and GStreamer. v2: use common functionality for size adjustment. Signed-off-by: Indrajit-kumar Das <Indrajit-kumar.Das@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:28 +01:00
Christian König	9f644295dc	st/va: use vl_video_buffer_adjust_size Use the new helper function instead of open coding it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:24 +01:00
Christian König	da39637764	st/vdpau: use vl_video_buffer_adjust_size Use the new helper function instead of open coding it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:21 +01:00
Christian König	52ca9a9b8b	vl/buffers: extract vl_video_buffer_adjust_size helper Useful for the state trackers as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2016-01-12 13:28:16 +01:00
Christian König	8479782361	st/va: make the implementation thread safe v2 Otherwise we might crash with MPV. v2: minor cleanups suggested on the list. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2016-01-12 13:26:24 +01:00
Tapani Pälli	8926dc87af	mesa: use gl_shader_variable in program resource list Patch changes linker to allocate gl_shader_variable instead of using ir_variable. This makes it possible to get rid of ir_variables and ir in memory after linking. v2: check that we do not create duplicate entries with packed varyings v3: document 'patch' bit (Ilia Mirkin) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-12 09:07:10 +02:00
Tapani Pälli	4985159ad6	glsl: track total amount of uniform locations used Linker missed a check for situation where we exceed max amount of uniform locations with explicit + implicit locations. Patch adds this check to already existing iteration over uniforms in linker. Fixes following CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-negative-link-max-num-of-locations v2: use var->type->uniform_locations() (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 07:52:44 +02:00
Erik Faye-Lund	395b53dad6	main: get rid of needless conditional We already check if the driver changed the completeness, we don't need to duplicate that check. Let's just early out there instead. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 11:02:16 +11:00
Erik Faye-Lund	2a15dc0dd5	gallium/util: removed unused header-file This hasn't been in use since `c476305` ("gallium/util: pregenerate half float tables"), where the last bit of run-time init using this was killed. So let's just get rid of the pointless header. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-12 11:02:08 +11:00
Samuel Pitoiset	e67f5cac79	nvc0: do not force re-binding of compute constbufs on Fermi Re-binding compute constant buffers after launching a grid have no effects because they are not currently validated and because dirty_cp is not updated accordingly. This might also prevent weird future behaviours when UBOs will be bound for compute. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-12 00:47:20 +01:00
Ian Romanick	5be700e5cc	meta: Unconditionally set GL_SKIP_DECODE_EXT The path that depends on this will be avoided (by fallback_required) if the extension is not supported. _mesa_set_sampler_srgb_decode does not generate GL errors (by design), so there are no problems there. I kept this change separate and last because it is one of the few in the series that is not a candidate for the stable branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	1799eddb51	meta: Only bind the sampler in one place All of the calls after the first _mesa_bind_sampler call are DSA style calls that don't depend on the current binding. I kept this change separate and last because it is one of the few in the series that is not a candidate for the stable branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	ae50157363	meta/decompress: Don't pollute the sampler object namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	b03ee127d8	meta/decompress: Save and restore the sampler using gl_sampler_object instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:04 -08:00
Ian Romanick	d4094f64c1	meta/decompress: Track sampler using gl_sampler_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	1998af813a	meta/decompress: Use internal functions for sampler object access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	b85c5fe526	meta/generate_mipmap: Don't pollute the sampler object namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	d6782712a1	meta/generate_mipmap: Save and restore the sampler using gl_sampler_object instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	36f413209f	meta/generate_mipmap: Track sampler using gl_sampler_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	b94e7f398d	meta/generate_mipmap: Use internal functions for sampler object access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	963065b76c	meta/blit: Don't pollute the sampler object namespace in _mesa_meta_setup_sampler tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	533320e4d1	meta/blit: Save and restore the sampler using gl_sampler_object instead of GL API object handle Some meta operations can be called recursively. Future changes (the "Don't pollute the ... namespace" changes) will cause objects with invalid names to be used. If a nested meta operation tries to restore an object named 0xDEADBEEF, it will fail. v2: Add a comment explaining why samp_obj_save is set to NULL in _mesa_meta_fb_tex_blit_begin. This came out of review feedback from Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	d796b491cc	meta/blit: Use internal functions for sampler object access This requires tracking the sampler object using the gl_sampler_object* instead of the object name. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	ad5b1b41ae	meta/blit: Group the SamplerParameteri calls with the other sampler operations Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	adb4b31bc3	mesa: Refator _mesa_BindSampler to make _mesa_bind_sampler Pulls the parts of _mesa_BindSampler that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	4cf5c85ec7	mesa: Add _mesa_set_sampler_srgb_decode method Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	ecba76d3c0	mesa: Add _mesa_set_sampler_filters method v2: Add filter enum assertions. Suggested by Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Ian Romanick	08822b4b43	mesa: Add _mesa_set_sampler_wrap method Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2016-01-11 15:38:03 -08:00
Samuel Pitoiset	3029d60de7	nvc0: remove useless goto in nvc0_launch_grid() Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-12 00:19:34 +01:00
Ian Romanick	5318bd351e	mesa: Mark Identity as const I was going to send this as review for `dce1e1a8`, but I missed that window. This saves 64 bytes of unshared data and prelaces it with 96 bytes shared text. My guess is that some of the calls to memcpy get optimized to something else. text data bss dec hex filename 7847613 220208 27432 8095253 7b8615 i965_dri.so before 7847709 220144 27432 8095285 7b8635 i965_dri.so after Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Brian Paul <brianp@vmware.com>	2016-01-11 14:34:38 -08:00
Oded Gabbay	647d8e95d1	configure.ac: always define __STDC_CONSTANT_MACROS The ISO C99 standard (7.18.4) specifies that C++ implementations should define UINT64_C only when __STDC_CONSTANT_MACROS is defined. Because we now use UINT64_C in our cpp files (since commit `208bfc493d`), we need to add this define. This also solves compilation errors with GCC 4.8.x on ppc64le machines. v2: add this define to SCons build system Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-11 23:28:23 +02:00
Kenneth Graunke	aa6aa39a8f	i965: Upload 3DSTATE_BINDING_TABLE_POINTERS_HS when !TCS on Gen9+. Gen9+ requires us to emit 3DSTATE_BINDING_TABLE_POINTERS_HS for the hull shader push constants to take effect. The passthrough TCS uses push constants for the default tessellation levels. So, when those change, we need to re-upload the binding table as well. Fixes five Piglit tests on Skylake: - spec/arb_tessellation_shader/vs-tes-vertex - spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-quads - spec/arb_tessellation_shader/vs-tes-tessinner-tessouter-inputs-tris - spec/arb_tessellation_shader/tes-read-texture - spec/arb_tessellation_shader/tess_with_geometry Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-11 12:10:00 -08:00
Mark Janes	f2c8913536	Add missing platform information for KBL In testing KBL, I found: - urb size was not set for slices gt1.5, gt2, and gt3. The value I used for these slices (384) was taken from an earlier patch authored by Ben Widawsky. - slice count was missing. This field was added by `a403ad4f5a` With this commit, KBL passes piglit at parity with SKL. Note: As requested by Kristian, Sarah modified this patch to drop setting urb size for gt1.5, gt2, and gt3, since the correct default is set in the GEN9 macro by commit `c1e38ad370` "i965/skl: Use larger URB size where available." Signed-off-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2016-01-11 11:24:20 -08:00
Ilia Mirkin	f21df5c513	nv50/ir: the whole point of data array is to hand out regular registers Fixes: `0d3051f75a` (nv50/ir: Fix scratch allocation size and file) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-11 13:01:11 -05:00
Dave Airlie	a9eace326e	mesa/uniform_query: add IROUNDD and use for doubles->ints (v2) For the case where we convert a double to an int, we should round the same as we do for floats. This fixes GL41-CTS.gpu_shader_fp64.state_query v2: add IROUNDD (Ilia) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-11 02:27:51 +00:00
Timothy Arceri	124c9c2b97	glsl: replace unreachable code path with assert The lower_named_interface_blocks() pass is called before we try assign locations to varyings so this shouldn't be reachable. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-11 09:24:05 +11:00
Timothy Arceri	cf757f48ea	Revert "glsl: replace unreachable code path with assert" This reverts commit `98270fd20d`. Something went terribly wrong the commit is not what the commit message says.	2016-01-11 09:20:39 +11:00
Timothy Arceri	98270fd20d	glsl: replace unreachable code path with assert The lower_named_interface_blocks() pass is called before we try assign locations to varyings so this shouldn't be reachable. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-11 09:18:51 +11:00
Timothy Arceri	e4c5ace6a9	glsl: combine if blocks Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-11 09:18:45 +11:00
Rhys Kidd	7b4f8c827d	mesa: Update todo regarding StencilOp and StencilOpSeparate. OpenGL 2.0 function StencilOp() is in part internally implemented via StencilOpSeparate(). This change happened some time ago, however the accompanying doxygen todo comment was not accordingly updated. Replace the outdated portion of this doxygen todo comment, leaving the remainder unchanged. Also better respect the 80 character suggested line length in this file. v2: Fully remove comment, following code review by t_arceri@yahoo.com.au Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-11 09:10:17 +11:00
Kenneth Graunke	5e3edd4b28	glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable. Currently, opt_vectorize() tries to combine: result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x); result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y); result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z); result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w); into a single ir_quadop_bitfield_insert opcode, which operates on ivec4s. However, GLSL IR's opcodes currently require the bits and offset parameters to be scalar integers. So, this breaks. We want to be able to vectorize this eventually, but for now, just chicken out and make opt_vectorize() bail by marking all the bitfield insert/extract related opcodes as horizontal. This is a relatively uncommon case today, so we'll do the simple fix for stable branches, and fix it properly on master. Fixes assertion failures when compiling Shadow of Mordor vertex shaders on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-09 15:46:37 -08:00
Pierre Moreau	0d3051f75a	nv50/ir: Fix scratch allocation size and file Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-09 12:58:21 -05:00
Nicolai Hähnle	da5d4583e5	mesa: merge bind_atomic_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:38 -05:00
Nicolai Hähnle	5eb104d6ab	mesa: merge bind_shader_storage_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:38 -05:00
Nicolai Hähnle	e8dd7cc303	mesa: merge bind_uniform_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:37 -05:00
Nicolai Hähnle	b3ca26cded	mesa: merge bind_xfb_buffers_{base\|range} Reduced code duplication should make the code more maintainable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 19:37:37 -05:00
Kristian Høgsberg Kristensen	81f7fd3c54	glsl: Don't add nir files to libglsl_la_SOURCES SCons doesn't understand nir yet and doesn't want to compile the glsl to nir pass. Move the files to their own variable so we can add it only for automake. Tested-by: Brian Paul <brianp@vmware.com>	2016-01-08 16:15:49 -08:00
Ilia Mirkin	e3706a7118	nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-08 17:40:52 -05:00
Kristian Høgsberg Kristensen	82ad571abf	glsl: Move _mesa_shader_stage_to_string/abbrev to shader_enums.c These are used by code that doesn't necessarily link to libglsl.la. Move them to shader_enums.[ch] where we keep similar helpers. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 14:26:20 -08:00
Kristian Høgsberg Kristensen	1d25ef6ae7	i965: Move GLSL lowering passes out of libi965_compiler.la The scope of libi965_compiler.la is to be able to take nir shaders and generate i965 EU code. As such, we don't want the GLSL IR lowering passes in the library. With this change, libi965_compiler.la no longer needs to link to libglsl.la. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 14:26:16 -08:00
Kristian Høgsberg Kristensen	e97caba1f6	glsl: Move glsl_to_nir files to LIBGLSL_FILES libglsl_la_SOURCES includes both NIR_FILES and LIBGLSL_FILES, so for libglsl.la consumers, this is a no-op. libnir.la however no longer uses any GLSL IR infrastructure and can be used without also linking to libglsl.la. Acked-by: Matt Turner <mattst88@gmail.com>	2016-01-08 14:26:12 -08:00
Jordan Justen	1d54ac6c9f	mesa: Use separate indices for UBO & SSBO during binding Previously we were treating the binding index for Uniform Buffer Objects and Shader Storage Buffer Objects as being part of the combined BufferInterfaceBlocks array. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93322 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-08 13:11:31 -08:00
Jordan Justen	cf66a8ffb7	mesa: Map program UBOs and SSBOs to Interface Blocks v2: * Fill UboInterfaceBlockIndex and SsboInterfaceBlockIndex in split_ubos_and_ssbos (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2016-01-08 13:10:28 -08:00
Sarah Sharp	5d349fab46	mesa: docs: Add link to planet.freedesktop.org The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or any of the graphics project wikis (including the DRI wiki) on freedeskop.org. Fix that by linking to it from the sidebar. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 12:18:12 -08:00
Ilia Mirkin	dff1caccac	freedreno: add ir3_compiler to gitignore Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-08 15:16:37 -05:00
Ilia Mirkin	90ba06618e	gallium: add a RESQ opcode to query info about a resource Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	ebfb5446c7	gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	266d001261	gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERS Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	8cb493acc7	tgsi: update atomic op docs Specify that the operation only applies to the x component, not per-component as previously specified. This is unnecessary for GL and creates additional complications for images which need to support these operations as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	bdef02ff26	tgsi: add a is_store property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:33 -05:00
Ilia Mirkin	50b8488926	tgsi: provide a way to encode memory qualifiers for SSBO Each load/store on most hardware can specify what caching to do. Since SSBO allows individual variables to also have separate caching modes, allow loads/stores to have the qualifiers instead of attempting to encode them in declarations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Ilia Mirkin	888ddd632d	ureg: add buffer support to ureg Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Ilia Mirkin	8cc9a8aa2a	tgsi: add ureg support for image decls Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-08 15:10:32 -05:00
Jose Fonseca	208bfc493d	glsl: Ensure 64bits shift is used. I believe that `1u << x`, where x >= 32 yields undefined results according to the C standard. Particularly MSVC says `warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)`. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:59 +00:00
Jose Fonseca	e378184d9c	mesa/main: Avoid `void function returning a value` warning. Trivial. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:59 +00:00
Oded Gabbay	6613042c4e	configure.ac: add --enable-profile For profiling mesa's code, especially llvmpipe, PROFILE should be defined. Currently, this define can only be generated if mesa is built using scons. This patch makes it possible to generate this define also when building mesa through automake tools. v2: - Change --enable-llvmpipe-profile to --enable-profile - Add -fno-omit-frame-pointer to CFLAGS and CXXFLAGS when enabling profile Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-08 21:59:47 +02:00
Marek Olšák	1e463d20ba	nine: allow fragment shader POSITION and FACE to be system values Reported-by: Axel Davy <axel.davy@ens.fr>	2016-01-08 20:07:16 +01:00
Marek Olšák	d0cf66d835	vl: allow fragment shader POSITION to be a system value Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:16 +01:00
Marek Olšák	69f43c2cc9	util/pstipple: allow fragment shader POSITION to be a system value Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:16 +01:00
Marek Olšák	8a13ce14fd	st/mesa: add support for POSITION and FACE system values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	c00e534283	tgsi/scan: update for POSITION and FACE sytem values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	34738a92de	gallium: add caps for POSITION and FACE system values v2: document the integer behavior Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:07:15 +01:00
Marek Olšák	24737f2298	program: add a helper for rewriting FP position input to sysval Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:23 +01:00
Marek Olšák	4191c1a57c	glsl: optionally declare gl_FragCoord & gl_FrontFacing as system values Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 20:06:23 +01:00
Marek Olšák	c07cf5f5a9	tgsi/ureg: handle redundant declarations in ureg_DECL_system_value Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Marek Olšák	c886422656	tgsi/ureg: remove index parameter from ureg_DECL_system_value It can be trivially derived from the number of already declared system values. This allows ureg users not to worry about which index to choose. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Marek Olšák	91e8f2b0a5	st/mesa: remove dead code from mesa_to_tgsi These aren't part of ARB_fragment_program. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-08 20:06:22 +01:00
Edward O'Callaghan	cb513485a0	radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() calls Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-08 12:18:36 -05:00
Edward O'Callaghan	b42254eff3	gallium/aux: Use TGSI chan name defines inplace of literals Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-08 12:18:24 -05:00
Nicolai Hähnle	d6db7ceedf	mesa: check that internalformat of CopyTexImage*D is not 1, 2, 3, 4 The piglit copyteximage check has recently been augmented to test this, but apparently it hasn't been fixed in Mesa so far. This language also already appears in the OpenGL 2.1 spec (Ian). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-08 10:58:27 -05:00
Jason Ekstrand	040e314143	i965/compiler: Enable more lowering in NIR We don't need these for GLSL or ARB, but we need them for SPIR-V Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:42 -08:00
Jason Ekstrand	d00abcc283	nir/algebraic: Add more lowering This commit adds lowering options for the following opcodes: - nir_op_fmod - nir_op_bitfield_insert - nir_op_uadd_carry - nir_op_usub_borrow Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:38 -08:00
Jason Ekstrand	b0d4ee520e	nir/opcodes: Fix up uadd_carry and usub_borrow Both were defined as returning bool but the gpu_shader5 functions are defined to return int. Also, we had the parameters for usub borrwo backwards in the folding expression. Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-07 16:14:25 -08:00
Ilia Mirkin	67b31b3c59	nvc0: add ARB_indirect_parameters support I chose to make separate macros for this due to the additional complexity and extra scratch usage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	9a54ccf30a	st/mesa: expose ARB_indirect_parameters when the backend driver allows Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	e1eab5a76f	mesa: add support for ARB_indirect_parameters draw functions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	9327e2d312	mesa: add parameter buffer, used for ARB_indirect_parameters Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	b3e2c21fe5	glapi: add ARB_indirect_parameters definitions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	7ca67c752b	nvc0: add support for real ARB_multi_draw_indirect The draw groups are now split up into groups of 32 if there's a non-packed stride, or in groups of 400-500 if the draw data is packed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d3e43baffe	nvc0: adjust indirect draw macros to handle multiple draws at once These are still invoked one at a time, but the underlying macro can handle multiple draws. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	2860f20859	st/mesa: add support for new mesa indirect draw interface This shifts all indirect draws to go through the new function. If the driver doesn't have support for multi draws, we break those up and perform N draws. Otherwise, we pass everything through for just a single draw call. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	d67b9ba9a1	gallium: add caps to expose support for multi indirect draws Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	3e11656694	gallium: add sufficient draw interface to allow new indirect features This makes it possible to support indirect multidraws as well as having the number of such draws to come from a separate GPU resource. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-07 18:38:46 -05:00
Ilia Mirkin	60d0cfd429	vbo: create a new draw function interface for indirect draws All indirect draws are passed to the new draw function. By default there's a fallback implementation which pipes it right back to draw_prims, but eventually both the fallback and draw_prim's support for indirect drawing should be removed. This should allow a backend to properly support ARB_multi_draw_indirect and ARB_indirect_parameters. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 18:38:45 -05:00
Roland Scheidegger	2923c7a0ed	llvmpipe: do 64bit plane calculations in the sse path The sse path was pretty much disabled for practical purposes because the largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations. This is actually not that difficult, though a problem is that we can't do a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall, the code still looks reasonable, though it's not like changes there in setup really make much of a difference in the end... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	fad283ba9e	llvmpipe: don't store eo as 64bit int eo, just like dcdx and dcdy, cannot overflow 32bit. Store it as unsigned though just in case (it cannot be negative, but in theory twice as big as dcdx or dcdy so this gives it one more bit). This doesn't really change anything, albeit it might help minimally on 32bit archs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:14 +01:00
Roland Scheidegger	b61b9a377e	llvmpipe: use aligned data for the assembly program in setup Back in the day (before `24678700ed`) the values were not actually in a struct but even then I can't see why we didn't simply align the values. Especially since it's trivial to do so. (Not that it actually matters since the code is pretty much unused for now.) Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	9db7309595	draw: initialize prim header flags when clipping lines Otherwise, clipped lines would have undefined stippling reset bit if line stippling is enabled. (Untested, and I just assume copying over the bits from the original line is actually the right thing to do.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2016-01-08 00:34:13 +01:00
Roland Scheidegger	64da11f052	draw: fix line stippling with unfilled prims The unfilled stage was not filling in the prim header, and the line stage then decided to reset the stipple counter or not based on the uninitialized data. This causes some failures in conform linestipple test (albeit quite randomly happening depending on environment). So fill in the prim header in the unfilled stage - I am not entirely sure if anybody really needs determinant after that stage, but there's at least later stages (wide line for instance) which copy over the determinant as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-08 00:34:13 +01:00
Timothy Arceri	5cf156c6b4	glsl: replace null check with assert This was added in `54f583a20` since then error handling has improved. The test this was added to fix now fails earlier since `01822706ec` Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-08 09:12:45 +11:00
Nicolai Hähnle	051603efd5	i965: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:12 -05:00
Nicolai Hähnle	1b74c02e83	i915: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:09 -05:00
Nicolai Hähnle	8882b46226	radeon: use _mesa_delete_buffer_object This is more future-proof, plugs the memory leak of Label and properly destroys the buffer mutex. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:07:03 -05:00
Nicolai Hähnle	1c2187b1c2	st/mesa: use _mesa_delete_buffer_object This is more future-proof than the current code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-07 17:06:58 -05:00
Nicolai Hähnle	6aed083b93	mesa/bufferobj: make _mesa_delete_buffer_object externally accessible gl_buffer_object has grown more complicated and requires cleanup. Using this function from drivers will be more future-proof. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-07 17:05:54 -05:00
Oded Gabbay	f41b6cfb07	llvmpipe: use sse2 conv code for altivec In lp_build_conv() and lp_build_conv_auto(), there is a special case of conversion when sse2 is present. That code path is suitable without any changes to altivec, because all the functions that are called in that code path already support altivec. This patch increase the FPS in POWER arch across the board between 10%-25% I checked ipers, glxgears, glxspheres64, openarena, xonotic and glmark2. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-07 22:07:02 +02:00
Marek Olšák	bca18057a3	radeonsi: adjust the parameters of si_shader_dump The function will be extended to dump all binaries shaders will consist of, so si_shader* makes sense here. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	0a51b010e5	radeonsi: move si_shader_dump call out of si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	b0df5f4c19	radeonsi: inline si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	c9c031f3d0	radeonsi: move si_shader_dump call out of si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	f8b34fe093	radeonsi: separate shader dumping code to si_shader_dump and *_dump_stats Eventually, I'd like to dump stats for several combined binaries, which is why you don't see a binary parameter in si_shader_dump_stats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	ccd7d7e13d	radeonsi: add si_shader_destroy_binary Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	5c9f104567	radeonsi: don't pass si_shader to si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	54ed83669e	radeonsi: move si_shader_binary_upload out of si_compile_llvm Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	f20a76a4fd	radeonsi: always keep shader code, rodata, and relocs in memory We won't compile shaders in draw calls, but we will concatenate shader binaries according to states in draw calls, so keep the binaries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	63345cfc3a	radeonsi: don't pass si_shader to si_shader_binary_read Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	2d3a96448a	radeonsi: don't pass si_shader to si_shader_binary_read_config Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	20b9b5d7f5	radeonsi: add struct si_shader_config There will be 1 config per variant, which will be a union of configs from {prolog, main, epilog}. For now, just add the structure. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	890873d106	radeonsi: move NULL exporting into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	a72ed2f6bc	radeonsi: move MRT color exporting into a separate function This will be used by a fragment shader epilog. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	0ffe3d3772	radeonsi: use EXP_NULL for pixel shaders without outputs This never happens currently. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	677c65968b	radeonsi: only use LLVMBuildLoad once when updating color outputs at the end without LLVMBuildStore. So: - do LLVMBuildLoad - update the values as necessary - export Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	185267a6fd	radeonsi: export "undef" values for undefined PS outputs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	1ce659f820	radeonsi: move MRTZ export into a separate function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	5f3e6b5b0f	radeonsi: simplify setting the DONE bit for PS exports First find out what the last export is and simply set the DONE bit there. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	e00f3f23b1	radeonsi: set SPI color formats and CB_SHADER_MASK outside of compilation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	4e597c25c7	radeonsi: write all MRTs only if there is exactly one output This doesn't fix a known bug, but better safe than sorry. Also, simplify the expression in si_shader.c. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:06 +01:00
Marek Olšák	746a7a7498	radeonsi: determine SPI_SHADER_Z_FORMAT outside of shader compilation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	2cb8bf90cd	radeonsi: determine DB_SHADER_CONTROL outside of shader compilation because the API pixel shader binary will not emulate alpha test one day, so the KILL_ENABLE bit must be determined elsewhere. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	ff7e77724e	tgsi/scan: set which color components are read by a fragment shader This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	18ec76730a	tgsi/scan: fix tgsi_shader_info::reads_z This has no users in Mesa. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Marek Olšák	f3658be108	tgsi/scan: set if a fragment shader writes sample mask This will be used by radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-07 18:26:05 +01:00
Kenneth Graunke	3e8f644ed3	glsl: Disallow vectorization of vector_insert/extract. vector_insert takes a vector, a scalar location, and a scalar value, and produces a new vector with that component updated. As such, it can't be vectorized properly. vector_extract takes a vector and a scalar location, and returns that scalar component of the vector. Vectorization doesn't really make any sense. Treating both as horizontal operations makes sure the vectorizer won't try to touch these. Found by inspection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-06 21:22:06 -08:00
Roland Scheidegger	8d4039ecdb	softpipe: tell draw about the vertex layout we want This makes it more similar to llvmpipe. It also allows us to let draw emit code handle things like getting zeros for non-existing vs outputs automatically. There probably isn't really any overhead either way, there isn't really any "simply copy everything" code in the emit path it would copy each attrib individually just the same. Likewise, we still do another mapping step in softpipe as the layout may still not match exactly (same as in llvmpipe, should probably nuke the pointless mapping in both drivers). This fixes the piglit arb_fragment_layer_viewport no_gs/no_write tests. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 02:00:04 +01:00
Roland Scheidegger	8e3a76791f	llvmpipe: use ints not unsigned for slots They can't actually be 0 (as position is there) but should avoid confusion. This was supposed to have been done by `af7ba989fb` but I accidentally pushed an older version of the patch in the end... Also prettify slightly. And make some notes about the confusing and useless fs input "map". Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:59:17 +01:00
Roland Scheidegger	2dbc20e456	draw: nuke the interp parameter from vertex_info draw emit couldn't care less what the interpolation mode is... This somehow looked like it would matter, all drivers more or less dutifully filled that in correctly. But this is only used for emit, if draw needs to know about interpolation mode (for clipping for instance) it will get that information from the vs anyway. softpipe actually used to depend on that interpolation parameter, as it abused that structure quite a bit but no longer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:58:05 +01:00
Roland Scheidegger	892e2d1395	softpipe: don't abuse the draw vertex_info struct for something different softpipe would calculate two "vertex layouts". The second one was however just used for internal purposes, draw would know nothing about it even though it looked exactly the same as the other one we tell draw about. So, store that information separately as this was just confusing. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:57:21 +01:00
Roland Scheidegger	b64d008052	softpipe: fix mapping of "special" vs outputs Unlike llvmpipe, softpipe always tells draw to emit the vertices as-is. The two vertex layouts it calculates are a bit confusing, one which is just used to tell draw to emit vertices as-is, and the other which has draw written all over it but draw is completely unaware of and is used only to look up the correct interpolation info later in setup. Thus, the slots used are different to what llvmpipe does (I'm going to clean up the confusing two layout stuff). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:56:43 +01:00
Roland Scheidegger	01761a38e8	llvmpipe: scratch some special handling of vp_index/layer It was actually slightly buggy (missing initialization / setup not dependent on new vs albeit I didn't see issues), but the case of non-existing attributes is now handled by draw emit code so don't need that anymore. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:55:45 +01:00
Roland Scheidegger	afa035031f	draw: rework handling of non-existing outputs in emit code Previously the code would just redirect requests for attributes which don't exist to use output 0. Rework this to output all zeros instead which seems more useful - in particular some extensions like ARB_fragment_layer_viewport require 0 in the fs even if it wasn't output by previous stages. That way, drivers don't have to special case this depending if the vs/gs outputs some attribute or not. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 01:52:39 +01:00
Sarah Sharp	39c41be50d	mesa: Add KBL PCI IDs and platform information. Add PCI IDs for the Intel Kabylake platforms. The IDs are taken directly from the Linux kernel patches, which are under review: http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2 The Kabylake PCI IDs taken from the kernel are rearranged to be in order of GT type, then PCI ID. Please note that if this patch is backported, the following fixes will need to be added before this patch: commit `28ed1e08e8` "i965/skl: Remove early platform support" commit `c1e38ad370` "i965/skl: Use larger URB size where available." Thanks to Ben for fixing a bug around setting urb.size, and being patient with my questions about what the various fields mean. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2) Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2016-01-06 15:11:00 -08:00
Sinclair Yeh	0819287f56	svga: Rename SVGA_HINT_FLAG_DRAW_EMITTED Rename SVGA_HINT_FLAG_DRAW_EMITTED to SVGA_HINT_FLAG_CAN_PRE_FLUSH because preemptive flush can be unblocked by more commands than draw. Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:04:45 -07:00
Sinclair Yeh	9ccc716534	svga: allow preemptive flushing on DMA, update, and readback commands The existing code effectively turns off preemptive flushing for all but the regions used for draws. This turns out to be overly restrictive as some memory regions, e.g. GMR, may never get a draw when used as a DMA upload staging area, causing problems for apps that upload a large amount of textures, e.g. Unigine Heaven. This patch fixes the Unigine Heaven memory allocation error and has been verified to not cause a regression in the previous extended retina display issue. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:03:33 -07:00
Charmaine Lee	b074a5b02d	svga: skip vertex attribute instruction with zero usage_mask In emit_input_declarations(), we are skipping declarations for those registers that are not being used. But in emit_vertex_attrib_instructions(), we are still emitting instructions to tweak the vertex attributes even if they are not being used. This causes an assert in the backend because an input register is not declared in the shader. This patch fixes the problem by skipping the instruction if the vertex attribute is not being used. Changes in this patch is originated from the code snippet from Jose as suggested in bug 1530161. Tested with piglit, Heaven, Turbine, glretrace. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 16:01:38 -07:00
Brian Paul	b59fad8478	st/mesa: minor clean-ups in st_atom.c Remove useless comment. Reformat code.	2016-01-06 15:53:47 -07:00
Brian Paul	85444ab08b	st/mesa: replace bitmap size checks with assertion The _mesa_Bitmap() caller already checks for zero-sized bitmaps.	2016-01-06 15:53:47 -07:00
Brian Paul	18038b9fd6	st/mesa: check texture target in allocate_full_mipmap() Some kinds of textures never have mipmaps. 3D textures seldom have mipmaps. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:47 -07:00
Brian Paul	c032ae85ee	st/mesa: move mipmap allocation check logic into a function Better readability and easier to extend. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	0d39b5fc3b	main: s/GLuint/GLbitfield for state bitmasks Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c81ddc2092	vbo: s/GLuint/GLbitfield/ for state bitmasks Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	3c0521cd0f	st/mesa: use GLbitfield in st_state_flags, add comments Use GLbitfield instead of GLuint to be consistent with other variables. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	4cd1bd46ed	s/GLuint/GLbitfield/ for st_invalidate_state() parameter To match dd_function_table::UpdateState(). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	2cc52801c0	st/mesa: be more careful about state validation in st_Bitmap() If the only dirty state is mesa's _NEW_PROGRAM_CONSTANTS flag, we can skip state validation before drawing a bitmap since that state doesn't effect bitmap rendering. This further increases the performance of the ipers demo on llvmpipe to about what it was before commit `36c93a6fae`. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	b6bcf08641	st/mesa: move bitmap cache flushing out of state validation Just do it where needed (before drawing, clearing, etc). Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c28d72a347	st/mesa: check state->mesa in early return check in st_validate_state() We were checking the dirty->st flags but not the dirty->mesa flags. When we took the early return, we didn't clear the dirty->mesa flags so the next time we called st_validate_state() we'd often flush the glBitmap cache. And since st_validate_state() is called from st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap() call. This change seems to recover most of the performance loss observed with the ipers demo on llvmpipe since commit commit `36c93a6fae`. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2016-01-06 15:53:46 -07:00
Brian Paul	c75d00e054	st/mesa: protect debug printf() with a conditional instead of comment	2016-01-06 15:53:46 -07:00
Brian Paul	72d6bbca5b	st/mesa: fix comment indentation in st_flush_bitmap_cache()	2016-01-06 15:53:46 -07:00
Timothy Arceri	e58be8ac0e	glsl: fix varying slot allocation for blocks and structs with explicit locations Previously each member was being counted as using a single slot, count_attribute_slots() fixes the count for array and struct members. Also don't assign a negitive to the unsigned expl_location variable. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-07 09:44:32 +11:00
Timothy Arceri	47dde2bd45	glsl: don't try adding built-ins to explicit locations bitmask Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:26 +11:00
Timothy Arceri	ac6e2c2056	glsl: fix overlapping of varying locations for arrays and structs Previously we were only reserving a single location for arrays and structs. We also didn't take into account implicit locations clashing with explicit locations when assigning locations for their arrays or structs. This patch fixes both issues. V5: fix regression for patch inputs/outputs in tessellation shaders V4: just use count_attribute_slots() to get the number of slots, also calculate the correct number of slots to reserve for gs and tess stages by making use of the new get_varying_type() helper. V3: handle arrays of structs V2: also fix for arrays of arrays and structs. Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:20 +11:00
Timothy Arceri	5907a02ab6	glsl: create helper to remove outer vertex index array used by some stages This will be used in the following patch for calculating array sizes correctly when reserving explicit varying locations. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:16 +11:00
Timothy Arceri	30991d7389	glsl: remove unused varyings before packing them Previously we would pack varyings before trying to remove them, this relied on the packing pass not packing varyings with a location of -1 to avoid packing varyings that should be removed. However this meant unused varyings with an explicit location would be packed before they could be removed when we enable packing of them in a later patch. V2: fix regression in V1 removing unused varyings in multi-stage SSO, fix regression with single stage programs. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-07 09:06:12 +11:00
Krzysztof Sobiecki	0d7477a289	gallium/r600: Replace ALIGN_DIVUP with DIV_ROUND_UP ALIGN_DIVUP is a driver specific(r600g) macro that duplicates DIV_ROUND_UP functionality. Replacing it with DIV_ROUND_UP eliminates this problems. Signed-off-by: Krzysztof A. Sobiecki <sobkas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-06 16:09:12 -05:00
Eric Anholt	bbd29f1375	vc4: Fix driver build from last minute rebase fix. I had the driver all tested for the last series, and in my last build I noticed that get_swizzled_channel was unused now, and removed it... apparently without testing to find that I removed the wrong channel swizzle function.	2016-01-06 12:49:45 -08:00
Eric Anholt	25aa436e86	vc4: Optimize out a comparison for bcsel based on an ALU comparison We routinely have code like: vec1 ssa_220 = fge ssa_104, ssa_61 vec1 ssa_199 = bcsel ssa_220, ssa_106, ssa_105 and we would compare fge's args and choose between ~0 and 0 to generate ssa_220, then compare ssa_220 to 0 and choose between bcsel's args. Instead, try to notice the pattern and compare between fge's args to select between bcsel's args. total instructions in shared programs: 88019 -> 87574 (-0.51%) instructions in affected programs: 9985 -> 9540 (-4.46%) total estimated cycles in shared programs: 245752 -> 245237 (-0.21%) estimated cycles in affected programs: 17232 -> 16717 (-2.99%)	2016-01-06 12:43:09 -08:00
Eric Anholt	7a9eb76786	vc4: Add missing sRGB decode to texel fetches. We only see txf on MSAA textures, currently, and apparently this didn't impact any of our piglit tests.	2016-01-06 12:43:09 -08:00
Eric Anholt	f01ca9eeda	vc4: Add support for GL_ARB_texture_swizzle. We already had the code supporting it, since it's needed for the depth mode when doing shadow comparisons.	2016-01-06 12:43:09 -08:00
Eric Anholt	12519a972f	vc4: Use NIR texture lowering for texture swizzling. We can't use its other features currently (mostly because we don't want Newton-Raphson on rcps for texture coordinates), but it gets us started. This eliminates some comparisons with constants in GLB2.7 and ETQW traces at the QIR level by moving the comparisons into NIR, where they get constant-folded out. instructions in affected programs: 165 -> 156 (-5.45%) total uniforms in shared programs: 32087 -> 32085 (-0.01%) total estimated cycles in shared programs: 245762 -> 245752 (-0.00%) estimated cycles in affected programs: 461 -> 451 (-2.17%)	2016-01-06 12:43:08 -08:00
Eric Anholt	71db7d3dc5	vc4: Replace the SSA-style SEL operators with conditional MOVs. I'm moving away from QIR being SSA (since NIR is doing lots of SSA optimization for us now) and instead having QIR just be QPU operations with virtual registers. By making our SELs be composed of two MOVs, we could potentially coalesce the registers for the MOV's src and dst and eliminate the MOV. total instructions in shared programs: 88448 -> 88028 (-0.47%) instructions in affected programs: 39845 -> 39425 (-1.05%) total estimated cycles in shared programs: 246306 -> 245762 (-0.22%) estimated cycles in affected programs: 162887 -> 162343 (-0.33%)	2016-01-06 12:39:51 -08:00
Eric Anholt	0a89f307f9	vc4: Don't try the SF coalescing unless it's on a def. If you want the SF of the value of a register produced from a series of packing MOVs or conditional MOVs, we can't just SF on the last MOV into the register.	2016-01-06 12:39:27 -08:00
Edward O'Callaghan	1953cee6d7	gallium/drivers/svga: Use unsigned for loop index Fix a 's/unsigned int/unsigned/' consistency case while here. Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	8e2a8ec731	gallium/drivers/r600: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	76a7d6f412	gallium/drivers/ilo: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	5071c192cc	gallium: Use unsigned for loop index Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	bfabd5e74a	gallium/drivers: Remove unnecessary semicolons Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Edward O'Callaghan	67d4b4b28c	gallium: Remove unnecessary semicolons Fix silly issue with MSVC case fall-though support to need a extra 'break;' Found-by: Coccinelle Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2016-01-06 08:04:03 -07:00
Oded Gabbay	9d59b9d00c	llvmpipe: Optimize lp_rast_triangle_32_3_16 for POWER8 This patch converts the SSE-optimized lp_rast_triangle_32_3_16() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ openarena 16.35 16.7 2.14% xonotic 4.707 4.97 5.57% glmark2 didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	925c46cfc4	llvmpipe: Optimize BUILD_MASK(_LINEAR) for POWER8 This patch converts the SSE-optimized build_mask_32() and build_mask_linear_32() to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ glmark2 (score) 139.8 142.7 2.07% openarena and xonotic didn't show a significant (more than 1%) difference. v2: Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	3bbe16ea79	llvmpipe: Optimize do_triangle_ccw for POWER8 This patch converts the SSE optimization done in do_triangle_ccw to VMX/VSX. I measured the results on POWER8 machine with 32 cores at 3.4GHz and 16GB of RAM. FPS/Score Name Before After Delta ------------------------------------------------ glmark2 (score) 136.6 139.8 2.34% openarena 16.14 16.35 1.30% xonotic 4.655 4.707 1.11% v2: - Convert loads to use aligned loads - Make sure code is build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	e99555ef0b	llvmpipe: add POWER8 portability file - u_pwr8.h This file provides a portability layer that will make it easier to convert SSE-based functions to VMX/VSX-based functions. All the functions implemented in this file are prefixed using "vec_". Therefore, when converting from SSE-based function, one needs to simply replace the "_mm_" prefix of the SSE function being called to "vec_". Having said that, not all functions could be converted as such, due to the differences between the architectures. So, when doing such conversion hurt the performance, I preferred to implement a more ad-hoc solution. For example, converting the _mm_shuffle_epi32 needed to be done using ad-hoc masks instead of a generic function. All the functions in this file support both little-endian and big-endian but currently the file is build only on POWER8 LE machine. All of the functions are implemented using the Altivec/VMX intrinsics, except one where I needed to use inline assembly (due to missing intrinsic). v2: - Use vec_vgbbd instead of __builtin_vec_vgbbd - Add an aligned load function - Don't use typeof() - Make file build only on POWER8 LE machine Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Oded Gabbay	afe88f66a8	configure.ac: Detect if running on POWER8 arch To determine if we could use special POWER8 assembly directives, we first need to detect whether we are running on POWER8 architecture. This patch adds this detection to configure.ac and adds the necessary compilation flags accordingly. v2: - Add option to disable POWER8 instructions generation - Detect whether building on BE or LE machine and build with -mpower8-vector only on LE machine - Make the printed messages more standard Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-06 14:54:16 +02:00
Kenneth Graunke	7295f4fcc2	nir: Add a lower_fdiv option, turn fdiv into fmul/frcp. The nir_opt_algebraic rule (('fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv', a, b))), can produce new fdiv operations, which need to be lowered on i965, as we don't actually implement fdiv. (Normally, we handle this in GLSL IR's lower_instructions pass, but in the above case we introduce an fdiv after that point. So, make NIR do it for us.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2016-01-05 19:22:11 -08:00
Kenneth Graunke	bd21b54607	i965: Only turn on ARB_compute_shader if we can write registers. Compute shaders require reconfiguring the L3 for shared local memory support. We have to be able to write the L3 registers to do that. This effectively turns off compute shaders prior to Kernel 4.2. (Previously, the extension enable was in an API_OPENGL_CORE conditional. However, that isn't necessary - core Mesa extension handling already restricts it properly. I've moved it out in this patch.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2016-01-05 18:07:27 -08:00
Kenneth Graunke	25b7e4a01f	i965: Use rcp in brw_lower_texture_gradients rather than 1.0 / x. That's what it's for. Plus, we actually implement rcp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-05 18:07:27 -08:00
Timothy Arceri	3d402d4450	mesa: fix GL_MAX_NAME_LENGTH query for tessellation shaders This fixes some piglit subtests for ARB_program_interface_query. V3: remove some of the unnecessary parentheses V2: fix alignment Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-06 12:01:09 +11:00
Timothy Arceri	e1e1b67878	glsl: don't change the varying type in validation code There is a function dedicated to demoting unused varyings lets trust it to do its job. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-06 10:52:58 +11:00
Timothy Arceri	21590a307c	glsl: move lowering after matching validation After lowering the matching flag is_unmatched_generic_inout is lost so we need to move this validation before lowering. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-06 10:52:54 +11:00
Timothy Arceri	0508d9504a	glsl: only add outward facing varyings to resourse list for SSO An SSO program can have multiple stages and we only want to add the externally facing varyings. The current code was adding both the packed inputs and outputs for the first and last stage of each program. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-06 10:52:48 +11:00
Anuj Phogat	4d2a7f5111	i965/gen9: Modify the conditions to use blitter on skl+ Conditions modified allow skl+ to use blitter: - for all tiling formats - to write data to YF/YS tiled surfaces Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-01-05 13:43:32 -08:00
Anuj Phogat	0bf037c0fe	i965/gen9: Return false in place of assert in intelEmitCopyBlit() This allows the fallback paths to handle it correctly. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-05 13:43:32 -08:00
Anuj Phogat	5cbe01c83f	i965/gen9: Remove regions overlap check in fast copy blit Overlapping blits are anyway undefined in OpenGL. So no need of overlap check here. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-05 13:43:32 -08:00
Anuj Phogat	3c8b97a45b	i965/gen9: Don't use fast copy blit in case of non power of 2 cpp Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-05 13:43:32 -08:00
Ian Romanick	ee4676aa57	i915/i965: Fix typo in perf_debug message Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2016-01-05 13:18:45 -08:00
Brian Paul	a13e9adbee	st/mesa: minor indentation fixes	2016-01-05 13:04:46 -07:00
Brian Paul	f4caa7d2fc	draw: minor indentation fix	2016-01-05 13:03:05 -07:00
Brian Paul	dce1e1a8eb	mesa: minor clean-up of some memcpy/sizeof() calls in m_matrix.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:05 -07:00
Brian Paul	95d412181d	util: add debug_dump_ubyte_rgba_bmp() Like debug_dump_float_rgba_bmp() but takes ubyte values. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	f04d7439a0	mesa: check for z=0 in _mesa_Vertex3dv() It's very rare that a GL app calls glVertex3dv(), but one in particular calls it lot, always with Z = 0. Check for that condition and convert the call into glVertex2f. This reduces VBO memory used and reduces the number of times we have to switch between float[2] and float[3] vertex formats in the svga driver. This results in a small but measurable performance improvement. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	eec8d7e7e0	svga: fix test for SVGA_NEW_STIPPLE We only want to set the SVGA_NEW_STIPPLE dirty flag when the polygon stipple state changes. Before, we only set the flag when we were enabling stipple, but not disabling. We don't really have to add SVGA_NEW_STIPPLE to the dirty FS state set since it's a subset of SVGA_NEW_RAST, but let's be explicit. This doesn't fix any known bugs. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	993b04ee2c	svga: add some comments in svga_state_vs.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	fc07658895	svga: change svga_hw_view_state::dirty to boolean Since it's a true/false value. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	077aa3be93	svga: avoid emitting redundant SetVertexBuffers() commands Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Brian Paul	b11bd20889	svga: check for no-ops in svga_bind_sampler_states() and svga_set_sampler_views(). If there's no change, return early and don't set a SVGA_NEW_x dirty state flag. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-01-05 13:03:04 -07:00
Ilia Mirkin	6531ccb705	i965: quieten compiler warning about out-of-bounds access gcc 4.9.3 shows the following error: brw_vue_map.c:260:20: warning: array subscript is above array bounds [-Warray-bounds] return brw_names[slot - VARYING_SLOT_MAX]; This is because BRW_VARYING_SLOT_COUNT is a valid value for the enum type. Adding an assert will generate no additional code but will teach the compiler to not complain. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-01-05 12:07:53 -05:00
Julien Isorce	777d1453f1	build: enable st/va with nouveau driver vainfo fails in vaDriverInit because "dd_create_screen" does not reach strcmp(driver_name, "nouveau") code. Indeed when compiling the va target.c, the macro GALLIUM_NOUVEAU is not defined. This patch define the macro the same it is done for dri and vdpau targets. Tested with: ./autogen.sh --enable-glx --enable-gles2 --enable-egl --enable-vdpau --enable-glx-tls=yes --enable-va --with-gallium-drivers=swrast,nouveau --with-dri-drivers=swrast,nouveau --with-egl-platforms=x11 LIBVA_DRIVER_NAME=gallium vainfo Output: vainfo: Driver version: mesa gallium vaapi vainfo: Supported profile and entrypoints VAProfileMPEG2Simple : VAEntrypointVLD VAProfileMPEG2Main : VAEntrypointVLD VAProfileMPEG4Simple : VAEntrypointVLD VAProfileMPEG4AdvancedSimple : VAEntrypointVLD VAProfileVC1Simple : VAEntrypointVLD VAProfileVC1Main : VAEntrypointVLD VAProfileVC1Advanced : VAEntrypointVLD VAProfileH264Baseline : VAEntrypointVLD VAProfileH264Main : VAEntrypointVLD VAProfileH264High : VAEntrypointVLD VAProfileNone : VAEntrypointVideoProc Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-05 12:07:53 -05:00
Julien Isorce	abb30b9c8b	nvc0: add support for st/va - split nvc0_decoder_bsp in begin/next/end - preserve content buffer when calling nvc0_decoder_bsp_next - implement pipe_video_codec::begin_frame/end_frame https://bugs.freedesktop.org/show_bug.cgi?id=89969 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-05 12:07:53 -05:00
Julien Isorce	7ba27f60f7	nouveau: split nouveau_vp3_bsp in begin/next/end It allows to call nouveau_vp3_bsp_next multiple times between one begin/end. It is required to support st/va. https://bugs.freedesktop.org/show_bug.cgi?id=89969 Signed-off-by: Julien Isorce <j.isorce@samsung.com> [imirkin: create strparm_bsp function, simplified w0 calculation] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-05 12:07:53 -05:00
Julien Isorce	851e7e12aa	st/va: count number of slices The counter was not set but used by the nouveau driver. It is required otherwise visual output is garbage. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com>	2016-01-05 15:02:47 +00:00
Ilia Mirkin	14f21f53d5	i965/wm: use binding size for ubo/ssbo when automatic size is unset This fixes the same tests that commit `8cf2e892f` was attempting to fix: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize as confirmed by Samuel. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-05 01:30:09 -05:00
Ilia Mirkin	a1d664a0b7	Revert "i965/wm: use proper API buffer size for the surfaces." This reverts commit `8cf2e892fc`. It's entirely bogus to attempt to store anything about the binding in the buffer object itself, which might be bound any number of times. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2016-01-05 01:29:49 -05:00
Nicolai Hähnle	2123bfcc9c	st/mesa: make KHR_debug output independent of context creation flags (v2) Instead, keep track of GL_DEBUG_OUTPUT and (un)install the pipe_debug_callback accordingly. Hardware drivers can still use the absence of the callback to skip more expensive operations in the normal case, and users can no longer be surprised by the need to set the debug flag at context creation time. v2: - re-add the proper initialization of debug contexts (Ilia Mirkin) - silence a potential warning (Ilia Mirkin) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-04 18:40:49 -05:00
Ilia Mirkin	b16c9be4a5	nvc0: scale up inter_bo size so that it's 16M for a 4K video Experimentally, 4M causes corruption and slowness, try to ramp it up with size instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-04 11:32:45 -05:00
Ilia Mirkin	b5f2f7073f	nv50,nvc0: fix crash when increasing bsp bo size for h264 H264 doesn't have a bitplane bo. We just need a device reference, so use the one from the client. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-04 11:32:45 -05:00
Samuel Iglesias Gonsálvez	8cf2e892fc	i965/wm: use proper API buffer size for the surfaces. Commit `5bb5eeea` fixes a bug indicating that the surfaces should have the API buffer size. Hovewer it picked the wrong value. This patch adds a new variable, which takes into account glBindBufferRange() values. This patch fixes the following CTS regressions: ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeOffset ES31-CTS.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-std430-vec-bindrangeSize Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2016-01-04 07:52:24 +01:00
Marek Olšák	86fa48426c	radeonsi: remove unused parameter from si_shader_binary_read_config Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	b6d95248f0	radeonsi: move si_shader_binary_upload out of si_shader_binary_read Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	7fa6bb47e3	gallium/radeon: dump LLVM module outside of radeon_llvm_compile Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	fb98acb5a1	gallium/radeon: always add +DumpCode to the LLVM target machine for LLVM <= 3.5 It's the same behavior that we use for later LLVM. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	cd7f252b11	gallium/radeon: r600_can_dump_shader should get TGSI processor type directly Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	fd7000bd78	radeonsi: pass TGSI processor type to si_shader_binary_read for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	3ce0a2fd7f	radeonsi: pass TGSI processor type to si_compile_llvm for dumping the parameter will be used later Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Marek Olšák	dd79034ca6	radeonsi: rename shader parameter definitions and variables for more clarity Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-03 22:41:16 +01:00
Ilia Mirkin	34217018c4	nvc0/ir: add support for PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 16:20:52 -05:00
Ilia Mirkin	20dee333f3	st/mesa: use PK2H/UP2H when supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:47 -05:00
Ilia Mirkin	e9f43d6333	gallium: add PIPE_CAP_TGSI_PACK_HALF_FLOAT to indicate UP2H/PK2H support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:41 -05:00
Ilia Mirkin	459e4532af	tgsi: update PK2H/UP2H channel behavior info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:20:27 -05:00
Ilia Mirkin	6eb74b87b8	gallium: document PK2H/UP2H Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-01-03 16:19:57 -05:00
Samuel Pitoiset	0ab2c21b93	st/mesa: fix parameter names for tesseval/tessctrl prototypes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 22:01:18 +01:00
Ilia Mirkin	bf34748b39	nouveau: fix double-const qualifier Reported by Tom^ on IRC. The original intent was to mark the pointer constant as well as the data being pointed to, so move the *. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 11:32:15 -05:00
Rob Clark	3684e899ea	freedreno/ir3: use NIR_PASS helper macros Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	317628dbb3	nir: extract out helper macros for running passes Note these are a bit uglier, due to avoidance of GNU C extensions. But drivers which do not need to be built with compilers that don't support the extension can wrap these macros with their own. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-03 09:11:27 -05:00
Rob Clark	23bd6affb2	freedreno/ir3: we require block_index metadata Found during NIR_TEST_CLONE=1 piglit run. We were using block->index but forgetting to require it. Causing things to not work with a cloned shader which didn't preserve block_index. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	74135f804a	freedreno/ir3: refactor NIR IR handling Immediately convert into NIR and do an initial key-agnostic lowering/ optimization pass. This should let us share most of the per-variant transformations between each variant, and hopefully minimize the draw- time variant creation part of the compilation process. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Rob Clark	ab4efb19dc	freedreno/ir3: drop unnecessary unreachable() case It will still hit a compile_assert() in emit_tex, which has the advantage of dumping out the offending shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2016-01-03 09:11:27 -05:00
Samuel Pitoiset	6a49fcfb1f	gallium/tests: fix build with clang compiler Nested functions are supported as an extension in GNU C, but Clang don't support them. This fixes compilation errors when (manually) building compute.c, or by setting --enable-gallium-tests to the configure script. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75165 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2016-01-03 12:18:00 +01:00
Samuel Pitoiset	53dddab78c	nv50,nvc0: optimize coherent buffer checking at draw time Instead of iterating over all the buffer resources looking for coherent buffers, we keep track of a context-wide count. This will save some iterations (and CPU cycles) in 99.99% case because usually coherent buffers are not so used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-01-03 12:17:05 +01:00
Kenneth Graunke	28dea26626	i965: Make TCS precompile use the TES primitive mode when available. If there's a linked TES program, we should just use the actual primitive mode. If not, just guess triangles (as we did before). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	4a1c8a3037	i965: Push most TES inputs in SIMD8 mode. Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 32 vec4 slots (16 registers) is more than sufficient to ensure that 100% of TES inputs are pushed for Shadow of Mordor, Unigine Heaven, GPUTest/TessMark, and SynMark. Note that unlike most SIMD8 stages, this actually reads packed vec4 data, since that is what our vec4 TCS programs write. Improves performance in GPUTest's tessmark_x64 microbenchmark by 93.4426% +/- 5.35541% (n = 25) on my Lenovo X250 at 1024x768. Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 22.74% +/- 0.309394% (n = 5). Improves performance in Shadow of Mordor at low settings with tessellation enabled at 1280x720 by 2.12197% +/- 0.478553% (n = 4). shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 184358 -> 181181 (-1.72%) instructions in affected programs: 27971 -> 24794 (-11.36%) helped: 226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	b022150d70	i965: Use LOAD_PAYLOAD for SIMD8 TES input loads, not MOV. We need a MOV to replicate g0.0<0,1,0> to all 8 channels. Since the message payload is a single register, MOV seemed more sensible than LOAD_PAYLOAD. However, MOV cannot be CSE'd, while LOAD_PAYLOAD can. All input loads can use the same header - we don't need to re-expand g0 every time. CSE accomplishes this, saving instructions. shader-db statistics for files containing tessellation shaders: total instructions in shared programs: 186923 -> 184358 (-1.37%) instructions in affected programs: 30536 -> 27971 (-8.40%) helped: 226 HURT: 0 total cycles in shared programs: 1009850 -> 1005356 (-0.45%) cycles in affected programs: 168206 -> 163712 (-2.67%) helped: 226 HURT: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Kenneth Graunke	53a9b6223f	i965: Move 3-src subnr swizzle handling into the vec4 backend. While most align16 instructions only support a SubRegNum of 0 or 4 (using swizzling to control the other channels), 3-src instructions actually support arbitrary SubRegNums. When the RepCtrl bit is set, we believe it ignores the swizzle and uses the equivalent of a <0,1,0> region from the subnr. In the past, we adopted a vec4-centric approach of specifying subnr of 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper SubRegNum. This isn't a great fit for the scalar backend, where we don't set swizzles at all, and happily set subnrs in the range [0, 7]. This patch changes brw_eu_emit.c to use subnr and swizzle directly, relying on the higher levels to set them sensibly. This should fix problems where scalar sources get copy propagated into 3-src instructions in the FS backend. I've only observed this with TES push model inputs, but I suppose it could happen in other cases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-02 18:46:16 -08:00
Eric Anholt	64253fdb2e	vc4: Fix build from upload changes.	2016-01-02 17:33:19 -08:00
Nicolai Hähnle	8f384d07a8	gallium/radeon: send LLVM diagnostics as debug messages Diagnostics sent during code generation and the every error message reported by LLVMTargetMachineEmitToMemoryBuffer are disjoint reporting mechanisms. We take care of both and also send an explicit message indicating failure at the end, so that log parsers can more easily tell the boundary between shader compiles. Removed an fprintf that could never be triggered. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	255ccd1e99	gallium/radeon: pass pipe_debug_callback into radeon_llvm_compile (v2) This will allow us to send shader debug info via the context's debug callback. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	f8cd11403a	radeonsi: send shader info as debug messages in addition to stderr output The output via stderr is very helpful for ad-hoc debugging tasks, so that remains unchanged, but having the information available via debug messages as well will allow the use of parallel shader-db runs. Shader stats are always provided (if the context is a debug context, that is), but you still have to enable the appropriate R600_DEBUG flags to get disassembly (since it is rather spammy and is only generated by LLVM when we explicitly ask for it). Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:24 -05:00
Nicolai Hähnle	4bb1c8dfec	radeonsi: pass pipe_debug_callback down into si_shader_binary_read (v2) This will allow us to send shader debug info. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Nicolai Hähnle	b6847062dd	gallium/radeon: implement set_debug_callback Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-02 16:47:23 -05:00
Marek Olšák	ecb2da1559	u_upload_mgr: allow specifying PIPE_USAGE_* for the upload buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	37d0aea772	u_upload_mgr: remove alignment parameter from u_upload_create Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:45 +01:00
Marek Olšák	1bb79c3a7b	u_upload_mgr: pass alignment to u_upload_buffer manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	e0f932846c	u_upload_mgr: pass alignment to u_upload_data manually Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	020009f7cc	u_upload_mgr: pass alignment to u_upload_alloc manually The fixed alignment of u_upload_mgr will go away. This is the first step. The motivation is that one u_upload_mgr can have multiple users, each allocating from the same buffer, but requiring a different alignment. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	ffc4716e97	u_upload_mgr: rework the application of alignment The function only aligned the size, but not the offset. The offset was aligned only when the previous suballocation was aligned. That yielded the correct offset alignment if the alignment was constant for all suballocations. Instead, directly align the offset, but allow an unaligned size. There is no change in behavior, because the alignment is constant at the moment. This a prerequisite for allowing a variable alignment for suballocations. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-02 15:15:44 +01:00
Marek Olšák	36c93a6fae	st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels (v2) Spotted by luck. The GLSL uniform storage is only associated once in LinkShader and can't be reallocated afterwards, because that would break the association. v2: don't remove st_upload_constants calls, clarify why they're needed Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Marek Olšák	294ed5cd13	program: add _mesa_reserve_parameter_storage The next commit will use this. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>	2016-01-02 15:15:44 +01:00
Jordan Justen	a2942d8f26	mesa: Fix warning with MESA_VERBOSE=api for BindBufferRange Reported-by: Dieter Nützel <Dieter@nuetzel-hh.de> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2016-01-01 17:27:14 -08:00
Ilia Mirkin	c1d14c6817	nv50,nvc0: make sure there's pushbuf space and that we ref the bo early First off, we can't flush in the middle of a command. Secondly requesting the extra push space might cause a flush to happen. If that flush happens, we'd have to do the PUSH_REFN again. So instead do PUSH_REFN after the push space request. This helps avoid rare crashes with supertuxkart in libdrm due to assertion failures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2016-01-01 19:52:41 -05:00
Ilia Mirkin	33a415310b	st/mesa: sort extensions enablement array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-01-01 19:50:02 -05:00
Rob Clark	816ddee6b8	nir/lower_clip: add missing writemask on store Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-01-01 15:32:46 -05:00
Jordan Justen	3dce7bf268	mesa: Add MESA_VERBOSE=api for GL_ARB_program_interface_query v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-01-01 12:00:51 -08:00
Jordan Justen	36db91c4c4	mesa: Add MESA_VERBOSE=api for several indexed BindBuffer variants v2: * Add braces '{}' when the _mesa_debug call spans multiple lines (Ken) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2016-01-01 12:00:51 -08:00
Dave Airlie	b835255992	st/glsl_to_tgsi: fix block movs for doubles While playing with fp64, I disable varying packing to debug something else, and noticed we never emitted half the output movs for double matrix arrays. We should be moving the left index two slots for dual source doubles, and the right index two slots for non-vs input doubles. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	d214ce86cf	st/glsl_to_tgsi: handle different attrib size vertex inputs are counted differently in some cases, with vertex inputs we need to make sure we don't double count them. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	dc7b33c1f3	st/glsl_to_tgsi: readd the double_reg2 for input index mapping Otherwise we end up emitting the wrong index for the second double. This fixes dmat-vs-gs-tcs-tes.shader_test and dvec3-vs-gs-tcs-tes.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:54 +10:00
Dave Airlie	84dbf3c4ff	st/glsl_to_tgsi: when doing reladdr get vec4 of correct type This fixes fp64 relative addressing, in the upcoming dmat-vs-gs-tcs-tes.shader_test. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	d87894b98f	st/glsl_to_tgsi: handle double immediates in matrices properly. This handles matrix initialisation properly. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	7351c7684f	st/glsl_to_tgsi: setup writemask for double arrays and matricies. It's important for the double instruction emission code that the writemasks are correct going in for double so it know which channels to replicate. This fixes it for the array and matrix cases. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	14506dcae2	st/glsl_to_tgsi: handle doubles in array shrinking code. This code takes into account double inputs in the array shrinking code. This fixes some issues with doubles and geom/tess inputs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	aab0c6c9c4	st/glsl_to_tgsi: handle doubles outputs in arrays. This handles the case where a double output is stored in an array, and tracks it for use in the double instruction emit code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Dave Airlie	fc890d703e	st/glsl_to_tgsi: store if dst is double in array This is just a precursor patch to a fix for doubles with tessellation that I've written. We need to descend into output arrays in that case and mark dst's as double. Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-01-01 09:43:53 +10:00
Kenneth Graunke	65d3f85eb3	nvc0: Set winding order regardless of domain. Quads need to respect winding order, too - not just triangles. Fixes rendering in GFXBench 4.0's tessellation benchmark. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-30 16:04:12 -08:00
Kenneth Graunke	7cdc2b9ca0	glsl: Fix varying struct locations when varying packing is disabled. varying_matches::record tries to compute the number of components in each varying, which varying_matches::assign_locations uses to assign locations. With varying packing, it uses glsl_type::component_slots() to come up with a reasonable value. Without varying packing, it fell back to an open-coded computation that didn't bother to handle structs at all. I believe we can simply use 4 * glsl_type::count_attribute_slots(false), which already handles these cases correctly. Partially fixes rendering in GFXBench 4.0's tessellation benchmark. (NVE0 is almost right after this, but i965 is still mostly garbage.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-30 16:04:12 -08:00
Kenneth Graunke	4acf71c89b	drirc: Disable ARB_blend_func_extended for Heaven 4.0/Valley 1.0. Unigine Heaven 4.0 and Valley 1.0 use dual color blending but don't specify which fragment shader output is which, so there's at best a 50/50 chance of us guessing it correctly. This is invalid. Unigine fixed this in 4.1 and 1.1 versions over a year and a half ago, but hasn't actually released them for whatever reason. So, add the workaround back so that it works for most people. Fixes Heaven 4.0/Valley 1.0 rendering on Ivybridge. For whatever reason, Broadwell worked. 4.1 and 1.1 have always worked. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-12-30 16:04:12 -08:00
Ilia Mirkin	5ac15f788b	glsl: add GL_ARB_shader_draw_parameters define Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-30 18:59:18 -05:00
Ilia Mirkin	517a93b346	nvc0: add ARB_shader_draw_parameters support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-30 16:55:57 -05:00
Ilia Mirkin	89bda9772d	st/mesa: add GL_ARB_shader_draw_parameters support Hooks up the new system values, passes the drawid in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	daaf0bdf46	gallium: add a drawid to pipe_draw_info This will allow the state tracker to inform the driver where in a broken-up multidraw we currently are. This can then be passed into the vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	87b4e4e29f	gallium: add PIPE_CAP_DRAW_PARAMETERS This allows the state tracker to know that the various draw parameters are available in vertex shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	bb52ea45cc	gallium: add baseinstance/drawid semantics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-30 16:55:56 -05:00
Ilia Mirkin	d50e6128b8	nv50/ir: attempt to do more constant folding on mad -> add conversion The add might actually have a 0 as an argument, which would convert it into a mov. Make sure to detect that. Also avoid the hack of putting the immediate directly into the instruction, instead use a mov to put it into place and let the later LoadPropagation pass place it if possible. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-30 12:29:07 -05:00
Marta Lofstedt	97685ff10e	i965/gen8: Always use BRW_REGISTER_TYPE_UW for MUL on GEN8+ The imulExtended tests of the shader bitfield tests of the OpenGL ES 3.1 CTS, fail on gen8+, when BRW_REGISTER_TYPE_W is used for SHADER_OPECODE_MULH. Also, remove unused helper function: static inline bool type_is_signed(unsigned type) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-30 09:29:14 +01:00
Timothy Arceri	0d4cd045c8	glsl: tidy up struct with a single member There used to be more members but they now share other fields in order to keep memory use low. Also making the naming more generic will allow us to reuse the field for explicit byte offsets within blocks for ARB_enhanced_layouts. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:52:05 +11:00
Emil Velikov	2c1a215409	glsl/linker: annotate static functions as such Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:51:58 +11:00
Emil Velikov	c704b89fe4	glsl: annotate ast_process_struct_or_iface_block_members() as static Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-30 11:51:51 +11:00
Jason Ekstrand	0119773ffc	nir/builder: Add an init function that creates a simple shader for you A hugely common case when using nir_builder is to have a shader with a single function called main. This adds a helper that gives you just that. This commit also makes us use it in the NIR control-flow unit tests as well as tgsi_to_nir and prog_to_nir. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-29 13:44:05 -08:00
Kristian Høgsberg Kristensen	55ca5b0e74	mesa/st: Pad out _mesa_sysval_to_semantic for new SYSTEM_VALUE_* enums GL_ARB_shader_draw_parameters added two new system values. This gets us back to mapping mesa system values to the right TGSI semantics. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-29 12:15:01 -08:00
Ilia Mirkin	724134f683	nv50/ir: float(s32 & 0xff) = float(u8), not s8 Make sure to make conversion unsigned when we're ANDing the high bits away. Fixes corruption in dolphin. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-29 15:08:20 -05:00
Kristian Høgsberg Kristensen	581f81860e	i965: Reemit vertex state between indirect multi draws If we're doing an indirect draw, prims[i].basevertex is always 0 and the real base vertex value is in the indirect parameter buffer. We try to avoid flagging BRW_NEW_VERTICES if prims[i].basevertex doesn't change, which then breaks down for indirect draws. Thus, if a program uses base vertex or base instance, and the draw call is indirect, always flag BRW_NEW_VERTICES. A new piglit test, spec/ARB_shader_draw_parameters/drawid-indirect-vertexid tests this. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	f9283f2668	nir: Teach nir_opt_algebraic about adding and subtracting the same thing This optimizes a + b - b to just a. Modest shader-db results (BDW): total instructions in shared programs: 7842452 -> 7841862 (-0.01%) instructions in affected programs: 61938 -> 61348 (-0.95%) total loops in shared programs: 2131 -> 2131 (0.00%) helped: 263 HURT: 0 GAINED: 0 LOST: 0 but the optimization turns gl_VertexID - gl_BaseVertexARB into just a reference to SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, which the i965 hardware supports natively. That means we can avoid using the internal vertex buffer for gl_BaseVertexARB in this case. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	cddfc2cefa	i965: Add support for gl_DrawIDARB and enable extension We have to break open a new vec4 for gl_DrawIDARB. We've used up all space in the vec4 we use for SGVS and gl_DrawIDARB has to come from its own separate vertex buffer anyway. This is because we point the vb for base vertex and base instance into the draw parameter BO for indirect draw calls, but the draw id is generated by mesa in a different buffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	17ebb55a14	i965: Add support for gl_BaseVertexARB and gl_BaseInstanceARB We already have gl_BaseVertexARB in the .x component of the SGVS vec4 and plug gl_BaseInstanceARB into the last free component (.y). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	b70616f3e7	i965: Assert that SYSTEM_VALUE_VERTEX_ID gets lowered fs_visitor::emit_vs_system_value() looks like it's trying to handle SYSTEM_VALUE_VERTEX_ID, but we should never see that value in the backend. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	1a59aeaebd	mesa: Add core mesa support for GL_ARB_shader_draw_parameters Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-29 10:39:25 -08:00
Kristian Høgsberg Kristensen	42dd2c028d	mesa/vbo: Add draw_id field to struct _mesa_prim The drivers will need this for passing in gl_DrawIDARB. For indirect multidraw calls, we get the prim array and prim[i].draw_id == i and is redundant. But for non-indirect calls, we get one primitive at a time and need the draw_id field. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-29 10:39:25 -08:00
Aaron Watry	70d8dbc9a1	nir: Remove function overload in control flow test Fixes make check. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-29 09:42:14 -08:00
Nicolai Hähnle	7b8db37abb	radeonsi: add RADEON_REPLACE_SHADERS debug option This option allows replacing a single shader by a pre-compiled ELF object as generated by LLVM's llc, for example. This can be useful for debugging a deterministically occuring error in shaders (and has in fact helped find the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264). v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-29 09:07:04 -05:00
Nicolai Hähnle	7d1fc2cf51	radeonsi: count compilations in si_compile_llvm This changes the count slightly (because of si_generate_gs_copy_shader), but this is only relevant for the driver-specific num-compilations query. It sets the stage for the next commit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-29 09:07:01 -05:00
Nicolai Hähnle	4711170239	gallium/util: add DEBUG_GET_ONCE_OPTION This is analogous to the alreading existing macros for BOOL, NUM, and FLAGS. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-29 09:06:57 -05:00
Grazvydas Ignotas	da0e216e06	r600: fix constant buffer size programming When buffer size is less than 16, zero ends up being programmed as size, which prevents the hardware from fetching the correct values. Fix it by combining shift and align so that the value is always rounded up. Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92229 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-12-29 09:05:55 -05:00
Kenneth Graunke	dfce9759ab	docs: Mark ARB_tessellation_shader as done on all i965 platforms. We now support all Intel GPUs which can do tessellation. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:08 -08:00
Kenneth Graunke	381a89cf2a	i965: Enable ARB_tessellation_shader on Gen7-7.5. We've resolved all the GPU hangs, and everything seems to be working. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:05 -08:00
Kenneth Graunke	bd8ab8dedb	i965: Don't set interleave or complete on TCS EOT message. Setting interleave on the TCS EOT message causes Ivybridge hardware to GPU hang like crazy. Individual tests would pass, but running even a simple test like nop.shader_test in a loop would hang within 1-3 runs. Adding sleep delays worked around the problem, somehow. Interleave doesn't make much sense given that we only have one patch URB handle, not two. Complete doesn't seem useful either. There's no reason to actually set those bits. We were just being lazy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:03 -08:00
Kenneth Graunke	b7793783b3	i965: Relase input URB Handles on Gen7/7.5 when TCS threads finish. Pre-Broadwell hardware requires us to manually release the ICP Handles by issuing URB read messages with the "Complete" bit set. We can do this in pairs to use fewer URB read messages. Based heavily on work from Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:17:00 -08:00
Kenneth Graunke	6ceabb72ea	i965: Use proper TCS barrier ID bits for Ivybridge/Baytrail. Gen7 uses bits 15:12 while Gen7+ uses bits 16:13. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:57 -08:00
Kenneth Graunke	5898cbae24	i965: Use proper TCS Instance ID bits for Ivybridge/Baytrail. Gen7 uses 22:16 while Gen7.5+ uses 23:17. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:54 -08:00
Kenneth Graunke	1245724f72	i965: Port tessellation evaluation shaders to vec4 mode. This can be used on Broadwell by setting INTEL_SCALAR_TES=0. More importantly, it will be used for Ivybridge and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:48 -08:00
Kenneth Graunke	889d987904	i965: Emit a real 3DSTATE_DS on Gen7. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:45 -08:00
Kenneth Graunke	2c240b05e9	i965: Emit a real 3DSTATE_HS on Gen7. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:34 -08:00
Kenneth Graunke	74b83fe368	i965: Add the TCS/TES state upload atoms to the gen7_atoms list. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-28 13:16:19 -08:00
Jason Ekstrand	237f2f2d8b	nir: Get rid of function overloads When Connor originally drafted NIR, he copied the same function+overload system that GLSL IR had with a few names changed. However, this double-indirection is not really needed and has only served to confuse people. Instead, let's just have functions which may not have unique names and may or may not have an implementation. If someone wants to do overload resolving, they can hav a hash table based function+overload system in the overload resolving pass. There's no good reason to keep it in core NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> ir3 bits are Reviewed-by: Rob Clark <robclark@gmail.com>	2015-12-28 09:59:53 -08:00
Ilia Mirkin	109c348284	nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion Also release the scratch allocation if any. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-27 21:33:36 -05:00
Ilia Mirkin	28e07fdd4a	nv50,nvc0: add a note when converting vertex elements using CPU Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-27 19:49:44 -05:00
Connor Abbott	41c7912d04	gallium/auxiliary: don't build NIR sources with MSVC2008 flags NIR has never been built with MSVC2008, so we shouldn't add MSVC2008_COMPAT_CFLAGS to anything that uses it. This allows us to get rid of the pragma in tgsi_to_nir.c. Build tested with freedreno. v2: Use MSVC2013_COMPAT_CLFAGS instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-12-23 20:46:48 -05:00
Anuj Phogat	52865efc41	i965: Add tr_mode and mip tail information in surface state dump Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-12-23 13:20:45 -08:00
Jordan Justen	8326eb13f2	i965/gen8/cs: Gen8 requires 64 byte alignment for push constant data The BDW PRM Vol2a: Command Reference: Instructions, section MEDIA_CURBE_LOAD, says that 'CURBE Total Data Length' and 'CURBE Data Start Address' are 64-byte aligned. This is different from previous gens, that were 32-byte aligned. v2 (Jordan): - CURBE Data Start Address is also 64-byte aligned. - The call to brw_state_batch should also use 64-byte alignment. - Improve PRM reference. v3: * New patch from Jordan. Always align base and size to 64 bytes. Fixes the following SSBO CTS tests on BDW: ES31-CTS.shader_storage_buffer_object.basic-atomic-case1-cs ES31-CTS.shader_storage_buffer_object.basic-operations-case1-cs ES31-CTS.shader_storage_buffer_object.basic-operations-case2-cs ES31-CTS.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs ES31-CTS.shader_storage_buffer_object.advanced-write-fragment-cs ES31-CTS.shader_storage_buffer_object.advanced-indirectAddressing-case2-cs ES31-CTS.shader_storage_buffer_object.advanced-matrix-cs And many other CS CTS tests as reported by Marta Lofstedt. (Commit message is from Iago, but in v3, code is from Jordan.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-22 23:54:02 -08:00
Rob Clark	843cec6d3a	freedreno/ir3: spelling.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-23 00:28:24 -05:00
Rob Clark	dc21747838	nir/print: print variable constant-initializers Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-23 00:28:01 -05:00
Kenneth Graunke	6524897606	docs: Clarify that ARB_tessellation_shader is only done on i965/gen8+. Requested by kisak on IRC.	2015-12-22 20:14:35 -08:00
Kenneth Graunke	209d130dd1	docs: Mark ARB_tessellation_shader as done on i965/gen8+.	2015-12-22 18:50:38 -08:00
Kenneth Graunke	7738f3a988	i965: Enable ARB_tessellation_shader on Gen8+. Everything is in place and I'm not aware of any further issues. Tested with: - Piglit - Tessmark - Unigine Heaven - Shadow of Mordor - GRID Autosport I have patches to backport this to Haswell, Ivybridge, and Baytrail as well (the first Intel hardware to support tessellation), but there are still a lot of GPU hangs left to debug. So that will come later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:14 -08:00
Kenneth Graunke	794eb9d727	i965: Handle mix-and-match TCS/TES with separate shader objects. GL_ARB_separate_shader_objects allows the application to mix-and-match TCS and TES programs separately. This means that the interface between the two stages isn't known until the final SSO pipeline is in place. This isn't a great match for our hardware: the TCS and TES have to agree on the Patch URB entry layout. Since we store data as per-patch slots followed by per-vertex slots, changing the number of per-patch slots can significantly alter the layout. This can easily happen with SSO. To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead bitfields in the TCS/TES program keys, introducing program recompiles. brw_upload_programs() decides the layout for both TCS and TES, and passes it to brw_upload_tcs/tes(), which store it in the key. When creating the NIR for a shader specialization, we override nir->info.inputs_read (and friends) to the program key's values. Since everything uses those, no further compiler changes are needed. This also replaces the hack in brw_create_nir(). To avoid recompiles, brw_precompile_tes() looks to see if there's a TCS in the linked shader. If so, it accounts for the TCS outputs, just as brw_upload_programs() would. This eliminates all recompiles in the non-SSO case. In the SSO case, there should only be recompiles when using a TCS and TES that have different input/output interfaces. Fixes Piglit's mix-and-match-tcs-tes test. v2: Pull the brw_upload_programs code into a brw_upload_tess_programs() helper function (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:11 -08:00
Kenneth Graunke	01b1b44d31	i965: Defer input lowering for tessellation stages until specialization. With tessellation shaders and SSO, we won't be able to always decide on VUE map layouts at LinkProgram time. Unfortunately, we have to delay it until shader specialization time. However, uniform lowering cannot be deferred - brw_codegen_*_prog() reads nir->num_uniforms. Fortunately, we don't need to defer it - uniform, system value, atomic, and sampler lowering can safely stay where it is. This patch moves those to brw_lower_nir()'s only caller, renames brw_lower_nir() to brw_nir_lower_io(), and introduces calls to that. For non-tessellation stages, I chose to call brw_nir_lower_io() from brw_create_nir(), so it's still done at the same time. There's no need to defer it, and doing it at LinkProgram time is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:10 -08:00
Kenneth Graunke	8bc073d601	i965: Automatically create a passthrough TCS when needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:09 -08:00
Kenneth Graunke	4ec3f0f4b9	i965: Start program_string_id from 1, not 0. This way, I can safely use brw_tcs_prog_key::program_string_id == 0 to mean "not filled out because no program exists", which avoids the need for adding an extra boolean to that struct. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:08 -08:00
Kenneth Graunke	2432643e89	i965: Create and set a new brw_tcs_prog_data::outputs_written field. When the application hasn't supplied a TCS, and we have to create one, we need to know what VS outputs to copy to TES inputs. To do this, we create a new program key field, and set it to the TES InputsRead bitfield. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:06 -08:00
Kenneth Graunke	239a4bdcd4	i965: Upload HS push constants whenever default tess. levels change. When using tessellation on OpenGL without a TCS, default values for gl_TessLevelOuter/gl_TessLevelInner are provided via the API. Core Mesa will flag ctx->DriverFlags.NewDefaultTessLevels whenever those values change. We add a corresponding BRW_NEW_DEFAULT_TESS_LEVELS flag and hook it up to HS push constants (which will be used to upload these default values to the autogenerated TCS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:05 -08:00
Kenneth Graunke	0d5cb4aef4	i965: Only call _mesa_load_state_parameters if prog exists. With the automatic-TCS creation, we won't have a prog, but still need to upload push constants. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:04 -08:00
Kenneth Graunke	a122af696c	i965: Switch TCS gl_program/gl_shader_program checks over to TES. Tessellation control shaders are optional, but evaluation shaders will always be present when using tessellation. However, we'll always enable the TCS (HS) hardware stage when using tessellation - we'll just create a program on the fly. That program, however, won't have a gl_program or gl_shader_program. So we shouldn't check brw->tess_ctrl_program or shader_prog->_LinkedShaders[MESA_SHADER_TESS_CTRL] - if we want to know whether tessellation is enabled, we should look for a TES. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:03 -08:00
Kenneth Graunke	9d35fecfb9	i965: Remove unnecessary brw->tess_ctrl_program assertions. This is trying to enforce the fact that the hardware requires HS, TE, and DS to be enabled or disabled together. But it's kind of an ad-hoc attempt, and not too useful. More importantly, we aren't going to have a gl_shader_program for the TCS which is automatically generated when none is present. (We'll just handle it in the driver backend.) So, these will trip for no reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:02 -08:00
Kenneth Graunke	f46dbfaed9	i965: Consolidate BRW_NEW_TESS_{CTRL,EVAL}_PROGRAM flags. For several reasons, I don't think it's particularly useful to have separate flags: 1. Most of the time, tessellation shaders are paired, so both will be replaced at the same time. 2. The data layout is tightly coupled. Both need to agree on the number of per-patch slots in the VUE map. Even adding extra TCS outputs that aren't read by the TES will trigger the need for recompiles. 3. The TCS is optional from an API perspective, but required by the hardware whenever tessellation is enabled. So, atoms that deal with the TCS must check brw->tess_eval_program (BRW_NEW_TESS_EVAL_PROGRAM?) rather than brw->tess_ctrl_program to tell whether tessellation is enabled. So, not only is it unlikely to be useful, it's a bit confusing to get right. Simply using one flag for both simplifies this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:22:00 -08:00
Kenneth Graunke	8498cb4a45	i965: Only call brw_upload_tcs/tes_prog when using tessellation. If there's no evaluation shader, tessellation is disabled. The upload functions would just bail. Instead, don't bother calling them. This will simplify the optional-TCS case a bit, as brw_upload_tcs can assume that we're doing tessellation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:21:59 -08:00
Kenneth Graunke	2bcf989407	nir: Add a glsl_vec_type() helper. I need access to glsl_type::vec2_type from C. Wrapping vec() also gives us access to vec3 if we need it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 17:21:47 -08:00
Kenneth Graunke	0daf51e130	nir: Use writemasked store_vars in glsl_to_nir. Instead of performing the read-modify-write cycle in glsl->nir, we can simply emit a partial writemask. For locals, nir_lower_vars_to_ssa will do the equivalent read-modify-write cycle for us, so we continue to get the same SSA values we had before. Because glsl_to_nir calls nir_lower_outputs_to_temporaries, all outputs are shadowed with temporary values, and written out as whole vectors at the end of the shader. So, most consumers will still not see partial writemasks. However, nir_lower_outputs_to_temporaries bails for tessellation control shader outputs. So those remain actual variables, and stores to those variables now get a writemask. nir_lower_io passes that through. This means that TCS outputs should actually work now. This is a functional change for tessellation control shaders. v2: Relax the nir_validate assert to allow partial writemasks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-22 15:57:59 -08:00
Kenneth Graunke	7d539080c1	nir: Add a writemask to store intrinsics. Tessellation control shaders need to be careful when writing outputs. Because multiple threads can concurrently write the same output variables, we need to only write the exact components we were told. Traditionally, for sub-vector writes, we've read the whole vector, updated the temporary, and written the whole vector back. This breaks down with concurrent access. This patch prepares the way for a solution by adding a writemask field to store_var intrinsics, as well as the other store intrinsics. It then updates all produces to emit a writemask of "all channels enabled". It updates nir_lower_io to copy the writemask to output store intrinsics. Finally, it updates nir_lower_vars_to_ssa to handle partial writemasks by doing a read-modify-write cycle (which is safe, because local variables are specific to a single thread). This should have no functional change, since no one actually emits partial writemasks yet. v2: Make nir_validate momentarily assert that writemasks cover the complete value - we shouldn't have partial writemasks yet (requested by Jason Ekstrand). v3: Fix accidental SSBO change that arose from merge conflicts. v4: Don't try to handle writemasks in ir3_compiler_nir - my code for indirects was likely wrong, and TTN doesn't generate partial writemasks today anyway. Change them to asserts as requested by Rob Clark. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v3]	2015-12-22 15:57:59 -08:00
Tapani Pälli	50fc4a9256	mesa: update gl_HelperInvocation support status in docs Was enabled for i965 and nvc0 by following commits: `c875e3cdd2` `39f51ec96f` Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-12-22 15:14:02 +02:00
Tapani Pälli	f2be5b8ba4	mesa: fix interface matching done in validate_io Patch makes following changes for interface matching: - do not try to match builtin variables - handle swizzle in input name, as example 'a.z' should match with 'a' - add matching by location - check that amount of inputs and outputs matches These changes make interface matching tests to work in: ES31-CTS.sepshaderobjs.StateInteraction The test still does not pass completely due to errors in rendering output. IMO this is unrelated to interface matching. Note that type matching is not done due to varying packing which changes type of variable, this can be added later on. Preferably when we have quicker way to iterate resources and have a complete list of all existed varyings (before packing) available. v2: add spec reference, return true on desktop since we do not have failing cases for it, inputs and outputs amount do not need to match on desktop. v3: add some more spec reference, remove desktop specifics since not used for now on desktop, add match by location qualifier, rename input_stage and output_stage as producer and consumer as suggested by Timothy. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-22 14:50:25 +02:00
Iago Toral Quiroga	5f8bb6fbb1	mesa: add SSBOs to the list of fragment shader side effects The i965 driver uses this function to decide if it can disable the FS unit in the absence of color/depth writes. We don't want to disable the unit in the presence of SSBOs, since the fragment shader could be writing to it. We could go a step further and check not just for the presence of SSBOs but also if the shader code writes to them. Does not look worth the trouble though and we are not doing this for atomic buffers either anyway. v2: put this into a generic _mesa_active_fragment_shader_has_side_effects function instead of having one specific for SSBOs (Jason). Fixes the following CTS test: ES31-CTS.shader_storage_buffer_object.advanced-usage-sync-vsfs Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-22 12:38:48 +01:00
Iago Toral Quiroga	9bbdd0eda4	i965: Ensure FS execution in presence of atomic buffers On Haswell we need to set the UAV_ONLY WM state bit when there are no colour or depth buffer writes and on all hardware we should set the early depth/stencil control field to PSEXEC unless early fragment tests are enabled to make sure that the fragment shader is executed regardless of whether per-fragment tests pass or not as the spec requires. So far we have been doing this for images only, but we should apply the same treatment to all side effectful scenarios. Suggested by Curro. This is not strictly required for compliance with the original ARB_shader_atomic_counters extension, it's only necessary to get the execution semantics specified in GL4.2+ right. v2: - Mark active_fs_has_side_effects as constant. (Curro) - Mention that this is only only necessary to get the execution semantics specified in GL4.2+ right. (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-22 12:38:48 +01:00
Iago Toral Quiroga	1a95b87dad	mesa: Add a _mesa_active_fragment_shader_has_side_effects helper Some drivers can disable the FS unit if there is nothing in the shader code that writes to an output (i.e. color, depth, etc). Right now, mesa has a function to check for atomic buffers and the i965 driver also checks for images. Refactor this logic into a generic function that we can use for any source of side effects in a fragment shader. Suggested by Jason. v2: - Use '_Shader', as suggested by Tapani, to fix the following CTS test: ES31-CTS.shader_atomic_counters.advanced-usage-many-draw-calls2 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-22 12:38:48 +01:00
Kenneth Graunke	57f7c85dcf	i965: Implement gl_PatchVerticesIn by baking it into brw_tcs_prog_key. The hardware provides us no decent way of getting at the number of input vertices in the patch topology from the tessellation control shader. It's actually very surprising - normally this sort of information would be available in the thread payload. For the precompile, we guess that the number of vertices will be the same for both the input and output patches. This usually seems to be the case. On Gen8+, we could pass in an extra push constant containing this value. We may be able to do that on Haswell too. It's quite a bit trickier on Ivybridge, however. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 02:12:05 -08:00
Kenneth Graunke	24be658d13	i965: Add tessellation control shaders. The TCS is the first tessellation shader stage, and the most complicated. It has access to each of the control points in the input patch, and computes a new output patch. There is one logical invocation per output control point; all invocations run in parallel, and can communicate by reading and writing output variables. One of the main responsibilities of the TCS is to write the special gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which control how much new geometry the hardware tessellation engine will produce. Otherwise, it simply writes outputs that are passed along to the TES. We run in SIMD4x2 mode, handling two logical invocations per EU thread. The hardware doesn't properly manage the dispatch mask for us; it always initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block to handle an odd number of invocations, essentially falling back to SIMD4x1 on the last thread. v2: Update comments (requested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 02:12:05 -08:00
Kenneth Graunke	a5038427c3	i965: Add tessellation evaluation shaders The TES is essentially a post-tessellator VS, which has access to the entire TCS output patch, and a special gl_TessCoord input. Otherwise, they're very straightforward. This patch implements SIMD8 tessellation evaluation shaders for Gen8+. The tessellator can generate a lot of geometry, so operating in SIMD8 mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only 2 vertices per thread). I have another patch which implements SIMD4x2 mode for older hardware (or via an environment variable override). We currently handle all inputs via the pull model. v2: Improve comments (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-22 02:12:05 -08:00
Timothy Arceri	54daffef16	nir: remove field only used in GLSL IR when assigning varying locations This field is used as a flag to optimise out any varyings that don't have a matching varying on the other side of the interface. The value should be the same for all varyings (except for SSO but we can't optimise those) by the time they reach nir and are no longer be needed. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-22 17:08:03 +11:00
Ben Skeggs	a8c4747602	nouveau: enable use of new kernel interfaces Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:17 +10:00
Ben Skeggs	5b614b141a	nvc0: remove use of deprecated sw class identifier Also emits a method to properly bind the class to a subchannel, which was missing previously. The kernel currently doesn't care, but this will break if it ever decides to (ie. to support multiple sw classes). Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:13 +10:00
Ben Skeggs	33a3ba8c59	nv50: fix g98+ vdec class allocation The kernel previously exposed incorrect classes for some of the chipsets that this code supports. It no longer does, but the older object ioctls have compatibility to avoid breaking userspace. This needs to be fixed before switching over to the newer interfaces. Rather than hardcoding chipset->class like the rest of the driver does, this makes use of (new) sclass queries to determine what's available. v2. - update to use symbolic class identifier from <nvif/class.h> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:10 +10:00
Ben Skeggs	791a3e1850	nouveau: remove use of deprecated nouveau_device_wrap() Switching to the newer libdrm entry-points tells libdrm that it's OK to make use of newer kernel interfaces. We want to be able to isolate any bugs to either the interfaces changes, or the use of NVIF itself. As such, this commit has a slight hack which forces libdrm to continue using the older kernel interfaces. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:08 +10:00
Ben Skeggs	323d4da372	nouveau: fix screen creation failure paths The winsys layer would attempt to cleanup the nouveau_device if screen init failed, however, in most paths the pipe driver would have already destroyed it, resulting in accesses to freed memory etc. This commit fixes the problem by allowing the winsys to detect whether the pipe driver's destroy function needs to be called or not. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:05 +10:00
Ben Skeggs	6c1bfff66c	nouveau: return nouveau_screen from hw-specific creation functions Kills off a void cast. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:03 +10:00
Ben Skeggs	1a9ec8e062	nouveau: remove use of deprecated nouveau_device::drm_version v2. update for libdrm nouveau_drm::lib_version removal Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:24:01 +10:00
Ben Skeggs	a458ffacba	nouveau: remove use of deprecated nouveau_device::fd Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:23:59 +10:00
Ben Skeggs	a8abdf2f35	nouveau: bump required libdrm version to 2.4.66 v2. forgot bump for non-gallium driver Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-22 13:23:27 +10:00
Dave Airlie	d19106649f	r600: fix viewport clipping handling (v2) If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (we don't have enough info to program VPORT_PROVOKE). Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop vport provoke write, drop initial state writing this on evergreen, only program it on evergreen. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-22 09:09:56 +10:00
Dave Airlie	73e7c5fd7f	radeonsi: fix viewport clipping handling. (v2) If oViewport is written, vertex reuse need to be turned off. If oViewport is constant, vertex reuse is fine, and VPORT_PROVOKE_DISABLE need to be set. (We don't know if oViewport is constant so we skip this.) Fixes: arb_viewport_array-render-viewport-2 and some CTS tests. v2: drop writing to provoke disable, drop write in initial state. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-22 09:09:52 +10:00
Dave Airlie	847f91f4e5	r600: drop VTX_CNT_EN write from initial state we always program this in shader stages atom now. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-22 09:09:48 +10:00
Nicolai Hähnle	ea8c0b16ec	gallium/radeon: fix regression in a number of driver queries This rather silly mistake was introduced by commit `01910676`. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-21 15:47:10 -05:00
Ben Widawsky	0865088cca	i965: Only apply CS stall workaround pre-SKL As per the docs. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-21 10:42:42 -08:00
Ilia Mirkin	f7b7145123	glx/dri3: a drawable might not be bound at wait time A trace of Alien Isolation hit this on nouveau. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-21 06:43:58 -05:00
Emil Velikov	37186c43b5	docs: add news item and link release notes for 11.0.8 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-21 10:13:17 +00:00
Emil Velikov	1c1994da58	docs: add sha256 checksums for 11.0.8 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b9b19162ee`)	2015-12-21 10:11:28 +00:00
Emil Velikov	bb5adf065f	docs: add release notes for 11.0.8 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `261daab6b4`)	2015-12-21 10:11:27 +00:00
Dave Airlie	97eee90547	glsl: count attributes for vertex inputs properly. This function deals with vertex inputs and fragment outputs, so we should count the attribute locations correctly for the vertex inputs. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 17:57:43 +10:00
Kenneth Graunke	14193e4643	ralloc: Fix ralloc_adopt() to the old context's last child's parent. I was cleverly using one iteration to obtain a pointer to the last item in ralloc's singly list child list, while also setting parents. Unfortunately, I forgot to set the parent on that last item. Cc: "11.1 11.0 10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-18 23:30:51 -08:00
Dave Airlie	b476c587e3	glsl: fix transform feedback for 64-bit outupts. This fixes the calculations for transform feedback for doubles. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 11:42:26 +10:00
Dave Airlie	64cfacf319	glsl: fix partial marking for fp64 types. This doubles the element width for the types that are greater than 2 elements wide. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 11:42:26 +10:00
Dave Airlie	1fc39dae22	glsl: only update doubles inputs for vertex inputs. This doesn't apply to other stages. This is only used in the mesa/st code, which needs further fixes. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-19 11:42:25 +10:00
Eric Anholt	f1fb85e544	vc4: Do instruction scheduling on the QIR to hide texture fetch latency. This is a rewrite of vc4_opt_qpu_schedule.c to operate on QIR. Texture fetch can probably take as much as the rest of the cycles of the program, so it's important to hide our other cycles during it (which is hard to do after register allocation). Also, we can queue up multiple texture requests before collecting the resulting samples, so that we keep the texture unit busy more of the time. High-settings openarena performance +2.35849% +/- 0.221154% (n=7). Also about 2-3% on the multiarb demo. 8 piglit tests (ext_framebuffer_multisample accuracy depthstencil) go from failing in rendering to failing in register allocation, but hopefully I can fix that up with some better register pressure handling here. total instructions in shared programs: 87723 -> 88448 (0.83%) instructions in affected programs: 78411 -> 79136 (0.92%) total estimated cycles in shared programs: 276583 -> 246306 (-10.95%) estimated cycles in affected programs: 265691 -> 235414 (-11.40%)	2015-12-18 17:12:10 -08:00
Eric Anholt	5278c64de5	vc4: Fix latency handling for QPU texture scheduling. There's only high latency between a complete texture fetch setup and collecting its result, not between each step of setting up the texture fetch request.	2015-12-18 17:09:03 -08:00
Eric Anholt	960f48809f	vc4: Keep sample mask writes from being reordered after TLB writes Fixes a regression I noticed after introducing scheduling on the QIR. Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-18 17:09:03 -08:00
Dave Airlie	5dc22cadb5	glsl: fix count_attribute_slots to allow for different 64-bit handling So vertex shader input attributes are handled different than internal varyings between shader stages, dvec3 and dvec4 only count as one slot for vertex attributes, but for internal varyings, they count as 2. This patch comments all the uses of this API to clarify what we pass in, except one which needs further investigation Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 12:00:00 +11:00
Dave Airlie	69ea66231e	glsl: use dual slot helper in the linker code. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 11:59:55 +11:00
Dave Airlie	d97b060e6f	glsl/fp64: add helper for dual slot double detection. The old function didn't work for matrices, and we need this in other places to fix some other problems, so move to a helper in glsl type and fix the one user so far. A dual slot double is one that has 3 or 4 components in it's base type. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 11:59:49 +11:00
Dave Airlie	9fbcd8e847	glsl: pass stage into mark function Don't use a bool here, as for some 64-bit fixes we need the stage. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-19 11:59:42 +11:00
Rob Herring	b201a6ed9f	freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabled Android builds with -Werror=pointer-to-int-cast causing an error on 32-bit builds. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-18 14:01:07 -05:00
Matt Turner	bb9eb59933	i965/vec4: Optimize predicate handling for any/all. For a select whose condition is any(v), instead of emitting cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D mov(8) g7<1>.xUD 0x00000000UD (+f0.any4h) mov(8) g7<1>.xUD 0xffffffffUD cmp.nz.f0(8) null<1>D g7<4,4,1>.xD 0D (+f0) sel(8) g8<1>UD g4<4,4,1>UD g3<4,4,1>UD we now emit cmp.nz.f0(8) null<1>D g1<0,4,1>D 0D (+f0.any4h) sel(8) g9<1>UD g4<4,4,1>UD g3<4,4,1>UD Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-18 13:20:13 -05:00
Matt Turner	c8a74e3a4e	nir: Delete bany, ball, fany, fall. As in the previous patches, these can be implemented as any(v) -> any_nequal(v, false) all(v) -> all_equal(v, true) and their removal simplifies the code in the next patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:13 -05:00
Matt Turner	21cd298aec	glsl: Implement all(v) as all_equal(v, true). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:13 -05:00
Matt Turner	2268a50ffd	glsl: Remove ir_unop_any. The GLSL IR to TGSI/Mesa IR paths for any_nequal have the same optimizations the ir_unop_any paths had. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:12 -05:00
Matt Turner	249bb89617	glsl: Implement any(v) as any_nequal(v, false). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-18 13:20:12 -05:00
Nicolai Hähnle	0a6a17b9d7	gallium/radeon: only dispose locally created target machine in radeon_llvm_compile Unify the cleanup paths of the function rather than duplicating code. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-18 12:17:40 -05:00
Roland Scheidegger	61e5f8d073	gallium/util: (trivial) include p_shader_tokens.h in u_simple_shaders.h as it uses definition from it (enum tgsi_return_type).	2015-12-18 01:02:16 +01:00
Roland Scheidegger	6743c68a11	draw: fix clip test with NaNs NaNs mean it should be clipped, otherwise the NaNs might get passed to the next stages (if clipping didn't happen for another reason already), which might cause all kind of problems. The llvm path got this right already (possibly by luck), but this isn't used when there's a gs active. Found by code inspection, verified with some hacked piglit test and some more hacked debug output. (Note the clipper can still itself incorrectly generate NaN and INF position values in its output prims (at least after w divide / viewport transform) even if the inputs weren't NaNs, if the position data of the vertices is "sufficiently bad".) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-18 00:57:07 +01:00
Roland Scheidegger	44e87b7b7b	draw: fix pstipple and aaline stages wrt sampler_views/samplers Those stages only really work for OGL-style texturing (so number of samplers and views mostly the same, certainly for the max values). These get often set up all at once, thus there might be max number of both even if all of them are just NULL. We must not set the max number of samplers and views to the same value since that will lead to terrible things if a driver supports more views than samplers (and the state tracker set up all the views). (This will not make these stages magically work if a shader uses dx10-style texturing, they might still replace an actually used sview in that case.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-18 00:55:35 +01:00
Miklós Máté	6723b61753	swrast: move two global defines to the only place where they are used Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-17 12:09:58 -08:00
Miklós Máté	555f67c3d7	mesa: improve debug log in atifragshader Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-17 12:09:58 -08:00
Miklós Máté	5150d56ec4	program: fix comment about the fog formula Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-17 12:09:58 -08:00
Miklós Máté	7279453da5	mesa: Don't leak ATIfs instructions in DeleteFragmentShader Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-17 12:09:58 -08:00
Oded Gabbay	6e44bbe0f5	configura.ac: fix test for SSE4.1 assembler support This patch modifies the SSE4.1 test in configure.ac to use a global variable to initialize vector variables. In addition, we now return the value of the computation instead of 0. This is done so gcc 4.9 (and lower) won't optimize the SSE4.1 assembly instructions (when using -O1 and higher), because then the configure test might incorrectly pass even though the assembler doesn't support the SSE4.1 instructions (the test will pass because the compiler does support the intrinsics). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Jonathan Gray	4ef44bb484	configure: check for python2.7 for PYTHON2 Check for a 'python2.7' binary, 'python' and 'python2' are not provided by the OpenBSD python 2.7.x packages. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Jonathan Gray	7f585a6a98	configure.ac: use pkg-config for libelf Use PKG_CHECK_MODULES to get the flags to link libelf v2: keep AC_CHECK_LIB as a fallback for elfutils provided libelf that doesn't install a pkg-config file. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-17 09:37:24 +00:00
Jordan Justen	e97b207654	i965/screen: Allow OpenGLES 3.1 for gen8+ OpenGLES 3.1 cannot be enabled for gen 7 (Ivy Bridge, Haswell) since they are still missing ARB_stencil_texturing. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-16 20:37:40 -08:00
Jordan Justen	3b5d442661	i965: Enable compute shaders in more cases for OpenGLES 3.1 Previously we were checking the desktop OpenGL ARB_compute_shader requirements, but for OpenGLES 3.1, the requirements are lower. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-16 20:37:23 -08:00
Jordan Justen	3e8a6e468b	main/version: Don't require ARB_compute_shader for OpenGLES 3.1 The OpenGL ARB_compute_shader extension specfication requires at least 1024 for GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS, whereas OpenGLES 3.1 only required 128. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-16 20:36:16 -08:00
Jordan Justen	a9d934726e	main: Allow compute shaders to be compiled with OpenGLES 3.1 Previous OpenGLES 3.1 testing had been done when ARB_compute_shader was overridden to enabled. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-16 20:35:55 -08:00
Jordan Justen	3507d0b7f9	main: Add MESA_VERBOSE=api for LinkProgram & UseProgram Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-16 20:35:51 -08:00
Matt Turner	257fb76403	ir_to_mesa: Skip useless comparison instructions. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-16 19:59:05 -08:00
Kenneth Graunke	4a5cff24d7	glsl: Remove inverse() from GLSL 1.20 and 1.30. I apparently regressed this when rewriting the built-ins using ir_builder, in `76d2f73643`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93387 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-16 19:32:21 -08:00
Samuel Pitoiset	695ae816da	nv50: free memory allocated by the prog which reads MP perf counters This fixes a memory leak introduced in `6a9c151` ("nv50: add compute-related MP perf counters on G84+") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 21:52:43 -05:00
Brian Paul	f992d02ba2	st/osmesa: add OSMesaCreateContextAttribs() function As with the previous commit, except for gallium. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:39:05 -07:00
Brian Paul	a34e7612dc	osmesa: add new OSMesaCreateContextAttribs function This allows specifying a GL profile and version so one can get a core- profile context. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:38:51 -07:00
Brian Paul	c2c0983215	svga: don't use debug code in update_state() in release builds Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-16 19:38:15 -07:00
Samuel Pitoiset	aeee7f2a4d	nv50,nvc0: free memory allocated by performance metrics The destroy_query() helper was actually never called. This fixes a memory leak while monitoring performance metrics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 23:03:08 +01:00
Samuel Pitoiset	9aca60bfb0	nvc0: free memory allocated by the prog which reads MP perf counters This fixes a long time ago memory leak (even before all my query related changes). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-16 22:00:57 +01:00
Samuel Pitoiset	8022c7480e	nvc0: fix metric-achieved_occupancy calculation on Kepler The maximum number of resident warps per multiprocessor is 64 on Kepler instead of 48 on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-16 22:00:57 +01:00
Christian König	a87a1420d6	st/va: remove fence handling v3 It's nonsense to drain the pipeline like this. v2: keep the drain for DMA-buf exports. v3: flush before the export and after compositing and add TODO comment. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-16 21:13:42 +01:00
Neil Roberts	61cdb7665f	Revert "i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals" This reverts commit `839793680f`. The patch was breaking DRI3 because driGLFormatToImageFormat does not handle MESA_FORMAT_B8G8R8X8_SRGB which ended up making it fail to create the renderbuffer and it would later crash. It's not trivial to add this format because there is no __DRI_IMAGE_FORMAT nor __DRI_IMAGE_FOURCC define for the format either. I'm not sure how difficult adding this would be and whether adding a new format would require some sort of new version for DRI. Seeing as this might take a while to fix I think it makes sense to just revert the patch in the meantime in order to avoid regressing master. It is also not handled in intel_gles3_srgb_workaround and there may be other cases where it breaks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93388 Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-16 17:35:33 +00:00
Neil Roberts	8c5310da9d	i965: Fix crash when calling glViewport with no surface bound If EGL_KHR_surfaceless_context is used then glViewport can be called with NULL for the draw and read surfaces. This was previously causing a crash because the i965 driver tries to use this point to invalidate the surfaces and it was derferencing the NULL pointer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93257 Cc: Nanley Chery <nanley.g.chery@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2015-12-16 16:39:29 +00:00
Neil Roberts	4c7c9e4602	mesa/blit: Don't require the same format for mulitisample blits Previously the GL spec required that whenever glBlitFramebuffer is used with either buffer being multisampled, the internal formats must match. However the GL 4.4 spec was later changed to remove this restriction. In the section entitled “Changes in the released Specification of July 22, 2013” it says: “Relax BlitFramebuffer in section 18.3.1 so that format conversion can take place during multisample blits, since drivers already allow this and some apps depend on it.” If most drivers already allowed this in earlier versions I think it's safe to assume that this is a spec bug and it should also be allowed in all versions. This patch just removes the restriction on desktop GL. For GLES there are conformance tests that assert the previous behaviour so it is probably safer to leave it in. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92706 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-16 16:20:36 +00:00
Julien Isorce	89eb342def	st/va: retrieve size from the temporary img variable "image" is not ready yet since it will be set at the end of the function by: image = img; Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com>	2015-12-16 14:12:31 +00:00
Roland Scheidegger	8e195a6251	draw: handle edge flags in llvm path We just ignored them altogether. While this feature is rather old-fashioned supporting it is actually rather trivial. This fixes the associated piglit tests (2 gl-1.0-edgeflag, 2 gl-2.0-edgeflag and all (7) of point-vertex-id). v2: comment fixes, and make the use of the edgeflag in clipmask consistent with when it's actually there (should be impossible to hit a case where the difference would actually matter but still...) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-16 03:55:25 +01:00
Roland Scheidegger	13c0b1c780	draw: don't set start_instance and instance id for pt emit This just adds confusion, these parameters are used when fetching vertices by translate, but certainly not when emitting hw vertices for drivers, they make no sense there (setting them has no consequences otherwise since there won't be any elements with instance_divisor set). So just set them to 0 (the draw_pipe_vbuf code for emitting vertices when the draw pipeline is run already does exactly that). Also while here do some whitespace cleanup. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-16 03:55:14 +01:00
Jason Ekstrand	d7cb1634d2	nir/lower_system_values: Refactor and use the builder. Now that we have a helper in the builder for system values and a helper in core NIR to get the intrinsic opcode, there's really no point in having things split out into a helper function. This commit "modernizes" this pass to use helpers better and look more like newer passes. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-12-15 14:12:31 -08:00
Jason Ekstrand	f6910f072a	nir/builder: Add a load_system_value helper While we're at it, go ahead and make nir_lower_clip use it. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-12-15 14:12:31 -08:00
Jason Ekstrand	ca5be008bc	nir/lower_system_values: Stop supporting non-SSA The one user of this (i965) only ever calls it while in SSA form. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-12-15 14:12:31 -08:00
Samuel Pitoiset	276837cbe4	nvc0: remove old comment related to metric calculations I forgot to remove it when I refactored all performance metrics. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-15 22:49:37 +01:00
Eric Anholt	3858722740	vc4: Add support for dumping executed commands to a file. The VC4_DEBUG=cl,qpu is nice and all, but I want to be able to get more detailed dumps, and to replay the same exact commands in simulation. For that I need a dump with all of the VBOs, shaders, shader recs, etc. This dump can be parsed by vc4-gpu-tools. For now this is only doable from simulator mode, because otherwise we don't have access to the RCL contents generated by the kernel.	2015-12-15 12:05:48 -08:00
Eric Anholt	07570edb98	vc4: Import updated vc4_drm.h with hang state.	2015-12-15 12:02:54 -08:00
Eric Anholt	c5b886b028	vc4: Only update vc4->msaa when the framebuffer changes. Any update here should have been the same as in vc4_set_framebuffer_state(), except for the point where vc4_blit.c temporarily sets different state for its different buffers.	2015-12-15 12:02:53 -08:00
Eric Anholt	f2cf2a63f1	vc4: Don't consider nr_samples==1 surfaces to be MSAA. This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.	2015-12-15 12:02:53 -08:00
Eric Anholt	da92f16c50	vc4: Fix min() wrapper definition for the simulator's kernel code.	2015-12-15 12:02:53 -08:00
Eric Anholt	02bcb443ee	vc4: Warn instead of abort()ing on exec ioctl failures. It's really harsh to abort() the X Server because of a momentary failure (particularly -ENOMEM). I don't see a way to pass an -ENOMEM up the stack from here, but we can at least log to stderr before proceeding on. Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-15 12:02:44 -08:00
Andreas Boll	a2140b0571	docs: Replace sourceforge logo with a text link Fixes the following Lintian (Debian package checker) error: privacy-breach-logo usr/share/doc/mesa-common-dev/contents.html (http://sourceforge.net/sflogo.php?group_id=3&type=1) usr/share/doc/mesa-common-dev/thanks.html (http://sourceforge.net/sflogo.php?group_id=3&type=1) The extended description of this tag is: This package creates a potential privacy breach by fetching a logo at runtime. Before using a local copy you should check that the logo is suitable for main. You can get help with determining this by posting a link to the logo and a copy of, or a link to, the logo copyright and license information to the debian-legal mailing list. Please replace any scripts, images, or other remote resources with non-remote resources. It is preferable to replace them with text and links but local copies of the remote resources are also acceptable as long as they don't also make calls to remote services. Please ensure that the remote resources are suitable for Debian main before making local copies of them. Severity: serious, Certainty: possible Check: files, Type: binary, udeb Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-15 17:57:25 +01:00
Nicolai Hähnle	c8d9d289ff	radeonsi: fix perfcounter selection for SI_PC_MULTI_BLOCK layouts The incorrectly computed register count caused lockups. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 11:23:40 -05:00
Nicolai Hähnle	149d049676	gallium/radeon: remove unnecessary test in r600_pc_query_add_result This test is a left-over of the initial development. It is unneeded and misleading, so let's get rid of it. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 11:23:40 -05:00
Nicolai Hähnle	819543adb4	mesa/main: use BITSET_FOREACH_SET in perf_monitor_result_size This should make the code both faster and slightly clearer. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-15 11:23:40 -05:00
Emil Velikov	9c0773958e	docs: add news item and link release notes for 11.1.0 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-15 15:07:03 +00:00
Emil Velikov	b8394ef3df	docs: add sha256 checksums for 11.0.1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `525f3c2c28`)	2015-12-15 15:07:02 +00:00
Emil Velikov	5497e119a5	docs: Update 11.1.0 release notes Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `5a616125ac`)	2015-12-15 15:07:02 +00:00
Rob Clark	e677b3047b	freedreno/a4xx: fix fragcoord.z + fragdepth It seems like disabling earlyz on a4xx also, by defaults, disables fragcoord.z to the FS. For frag shaders that both read fragcoord(.z) and write fragdepth, we need to set some extra bits to prevent a lockup. This lets us get rid of the hack of disabling fragcoord.z (which prevented 0ad from lockups, but resulted in rendering corruption). Also fixes fbo-depth-sample-compare. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:40:54 -05:00
Rob Clark	cad0920d11	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:39:10 -05:00
Rob Clark	249b2be3bc	freedreno/ir3/cmdline: don't dump nir by default By default we only want the disasm dumped, which we get anyways. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-15 09:39:10 -05:00
Christian König	10b7a7c344	st/va: remove nonesense HEVC picture id handling The picture id in this case is a VA-API surface handle, checking for a certain value can't be correct. Signed-off-by: Christian König <christian.koenig@amd.com>	2015-12-15 11:25:02 +01:00
Chris Forbes	af5ca43f26	i965: Allocate URB space for HS and DS stages when required. v2: (by Ken, incorporating feedback from Matt Turner): - Rewrite the push constant allocation code to be clearer. - Only apply the minimum VS entries workaround on Gen 8. v3: (by Ken) - Fix a bug in v2 where we failed to allocate the full push constant space when the number of enabled stages didn't divide the available push constant space evenly. (Any left over space is now allocated to the PS, as it was in v1.) - Fix an off-by-one error in v2's number of enabled stages calculation. - Use DIV_ROUND_UP for nicer formatting. - Line wrapping fixes. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-15 02:16:14 -08:00
Timothy Arceri	8c0963f9d3	docs: mark input/output block locations as DONE Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 13:10:51 +11:00
Timothy Arceri	0aeb9b3e5e	glsl: add support for explicit locations inside interface blocks This change also adds explicit location support for structs and interfaces which is currently missing in Mesa but is allowed with SSO and GLSL 1.50+. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 13:10:44 +11:00
Timothy Arceri	183c606066	glsl: simplify interface matching This makes the code easier to follow, should be more efficient and will makes it easier to add matching via explicit locations in the following patch. This patch also replaces the hash table with the newer resizable hash table this should be more suitable as the table is likely to only contain a small number of entries. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-15 13:10:39 +11:00
Roland Scheidegger	8e264765a4	draw: remove clip_vertex from vertex header vertex header had both clip_pos and clip_vertex. We only really need one (clip_pos) because the draw llvm shader would overwrite the position output from the vs with the viewport transformed. However, we don't really need the second one, which was only really used for gl_ClipVertex - if the shader didn't have that the values were just duplicated to both clip_pos and clip_vertex. So, just use this from the vs output instead when we actually need it. Also change clip debug to output both the data from clip_pos and the clipVertex output (if available). Makes some things more complex, some things less complex, but seems more easy to understand what clipping actually does (and what values it uses to do its magic). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	1775400a20	draw: use clip_pos, not clip_vertex for the fake guardband xy point clipping Seems obvious now this should use the data from position and not clip_vertex (albeit might not really make a difference). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	8575ddb644	draw: rename vertex header members clip -> clip_vertex and pre_clip_pos -> clip_pos. Looks more obvious to me what these values actually represent (so use something resembling the vs output names). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	1b22815af6	draw: don't pretend have_clipdist is per-vertex This is just for code cleanup, conceptually the have_clipdist really isn't per-vertex state, so don't put it there (just dependent on the shader). Even though there wasn't really any overhead associated with this, we shouldn't store random shader information in the vertex header. Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Roland Scheidegger	9e3f2af3c3	draw: use position not clipVertex output for xyz view volume clipping I'm pretty sure this should use position (i.e. pre_clip_pos) and not the output from clipVertex. Albeit piglit doesn't care. It is what we use in the clip test, and it is what every other driver does (as they don't even have clipVertex output and lower the additional planes to clip distances). Reviewed-by: Brian Paul <brianp@vmware.com Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-15 02:03:40 +01:00
Kenneth Graunke	77cc2666b1	i965: Use DIV_ROUND_UP() in gen7_urb.c code. This is a newer convention, which we prefer over ALIGN(x, n) / n. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-14 14:56:14 -08:00
Kenneth Graunke	9f0944d15b	i965: Make TES inputs match TCS outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:29 -08:00
Kenneth Graunke	4fac950010	i965: Force VS -> TCS varyings to use the SSO VUE map layout. The compact VUE map only works when varying packing is in use. Unfortunately, varying packing is disabled for TCS inputs. This is needed to fix Piglit's tcs-input-read-array-interface test. v2: Make lines fit in 80 columns (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:18 -08:00
Kenneth Graunke	bee42cc1f7	i965: Handle TCS outputs and TES inputs. TCS outputs and TES inputs both refer to a common "patch URB entry" shared across all invocations. First, there are some number of per-patch entries. Then, there are per-vertex entries accessed via an offset for the variable and a stride times the vertex index. Because these calculations need to be done in both the vec4 and scalar backends, it's simpler to just compute the offset calculations in NIR. It doesn't necessarily make much sense to use per-vertex intrinsics afterwards, but that at least means we don't lose the per-patch vs. per-vertex information. v2: Use is_input/is_output helpers (suggested by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:13 -08:00
Kenneth Graunke	31140d097a	i965: Handle TCS inputs and TES outputs. TES outputs work exactly like VS outputs, so we can simply add a case statement for those. TCS inputs are very similar to geometry shaders - they're arrays of per-vertex data. We use the same method I used for the scalar GS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:07 -08:00
Kenneth Graunke	1f46163acb	i965: Add tessellation shader VUE map code. Based on a patch by Chris Forbes, but largely rewritten by Ken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-14 14:48:01 -08:00
Kenneth Graunke	9f3917bf37	i965: Fix partial variable access for geometry shaders in SSO mode. Without varying packing, if a VS writes a compound variable, and the GS only reads part of it, the base location of the variable may not actually be in the VUE map. To cope with this, we do lowering in terms of varying slots, add any constant offsets to the base, and then do the VUE map remapping. This ensures we only look up VUE map entries for slots which actually exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-14 14:39:38 -08:00
Kenneth Graunke	8c4deb10df	i965: Separate base offset/constant offset combining from remapping. My tessellation branch has two additional remap functions. I don't want to replicate this logic there. v2: Handle inputs/outputs separately (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-14 14:39:34 -08:00
Kenneth Graunke	106c3a8a48	nir: Fix number of indices on shared variable store intrinsics. Shared variables and input reworks landed around the same time. Presumably, this was some sort of mistake in rebase conflict resolution. This really only affects the num_indices field in nir_intrinsic_infos, which is rarely used. However, it's used by the printer. Found by inspection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-14 14:27:38 -08:00
Ian Romanick	96dc732ed8	meta/generate_mipmap: Work-around GLES 1.x problem with GL_DRAW_FRAMEBUFFER GL_DRAW_FRAMEBUFFER does not exist in OpenGL ES 1.x, and since _mesa_meta_begin hasn't been called yet, we have to work-around API difficulties. The whole reason that GL_DRAW_FRAMEBUFFER is used instead of GL_FRAMEBUFFER is that the read framebuffer may be different. This is moot in OpenGL ES 1.x. I have another patch series that would also fix this (by removing the calls to _mesa_BindFramebuffer and friends), but it's not quite ready yet... and I think it may be a bit heavy for some stable branches. Consider this a stop-gap fix. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93215 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-14 13:09:15 -08:00
Samuel Pitoiset	71135e275f	nvc0: check return value of nvc0_program_validate() Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-14 19:08:42 +01:00
Samuel Pitoiset	54f58210c2	nv50: check return value of nouveau_object_new() When ret == 0, obj is not NULL. Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-14 19:08:39 +01:00
Samuel Pitoiset	3f7462b792	nv50,nvc0: make use of unreachable() when invalid texture target happens Spotted by Coverity. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-14 19:08:25 +01:00
Christian König	8b52fa71ac	st/va: handle default post process regions Avoid referencing NULL pointers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:55 +01:00
Christian König	f6dd31c1cf	st/va: fix unused variable warning Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:55 +01:00
Christian König	025d97381e	st/va: clean up post process includes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: Julien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:54 +01:00
Christian König	27a276f625	st/va: cleanup filter color standard handling Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Tested-by: ulien Isorce <j.isorce@samsung.com>	2015-12-14 11:54:54 +01:00
Tapani Pälli	8b79258cfe	meta: clear_state structure cleanup Remove unused variables from clear_state and use a hardcoded location for color uniform to get rid of 2 more variables. Modify shaders to use explicit location for vertex attribute too as extension is enabled. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-14 08:01:49 +02:00
Ilia Mirkin	eca8f38dcf	glsl: assign varying locations to tess shaders when doing SSO GRID Autosport uses SSO shaders. When a tessellation evaluation shader is passed through this, it triggers assertion failures down the line with unassigned varying locations. Make sure to do this when the first shader in the pipeline is not a vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-13 11:35:28 -05:00
Neil Roberts	839793680f	i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals Previously if the visual didn't have an alpha channel then it would pick a format that is not sRGB-capable. I don't think there's any reason not to always have an sRGB-capable visual. Since `28090b30` there are now visuals advertised without an alpha channel which means that games that don't request alpha bits in the config would end up without an sRGB-capable visual. This was breaking supertuxkart which assumes the winsys buffer is always sRGB-capable. The previous code always used an RGBA format if the visual config itself was marked as sRGB-capable regardless of whether the visual has alpha bits. I think we don't actually advertise any sRGB-capable visuals (but we just use sRGB formats anyway) so it shouldn't make any difference. However this patch also changes it to use RGBX if an sRGB-capable visual is requested without alpha bits for consistency. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759 Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-13 14:29:42 +00:00
Neil Roberts	43f4be5f06	i965: Add B8G8R8X8_SRGB to the alpha format override brw_init_surface_formats overrides the render format for RGBX formats which aren't supported for rendering so that they internally use RGBA instead. However, B8G8R8X8_SRGB was missing so it wasn't marked as a renderable format. This patch just adds it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-13 14:29:41 +00:00
Neil Roberts	c769efda93	i965: Add MESA_FORMAT_B8G8R8X8_SRGB to brw_format_for_mesa_format This will be used in a subsequent patch as the format for RGB visuals. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-13 14:29:38 +00:00
Ilia Mirkin	7752bbc44e	gk104/ir: simplify and fool-proof texbar algorithm With the current algorithm, we only look at tex uses. However there's a write-after-write hazard where we might decide to, on some path, not use a texture's output at all, but instead to write a different value to that register. However without the barrier, the texture might complete later and overwrite that value. This fixes Unreal Elemental demo on GK110/GK208, flightgear on GK10x, and likely other random-looking failures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-12-12 18:10:16 -05:00
Ilia Mirkin	d35695096d	nv50/ir: combine sequences of conversions In some cases shaders want non-default rounding when converting float to integer. This can be done in one go, so merge the two ops. This comes up in the packUnorm4x8 & co functions, as well as a few random shaders. Overall shader-db impact is minimal, helping a handful of witcher2 and other misc shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:16 -05:00
Ilia Mirkin	dbca0f3eba	nv50/ir: manually optimize multiplication expansion logic The conversion of 32-bit integer multiplies into 16-bit ones happens after the regular optimization loop. However it's fairly common to multiply by a small integer, rendering some of the expansion pointless. Firstly, propagate immediates when possible into mul ops, secondly just remove the ops when they are unnecessary. Including the change to generate imad immediates, the effect is: total instructions in shared programs : 6365463 -> 6351898 (-0.21%) total gprs used in shared programs : 728684 -> 728684 (0.00%) total local used in shared programs : 9904 -> 9904 (0.00%) total bytes used in shared programs : 44001576 -> 44036120 (0.08%) local gpr inst bytes helped 0 0 3288 4 hurt 0 0 0 842 It's easy for this to hurt bytes since we end up always generating the 8-byte form, while we can't always get rid of the immediate in question. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:16 -05:00
Ilia Mirkin	3af83c4bc7	nv50/ir: fix imul emission in the presence of an immediate Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	a0b5d5beed	nv50/ir: teach post-ra immediate folding into mad about integers There will usually be a split before the mad op, peer through that and pick out the right word of the immediate. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	ab70ea1353	nv50/ir: add short imad support Support emission of the short imad, but also include it in the various logic that tries to make it possible to emit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	6aca7fecb7	nv50/ir: can't have predication and immediates Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	69e8b476d0	nv50/ir: fix texture grad for cubemaps We were ignoring the partial derivatives on the last dim. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Ilia Mirkin	a27548400e	nv50/ir: fix assumption that prog->maxGPR is in 32-bit reg units On NV50, we use 16-bit reg units (to make it all work with half-regs). A few places assumed that it was always in 32-bit units. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-12 18:10:15 -05:00
Nicolai Hähnle	d640f179d3	gallium/ddebug: regularly log the total number of draw calls This helps in the use of GALLIUM_DDEBUG_SKIP: first run a target application with skip set to a very large number and note how many draw calls happen before the bug. Then re-run, skipping the corresponding number of calls. Despite the additional run, this can still be much faster than not skipping anything. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-12 15:23:50 -05:00
Nicolai Hähnle	b86d5ccae2	gallium/ddebug: add GALLIUM_DDEBUG_SKIP option When we know that hangs occur only very late in a reproducible run (e.g. apitrace), we can save a lot of debugging time by skipping the flush and hang detection for earlier draw calls. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-12 15:23:34 -05:00
Roland Scheidegger	af7ba989fb	llvmpipe: fix layer/vp input into fs when not written by prior stages ARB_fragment_layer_viewport requires that if a fs reads layer or viewport index but it wasn't output by gs (or vs with other extensions), then it reads 0. This never worked for llvmpipe, and is surprisingly non-trivial to fix. The problem is the mechanism to handle non-existing outputs in draw is rather crude, it will simply redirect them to whatever is at output 0, thus later stages will just get garbage. So, rather than trying to fix this up (which looks non-trivial), fix this up in llvmpipe setup by detecting this case there and output a fixed zero directly. While here, also optimize the hw vertex layout a bit - previously if the gs outputted layer (or vp) and the fs read those inputs, we'd add them twice to the vertex layout, which is unnecessary. And do some minor cleanup, slots don't require that many bits, there was some bogus (but harmless) float/int mixup for psize slot too, make the slots all unsigned (we always put pos at pos zero thus everything else has to be positive if it exists), and make sure they are properly initialized (layer and vp index slot were not which looked fishy as they might not have got set back to zero when changing from a gs which outputs them to one which does not). This fixes the failures in piglit's arb_fragment_layer_viewport group (3 each for layer and vp). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-12 01:59:15 +01:00
Brian Paul	27d5be0b8f	svga: avoid emitting redundant SetSamplers() commands This greatly reduces the number of SetSamplers() commands for some applications. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-11 16:54:58 -07:00
Brian Paul	1291e910d5	svga: avoid emitting redundant SetIndexBuffer commands Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-11 16:54:44 -07:00
Brian Paul	71f19dd201	st/mesa: trivial indentation fix	2015-12-11 16:53:20 -07:00
Brian Paul	c877f1aeef	util/blitter: minor formatting fixes	2015-12-11 16:53:20 -07:00
Jason Ekstrand	b8425bb1e8	i965/fs: Use the correct source for local memory load offsets The offset for loads is in src[0]. This was a copy+paste error in the nir_intrinsic_load/store refactoring. This commit fixes a segfault in ES31-CTS.compute_shader.work-group-size. I have no idea how piglit failed to catch this... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93348 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:56:34 -08:00
Kenneth Graunke	fadf378497	i965: Add Gen8+ tessellation control shader state (3DSTATE_HS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	b3c32f5f34	i965: Add Gen7+ tessellation engine state (3DSTATE_TE). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	37b0b11cef	i965: Add Gen8+ tessellation evaluation shader state (3DSTATE_DS). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	86a6eda9bc	i965: Add tessellation shader push constant support. Based on a patch by Chris Forbes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	c59d1b1fd1	i965: Add tessellation shader sampler support. Based on code by Chris Forbes and Fabian Bieler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	f34c04fda6	i965: Add tessellation shader surface support. This is brw_gs_surface_state.c copy and pasted twice with search and replace. brw_binding_table.c code is similarly copy and pasted. v2: Drop dword_pitch related fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	82455e5396	i965: Make fs_visitor::emit_urb_writes set EOT for TES as well. Tessellation evaluation shaders work almost identically to vertex shaders - we have a set of URB writes at the end of the program, and the last one should terminate it. Geometry shaders really are the special case, where multiple EmitVertex() calls trigger URB writes in the middle of the program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	7e0c22d461	i965: Don't hardcode g1 for URB handles in fs_visitor::emit_urb_writes(). Tessellation evaluation shaders will use g4 instead. For now, make an fs_reg called urb_handle and use that in place of hardcoding g1. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:11:15 -08:00
Kenneth Graunke	77b338d63b	i965: Make brw_set_message_descriptor() non-static. I want to use this directly from brw_vec4_generator.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-11 13:11:15 -08:00
Kristian Høgsberg Kristensen	c51f133197	i965: Move brw_cs_fill_local_id_payload() to libi965_compiler This is a helper function for setting up the local invocation ID payload according to the cs_prog_data generated by the compiler. It's intended to be available to users of libi965_compiler so move it there.	2015-12-11 13:07:25 -08:00
Eric Anholt	076551116e	vc4: Add quick algebraic optimization for clamping of unpacked values. GL likes to saturate your incoming color, but if that color's coming from unpacking from unorms, there's no point. Ideally we'd have a range propagation pass that cleans these up in NIR, but that doesn't seem to be going to land soon. It seems like we could do a one-off optimization in nir_opt_algebraic, except that doesn't want to operate on expressions involving unpack_unorm_4x8, since it's sized. total instructions in shared programs: 87879 -> 87761 (-0.13%) instructions in affected programs: 6044 -> 5926 (-1.95%) total estimated cycles in shared programs: 349457 -> 349252 (-0.06%) estimated cycles in affected programs: 6172 -> 5967 (-3.32%) No SSPD on openarena (which had the biggest gains, in its VS/CSes), n=15.	2015-12-11 12:36:16 -08:00
Eric Anholt	e3efc4b023	vc4: When doing algebraic optimization into a MOV, use the right MOV. If there were src unpacks, changing to the integer MOV instead of float (for example) would change the unpack operation.	2015-12-11 12:21:22 -08:00
Eric Anholt	2591beef89	vc4: Fix handling of src packs on in qir_follow_movs(). The caller isn't going to expect it from a return, so it would probably get misinterpreted. If the caller had an unpack in its reg, that's fine, but don't lose track of it.	2015-12-11 12:21:22 -08:00
Eric Anholt	b70a2f4d81	vc4: Add missing progress note in opt_algebraic.	2015-12-11 12:21:22 -08:00
Eric Anholt	5989ef2b0f	vc4: Add debugging of the estimated time to run the shader to shader-db.	2015-12-11 12:21:22 -08:00
Eric Anholt	53b2523c6e	vc4: Fix handling of sample_mask output. I apparently broke this in a late refactor, in such a way that I decided its tests were some of those interminable ones that I should just blacklist from my testing. As a result, the refactors related to it were totally wrong.	2015-12-11 12:21:22 -08:00
Edward O'Callaghan	53609de762	softpipe: enable GL_ARB_viewport_array support, update GL3.txt doc Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-11 20:09:21 +01:00
Edward O'Callaghan	00f97ad5de	softpipe: implement some support for multiple viewports Mostly related to making sure the rasterizer can correctly pick out the correct scissor box for the current viewport. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-12-11 20:09:21 +01:00
Roland Scheidegger	6c2c1e0ffe	draw: don't assume fixed offset for data in struct vertex_info Otherwise, if struct vertex_info is changed, you're in for some surprises... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-11 20:09:21 +01:00
Neil Roberts	583a5778f4	i965/gen9: Don't do fast clears when GL_FRAMEBUFFER_SRGB is enabled When GL_FRAMEBUFFER_SRGB is enabled any single-sampled renderbuffers are resolved in intel_update_state because the hardware can't cope with fast clears on SRGB buffers. In that case it's pointless to do a fast clear because it will just be immediately resolved. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	0033c81344	i965/gen9: Allow fast clears for non-MSRT SRGB buffers SRGB buffers are not marked as losslessly compressible so previously they would not be used for fast clears. However in practice the hardware will never actually see that we are using SRGB buffers for fast clears if we use the linear equivalent format when clearing and make sure to resolve the buffer as a linear format before sampling from it. This is an important use case because by default the window system framebuffers are created as SRGB so without this fast clears won't be used there. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	82d459a423	i965/gen9: Resolve SRGB color buffers when GL_FRAMEBUFFER_SRGB enabled SKL can't cope with the CCS buffer for SRGB buffers. Normally the hardware won't see the SRGB formats because when GL_FRAMEBUFFER_SRGB is disabled these get mapped to their linear equivalents. In order to avoid relying on the CCS buffer when it is enabled this patch now makes it flush the renderbuffers. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	eb291d7013	i965/gen8+: Don't upload the MCS buffer for single-sampled textures For single-sampled textures the MCS buffer is only used to implement fast clears. However the surface always needs to be resolved before being used as a texture anyway so the the MCS buffer doesn't actually achieve anything. This is important for Gen9 because in that case SRGB surfaces are not supported for fast clears and we don't want the hardware to see the MCS buffer in that case. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Neil Roberts	44902ed1fa	i965/meta-fast-clear: Disable GL_FRAMEBUFFER_SRGB during clear Adds MESA_META_FRAMEBUFFER_SRGB to the meta save state so that GL_FRAMEBUFFER_SRGB will be disabled when performing the fast clear. That way the render surface state will be programmed with the linear equivalent format during the clear. This is important for Gen9 because the SRGB formats are not marked as losslessly compressible so in theory they aren't support for fast clears. It shouldn't make any difference whether GL_FRAMEBUFFER_SRGB is enabled for the fast clear operation because the color is not actually written to the framebuffer so there is no chance for the hardware to apply the SRGB conversion on it anyway. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-11 18:05:56 +00:00
Marek Olšák	369afdb7b6	winsys/amdgpu: clear the buffer cache on mmap failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	84a38bfc29	winsys/radeon: clear the buffer cache on mmap failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	eb1e1af676	winsys/amdgpu: clear the buffer cache on allocation failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	f9d6fe8001	winsys/radeon: clear the buffer cache on allocation failure and try again Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	cf811faeff	gallium/radeon: remove radeon_winsys_cs_handle "radeon_winsys_cs_handle cs_buf" is now equivalent to "pb_buffer buf". Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	cf422d20ff	winsys/radeon: use pb_cache instead of pb_cache_manager This is a prerequisite for the removal of radeon_winsys_cs_handle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	ebc9497fcb	winsys/radeon: use radeon_bomgr less Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	a450f96ba9	winsys/radeon: rename radeon_bomgr_init_functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	38ac20f7dd	winsys/radeon: move variables from radeon_bomgr to radeon_drm_winsys radeon_bomgr is going away. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:13 +01:00
Marek Olšák	3d090223ef	winsys/radeon: remove redundant radeon_bomgr::va Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	1e05812fcd	winsys/amdgpu: don't use the "rws" abbreviation for amdgpu_winsys Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	6f4e74d165	winsys/amdgpu: use pb_cache instead of pb_cache_manager This is a prerequisite for the removal of radeon_winsys_cs_handle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	3fbf250dfa	gallium/pb_bufmgr_cache: use the new pb_cache module Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	2b396eeed9	gallium/pb_cache: add a copy of cache bufmgr independent of pb_manager This simplified (basically duplicated) version of pb_cache_manager will allow removing some ugly hacks from radeon and amdgpu winsyses and flatten simplify their design. The difference is that winsyses must manually add buffers to the cache in "destroy" functions and the cache doesn't know about the buffers before that. The integration is therefore trivial and the impact on the winsys design is negligible. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	1a24f443b4	radeonsi: implement fast stencil clear Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	8ee96ce834	radeonsi: re-enable Hyper-Z for stencil Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	99e63338fb	r600g: remove a Hyper-Z workaround that's likely not needed anymore FORCE_OFF == 0, no need to set that Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	96e8d38ac4	r600g: re-enable Hyper-Z for stencil on Evergreen & Cayman Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	d3c08309ab	gallium/radeon: fix Hyper-Z hangs by programming PA_SC_MODE_CNTL_1 correctly This is the recommended setting according to hw people and it makes Hyper-Z stable. Just the two magic states. This fixes Evergreen, Cayman, SI, CI, VI (using the Cayman code). Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	7c29bf26bb	radeonsi: don't use the CP DMA workaround on Fiji and newer Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	787ada6bf6	radeonsi: apply the streamout workaround to Fiji as well Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	62d82193b8	radeonsi: also print hexadecimal values for register fields in the IB parser Reviewed-by: Michel Dänzer <michel.daenzer@amd.com Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	de887ba90c	radeonsi: implement RB+ for Stoney (v2) v2: fix dual source blending Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	0f9519b938	radeonsi: don't call of u_prims_for_vertices for patches and rectangles Both caused a crash due to a division by zero in that function. This is an alternative fix. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	51603af390	radeonsi: use tgsi_shader_info::colors_written Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:11 +01:00
Marek Olšák	b5b87c4ed1	r600g: write all MRTs only if there is exactly one output (fixes a hang) This fixes a hang in piglit/arb_blend_func_extended-fbo-extended-blend-pattern_gles2 on REDWOOD. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:11 +01:00
Marek Olšák	eb4813a952	tgsi/scan: add flag colors_written This is a prerequisite for the following r600g fix. Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:11 +01:00
Marek Olšák	37208c4fd7	Revert "radeonsi: disable DCC on Stoney" This reverts commit `32f05fadbb`. It turned out the problem with Stoney was caused by incorrect handling of a non-power-two VRAM size in the kernel driver. This is an optional BIOS setting and can be worked around by choosing a different VRAM size in the BIOS. Cc: 11.1 <mesa-stable@lists.freedesktop.org>	2015-12-11 15:25:11 +01:00
Timothy Arceri	4b9a79b7b8	nir: silence uninitialized warning Reviewed-by: Rob Clark <robdclark@gmail.com>	2015-12-11 19:26:20 +11:00
Dave Airlie	18ad641c3b	mesa/shader: return correct attribute location for double matrix arrays If we have a dmat2[4], then dmat2[0] is at 17, dmat2[1] at 19, dmat2[2] at 21 etc. The old code was returning 17,18,19. I think this code is also wrong for float matricies as well. There is now a piglit for the float case. This partly fixes: GL41-CTS.vertex_attrib_64bit.limits_test [airlied: update with Tapani suggestion to clean it up]. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-11 16:28:29 +10:00
Roland Scheidegger	64c59b0624	draw: fix clipping with linear interpolated values and gl_ClipVertex Discovered this when working on other clip code, apparently didn't work correctly - the combination of linear interpolated values and using gl_ClipVertex produced wrong values (failing all such combinations in piglits glsl-1.30 interpolation tests, named interpolation-noperspective-XXX-vertex). Use the pre-clip-pos values when determining the interpolation factor to fix this. Noone really understands this code well, but everybody agrees this looks sane... This fixes all those failing tests (10 in total) both with the llvm and non-llvm draw paths, with no piglit regressions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-12-11 02:21:39 +01:00
Dave Airlie	5362e53a06	r600: add missing return value check. Pointed out by coverity scan. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-11 09:37:20 +10:00
Jason Ekstrand	78b81be627	nir: Get rid of _indirect variants of input/output load/store intrinsics There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the _indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the _indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of _indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <eric@anholt.net> ir3 changes are Reviewed-by: Rob Clark <robdclark@gmail.com> NIR changes are Acked-by: Rob Clark <robdclark@gmail.com>	2015-12-10 12:25:16 -08:00
Jason Ekstrand	f3970fad9e	i965/fs_nir: Refactor store_output, load_input, and load_uniform There was way too much incrementing of things going on. Instead, let's just start everything off at the right base location, and then increment in the loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-10 12:25:16 -08:00
Patrick Rudolph	79bff488bc	gallium/util: return correct number of bound vertex buffers In case a state tracker unbinds every slot by a seperate pipe->set_vertex_buffers() call, starting from slot zero, the number of bound buffers would not reach zero at all. The current algorithm does not account for pre-existing holes in the buffer list. Unbinding all buffers at once or starting at the top-most slot results in correct behaviour. Calculating the correct number of bound buffers fixes a NULL pointer dereference in nvc0_validate_vertex_buffers_shared(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 Signed-off-by: Patrick Rudolph <siro@das-labor.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-10 13:55:53 -05:00
Neil Roberts	ba67739b66	blit: Don't take into account the Mesa format when checking MSRT blit According to the GLES3 spec, blitting between multisample FBOs with different internal formats should not be allowed. The compatible_resolve_formats function implements this check. Previously it had a shortcut where if the Mesa formats of the two renderbuffers were the same then it would assume the blit is ok. However some drivers map different internal formats to the same Mesa format, for example it might implement both GL_RGB and GL_RGBA textures with MESA_FORMAT_R8G8B8A_UNORM. The function is used to generate a GL error according to what the GL spec requires so the blit should not be allowed in that case. This patch just removes the shortcut so that it only ever looks at the internal format. Note that I posted a related patch to disable this check altogether for desktop GL. However this function is still used on GLES3 because there are conformance tests that require this behaviour so this patch is still useful. Cc: Marek Olšák <maraeo@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-10 11:03:58 +00:00
Neil Roberts	3f10774cba	i965: Check base format to determine whether to use tiled memcpy The tiled memcpy doesn't work for copying from RGBX to RGBA because it doesn't override the alpha component to 1.0. Commit `2cebaac479` added a check to disable it for RGBX formats by looking at the TexFormat. However a lot of the rest of the code base is written with the assumption that an RGBA texture can be used internally to implement a GL_RGB texture. If that is done then this check breaks. This patch makes it instead check the base format of the texture which I think more directly matches the intention. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-12-10 11:03:49 +00:00
Neil Roberts	9a31d9870b	i965/gen8: Allow rendering to B8G8R8X8 Since Gen8 this is allowed as a rendering target so we don't need to override it to B8G8R8A8. This is helpful on Gen9+ where using this override causes fast clears not to work. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-12-10 11:03:49 +00:00
Neil Roberts	d151338594	i965/gen9: Allow fast clear for MSRT formats matching render Previously fast clear was disallowed on Gen9 for MSRTs with the claim that some formats don't work but we didn't understand why. On further investigation it seems the formats that don't work are the ones where the render surface format is being overriden to a different format than the one used for texturing. The one used for texturing is not actually a renderable format. It arguably makes sense that the sampler hardware doesn't handle the fast color correctly in these cases because it shouldn't be possible to end up with a fast cleared surface that is non-renderable. This patch changes the limitation to prevent fast clear for surfaces where the format for rendering is overriden. Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-12-10 11:03:49 +00:00
Neil Roberts	e1a16b901b	i965/gen9/fast-clear: Handle linear→SRGB conversion If GL_FRAMEBUFFER_SRGB is enabled when writing to an SRGB-capable framebuffer then the color will be converted from linear to SRGB before being written. There is no chance for the hardware to do this itself because it can't modify the clear color that is programmed in the surface state so it seems pretty clear that the driver should be handling this itself. Note that this wasn't a problem before Gen9 because previously we were only able to do fast clears to 0 or 1 and those values are the same in linear and SRGB space. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-10 11:03:49 +00:00
Jordan Justen	83e8e07a2b	docs: Add ARB_compute_shader to 11.2.0 release notes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	1c0d059c02	docs: Mark ARB_compute_shader as done for i965 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	d04612b60d	i965: Enable ARB_compute_shader extension on supported hardware Enable ARB_compute_shader on gen7+, on hardware that supports the OpenGL 4.3 requirements of a local group size of 1024. With SIMD16 support, this is limited to Ivy Bridge and Haswell. Broadwell will work with a local group size up to 896 on SIMD16 meaning programs that use this size or lower should run when setting MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	e288b4a133	i965/nir: Implement shared variable atomic operations v3: * Update based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	d584b2313e	nir: Add nir intrinsics for shared variable atomic operations v3: * Update min/max based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	fc21a7c26e	glsl: Disable several optimizations on shared variables Shared variables can be accessed by other threads within the same local workgroup. This prevents us from performing certain optimizations with shared variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	f821a3ec4f	glsl: Buffer atomics are supported for compute shaders Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	7333593cf3	glsl: Translate atomic intrinsic functions on shared variables When an intrinsic atomic operation is used on a shared variable, we translate it to a new 'shared variable' specific intrinsic function call. For example, a call to __intrinsic_atomic_add when used on a shared variable will be translated to a call to __intrinsic_atomic_add_shared. v3: * Fix stale comments copied from SSBOs (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	614ad9b40b	glsl: Check for SSBO variable in check_for_ssbo_store The compiler probably already blocks this earlier on, but we should be checking for an SSBO here. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	c2e6cfbd78	glsl: Check for SSBO variable in SSBO atomic lowering When an atomic function is called, we need to check to see if it is for an SSBO variable before lowering it to the SSBO specific intrinsic function. v2: * is_in_buffer_block => is_in_shader_storage_block (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	a108e14d1c	glsl: Replace atomic_ssbo and ssbo_atomic with atomic The atomic functions can also be used with shared variables in compute shaders. When lowering the intrinsic in lower_ubo_reference, we still create an SSBO specific intrinsic since SSBO accesses can be indirectly addressed, whereas all compute shader shared variable live in a single shared variable area. v2: * Also remove the _internal suffix from ssbo atomic intrinsic names (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	23da6aeb17	glsl: Allow atomic functions to be used with shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	d3625d4071	i965: Lower shared variable references to intrinsic calls Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	b1fe3af0da	i965: Enable shared local memory for CS shared variables v3: * Check shared variable size at link time Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	faddb301ff	i965/fs: Handle nir shared variable store intrinsic v4: * Apply similar optimization for shared variable stores as `0cb7d7b4b7`. This was causing a OpenGLES 3.1 CTS failure, but `867c436ca8` fixes that. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	8613206bd3	i965/fs: Handle nir shared variable load intrinsic v3: * Remove extra #includes (Iago) * Use recently added GEN7_BTI_SLM instead of BRW_SLM_SURFACE_INDEX (curro) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	e128a62318	i965: Disable vector splitting on shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	aa12a92626	nir: Translate glsl shared var store intrinsic to nir intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	03b0439938	nir: Translate glsl shared var load intrinsic to nir intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	1078d712d7	glsl: Add lowering pass for shared variable references In this lowering pass, shared variables are decomposed into intrinsic calls. v2: * Send mem_ctx as a parameter (Iago) v3: * Shared variables don't have an associated interface block (Iago) * Always use 430 packing (Iago) * Comment / whitespace cleanup (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Iago Toral Quiroga	f22ab2e8b3	glsl: Don't assert on shared variable matrices with 'inherited' layout We use column-major for shared variable matrices. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	66eaef7737	glsl: Don't lower_variable_index_to_cond_assign for shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	c43a7e605e	glsl: Remove mem_ctx as member variable in lower_ubo_reference_visitor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	ee005df2f9	glsl ubo/ssbo: Move common code into lower_buffer_access::setup_buffer_access This code will also be usable by the pass to lower shared variables. Note, that const_offset is adjusted by setup_buffer_access so it must be initialized before calling setup_buffer_access. v2: Add comment for lower_buffer_access::setup_buffer_access Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	99c8196458	glsl ubo/ssbo: Move is_dereferenced_thing_row_major into lower_buffer_access Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	afa4129cf6	glsl ubo/ssbo: Add lower_buffer_access class This class has code that will be shared by lower_ubo_reference and lower_shared_reference. (lower_shared_reference will be used to support compute shader shared variables.) v2: * Add lower_buffer_access.h to makefile (Emil) * Remove static is_dereferenced_thing_row_major from lower_buffer_access.cpp. This will become a lower_buffer_access method in the next commit. * Pass mem_ctx as parameter rather than using a member variable (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	ad3c65e792	glsl ubo/ssbo: Split buffer access to insert_buffer_access This allows the code in emit_access to be generic enough to also be for lowering shared variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Jordan Justen	05667ecc52	glsl ubo/ssbo: Use enum to track current buffer access type v2: * Rename ssbo_get_array_length to ssbo_unsized_array_length_access (Iago) * Use always use this-> when referencing buffer_access_type (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 23:50:38 -08:00
Tapani Pälli	8cc372b6d9	glsl: do not loose always_active_io when packing varyings Otherwise packed and inactive varyings get optimized away. This needs to be prevented when using separate shader objects where interface needs to be preserved. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-10 07:51:55 +02:00
Tapani Pälli	2377db2c4e	mesa: invalidate pipeline status after glUseProgramStages This will cause validation to run during next draw, this is done because possible changes in used stages and programs can cause invalid pipeline state. This fixes a subtest in following CTS test: ES31-CTS.sepshaderobjs.StateInteraction Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-10 07:51:40 +02:00
Dave Airlie	21abaad8fe	mesa/varray: set double arrays to non-normalised. Doesn't have any effect in practice I don't think, but CTS reads back using GetVertexAttrib. This fixes: GL41-CTS.vertex_attrib_64bit.get_vertex_attrib Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-10 13:51:44 +10:00
Michel Dänzer	b4a03e7f8f	clover: Fix build against LLVM 3.8 SVN >= r255078 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-10 10:45:29 +09:00
Brian Paul	e1815bcc47	mesa: fix ID usage for buffer warnings We need a different ID pointer for each call site. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-09 16:06:35 -07:00
Brian Paul	de5bb7fe78	docs: remove stray <ul> tag from 11.0.5.html file to fix indentation	2015-12-09 15:55:11 -07:00
Serge Martin	2b930327e8	freedreno: little clean up in fd_create_surface in order to avoid returing invalid adress if CALLOC_STRUCT return NULL. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-09 17:32:41 -05:00
Serge Martin	0149e7a944	freedreno: change to goto fail in fd_resource_transfer_map, like the others error cases Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-09 17:31:16 -05:00
Serge Martin	e63fec29a1	freedreno: fix bind_sampler_states when hwcso is NULL src/gallium/tests/trivial/compute.c expects samplers to be cleaned when the samplers list is NULL. Like in radeon, the function behave like when the number of samplers parameter is set to 0. [small s/hwsco/hwcso/ typo fix] Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-09 17:30:58 -05:00
Edward O'Callaghan	f32f80e19d	gallium/util: Make u_prims_for_vertices() safe Let us avoid trapping in hardware from a SIGFPE and instead assert on a zero divisor. Hint: This can occur if a PIPE_PRIM_? is not handled in u_prim_vertex_count() that results in ' info ' not being initialized in the expected manner. Further, we also fix a possibly NULL pointer dereference from ' info ' being NULL from a u_prim_vertex_count() call. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-09 22:51:56 +01:00
Andreas Boll	63fe600c7a	docs: add news item for mesa-demos 8.3.0 release Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2015-12-09 22:44:52 +01:00
Patrick Rudolph	432a798cf5	nv50,nvc0: fix use-after-free when vertex buffers are unbound Always reset the vertex bufctx to make sure there's no pointer to an already freed pipe_resource left after unbinding buffers. Fixes use after free crash in nvc0_bufctx_fence(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004 Signed-off-by: Patrick Rudolph <siro@das-labor.org> [imirkin: simplify nvc0 fix, apply to nv50] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-09 13:38:15 -05:00
Andreas Boll	f876346cdd	mesa: Fix a typo in a comment s/suports/supports/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:24 +01:00
Andreas Boll	0560e835f3	glx: Fix a typo in a comment s/suports/supports/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:21 +01:00
Andreas Boll	9246df2280	st/osmesa: Fix a typo in a comment s/suport/support/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:18 +01:00
Andreas Boll	7af9930ab4	meta: Fix a typo in a print message s/Unkown/Unknown/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:15 +01:00
Andreas Boll	c83e161c91	mesa: Fix typos in print messages s/inconsistant/inconsistent/ s/occurences/occurrences/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:29:11 +01:00
Andreas Boll	5c27cb3da3	glsl: Fix a typo in a comment s/suports/supports/ Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-09 18:26:47 +01:00
Brian Paul	aa9af32752	svga: initialize pipe_driver_query_info entries with a macro To be safe, set all the fields in case the enums ordering/values ever change. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-09 09:43:47 -07:00
Brian Paul	ab0651ccfd	mesa: detect inefficient buffer use and report through debug output When a buffer is created with GL_STATIC_DRAW, its contents should not be changed frequently. But that's exactly what one application I'm debugging does. This patch adds code to try to detect inefficient buffer use in a couple places. The GL_ARB_debug_output mechanism is used to report the issue. NVIDIA's driver detects these sort of things too. Other types of inefficient buffer use could also be detected in the future. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-09 09:43:47 -07:00
Emil Velikov	7d3df58125	docs: add news item and link release notes for 11.0.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-12-09 16:12:32 +00:00
Emil Velikov	61b91d0811	docs: add sha256 checksums for 11.0.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `f9715bc449`)	2015-12-09 16:11:12 +00:00
Emil Velikov	d432be32e2	docs: add release notes for 11.0.7 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `bec983b738`)	2015-12-09 16:11:11 +00:00
Francisco Jerez	595c818071	i965: Resolve color and flush for all active shader images in intel_update_state(). Fixes arb_shader_image_load_store/execution/load-from-cleared-image.shader_test. Couldn't reproduce any significant FPS regression in CPU-bound benchmarks from the Finnish benchmarking system on neither VLV nor BSW after 30 runs with 95% confidence level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92849 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 15:12:59 +02:00
Francisco Jerez	3dc97a1586	i965: Document inconsistent units the URB size is represented in. Every other gen the representation of the URB size was changed and previous ones weren't updated. I'd be willing to write a series normalizing this to be KB on all generations if anybody else cares.	2015-12-09 14:00:30 +02:00
Francisco Jerez	228d5a3f75	i965: Hook up L3 partitioning state atom. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:59:03 +02:00
Francisco Jerez	1fc797e8e4	i965: Work around L3 state leaks during context switches. This is going to require some rather intrusive kernel changes to fix properly, in the meantime (and forever on at least pre-v4.1 kernels) we'll have to restore the hardware defaults at the end of every batch in which the L3 configuration was changed to avoid interfering with the DDX and GL clients that use an older non-L3-aware version of Mesa. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> v2: Optimize look-up of the default configuration by assuming it's the first entry of the L3 config array in order to avoid an FPS regression in GpuTest Triangle and SynMark OglBatch2-7 on most affected platforms. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-09 13:57:40 +02:00
Francisco Jerez	09d9638dd0	i965: Add debug flag to print out the new L3 state during transitions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	acc77947ca	i965: Implement L3 state atom. The L3 state atom calculates the target L3 partition weights when the program bound to some shader stage is modified, and in case they are far enough from the current partitioning it makes sure that the L3 state is re-emitted. v2: Fix for inconsistent units the context URB size is expressed in. Clamp URB size to 1008 KB on SKL due to FF hardware limitation. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	95ad0bd33b	i965: Calculate appropriate L3 partition weights for the current pipeline state. This calculates a rather conservative partitioning of the L3 cache based on the shaders currently bound to the pipeline and whether they use SLM, atomics, images or scratch space. The result is intended to be fine-tuned later on based on other pipeline state. Note that the L3 partitioning calculated for VLV in the non-SLM non-DC case differs from the hardware defaults in that it doesn't include a DC partition and has twice as much RO cache space -- This is an intentional functional change that improves performance in several bandwidth-bound benchmarks on VLV (5% significance): SynMark OglTexFilterAniso by 14.18%, SynMark OglTexFilterTri by 7.15%, Unigine Heaven by 4.91%, SynMark OglShMapPcf by 2.15%, GpuTest Fur by 1.83%, SynMark OglDrvRes by 1.80%, SynMark OglVsTangent by 1.71%, and a few other benchmarks from the Finnish system by less than 1%. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	fa1300f75e	i965: Implement selection of the closest L3 configuration based on a vector of weights. The input of the L3 set-up code is a vector giving the approximate desired relative size of each partition. This implements logic to compare the input vector against the table of validated configurations for the device and pick the closest compatible one. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	353abb294b	i965: Define and use REG_MASK macro to make masked MMIO writes slightly more readable. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	fa043698d2	i965/hsw: Enable L3 atomics. Improves performance of the arb_shader_image_load_store-atomicity piglit test by over 25x (which isn't a real benchmark it's just heavy on atomics -- the improvement in a microbenchmark I wrote a while ago seemed to be even greater). The drawback is one needs to be extra-careful not to hang the GPU (in fact the whole system). A DC partition must have been allocated on L3, the "convert L3 cycle for DC to UC" bit may not be set, the MOCS L3 cacheability bit must be set for all surfaces accessed using DC atomics, and the SCRATCH1 and ROW_CHICKEN3 bits must be kept in sync. A fairly recent kernel is required for the command parser to allow writes to these registers. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	6907175a4f	i965: Implement programming of the L3 configuration. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	b22bebe966	i965: Import tables enumerating the set of validated L3 configurations. It should be possible to use additional L3 configurations other than the ones listed in the tables of validated allocations ("BSpec » 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*] » L3 Allocation and Programming"), but it seems sensible for now to hard-code the tables in order to stick to the hardware docs. Instead of setting up the arbitrary L3 partitioning given as input, the closest validated L3 configuration will be looked up in these tables and used to program the hardware. The included tables should work for Gen7-9. Note that the quantities are specified in ways rather than in KB, this is because the L3 control registers expect the value in ways, and because by doing that we can re-use a single table for all GT variants of the same generation (and in the case of IVB/HSW and CHV/SKL across different generations) which generally have different L3 way sizes but allow the same combinations of way allocations. v2: Use slice count from the devinfo structure instead of the gt number to implement get_l3_way_size(). Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	a403ad4f5a	i965: Add slice count to the brw_device_info structure. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	c8ff045fdb	i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC flush is set. According to the hardware docs a DC flush is sufficient to make CS_STALL happy, there's no need to add STALL_AT_SCOREBOARD whenever it's present. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:05 +02:00
Francisco Jerez	2405b75bc9	i965: Define state flag to signal that the URB size has been altered. This will make sure that we recalculate the URB layout anytime the URB size is modified by the L3 partitioning code. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Francisco Jerez	4841cab01a	i965: Keep track of whether LRI is allowed in the context struct. This stores the result of can_do_pipelined_register_writes() in the context struct so we can find out later whether LRI can be used to program the L3 configuration. v2: * Split change of gen check in can_do_pipelined_register_writes (jljusten) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Francisco Jerez	50c2713726	i965: Adjust gen check in can_do_pipelined_register_writes Allow for pipelined register writes for gen < 7. v2: * Split from another patch and adjust comment (jljusten) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Francisco Jerez	5912da45a6	i965: Define symbolic constants for some useful L3 cache control registers. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-12-09 13:46:04 +02:00
Dave Airlie	e307cfa7d9	radeonsi: handle loading doubles as geometry shader inputs. This adds the double code to the geometry shader input handling. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 17:04:04 +10:00
Dave Airlie	8c9e40ac22	radeonsi: handle doubles in lds load path. This handles loading doubles from LDS properly. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "11.0 11.1" <mesa-stable@lists.fedoraproject.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 17:03:38 +10:00
Dave Airlie	cce3864046	r600: handle geometry dynamic input array index This fixes: glsl-1.50/execution/geometry/dynamic_input_array_index.shader_test my profanity. We need to load the AR register with the value from the index reg Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 15:07:53 +10:00
Dave Airlie	38542921c7	r600g: fix geom shader input indirect indexing. This fixes: gs-input-array-vec4-index-rd The others run out of gprs unfortunately. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 15:07:47 +10:00
Dave Airlie	e97ac006d7	r600g: fix outputing to non-0 buffers for stream 0. This fixes: arb_transform_feedback3-ext_interleaved_two_bufs_gs arb_transform_feedback3-ext_interleaved_two_bufs_gs_max transform-feedback-builtins If we are only emitting one ring, then emit all output buffers on it. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 15:07:01 +10:00
Edward O'Callaghan	1f61447ce1	r600: Add ARB_copy_image support [airlied: update relnotes] Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 14:41:46 +10:00
Edward O'Callaghan	d13ac27200	r600g: allow copying between compatible un/compressed formats See: `commit e82c527f1fc2f8ddc64954ecd06b0de3cea92e93` which is where a block in src maps to a pixel in dst and vice versa. e.g. DXT1 <-> R32G32_UINT DXT5 <-> R32G32B32A32_UINT Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 14:40:32 +10:00
Ilia Mirkin	f920f8eb02	nv50/ir: fix cutoff for using r63 vs r127 when replacing zero The only effect here is a space savings - 822 programs in shader-db affected with the following overall change: total bytes used in shared programs : 44154976 -> 44139880 (-0.03%) Fixes: `641eda0c` (nv50/ir: r63 is only 0 if we are using less than 63 registers) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	44260d9080	nv50/ir: prefer to color mad def and src2 with the same color This allows us to use the short encoding, and potentially fold immediates in later on. total instructions in shared programs : 6379731 -> 6367861 (-0.19%) total gprs used in shared programs : 728502 -> 728683 (0.02%) total local used in shared programs : 9904 -> 9904 (0.00%) total bytes used in shared programs : 44661008 -> 44154976 (-1.13%) local gpr inst bytes helped 0 51 7267 20306 hurt 0 232 125 274 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	c1c1248b94	nv50/ir: reduce degree limit on ops that can't encode large reg dests Operations that take immediates can only encode registers up to 64. This fixes a shader in a "Powered by Unity" intro. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	99581ca393	nv50/ir: only unspill once ahead of a group of instructions We already semi-did this but the list of uses as unsorted, so it was unreliable. Sort the uses by bb and serial, and don't unspill for each instruction in a sequence. (And also don't unspill multiple times for a single instruction that uses the value in question multiple times.) This causes a minor reduction in generated instructions for shader-db (as few programs spill) but more importantly it brings determinism to each run's output. On SM10: total instructions in shared programs : 6387945 -> 6379359 (-0.13%) total gprs used in shared programs : 728544 -> 728544 (0.00%) total local used in shared programs : 9904 -> 9904 (0.00%) local gpr inst bytes helped 0 0 322 322 hurt 0 0 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Ilia Mirkin	0f647bd65b	nv50/ir: check if the target supports the new offset before inlining Fixes: `abd326e81b` (nv50/ir: propagate indirect loads into instructions) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93300 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 23:15:29 -05:00
Dave Airlie	a13b14930d	llvmpipe: fix fp64 inputs to geom shader. This fixes the fetching of fp64 inputs to the geometry shader, this fixes the recently posted piglit's arb_gpu_shader_fp64/execution/gs-fs-vs-double-array.shader_test arb_vertex_attrib_64bit/execution/gs-fs-vs-attrib-double-array.shader_test Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-09 13:56:39 +10:00
Matt Turner	3a7f95b3aa	nir: Optimize useless comparisons against true/false. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> [v1] v2: Move new rule to Boolean simplification section Add a a@bool != true simplification Suggested-by: Neil Roberts <neil@linux.intel.com>	2015-12-08 15:41:08 -08:00
Matt Turner	9e9e6fc8f1	glsl: Switch opcode and avail parameters to binop(). To make it match unop(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-08 15:39:47 -08:00
Matt Turner	dd3c16c94b	glsl_to_tgsi: Skip useless comparison instructions. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 15:38:03 -08:00
Matt Turner	eca846e7ae	glsl: Relax qualifier ordering restriction in ES 3.1. ... and allow the "binding" qualifier in ES 3.1 as well. GLSL ES 3.1 incorporates only a few features from the extension ARB_shading_language_420pack: the relaxed qualifier ordering requirements and the binding qualifier. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-08 15:36:57 -08:00
Matt Turner	79da7220db	glsl: Use has_420pack(). These features would not have been enabled with #version 420 otherwise. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-08 15:36:57 -08:00
Matt Turner	c200e606f7	glsl: Allow binding of image variables with 420pack. This interaction was missed in the addition of ARB_image_load_store. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93266 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-08 15:36:57 -08:00
Jose Fonseca	a9a0c693e5	appveyor: Cache winflexbison archive. Unforunately the Appveyor -> SourceForge connection seems a bit unreliable, causing frequent build failures while downloading winflexbison (approx once every 2 days). Fetching winflexbison archive into Appveyor's cache should eliminate these. Fetching Python modules from PyPI doesn't seem to be a problem, so they are left alone for now, though they could eventually get the same treatment.	2015-12-08 22:49:38 +00:00
Eric Anholt	f61ceeb3fd	vc4: Enable MSAA. We still have several failures in the newly enabled tests in simulation: sRGB downsampling is done as if it was just linear, stencil blits are not supported on MSAA either, and derivatives are still not supported (breaking some MSAA simulation shaders). So, other than sRGB downsampling quality, things seem to be in good shape.	2015-12-08 10:09:52 -08:00
Eric Anholt	fc4a1bfb88	vc4: Add support for mapping of MSAA resources. The pipe_transfer_map API requires that we do an implicit downsample/upsample and return a mapping of that.	2015-12-08 09:49:56 -08:00
Eric Anholt	6b4dfd53ae	vc4: Add support for texel fetches from MSAA resources. This is the core of ARB_texture_multisample. Most of the piglit tests for GL_ARB_texture_multisample require GL 3.0, but exposing support for this lets us use the gallium blitter for multisample resolves. We can sometimes multisample resolve using just the RCL, but that requires that the blit is 1:1, unflipped, and aligned to tile boundaries.	2015-12-08 09:49:55 -08:00
Eric Anholt	a97b40dca4	vc4: Add support for multisample framebuffer operations. This includes GL_SAMPLE_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, and GL_SAMPLE_ALPHA_TO_COVAGE. I haven't implemented a dithering function yet, and gallium doesn't give me a good chance to do so for GL_SAMPLE_COVERAGE.	2015-12-08 09:49:54 -08:00
Eric Anholt	edc3305de7	vc4: Add a workaround for HW-2905, and additional failure I saw with MSAA. I only stumbled on this while experimenting due to reading about HW-2905. I don't know if the EZ disable in the Z-clear is actually necessary, but go with it for now.	2015-12-08 09:49:54 -08:00
Eric Anholt	edfd4d853a	vc4: Add support for drawing in MSAA.	2015-12-08 09:49:53 -08:00
Eric Anholt	e7c8ad0a6c	vc4: Add kernel RCL support for MSAA rendering.	2015-12-08 09:49:53 -08:00
Eric Anholt	568d3a8e32	vc4: Rename color_ms_write to color_write. I was thinking this was the only MSAA resolve thing, so it should be noted separately, but actually load/store general also do MSAA resolve.	2015-12-08 09:49:52 -08:00
Eric Anholt	bf92017ace	vc4: Allow RCL blits to the edge of the surface. The recent unaligned fix successfully prevented RCL blits that weren't aligned inside of the surface, but we also want to be able to do RCL blits for the whole surface when the width or height of the surface aren't aligned (we don't care what renders inside of the padding).	2015-12-08 09:49:52 -08:00
Eric Anholt	fb4877dbab	vc4: Add disabled debug printf for describing blits. I keep typing variants of this while debugging RCL blits for MSAA.	2015-12-08 09:49:51 -08:00
Eric Anholt	2792d118f1	vc4: Fix check for tile RCL blits with mismatched y. This was a typo in `3a508a0d94` that didn't show up in testcases at that moment.	2015-12-08 09:49:51 -08:00
Eric Anholt	1529f138ff	vc4: Fix compiler warning from size_t change. I missed this when bringing over the kernel changes.	2015-12-08 09:49:50 -08:00
Olivier Pena	a5256012ef	scons: support for LLVM 3.7. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-08 13:53:31 +00:00
Dave Airlie	bd47fcd57b	docs/GL3.txt: consolidate r600 GL4.1. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-08 20:13:14 +10:00
Jason Ekstrand	18069dce4a	i965: Make uniform offsets be in terms of bytes This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	813f0eda8e	i965/nir_uniforms: Replace comps_per_unit with an is_scalar boolean Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	22c273de2b	i965/nir: Remove unused indirect handling The one and only place where the FS backend allows reladdr is on uniforms. For locals, inputs, and outputs, we lower it away before the backend ever sees it. This commit gets rid of the dead indirect handling code. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	abb569ca18	i965/state: Get rid of dword_pitch arguments to buffer functions Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	05bdc21f84	i965/vec4: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92909 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	13ad8d03f2	i965/fs: Use a stride of 1 and byte offsets for UBOs Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Jason Ekstrand	e3e70698c3	i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on vec4-aligned byte offsets on Iron Lake and below and worked in terms of vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7 variant which works in terms of vec4s. We're about to change the GEN7 version to work in terms of bytes, so this is a nice unification. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:51:23 -08:00
Ben Widawsky	6ef8149bcd	i965: Fix texture views of 2d array surfaces It is legal to have a texture view of a single layer from a 2D array texture; you can sample from it, or render to it. Intel hardware needs to be made aware when it is using a 2d array surface in the surface state. The texture view is just a 2d surface with the backing miptree actually being a 2d array surface. This caused the previous code would not set the right bit in the surface state since it wasn't considered an array texture. I spotted this early on in debug but brushed it off because it is clearly not needed on other platforms (since they all pass). I have no idea how this works properly on other platforms (I think gen7 introduced the bit in the state, but I am too lazy to check). As such, I have opted not to modify gen7, though I believe the current code is wrong there as well. Thanks to Chris for helping me debug this. v2: Just use the underlying mt's target type to make the array determination. This replaces a bug in the first patch which was incorrectly relying only on non-zero depth (not sure how that had no failures). (Ilia) Cc: Chris Forbes <chrisf@ijw.co.nz> Reported-by: Mark Janes <mark.a.janes@intel.com> (Jenkins) References: https://www.opengl.org/registry/specs/ARB/texture_view.txt Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92609 Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-12-07 18:47:04 -08:00
Nicolai Hähnle	d5a5dbd71f	radeonsi: last_gfx_fence is a winsys fence Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-12-07 21:15:59 -05:00
Ilia Mirkin	f97f755192	nvc0/ir: fix up mul+add -> mad algebraic opt, enable for integers For some reason this has been disabled for integers ever since codegen was merged, despite there being emission code for IMAD. Seems to work. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-07 18:49:28 -05:00
Ilia Mirkin	1d708aacb7	gk110/ir: fix imad sat/hi flag emission for immediate args According to nvdisasm both the immediate and non-imm cases use the same bits. Both of these flags are quite rarely set though. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-07 18:49:28 -05:00
Kenneth Graunke	87a1166310	i965: Add brw_device_info::min_ds_entries field. From the 3DSTATE_URB_DS documentation: "Project: IVB, HSW If Domain Shader Thread Dispatch is Enabled then the minimum number of handles that must be allocated is 10 URB entries." "Project: BDW+ If Domain Shader Thread Dispatch is Enabled then the minimum number of handles that must be allocated is 34 URB entries." When the HS is run in SINGLE_PATCH mode (the only mode we support today), there is no minimum for HS - it's just zero. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:55 -08:00
Chris Forbes	42ca675cc9	i965: Add state bits for tess stages Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:55 -08:00
Chris Forbes	80ea18d1a1	i965: Add backend structures for tess stages Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:55 -08:00
Chris Forbes	5340f37902	i965: Set core tessellation-related limits Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:54 -08:00
Kenneth Graunke	a9e6a56a02	i965: Request lowering of gl_TessLevel* from float[] to vec4s. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:54 -08:00
Kenneth Graunke	7a17356800	i965: Create new files for HS/DS/TE state upload code. For now, this just splits the existing code to disable these stages into separate atoms/files. We can then replace it with real code. v2: Bump the render atoms in this patch so it compiles (in my branch, I'd bumped it in an earlier patch). 61 seems to be the minimum that works, which doesn't match the old value + the number of atoms I added in this patch, so apparently we had some slop before. v3: Actually disable the DS unit on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-12-07 14:48:54 -08:00
Ilia Mirkin	63b850403c	gk104/ir: sampler doesn't matter for txf We actually leave the sampler unset for OP_TXF, which caused the GK104+ logic to treat some texel fetches as indirect. While this works, it's incredibly wasteful. This only happened when the texture was > 0 (since sampler remained == 0). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-07 16:22:54 -05:00
Marek Olšák	32f05fadbb	radeonsi: disable DCC on Stoney Cc: 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-07 22:01:08 +01:00
Sonny Jiang	2618886600	winsys/amdgpu: addrlib - port a Fiji bug fix Fiji: Fixed tiled resource failures Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> v2: fix a compile failure (typo) - Marek	2015-12-07 21:58:42 +01:00
Sonny Jiang	338d7bf053	winsys/amdgpu: addrlib - port Checks mip 0 for czDispCompatible Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-07 21:58:42 +01:00
Sonny Jiang	676bc25140	winsys/amdgpu: addrlib - port fix error for workaround for 1D tiling Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-07 21:58:42 +01:00
Christian König	a2c5200a4b	st/va: disable MPEG4 by default v2 The workarounds are too hacky to enable them by default and otherwise MPEG4 doesn't work reliably. v2: add docs/envvars.html, CC stable and fix typos Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1) Cc: "11.1.0" <mesa-stable@lists.freedesktop.org>	2015-12-07 20:34:17 +01:00
Christian König	ca3e2b76c0	st/va: move HEVC functions into separate file v2 v2: actually copy all of it Signed-off-by: Christian König <christian.koenig@amd.com>	2015-12-07 20:34:17 +01:00
Alejandro Piñeiro	3d260cc653	mesa: remove _mesa_tex_target_is_array _mesa_is_array_texture provides the same functionality and: 1. it returns bool instead of GLboolean 2. it's not related to the texture format (texformat.c) 3. the name's a little shorter v2: remove _mesa_tex_target_is_array instead (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-07 20:31:20 +01:00
Alejandro Piñeiro	b16e0ff34e	i965: use _mesa_is_array_texture instead of _mesa_tex_target_is_array Both methods provide the same functionality, so one would be removed. v2: use _mesa_is_array_texture and not the other way (Brian Paul) Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-07 20:30:24 +01:00
Ilia Mirkin	db072d2086	gk110/ir: fix imul hi emission with limm arg The elemental demo hits this case. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-07 13:30:17 -05:00
Brian Paul	32a6e081c3	svga: use the debug callback to report issues to the state tracker Use the new debug callback hook to report conformance, performance and fallbacks to the state tracker. The state tracker, in turn can report this issues to the user via the GL_ARB_debug_output extension. More issues can be reported in the future; this is just a start. v2: remove conditionals around pipe_debug_message() calls since the check is now done in the macro itself. v3: remove unneeded dummy %s substitutions Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>, Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-07 08:57:49 -07:00
Brian Paul	5effc3ae74	gallium/util: check callback pointers for non-null in pipe_debug_message() So the callers don't have to do it. v2: also check cb!=NULL in the macro Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-12-07 08:56:51 -07:00
Abdiel Janulgue	b19546abf3	i965: Add defines for gather push constants v2 (Francisco Jerez): - Rename HSW_GATHER_CONSTANTS_RESERVED to HSW_GATHER_POOL_ALLOC_MUST_BE_ONE. - Rename BRW_GATHER_* prefix to HSW_GATHER_CONSTANT_*. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-12-07 14:58:12 +02:00
Timothy Arceri	9214664aed	mesa: move GLES checks for SSO input/output validation This function is unfinished there is a bunch more validation rules that need to be applied here. We will still want to call it for desktop GL we just don't want to validate precision so move the ES check to reflect this. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-07 21:41:14 +11:00
Timothy Arceri	ad02621854	mesa: move GL_INVALID_OPERATION error to rendering call The validation api doesn't trigger this error so just move it to the code called during rendering. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:41:09 +11:00
Timothy Arceri	4dd096d741	mesa: move pipeline input/output validation inside _mesa_validate_program_pipeline() This allows validation to be done on rendering calls also. Fixes 3 dEQP-GLES31.functional.separate tests. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org>	2015-12-07 21:41:05 +11:00
Timothy Arceri	da1a01361b	glsl: re-validate program pipeline after sampler change Cc: "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> https://bugs.freedesktop.org/show_bug.cgi?id=93180	2015-12-07 21:41:00 +11:00
Dave Airlie	41e82f4f96	r600: apply SIMD workaround to cayman also. At last on ARUBA this is required to stop tessellation hanging in heaven. This removes one of the SIMDs from use by the HS/LS. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Tested-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 18:57:34 +10:00
Dave Airlie	6bf6bdbc2b	r600: fix regression introduced with ring emit changes. This was adding one after a CUT which broke end primitive	2015-12-07 05:44:55 +00:00
Dave Airlie	fc276bda22	r600: remove stale tessellation comment pointed out by Marek. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 11:04:48 +10:00
Dave Airlie	5ca9825758	docs: consolidate r600 entry in GL3.txt Though fp64 emulation still needs to be done for a lot of the evergreen hw.	2015-12-07 10:06:44 +10:00
Dave Airlie	7fa2914b06	docs: update with r600 tessellation status. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	33404f1415	r600: enable tessellation for evergreen/cayman (v2) This enables tessellation for evergreen/cayman, This will need changes before committing depending on what hw works etc. working are CAYMAN/REDWOOD/BARTS/TURKS/SUMO/CAICOS v2: only enable on evergreen and above.	2015-12-07 09:59:02 +10:00
Dave Airlie	a2885d9cf9	r600g: reduce number of ps thread on caicos this allows tess apps to start Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	fe64a0c8bf	r600g: adjust ls/hs thread counts for sumo these stop tess hangs here. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	e7ce9e3bb8	r600/asm: enable nstack check for tess ctrl/eval shaders. This just makes sure they register at least one stack usage frame like vertex shaders. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	bb44c1f036	r600/asm: handle lds read operations. Reads from the queue shouldn't be merged for now read operations. Reads from the queue shouldn't be merged for now, or put in T slots. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	8ec2cb13e5	r600/asm: add LDS ops and barrier to the once per group restriction. LDS ops must be scheduled in X slot, and barrier should be on its own in a group. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	18871ac576	r600: move VGT_VTX_CNT_EN into shader stages atom. This should be enabled for tessellation shaders as well. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:02 +10:00
Dave Airlie	958d617d98	r600: enable tcs/tes dumping for R600_DUMP_SHADERS. Trivial patch just to enable dumping more. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	b8df7d03c8	r600: handle SIMD allocation issue with HS/LS At least one SIMD must be kept away from the HS/LS stages in order to avoid a hw issue on evergreen/cayman. This patch implements this workaround. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	7b5878ee04	r600/shader: increase number of inputs/outputs to 64. Tessellation exceeds these sometimes, so increase them for now. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Edward O'Callaghan	22058f69fb	r600: handle barrier opcode. This handles the barrier opcode for EG/CM. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	9662a43d23	r600/shader: handle tess related system-values. This adds handling for TESSINNER/TESSOUTER in the TES where they need to be fetched from LDS, and TESSCOORD which comes in via r0. It also handle primitive ID and invocation ID. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	92fbf856f4	r600/shader: allow multi-dimension arrays for tcs/tes inputs/outputs. This just allows multi-dim arrays to be processed. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	30d56d1c00	r600/shader: handle TES exports and streamout when tessellation is enabled the TES shader is responsible for handling streamout and exports. This adds the streamout and export workarounds to TES, and also makes sure TES sets up spi_sid. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	2239f3eaff	r600/shader: emit tessellation factors to GDS at end of TCS. When we are finished the shader, we read back all the tess factors from LDS and write them to special global memory storage using GDS instructions. This also handles adding NOP when GDS or ENDLOOP end the TCS. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	cfc2818e23	r600/shader: handle TCS output writing. TCS outputs whenever they are written in the shader, need to be written to LDS not temporaries, this handles this case. It also fixes up the case where the output is a relative addressed output, so we don't try to apply the relative address at the wrong time. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	892cc65fa3	r600/shader: handle VS shader writing to the LDS outputs. (v1.1) This writes the VS shaders outputs to the LDS memory in the correct places. v1.1: use 24-bit Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	8b2024196f	r600/shader: handle fetching tcs/tes inputs and tcs outputs This handles the logic for doing fetches from LDS for TCS and TES. For TCS we need to fetch both inputs and outputs, for TES only inputs need to be fetched. v2: use 24-bit ops. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	4477be2404	r600/shader: add get_lds_offset0 helper This retrievs the offset into the LDS for a patch or non-patch variable, it takes the RelPatch channel and a temporary register. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	2a9639e41f	r600/shader: add function to get tess constants info This function retrieves the tess input/output info from the tess constant buffer that is bound to the shader. This uses a vfetch to get the values into the shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	0696ebc899	r600/shader: add utility functions to do single slot arithmatic These utilities are to be used to do things like integer adds and multiplies to be used in calculating the LDS offsets etc. It handles CAYMAN MULLO differences as well. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	09d25a9b37	r600/eg: workaround bug with tess shader and dynamic GPRs. When using tessellation on eg/ni chipsets, we must disable dynamic GPRs to workaround a hw bug where the GPU hangs when too many things get queued. This implements something like the r600 code to emit the transition between static and dynamic GPRs, and to statically allocate GPRs when tessellation is enabled. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	d87f54f225	r600/shader: move get_temp and last_instruction helpers up These are required for tess to be used earlier. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:01 +10:00
Dave Airlie	7933ba4d9c	r600: bind geometry shader ring to the correct place When tess/gs are enabled, the geom shader ring needs to bind to the tess eval not the vertex shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	e3ecc28e99	r600: create fixed function tess control shader fallback. If we have no tess control shader, then we have to use a fallback one that just writes the tessellation factors. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	731ff3766f	r600: create LDS info constants buffer and write LDS registers. (v2) This creates a constant buffer with the information about the layout of the LDS memory that is given to the vertex, tess control and tess evaluation shaders. This also programs the LDS size and the LS_HS_CONFIG registers, on evergreen only. v2: calculate lds hs num waves properly (Marek) Emit the state only when something has changed (airlied). Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	38b5ee4796	r600/eg: update shader stage emission/tf param for tess. This update the setting of the shader stages register when tess is enabled and add the setting of the VGT_TF_PARAM register from the tess shader properties. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	8874725c84	r600: hook TES/TCS shaders to the selection logic. This hooks the TES/TCS bindings to the HW stages up. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	79d88afd5c	r600: workout bitmask for the used tcs inputs/outputs. This is used later to setup the constants to be given to the tessellation shaders. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	839dae0dc0	r600: port over the get_lds_unique_index from radeonsi On r600 this needs to subtract 9 due to texcoord interactions. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	420afe06d1	r600: add set_tess_state callback. This just stores the values in the context to be used later when emitting the constant buffers. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	7db24b740c	r600/eg: init tess registers to defaults (v1.1) This initialises the tess min/max using fglrx values, and also initialises a number of other registers related to tessellation. v1.1: caicos doesn't have some registers. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	25f96c1120	r600: hook up constants/samplers/sampler view for tessellation This hooks the resources to the correct hw shaders when tess is enabled. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	9f86741863	r600: add create/bind/delete shader hooks for tessellation This hooks up the gallium API for the tessellation shaders. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	797012bb67	r600/sb: add LS/HS hw shader types. This just adds printing for the hw shader types, and hooks it up. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	382e2a2901	r600/blit: add tcs/tes shader saves. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	bdf7dadda8	r600: disable SB for now on tess related shaders. Note we have to disable on vertex shaders when we are operating in tes mode. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	8849867b8a	r600: update correct hw shaders depending on configuration. This updates the tess hw shaders from the sw ones routing things correctly. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:59:00 +10:00
Dave Airlie	b1da110b71	r600: add shader key entries for tcs and tes. with tessellation vs can now run on ls, and tes can run on vs or es, tcs runs on hs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	a131ac73e6	r600: add PATCHES to the pipe conversion. This just converts the value to the hw value. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	0b08a8ade6	r600: add functions to update ls/hs state. This just adds the two functions, these will get hooked up later in the shader code. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Glenn Kennard	b2fa64b161	r600g/sb: Support LDS ops in SB bytecode I/O This just adds the LDS ops to the SB bytecode reader/writers. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	816bb30245	r600: add support for LDS instruction encoding. These are used in tessellation shaders to read/write values between VS/TCS/TES. This splits the eg alu assembler out to handle these instructions. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	fe4eb49df9	r600/sb: add support for GDS to the sb decoder/dump. (v1.1) This just adds support to the decoder, not actual SB support. v1.1: fixup GDS relative mode. (Glenn). Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	2b25d9ac7f	r600: add support for GDS clause to the assembler. This just adds enough for the tessellation shaders, which require TF_WRITE to work. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	4f83184eff	r600: use macros for updating the various stages. These macros will make things easier to see when tess is added to the mix. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	85131a5490	r600: add SET_NULL_SHADER macro. This is used to set a hw shader to NULL. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	f395ed8d4c	r600: move clip misc and streamout stream updates to a single place This will be updated in a macro later. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	8a0e21fc5a	r600: move selecting shaders into earlier code. select the ps/gs/vs in that order then process the results. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	3a7232a9a9	r600: use a macro to remove common shader selection code. This function is going to get a lot messier with tessellation so I'm going to use some macros to try and clean some bits of common code up. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	19799a5928	r600: move to using hw stages array for hw stage atoms This moves to using an array of hw stages for the atoms. Note this drops the 23 from the vertex shader, this value is calculated internally when shaders are bound, so not required here. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	bb2b8778cb	r600: make adjust_gprs use hw stages. This changes the r600 specific GPR adjustment code to use the stage defines, and arrays. This is prep work for the tess changes later. Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:59 +10:00
Dave Airlie	d1b90839c0	r600: introduce HW shader stage defines Add a list of defines for the HW stages. We will use this for GPR calculations amongst other things. Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:58 +10:00
Dave Airlie	bd71f3e4fe	r600: fix masks for two of the unused evergreen regs. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-07 09:58:58 +10:00
Edward O'Callaghan	d108b69d2c	gallium: Remove redundant NULL ptr checks Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:23 +01:00
Edward O'Callaghan	13eb5f596b	gallium/drivers: Sanitize NULL checks into canonical form Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:23 +01:00
Edward O'Callaghan	150c289f60	gallium/auxiliary: Sanitize NULL checks into canonical form Use NULL tests of the form `if (ptr)' or `if (!ptr)'. They do not depend on the definition of the symbol NULL. Further, they provide the opportunity for the accidental assignment, are clear and succinct. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:23 +01:00
Edward O'Callaghan	147fd00bb3	gallium/auxiliary: Trivial code style cleanup Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:22 +01:00
Edward O'Callaghan	25b3d554c4	gallium/drivers: Trivial code-style cleanup Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:22 +01:00
Edward O'Callaghan	34782eec31	gallium/auxiliary: Fix zero integer literal to pointer comparison Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:10:02 +01:00
Edward O'Callaghan	3edae10601	winsys/amdgpu: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:09:54 +01:00
Edward O'Callaghan	82871081fc	svga: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:09:52 +01:00
Edward O'Callaghan	70d2d3ef7f	llvmpipe: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:09:47 +01:00
Edward O'Callaghan	be51020f2a	gallium/drivers/nouveau: Make use of ARRAY_SIZE macro Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 17:03:17 +01:00
Edward O'Callaghan	7e43a28079	gallium/radeon*: Remove useless casts These are unnecessary and are likely just left overs from prior work. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-06 11:52:16 +01:00
Ilia Mirkin	0ef5c8ab74	nv50/ir: fold shl + mul with immediates On SM20 this gives: total instructions in shared programs : 6299222 -> 6294240 (-0.08%) total gprs used in shared programs : 944139 -> 944068 (-0.01%) total local used in shared programs : 54116 -> 54116 (0.00%) local gpr inst bytes helped 0 126 2781 2781 hurt 0 55 11 11 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-05 18:56:43 -05:00
Ilia Mirkin	abd326e81b	nv50/ir: propagate indirect loads into instructions This way $r1 = $r0 + 4; c1[$r1] becomes c1[$r0+4]. On SM35: total instructions in shared programs : 6206257 -> 6185058 (-0.34%) total gprs used in shared programs : 911045 -> 910722 (-0.04%) total local used in shared programs : 39072 -> 39072 (0.00%) local gpr inst bytes helped 0 417 4195 4195 hurt 0 280 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-05 17:50:23 -05:00
Ilia Mirkin	31fde8faba	nv50/ir: flip shl(add, imm) into add(shl, imm) This works when the add also has an immediate. This often happens in address calculations. These addresses can then be inlined as well. On code targeted to SM35: total instructions in shared programs : 6223346 -> 6206257 (-0.27%) total gprs used in shared programs : 911075 -> 911045 (-0.00%) total local used in shared programs : 39072 -> 39072 (0.00%) local gpr inst bytes helped 0 119 3664 3664 hurt 0 74 15 15 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-05 17:50:23 -05:00
Eric Anholt	a4eff86f4a	vc4: Fix accidental scissoring when scissor is disabled. Even if the rasterizer has scissor disabled, we'll have whatever vc4->scissor bounds were last set when someone set up a scissor, so we shouldn't clip to them in that case. Fixes piglit fbo-blit-rect, and a lot of MSAA tests once they're enabled.	2015-12-05 13:12:27 -08:00
Eric Anholt	d16d666776	vc4: Disable RCL blitting when scissors are enabled. We could potentially handle scissored blits when they're tile aligned, but it doesn't seem worth it. If you're doing a scissored blit, you're probably a testcase. Fixes piglit's fbo-scissor-blit fbo	2015-12-05 13:12:27 -08:00
Eric Anholt	0afe83078d	vc4: Bring over cleanups from submitting to the kernel.	2015-12-05 13:12:27 -08:00
Samuel Pitoiset	9f6ff76fdc	nvc0: expose a group of performance metrics for SM30 (Kepler) This allows to monitor these performance metrics through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	0afd8f7bd7	nvc0: re-introduce performance metrics for SM30 (Kepler) This implements more performance metrics than the previous support, but some other metrics still need to be figured out. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	af275b8839	nvc0: remove useless counting operations for MP counters Those bits were related to old performance metrics support. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	6667355d4b	nvc0: remove old performance metrics support on Kepler These performance metrics will be re-introduced in an upcoming patch that will follow the same design as Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	662eb434ee	nvc0: remove wrong inst_issued HW SM perf counter on Kepler inst_issued is performance metric not a hardware event on Kepler (SM30). It will be re-introduced in an upcoming patch. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	342ea31193	nvc0: add missing HW SM perf counters for SM30 (Kepler) SM30 is the compute capability version for GK104/GK106/GK107. This also introduces a new signal group selection called UNK0F. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Samuel Pitoiset	7f42688017	nvc0: fix the comment that describe MP counters storage on Kepler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-12-05 19:23:34 +01:00
Rob Clark	58efff89a2	freedreno/ir3: nir shader prints with 'disasm' debug option Move these to 'disasm' instead of the more verbose 'optmsgs' since, like the tgsi dumps, it is useful without the more verbose compiler logging enabled. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-05 08:48:19 -05:00
Ilia Mirkin	a3f90ef0a6	gallium/util: fix pipe_debug_message macro to allow 0 args Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-12-04 15:24:17 -05:00
Eric Anholt	a69ac4e89c	vc4: Add debug dumping of MSAA surfaces.	2015-12-04 09:24:36 -08:00
Eric Anholt	3c3b1184eb	vc4: Add support for laying out MSAA resources. For MSAA, we store full resolution tile buffer contents, which have their own tiling format. Since they're full resolution buffers, we have to align their size to full tiles.	2015-12-04 09:24:36 -08:00
Eric Anholt	74c4b3b80c	vc4: Add support for storing sample mask. From the API perspective, writing 1 bits can't turn on pixels that were off, so we AND it with the sample mask from the payload.	2015-12-04 09:23:55 -08:00
Eric Anholt	3a508a0d94	vc4: Fix up tile alignment checks for blitting using just an RCL. We were checking that the blit started at 0 and was 1:1, but not that it went to the full width of the surface, or that the width was aligned to a tile. We then told it to blit to the full width/height of the surface, causing contents to be stomped in a bunch of MSAA tests that happen to include half-screen-width blits to 0,0.	2015-12-04 09:10:53 -08:00
Eric Anholt	a664233042	vc4: Add support for loading sample mask.	2015-12-04 09:10:53 -08:00
Rob Clark	4b18d51756	freedreno/ir3: convert scheduler back to recursive algo I've played with a few different approaches to tweak instruction priority according to how much they increase/decrease register pressure, etc. But nothing seems to change the fact that compared to original (pre-multiple-block-support) scheduler, in some edge cases we are generating shaders w/ 5-6x higher register usage. The problem is that the priority queue approach completely looses the dependency between instructions, and ends up scheduling all paths at the same time. Original reason for switching was that recursive approach relied on starting from the shader outputs array. But we can achieve more or less the same thing by starting from the depth-sorted list. shader-db results: total instructions in shared programs: 113350 -> 105183 (-7.21%) total dwords in shared programs: 219328 -> 211168 (-3.72%) total full registers used in shared programs: 7911 -> 7383 (-6.67%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 21294 -> 21294 (0.00%) half full const instr dwords helped 0 322 0 711 215 hurt 0 163 0 38 4 The shaders hurt tend to gain a register or two. While there are also a lot of helped shaders that only loose a register or two, the more complex ones tend to loose significanly more registers used. In some more extreme cases, like glsl-fs-convolution-1.shader_test it is more like 7 vs 34 registers! Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-04 10:27:09 -05:00
Rob Clark	ad2cc7bddc	freedreno/ir3: don't reuse a0.x across blocks It causes confusion in sched if we need to split_addr() since otherwise we wouldn't easily know which block the new addr instr will be scheduled in. So just side-step the whole situation. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-04 10:27:09 -05:00
Rob Clark	8e52344dc1	freedreno/ir3: rename ir3_block::bd We'll need to add similar for ir3_instruction, but following the pattern to use 'id' seems confusing. Let's just go w/ generic 'data' as the name. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-12-04 10:27:09 -05:00
Giuseppe Bilotta	d566382a98	util: fix comment typo Undefining the NDEBUG is relevant for release build, as they are the ones that set it. [Emil Velikov: split from previous patch] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:41 +00:00
Giuseppe Bilotta	efaac624af	xvmc: force assertion in XvMC tests This follows the src/util/u_atomic_test.c model of undefining NDEBUG unconditionally throughouth the XvMC tests, to force asserts regardless of debug mode. The comment on u_atomic_test.c is also fixed (read 'debug' where it should have been 'release'). v2: s/debug/release/ in relevant comments Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> [Emil Velikov: keep the src/util/ hunk as separate patch] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:41 +00:00
Giuseppe Bilotta	4839353634	radeon: const correctness Add missing `const` specifier for pointer pointing to a const struct. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:41 +00:00
Giuseppe Bilotta	d61802b5e0	radeon: whitespace cleanup Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-04 14:06:38 +00:00
Emil Velikov	1074e38fbb	mesa/tests: add KHR_debug GLES glGetPointervKHR entry points Should have been part of commit `f53f9eb8d4` "glapi: add GetPointervKHR to the ES dispatch". v2: comment out the ES1.1 symbol and use the same description (pattern) as elsewhere (Matt) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93235 Fixes: `f53f9eb8d4` "glapi: add GetPointervKHR to the ES dispatch". Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org> (v1) Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-04 13:56:43 +00:00
Jason Ekstrand	b715e6d528	i965/vec4: Stop pretending to support indirect output stores Since we're using nir_lower_outputs_to_temporaries to shadow all our outputs, it's impossible to actually get an indirect store. The code we had to "handle" this was pretty bogus as it created a register with a reladdr and then stuffed it in a fixed varying slot without so much as a MOV. Not only does this not do the MOV, it also puts the indirect on the wrong side of the transaction. Let's just delete the broken dead code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-03 20:58:12 -08:00
Jason Ekstrand	aa35b0c2c7	i965/vec4: Get rid of the nir_inputs array It's not really buying us anything at this point. It's just a way of remapping one offset namespace onto another. We can just use the location namespace the whole way through. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-03 20:58:12 -08:00
Jason Ekstrand	c6bcc23369	nir/lower_io: Pass the builder and type_size into get_io_offset Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-03 20:58:12 -08:00
Ilia Mirkin	204f803ce0	nv50/ir: replace zeros in movs as well The original change to put zeroes directly into instructions created conditional mov's with the zero immediate. However that can't be emitted, so make sure to replace the zero with r63. Fixes: `52a800a68` (nv50/ir: allow immediate 0 to be loaded anywhere) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-03 23:46:02 -05:00
Ilia Mirkin	a3722b81f5	nv50/ir: fold fma/mad when all 3 args are immediates This happens pretty rarely, but might as well do it when it does. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-03 23:02:57 -05:00
Ilia Mirkin	2b98914fe0	nv50/ir: avoid looking at uninitialized srcMods entries Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-03 23:02:57 -05:00
Ilia Mirkin	49692f86a1	nv50/ir: fix DCE to not generate 96-bit loads A situation where there's a 128-bit load where the last component gets DCE'd causes a 96-bit load to be generated, which no GPU can actually emit. Avoid generating such instructions by scaling back to 64-bit on the first load when splitting. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-03 23:02:57 -05:00
Roland Scheidegger	51140f452a	draw: fix clipping of layer/vp index outputs This was just plain broken. It used always the value from v0 (for vp_index) but would pass the value from the provoking vertex to later stages - but only if there was a corresponding fs input, otherwise the layer/vp index would get lost completely (as it would try to interpolate the (unsigned) values as floats). So, make it obey provoking vertex rules (drivers relying on draw will need to do the same). And make sure that the default interpolation mode (when no corresponding fs input is found) for them is constant. Also, change the code a bit so constant inputs aren't interpolated then copied over later. Fixes the new piglit test gl-layer-render-clipped. v2: more consistent whitespaces fixes for function defs, and more tab killing (overall still not quite right however). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-04 03:42:19 +01:00
Roland Scheidegger	5ea5b169e9	softpipe: use provoking vertex for layer Same as for llvmpipe, albeit softpipe only really handles multiple layers, not multiple viewports/scissors. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-04 03:42:19 +01:00
Roland Scheidegger	ddaf8d7b10	llvmpipe: use provoking vertex for layer/viewport d3d10 actually requires using provoking (first) vertex. GL is happy with any vertex (as long as we say it's undefined in the corresponding queries). Up to now we actually used vertex 0 for viewport index, and vertex 1 for layer (for tris), which really didn't make sense (probably a typo). Also,$ since we reorder vertices of clockwise triangle, that actually meant we used a different vertex depending if the traingle was cw or ccw (still ok by gl). However, it should be consistent with what draw (clip) does, and using provoking vertex seems like the sensible choice (draw clip will be fixed next as it is totally broken there). While here, also use the correct viewport always even when not needed in setup (we pass it down to jit fragment shader it might be needed there for getting correct near/far depth values). No piglit changes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-04 03:42:19 +01:00
Eric Anholt	83e65ca831	vc4: Add the RCL to CL debug dumping when in simulator mode. We can't dump it in the real driver, since the kernel doesn't give us a handle to it (except after a GPU hang, using a root ioctl). In the simulator we can.	2015-12-03 18:20:39 -08:00
Marek Olšák	dd27825c8c	radeonsi: fix Fiji for LLVM <= 3.7 Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-03 23:55:23 +01:00
Marek Olšák	bfc14796b0	radeonsi: fix occlusion queries on Fiji Tested.	2015-12-03 23:46:37 +01:00
Marek Olšák	0b03f2def0	radeonsi: dump init_config IBs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	3a6de8c86e	radeonsi: print framebuffer info into ddebug logs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	a0bfb2798d	gallium/radeon: print more info about HTILE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	1cca259d99	gallium/radeon: print more info about CMASK Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	84fbb0aff9	gallium/radeon: rename fmask::pitch -> pitch_in_pixels Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	19eaceb6ed	gallium/radeon: print more information about textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	2d712d35c5	gallium/radeon: move printing texture info into a separate function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	c60d49161e	gallium/radeon: remove unused r600_texture::pitch_override Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Marek Olšák	75d64698f0	gallium/radeon: remove DBG_TEXMIP we don't need 2 flags for dumping texture info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-03 23:41:23 +01:00
Edward O'Callaghan	a5055e2f86	gallium/aux/util: Trivial, we already have format use it No need to dereference again, fixup for clarity. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-03 23:41:23 +01:00
Jose Fonseca	5294debfa4	automake: Fix typo in MSVC2008 compat flags. It should be MSVC2008_COMPAT_CFLAGS and not MSVC2008_COMPAT_CXXFLAGS. This is why the recent util_blitter breakage went unnoticed on autotools builds. Trivial.	2015-12-03 22:00:49 +00:00
Jose Fonseca	071af9a511	ttn: Whitelist from -Werror=declaration-after-statement. nir is the exception among gallium/auxiliary -- we don't need to compile it with MSVC2008 yet. And this enables us to use -Werror=declaration-after-statement in the next commit as we should, without complicated fixes to tgsi_to_nir module. Trvial. Tested with GCC and Clang.	2015-12-03 22:00:49 +00:00
Emil Velikov	5a23f6bd8d	mesa: rework the meaning of gl_debug_message::length Currently it stores strlen(buf) whenever the user originally provided a negative value for length. Although I've not seen any explicit text in the spec, CTS requires that the very same length (be that negative value or not) is returned back on Pop. So let's push down the length < 0 checks, tweak the meaning of gl_debug_message::length and fix GetDebugMessageLog to add and count the null terminators, as required by the spec. v2: return correct total length in GetDebugMessageLog v3: rebase (drop _mesa_shader_debug hunk). Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:19 +00:00
Emil Velikov	622186fbdf	mesa: errors: validate the length of null terminated string We're about to rework the meaning of gl_debug_message::length to only store the user provided data. Thus we should add an explicit validation for null terminated strings. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:08 +00:00
Emil Velikov	66fea8bd96	mesa: accept TYPE_PUSH/POP_GROUP with glDebugMessageInsert These new (relative to ARB_debug_output) tokens, have been explicitly separated from the existing ones in the spec text. With the reference to glDebugMessageInsert was dropped. At the same time, further down the spec says: "The value of <type> must be one of the values from Table 5.4" ... and these two are listed in Table 5.4. The GL 4.3 and GLES 3.2 do not give any hints on the former 'definition', plus CTS requires that the tokens are valid values for glDebugMessageInsert. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:08 +00:00
Emil Velikov	53be28107b	mesa: add SEVERITY_NOTIFICATION to default state As per the spec quote: "All messages are initially enabled unless their assigned severity is DEBUG_SEVERITY_LOW" We already had MEDIUM and HIGH set, let's toggle NOTIFICATION as well. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:21:07 +00:00
Emil Velikov	078dd6a0b4	mesa: return the correct value for GroupStackDepth We already have one group (the default) as specified in the spec. So lets return its size, rather than the index of the current group. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:20:58 +00:00
Emil Velikov	f39954bf7c	mesa: rename GroupStackDepth to CurrentGroup The variable is used as the actual index, rather than the size of the group stack - rename it to reflect that. Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:17:48 +00:00
Emil Velikov	1ca735701b	mesa: do not enable KHR_debug for ES 1.0 The extension requires (cough implements) GetPointervKHR (alias of GetPointerv) which in itself is available for ES 1.1 enabled mesa. Anyone willing to fish around and implement it for ES 1.0 is more than welcome to revert this commit. Until then lets restrict things. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93048 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:17:48 +00:00
Emil Velikov	f53f9eb8d4	glapi: add GetPointervKHR to the ES dispatch The KHR_debug extension implements this. Strictly speaking it could be used with ES 1.0, although as the original function is available on ES 1.1, I'm inclined to lift the KHR_debug requirement to ES 1.1. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93048 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 19:17:48 +00:00
Nanley Chery	808e752796	mesa/version: Update gl_extensions::Version during version override Commit `a16ffb743c`, which introduced gl_extensions::Version, updates the field when the context version is computed and when entering/exiting meta. Update this field when the version is overridden as well. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-12-03 10:20:34 -08:00
Brian Paul	a0f1bc18e5	mesa: print enum names rather than hexadecimal values in error messages Trivial.	2015-12-03 09:40:43 -07:00
Brian Paul	72a913ceb8	st/wgl: add new stw_ext_rendertexture.c file This should have been included in the previous commit. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-12-03 09:33:55 -07:00
Brian Paul	e832b5b7fa	st/wgl: add support for WGL_ARB_render_texture There are a few legacy OpenGL apps on Windows which need this extension. We basically use glCopyTex[Sub]Image to implement wglBindTexImageARB (see the implementation notes for details). v2: refactor code to use st_copy_framebuffer_to_texture() helper function. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-03 09:12:20 -07:00
Brian Paul	47b9ef872b	st/mesa: add new st_copy_framebuffer_to_texture() function This helper is used by the WGL state tracker to implement the wglBindTexImageARB() function. This is basically a new "meta" function. However, we're not putting it in the src/mesa/drivers/common/ directory because that code is not linked with gallium-based drivers. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-12-03 08:34:24 -07:00
Juha-Pekka Heikkila	d6d90750f1	glsl: remove useless null checks and make match_explicit_outputs_to_inputs() static match_explicit_outputs_to_inputs() cannot get null inputs and if it ever did triggering first null check would later in the function cause segfault. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> CC: timothy.arceri@collabora.com Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-12-03 10:56:35 +02:00
Tapani Pälli	231db5869c	i965: use _Shader to get fragment program when updating surface state Atomic counters and Images were using ctx::Shader that does not take in to account program pipeline changes, ctx::_Shader must be used for SSO to work. Commit `c0347705` already changed ubo's to use this. Fixes failures seen with following Piglit test: arb_separate_shader_object-atomic-counter Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-03 08:08:07 +02:00
Ilia Mirkin	6c6f28c35e	nv50/ir: fix moves to/from flags Noticed this when looking at a trace that caused flags to spill to/from registers. The flags source/destination wasn't encoded correctly according to both envydis and nvdisasm. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 20:41:38 -05:00
Ilia Mirkin	101e315cc1	nv50/ir: don't forget to mark flagsDef on cvt in txb lowering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 20:41:38 -05:00
Ilia Mirkin	06055121e6	nv50/ir: fix instruction permutation logic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 20:41:38 -05:00
Ilia Mirkin	11fcf46590	nv50/ir: the mad source might not have a defining instruction For example if it's $r63 (aka 0), there won't be a definition. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 20:41:37 -05:00
Ilia Mirkin	52b68375ae	nv50/ir: make sure entire graph is reachable The algorithm expects the entire CFG to be reachable, so make sure that we hit every node. Otherwise we will end up with uninitialized data, memory corruption, etc. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 18:51:15 -05:00
Ilia Mirkin	adcc547bfb	nv50/ir: deal with loops with no breaks For example if there are only returns, the break bb will not end up part of the CFG. However there will have been a prebreak already emitted for it, and when hitting the RET that comes after, we will try to insert the current (i.e. break) BB into the graph even though it will be unreachable. This makes the SSA code sad. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 18:51:15 -05:00
Ilia Mirkin	ff61ac4838	nvc0/ir: fold postfactor into immediate SM20-SM50 can't emit a post-factor in the presence of a long immediate. Make sure to fold it in. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-12-02 18:51:15 -05:00
Ilia Mirkin	52a800a687	nv50/ir: allow immediate 0 to be loaded anywhere There's a post-RA fixup to replace 0's with $r63 (or $r127 if too many regs are used), so just as nvc0, let an immediate 0 be loaded anywhere. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 18:51:15 -05:00
Kenneth Graunke	043d427538	i965: Add INTEL_DEBUG=perf information for GS recompiles. Surprisingly, this didn't exist at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-02 15:23:01 -08:00
Kenneth Graunke	b6d4f051a5	i965: De-duplicate key_debug() function. This appeared in brw_vs.c and brw_wm.c, should have appeared in brw_gs.c, and was soon going to have to be in brw_tcs.c and brw_tes.c as well. So, instead, move it to a central location (which has to know about both struct brw_context and perf_debug()). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-12-02 15:22:58 -08:00
Samuel Pitoiset	8482763d35	nv50/ir/gk110: add memory barriers support for GK110 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 22:44:53 +01:00
Samuel Pitoiset	c672bf3b04	nv50/ir: do not call textureMask() for surface ops That texture mask thing doesn't seem to be needed for surface ops, so just as nve4+, let do that only for texture ops. This fixes a segfault with 'test_surface_st' from gallium/tests/trivial/compute.c on Fermi because this test uses sustp. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-02 22:10:44 +01:00
Jose Fonseca	9e6af56666	appveyor: Initial integration. AppVeyor doesn't require an appveyor.yml in the repos (in fact it has some limitations as noted in comments below), but doing so has two great advantages over the web UI: - appveyor.yml can be revisioned together with the code, so instructions should always be in synch with the code - appveyor.yml can be reused for people's private repositories (be on fdo or GitHub, etc.) Acked-by: Roland Scheidegger <sroland@vmware.com>	2015-12-02 19:40:53 +00:00
Jose Fonseca	4a3d388834	util/blitter: Fix "SO C90 forbids mixed declarations and code". Trivial.	2015-12-02 17:49:20 +00:00
Brian Paul	d31065cbf6	mesa: print enum string in compressed_subtexture_error_check() error msg Trivial.	2015-12-02 10:28:15 -07:00
Edward O'Callaghan	772f429f0a	gallium/util: Fix util_blitter_clear_depth_stencil() for num_layers>1 Previously util_blitter_clear_depth_stencil() could not clear more than the first layer. We need to generalise this as we did for util_blitter_clear_render_target(). Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-02 18:23:43 +01:00
Edward O'Callaghan	8f2c5e281d	gallium/util: Fix util_blitter_clear_render_target() for num_layers>1 Previously util_blitter_clear_render_target() could not clear more than the first layer. We need to generalise this so that ARB_clear_texture can pass the 3d piglit test. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-12-02 18:23:43 +01:00
Roland Scheidegger	09f74e6ef4	mesa: fix VIEWPORT_INDEX_PROVOKING_VERTEX and LAYER_PROVOKING_VERTEX queries These are implementation-dependent queries, but so far we just returned the value of whatever the current provoking vertex convention was set to, which was clearly wrong. Just make this a variable in the context constants like for other things which are implementation dependent (I assume all drivers will want to set this to the same value for both queries), and set it to GL_UNDEFINED_VERTEX which is correct for everybody (and drivers can override it). Reviewed-by: Brian Paul <brianp@vmware.com> CC: <mesa-stable@lists.freedesktop.org>	2015-12-02 18:20:57 +01:00
Jose Fonseca	56aff6bb4e	Remove Sun CC specific code. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>	2015-12-02 07:51:04 +00:00
Jose Fonseca	51564f04b7	configure.ac: Refuse to build with Sun C compiler. https://bugs.freedesktop.org/show_bug.cgi?id=93189 Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>	2015-12-02 07:51:04 +00:00
Eric Anholt	18f8da7865	travis: Add a test build with scons. Since I just broke the scons build, I figured I'd make Travis test that I don't break it again in the future. The script runs the builds in parallel across VMs, so it still takes just 5 minutes to turn around results. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-12-01 15:09:56 -08:00
Kenneth Graunke	975b1299dd	i965: Increase BRW_MAX_UBO to 14. The NVIDIA binary driver and Intel's closed source driver both expose 14 here, rather than the GL minimum of 12. Let's follow suit. Without this, Shadow of Mordor fails to render correctly and triggers OpenGL errors: Mesa: User error: GL_INVALID_VALUE in glBindBufferBase(index=68) Mesa: User error: GL_INVALID_VALUE in glUniformBlockBinding(block binding 68 >= 60) There are 5 stages (VS, TCS, TES, GS, FS), and 12 * 5 = 60 is too small. 14 * 5 = 70 will work just fine. Tapani believes this will also help Alien Isolation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-12-01 14:55:33 -08:00
Matt Turner	7e6a6f3e61	i965: Do dead-code elimination in a single pass. The first pass marked dead instructions as opcode = NOP, and a second pass deleted those instructions so that the live ranges used in the first pass wouldn't change. But since we're walking the instructions in reverse order, we can just do everything in one pass. The only thing we have to do is walk the blocks in reverse as well. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-01 14:48:55 -08:00
Matt Turner	5a6f0bf5b8	glsl: Rename safe_reverse -> reverse_safe. To match existing foreach_in_list_reverse_safe. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-12-01 14:48:55 -08:00
Matt Turner	48b4e88d3d	i965: Don't mark dead instructions' sources live. Removes dead code from glsl-mat-from-int-ctor-03.shader_test. Reported-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-12-01 14:48:55 -08:00
Dave Airlie	0e49151dcf	r600: set mega fetch count to 16 for gs copy shader Seems like MFC should be set for this shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:13 +10:00
Dave Airlie	4ebcf5194d	r600: increment ring index after emit vertex not before. The docs say we should send the emit after the ring writes, so lets do that and not have an ALU in between. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:13 +10:00
Dave Airlie	13b134a443	r600: add alu + cf nop to copy shader on r600 SB suggests we do this for r600, so lets do it, for the copy shader. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:13 +10:00
Dave Airlie	af4013d26b	r600: SMX returns CONTEXT_DONE early workaround streamout, gs rings bug on certain r600s, requires a wait idle before each surface sync. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:25:00 +10:00
Dave Airlie	b63944e8b9	r600: do SQ flush ES ring rolling workaround Need to insert a SQ_NON_EVENT when ever geometry shaders are enabled. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-02 08:24:32 +10:00
Samuel Pitoiset	ea33920f7e	nv50,nvc0: allow to create resources other than buffers For the compute support, we might stick buffers as surfaces. This fixes an assertion when executing src/gallium/tests/trivial/compute. To avoid using these "restricted" surfaces as render targets, these assertions have been moved. Note that it's already handled for the framebuffer thing on nvc0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-12-01 22:55:14 +01:00
Brian Paul	f391b95105	glapi: work-around MSVC 65K string length limitation for enums.c String literals cannot exceed 65535 characters for MSVC. Instead of emiting a string, emit an array of characters. v2: fix indentation and add comment in the gl_enums.py file about this ugliness. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 14:28:45 -07:00
Eric Anholt	148c2f5b17	mapi: Fix enums.c build with other build systems. Tested with scons (by both myself and Mark Janes), Android is just copy and paste.	2015-12-01 12:19:02 -08:00
Eric Anholt	1c0ac1976a	travis: Initial import of travis instructions. This just builds/installs our dependencies, and runs "make check". I'm interested in integrating more tests into it, but this seems like a pretty easy first start. If your personal branches of Mesa are on github, you can enable it on your account and the repository (see https://docs.travis-ci.com/user/for-beginners), then any pushes you do will get their HEAD commit tested, and any pull requests to your tree will get their merge commits tested.	2015-12-01 11:08:57 -08:00
Eric Anholt	4922a3ae52	mesa: Drop the blacklisting of new GL enums. Now when people need new extensions, they can skip the entire enum-definition process, and we can stop reviewing new extension XML for its enum content. This also brings in a new enum that I wanted to use in enum_strings.cpp for testing the code generator. v2: Drop comment about disabled GL_1PASS_EXT test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:42 -08:00
Eric Anholt	b65e44f55d	mesa: Use a 32-bit offset for the enums.c string offset table. With GLES 3.1, GL 4.5, and many new vendor extensions about to get their enums added, we jump up to 85k of table. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:41 -08:00
Eric Anholt	c75cfe1c8a	mesa: Prefer newer names to older ones among names present in core. Sometimes GL likes to rename an old enum when it grows a more general purpose, and we should prefer the new name. Changes from this: GL_POINT/LINE_SIZE_* (1.1) -> GL_SMOOTH_POINT/LINE_SIZE_* (1.2) GL_FOG_COORDINATE_* (1.4) -> GL_FOG_COORD_* (1.5) GL_SOURCE[012]_RGB/ALPHA (1.3) -> GL_SRC0_RGB (1.5) GL_COPY_READ/WRITE_BUFFER (3.1) -> GL_COPY_READ_BUFFER_BINDING (4.2) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:38 -08:00
Eric Anholt	710762b64a	mesa: Drop bitfield "enums" from the enum-to-string table. Asking the table for bitfield names doesn't make any sense. For 0x10, do you want GL_GLYPH_HORIZONTAL_BEARING_ADVANCE_BIT_NV or GL_COLOR_BUFFER_BIT4_QCOM or GL_POLYGON_STIPPLE_BIT or GL_SHADER_GLOBAL_ACCESS_BARRIER_BIT_NV? Giving a useful answer would depend on a whole lot of context. This also fixes a bad enum table entry, where we chose GL_HINT_BIT instead of GL_ABGR_EXT for 0x8000, so we can now fix its entry in the enum_strings test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:36 -08:00
Eric Anholt	cbabf5f9dc	mesa: Switch to using the Khronos registry for generating enums. I've used a bunch of python code to cut out new enums so that the two generated files can be diffed. I'll remove all that hardcoding in the following commits. All remaining differences between the generated code: - GL_TEXTURE_BUFFER_FORMAT didn't appear in GL3 when TBOs got merged to core, so it now gets an _ARB suffix instead. - Blacklisting can't keep EXT_sso's GL_ACTIVE_PROGRAM_EXT from becoming GL_ACTIVE_PROGRAM -- in our hash table, GL_ACTIVE_PROGRAM_EXT points at the GLES2 enum's value (aka GL_CURRENT_PROGRAM). By not blacklisting the core name, we get both enums translated. - GL_DRAW_FRAMEBUFFER_BINDING and GL_FRAMEBUFFER_BINDING both appeared in GL3 as synonyms, and the new code happens to choose GL_FRAMEBUFFER_BINDING instead. - GL_TEXTURE_COMPONENTS and GL_TEXTURE_INTERNAL_FORMAT both appear in 1.1, and the new code chooses GL_TEXTURE_INTERNAL_FORMAT instead (which seems better, to me) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:34 -08:00
Eric Anholt	f72923aaea	mesa: Remove the python mode bits from gl_enums.py. emacs whines at me every time I open the file about these unsafe variables, and the file was reformatted from 8 space to 4 space long ago. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:31 -08:00
Eric Anholt	741f642a6f	mesa: Drop apparently typoed GL_ALL_CLIENT_ATTRIB_BITS. GL_ALL_ATTRIB_BITS is a thing, and GL_CLIENT_ALL_ATTRIB_BITS, but I don't see GL_ALL_CLIENT_ATTRIB_BITS in my grepping of khronos XML, GL extension specs, GL 1.1, GL 2.2, and GL 4.4. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-12-01 10:24:22 -08:00
Eric Anholt	5cb9dc45c7	mesa: Drop enums that had been removed in later revs of specs. Mesa hasn't been using these enums and the finalized specs don't reference them, so losing them from our generated enum-to-string code should be fine. Reduces diffs to generating from Khronos XML, which has these enums noted defined but commented out from any consumers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:18 -08:00
Eric Anholt	5a7e5d8bb6	mesa: Fix a typo in AMD_performance_monitor enum. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:16 -08:00
Eric Anholt	bfc64b9688	mesa: Fix enum definition of CULL_VERTEX_EYE/OBJECT_POSITION In converting to using the Khronos XML, I found that our XML had these two swapped, and the text spec agreed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:15 -08:00
Eric Anholt	76ec0b9038	mesa: Add a copy of the Khronos gl.xml (SVN #31705 ). The intention here is to keep a pristine copy of the upstream gl.xml that can be updated at any time with a new version, and use that to generate Mesa code from instead of our private XML. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:13 -08:00
Eric Anholt	edc8850436	mesa: Cut enum_strings.cpp test down to a few hand-chosen enums. The previous contents appeared to be the output of some form of code generation for all enums, with a few entries hand-edited to deal with oddness. The downside to this was that when an enum gets promoted from vendor to _EXT or _EXT to _ARB or _ARB to core, make check starts failing even when the commiter has done nothing wrong. Instead of black-box testing the code generation, pick a few enums that intentionally poke the interesting cases of code generation. People editing the code generator should be diffing the generated code anyway. This should catch when they fail to do so, without throwing false negatives when people update the GL XML. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-12-01 10:24:02 -08:00
Tom Stellard	9adbb9e713	clover: Handle NULL devices returned by pipe_loader_probe() v2 When probing for devices, clover will call pipe_loader_probe() twice. The first time to retrieve the number of devices, and then second time to retrieve the device structures. We currently assume that the return value of both calls will be the same, but this will not be the case if a device happens to disappear between the two calls. When a device disappears, the pipe_loader_probe() will add a NULL device to the device list, so we need to handle this. v2: - Keep range for loop Reviewed-by: Francisco Jerez <currojerez@riseup.net> Acked-by: Emil Velikov <emil.l.velikov@gmail.com> CC: <mesa-stable@lists.freedesktop.org>	2015-12-01 16:00:54 +00:00
Jonathan Gray	99cd600835	automake: fix some occurrences of hardcoded -ldl and -lpthread Correct some occurrences of -ldl and -lpthread to use $(DLOPEN_LIBS) and $(PTHREAD_LIBS) respectively. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-01 16:53:40 +00:00
Iago Toral Quiroga	241f15ac80	glsl/lower_ubo_reference: split struct copies into element copies Improves register pressure, since otherwise we end up emitting loads for all the elements in the RHS and them emitting stores for all elements in the LHS. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-01 13:30:42 +01:00
Iago Toral Quiroga	867c436ca8	glsl/lower_ubo_reference: split array copies into element copies Improves register pressure, since otherwise we end up emitting loads for all the elements in the RHS and them emitting stores for all elements in the LHS. v2: - Mark progress properly. This also fixes some instances where the added nodes with individual element copies where not being lowered, which is expected behavior as explained in the documentation for visit_list_elements. - Only need to do this if the RHS is a buffer-backed variable. - We can also have arrays inside structs. A later patch will make it so we also split struct copies and end up with multiple ir_dereference_record assignments, so make sure that if any of these is an array copy, we also split it. Fixes the following piglit tests: tests/spec/arb_shader_storage_buffer_object/execution/large-field-copy.shader_test tests/spec/arb_shader_storage_buffer_object/linker/copy-large-array.shader_test Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-12-01 13:29:57 +01:00
Julien Isorce	e483cba9f5	st/va: also retrieve reference frames info for h264 Other hardwares than AMD require to parse: VAPictureParameterBufferH264.ReferenceFrames[16] Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-12-01 08:21:37 +00:00
Julien Isorce	b4fb6d7616	st/va: delay decoder creation until max_references is known In general max_references cannot be based on num_render_targets. This patch allows to allocate buffers with an accurate size. I.e. no more than necessary. For other codecs it is a fixed value 2. This is similar behaviour as vaapi/vdpau-driver. For now HEVC case defaults to num_render_targets as before. But it could also benefits this change by setting a more accurate max_references number in handlePictureParameterBuffer. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-12-01 08:21:20 +00:00
Iago Toral Quiroga	750393ff7d	glsl/dead_builin_varyings: Fix gl_FragData array lowering The current implementation looks for array dereferences on gl_FragData and immediately proceeds to lower them, however this is not enough because we can have array access on vector variables too, like in this code: out vec4 color; void main() { int i; for (i = 0; i < 4; i++) color[i] = 1.0; } Fix it by making sure that the actual variable being dereferenced is an array. Fixes a crash in: spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 08:30:52 +01:00
Dave Airlie	4f34722575	r600: workaround empty geom shader. We need to emit at least one cut/emit in every geometry shader, the easiest workaround it to stick a single CUT at the top of each geom shader. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 12:58:43 +10:00
Dave Airlie	04efcc6c7a	r600: rv670 use at least 16es/gs threads This is specified in the docs for rv670 to work properly. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 12:58:34 +10:00
Dave Airlie	8168dfdd4e	r600: geometry shader gsvs itemsize workaround On some chips the GSVS itemsize needs to be aligned to a cacheline size. This only applies to some of the r600 family chips. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 12:57:55 +10:00
Gregory Hainaut	2ab9cd0c4d	glsl: don't sort varying in separate shader mode This fixes an issue where the addition of the FLAT qualifier in varying_matches::record() can break the expected varying order. It also avoids a future issue with the relaxing of interpolation qualifier matching constraints in GLSL 4.50. V2: (by Timothy Arceri) * reworked comment slightly Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:46:37 +11:00
Gregory Hainaut	8117f46f49	glsl: don't dead code remove SSO varyings marked as active GL_ARB_separate_shader_objects allow matching by name variable or block interface. Input varyings can't be removed because it is will impact the location assignment. This fixes the bug 79783 and likely any application that uses GL_ARB_separate_shader_objects extension. V2 (by Timothy Arceri): * simplify now that builtins are not set as always active Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> https://bugs.freedesktop.org/show_bug.cgi?id=79783	2015-12-01 12:46:32 +11:00
Gregory Hainaut	618612f867	glsl: add always_active_io attribute to ir_variable The value will be set in separate-shader program when an input/output must remains active. e.g. when deadcode removal isn't allowed because it will create interface location/name-matching mismatch. v3: * Rename the attribute * Use ir_variable directly instead of ir_variable_refcount_visitor * Move the foreach IR code in the linker file v4: * Fix variable name in assert v5 (by Timothy Arceri): * Rename functions and reword comments * Don't set always active on builtins Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:46:26 +11:00
Timothy Arceri	76c09c1792	glsl: copy how_declared when lowering interface blocks Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:45:07 +11:00
Timothy Arceri	12ba6cfba7	glsl: optimise inputs/outputs with explicit locations This change allows used defined inputs/outputs with explicit locations to be removed if they are detected to not be used between shaders at link time. To enable this we change the is_unmatched_generic_inout field to be flagged when we have a user defined varying. Previously explicit_location was assumed to be set only in builtins however SSO allows the user to set an explicit location. We then add a function to match explicit locations between shaders. V2: call match_explicit_outputs_to_inputs() after is_unmatched_generic_inout has been initialised. Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-12-01 12:45:03 +11:00
Dave Airlie	4d64459a92	r600/shader: split address get out to a function. This will be used in the tess shaders. Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-12-01 08:10:21 +10:00
Marta Lofstedt	44944a66ce	doc: Set GL_OES_geometry_shader as started Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-11-30 10:47:21 +01:00
Marta Lofstedt	1d5b88e33b	gles2: Update gl2ext.h to revision: 32120 This is needed to be able to implement the accepted OES extensions. Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-30 10:46:15 +01:00
Julien Isorce	10c14919c8	vl/buffers: fixes vl_video_buffer_formats for RGBX Fixes: `42a5e143a8` "vl/buffers: add RGBX and BGRX to the supported formats" Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-30 09:02:29 +00:00
Samuel Iglesias Gonsálvez	a348fe89af	i965/fs: remove unused fs_reg offset Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-11-30 10:00:40 +01:00
Kenneth Graunke	83dedb6354	i965: Add src/dst interference for certain instructions with hazards. When working on tessellation shaders, I created some vec4 virtual opcodes for creating message headers through a sequence like: mov(8) g7<1>UD 0x00000000UD { align1 WE_all 1Q compacted }; mov(1) g7.5<1>UD 0x00000100UD { align1 WE_all }; mov(1) g7<1>UD g0<0,1,0>UD { align1 WE_all compacted }; mov(1) g7.3<1>UD g8<0,1,0>UD { align1 WE_all }; This is done in the generator since the vec4 backend can't handle align1 regioning. From the visitor's point of view, this is a single opcode: hs_set_output_urb_offsets vgrf7.0:UD, 1U, vgrf8.xxxx:UD Normally, there's no hazard between sources and destinations - an instruction (naturally) reads its sources, then writes the result to the destination. However, when the virtual instruction generates multiple hardware instructions, we can get into trouble. In the above example, if the register allocator assigned vgrf7 and vgrf8 to the same hardware register, then we'd clobber the source with 0 in the first instruction, and read back the wrong value in the last one. It occured to me that this is exactly the same problem we have with SIMD16 instructions that use W/UW or B/UB types with 0 stride. The hardware implicitly decodes them as two SIMD8 instructions, and with the overlapping regions, the first would clobber the second. Previously, we handled that by incrementing the live range end IP by 1, which works, but is excessive: the next instruction doesn't actually care about that. It might also be the end of control flow. This might keep values alive too long. What we really want is to say "my source and destinations interfere". This patch creates new infrastructure for doing just that, and teaches the register allocator to add interference when there's a hazard. For my vec4 case, we can determine this by switching on opcodes. For the SIMD16 case, we just move the existing code there. I audited our existing virtual opcodes that generate multiple instructions; I believe FS_OPCODE_PACK_HALF_2x16_SPLIT needs this treatment as well, but no others. v2: Rebased by mattst88. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-30 00:34:07 -08:00
Kenneth Graunke	1ac1581f38	i965: Fix JIP to properly skip over unrelated control flow. We've apparently always been botching JIP for sequences such as: do cmp.f0.0 ... (+f0.0) break ... if ... else ... endif ... while Normally, UIP is supposed to point to the final destination of the jump, while in nested control flow, JIP is supposed to point to the end of the current nesting level. It essentially bounces out of the current nested control flow, to an instruction that has a JIP which bounces out another level, and so on. In the above example, when setting JIP for the BREAK, we call brw_find_next_block_end(), which begins a search after the BREAK for the next ENDIF, ELSE, WHILE, or HALT. It ignores the IF and finds the ELSE, setting JIP there. This makes no sense at all. The break is supposed to skip over the whole if/else/endif block entirely. They have a sibling relationship, not a nesting relationship. This patch fixes brw_find_next_block_end() to track depth as it does its search, and ignore anything not at depth 0. So when it sees the IF, it ignores everything until after the ENDIF. That way, it finds the end of the right block. I noticed this while reading some assembly code. We believe jumping earlier is harmless, but makes the EU walk through a bunch of disabled instructions for no reason. I noticed that GLBenchmark Manhattan had a shader that contained a BREAK with a bogus JIP, but didn't measure any performance improvement (it's likely miniscule, if there is any). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-30 00:27:16 -08:00
Dave Airlie	d72299c531	r600: move per-type settings into a switch statement This will allow adding tess stuff much cleaner later. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 11:08:00 +10:00
Dave Airlie	58e0122d86	r600: split out common alu_writes pattern. This just splits out a common pattern into an inline function to make things cleaner to read. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 11:07:18 +10:00
Dave Airlie	26332ef797	r600/llvm: fix r600/llvm build Reported on irc by gryffus Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 11:05:42 +10:00
Dave Airlie	9eff9f6134	r600: fixes for register definitions. Forgot to add these. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:35:37 +10:00
Dave Airlie	c2e701c7ca	r600: add missing register to initial state We really should initialise HS/LS_2 and SQ_LDS_ALLOC exists on all evergreen not just cayman, so we should initialise it as well. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:14:16 +10:00
Dave Airlie	bcdc748fe2	r600: define registers required for tessellation This adds the defines for a bunch of registers and shader values that are required to implement tessellation. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:14:16 +10:00
Dave Airlie	b502bae610	r600: consolidate clip state updates Move some common code into one place, tess will also need to use this function. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-30 09:14:16 +10:00
Samuel Pitoiset	b8c524ff88	nv50/ir: always display the opcode number for unknown instructions This helps in debugging unknown instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-29 16:40:12 +01:00
Emil Velikov	d37ebed470	mesa: remove len argument from _mesa_shader_debug() There was only a single user which was using strlen(buf). As this function is not user facing (i.e. we don't need to feed back original length via a callback), we can simplify things. Suggested-by: Timothy Arceri <timothy.arceri@collabora.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-29 14:41:40 +00:00
Emil Velikov	e714c971ae	drivers/x11: scons: partially revert `b9b40ef9b7` As glsl_types.{cpp,h} were moved out of the sconscript (commit `b23a4859f4` "scons: Build nir/glsl_types.cpp once.") remove the dangling includes. Cc: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	31ed3fc57d	nir: remove recursive inclusion in builtin_type_macros.h The header is already included by glsl_types.{cpp,h}. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	fc16942cf7	nir: remove unneeded include Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	b92ecdcc79	mesa/program: remove dead function declarations Dead since `5e9aa9926b` (2011) - _mesa_ir_compile_shader `69e07bdeb4` (2009) - _mesa_get_program_register Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-29 14:41:39 +00:00
Emil Velikov	5d294d9fa3	auxiliary/vl/dri: fd management cleanups Analogous to previous commit, minus the extra dup. We are the one opening the device thus we can directly use the fd. Spotted by Coverity (CID `1339867`, 1339877) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:41:00 +00:00
Emil Velikov	151290c154	auxiliary/vl/drm: fd management cleanups Analogous to previous commit. Spotted by Coverity (CID 1339868) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:40:26 +00:00
Emil Velikov	fe71059388	st/xa: fd management cleanups Analogous to previous commit. Spotted by Coverity (CID 1339866) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:39:51 +00:00
Emil Velikov	d90ba57c08	st/dri: fd management cleanups Add some checks if the original/dup'd fd is valid and ensure that we don't leak it on error. The former is implicitly handled within the pipe_loader, although let's make things explicit and check beforehand. Spotted by Coverity (CID 1339865) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:39:03 +00:00
Emil Velikov	5f92906b87	pipe-loader: check if winsys.name is non-null prior to strcmp In theory this wouldn't be an issue, as we'll find the correct name and break out of the loop before we hit the sentinel. Let's fix this and avoid issues in the future. Spotted by Coverity (CID 1339869, 1339870, 1339871) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-29 14:38:22 +00:00
Emil Velikov	866a1f7fdd	st/va: add missing break statement Earlier commit factored out the mpeg4 IQ matrix handling into separate function, although it forgot to add a break in its case statement. Thus the data ended up partially overwritten as the mpeg4 and h265 structs are members of the desc union. Spotted by Coverity (CID 1341052) Fixes: `64761a841d` "st/va: move MPEG4 functions into separate file" Cc: Julien Isorce <j.isorce@samsung.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-29 14:31:14 +00:00
Ilia Mirkin	0396eaaf80	mesa: support GL_RED/GL_RG in ES2 contexts when driver support exists Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93126 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-28 17:24:34 -05:00
Nicolai Hähnle	9e5e702cfb	radeon: only suspend queries on flush if they haven't been suspended yet Non-timer queries are suspended during blits. When the blits end, the queries are resumed, but this resume operation itself might run out of CS space and trigger a flush. When this happens, we must prevent a duplicate suspend during preflush suspend, and we must also prevent a duplicate resume when the CS flush returns back to the original resume operation. This fixes a regression that was introduced by: commit `8a125afa6e` Author: Nicolai Hähnle <nhaehnle@gmail.com> Date: Wed Nov 18 18:40:22 2015 +0100 radeon: ensure that timing/profiling queries are suspended on flush The queries_suspended_for_flush flag is redundant because suspended queries are not removed from their respective linked list. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reported-by: Axel Davy <axel.davy@ens.fr> Cc: "11.1" <mesa-stable@lists.freedesktop.org> Tested-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-28 11:08:49 +01:00
Jose Fonseca	ea3f394e4a	scons: Use LD version script for libgl-xlib. Trivial.	2015-11-27 14:14:25 +00:00
Jose Fonseca	a11955b9f9	svga: Don't return value from void function. Addresses MSVC warning C4098: 'svga_destroy_query' : 'void' function returning a value. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-27 14:14:25 +00:00
Jose Fonseca	c127e6a3ea	gallium: Make pipe_query_result::batch array length non-zero. Zero length arrays are non standard: warning C4200: nonstandard extension used : zero-sized array in struct/union Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array And all code does `N * sizeof query_result->batch[0]`, so it should work exactly the same. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-27 14:14:25 +00:00
Neil Roberts	bc2470d5d3	util: Tiny optimisation for the linear→srgb conversion When converting 0.0 it would be nice if it didn't do any arithmetic. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-11-27 10:55:22 +01:00
Eduardo Lima Mitev	27a88a947c	docs: Update GL3.txt to add ARB_internalformat_query2 Added to OpenGL 4.3 section, tagged as 'in progress (elima)'. See https://bugs.freedesktop.org/show_bug.cgi?id=92687. Thanks to Thomas H.P. Andersen for remainding me about this. v1: - Update the already existing entry in section 4.3 instead (Ilia Mirkin). - Added my BZ nickname as contact person (Felix Schwarz). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-26 23:53:16 +01:00
Timothy Arceri	c3ec12ec3c	glsl: don't generate extra errors in ValidateProgramPipeline From Section 11.1.3.11 (Validation) of the GLES 3.1 spec: "An INVALID_OPERATION error is generated by any command that trans- fers vertices to the GL or launches compute work if the current set of active program objects cannot be executed, for reasons including:" It then goes on to list the rules we validate in the _mesa_validate_program_pipeline() function. For ValidateProgramPipeline the only mention of generating an error is: "An INVALID_OPERATION error is generated if pipeline is not a name re- turned from a previous call to GenProgramPipelines or if such a name has since been deleted by DeleteProgramPipelines," Which we handle separately. This fixes: ES31-CTS.sepshaderobjs.PipelineApi No regressions on the eEQP 3.1 tests. Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-27 08:44:37 +11:00
Rob Clark	57fc0dd8d5	freedreno/ir3: assign varying locations later Rather than assigning inloc up front, when we don't yet know if it will be unused, assign it last thing before the legalize pass. Also, realize when inputs are unused (since for frag shader's we can't rely on them being removed from ir->inputs[]). This doesn't make sense if we don't also dynamically assign the inloc's, since we could end up telling the hw the wrong # of varyings (since we currently assume that the # of varyings and max-inloc are related..) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Rob Clark	2181f2cd58	freedreno/ir3: use instr flag to mark unused instructions Rather than magic depth value, which won't be available in later stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Rob Clark	2fbe4e7d2f	freedreno/a4xx: rework vinterp/vpsrepl Same as previous commit, for a4xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Rob Clark	5adf4a5cda	freedreno/a3xx: rework vinterp/vpsrepl Make the interpolation / point-sprite replacement mode setup deal with varying packing. In a later commit, we switch to packing just the varying components that are actually used by the frag shader, so we won't be able to assume everything is vec4's aligned to vec4. Which would highly confuse the previous vinterp/vpsrepl logic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-26 12:35:10 -05:00
Serge Martin	b7c958b7b7	clover: fix tgsi compiler crash with invalid src Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-26 15:30:25 +02:00
Francisco Jerez	55ffa64daf	i965/gen9+: Switch thread scratch space to non-coherent stateless access. The thread scratch space is thread-local so using the full IA-coherent stateless surface index (255 since Gen8) is unnecessary and potentially expensive. On Gen8 and early steppings of Gen9 this is not a functional change because the kernel already sets bit 4 of HDC_CHICKEN0 which overrides all HDC memory access to be non-coherent in order to workaround a hardware bug. This happens to fix a full system hang when running any spilling code on a pre-production SKL GT4e machine I have on my desk (forcing all HDC access to non-coherent from the kernel up to stepping F0 might be a good idea though regardless of this patch), and improves performance of the OglPSBump2 SynMark benchmark run with INTEL_DEBUG=spill_fs by 33% (11 runs, 5% significance) on a production SKL GT2 (on which HDC IA-coherency is apparently functional so it wouldn't make sense to disable globally). Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-26 14:07:58 +02:00
Francisco Jerez	bc8182808a	i965/fs: Don't use Gen7-style scratch block reads on Gen9+. Unfortunately Gen7 scratch block reads and writes seem to be hardwired to BTI 255 even on Gen9+ where that index causes the dataport to do an IA-coherent read or write. This change is required for the next patch to be correct, since otherwise we would be writing to the scratch space using non-coherent access and then reading it back using IA-coherent reads, which wouldn't be guaranteed to return the value previously written to the same location without introducing an additional HDC flush in between. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-26 14:07:58 +02:00
Francisco Jerez	3e6d0d2ca4	i965: Add symbolic defines for some magic dataport surface indices. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-26 14:07:58 +02:00
Nicolai Hähnle	6b5268d202	radeon: use PIPE_DRIVER_QUERY_FLAG_DONT_LIST for perfcounters Since the query names are not very enlightening, and there are thousands of them, GALLIUM_HUD=help should only show the first and last query name for each hardware block. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-26 10:57:44 +01:00
Nicolai Hähnle	f36d9857cd	gallium: add PIPE_DRIVER_QUERY_FLAG_DONT_LIST This allows the driver to give a hint to the HUD so that GALLIUM_HUD=help is less spammy. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-26 10:57:43 +01:00
Nicolai Hähnle	80a16dece6	radeon: delay the generation of driver query names until first use This shaves a bit more time off the startup of programs that don't actually use performance counters. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-26 10:57:43 +01:00
Julien Isorce	ca976e6900	st/va: add missing profiles in PipeToProfile's switch. Otherwise assert is raised from vlVaQueryConfigProfiles's for loop. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-26 08:21:45 +00:00
Marta Lofstedt	63b49e1711	mesa: remove ARB_geometry_shader4 No drivers currently implement ARB_geometry_shader4, nor are there any plans to implement it. We only support the version of geometry shaders that was incorporated into OpenGL 3.2 / GLSL 1.50. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-26 08:40:46 +01:00
Tapani Pälli	c2e146f487	mesa: error out in indirect draw when vertex bindings mismatch Patch adds additional mask for tracking which vertex arrays have associated vertex buffer binding set. This array can be directly compared to which vertex arrays are enabled and should match when drawing. Fixes following CTS tests: ES31-CTS.draw_indirect.negative-noVBO-arrays ES31-CTS.draw_indirect.negative-noVBO-elements v2: update mask in vertex_array_attrib_binding v3: rename mask and make it track _BoundArrays which matches what was actually originally wanted (Fredrik Höglund) v4: code cleanup, check for GLES 3.1 (Fredrik Höglund) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-11-26 08:01:31 +02:00
Michel Dänzer	22d2dda03b	targets/xvmc: use the non-inline sw helpers This was missed in commit `59cfb21d` ("targets: use the non-inline sw helpers"). Fixes build failure: CXXLD libXvMCgallium.la ../../../../src/gallium/auxiliary/pipe-loader/.libs/libpipe_loader_static.a(libpipe_loader_static_la-pipe_loader_sw.o):(.data.rel.ro+0x0): undefined reference to `sw_screen_create' collect2: error: ld returned 1 exit status Makefile:756: recipe for target 'libXvMCgallium.la' failed make[3]: *** [libXvMCgallium.la] Error 1 Trivial.	2015-11-26 12:14:28 +09:00
Emil Velikov	72c33f0dd5	targets/nine: remove freedreno target Analogous to previous commit. As we no longer have anyone who uses NIR we can drop the link. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robdclark@gmail.com>	2015-11-25 20:29:44 +00:00
Emil Velikov	aa335bb01b	targets/nine: remove vc4 target There are no users for it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-11-25 20:28:38 +00:00
Emil Velikov	b78259c4b5	gallium: remove unused function declarations Unused as of commit `23fb11455b` "{st,targets}/dri: use static/dynamic pipe-loader" Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-25 20:26:52 +00:00
Emil Velikov	59cfb21d46	targets: use the non-inline sw helpers Previously (with the inline ones) things were embedded into the pipe-loader, which means that we cannot control/select what we want in each target. That also meant that at runtime we ended up with the empty sw_screen_create() as the GALLIUM_SOFTPIPE/LLVMPIPE were not set. v2: Cover all the targets, not just dri. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2015-11-25 20:25:29 +00:00
Emil Velikov	fbc6447c3d	target-hepers: add non inline sw helpers Feeling rather dirty copying the inline ones, yet we need the inline ones for swrast only targets like libgl-xlib, osmesa. Cc: "11.1" <mesa-stable@lists.freedesktop.org> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Edward O'Callaghan <edward.ocallaghan@koparo.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Oded Gabbay <oded.gabbay@gmail.com> Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2015-11-25 20:25:14 +00:00
Emil Velikov	f623517188	pipe-loader: fix off-by one error With earlier commit we've dropped the manual iteration over the fixed size array and prepemtively set the variable storing the size, that is to be returned. Yet we forgot to adjust the comparison, as before we were comparing the index, now we're comparing the size. Fixes: `ff9cd8a67c` "pipe-loader: directly use pipe_loader_sw_probe_null() at probe time" Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93091 Reported-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2015-11-25 20:22:35 +00:00
Emil Velikov	0572e5fea5	nir: include what we want/need Swap core.h with macros.h, as the latter provides the required MAX2 macro. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-25 20:19:47 +00:00
Kenneth Graunke	3810c15614	i965: Fix scalar vertex shader struct outputs. While we correctly set output[] for composite varyings, we set completely bogus values for output_components[], making emit_urb_writes() output zeros instead of the actual values. Unfortunately, our simple approach goes out the window, and we need to recurse into structs to get the proper value of vector_elements for each field. Together with the previous patch, this fixes rendering in an upcoming game from Feral Interactive. v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-25 11:47:47 -08:00
Kenneth Graunke	3e9003e9cf	i965: Fix fragment shader struct inputs. Apparently we have literally no support for FS varying struct inputs. This is somewhat surprising, given that we've had tests for that very feature that have been passing for a long time. Normally, varying packing splits up structures for us, so we don't see them in the backend. However, with SSO, varying packing isn't around to save us, and we get actual structs that we have to handle. This patch changes fs_visitor::emit_general_interpolation() to work recursively, properly handling nested structs/arrays/and so on. (It's easier to read with diff -b, as indentation changes.) When using the vec4 VS backend, this fixes rendering in an upcoming game from Feral Interactive. (The scalar VS backend requires additional bug fixes in the next patch.) v2: Use pointers instead of pass-by-mutable-reference (Jason, Matt). Cc: "11.1 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-25 11:47:47 -08:00
Tom Stellard	89851a2965	radeonsi/compute: Use the compiler's COMPUTE_PGM_RSRC* register values The compiler has more information and is able to optimize the bits it sets in these registers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> CC: <mesa-stable@lists.freedesktop.org>	2015-11-25 11:03:05 -05:00
Tom Stellard	95e0510916	radeonsi: Rename si_shader::ls_rsrc{1,2} to si_shader::rsrc{1,2} In the future, these will be used by other shaders types. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 11:03:05 -05:00
Samuel Iglesias Gonsálvez	98ceb60177	docs: minimum required python mako version is 0.3.4 Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-25 16:50:53 +01:00
Nicolai Hähnle	07bddff460	docs: update relnotes with AMD_performance_monitor for radeonsi Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:52:09 +01:00
Nicolai Hähnle	ad22006892	radeonsi: implement AMD_performance_monitor for CIK+ Expose most of the performance counter groups that are exposed by Catalyst. Ideally, the driver will work with GPUPerfStudio at some point, but we are not quite there yet. In any case, this is the reason for grouping multiple instances of hardware blocks in the way it is implemented. The counters can also be shown using the Gallium HUD. If one is interested to see how work is distributed across multiple shader engines, one can set the environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance counter groups. Part of the implementation is in radeon because an implementation for older hardware would largely follow along the same lines, but exposing a different set of blocks which are programmed slightly differently. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:52:09 +01:00
Nicolai Hähnle	b9fc01aee7	radeon: scale query buffer size to result size Performance monitor queries can become very big, especially considering that instances of a block in different shader engines are queried separately. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:09 +01:00
Nicolai Hähnle	592928065c	radeonsi/sid: add performance counter registers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:06 +01:00
Nicolai Hähnle	9823048e0b	radeonsi/sid: add hardware constants for COPY_DATA packet Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:03 +01:00
Nicolai Hähnle	1aa3b48c12	radeon: extend CIK_UCONFIG_REG_END for performance counters Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:28:00 +01:00
Nicolai Hähnle	b589e18a98	radeon: add perfcounter-related EVENT_TYPEs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:27:56 +01:00
Nicolai Hähnle	30462b1826	radeon: additional constants for WAIT_REG_MEM and EVENT_WRITE_EOP Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:27:34 +01:00
Nicolai Hähnle	bfddd005ea	st/mesa: remove outdated comment The enable of AMD_performance_monitor is no longer related to whether queries are run by the GPU since the commit mentioned below. Suggested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> commit `ddf27a3dd0` Author: Nicolai Hähnle <nhaehnle@gmail.com> Date: Tue Nov 10 13:35:01 2015 +0100 gallium: remove pipe_driver_query_group_info field type	2015-11-25 15:27:34 +01:00
Nicolai Hähnle	babf655ab2	st/mesa: delay initialization of performance counters Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-25 15:27:33 +01:00
Nicolai Hähnle	27a06e0bbe	mesa/main: allow delayed initialization of performance monitors Most applications never use performance counters, so allow drivers to skip potentially expensive initialization steps. A driver that wants to use this must enable the appropriate extension(s) at context initialization and set the InitPerfMonitorGroups driver function which will be called the first time information about the performance monitor groups is actually used. The init_groups helper is called for API functions that can be called before a monitor object exists. Functions that require an existing monitor object can rely on init_groups having been called before. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-25 15:27:33 +01:00
Tapani Pälli	315c4c315e	glsl: handle case where index is array deref in optimize_split_arrays Previously pass did not traverse to those array dereferences which were used as indices to arrays. This fixes Synmark2 Gl42CSCloth application issues. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-25 11:25:57 +02:00
Julien Isorce	63c344d179	nouveau: move interlaced assert down in nouveau_vp3_video_buffer_create templat->interlaced is 0 if not NV12 which is the case currently when using VPP. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-25 08:17:39 +00:00
Iago Toral Quiroga	2bba2152e4	i965: remove trailing spaces in various files Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-25 08:12:08 +01:00
Iago Toral Quiroga	1af0d9d939	glsl: remove trailing spaces in various files Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-25 08:09:17 +01:00
Matt Turner	f1b7fefd4e	i965: Pass brw_context pointer, not gl_context pointer. Fixes a warning introduced by commit `dcadd855`.	2015-11-24 21:27:57 -08:00
Timothy Arceri	7436d7c33b	glsl: only call dead code pass when new inputs/outputs demoted This will help avoid eliminating inputs/outputs needed by SSOs. Cc: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-25 09:50:13 +11:00
Timothy Arceri	404ac4bf9e	glsl: move and reused code to find first and last shaders Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-25 09:49:48 +11:00
Matt Turner	0ce370a84b	mesa: Use unreachable() instead of a default case. (And add an unreachable() in one place that didn't have a default case)	2015-11-24 13:27:20 -08:00
Ian Romanick	47b3a0d235	meta: Don't save or restore the active client texture This setting is only used by glTexCoordPointer and related glEnable calls. Since the preceeding commits removed all of those, it is not necessary to save, reset to default, or restore this state. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	c63f9c735d	meta: Don't save or restore the VBO binding Nothing left in meta does anything with the VBO binding, so we don't need to save or restore it. The VAO binding is still modified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	58aa56d40b	meta/TexSubImage: Don't pollute the buffer object namespace tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	76cfe2bc44	meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	a222d4cbc3	meta: Use internal functions for buffer object and VAO access in _mesa_meta_DrawTex Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	b8a7369fb7	meta: Track VBO using gl_buffer_object instead of GL API object handle in _mesa_meta_DrawTex Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:30 -08:00
Ian Romanick	d5225ee5d9	meta: Partially convert _mesa_meta_DrawTex to DSA Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	37d11b13ce	meta: Don't pollute the buffer object namespace in _mesa_meta_setup_vertex_objects tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	b1b73a42c8	meta: Use internal functions for buffer object and VAO access Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	52921f8e08	meta: Use DSA functions for VBOs in _mesa_meta_setup_vertex_objects The fixed-function attribute paths don't get the DSA treatment because there are no DSA entry-points for fixed-function attributes. These could have been added, but this is a temporary patch intended to make later patches easier to review. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	1035e00a81	meta: Track VBO using gl_buffer_object instead of GL API object handle Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	3b5a7d450d	meta: Don't leave the VBO bound after _mesa_meta_setup_vertex_objects Meta currently does this, but future changes will make this impossible. Explicitly do it as a step in the patch series now to catch any possible kinks. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	ed0bd6573b	i965: Use _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	7f2f300071	meta: Use _mesa_NamedBufferData and _mesa_NamedBufferSubData for users of _mesa_meta_setup_vertex_objects Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	89a61afdd7	meta: Use DSA functions for PBO in create_texture_for_pbo Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	4e6b9c11fc	i965: Don't pollute the buffer object namespace in brw_meta_fast_clear tl;dr: For many types of GL object, we can NEVER use the Gen function. In OpenGL ES (all versions!) and OpenGL compatibility profile, applications don't have to call Gen functions. The GL spec is very clear about how you can mix-and-match generated names and non-generated names: you can use any name you want for a particular object type until you call the Gen function for that object type. Here's the problem scenario: - Application calls a meta function that generates a name. The first Gen will probably return 1. - Application decides to use the same name for an object of the same type without calling Gen. Many demo programs use names 1, 2, 3, etc. without calling Gen. - Application calls the meta function again, and the meta function replaces the data. The application's data is lost, and the app fails. Have fun debugging that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92363 Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	e62799bd4e	i965: Use internal functions for buffer object access Instead of going through the GL API implementation functions, use the lower-level functions. This means that we have to keep track of a pointer to the gl_buffer_object and the gl_vertex_array_object. This has two advantages. First, it avoids a bunch of CPU overhead in looking up objects and validing API parameters. Second, and much more importantly, it will allow us to stop calling _mesa_GenBuffers / _mesa_CreateBuffers and pollute the buffer namespace (next patch). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	1c5423d3a0	i965: Use DSA functions for VBOs in brw_meta_fast_clear Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	dcadd855f1	i965: Pass brw_context instead of gl_context to brw_draw_rectlist Future patches will use the brw_context instead. Keeping this non-functional change separate should make the function changes easier to review. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	4a644f1caa	mesa: Refactor enable_vertex_array_attrib to make _mesa_enable_vertex_array_attrib Pulls the parts of enable_vertex_array_attrib that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). _mesa_enable_vertex_array_attrib can also be used to enable fixed-function arrays. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:29 -08:00
Ian Romanick	a336fcd36a	mesa: Refactor update_array_format to make _mesa_update_array_format_public Pulls the parts of update_array_format that aren't just parameter validation out into a function that can be called from other parts of Mesa (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:28 -08:00
Ian Romanick	8fae494df2	mesa: Make bind_vertex_buffer avilable outside varray.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-24 11:31:28 -08:00
Kenneth Graunke	03d6949630	Revert "i965: Combine assembly annotations if possible." This reverts commit `a280e83d71`. It breaks INTEL_DEBUG=fs output. For example, glsl-fs-discard-01.shader_test has 11 instructions but only prints 5. Acked-by: Matt Turner <mattst88@gmail.com>	2015-11-24 10:21:37 -08:00
Matt Turner	5369efe311	glsl: Pass ast_type_qualifier by const reference. Coverity noticed that we were passing this by value, and it's 152 bytes. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-24 10:05:33 -08:00
Matt Turner	f36993b469	i965: Clean up #includes in the compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	1eb11e64b3	i965: Move brw_new_shader and brw_link_shader prototypes from brw_wm.h. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	6ba700c3c3	i965: Compile brw_cs_fill_local_id_payload() as C. It's only called from C, it compiles as C, so just compile it as C. Notice the missing extern "C" on the definition of the function, which would screw things up if the prototype wasn't parsed before the definition. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	6b525d9f2b	i965: Move MRF macros from brw_inst.h to brw_eu.h. brw_inst.h is only for the brw_inst/brw_compact_inst functions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	76732932ec	i965: Drop #include of main/glheader.h. It's never used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	ecac1aab53	i965: Push down inclusion of brw_program.h. We were including it in headers, which then caused it to be included in tons of places it wasn't needed. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	64cc7572c1	i965: Mark functions called from C as extern "C". These functions' prototypes are marked with extern "C", which apparently overrides a lack of extern "C" at the definition site if the prototype has been seen first. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	fb86f0e75a	i965: Push down inclusion of vbo/vbo.h. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	6fe9ea78fa	i965: Remove duplicate #includes. Added in commits `36fd65381` and `337dad8ce` even though the existing include was in view. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:33 -08:00
Matt Turner	c06f3d5d54	i965: Remove unneeded forward declarations. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:32 -08:00
Matt Turner	e768c498bf	i965: Mark count_trailing_one_bits() static. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:32 -08:00
Matt Turner	836aaa4394	i965: Remove useless gen6_blorp.h/gen7_blorp.h headers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-24 10:05:32 -08:00
Matt Turner	d956335a0b	util: Include assert.h in macros.h.	2015-11-24 10:05:32 -08:00
Matt Turner	fafbf994cf	util: Include <stdbool.h> in debug.h.	2015-11-24 10:05:32 -08:00
Matt Turner	2d8c529903	i965: Prevent implicit upcasts to brw_reg. Now that backend_reg inherits from brw_reg, we have to be careful to avoid the object slicing problem. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-24 09:58:33 -08:00
Matt Turner	799f924073	i965: Use scope operator to ensure brw_reg is interpreted as a type. In the next patch, I make backend_reg's inheritance from brw_reg private, which confuses clang when it sees the type "struct brw_reg" in the derived class constructors, thinking it is referring to the privately inherited brw_reg: brw_fs.cpp:366:23: error: 'brw_reg' is a private member of 'brw_reg' fs_reg::fs_reg(struct brw_reg reg) : ^ brw_shader.h:39:22: note: constrained by private inheritance here struct backend_reg : private brw_reg ^~~~~~~~~~~~~~~ brw_reg.h:232:8: note: member is declared here struct brw_reg { ^ Avoid this by marking brw_reg with the scope resolution operator.	2015-11-24 09:58:33 -08:00
Matt Turner	f093c842e6	i965: Use implicit backend_reg copy-constructor. In order to do this, we have to change the signature of the backend_reg(brw_reg) constructor to take a reference to a brw_reg in order to avoid unresolvable ambiguity about which constructor is actually being called in the other modifications in this patch. As far as I understand it, the rule in C++ is that if multiple constructors are available for parent classes, the one closest to you in the class heirarchy is closen, but if one of them didn't take a reference, that screws things up. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-24 09:58:33 -08:00
Matt Turner	309a44d63c	i965: Add and use backend_reg::equals(). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-24 09:58:33 -08:00
Roland Scheidegger	6c6a439e98	softpipe/llvmpipe: don't advertize support for ASTC `3333977556` added support for ASTC textures to gallium. They don't have any helpers hooked up for software decoding, however, so cannot support them in drivers relying on util code for decoding. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-24 18:26:11 +01:00
Roland Scheidegger	97eed8dcb9	llvmpipe: don't test for unsupported formats in lp_test_format Removing the fake format helpers (`1c7d0a6aa4`) caused this to fail. These formats were never supported, but previously they would have asserted in the generated jit functions (which, due to lack of test cases for these formats, were never called) whereas we now assert when trying to build the jit function. So, skip them completely. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=93092 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-24 18:26:11 +01:00
Ian Romanick	9b41489cb5	docs: add missed i965 feature to relnotes Trivial. GL_ARB_fragment_layer_viewport support was added in `8c902a58` by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-24 09:03:39 -08:00
Rob Clark	d278e31459	util: move brw_env_var_as_boolean() to util Kind of a handy function. And I'll want it available outside of i965 for common nir-pass helpers. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Nicolai Hähnle <nhaehnle@gmail.com>	2015-11-24 10:02:55 -05:00
Christian König	d3e2c48dfa	st/va: fix indentation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:48 +01:00
Christian König	64761a841d	st/va: move MPEG4 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:45 +01:00
Christian König	9fe7924328	st/va: move VC-1 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:41 +01:00
Christian König	da173344a6	st/va: move H264 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:38 +01:00
Christian König	c9cb22392b	st/va: move MPEG12 functions into separate file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:35 +01:00
Christian König	ec6ef1cbfe	st/va: move post processing function into own file Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:31 +01:00
Christian König	3d6386fdc5	st/va: fix post process dirty area handling The dirty area in this call isn't related to the screen at all. v2: set clear dirty area to false as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-24 15:31:11 +01:00
Timothy Arceri	2571a768d6	glsl: implement recent spec update to SSO validation Enables 200+ dEQP SSO tests to proceed past validation, and fixes a ES31-CTS.sepshaderobjs.PipelineApi subtest. V2: split out change that reverts a previous patch into its own commit, move variable declaration to top of function, and fix some formatting all suggested by Ian. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-24 20:59:48 +11:00
Timothy Arceri	3c4aa7aff2	Revert "mesa: return initial value for VALIDATE_STATUS if pipe not bound" This reverts commit `ba02f7a3b6`. The commit checked whether the pipeline was currently bound instead of checking whether it had ever been bound. The previous setting of Validated during object creation makes this unnecessary. The real problem was that Validated was not properly set to false elsewhere in the code. This is fixed by a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-24 20:59:44 +11:00
Michel Dänzer	d094631936	radeon/llvm: Use llvm.AMDIL.exp intrinsic again for now llvm.exp2.f32 doesn't work in some cases yet. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-24 18:07:48 +09:00
Boyuan Zhang	f55f134a03	radeon/uvd: uv pitch separation for stoney v2: set the behaviour default for future ASICs. Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-23 17:34:43 -05:00
Dave Airlie	237bcdbab5	texgetimage: consolidate 1D array handling code. This should fix the getteximage-depth test that currently asserts. I was hitting problem with virgl as well in this area. This moves the 1D array handling code to a single place. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.6 11.0 11.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-24 06:43:21 +10:00
Jason Ekstrand	d9b8fde963	i965: Use NIR for lowering texture swizzle Now that nir_lower_tex can do texture swizzle lowering, we can use that instead of repeating more-or-less the same code in both backends. This both allows us to share code and means that things like the tg4 work-arounds are somewhat simpler because they don't have to take the swizzle into account. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:07:32 -08:00
Jason Ekstrand	8537b4ab76	nir/lower_tex: Add support for lowering texture swizzle Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	6921b17107	nir: Add a tex_instr_is_query helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	7e83fd85aa	nir: Add a ssa_def_rewrite_uses_after helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	384396a69b	nir: Use instr/if_rewrite in nir_ssa_def_rewrite_uses nir_ssa_def_rewrite_uses is one of the older helpers in NIR and predated both of those. Now it can be substantially simplified. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	03c9ad900e	nir/validate: Validated dests after sources Previously, if someone accidentally made an instruction that refers to its own SSA destination, the validator wouldn't catch it. The reason for this is that it validated the destination too early and, by the time it got to the source, the destination SSA value was already added to the set of seen SSA values so it would assume that it came from some previous instruction. By moving destination validation to be after source validation, the SSA value is not in the list of seen values and the validator will catch self-referential instructions. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	6c8ba59cff	i965: Use nir_lower_tex for texture coordinate lowering Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-23 11:04:49 -08:00
Jason Ekstrand	d065a93a3f	i965/fs: Stomp the texture return type to UINT32 for resinfo messages Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	042fa75e48	nir/lower_tex: Set the dest_type for txs instructions Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	1417f6a216	nir/lower_tex: Report progress Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	ce767bbdff	i965: Move postprocess_nir to codegen time This allows us to insert NIR passes between initial NIR compilation and optimization (link time) and actual backend code-gen. In particular, it will allow us to do shader variants in NIR and share some of that shader variant code between backends. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	9cf108193b	i965/nir: Split shader optimization and lowering into three stages At the moment, brw_create_nir just calls the three stages in sequence so there's not much difference. Soon, however, we will want to start doing variants in NIR at which point the postprocessing step will have to move from shader create time to codegen time. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-23 11:02:15 -08:00
Jason Ekstrand	9d703de85a	i965: Use ull immediates in brw_inst_bits This fixes a regression introduced in `b1a83b5d1` that caused basically all shaders to fail to compile on 32-bit platforms. Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 10:55:38 -08:00
Ilia Mirkin	e4c1221d36	docs: add missed freedreno features to relnotes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1" <mesa-stable@lists.freedesktop.org>	2015-11-23 12:32:54 -05:00
Ilia Mirkin	33dc9aac07	docs: update relnotes with new freedreno/a4xx support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 12:32:54 -05:00
Jose Fonseca	c9651f0264	svga: Add ASTC formats to format table. Fixes build. Otherwise untested. Trivial.	2015-11-23 16:45:28 +00:00
Ilia Mirkin	754b26e76d	freedreno/ir3: add support for a few gs5 ops Tested on a4xx. This is part of the builtins added by ARB_gpu_shader5 and GLSL ES 3.10. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:16 -05:00
Ilia Mirkin	cca8dd4e93	ttn: fix UMSB conversion Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:16 -05:00
Ilia Mirkin	190acb34ca	freedreno/a4xx: add ARB_texture_query_lod support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	f0e670bdd7	ttn: add LODQ support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	9761d5146f	freedreno/a4xx: re-emit program on dirty framebuffer The program emit depends on certain fb details. Make sure those get updated when the fb changes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	81b16350fa	freedreno/a4xx: use a factor of 32767 for snorm8 blending It appears that the hardware wants the integer to be scaled the same way that the hardware representation is. snorm16 uses one of the float factors, so this is only relevant for snorm8. This fixes a number of subcases of bin/fbo-blending-formats GL_EXT_texture_snorm Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-23 11:17:15 -05:00
Ilia Mirkin	6f17f19b17	freedreno/a4xx: only compute texture offset once for the view Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	f10bb0ac9e	freedreno/a4xx: add ARB_texture_view support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	1b9992b803	freedreno/a4xx: add formats for ARB_texture_buffer_object_rgb32 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	f9549d0a0f	freedreno/a4xx: add ARB_texture_rgb10_a2ui support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	93905a8df1	freedreno/a4xx: add astc formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	6b21d3c92e	st/mesa: add astc support This doesn't account for the ldr/hdr distinction... that will probably have to be exposed via a separate cap. When relevant hardware appears, this can be worked out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	3333977556	gallium: add ASTC formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-23 11:17:15 -05:00
Ilia Mirkin	1c7d0a6aa4	gallium/util: remove the fake format helpers for bptc and etc2 This was a silly hack that kept growing and growing. Instead, just write NULLs for those functions. No need to have helpers that just assert(0) when you call them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-23 11:17:14 -05:00
Ilia Mirkin	c65bc2e805	freedreno/a4xx: support 16384 texels in buffer texture Looks like the width field's bitmask was off-by-one. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:14 -05:00
Ilia Mirkin	99f12a3f1a	freedreno/a4xx: add ARB_texture_buffer_range support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:14 -05:00
Ilia Mirkin	d4c40f99ab	freedreno/a4xx: add polygon mode support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-23 11:17:14 -05:00
Emil Velikov	b89d1b2ccf	configure.ac: default to disabled dri3 when --disable-dri is set Not too long ago, the dri3 code was living in src/glx, which in itself was guarded by HAVE_DRI_GLX. As the name suggests we didn't dive into the folder when dri was disabled, thus we missed that dri3 does not consider/honour --enable-dri. Cc: mesa-stable@lists.freedesktop.org Fixes: `6bd9ba7d07` "loader: Add dri3 helper" Cc: Pali Rohár <pali.rohar@gmail.com> Reported-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-23 12:08:04 +00:00
Emil Velikov	b9b0a1f58e	loader: unconditionally add AM_CPPFLAGS to libloader_la_CPPFLAGS It seems that due to the conditional autotools is getting confused and forgetting to add AM_CPPFLAGS when building libloader (when HAVE_DRICOMMON is not set). Cc: mesa-stable@lists.freedesktop.org Fixes: `5a79e0a8e3` "automake: loader: rework the CPPFLAGS" Reported-by: Pali Rohár <pali.rohar@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 12:07:50 +00:00
Emil Velikov	8a6d476588	pipe-loader: link against libloader regardless of libdrm presence Whether or not the loader has libdrm support is up-to it. Anyone using the loader should just include it whenever they depend on it. Cc: mesa-stable@lists.freedesktop.org Fixes: `0f39f9cb7a` "pipe-loader: add a dummy 'static' pipe-loader" Reported-by: Jon TURNEY <jon.turney@dronecode.org.uk> Tested-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-23 12:07:09 +00:00
Neil Roberts	2010de4015	i965: Handle lum, intensity and missing components in the fast clear It looks like the sampler hardware doesn't take into account the surface format when sampling a cleared color after a fast clear has been done. So for example if you clear a GL_RED surface to 1,1,1,1 then the sampling instructions will return 1,1,1,1 instead of 1,0,0,1. This patch makes it override the color that is programmed in the surface state in order to swizzle for luminance and intensity as well as overriding the missing components. Fixes the ext_framebuffer_multisample-fast-clear Piglit test. v2: Handle luminance and intensity formats Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-11-23 10:44:01 +01:00
Jason Ekstrand	f58813842b	nir: s/nir_type_unsigned/nir_type_uint v2: do the same in tgsi_to_nir (Samuel) v3: added missing cases after rebase (Iago) v4: Add a blank space after '#' in one of the comments (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:36:12 +01:00
Connor Abbott	fb93dd7aa8	nir/builder: only read meaningful channels in nir_swizzle() This way the caller doesn't have to initialize all 4 channels when they aren't using them. v2: Fix signed/unsigned comparison warning (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:36:12 +01:00
Connor Abbott	d982922b18	i965/fs: add stride restrictions for copy propagation There are various restrictions on what the hstride can be that depend on the Gen, and now that we're using hstride == 2 for packing/unpacking doubles, we're going to run into these restrictions a lot more often. Pull them out into a separate function, and move the one restriction we checked previously into it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:30:30 +01:00
Connor Abbott	95ac3b1dae	i965/fs: don't propagate cmod when the exec sizes differ This can happen when the source of the compare was split by the SIMD lowering pass. Potentially, we could allow the case where the exec size of scan_inst is larger, and scan_inst has the right quarter selected, but doing that seems a little more risky. v2: Merge the bail condition into the the previous if/break block (Matt) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:30:30 +01:00
Connor Abbott	70171a9c89	i965/fs: respect force_sechalf/force_writemask_all in CSE Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-23 08:30:30 +01:00
Connor Abbott	b1a83b5d1b	i965: fix 64-bit immediates in brw_inst(_set)_bits If we tried to get/set something that was exactly 64 bits, we would try to do (1 << 64) - 1 to calculate the mask which doesn't give us all 1's like we want. v2 (Iago) - Replace ~0 by ~0ull - Removed unnecessary parenthesis v3 (Kristian) - Avoid the conditional Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-23 08:30:30 +01:00
Connor Abbott	718b9f52dd	i965/fs: print non-1 strides when dumping instructions v2: - Simplify code (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-23 08:30:30 +01:00
Ilia Mirkin	4deb118d06	nv50/ir: fix (un)spilling of 3-wide results There is no 96-bit load/store operations, so we have to split it up into a 32-bit parts, with a split/merge around it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90348 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-22 23:27:22 -05:00
Timothy Arceri	6463d36394	glsl: fix max binding validation for uniform blocks Regression as of `64710db664` We can't use the type returned by get_interface_type() as the interface type has arrays removed. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-23 13:47:19 +11:00
Ilia Mirkin	ad5f6b03e7	nv50,nvc0: properly handle buffer storage invalidation on dsa buffer In case that the buffer has no bind at all, assume it can be a regular buffer. This can happen on buffers created through the ARB_dsa interfaces. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-22 21:08:16 -05:00
Ilia Mirkin	079f713754	nouveau: use the buffer usage to determine placement when no binding With ARB_direct_state_access, buffers can be created without any binding hints at all. We still need to allocate these buffers to VRAM or GART, as we don't have logic down the line to place them into GPU-mappable space. Ideally we'd be able to shift these things around based on usage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92438 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>	2015-11-22 20:58:56 -05:00
Eric Anholt	1b62a4e885	vc4: Take precedence over ilo when in simulator mode. They're exclusive at build time, but the ilo entry is always present, so we'd try to use it and fail out. v2: Add comment in the code, from Emil. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-22 13:15:43 -08:00
Eric Anholt	a39eac80fd	vc4: Just put USE_VC4_SIMULATOR in DEFINES. In the pipe-loader reworks, it was missed in one of the new directories it was used. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-22 13:15:40 -08:00
Nanley Chery	d1212abf50	mesa/teximage: Fix S3TC regression due to ASTC interaction A prior, literal reading of the ASTC spec led to the prohibition of some compressed formats being used against the targets: TEXTURE_CUBE_MAP_ARRAY and TEXTURE_3D. Since the spec does not specify interactions with other extensions for specific compressed textures, remove such interactions. Fixes the following Piglit tests on Gen9: piglit.spec.arb_direct_state_access.getcompressedtextureimage piglit.spec.arb_get_texture_sub_image.arb_get_texture_sub_image-getcompressed piglit.spec.arb_texture_cube_map_array.fbo-generatemipmap-cubemap array s3tc_dxt1 piglit.spec.ext_texture_compression_s3tc.getteximage-targets cube_array s3tc v2. Don't interact with other specific compressed formats (Ian). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91927 Suggested-by: Neil Roberts <neil@linux.intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-22 12:29:09 -08:00
Nanley Chery	21d43fe51a	mesa/extensions: Enable overriding permanently enabled extensions Provide the ability to prevent any permanently enabled extension from appearing in the string returned by glGetString[i](). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-11-22 12:19:45 -08:00
Igor Gnatenko	05eed0eca7	virgl: pipe_virgl_create_screen is not static Cc: mesa-stable@lists.freedesktop.org Fixes: `17d3a5f857` "target-helpers: add a non-inline drm_helper.h" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93063 Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-22 11:17:17 +00:00
Kenneth Graunke	86fc97da06	i965: Fix num_uniforms count for scalar GS. I noticed that brw_vs.c does this. I believe the point is that nir->num_uniforms is either counted in scalar components (in scalar mode), or vec4 slots (in vector mode). But we want param_count to be in scalar components regardless, so we have to scale up in vector mode. We don't have to scale up in scalar mode, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-22 00:03:21 -08:00
Eric Anholt	4cff16bc3a	vc4: Use nir_channel() to simplify all of our nir_swizzle() cases.	2015-11-21 18:55:31 -08:00
Eric Anholt	81544f231a	vc4: Fix point size lookup. I think I may have regressed this in the NIR conversion. TGSI-to-NIR is putting the PSIZ in the .x channel, not .w, so we were grabbing some garbage for point size, which ended up meaning just not drawing points. Fixes glean pointAtten and pointsprite.	2015-11-21 18:55:31 -08:00
Jose Fonseca	4befd82a64	pipe-loader: Fix PATH_MAX define on MSVC.	2015-11-21 23:03:20 +00:00
Jose Fonseca	02afbd2476	scons: Conditionally use DRM module on pipe-loader. Fixes non Linux builds. Trivial.	2015-11-21 21:20:12 +00:00
Ilia Mirkin	22aeb0c568	freedreno/a4xx: disable blending and alphatest for integer rt0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Ilia Mirkin	4c170d9e1d	freedreno/a4xx: fix independent blend This fixes the ext_draw_buffers2 and arb_draw_buffers_blend tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Ilia Mirkin	801b55c2ee	freedreno/a4xx: enable ARB_base_instance support We already pass in start_instance in fd4_draw. Expose the extension. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-21 09:08:16 -05:00
Ilia Mirkin	f54c89f13e	freedreno/a4xx: set fetchsize in mem2gmem texture restore Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-21 09:08:16 -05:00
Ilia Mirkin	7426d9581a	freedreno/a4xx: add 11_11_10_float vertex type support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-21 09:08:16 -05:00
Ilia Mirkin	740eb63aa7	freedreno/a4xx: fix 3d texture setup Same fix as on a3xx - set the second (tiny) layer size bitfield to the smallest level's size so that the hw knows not to minify beyond that. This fixes texelFetch sampler3D piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Ilia Mirkin	ecb0dcd34c	freedreno/a4xx: only align slices in non-layer_first textures When layer is the container, slices are tightly packed inside of each layer. We don't need any additional alignment. On a3xx, each slice contains all the layers, so having alignment makes sense. This fixes a whole slew of array-related piglits, including texelFetch and tex-miplevel-selection varieties. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-21 09:08:16 -05:00
Emil Velikov	428146522b	docs: add 11.2.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 14:10:08 +00:00
Emil Velikov	623f64efc1	util: use RTLD_LOCAL with util_dl_open() Otherwise we risk things blowing up due to conflicting symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:21 +00:00
Emil Velikov	8943a562e2	targets/nine: remove unused static functions Dead code since commit `8f50614910` Cc: Axel Davy <axel.davy@ens.fr> Cc: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:21 +00:00
Emil Velikov	42dde5aa24	targets/nine: add note about messy header inclusion order Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:21 +00:00
Emil Velikov	0942250781	targets/nine: add note about fd owndership v2: - move autotools hunk into correct patch - correct the note based on Axel's feedback Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:21 +00:00
Emil Velikov	f8a1665542	auxiliary/vl: Don't close the drm fd on failure Ported from an identically named commit in st/xa commit `35cf3831d7` Author: Thomas Hellstrom <thellstrom@vmware.com> Date: Thu Jul 3 02:07:36 2014 -0700 st/xa: Don't close the drm fd on failure v2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:21 +00:00
Emil Velikov	e43a771dfa	st/dri: NULL check the pscreen earlier We delay the null check only to jump through hoops to work around that. Check early to make our lives easier. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	13bccee87d	st/dri: Don't close the drm fd on failure Ported from an identically named commit in st/xa commit `35cf3831d7` Author: Thomas Hellstrom <thellstrom@vmware.com> Date: Thu Jul 3 02:07:36 2014 -0700 st/xa: Don't close the drm fd on failure v2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	b7f5c2ee48	target-helpers: remove inline_drm_helper.h As of earlier all the targets use the non inline version. Don't forget to remove the function prototypes/declarations. v2: rebase on top of virgl support. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	dddedbec0e	{st,targets}/nine: use static/dynamic pipe-loader Analogous to previous commits. v2: add the missing winsys libs linkage Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	611ef64ed5	{st,targets}/xa: use static/dynamic pipe-loader Analogous to previous commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	1eb6e8a23c	{auxiliary,targets}/vl: use static/dynamic pipe-loader Analogous to previous commit. v2: rebase on top of vl_winsys_drm.c addition Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	23fb11455b	{st,targets}/dri: use static/dynamic pipe-loader Covert DRI to use only the pipe-loader interface. With drisw_create_screen and kms_swrast_create_screen replaced by their pipe-loader equivalent, we can now drop them. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	c4d337146a	pipe-loader: add preliminary Android support Add a 'static' pipe-loader build, which will be used with follow-up commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	234b03cc23	pipe-loader: add preliminary scons support Add a 'static' pipe-loader build, which will be used with follow-up commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	7999e6ddba	pipe-loader: don't mix code and variable declarations We cannot use this C99 feature here quite yet, as the code needs to be build with MSVC prior to 2013. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:20 +00:00
Emil Velikov	17d3a5f857	target-helpers: add a non-inline drm_helper.h Unlike the inline ones, here we'd want to have an extern definition of the functions. This is required as with follow-up commits, we'll gradually start using the static pipe-loader, with the latter needing the symbols. These are direct copy from the inline version. v2: - rebase on top of virgl support - add "driver missing" printfs (Nicolai) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	af031deed6	target-helpers: move the DRI specifics to the target Rather than having all targets include the file, with only some defining the relevant guard macro, just move things where they are used. v2: rebase on top of virgl support. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	950e06a29b	automake: remove no longer needed HAVE_LOADER_GALLIUM conditional As of last few commits we have a static and dynamic pipe-loader. Either of which will be used with (almost) all targets.. We can look into allowing the user to select which way the targets are built, be that 'static for all' or 'per target' in follow up commits. After which we can look into building only the static or dynamic version, although building both shouldn't cause any issues. Hack/workaround alert: Control the standalone pipe-drivers via HAVE_CLOVER. Will need to be fixed as the targets are converted/configure knobs are in. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	be78f73b37	pipe-loader: wire up the 'static' sw pipe-loader Analogous to previous commit with a small catch. As the sw inline helpers are mere wrappers, and the screen <> winsys split is more prominent (with the latter not being part of the final pipe-driver), things will just work. v2: rebase on top of earlier 'consolitate teardown' changes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	1b589207de	pipe-loader: wire up the 'static' drm pipe-loader Add a list of driver descriptors and select one from the list, during probe time. As we'll need to have all the driver pipe_foo_screen_create() functions provided externally (i.e. from another static lib) we need a separate (non-inline) drm_helper, which contains the function declarations. v2: rebase on top of virgl support. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	0f39f9cb7a	pipe-loader: add a dummy 'static' pipe-loader It is to be used in contrast of the dynamic one. The state-tracker does not need to know if the pipe-driver is built into the final blob or a separate object. This will allow us to move the logic to the final step (in target) where the appropriate pipe-loader will be chosen. Cc: Tom Stellard <thomas.stellard@amd.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	ad12027d8f	gallium: rename libpipe_loader to libpipe_loader_dynamic With the next commits we'll introduce a 'static' version, which will essentially load the statically linked-in pipe-drivers, rather than the standalone pipe-$foo.so ones. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	3ca12ee976	pipe-loader: dlopen/dlsym the pipe-driver at probe time Rather than giving false hopes that things might work, just check at probe time. This allows us to remove the duplication and consolidate the code wrt the upcomming static pipe-loader. Cc: Tom Stellard <thomas.stellard@amd.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	e465de5a51	pipe-loader: annotate the ops as const data Already defined as such in struct pipe_loader_device::ops. Cc: Tom Stellard <thomas.stellard@amd.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	46991ab9aa	pipe-loader: teardown the winsys, if create_screen fails i.e. plug some (hard to hit) memory leaks. v2: fix rebase fallout - really teardown the winsys (Brian) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:19 +00:00
Emil Velikov	d54ca54faa	pipe-loader: rework the sw backend Move the winsys into the pipe-target, similar to the hardware pipe-driver. v2: - move int declaration outside of loop (Brian) - fold the teardown into a goto + separate function. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	f58a6f7be3	gallium: keep the libdrm link alongside libkmsdri.la Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	ff9cd8a67c	pipe-loader: directly use pipe_loader_sw_probe_null() at probe time Due to the nature of the other sw winsys' we cannot use them during the generic probe stage. As such there is little point in keeping the abstraction layer. Cc: Tom Stellard <thomas.stellard@amd.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	4e3c06a501	pipe-loader: add pipe_loader_sw_probe_init_common() helper Allows us to fold the duplication in pipe_loader_sw_probe_*(). Cc: Tom Stellard <thomas.stellard@amd.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	6d68d714c0	gallium/tests: remove unneeded include paths The tests don't (and shouldn't) need to have anything driver and/or winsys specific. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	74d41a32bc	gallium: remove library_path argument from pipe_loader_create_screen() Currently the location is determined at configure/build time and consistently copied across gallium. Just remove the extra argument, and use PIPE_SEARCH_DIR where appropriate. This will allow us to remove the duplication in the configuration and screen_create APIs by moving util_dl_get_proc_address() and friends to probe time. v2: rebase on top of vl_winsys_drm.c addition Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	cbc4d9730a	targets/nine: remove the custom pipe-driver path management Since the up-streaming of nine, the static target was used by default. The dynamic pipe-drivers being available only via manual tweak of configure.ac. As we'll be removing the library_path argument from the pipe-loader with follow-up commits, we can remove D3D9_DRIVERS_PATH/D3D9_DRIVERS_DIR. Everyone doing local hacking on nine, or wishing to have a env override can bring them back within the pipe-loader. Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	149454bb13	pipe-loader: remove HAVE_DRM_LOADER_GALLIUM and HAVE_PIPE_LOADER_DRM ... in favour of HAVE_LIBDRM. After all we solely want to build the code when the latter is available. In the not too distant future we will remove the libudev/sysfs dependency and simplify configure.ac even further. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	33f1db1eb4	pipe-loader: add pipe_loader_sw_probe_kms() implementation Will be used as a counterpart for target-helpers' kms_swrast_create_screen(). Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	be430726e2	configure: use HAVE_DRISW_KMS when handling kms swrast Using HAVE_DRI2 to manage it seems counter-intuitive. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:18 +00:00
Emil Velikov	f9c9471b76	targets/nine: use the existing sw_screen_wrap() over our custom version Cc: Axel Davy <axel.davy@ens.fr> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:17 +00:00
Emil Velikov	6bcd5f0d02	automake: use GALLIUM_PIPE_LOADER_DEFINES only where applicable As of last commit we no longer need the defines in order to have the function prototypes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:17 +00:00
Emil Velikov	b7875ca493	pipe-loader: remove HAVE_PIPE_LOADER_foo function prototype guards They serve little to no purpose, as we don't need any additional dependencies (headers and/or symbols). On the other hand dropping them will allow us to use GALLIUM_PIPE_LOADER_DEFINES in only one single place - the pipe-loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:17 +00:00
Emil Velikov	c751d33a20	gallium/trace: remove useless NULL check from trace_screen_create() Currently every target makes sure that the screen is non-null prior to using the debug (trace including) wrappers. If that no longer holds true we want to know and fix this ASAP rather than silently bailing out. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:17 +00:00
Emil Velikov	e762a46a07	configure: remove obsolete _CLIENT comment The referenced variable(s) have been removed with commit `abc20120e4` (automake: pipe-loader: remove the 'client' pipe-loader) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2015-11-21 12:52:17 +00:00
Emil Velikov	1a18457a52	docs: add news item and link release notes for 11.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 12:42:48 +00:00
Emil Velikov	da2cb8a2ee	docs: add sha256 checksums for 11.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `2555e000fc`)	2015-11-21 12:41:22 +00:00
Emil Velikov	380aec1703	docs: add release notes for 11.0.6 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `04fd3a6f62`)	2015-11-21 12:41:21 +00:00
Ilia Mirkin	d8c26969d5	freedreno/a4xx: add missing formats to enable ARB_vertex_type_2_10_10_10_rev Same as commit `84d087aea` but for a4xx. The RE'd enums had the same issue too. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 20:41:39 -05:00
Matt Turner	f6986a81c9	i965: Test that nonrepresentable floats cannot be converted to VF. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-20 17:39:34 -08:00
Matt Turner	f450030f66	i965: Use ldexpf() in VF float test set up. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-20 17:39:34 -08:00
Matt Turner	0684aed8ab	i965/vec4: Initialize nir_inputs with src_reg(). nir_locals, nir_ssa_values, and nir_system_values are all dst_reg (not that that makes a whole lot of sense to me), and only nir_inputs is a src_reg. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-20 17:39:34 -08:00
Matt Turner	c875e3cdd2	i965/fs: Add support for gl_HelperInvocation system value. In most cases (when the negate is copy propagated and the MOV removed), this is two instructions on Gen >= 8 and only two instructions on earlier platforms -- and it doesn't use the flag register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-20 17:39:33 -08:00
Matt Turner	4b15281295	i965: Add brw_imm_uv().	2015-11-20 17:39:33 -08:00
Matt Turner	ce11d4f369	i965: Don't bother setting regioning on immediates. The region fields are unioned with the immediate storage.	2015-11-20 17:39:33 -08:00
Matt Turner	c28b574170	nir: Add support for gl_HelperInvocation system value. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-20 17:39:33 -08:00
Ilia Mirkin	fe29330406	freedreno/a4xx: use hardware RGTC texture samplers a4xx hardware has real support for RGTC so there's no need to fake it like we do on a3xx. Undo the hacks, and keep track of an "internal format" of a resource, which on a3xx will be different, triggering the transfer-time conversions to take place. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 19:46:21 -05:00
Ilia Mirkin	39fa5c8419	freedreno/a4xx: hook up RGB565 format Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 19:46:21 -05:00
Ilia Mirkin	3b77826cc1	freedreno/a4xx: logic op handling Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 19:46:21 -05:00
Ilia Mirkin	e1319dcdd6	freedreno/a4xx: add 16-bit unorm/snorm format texturing/rendering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 19:46:21 -05:00
Ilia Mirkin	ff9450ecd1	freedreno/a4xx: point regid to "red" even for alpha-only rb formats Looks like a4xx hw does this in a more standard way and we don't need to hack around it like we do on a3xx. Fixes GL_ALPHA formats in fbo-blending-formats, fbo-colormask-formats, and fbo-alphatest-formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-20 18:15:15 -05:00
Ilia Mirkin	4fd24caf92	ttn: add TEX2 support This fixes CubeArrayShadow tests (where the shadow comes in via a second arg to the TEX2 instruction). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-20 17:45:08 -05:00
Ilia Mirkin	c1babbd85c	freedreno: always set all border colors Instead of playing the guessing game as to which texture format reads from which border color encoding type, just write both of them always. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 17:44:10 -05:00
Ilia Mirkin	ec106e9f62	freedreno/a4xx: fix dst_alpha blend for RGBX render targets There are not native RGBX render formats, so we must manually force dst_alpha to be one, same as for a3xx. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 17:44:10 -05:00
Nicolai Hähnle	5bda3d0958	radeon: re-prepare query buffers on begin_query for predicate queries The point of prepare_buffer is to ensure that the query buffer contains valid initial data for conditional rendering: as long as the buffer is initialized correctly, the GPU is able to tell whether query results have been written already (and wait or fall back to unconditional rendering if desired). This means prepare_buffer needs to be called again when a buffer is reused. Conversely, for queries that cannot be used for conditional rendering (notably pipeline statistics), we can re-use buffers immediately, and they do not need to be initialized. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Andy Furniss <adf.lists@gmail.com>	2015-11-20 22:46:11 +01:00
Nicolai Hähnle	6f4fe8e76a	radeon: reset query buffers for PIPE_QUERY_TIMESTAMP Since begin_query is not called for this query type, we need to reset the query buffer state in end_query instead. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93015 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Andy Furniss <adf.lists@gmail.com> Tested-by: Mathias Tillman <master.homer@gmail.com>	2015-11-20 22:46:11 +01:00
Brian Paul	47fae842d0	mesa: update some old-style (K&R?) function pointer calls Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 14:09:15 -07:00
Brian Paul	1def5ef958	docs: mention GL 3.3 support for VMware driver in Mesa 11.1 relnotes Signed-off-by: Brian Paul <brianp@vmware.com>	2015-11-20 14:06:25 -07:00
Brian Paul	527466d9a1	svga: add num-bytes-uploaded HUD query To graph the number of bytes uploaded to GPU per frame (vertex buffer data, constant buffer data, texture data, etc). Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-20 13:40:06 -07:00
Brian Paul	e96d7a1489	svga: add some sanity check assertions in svga_buffer_transfer_map() Make sure y and z values of buffers are as expected. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-20 13:40:06 -07:00
Timothy Arceri	b109cd3c27	docs: mark compile-time constant expressions as done Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:30:18 +11:00
Timothy Arceri	f7af69c350	glsl: add subroutine index qualifier support ARB_explicit_uniform_location allows the index for subroutine functions to be explicitly set in the shader. This patch reduces the restriction on the index qualifier in validate_layout_qualifiers() to allow it to be applied to subroutines and adds the new subroutine qualifier validation to ast_function::hir(). ast_fully_specified_type::has_qualifiers() is updated to allow the index qualifier on subroutine functions when explicit uniform locations is available. A new check is added to ast_type_qualifier::merge_qualifier() to stop multiple function qualifiers from being defied, before this patch this would cause a segfault. Finally a new variable is added to ir_function_signature to store the index. This value is validated and the non explicit values assigned in link_assign_subroutine_types(). Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-21 07:30:12 +11:00
Timothy Arceri	02d2ab2378	glsl: add support for complie-time constant expressions This patch replaces the old interger constant qualifiers with either the new ast_layout_expression type if the qualifier requires merging or ast_expression if the qualifier can't have mulitple declarations or if all but the newest qualifier is simply ignored. We also update the process_qualifier_constant() helper to be similar to the one in the ast_layout_expression class, but in this case it will be used to process the ast_expression qualifiers. Global shader layout qualifier validation is moved out of the parser in this change as we now need to evaluate any constant expression before doing the validation. V2: Fix minimum value check for vertices (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:28:06 +11:00
Timothy Arceri	0954b813a3	glsl: add new type for compile time constants In this patch we introduce a new ast type for holding the new compile-time constant expressions. The main reason for this is that we can no longer do merging of layout qualifiers before they have been converted into GLSL IR so we need to store them to be proccessed later. The new type has two helper functions: - process_qualifier_constant() Used to merge and then evaluate qualifier expressions - merge_qualifier() Simply appends a qualifier to a list to be merged later by process_qualifier_constant() In order to avoid cascading error messages the process_qualifier_constant() helpers return a bool Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:56 +11:00
Timothy Arceri	4196af4ce7	glsl: call set_shader_inout_layout() earlier This will allow us to add error checking to this function in a later patch, if we don't move it the error messages will go missing. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:49 +11:00
Timothy Arceri	e74fe2a844	glsl: replace binding layout min boundary check Use new helper that will in a later patch allow for compile time constants. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:42 +11:00
Timothy Arceri	64710db664	glsl: encapsulate binding validation and setting This change moves the binding layout handing code into an apply function to be consistent with other helper functions in the ast code, and to encapsulate the code so that when we introduce compile time constants the code will be much cleaner. One small downside is for unnamed interface blocks we will now be revalidating the binding for each member its applied to. However this seems a small sacrifice in order to have code which is readable. We also remove the incorrect comment in the named interface code about propagating bindings to members which seems to have been copied from the unnamed interface code. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:30 +11:00
Timothy Arceri	db3c36aedf	glsl: move stream layout max validation This validation is moved later so we can validate the max value when compile time constant support is added in a later patch. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:21 +11:00
Timothy Arceri	17e224e8ec	glsl: move stream layout qualifier validation We are moving this out of the parser in preparation for compile time constant support. The reason a validation function is used rather than an apply function like what is used with bindings is because glsl allows streams to be defined on members of blocks even though they must match the stream thats associated with the current block, this means we need access to the value after validation to do this comparision. V2: Fix typo in comment (Emil) Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:15 +11:00
Timothy Arceri	efa34e4a1d	glsl: replace index layout min boundary check Use new helper that will in a later patch allow for compile time constants. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:09 +11:00
Timothy Arceri	1d87d6f9ca	glsl: remove duplicate validation for index layout qualifier The minimum value for index is validated in apply_explicit_location() and we want to remove validation from the parser so we can add compile time constant support. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:04 +11:00
Timothy Arceri	d1f23545a1	glsl: move location layout qualifier validation We are moving this out of the parser in preparation for compile time constant support. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:27:00 +11:00
Timothy Arceri	de8f0c9ab9	glsl: add process_qualifier_constant() helper For now this just validates that a qualifier is inside its minimum boundary, in a later patch we will expand it to evaluate compile time constants. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-21 07:26:55 +11:00
Samuel Pitoiset	f57285c8fc	docs: mark GL_AMD_performance_monitor for nv50 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 21:03:14 +01:00
Samuel Pitoiset	aede8ca9a7	nv50: expose two groups of compute-related MP perf counters This turns on GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 21:03:14 +01:00
Ben Widawsky	0288f92e7b	i965/gen9: Support fast clears for 32b float SKL supports the ability to do fast clears and resolves of 32b RGBA as both integer and floats. This patch only enables float color clears because we haven't yet enabled integer color clears, (HW support for that was added in BDW). v2: Remove LUMINANCE16F and INTENSITY16F special cases since they are now handled by Neil's patch to disable MSAA fast clears. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-20 11:45:44 -08:00
Ben Widawsky	7c690da29c	Revert "i965/gen9: Enable rep clears on gen9" This reverts commit `8a0c85b258`. It's not a strict revert because I don't want to bring back the gen < 9 check at this point in time. Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-11-20 11:45:32 -08:00
Ben Widawsky	f838e53c70	Revert "i965/gen9: Disable MCS for 1x color surfaces" This reverts commit `dcd59a9e32`. Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-11-20 11:45:32 -08:00
Ben Widawsky	c4edc048c6	i965/meta/gen9: Individually fast clear color attachments The impetus for this patch comes from a seemingly benign statement within the spec (quoted within the patch). It is very important for clearing multiple color buffer attachments and can be observed in the following piglit tests: spec/arb_framebuffer_object/fbo-drawbuffers-none glclear spec/ext_framebuffer_multisample/blit-multiple-render-targets 0 v2: Doing the framebuffer binding only once (Chad) Directly use the renderbuffers from the mt (Chad) v3: Patch from Neil whose feedback I originally missed. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-11-20 11:45:32 -08:00
Ben Widawsky	6fa1130cd2	i965/skl: skip fast clears for certain surface formats Some of the information originally in this commit message is now in the patch before this. SKL adds compressible render targets and as a result mutates some of the programming for fast clears and resolves. There is a new internal surface type called the CCS. The old AUX_MCS bit becomes AUX_CCS_D. "Auxiliary Surfaces For Sampled Tiled Resource". The formats which are supported are defined in the table titled "Render Target Surface Types [SKL+]". There is no PRM yet to reference. The previously implemented helper function already does the right thing provided the table is correct. v2: Use better English in commit message (Matt) s/compressable/compressible/ (Matt) Don't compare bools to true (Matt) Use the helper function and don't increase the context size - this is mostly implemented in the patch just before this (Chad, Neil) Remove an "invalid" assert (Chad) Fix assertion to check num_samples > 1, instead of num_samples (Chad) v3: Use Matt's code as Requested-by: Chad. I didn't even look at it since Chad said he was fine with that, and presumably Matt is fine with it. v4: Use better quote from spec (Topi) Cc: Chad Versace <chad.versace@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-11-20 11:45:32 -08:00
Ben Widawsky	9d94eeb8a4	i965: Add lossless compression to surface format table Background: Prior to Skylake and since Ivybridge Intel hardware has had the ability to use a MCS (Multisample Control Surface) as auxiliary data in "compression" operations on the surface. This reduces memory bandwidth. This hardware was either used for MSAA compression, or fast clear operations. On Gen8, a similar mechanism exists to allow the hiz buffer to be sampled from, and therefore this feature is sometimes referred to more generally as "AUX buffers". Skylake adds the ability to have the display engine directly source compressed surfaces on top of the ability to sample from them. Inference dictates that enabling this display features adds a restriction to the formats which could actually be compressed. This is backed up by a blurb in the AUX_CCS_D section from the RENDER_SURFACE_STATE: "In addition, if the surface is bound to the sampling engine, Surface Format must be supported for Render Target Compression for surfaces bound to the sampling engine." The current set of surfaces seems to be a subset as compared to previous gens (see the next patch). Also, if I had to guess I would guess that future gens add support for more surface formats. To make handling this a bit easier to read, and more future proof, the support for this is moved into the surface formats table. Along with the modifications to the table, a helper function is also provided to determine if a surface is CCS_E compatible. Because fast clears are currently disabled on SKL, we can plumb the helper all the way through here, and not actually have anything break. v2: - rename ccs to ccs_e; Requested-by: Chad - rename lossless_compression to lossless_compression Requested-by: Chad - change meaning of brw_losslessly_compressible_format Requested-by: Chad - related changes to the code to reflect this. - remove excess ccs (Chad) v3: - Commit message changes (Topi) - Const some things which could be const (Topi) Requested-by: Chad Versace <chad.versace@intel.com> Requested-by: Neil Roberts <neil@linux.intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-20 11:45:32 -08:00
Ben Widawsky	d23aa634e0	i965/skl: Add fast color clear infrastructure Patch was originally called: i965/skl: Enable fast color clears on SKL Skylake introduces some differences in the way that fast clears are programmed and in the restrictions for using fast clears. Since some of these are non-obvious, and fast clears are currently disabled globally, we can enable the simple stuff here and leave the weirder stuff and separately reviewable work. Based on a patch originally from Kristian. Note that within this patch the change in scaling factors could be achieved with this hunk instead. I've opted to keep things more like how the docs describe it however. --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -150,9 +150,13 @@ intel_get_non_msrt_mcs_alignment(struct brw_context brw, / In release builds, fall through / case I915_TILING_Y: width_px = 32 / mt->cpp; - height = 4; + if (brw->gen >= 9) + height = 2; + else + *height = 4; v2: Add braces for the multiline (Matt + Chad) Comment updates (requested by Chad) Modified commit message Commit message from Chad explaining the MCS height change (Chad) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-20 11:45:32 -08:00
Ian Romanick	2f7d2fd997	docs: Add GL_EXT_shader_samples_identical to the release notes Trivial Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-20 11:38:11 -08:00
Leo Liu	8762570cc5	radeon/vce: disable two pipe mode for stoney Only one encoding pipe available for Stoney Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-20 13:21:54 -05:00
Leo Liu	99d92de5d0	radeon/vce: add new firmware interface support Add new interface to create and encode Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-20 13:21:54 -05:00
Emil Velikov	8fdb548799	egl: don't forget to ship platform_x11_dri3.h into the tarball Should have been a part of `f35198bade` Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 18:08:04 +00:00
Emil Velikov	ae6d6941f6	glsl: move builtin_type_macros.h into the correct list Commit `b9b40ef9b7` moved the file, but forgot to update the reference in the makefile. Thus the out of tree build was busted :\ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 18:07:58 +00:00
Emil Velikov	c45b4257c2	automake: use static llvm for make distcheck With llvm 3.7 semi-dropping the autoconf build, we rely on their cmake build. With the latter of which annoyingly using another (busted?) SONAME. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 18:07:52 +00:00
Brian Paul	0743e14aee	mesa: remove unused var in _mesa_PushDebugGroup() Trivial.	2015-11-20 09:35:18 -07:00
Brian Paul	108013b8e5	mesa: whitespaces fixes in _mesa_one_time_init_extension_overrides() Trivial.	2015-11-20 09:35:05 -07:00
Nicolai Hähnle	8a125afa6e	radeon: ensure that timing/profiling queries are suspended on flush The queries_suspended_for_flush flag is redundant because suspended queries are not removed from their respective linked list. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-20 17:27:40 +01:00
Nicolai Hähnle	6a14a39fab	st/mesa: add support for batch driver queries to perfmon v2 + v3: forgot null-pointer checks (spotted by Samuel Pitoiset) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:27:36 +01:00
Nicolai Hähnle	424a614ff1	gallium/hud: add support for batch queries v2 + v3: be more defensive about allocations Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:27:32 +01:00
Nicolai Hähnle	d61d4df02e	gallium: add the concept of batch queries Some drivers (in particular radeon[si], but also freedreno judging from a quick grep) may want to expose performance counters that cannot be individually enabled or disabled. Allow such drivers to mark driver-specific queries as requiring a new type of batch query object that is used to start and stop a list of queries simultaneously. v3: adjust recently added nv50 queries v2: documentation for create_batch_query Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:27:28 +01:00
Nicolai Hähnle	c235300bfc	st/mesa: maintain active perfmon counters in an array It is easy enough to pre-determine the required size, and arrays are generally better behaved especially when they get large. v2: make sure init_perf_monitor returns true when no counters are active (spotted by Samuel Pitoiset) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:27:23 +01:00
Nicolai Hähnle	afa6121b4e	st/mesa: use BITSET_FOREACH_SET to loop through active perfmon counters Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:27:18 +01:00
Nicolai Hähnle	0aea83dc4a	st/mesa: store mapping from perfmon counter to query type Previously, when a performance monitor was initialized, an inner loop through all driver queries with string comparisons for each enabled performance monitor counter was used. This hurts when a driver exposes lots of queries. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:27:09 +01:00
Nicolai Hähnle	4e1339691d	st/mesa: map semantic driver query types to underlying type Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:26:59 +01:00
Nicolai Hähnle	050db20d37	gallium/hud: remove unused field in query_info Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:26:50 +01:00
Nicolai Hähnle	ddf27a3dd0	gallium: remove pipe_driver_query_group_info field type This was only used to implement an unnecessarily restrictive interpretation of the spec of AMD_performance_monitor. The spec says A performance monitor consists of a number of hardware and software counters that can be sampled by the GPU and reported back to the application. I guess one could take this as a requirement that counters _must_ be sampled by the GPU, but then why are they called _software_ counters? Besides, there's not much reason _not_ to expose all counters that are available, and this simplifies the code. v3: add a missing change in the nouveau driver (thanks Samuel Pitoiset) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-20 17:26:39 +01:00
Roland Scheidegger	24dc0316b4	gallivm: use sampler index 0 for texel fetches texel fetches don't use any samplers. Previously we just set the same number for both texture and sampler unit (as per "ordinary" gl style sampling where the numbers are always the same) however this would trigger some assertions checking that the sampler index isn't over PIPE_MAX_SAMPLERS limit elsewhere with d3d10, so just set to 0. (Fixing the assertion instead isn't really an option, the sampler isn't really used but might still pass an out-of-bound pointer around and even copy some things from it.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-20 17:00:15 +01:00
Ilia Mirkin	9a93da4e83	freedreno/a4xx: add BPTC support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-20 09:25:39 -05:00
François Tigeot	8a94ba5e0c	xmlconfig: Add support for DragonFly Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 11:18:47 +00:00
Mauro Rossi	480ba46bcb	android: export the path of glsl nir headers The change is necessary to avoid building errors in glsl and i965 modules due to missing glsl_types.h header Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 11:18:19 +00:00
Boyan Ding	b8547a5063	mesa: re-enable KHR_debug for ES contexts With the earlier issues resolved we can expose the extension. Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 11:14:05 +00:00
Boyan Ding	ab7294668c	main: Don't restrict several KHR_debug enum to desktop GL In preparation for supporting GL_KHR_debug in OpenGL ES v2: add a missing hunk in _mesa_IsEnabled (Emil) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 11:14:05 +00:00
Emil Velikov	af27236854	mesa: use the correct string for the ES GL_KHR_debug functions As defined in the spec when implemented in an OpenGL ES context, all entry points defined by this extension must have a "KHR" suffix. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-20 11:14:05 +00:00
Gregory Hainaut	9108a785a0	glsl: avoid linker and user varying location to overlap Current behavior on the interface matching: layout (location = 0) out0; // Assigned to VARYING_SLOT_VAR0 by user out1; // Assigned to VARYING_SLOT_VAR0 by the linker New behavior on the interface matching: layout (location = 0) out0; // Assigned to VARYING_SLOT_VAR0 by user out1; // Assigned to VARYING_SLOT_VAR1 by the linker v4: * Fix variable name in assert Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-20 22:04:02 +11:00
Emil Velikov	3afb253e9b	auxiliary/vl/dri2: coding style fixes Rewrap long(ish) lines, add space between struct foo and *. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:45 +00:00
Emil Velikov	b31f092bfb	auxiliary/vl/dri2: hide internal functions Analogous to previous commit. While we're here prefix all functions identically -> vl_dri2_foo Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:45 +00:00
Emil Velikov	4533c022f4	auxiliary/vl/drm: hide internal functions As of last commit everyone is using the vl_screen dispatch, thus we can hide this function from the headers and make it static. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:45 +00:00
Emil Velikov	abbfda60d8	st/vdpau: use the vl_screen dispatch Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:45 +00:00
Emil Velikov	4307155127	st/xvmc: use the vl_screen dispatch Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:45 +00:00
Emil Velikov	422356ed2f	st/va: use the vl_screen dispatch Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:45 +00:00
Emil Velikov	9eb109f4d3	st/omx: use the vl_screen dispatch Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:44 +00:00
Emil Velikov	32094979f7	auxiliary/vl/dri2: setup the dispatch Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:44 +00:00
Emil Velikov	6150d8d4bd	auxiliary/vl/drm: use a label for the error path ... just like every other place in gallium. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:44 +00:00
Emil Velikov	d03d9ecafa	auxiliary/vl/drm: setup the dispatch Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:44 +00:00
Emil Velikov	6b152ee7b6	auxiliary/vl: add dispatch table As mentioned previously, it will allow us to use different vl backend in a generic way from either video state-tracker. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:58:41 +00:00
Emil Velikov	2bd9116b82	auxiliary/vl: rename vl_screen_create to vl_dri2_screen_create In a preparation of having proper multi-platform/backend handling in VL. With follow up commits we'll introduce a dispatch within vl_screen similar to the one in pipe_screen. This way any VL state-tracker can operate seamlessly, considering the backend/platform is properly setup. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:56:34 +00:00
Emil Velikov	c31218cdb3	st/va: trivial cleanup Drop the temporary variable and fold the two conditional. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-20 10:56:17 +00:00
Emil Velikov	a8f45e0161	st/omx: straighten get/put_screen The current code is busted in a number of ways. - initially checks for omx_display (rather than omx_screen), which may or may not be around. - blindly feeds the empty env variable string to loader_open_device() - reads the env variable every time get_screen is called - the latter manifests into memory leaks, and other issues as one sets the variable between two get_screen calls. Additionally it cleans up a couple of extra bits - drops unneeded set/check of omx_display. - make the teardown (put_screen) order was not symmetrical to the setup (get_screen) v2: Drop the "is empty string" check (Leo) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-11-20 10:56:10 +00:00
Emil Velikov	7157085140	automake: loader: don't create an empty dri3 helper Seems that creating an empty one does not fair too well with MacOSX's ar. Considering that all the users of the helper include it only when needed, let's reshuffle the makefile. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92985 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-20 10:40:35 +00:00
Emil Velikov	115f179852	automake: loader: honour the XCB_DRI3 cflags Without this the compilation will fail, as the headers are installed in a non-default location. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-20 10:40:29 +00:00
Emil Velikov	166314dd88	automake: egl: add symbols test Should help us catch issues where we expose any extra symbols by mistake. Just like the ones fixes with previous commit. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>	2015-11-20 10:40:23 +00:00
Emil Velikov	5a79e0a8e3	automake: loader: rework the CPPFLAGS Rather than duplicating things, just use the generic AM_CPPFLAGS. This has the fortunate side-effect of adding VISIBILITY_CFLAGS for the dri3 helper. The latter of which was erroneously exposing some internal symbols. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reported-by: Kai Wasserbäch <kai@dev.carbon-project.org> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-20 10:40:11 +00:00
Ian Romanick	99840eb983	i965: Enable EXT_shader_samples_identical On the vec4 backend, textureSamplesIdentical() will always return false. There are currently no test cases for the vec4 backend, so we don't have much confidence in any implementation. We also don't think anyone is likely to miss it. v2: Handle immediate value for MCS smarter. Rebase on changes to nir_texop_sampels_identical (missing second parameter). Suggested by Jason. v3: Add Neil's code to handle 16x MSAA in the FS. Also rebase on top of `f9a9ba5e`. Stub out the vec4 implementation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v2] Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]	2015-11-19 20:17:16 -08:00
Ian Romanick	84b6c64efc	i965/vec4: Handle nir_tex_src_ms_index more like the scalar v2: Rebase on top of `f9a9ba5e`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:16 -08:00
Ian Romanick	457bb290ef	nir: Add nir_texop_samples_identical opcode This is the NIR analog to GLSL IR ir_samples_identical. v2: Don't add the second nir_tex_src_ms_index parameter. Suggested by Ken and Jason. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:16 -08:00
Ian Romanick	06c56f443a	glsl: Add textureSamplesIdenticalEXT built-in functions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:16 -08:00
Ian Romanick	8343583557	glsl: Add ir_samples_identical opcode Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:16 -08:00
Ian Romanick	ef54434c52	glsl: Extension tracking for EXT_shader_samples_indentical Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:16 -08:00
Ian Romanick	ff59700d29	mesa: Extension tracking for EXT_shader_samples_indentical Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:15 -08:00
Ian Romanick	b1b9f68d4c	Import current draft of EXT_shader_samples_identical spec v2: Add Neil to the list of contributors. I meant to do that before, but Matt reminded me. v3: Fix typos noticed by Nicolai. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-19 20:17:15 -08:00
Rob Clark	acca6c65d3	nir: add nir_ssa_for_alu_src() Using something like: numer = nir_ssa_for_src(bld, alu->src[0].src, nir_ssa_alu_instr_src_components(alu, 0)); for alu src's with swizzle, like: vec1 ssa_10 = intrinsic load_uniform () () (0, 0) vec2 ssa_11 = intrinsic load_uniform () () (1, 0) vec2 ssa_2 = udiv ssa_10.xx, ssa_11 ends up turning into something like: vec1 ssa_10 = intrinsic load_uniform () () (0, 0) vec2 ssa_11 = intrinsic load_uniform () () (1, 0) vec2 ssa_13 = imov ssa_10 ... because nir_ssa_for_src() ignore's the original nir_alu_src's swizzle. Instead for alu instructions, nir_src_for_alu_src() should be used to ensure the original alu src's swizzle doesn't get lost in translation: vec1 ssa_10 = intrinsic load_uniform () () (0, 0) vec2 ssa_11 = intrinsic load_uniform () () (1, 0) vec2 ssa_13 = imov ssa_10.xx ... v2: check for abs/neg, and re-use existing nir_alu_src Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-19 20:03:32 -05:00
Rob Clark	c73f40c473	nir: fix missing increments of num_inputs/num_outputs Note: not quite perfect, we should use type_size vfunc (in compiler_options or nir_shader?) to determine how much we increment num_inputs/outputs/uniforms. But we don't have that yet, so let's at least fix things for the existing users of these passes. Signed-off-by: Rob Clark <robclark@freedesktop.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-19 20:03:32 -05:00
Rob Clark	fec9367deb	nir/print: show # of uniforms/inputs/outputs Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-19 20:03:32 -05:00
Rob Clark	01e94d8d5d	nir/print: show shader name/label if set Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-19 20:03:32 -05:00
Rob Clark	006e4f070f	nir: add nir_var_all enum Otherwise, passing -1 gets you: error: invalid conversion from 'int' to 'nir_variable_mode' [-fpermissive] Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-19 20:03:32 -05:00
Ilia Mirkin	769b3ab6c5	freedreno/a4xx: fix 5_5_5_1 texture sampler format This fixes teximage-colors, fbo-generatemipmap-formats, and probably others (in relation to the RGB5 formats, others still fail). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-19 19:00:18 -05:00
Ilia Mirkin	a05e5491c3	freedreno/a4xx: add depth clamp and halfz clip Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 19:00:18 -05:00
Ilia Mirkin	b17a405609	freedreno/a4xx: allow seamless cubemap filtering to be enabled per-texture Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 19:00:18 -05:00
Ilia Mirkin	0a4462ad6e	freedreno/a4xx: support lod_bias The lower layers assume that we support this, and it's been core since GL 1.4. This fixes a slew of piglit tests, especially around tex-miplevel-selection. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-19 19:00:17 -05:00
Samuel Pitoiset	0cfc1304be	nv50: allow using inline vertex data submit when gl_VertexID is used The hardware can actually generates vertexid when vertices come from a client-side buffer like when glDrawElements is used. This doesn't fix (or break) any piglit tests but it improves the previous attempt of Ilia (`c830d19` "nv50: avoid using inline vertex data submit when gl_VertexID is used") The only disadvantage is that only works on G84+, but we don't really care of that weird and old NV50 chipset. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 21:11:38 +01:00
Samuel Pitoiset	9e40a621c1	nv50: add NV84_3D macro Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 21:11:27 +01:00
Matt Turner	a5b3115f0a	i965: Drop IMM fs_reg/src_reg -> brw_reg conversions. The previous two commits make this unnecessary. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-19 11:12:24 -08:00
Matt Turner	f9a9ba5eac	i965/vec4: Replace src_reg(imm) constructors with brw_imm_*(). Cuts 1.5k of .text. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-19 11:12:24 -08:00
Matt Turner	9b978046eb	i965/fs: Use brw_imm_uw(). W/UW immediates are 16-bits, but those 16-bits must be replicated in the high 16-bits of the 32-bit field. Remove the useless W/UW immediate saturating code, since we'll now be using the appropriate immediate (and W/UW immediates in the IR can now no longer be larger than 16-bits). Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-19 11:12:24 -08:00
Matt Turner	3ccc41ecfc	i965/fs: Replace fs_reg(imm) constructors with brw_imm_*(). Cuts 10k of .text, of which only 776 bytes are the fs_reg constructor implementations themselves. text data bss dec hex filename 5204535 214112 27784 5446431 531b1f i965_dri.so before 5193977 214112 27784 5435873 52f1e1 i965_dri.so after Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-19 11:12:24 -08:00
Matt Turner	c15a407eb4	i965: Make brw_imm_vf4() take 8-bit restricted floats. This partially reverts commit `bbf8239f92`. I didn't like that commit to begin with -- computing things at compile time is fine -- but for purposes of verifying that the resulting values are correct, looking up 0x00 and 0x30 in a table is a lot better than evaluating a recursive function. Anyway, by making brw_imm_vf4() take the actual 8-bit restricted floats directly (instead of only integral values that would be converted to restricted float), we can use this function as a replacement for the vector float src_reg/fs_reg constructors. brw_float_to_vf() is not currently an inline function, so it will not be evaluated at compile time. I'll address that in a follow-up patch. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-19 11:12:24 -08:00
Nanley Chery	e8c5ef3eca	mesa: Add test for sorted extension table Enable developers to know if the table's alphabetical sorting is maintained or lost. v2: Move "" next to pointer name (Matt) Include extensions_table.h instead of extensions.h (Ian) Remove extra " " in comment (Ian) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-19 11:12:45 -08:00
Nanley Chery	f030227f46	mesa/extensions: Sort the extension table alphabetically Make it easier to determine where to add new extensions. Performed with the vim sort command. v2: Insert newline after last #define (Matt) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-19 10:35:20 -08:00
Ilia Mirkin	bcda79676a	docs: GL3.1 for a3xx and a4xx Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 12:26:28 -05:00
Ryan Houdek	0ec218d167	mesa: enable EXT_blend_func_extended if the driver supports the ARB version Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	f7c23f225f	mesa: allow MAX_DUAL_SOURCE_DRAW_BUFFERS to be available to ES Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	4b549f0d8c	mesa: enable usage of blend_func_extended blend factors in GLES2 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	33ddc8e865	glsl: add a parse check to check for the index layout qualifier This can only be used if EXT_blend_func_extended is enabled Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	ef9e6d1ec8	glsl: add GL_EXT_blend_func_extended preprocessor define Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	1d1d02f2ac	glsl: add support for EXT_blend_func_extended builtins gl_MaxDualSourceDrawBuffersEXT - Maximum dual-source draw buffers supported For ESSL 1.0, it provides two builtins since you can't have user-defined color output variables: gl_SecondaryFragColorEXT gl_SecondaryFragDataEXT[MaxDSDrawBuffers] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	ceecb0876f	glsl: add EXT_blend_func_extended parser enables This adds a state for the maximum dual source draw variables available and the variable for determining if the extension has been enabled in the program shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Ryan Houdek	625414f78c	glapi: add EXT_blend_func_extended XML definitions Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-19 11:39:51 -05:00
Brian Paul	15f8dc7b23	os: check for GALLIUM_PROCESS_NAME to override os_get_process_name() Useful for debugging and for glretrace. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-19 09:23:04 -07:00
Connor Abbott	f1ba0a5ea0	glsl: fix ir_constant::equals() for doubles Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-11-19 09:16:18 +01:00
Connor Abbott	84ed3819a4	glsl: fix isinf() for doubles Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-11-19 09:16:18 +01:00
Connor Abbott	7820b2c071	nir: fix constant folding of bfi Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-11-19 09:16:18 +01:00
Brian Paul	1cfffb95eb	hud: fix Windows build break Protect signal-related code with PIPE_OS_UNIX test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-19 07:57:09 +00:00
Ian Romanick	2f55476153	glsl: Fix off-by-one error in array size check assertion Apparently, this has been a bug since 2010 (`c30f6e5d`). Also use ARRAY_SIZE instead of open coding it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-11-18 18:35:56 -08:00
Ian Romanick	0aded03046	mesa: Don't expose GL_EXT_shader_integer_mix in GLES 1.x There are no shaders, so it doesn't even make sense to expose the extension. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Nanley Chery <nanley.g.chery@intel.com>	2015-11-18 18:35:56 -08:00
Ian Romanick	37c2cfa6bc	glsl: Silence unused parameter warnings builtin_functions.cpp:5289:52: warning: unused parameter 'num_arguments' [-Wunused-parameter] unsigned num_arguments, ^ builtin_functions.cpp:5290:52: warning: unused parameter 'flags' [-Wunused-parameter] unsigned flags) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-18 18:35:56 -08:00
Ian Romanick	c82498c4da	glsl: Silence ignored qualifier warning I think the intention was to mark the "this" parameter as const, but const goes on the other end to do that. In file included from glsl_symbol_table.cpp:26:0: ast.h:339:35: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] const bool is_single_dimension() ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-18 18:35:56 -08:00
Kenneth Graunke	fc19a0d2e4	i965: Allow indirect GS input indexing in the scalar backend. This allows arbitrary non-constant indices on GS input arrays, both for the vertex index, and any array offsets beyond that. All indirects are handled via the pull model. We could potentially handle indirect addressing of pushed data as well, but it would add additional code complexity, and we usually have to pull inputs anyway due to the sheer volume of input data. Plus, marking pushed inputs as live due to indirect addressing could exacerbate register pressure problems pretty badly. We'd need to be careful. v2: Use updated MOV_INDIRECT opcode. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-18 15:42:36 -08:00
Jimmy Berry	09d610796c	gallium/hud: document GALLIUM_HUD_PERIOD in envvars.html. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-11-19 00:02:34 +01:00
Jimmy Berry	56a1c10bb8	gallium/hud: control visibility at startup and runtime. - env GALLIUM_HUD_VISIBLE: control default visibility - env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-11-19 00:02:33 +01:00
Jason Ekstrand	0bee3acc2a	i965/nir: Add hooks for testing nir_shader_clone This commit adds code for testing nir_shader_clone by running it after each and every optimization pass and throwing away the old shader. Testing nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment variable. Reviewed-by: Rob Clark <robclark@freedesktop.org> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-18 12:28:55 -08:00
Jason Ekstrand	9fbd390dd4	nir: Add support for cloning shaders This commit is heavily based on one by Rob Clark <robdclark@gmail.com> but reworked to re-use nir_create functions and do less hashing. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 12:28:32 -08:00
Kenneth Graunke	9ff71b649b	i965/nir: Validate that NIR passes call nir_metadata_preserve(). Failing to call nir_metadata_preserve() can have nasty consequences: some pass breaks dominance information, but leaves it marked as valid, causing some subsequent pass to go haywire and probably crash. This pass adds a simple validation mechanism to ensure passes handle this properly. We add a new bogus metadata flag that isn't used for anything in particular, set it before each pass, and ensure it isn't still set after the pass. nir_metadata_preserve will reset the flag, so correct passes will work, and bad passes will assert fail. (I would have made these functions static inline, but nir.h is included in C++, so we can't bit-or enums without lots of casting...) Thanks to Dylan Baker for the idea. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-18 12:28:32 -08:00
Kenneth Graunke	7bc0978999	i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes. OPT() is the normal macro for passes that return booleans, while OPT_V() is a variant that works for passes that don't properly report progress. (Such passes should be fixed to return a boolean, eventually.) These macros take care of calling nir_validate_shader() and setting progress appropriately. In the future, it would be easy to add shader dumping similar to INTEL_DEBUG=optimizer by extending the macro. v2 (Jason Ekstrand): - Fix an unused variable warning Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-18 12:28:32 -08:00
Rob Clark	d27ae2cf8c	nir: add array length field This will simplify things somewhat in clone. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-18 12:28:32 -08:00
Rob Clark	624ec66653	nir: remove nir_variable::max_ifc_array_access No users. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-18 12:28:32 -08:00
Rob Clark	4671c13852	freedreno/a4xx: add fake RGTC support (required for GL3) The a4xx bits corresponding to 'freedreno/a3xx: add fake RGTC support (required for GL3)' TODO some more r/e.. maybe we get lucky and hw supports some of this directly? For now this will help us enable gl3. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Rob Clark	2379cc9fe0	freedreno/a4xx: add compressed texture formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Rob Clark	fadd39442b	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Ilia Mirkin	4607b2b9b6	freedreno: expose GLSL 140 and fake MSAA for GL3.0/3.1 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Ilia Mirkin	9c409c8df3	freedreno/a3xx: fix texture buffers, enable offsets The main issue is that the current logic looked into cso->u.tex, which is the wrong side of the union to look into for texture buffers. While I was at it, it was easy enough to add the logic to handle offsets (first_element). - reduce texture buffer size limit (determined experimentally) - don't look at first/last levels, instead look at first/last element - include the first element offset - set offset alignment to 16 (determined experimentally) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Ilia Mirkin	d69e557f2a	freedreno: add support for conditional rendering, required for GL3.0 A smarter implementation would make it possible to attach this to emit state for the BY_REGION versions to avoid breaking the tiling. But this is a start. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Ilia Mirkin	059da344ec	freedreno/a3xx: add fake RGTC support (required for GL3) Also throw in LATC while we're at it (same exact format). This could be made more efficient by keeping a shadow compressed texture to use for returning at map time. However... it's not worth it for now... presumably compressed textures are not updated often. Lastly fix up Z32S8 transfers to non-0 layers. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Ilia Mirkin	84d087aea2	freedreno/a3xx: add missing formats to enable ARB_vertex_type_2_10_10_10_rev The previously RE'd formats were from an ES driver implementing OES_vertex_type_10_10_10_2 and thus backwards. A future change could add the 2_10_10_10 support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Rob Clark	8106fec74c	freedreno/a3xx+a4xx: fix for stk binning pass hang We'd end up in a state where shader uses no inputs, yet num_elements is greater than zero. Triggered by a TF vertex shader which did: gl_Position = vec4(0.0, 0.0, 0.0, 0.0); resulting in a binning pass variant with no inputs. Includes equiv fix in a4xx, even though we don't have binning-pass enabled yet on a4xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Rob Clark	b24c9a8aee	freedreno/a3xx+a4xx: fix GL_POINTS lockup w/ GLES point_size_per_vertex is always TRUE for GLES, causing us to configure the hw as if gl_PointSize was written, even if it was not. Which makes for grumpy hw. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Ilia Mirkin	b40e144a66	nir: fix typo in idiv lowering, causing large-udiv-udiv failures In nv50, and in the python script that Rob circulated, we do: bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b); Do the same in the nir div lowering pass. This fixes the large-udiv-udiv piglit tests on freedreno. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-18 14:31:13 -05:00
Oded Gabbay	4581f8428e	llvmpipe: disable VSX in ppc due to LLVM PPC bug This patch disables the use of VSX instructions, as they cause some piglit tests to fail For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7 With this patch, ppc64le reaches parity with x86-64 as far as piglit test suite is concerned. v2: - Added check that we have at least LLVM 3.4 - Added the LLVM bug URL as a comment in the code v3: - Only disable VSX if Altivec is supported, because if Altivec support is missing, then VSX support doesn't exist anyway. - Change original patch description. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-18 21:27:29 +02:00
Ilia Mirkin	8e68113c1a	nvc0/ir: actually emit AFETCH on kepler Looks like this was forgotten in the commit which added the AFETCH logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-18 14:26:16 -05:00
Kenneth Graunke	2631bfd62c	nir: Store the size of the TCS output patch in nir_shader_info. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-18 10:49:18 -08:00
Kenneth Graunke	b196f1fff3	i965: Add enums for 3DSTATE_TE field values. 3DSTATE_TE has partitioning, output topology, and domain fields, each of which has several enumerated values. We'll also need to switch on the domain, so enums (rather than #defines) seem like a natural fit. I chose to put these in brw_compiler.h because they'll be stored in struct brw_tes_prog_data, which will live there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-18 10:49:18 -08:00
Ian Romanick	72e232374e	meta/generate_mipmap: Don't leak the framebuffer object Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-11-18 09:38:21 -08:00
Brian Paul	1a48326a84	svga: use more VGPU10 formats We always want to prefer the VGPU10 formats over the VGPU9 ones when we have VGPU10 support. Original patch by Jose and updated by Brian. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-18 09:16:12 -07:00
Brian Paul	1a90e3e1e3	svga: add/use new svga_sampler_format() function This is important for the case of sampling from a depth texture. In that case, we need to sample the texture as if it were a single-channel color texture. For other/color formats, we can use the format as-is. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-18 09:15:54 -07:00
Nicolai Hähnle	27ce75ed12	radeon: count cs dwords separately for query begin and end This will be important for perfcounter queries. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-18 12:27:13 +01:00
Nicolai Hähnle	ffd01b7781	radeon: expose r600_query_hw functions for reuse Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Fixed a rebase conflict and re-tested before pushing.]	2015-11-18 12:27:13 +01:00
Nicolai Hähnle	50f0f938e3	radeon: implement r600_query_hw_get_result via function pointers We will need the clear_result override for the batch query implementation. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-18 12:27:13 +01:00
Nicolai Hähnle	c207c55fc0	radeon: split hw query buffer handling from cs emit The idea here is that driver queries implemented outside of common code will use the same query buffer handling with different logic for starting and stopping the corresponding counters. Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Fixed a rebase conflict and re-tested before pushing.]	2015-11-18 12:27:13 +01:00
Nicolai Hähnle	1d10b3d01e	radeon: convert hardware queries to the new style Move r600_query and r600_query_hw into the header because we will want to reuse the buffer handling and suspend/resume logic outside of the common radeon code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Fixed a rebase conflict and re-tested before pushing.]	2015-11-18 12:27:12 +01:00
Nicolai Hähnle	019106760d	radeon: convert software queries to the new style Software queries are all queries that do not require suspend/resume and explicit handling of result buffers. Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Fixed a rebase conflict and re-tested before pushing.]	2015-11-18 12:27:12 +01:00
Nicolai Hähnle	829a9808a9	radeon: add query handler function pointers The goal here is to be able to move the implementation details of hardware- specific queries (in particular, performance counters) out of the common code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> [Fixed a rebase conflict and re-tested before pushing.]	2015-11-18 12:27:12 +01:00
Nicolai Hähnle	50cab4788d	radeon: move R600_QUERY_* constants into a new query header file More query-related structures will have to be moved into their own header file to support hardware-specific performance counters. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-18 12:27:12 +01:00
Nicolai Hähnle	c56e83e518	radeon: cleanup driver query list Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-18 12:27:12 +01:00
Nicolai Hähnle	e117e74baf	radeon: move get_driver_query_info to r600_query.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-18 12:27:11 +01:00
Neil Roberts	5dfb4dbc05	i965: Prevent fast clears for MSRTs on SKL There are currently a bunch of formats that behave strangely when sampling the cleared color from the MCS buffer on SKL. They seem to mostly be formats that don't have an alpha component, although it's not all of them, and we haven't yet found anything in the specs which would explain this. For now to be on the safe side this patch just prevents fast clears for MSRTs on SKL altogether so that when fast clears are eventually enabled it will only be for single-sampled surfaces. The assumption is that clears are probably more likely to be used in single-sampled applications anyway so we can at least get them working and we can enable MSRTs later once we understand the problem better. This patch should have no functional effect other than perhaps receiving fewer perf_debug messages on SKL+. v2: Improve the commit message to avoid saying the patch disables fast clears because it will be merged before fast clears are enabled for any surfaces so it doesn't actually disable anything. Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-18 10:29:07 +01:00
Eric Anholt	dd05ffebfc	vc4: Don't bother lowering uniforms when the same value is used twice. DEQP likes to do math on uniforms, and the "fmaxabs dst, uni, uni" to get the absolute value would get lowered. The lowering doesn't bother to try to restrict the lifetime of the lowered uniforms, so we'd end up register allocation failng due to this on 5 of the tests (More tests still fail in RA, which look like we'll need to reduce lowered uniform lifetimes to fix). No changes on shader-db, though fewer extra MOVs are generated on even glxgears (MOVs pair well enough that it ends up being the same instruction count).	2015-11-17 17:45:23 -08:00
Eric Anholt	dffe7260cd	vc4: Fix uniform reordering to support reading the same uniform twice. This does actually happen in the wild (particularly fabs of a uniform), so we'd like to support it.	2015-11-17 17:45:23 -08:00
Eric Anholt	d18d1ba587	vc4: Fix documentation on vc4_qir_lower_uniforms.c.	2015-11-17 17:45:23 -08:00
Eric Anholt	a4bf28178f	vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB. It looks like nir_lower_idiv is going to use it soon, so add support. With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with GL 3.0 forced on). Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-17 17:45:23 -08:00
Kenneth Graunke	27b1d34438	i965: Fix PIPE_CONTOL typo. PIPE_CONTOL!!!	2015-11-17 16:33:48 -08:00
Ben Widawsky	c531d40927	i965: Add assertion for src_stencil payload size This helps address a coverity warning and prevents future questions about this code. Reported-by: Coverity (via Ilia) Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-17 14:47:33 -08:00
Kenneth Graunke	2bec154b47	i965: Implement ARB_pipeline_statistics_query tessellation counters. We basically just need to uncomment Ben's code. v2: Fix obvious bugs caught by Ben. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-17 14:23:26 -08:00
Timothy Arceri	d4fbf11b58	glsl: rename location layout helper Change name from validate -> apply to more accurately describe what the function does. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-18 07:30:23 +11:00
Timothy Arceri	03bbddd139	glsl: don't validate binding when its not needed Checking that the flag has been set is all the validation thats needed here. Also not calling the binding validation function will make things much simpler when adding compile time constant support as we won't need to resolve the binding value. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:30:19 +11:00
Timothy Arceri	4f4ca6b90a	glsl: remove temp variable to make code easier to read Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:30:07 +11:00
Timothy Arceri	a01b8c7e77	glsl: cleanup and fix validate matrix function for arrays Previously if the member was an array of matrices then a warning message would be incorrectly given. Also the struct case could never be met so it has been removed. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:29:58 +11:00
Timothy Arceri	f8b5cc827e	glsl: use better location in struct and block error messages Previously we only gave the location for some members and never gave the variable location. In those cases we were just giving the location of the struct/block. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:29:53 +11:00
Timothy Arceri	c54865db78	glsl: only do type and qualifier validation once per declaration For struct and block members previously we were doing it for every variable declaration. So for example struct S { atomic_uint x, y, z; }; Would previously generate three error messages when one is sufficient. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:29:47 +11:00
Timothy Arceri	14d343b024	glsl: rename function that processes struct and iface members As of the previous commit this function handles only struct/iface members. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:29:37 +11:00
Timothy Arceri	8cf795dc7c	glsl: move block validation outside function that validates members Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:29:32 +11:00
Timothy Arceri	649803742d	glsl: move ast layout qualifier handling code into its own function We now also only apply these rules to variables rather than also trying to apply them to function params. V2: move code for handling stream layout qualifier Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-18 07:29:25 +11:00
Kenneth Graunke	5b596f3878	i965: Add INTEL_DEBUG=shader_time support for tessellation shaders. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:34:04 -08:00
Kenneth Graunke	df87cb837f	i965: Add INTEL_DEBUG=tcs,tes and hs,ds flags for tessellation shaders. Even though both tessellation shader stages must be used together, I still think it makes sense to add separate debug flags for each stage. It makes it possible to read the TCS/HS, rule out problems, then read the TES/DS separately, without sifting through as much printed text. I decided to add both the GL names (tcs/tes) and hardware names (hs/ds) so they can be used interchangeably. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:33:54 -08:00
Kenneth Graunke	e9b0fa496c	i965: Add more MAX_*_URB_ENTRY_SIZE_BYTES #defines. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-11-17 10:18:08 -08:00
Kenneth Graunke	874a1ed813	i965: Add missing stdio.h include to brw_compiler.h. This is needed for the FILE * type in brw_print_vue_map(). Apparently, all files that include brw_compiler.h already pick this up via some include chain, so this isn't actually a build fix. However, I have patches which introduce new consumers of brw_compiler.h that fail to build because of the missing #include. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-17 10:18:08 -08:00
Martin Peres	4518eea065	egl: make it clear which platform x11 backend is being used (dri2 or 3) Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Boyan Ding	fcdc798515	egl/x11_dri3: Implement EGL_KHR_image_pixmap v2: from Martin Peres - Replace a tab with spaces v3: from Martin Peres - disable EGL_KHR_image_pixmap when is_different_gpu is set (Axel Davy) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Boyan Ding	bd6131a8d1	loader/dri3: Expose function to create __DRIimage from pixmap Used to support EGL_KHR_image_pixmap. Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Boyan Ding	f35198bade	egl/x11: Implement dri3 support with loader's dri3 helper v2: From Martin Peres - Tell we are compiling the dri3 backend in configure.ac - Update the Makefile.am - get rid of the LIBDRM_HAS_RENDERNODE_SUPPORT macro - fix some warnings related to EGLuint64KHR to int64_t conversions - use dri2_get_dri_config to get the __DRIconfig instead of open-coding it - replace the occasional tabs with spaces v3: From Martin Peres - fix and indent problem (Matt Turner) - drop the authenticate function, use NULL in the vtable instead (Emil) - drop some useless includes (Emil Velikov) - mandate libdrm (Emil Velikov) - link to xcb-dri3 (Kristian Høgsberg) - convert to the new loader interface for drwable (Kristian) - remove some dead code after the dropping of some vfuncs (Kristian) - add a comment on the topic of rendering to the frontbuffer v4: From Martin Peres - do not expose the preserved swap behavior (Acked by Eric Anholt) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Boyan Ding	a25df54571	egl_dri2: Add a function to let platform code return dri drawable from _EGLSurface dri3 for EGL will use different struct other than dri2_egl_surface for an EGL surface, the common code only uses __DRIdrawable from that struct, so instead of converting _EGLSurface to dri2_egl_surface, let the platform code return the __DRIdrawable by its own (although the current platforms use the same function). v2: From Martin Peres - convert to the new drawable interface (Kristian) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Boyan Ding	fdacbc439e	glx/dri3: Convert to use dri3 helper in loader library v2: From Martin Peres - convert to the new drawable interface - delete dead code after the dropping of some vfuncs - delete the width and height attributes since they are found in the helper Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Boyan Ding	6bd9ba7d07	loader: Add dri3 helper v2: From Martin Peres - Try to fit in the 80-col limit as much as possible v3: From Martin Peres - introduce loader_dri3_helper.la to avoid dragging the xcb dep everywhere (Kristian & Emil) - get rid of the width, height, dri_screen and is_different_gpu vfuncs (Kristian) - replace the create/destroy functions with init/fini for dri3 drawables - prefix static functions with dri3_ and exported ones with loader_dri3 (Emil) - keep the function definition consistent (Emil) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-17 17:26:20 +02:00
Eduardo Lima Mitev	252b143e9e	i965: Return the correct value type from brw_compile_gs() brw_compile_gs() should return a pointer to unsigned, but it is returning the bool 'false' at some point, hence annoying us with a compiler warning: In function 'const unsigned int* brw::brw_compile_gs(const brw_compiler, void, void, const brw_gs_prog_key, brw_gs_prog_data, const nir_shader, gl_shader_program, int, unsigned int, char*)': brw_vec4_gs_visitor.cpp:776:14: warning: converting 'false' to pointer type 'const unsigned int' [-Wconversion-null] return false; ^ Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-11-17 12:50:09 +01:00
Samuel Iglesias Gonsálvez	dfa60e7057	glsl: copy each field's precision information in glsl_types's structure constructor Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:36:42 +01:00
Samuel Iglesias Gonsálvez	688b58c40c	glsl: copy each field's precision information from the old gl_PerVertex interface block Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:36:42 +01:00
Samuel Iglesias Gonsálvez	cfe32cfa8e	glsl: copy each field's precision information when generating varying variables Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:36:42 +01:00
Samuel Iglesias Gonsálvez	91eefe8505	glsl: initialize data.precision value in ir_variable constructor Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:36:42 +01:00
Samuel Iglesias Gonsálvez	58954e4daa	glsl/nir: initialize precision field in glsl_struct_field constructor Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:36:42 +01:00
Samuel Iglesias Gonsálvez	a96afaced8	nir: reduce memory footprint of glsl_struct_field's precision Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-17 10:36:41 +01:00
Tapani Pälli	f4f30ad730	mesa: do runtime validation of precision varyings only on ES Precision qualifier should be ignored on desktop OpenGL. v2: include spec quote (Samuel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-17 09:23:54 +02:00
Tapani Pälli	023fd58fd6	glsl: initialize precision when adding per vertex record fields Fixes issues with tessellation builtin variables since precision was introduced to IR with commit `f84bc57d7d`. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-17 07:37:13 +02:00
Kenneth Graunke	292df19401	i965: Set MaxCombinedUniformBlocks properly. Up until now, we've been letting core Mesa initialize it to 36 for us (which is presumably BRW_MAX_UBO (12) * (VS+GS+FS stages -> 3)). With compute and tessellation, we need to increase this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-11-16 16:24:44 -08:00
Kenneth Graunke	5ee5dfddea	i965: Clean up context constant initialization code. This was getting pretty out of hand, and with compute partially in place and tessellation on the way, it was only going to get worse. This patch makes a "stage exists?" predicate and a "number of stages" count and uses them to clean up a lot of calculations. We can just loop over shader stages and set things for the ones that exist. For combined counts, we can just multiply by the number of stages. It also tries to organize a little bit. We should probably use _mesa_has_geometry_shaders/tessellation/compute here, but we can't because ctx->Version isn't initialized yet. Perhaps that could be fixed in the future. No change in "glxinfo -l" on Broadwell. v2: Drop stray compute shader hunk. Mark stage_exists as const. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-11-16 16:24:44 -08:00
Kenneth Graunke	44d6c0c805	i965: Convert scalar_* flags to a scalar_stage array. I was going to add scalar_tcs and scalar_tes flags, and then thought better of it and decided to convert this to an array. Simpler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-11-16 16:24:44 -08:00
Roland Scheidegger	a2611ffe4b	r200: fix bgrx8/xrgb8 blits Since `779cabfc7d` the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This is untested but essentially addressing the same bug as for radeon. (I don't think that the second entry per le/be table is actually necessary, but shouldn't hurt...) Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-17 01:04:09 +01:00
Roland Scheidegger	983614dbed	radeon: fix bgrx8/xrgb8 blits Since `d21320f625` the same txformat table entries are used for "normal" texturing as well as for blits. However, I forgot to put in an entry for the bgrx8 (le) and xrgb8 (be) formats - the normal texturing path can't hit them because the radeon tex format chooser will never chose them, but we get that format from the dri buffers (at least I assume we got it from there). This caused lots of piglit regressions (and probably lots of trouble outside piglit too). This fixes bug https://bugs.freedesktop.org/show_bug.cgi?id=92900. Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-17 01:01:38 +01:00
Ian Romanick	c40a88b6c5	meta/generate_mipmap: Only modify the draw framebuffer binding in fallback_required Previously GL_FRAMEBUFFER was used. However, if GL_EXT_framebuffer_blit is supported (note: it is supported by every Mesa driver), this is sometimes an alias for GL_DRAW_FRAMEBUFFER (getters) and sometimes an alias for both GL_DRAW_FRAMEBUFFER and GL_READ_FRAMEBUFFER (setters). As a result, the code saved one binding but modified both. If the bindings were different, the GL_READ_FRAMEBUFFER would be incorrect on exit. Fixes the piglit fbo-generatemipmap-versus-READ_FRAMEBUFFER test. Ideally this function would use DSA functions and not modify the binding at all. However, that would be a much more intrusive change because _mesa_meta_bind_fbo_image would also need to be modified. _mesa_meta_bind_fbo_image has a lot of callers. Much of this code is about to get a major rework due to bug #92363, so I don't think it matters too much. In fact, I discovered this bug while working on the other bug. Le bon temps! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-11-16 10:30:10 -08:00
Matt Turner	d564b5b58e	nir/glsl: Fix copy-n-paste mistakes from commit `213f864`. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-16 09:05:53 -08:00
Alex Deucher	00f554abba	radeonsi: enable optimal raster config setting for fiji (v2) Requires proper kernel tiling configuration so check the tiling config registers. v2: send the right version of the patch Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-16 10:09:47 -05:00
Alex Deucher	5b37d8b50c	radeonsi: use proper GRBM_GFX_INDEX offset for CI+ The offset is different on CI and newer. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-16 10:09:34 -05:00
Neil Roberts	2ca018cb65	docs: Add 16x MSAA on i965 to the release notes Signed-off-by: Neil Roberts <neil@linux.intel.com>	2015-11-16 14:36:27 +01:00
Emil Velikov	1780a562bc	nv50: add missing header into the sources list Otherwise it won't end up in the tarball. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-16 10:49:14 +00:00
Juan A. Suarez Romero	40c2acef5c	nir/glsl_to_nir: use _mesa_fls() to compute num_textures Replace the current loop by a direct call to _mesa_fls() function. It also fixes an implicit bug in the current code where num_textures seems to be one value less than it should be when sh->Program->SamplersUsed > 0. For instance, num_textures is 0 instead of 1 when sh->Program->SamplersUsed is 1. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-16 09:24:28 +01:00
Iago Toral Quiroga	3f34afa0aa	nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. v2: remove the check for source modifiers from is_move() (Jason) v3: Put the check for source modifiers back into is_move() since this function is called from copy_prop_alu_src(). Add source modifiers checks to is_vec() instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-16 08:11:13 +01:00
Ilia Mirkin	ff17b3ccf4	nv50,nvc0: disable render condition around clear_* functions Only the regular "clear" call is supposed to respect the render condition. The rest should ignore it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 20:15:22 -05:00
Kenneth Graunke	d2f089ba17	i965: Introduce a MOV_INDIRECT opcode. The geometry and tessellation control shader stages both read from multiple URB entries (one per vertex). The thread payload contains several URB handles which reference these separate memory segments. In GLSL, these inputs are represented as per-vertex arrays; the outermost array index selects which vertex's inputs to read. This array index does not necessarily need to be constant. To handle that, we need to use indirect addressing on GRFs to select which of the thread payload registers has the appropriate URB handle. (This is before we can even think about applying the pull model!) This patch introduces a new opcode which performs a MOV from a source using VxH indirect addressing (which allows each of the 8 SIMD channels to select distinct data.) Based on a patch by Jason Ekstrand. v2: Rename from INDIRECT_THREAD_PAYLOAD_MOV to MOV_INDIRECT; make it a bit more generic. Use regs_read() instead of hacking up the register allocator. (Suggested by Jason Ekstrand.) v3: Fix regs_read() to be more accurate for small unaligned regions. Also rebase on Matt's work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v3] Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> [v1]	2015-11-14 16:41:37 -08:00
Samuel Pitoiset	848fa3101d	nv50: add support for performance metrics on G84+ Currently only one metric is exposed but more will be added later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:46 +01:00
Samuel Pitoiset	6a9c151dbb	nv50: add compute-related MP perf counters on G84+ These compute-related MP performance counters have been reverse engineered using CUPTI which is part of NVIDIA CUDA. As for nvc0, we use a compute kernel to read out those performance counters, and the command stream to configure them. Note that Tesla only exposes 4 MP performance counters, while Fermi has 8. Only G84+ is supported because G80 is an old and weird card. Tested on G84, G96, G200, MCP79 and GT218 with glxgears, glxspheres64, xonotic-glx, heaven and valley. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:42 +01:00
Samuel Pitoiset	ff72440b40	nv50: implement a basic compute support This adds the ability to launch simple compute kernels like the one I will use to read out MP performance counters in the upcoming patch. This compute support is based on the work of Francisco Jerez (aka curro) that he did as part of his EVoC project in 2011/2012 to get OpenCL working on Tesla. His original work can be found here: https://github.com/curro/mesa/commits/nv50-compute I did some improvements on the original code, like fixing using both 3D and COMPUTE simultaneously, improving global buffers binding, and making the code closer to what nvc0 already does. This compute support has been tested by Pierre Moreau and myself with some compute kernels. This is a step towards OpenCL. Speaking about this, it seems like compute programs overlap fragment programs when they are used both. To fix this, we need to re-validate fragment programs when binding compute programs and vice versa. Note that, textures, samplers and surfaces still need to be implemented. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:42:15 +01:00
Samuel Pitoiset	7167a058ba	nv50: free interpolation parameters in nv50_program_destroy() As for nvc0, we need to free memory allocated by interpolation parameters. This fixes a memory leak spotted by valgrind. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-14 23:16:12 +01:00
Samuel Pitoiset	69271bba06	nvc0: reduce the number of GPR used when reading MP perf counters No need to allocate more GPR than used in the compute kernel which reads MP performance counters on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-14 17:38:57 +01:00
Ilia Mirkin	f94e1d9738	nouveau: don't expose HEVC decoding support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-14 10:32:10 -05:00
Vinson Lee	3a0fef0005	nir: Silence GCC maybe-uninitialized warnings. nir/nir_control_flow.c: In function ‘split_block_cursor.isra.11’: nir/nir_control_flow.c:460:15: warning: ‘after’ may be used uninitialized in this function [-Wmaybe-uninitialized] _after = after; ^ nir/nir_control_flow.c:458:16: warning: ‘before’ may be used uninitialized in this function [-Wmaybe-uninitialized] _before = before; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-13 16:19:11 -08:00
Kenneth Graunke	5480bbd90e	i965: Add a SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT opcode. We need to use per-slot offsets when there's non-uniform indexing, as each SIMD channel could have a different index. We want to use them for any non-constant index (even if uniform), as it lives in the message header instead of the descriptor, allowing us to set offsets in GRFs rather than immediates. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-11-13 16:11:02 -08:00
Kenneth Graunke	511de1a80c	glsl: Allow implicit int -> uint conversions for the % operator. GLSL 4.00 and GL_ARB_gpu_shader5 introduced a new int -> uint implicit conversion rule and updated the rules for modulus to use them. (In earlier languages, none of the implicit conversion rules did anything relevant, so there was no point in applying them.) This allows expressions such as: int foo; uint bar; uint mod = foo % bar; Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-13 16:09:58 -08:00
Kenneth Graunke	a4ba476c30	i965: Print input/output VUE maps on INTEL_DEBUG=vs, gs. I've been carrying around a patch to do this for the last few months, and it's been exceedingly useful for debugging GS and tessellation problems. I've caught lots of bugs by inspecting the interface expectations of two adjacent stages. It's not that much spam, so I figure we may as well just print it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com>	2015-11-13 16:08:51 -08:00
Kenneth Graunke	f88c175a29	i965: Make convert_attr_sources_to_hw_regs handle stride == 0. This makes expressions like component(fs_reg(ATTR, n), 7) get a proper <0,1,0> region instead of the invalid <0,8,0>. Nobody uses this today, but I plan to. v2: Rebase on Matt's changes; simplify. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2015-11-13 15:17:58 -08:00
Kenneth Graunke	26f9469a46	nir: Add helpers for getting input/output intrinsic sources. With the many variants of IO intrinsics, particular sources are often in different locations. It's convenient to say "give me the indirect offset" or "give me the vertex index" and have it just work, without having to think about exactly which kind of intrinsic you have. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:46 -08:00
Kenneth Graunke	d12bde0944	nir: Don't lower TCS outputs to temporaries. We'd like to shadow these when possible, but the current code doesn't work properly for TCS outputs. For now, disable it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:46 -08:00
Kenneth Graunke	134728fdae	nir: Allow outputs reads and add the relevant intrinsics. Normally, we rely on nir_lower_outputs_to_temporaries to create shadow variables for outputs, buffering the results and writing them all out at the end of the program. However, this is infeasible for tessellation control shader outputs. Tessellation control shaders can generate multiple output vertices, and write per-vertex outputs. These are arrays indexed by the vertex number; each thread only writes one element, but can read any other element - including those being concurrently written by other threads. The barrier() intrinsic synchronizes between threads. Even if we tried to shadow every output element (which is of dubious value), we'd have to read updated values in at barrier() time, which means we need to allow output reads. Most stages should continue using nir_lower_outputs_to_temporaries(), but in theory drivers could choose not to if they really wanted. v2: Rebase to accomodate Jason's review feedback. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:41 -08:00
Kenneth Graunke	c51d7d5fe3	nir/lower_io: Introduce nir_store_per_vertex_output intrinsics. Similar to nir_load_per_vertex_input, but for outputs. This is not useful in geometry shaders, but will be useful in tessellation shaders. v2: Change stage_uses_per_vertex_outputs() to is_per_vertex_output(), taking a nir_variable (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:10 -08:00
Kenneth Graunke	0df452cd0d	nir/lower_io: Use load_per_vertex_input intrinsics for TCS and TES. Tessellation control shader inputs are an array indexed by the vertex number, like geometry shader inputs. There aren't per-patch TCS inputs. Tessellation evaluation shaders have both per-vertex and per-patch inputs. Per-vertex inputs get the new intrinsics; per-patch inputs continue to use the ordinary load_input intrinsics, as they already work like we want them to. v2: Change stage_uses_per_vertex_inputs into is_per_vertex_input(), which takes a variable (requested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 15:15:10 -08:00
Ian Romanick	1cb49eedb5	i965: Silence unused parameter warnings in get_buffer_rect brw_meta_fast_clear.c: In function 'get_buffer_rect': brw_meta_fast_clear.c:318:37: warning: unused parameter 'brw' [-Wunused-parameter] get_buffer_rect(struct brw_context brw, struct gl_framebuffer fb, ^ brw_meta_fast_clear.c:319:44: warning: unused parameter 'irb' [-Wunused-parameter] struct intel_renderbuffer irb, struct rect rect) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-13 12:29:57 -08:00
Ian Romanick	758f12fd98	meta/generate_mipmap: Don't leak the sampler object Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-13 12:29:56 -08:00
Matt Turner	7a879e422b	i965: Remove unneeded #includes. Some of these are no longer needed since all the backends switched to NIR.	2015-11-13 12:16:48 -08:00
Matt Turner	386759b02d	i965: Silence warning. intel_asm_annotation.c: In function ‘annotation_insert_error’: intel_asm_annotation.c:214:18: warning: ‘ann’ may be used uninitialized in this function [-Wmaybe-uninitialized] ann->error = ralloc_strdup(annotation->mem_ctx, error); ^ I initially tried changing the type of ann_count to unsigned (is currently int), since that in addition to the check that it's non-zero at the beginning of the function seems sufficient to prove that it must be greater than zero. Unfortunately that wasn't sufficient.	2015-11-13 12:13:14 -08:00
Juha-Pekka Heikkila	8b145d6a3d	i965: Don't write beyond allocated memory. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-13 12:06:11 -08:00
Matt Turner	0eb3db117b	i965: Use BRW_MRF_COMPR4 macro in more places. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:51 -08:00
Matt Turner	49b3215d70	i965: Combine register file field. The first four values (2-bits) are hardware values, and VGRF, ATTR, and UNIFORM remain values used in the IR. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:51 -08:00
Matt Turner	b3315a6f56	i965: Replace HW_REG with ARF/FIXED_GRF. HW_REGs are (were!) kind of awful. If the file was HW_REG, you had to look at different fields for type, abs, negate, writemask, swizzle, and a second file. They also caused annoying problems like immediate sources being considered scheduling barriers (commit `6148e94e2`) and other such nonsense. Instead use ARF/FIXED_GRF/MRF for fixed registers in those files. After a sufficient amount of time has passed since "GRF" was used, we can rename FIXED_GRF -> GRF, but doing so now would make rebasing awful. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:51 -08:00
Matt Turner	4b0fbebf02	i965/fs: Set stride correctly for immediates in fs_reg(brw_reg). The fs_reg() constructors for immediates set stride to 0, except for vector-immediates, which set stride to 1. This patch makes the fs_reg constructor that takes a brw_reg do likewise, so that stride is set correctly for cases such as fs_reg(brw_imm_v(...)). The generator asserts that this is true (and presumably it's useful in some optimization passes?) and the VF fs_reg constructors did this (by virtue of the fact that it doesn't override what init() does). In the next commit, calling this constructor with brw_imm_* will generate an IMM file register rather than a HW_REG, making this change necessary to avoid breakage with existing uses of brw_imm_v(). Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:51 -08:00
Matt Turner	b99e1fd547	i965/fs: Handle type-V immediates in brw_reg_from_fs_reg(). We use brw_imm_v() to produce type-V immediates, which generates a brw_reg with fs_reg's .file set to HW_REG. The next commit will rid us of HW_REGs, so we need to handle BRW_REGISTER_TYPE_V in the IMM case. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:51 -08:00
Matt Turner	b163aa0148	i965: Rename GRF to VGRF. The 2-bit hardware register file field is ARF, GRF, MRF, IMM. Rename GRF to VGRF (virtual GRF) so that we can reuse the GRF name to mean an assigned general purpose register. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	5a23b31c75	i965: Move BAD_FILE from the beginning of enum register_file. I'm going to begin using brw_reg's file field in backend_reg and its derivatives, and in order to keep the hardware value for ARF as 0, we have to do something different. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	dba309fc14	i965: Initialize registers. The test (file == BAD_FILE) works on registers for which the constructor has not run because BAD_FILE is zero. The next commit will move BAD_FILE in the enum so that it's no longer zero. In the case of this->outputs, the constructor was being run implicitly, and we were unnecessarily memsetting is to zero. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	7638e75cf9	i965: Use brw_reg's nr field to store register number. In addition to combining another field, we get replace silliness like "reg.reg" with something that actually makes sense, "reg.nr"; and no one will ever wonder again why dst.reg isn't a dst_reg. Moving the now 16-bit nr field to a 16-bit boundary decreases code size by about 3k. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	3048053908	i965: Unwrap some lines. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	58fa9d47b5	i965/vec4: Remove swizzle/writemask fields from src/dst_reg. Also allows us to handle HW_REGs in the swizzle() and writemask() functions. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	94b1031703	i965: Remove fixed_hw_reg field from backend_reg. Since backend_reg now inherits brw_reg, we can use it in place of the fixed_hw_reg field. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	1392e45bfb	i965: Use immediate storage in inherited brw_reg. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	d74dd703f8	i965: Add and use enum brw_reg_file. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	977df90d65	i965: Reorganize brw_reg fields. Put fields that are meaningless with an immediate in the same storage with the immediate. This leaves fields type, file, nr, subnr in the first dword where there's now extra room for expansion. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	e42fb0c2a6	i965: Make 'dw1' and 'bits' unnamed structures in brw_reg. Generated by sed -i -e 's/\.bits\././g' .c .h .cpp sed -i -e 's/dw1\.//g' .c .h .cpp and then reverting changes to comments in gen7_blorp.cpp and brw_fs_generator.cpp. There wasn't any utility offered by forcing the programmer to list these to access their fields. Removing them will reduce churn in future commits. This is C11 (and gcc has apparently supported it for sometime "compatibility with other compilers") See https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	182f137521	i965: Delete type field from backend_reg. Switching from an implicitly-sized type field to field with an explicit bit width is safe because we have fewer than 2^4 types, and gcc will warn if you attempt to set a value that will not fit. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	433df2e03c	i965: Delete abs/negate fields from backend_reg. Instead use the ones provided by brw_reg. Also allows us to handle HW_REGs in the negate() functions. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	c7ed5d1d1c	i965: Make backend_reg inherit from brw_reg. Some fields (file, type, abs, negate) in brw_reg are shadowed by backend_reg. Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-13 11:27:50 -08:00
Matt Turner	88f349c4e1	i965/fs: Replace nested ternary with if ladder. Since the types of the expression were bool ? src_reg : (bool ? brw_reg : brw_reg) the result of the second (nested) ternary would be implicitly converted to a src_reg by the src_reg(struct brw_reg) constructor. I.e., bool ? src_reg : src_reg(bool ? brw_reg : brw_reg) In the next patch, I make backend_reg (the parent of src_reg) inherit from brw_reg, which changes this expression to return brw_reg, which throws away any fields that exist in the classes derived from brw_reg. I.e., src_reg(bool ? brw_reg(src_reg) : bool ? brw_reg : brw_reg) Generally this code was gross, and wasn't actually shorter or easier to read than an if ladder. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-11-13 11:27:50 -08:00
Marek Olšák	3694d58e6c	radeonsi: remove dead code after ES-GS linkage change Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	d79a3449a7	radeonsi: link ES-GS just like LS-HS This reduces the shader key for ES. Use a fixed attrib location based on (semantic name, index). The ESGS item size is determined by the physical index of the highest ES output, so it's almost always larger than before, but I think that shouldn't matter as long as the ESGS ring buffer is large enough. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	b1c5f3faa9	radeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga I discovered that increasing the ESGS ring size fixes GS hangs on Tonga, so let's do it properly. There is now a separate init_config_gs_rings state that is not immutable, because GS rings are resized when needed. This also saves some memory. Most apps won't need more than 1MB per ring per shader engine. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	2f5d911ba2	radeonsi: rename si_update_gs_rings Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	4acd856088	radeonsi: calculate ESGS_RING_ITEMSIZE in create_shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	a0cf589961	radeonsi: move maximum gs stream calculation into create_shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	3ab0c49f04	radeonsi: clean up small duplication in si_shader_gs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	eb0d3e8a90	gallium/radeon: shorten render_cond variable names and ..._cond -> ..._invert Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	70c40cc989	gallium/radeon: remove predicate_drawing flag Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	12596cfd4c	gallium/radeon: atomize render condition (SET_PREDICATION) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	3521907622	gallium/radeon: simplify restoring render condition after flush Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	600e212d87	gallium/radeon: don't use PREDICATION_OP_CLEAR Not setting the predication bit is sufficient. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	6eff5415e4	gallium/radeon: simplify disabling render condition for u_blitter just disable it by not setting the predication bit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	8dd1ee6ff3	r600g: don't set predication on non-draw packets This has no effect. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	6cc8f6c6a7	gallium/radeon: inline the r600_rings structure Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	3d963abc81	radeonsi: prevent recursion in si_context_gfx_flush The recursion can only occur if you modify need_cs_space to always flush. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	8569f9a87e	gallium/radeon: remove the IB flushing flag Not needed anymore. A similar flag will be introduced in the next commit, which will be private in radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	81d412e02c	gallium/radeon: move GFX/DMA flushing from add_to_buffer_list to need_cs_space need_cs_space isn't invoked so often and is called before all commands too. This is a lot cleaner. The code in radeon_add_to_buffer_list always seemed dodgy to me. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	c6012a6650	radeonsi: rename cache flushing flags once more KCACHE, TC L1 and TC L2 are renamed to: - SMEM L1 - VMEM L1 - GLOBAL L2 You can easily tell what they are used for now. Shaders must deal with coherency issues between both L1s manually, e.g. by setting GLC=1 or by using s_dcache_*. BOTH_ICACHE_KCACHE was an unused definition. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	10130ccd8c	radeonsi: set the DISABLE_WR_CONFIRM flag on CI-VI as well I missed this in commit `c3e527f93d` radeonsi: only enable write confirmation on the last CP DMA packet Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	40912dd91e	radeonsi: initialize SX_PS_DOWNCONVERT to 0 on Stoney otherwise the SX or CB blocks can go bananas Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-13 19:54:41 +01:00
Marek Olšák	f7757100f2	radeonsi: add glClearBufferSubData acceleration 8-bit and 16-bit clears which are not aligned to dwords are done in software. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	19773f9805	radeonsi: add SI_SAVE_FRAGMENT_STATE blitter flag Buffer clears via transform feedback won't set this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	19a9c1ecc7	gallium/u_blitter: add support for multi-dword clear values in clear_buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	e15c5c7a06	radeonsi: fix a future crash in emit_cb_target_mask This can't crash currently, but it would crash if clear_buffer from u_blitter were used with a clean context. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	65d0c558d5	radeonsi: fix unaligned clear_buffer fallback This is unreachable currently, but it will be used by unaligned 8-bit and 16-bit fills. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:40 +01:00
Marek Olšák	7f1e34e6c8	r600g: fix clear_buffer fallback with offset != 0 Discovered by luck. This code path hasn't been exercised since transform feedback was implemented. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:40 +01:00
Marek Olšák	01526136ba	gallium/radeon: fix PIPE_QUERY_GPU_FINISHED Broken by the addition of r600_multi_fence in `3b37155a68` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-13 19:54:40 +01:00
Brian Paul	40663864d2	mesa: minor comment fix in blend.c	2015-11-13 08:02:19 -07:00
Brian Paul	5a5efbf804	docs: add link to Coverity on developer utilities page Signed-off-by: Brian Paul <brianp@vmware.com>	2015-11-13 08:02:19 -07:00
Brian Paul	00046393f8	docs: update VMware driver instructions Use a LIBDIR variable, set per-platform. Update the Mesa configuration flags. Run update-initramfs or dracut, update /etc/modules Signed-off-by: Brian Paul <brianp@vmware.com>	2015-11-13 08:02:19 -07:00
Daniel Stone	d1314de293	egl/wayland: Ignore rects from SwapBuffersWithDamage eglSwapBuffersWithDamage accepts damage-region rectangles to hint the compositor that it only needs to redraw certain areas, which was passed through the wl_surface_damage request, as designed. Wayland also offers a buffer transformation interface, e.g. to allow users to render pre-rotated buffers. Unfortunately, there is no way to query buffer transforms, and the damage region was provided in surface, rather than buffer, co-ordinate space. Users could in theory account for this themselves, but EGL also requires co-ordinates to be passed in GL/mathematical co-ordinate space, with an inversion to Wayland's natural/scanout co-ordinate space, so transformations other than a 180-degree rotation will fail as EGL attempts to subtract the region from (its view of the) surface height. Pending creation and acceptance of a wl_surface.buffer_damage request, which will accept co-ordinates in buffer co-ordinate space, pessimise to always sending full-surface damage. `bce64c6c` provides the explanation for why we send maximum-range damage, rather than the full size of the surface: in the presence of buffer transformations, full-surface damage may not actually cover the entire surface. Signed-off-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 10:09:23 +00:00
Iago Toral Quiroga	a29d922c1a	Revert "nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers" The change proposed in the review leads to piglit regressions because is_move() is used in other places and relies on the checks for source modifiers to be there. Revert this until we agree on a better solution.	2015-11-13 08:53:10 +01:00
Samuel Iglesias Gonsálvez	5f004fd197	glsl: fix 'shared' layout qualifier related regressions Commit `8b28b35` added 'shared' as a keyword for compute shaders but it broke the existing 'shared' layout qualifier support for uniform and shader storage blocks. This patch fixes 578 dEQP-GLES31.functional.ssbo.* tests. v2: - Move SHARED to interface_block_layout_qualifier (Timothy) - Don't remove "shared" case insensitive check (Timothy) - Remove the clearing of shared_storage flag (Timothy) Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-13 08:04:49 +01:00
Iago Toral Quiroga	8610cd6b8c	nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. v2: remove the check for source source modifiers from is_move() (Jason) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-13 07:54:33 +01:00
Jason Ekstrand	5f43e074d4	nir/vars_to_ssa: Delete dead output set code This was a remnant of an early attempt to handle output reads in vars_to_ssa. That attempt was abandon a long time ago but these few lines were aparently left in the pass and managed to evade review. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-12 22:08:43 -08:00
Jason Ekstrand	226ba889a0	nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_store Previously, we walked through a given deref_node's copies and, after lowering the copy away, removed it from both the source and destination copy sets. This commit changes this to only remove it from the other node's copy set (not the one we're lowering). At the end of the loop, we just throw away the copy set for the node we're lowering since that node no longer has any copies. This has two advantages: 1) It's more efficient because we're doing potentially half as many set search operations. 2) It now properly handles copies from a node to itself. Perviously, it would delete the copy from the set when processing the destinatioon and then assert-fail when we couldn't find it for the source. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588 Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-12 22:08:43 -08:00
Jason Ekstrand	4bbf2ac06e	nir/validate: Allow subroutine types for the tails of derefs The shader-subroutine code creates uniforms of type SUBROUTINE for subroutines that are then read as integers in the backends. If we ever want to do any optimizations on these, we'll need to come up with a better plan where they are actual scalars or something, but this works for now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92859 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-12 22:08:43 -08:00
Nanley Chery	79f68306d2	mesa: Replace gl_extensions::EXT_texture3D with ::dummy_true Mesa unconditionally sets this driver flag to true in _mesa_init_extensions(). There is therefore no need for the driver to communicate support for this extension. Replace the driver capability flag with ::dummy_true. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 21:31:05 -08:00
Brian Paul	2de2e1702b	mesa: fix MSVC build break in extensions.h Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-12 16:57:18 -07:00
Ilia Mirkin	39f51ec96f	nvc0/ir: add support for TGSI_SEMANTIC_HELPER_INVOCATION Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-12 17:58:42 -05:00
Ilia Mirkin	e3d9dbe304	gallium: add support for gl_HelperInvocation semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-11-12 17:58:23 -05:00
Ilia Mirkin	20748318c5	glsl: add gl_HelperInvocation system value Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-12 17:58:23 -05:00
Jordan Justen	b52cb9ec6a	glsl: Correctly handle vector extract on function parameter This commit accidentally used a '==' when '=' was intended. commit `96b22fb080` Author: Kristian Høgsberg Kristensen <krh@bitplanet.net> Date: Wed Nov 4 14:58:54 2015 -0800 glsl: Use array deref for access to vector components Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-12 14:11:16 -08:00
Nanley Chery	a16ffb743c	mesa: In helpers, only check driver capability for meta Make API context and version checks done by the helper functions pass unconditionally while meta is in progress. This transparently makes extension checks solely dependent on struct gl_extensions while in meta. v2: Use an 8-bit data type instead of a GLuint Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	5645770742	mesa/extensions: Prefix global struct and extension type Rename the following types and variables: * struct extension -> struct mesa_extension, like the mesa_format type. * extension_table -> _mesa_extension_table, like the _mesa_extension_override_{enables,disables} structs. Suggested-by: Marek Olšák <marek.olsak@amd.com> Suggested-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	ab129a44ae	mesa: Generate a helper function for each extension Generate functions which determine if an extension is supported in the current context. Initially, enums were going to be explicitly used with _mesa_extension_supported(). The idea to embed the function and enums into generated helper functions was suggested by Kristian Høgsberg. For performance, the function body no longer uses _mesa_extension_supported() and, as suggested by Chad Versace, the functions are also declared static inline. v2: Place function qualifiers on separate line (Chad) v3: Move function curly brace to new line (Chad) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	eda15abd84	mesa/extensions: Replace extension::api_set with ::version The api_set field has no users outside of _mesa_extension_supported(). Remove it and allow the version field to take its place. The brunt of the transformation was performed with the following vim commands: s/$GL [^,]\+$,\s\d,\s\d$,\s\d$$,\s\d$/\1, GLL, GLC\2\3/g s/$GLL [^,]\+$\,\s\d/\1, GLL/g s/$GLC [^,]\+$$,\s\d$,\s\d$,\s\d$$,\s\d$/\1\2, GLC\3\4/g s/$ ES1[^,]$$,\s\(\w\\|\d$\+\)$,\s\(\w\\|\d$\+\),\s\d/\1\2\4, ES1/g s/$ ES2[^,]$$,\s\(\w\\|\d$\+\)$,\s\(\w\\|\d$\+\)$,\s\(\w\\|\d$\+\),\s\d*/\1\2\4\6, ES2/g Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	a82bc779af	mesa/extensions: Use _mesa_extension_supported() Replace open-coded checks for extension support with _mesa_extension_supported(). Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	f6a818e76d	mesa/extensions: Create _mesa_extension_supported() Create a function which determines if an extension is supported in the current context. v2: Use common variable names (Emil) Insert new line between variables and return statement (Chad) Rename api_set variable to api_bit (Chad) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	f47df8f729	mesa/extensions: Add extension::version Enable limiting advertised extension support by context version with finer granularity. This new field is currently unused and is set to 0 everywhere. When it is used, a value of 0 will indicate that the extension is supported for any version of a context. v2: Use uint*t type for version and note the expected values (Emil) Use an 8-bit data type Reformat macro for better readability (Chad) v3: Note preparatory nature of commit (Chad) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	8bd82a91c0	mesa/extensions: Move entries entries to separate file With this infrastructure set in place, we can now reuse the entries to generate useful code. v2: Add the new file into Makefile.sources (Emil) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	c0b568f3db	mesa/extensions: Wrap array entries in macros Now that we're using macros, remove the redundant text from each entry. Remove comments between the entries to make editing easier and separate the sections with blank lines. Structure the EXT macros in a way that helps reviewers verify that no meaning has been altered. v2: Indent the entries (Chad) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Nanley Chery	e5af09f9ba	mesa/extensions: Remove array sentinel Simplify future updates to the extension struct array by removing the sentinel. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-11-12 13:10:37 -08:00
Matt Turner	74e48e9544	i965: Check instructions appear only on supported hardware. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:06:08 -08:00
Matt Turner	0b45d47f71	i965: Add initial assembly validation pass. Initially just checks that sources are non-NULL, which would have alerted us to the problem fixed by commit `6c846dc5`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:06:04 -08:00
Matt Turner	34ed45557e	i965: Add annotation_insert_error() and support for printing errors. Will allow annotations to contain error messages (indicating an instruction violates a rule for instance) that are printed after the disassembly of the block. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:00:10 -08:00
Matt Turner	a280e83d71	i965: Combine assembly annotations if possible. Often annotations are identical between sets of consecutive instructions. We can perhaps avoid some memory allocations by reusing the previous annotation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:00:10 -08:00
Matt Turner	93e371c140	i965: Set annotation_info's mem_ctx. It was being memset to 0 previously. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-11-12 11:00:10 -08:00
Matt Turner	9ab45b4df9	i965: Don't consider control flow instructions to have sources. And why did IFF have a destination? I suspect that once upon a time the disassembler used this information to know which fields to find the jump targets in. The jump targets have moved, so the disassembler has to know how to handle these per-generation anyway. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:00:10 -08:00
Matt Turner	0865e743c1	i965: Fill out instruction list. Add some instructions: illegal, movi, sends, sendsc. Remove some instructions with reused opcodes: msave, mrestore, push, pop, goto. I did have some gross code for disassembling opcodes per-generation, but there's very little meaningful overlap so it's probably not needed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:00:10 -08:00
Matt Turner	238877207e	ralloc: Set start in ralloc_vasprintf_rewrite_tail() if str is NULL. We were leaving it undefined, even though we were writing a string to str. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-12 11:00:10 -08:00
Matt Turner	903050694b	i965: Consolidate is_3src() functions. Otherwise I'll have to add another later in this series.	2015-11-12 11:00:10 -08:00
Brian Paul	3e74038280	st/wgl: add a comment about recursive locking in stw_make_current() Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	f45b644e11	st/wgl: add a lock assertion in stw_framebuffer_from_hwnd_locked() Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
José Fonseca	a1c9feafd5	st/wgl: add some mutex checking code This would have caught the locking bug that was fixed in the earlier "st/wgl: fix locking issue in stw_st_framebuffer_present_locked()" patch. v2: minor coding style changes by Brian. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	166769fe4b	st/wgl: rename stw_framebuffer_release() to stw_framebuffer_unlock() To match the new stw_framebuffer_lock() function. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	dabc423ed0	st/wgl: reimplement stw_framebuffer::mutex with CRITICAL_SECTION v2: update comments on the stw_framebuffer::mutex field regarding locking order. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	f71508ae79	st/wgl: include u_debug.h To get declaration for debug_printf() directly instead of getting it indirectly through os_thread.h Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	fce68832c5	st/wgl: reimplement stw_device::fb_mutex with CRITICAL_SECTION Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:25 -07:00
Brian Paul	fa30de7643	st/wgl: re-implement stw_device::ctx_mutex with CRITICAL_SECTION This is Windows-only code so we can use the native Win32 functions for critical sections. This will also allow us to (cleanly) add some mutex check/debug code in subsequent patches. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-12 11:21:24 -07:00
Brian Paul	a02385cd69	gallium/hud: add cpu graph support for Windows We support "cpu" but not "cpu#" because there's no good way of querying per-cpu usage. Also, the cpu usage is for the process, not the whole system. Original code cobbled together by Brian and then fixed/polished by Jose. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-11-12 09:11:15 -07:00
Tapani Pälli	f2fe607261	glsl: set matrix_stride for non matrices with atomic counter buffers Patch sets matrix_stride as 0 for non matrix uniforms that are in a atomic counter buffer. Matrix stride calculation for actual matrix uniforms is done during link_assign_uniform_locations. From ARB_program_interface_query specification: GL_MATRIX_STRIDE: "For active variables not declared as a matrix or array of matrices, zero is written to <params>. For active variables not backed by a buffer object, -1 is written to <params>, regardless of the variable type." Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-11-12 14:15:29 +02:00
Tapani Pälli	7e6dac1186	mesa: validate precision of varyings during ValidateProgramPipeline Fixes following failing ES3.1 CTS tests: ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingFloat ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingInt ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingUInt Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-12 09:50:14 +02:00
Tapani Pälli	5bd122cad9	glsl: do not lose precision information when packing varyings This information will be used by cross stage validation of varyings for pipeline objects. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-12 09:50:14 +02:00
Iago Toral Quiroga	f84bc57d7d	glsl: Add precision information to ir_variable We will need this later on when we implement proper support for precision qualifiers in the drivers and also to do link time checks for uniforms as indicated by the spec. This patch also adds compile-time checks for variables without precision information (currently, Mesa only checks that a default precision is set for floats in fragment shaders). As indicated by Ian, the addition of the precision information to ir_variable has been done using a bitfield and pahole to identify an available hole so that memory requirements for ir_variable stay the same. v2 (Ian): - Avoid if-ladders by defining arrays of supported sampler names and indexing into them with type->sampler_array + 2 * type->sampler_shadow - Make the code that selects the precision qualifier to use an utility function - Fix a typo v3 (Tapani): - rebased - squashed in "Precision qualifiers are not allowed on structs" - fixed select_gles_precision for sampler arrays - fixed precision_qualifier_allowed for arrays of structs v4 (Tapani): - add atomic_uint handling - do not allow precision qualifier on images (issues reported by Marta) v5 (Tapani): - support precision qualifier on image types v6 (Tapani): - set precision qualifier on interface block members Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-12 09:50:13 +02:00
Iago Toral Quiroga	9a00e1a69d	glsl: Move the definition of precision_qualifier_allowed We will need this to build later patches Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-12 09:50:13 +02:00
Iago Toral Quiroga	e6629d814f	glsl: Add user-defined default precision qualifiers to the symbol table Notice that the spec requires that a default precision has been set for every type used by a shader that can use a precision qualifier and does not have a predefined precision, however, at the moment, Mesa only checks this for floats in the fragment shader. This is probably because the GLSL ES 1.0 specs mentions this case specifically, but GLSL ES 3.0 clarifies that the same applies to other types: "The fragment language has no default precision qualifier for floating point types. Hence for float, floating point vector and matrix variable declarations, either the declaration must include a precision qualifier or the default float precision must have been previously declared. Similarly, there is no default precision qualifier for the following sampler types in either the vertex or fragment language: sampler3D; samplerCubeShadow; sampler2DShadow; sampler2DArray; sampler2DArrayShadow; isampler2D; isampler3D; isamplerCube; isampler2DArray; usampler2D; usampler3D; usamplerCube; usampler2DArray;" we will fix this in a later patch. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-12 09:50:13 +02:00
Iago Toral Quiroga	e3082fb273	glsl: Add default precision qualifiers to the symbol table The GLSL ES spec specifies default precision qualifiers for certain types, so populate the symbol table with these. Notice that the desktop GLSL spec also indicates defaults for some types but this is not really useful since precision qualifiers are completely ignored in desktop GLSL. v2: simplify and add samplerExternalOES, specified by OES_EGL_image_external (Tapani) v3: add atomic_uint (reported missing by Marta) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-12 09:50:13 +02:00
Iago Toral Quiroga	d6a6167354	glsl: Add API to put default precision qualifiers in the symbol table These have scoping rules that match the ones defined for other things such as variables, so we want them in the symbol table. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-12 09:50:13 +02:00
Samuel Iglesias Gonsálvez	d4fdb84f80	i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message. This patch adjusts the number of registers written by the opcode following what the PRM spec says about the number of registers written by the SIMD8 and SIMD16's writeback messages for sampler messages. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-12 08:39:14 +01:00
Ben Widawsky	55314c5be4	i965/skl/gt4: Fix URB programming restriction. The comment in the code details the restriction. Thanks to Ken for having a very helpful conversation with me, and spotting the blurb in the link I sent him :P. There are still stability problems for me on GT4, but this definitely helps with some of the failures. v2: Comment fixes Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-11 18:13:19 -08:00
Ilia Mirkin	c4182bb9b0	nv50,nvc0: add ARB_clear_texture support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-11 19:20:41 -05:00
Ilia Mirkin	ae39b0fda8	st/mesa: implement ARB_clear_texture Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-11 19:20:41 -05:00
Ilia Mirkin	3695b253f9	gallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototype Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-11 19:20:41 -05:00
Timothy Arceri	725fcdfbb1	glsl: add helper to check for enhanced layouts support Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-12 10:18:14 +11:00
Timothy Arceri	82e4f22d1e	mesa: add ARB_enhanced_layouts Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-11-12 10:18:08 +11:00
Dave Airlie	df8af7d751	r600: initialised PGM_RESOURCES_2 for ES/GS This fixes the corruption on rendering that we are seeing in certain geometry shaders. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91780 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested / Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-12 09:03:13 +10:00
Kenneth Graunke	918bda23dd	i965: Split nir_emit_intrinsic by stage with a general fallback. Many intrinsics only apply to a particular stage (such as discard). In other cases, we may want to interpret them differently based on the stage (such as load_primitive_id or load_input). The current method isn't that pretty - we handle all intrinsics in one giant function. Sometimes we assert on stage, sometimes we forget. Different behaviors are handled via if-ladders based on stage. This commit introduces new nir_emit_<stage>_intrinsic() functions, and makes nir_emit_instr() call those. In turn, those fall back to the generic nir_emit_intrinsic() function for cases they don't want to handle specially. This makes it clear which intrinsics only exist in one stage, and makes it easy to handle inputs/outputs differently for various stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-11-11 11:57:37 -08:00
Ilia Mirkin	912babba7b	mesa/copyimage: allow width/height to not be multiples of block For compressed textures, the image size is not necessarily a multiple of the block size (e.g. the last mip levels). Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core Profile spec says: An INVALID_VALUE error is generated if the dimensions of either subregion exceeds the boundaries of the corresponding image object, or if the image format is compressed and the dimensions of the subregion fail to meet the alignment constraints of the format. and Section 8.7 (Compressed Texture Images) says: An INVALID_OPERATION error is generated if any of the following conditions occurs: * width is not a multiple of four, and width + xoffset is not equal to the value of TEXTURE_WIDTH. * height is not a multiple of four, and height + yoffset is not equal to the value of TEXTURE_HEIGHT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-11 14:37:55 -05:00
Jason Ekstrand	80890eb0d3	i965/brw_reg: Add a brw_VxH_indirect helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-11 10:52:30 -08:00
Brian Paul	68993f77cd	mesa: remove old comments in arrayobj.c	2015-11-11 09:38:22 -07:00
Brian Paul	9870a5c6c9	st/wgl: clarify code in stw_framebuffer_from_hwnd_locked() Just a minor code change to make it obvious that NULL is returned when we don't find the given HWND. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-11 09:38:22 -07:00
Brian Paul	004ed6f4a9	st/wgl: improve some function comments In particular, explain when stw_framebuffer objects are locked/unlocked/etc. Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-11 09:38:22 -07:00
Brian Paul	b93cb6c1dc	st/wgl: whitespace/formatting fixes	2015-11-11 09:38:22 -07:00
Brian Paul	eb812921ac	st/wgl: fix locking issue in stw_st_framebuffer_present_locked() When stw_st_framebuffer_present_locked() is called, the stw_framebuffer's mutex will already be locked. Normally, the stw_framebuffer_present_locked() function calls stw_framebuffer_release() to unlock the mutex when it's done. But if for some reason the 'resource' pointer in stw_st_framebuffer_present_locked() is null, we'd return without unlocking the stw_framebuffer. This fixes that to avoid potential deadlocks. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-11 09:38:22 -07:00
Kenneth Graunke	e42a29531a	i965: Print force_writemask_all in dump_instructions(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-11 08:35:15 -08:00
Kenneth Graunke	ecb5e0a986	i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits. A while back, we moved to directly emitting the Gen7+ state when constructing the binding tables. These flags are only used on Gen4-6, which emit all the binding table pointers at once. We gain nothing by having separate flags, so combine them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-11-11 08:33:58 -08:00
Kenneth Graunke	a2987ff57f	i965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n. Inspired by a patch by Fabian Bieler. Fabian defined a _3DPRIM_PATCHLIST_0 macro (which isn't actually a valid topology type); I instead chose to make a macro that takes an argument. He also took the number of patch vertices from _mesa_prim (which was set to ctx->TessCtrlProgram.patch_vertices) - I chose to use it directly to avoid the need for the VBO patch. v2: Change macro to 0x20 + (n - 1) instead of 0x1F + n to better match the documentation (suggested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-11 08:33:48 -08:00
Emil Velikov	cbb7d90e57	docs: add news item and link release notes for 11.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-11-11 11:18:32 +00:00
Emil Velikov	6435d8ac5a	docs: add sha256 checksums for 11.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `66c949d0a1`)	2015-11-11 11:16:43 +00:00
Emil Velikov	07948b03fb	docs: add release notes for 11.0.5 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ee57c22141`)	2015-11-11 11:16:42 +00:00
Glenn Kennard	3f45d29fe4	r600g: Pass conservative depth parameters to hw Supported on R700 and up. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-11 09:06:25 +10:00
Dave Airlie	b3e793f2db	Revert "r600g: Pass conservative depth parameters to hw" This reverts commit `a1fc78911e`. I pushed the wrong patch.	2015-11-11 09:05:50 +10:00
Glenn Kennard	c878d61124	r600g: Implement ARB_texture_view Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-11 08:36:08 +10:00
Glenn Kennard	a1fc78911e	r600g: Pass conservative depth parameters to hw Supported on R700 and up. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-11 08:32:35 +10:00
Eduardo Lima Mitev	de51676b41	i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const When both fadd and fmul instructions have at least one operand that is a constant and it is only used once, the total number of instructions can be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because the constants will be progagated as immediate operands of fmul and fadd. This patch detects these situations and prevents fusing fmul+fadd into ffma. Shader-db results on i965 Haswell: total instructions in shared programs: 6235835 -> 6225895 (-0.16%) instructions in affected programs: 1124094 -> 1114154 (-0.88%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 7612 HURT: 843 GAINED: 4 LOST: 0 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev	fb3b5669ce	util: Add list_is_singular() helper function Returns whether the list has exactly one element. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev	94ff35204d	nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver Because the next patch will add an optimization that is specific to i965, we want to move this loweing pass to that driver altogether. This is safe because i965 is the only consumer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-10 21:13:35 +01:00
Kristian Høgsberg Kristensen	96b22fb080	glsl: Use array deref for access to vector components We've assumed that we could lower per-component vector access from vec[i] = scalar to vec = ir_triop_vector_insert(vec, scalar, i) but with SSBOs (and compute shader SLM and tesselation outputs) this is no longer valid. If a vector is "externally visible", multiple threads can write independent components simultaneously. With lowering to ir_triop_vector_insert, each thread read the entire vector, changes one component, then writes out the entire vector. This is racy. Instead of generating a ir_binop_vector_extract when we see v[i], we generate ir_dereference_array. We then add a lowering pass to lower the ir_dereference_array to ir_binop_vector_extract for rvalues and for to vector_insert for lvalues in a separate lowering pass. The resulting IR is the same as before, but we now have a window between ast->ir conversion and the lowering pass where v[i] appears in the IR as an array deref. This lets us run lowering passes that lower the vector access to I/O (eg for SSBO load/store) before we lower the per-component access to full vector writes. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen	60dd5287ff	glsl: Lower UBO and SSBO access in glsl linker All GLSL IR consumers run this lowering pass so we can move it to the linker. This moves the pass up quite a bit, but that's the point: it needs to run before we throw away information about per-component vector access. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen	f0e95c2500	glsl: Drop exec_list argument to lower_ubo_reference We always pass in shader->ir and we already pass in the shader, so just drop the exec_list. Most passes either take just a exec_list or a shader, so this seems more consistent. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-11-10 12:02:46 -08:00
Connor Abbott	213f86416f	nir/glsl: switch to using the builder v2: use nir_bulder_cf_insert (Ken) Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:56:43 -05:00
Connor Abbott	fbbfb7c025	nir/glsl: make emit() take nir_ssa_def * sources Again, this matches what the builder will have to do. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:56:35 -05:00
Connor Abbott	a60e990dd2	nir/glsl: convert nir_visitor::result to a nir_ssa_def * Its only user now returns a nir_ssa_def , and we'll need this since the builder returns a nir_ssa_def . Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:55:54 -05:00
Connor Abbott	30fe8eaa8e	nir/glsl: make evaluate_rvalue() return a nir_ssa_def * A long time ago, before NIR was even merged to master, glsl_to_nir used registers and these sources were actually register sources. But nowadays everything in glsl_to_nir is an SSA value, so stop pretending that by evaluating an rvalue we can get an arbitrary nir_src. Most importantly, we need this since the builder takes nir_ssa_def * sources directly. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:55:14 -05:00
Jose Fonseca	6f42162329	st/mesa: Destroy buffer object's mutex. Ideally we should have a _mesa_cleanup_buffer_object function in src/mesa/bufferobj.c so that the destruction logic resided in a single place. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-11-10 11:04:28 +00:00
Kenneth Graunke	db54673b54	nir: Store PatchInputsRead and PatchOutputsWritten in nir_shader_info. These tessellation shader related fields need plumbing through NIR. v2: Use uint32_t instead of uint64_t to match the source type of GLbitfield (caught by Iago Toral). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-10 01:03:43 -08:00
Eric Anholt	437d7b6119	vc4: Avoid loading undefined (newly-allocated) FBO contents. Since X has undefined contents in new pixmaps, it will allocate new textures for an FBO and draw to them without an explicit clear. For VC4, it's much faster to emit a clear than the load of the actual undefined memory contents, so just do that instead.	2015-11-09 19:17:36 -08:00
Eric Anholt	5980389bbf	vc4: Return NULL when we can't make our shadow for a sampler view. I'm not sure what the caller does is appropriate (just have a NULL sampler at this slot), but it fixes the immediate crash. Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-09 19:17:36 -08:00
Eric Anholt	eb8fb0064d	vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails. I was afraid our callers weren't prepared for this, but it looks like at least for resource creation, mesa/st throws an error appropriately. Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-09 19:17:36 -08:00
Eric Anholt	84608e07e7	vc4: Add CL dumping for GL_ARRAY_PRIMITIVE.	2015-11-09 19:17:36 -08:00
Eric Anholt	855a3ca598	vc4: Fix a compiler warning.	2015-11-09 19:17:36 -08:00
Jordan Justen	fb3da129d1	glsl: Use shared storage variable type for shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:24 -08:00
Jordan Justen	32746fc9b4	glsl: Add shared variable type Shared variables are stored in a common pool accessible by all threads in a compute shader local work group. These variables are similar to OpenCL's local/__local variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:24 -08:00
Jordan Justen	c0ac4740a7	glsl: Add space to shader_storage in print_visitor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:17 -08:00
Jordan Justen	007d96730e	glsl: Align comments on variables types v2: * Split from patch to add ir_var_shader_shared (tarceri) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:17 -08:00
Jordan Justen	8b28b35531	glsl: Parse shared keyword for compute shader variables v2: * Move shared parsing under storage qualifiers (tarceri) * Fail to compile if shared is used in non-compute shader (tarceri) * Use separate shared_storage bit for shared variables (tarceri) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:12 -08:00
Timothy Arceri	a4a46fe3fa	glsl: simplify interface block stream qualifier validation Qualifiers on member variables are redundent all we need to do if check if it matches the stream associated with the block and throw an error if its not. Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Cc: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-10 12:02:30 +11:00
Ilia Mirkin	3ea3727998	docs: note that ARB_copy_image was added to nv50, nvc0 in this release Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-09 07:14:07 -05:00
Brian Paul	28f6faca51	st/wgl: add null pointer check for HUD texture Fixes crash when using HUD with Nobel Clinician Viewer. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-09 11:25:59 +00:00
Brian Paul	75d1e363ff	st/wgl: fix double-present on swapbuffers bug The stw_st_framebuffer_present_locked() function was getting called twice per SwapBuffers. First, when st_context_iface::flush() was called from DrvSwapBuffers() because the ST_FLUSH_FRONT flag was given. Second, by stw_st_swap_framebuffer_locked() which does the actual SwapBuffers. Two code changes: 1. Pass ST_FLUSH_END_OF_FRAME, instead of ST_FLUSH_FRONT. 2. Move the implementation of stw_flush_current_locked() into DrvSwapBuffers() since it's not called anywhere else. Not much change in perf for benchmarks like Lightsmark, but some simple Mesa demos are measurably faster. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-09 11:25:59 +00:00
Brian Paul	8083943e2e	st/wgl: reorder pixel formats to put MSAA formats last And put 8-bit/channel formats before 5/6/5 formats. The ChoosePixelFormat() function seems to be finicky about format selection. Putting the MSAA formats after the non-MSAA formats means most apps get a low-numbered format. Now we generally get the same pixel format regardless of whether using vgpu9 or 10. VMware bug 1455030 Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-09 11:25:59 +00:00
José Fonseca	e524df5ef3	st/wgl: Don't rely on GDI to bookkeep pixelformat for us. This allows to use apitrace's retracediff script on Windows to retrace and compare two builds of a Mesa based opengl32.dll/ICD side-by-side. See also `e4a4f15f5b`	2015-11-09 11:08:27 +00:00
Michel Dänzer	24abbaff9a	winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3 Fixes GPUVM conflicts with non-4K page size. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738 v2: Replace sanitization of VM base address alignment with comment why that's not necessary. v3: Use unsigned instead of long as the type for the size_align member. (Marek) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-09 17:24:32 +09:00
Christian König	df4f9b0236	radeon/uvd: add H.265/HEVC to legal notes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-08 18:16:01 -05:00
Leo Liu	519502d08f	st/omx: add headless support This will allow dec/enc/transcode without X v2: use env override even with X, use loader_open_device instead of open v3: clean up Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Leo Liu	25526d77b1	st/va: use vl screen drm support from vl_wys_drm v2: move the dup to vl_wys_drm for pipe loader Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Leo Liu	7da86e0ec0	vl: add drm support for vl_screen This will allow the state trackers to use render nodes with screen creation v2: dup fd for pipe loader Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Leo Liu	d115e47099	st/va: fix build fails with pipe loader There is no dev in drv, and dev should be from vl_screen here Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Samuel Pitoiset	ffb60e7788	nvc0: enable compute support on Fermi Altough the compute support is still not complete because textures and surfaces need to be implemented, it allows to launch very simple compute kernel like one which reads reading MP performance counters. This turns on PIPE_CAP_COMPUTE and PIPE_SHADER_COMPUTE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-08 16:47:59 +01:00
Ilia Mirkin	e06238cb9e	nv50/ir: fix emission of s[] args in certain situations There might only be a single arg (e.g. cvt), so use mode rather than looking at the source directly. Also we don't want to rely on the type of the value, which can be unreliable, but instead use the instruction's. This works out well since mkSplit doesn't adjust the type. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 18:58:58 -05:00
Ilia Mirkin	af218217d7	nv50/ir: only take abs value when computing high result Not reachable from TGSI since it only has UMUL, no IMUL. However it's surprising that setting argument types to s32 will cause sign to get lost. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 18:58:58 -05:00
Ilia Mirkin	53cbb11707	nouveau: avoid queueing too much work onto a single fence Force the fence to get kicked off, which won't actually wait for its completion, but any additional work will be put onto a fresh list. This fixes crashes in teximage-colors --benchmark with too many active maps. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 18:58:58 -05:00
Dave Airlie	0f5b1409fd	llvmpipe: disable front updates for now As pointed out by Emil, this sometimes hangs, appears to be due to threading need to rethink how this stuff works for llvmpipe. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-08 07:55:17 +10:00
Dave Airlie	87711183ac	virgl: wrap ret assignment with braces to do correct thing Coverity reported that ret could only be 0 or 1, since it was setting ret = fn() > 0, instead of doing (ret = fn()) > 0. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-08 06:27:02 +10:00
Jason Ekstrand	6c731d8566	nir: Add a nir_deref_tail helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 12:09:44 -08:00
Jason Ekstrand	7d90e570f3	nir/types: Add an is_vector_or_scalar helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 12:09:38 -08:00
Jason Ekstrand	d43e16b163	i965/fs: Use regs_read/written for post-RA scheduling in calculate_deps Previously, we were assuming that everything read/wrote exactly 1 logical GRF (1 in SIMD8 and 2 in SIMD16). This isn't actually true. In particular, the PLN instruction reads 2 logical registers in one of the components. This commit changes post-RA scheduling to use regs_read and regs_written instead so that we add enough dependencies. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92770 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 08:41:48 -08:00
Jason Ekstrand	c839174d55	nir/validate: Add better validation of load/store types Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 08:41:35 -08:00
Marek Olšák	d57ede92b7	radeonsi: add register definitions for Stoney There are a few non-stoney changes too. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-07 10:22:13 +01:00
Marek Olšák	2658777f46	radeonsi: add workarounds for CP DMA to stay on the fast path v2: set emit_scratch_reloc, add a NULL check Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:13 +01:00
Marek Olšák	fc0416ef5d	radeonsi: unify CP DMA preparation logic Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:13 +01:00
Marek Olšák	89da3b4458	radeonsi: unify CP DMA code determining various flags v2: don't call get_flush_flags twice per function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:12 +01:00
Marek Olšák	c3e527f93d	radeonsi: only enable write confirmation on the last CP DMA packet This should improve performance for big copies that need to be split. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:12 +01:00
Ilia Mirkin	8e9ade7eb3	nv50/ir: allow emission of immediates in imul/imad ops Nothing actually uses this yet (due to complications), but the emission logic is right. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 00:42:15 -05:00
Ilia Mirkin	393d0c336b	nv50/ir: properly set the type of the constant folding result This removes the hack used for merge, which only covers a fraction of the cases. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 19:39:32 -05:00
Ilia Mirkin	2f9aaed749	nv50/ir: add support for const-folding OP_CVT with F64 source/dest Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 19:39:32 -05:00
Ilia Mirkin	76957389fc	nv50/ir: add fp64 opcode emission support for G200 (NVA0) Need to emulate rcp/rsq before providing full fp64 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 18:36:25 -05:00
Hans de Goede	f979d3cfec	nv50/ir: Add support for 64bit immediates to checkSwapSrc01 Now that we support 64 bit immediates in insnCanLoad, we need to swap 64 bit immediate sources too for optimal effect. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 18:13:31 -05:00
Hans de Goede	9f2f8bda6e	nvc0/ir: Teach insnCanLoad about double immediates Teach insnCanLoad about double immediates, together with the "Add support for merge-s to the ConstantFolding pass" This turns the following (nvc0) code: 1: mov u32 $r2 0x00000000 (8) 2: mov u32 $r3 0x3fe00000 (8) 3: add f64 $r0d $r0d $r2d (8) Into: 1: add f64 $r0d $r0d 0.500000 (8) Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 18:13:31 -05:00
Hans de Goede	428506ece2	nv50/ir: Add support for merge-s to the ConstantFolding pass This allows later passes like LoadPropagation to properly deal with 64 bit immediates. If the new 64 bit load this introduces does not get optimized away then split64BitOpPostRA() will split this into 2 instructions again. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 18:13:31 -05:00
Ilia Mirkin	2437f00853	nv50/ir: disallow 64-bit immediates on nv50 targets No instructions are able to load short immediates like nvc0 can. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 18:13:31 -05:00
Ilia Mirkin	11e3dac36e	nv50/ir: allow movs with TYPE_F64 destinations to be split Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 18:13:31 -05:00
Hans de Goede	b487b55f7d	gm107/ir: Add support for double immediates Add support for encoding double immediates (up to 20 bits of precision) into the generated gm107 machine-code. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 17:22:40 -05:00
Hans de Goede	12c850d01c	nvc0/ir: Add support for double immediates Add support for encoding double immediates (up to 20 bits of precision) into the generated nvc0 machine-code. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 17:22:40 -05:00
Francisco Jerez	5169407221	i965/nir/fs: Add comment for no-op memory barrier functions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-11-06 13:19:56 -08:00
Jordan Justen	faa1193070	i965/nir/fs: Implement new barrier functions for compute shaders For these nir intrinsics, we emit the same code as nir_intrinsic_memory_barrier: * nir_intrinsic_memory_barrier_atomic_counter * nir_intrinsic_memory_barrier_buffer * nir_intrinsic_memory_barrier_image We treat these nir intrinsics as no-ops: * nir_intrinsic_group_memory_barrier * nir_intrinsic_memory_barrier_shared v3: * Add comment for no-op cases (curro) v4: * Moving comment to a separate patch authored by curro Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-06 13:16:11 -08:00
Jordan Justen	9d65f3208b	nir: Add new barrier functions for compute shaders When these functions are called in glsl-ir, we create a corresponding nir intrinsic function call. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-06 13:15:16 -08:00
Jordan Justen	91f188710a	glsl: Add new barrier functions for compute shaders When these functions are called in GLSL code, we create an intrinsic function call: * groupMemoryBarrier => __intrinsic_group_memory_barrier * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter * memoryBarrierBuffer => __intrinsic_memory_barrier_buffer * memoryBarrierImage => __intrinsic_memory_barrier_image * memoryBarrierShared => __intrinsic_memory_barrier_shared v2: * Consolidate with memoryBarrier function/intrinsic creation (curro) v3: * Instead of add_memory_barrier_function, add an intrinsic_name parameter to _memory_barrier (curro) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-06 13:14:44 -08:00
Boyuan Zhang	6bad554d98	radeon/uvd: fix VC-1 simple/main profile decode v2 We just needed to set the extra width/height fields to get this working. v2 (chk): rebased, CC stable added, commit message added, fixed coding style Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-11-06 20:07:23 +01:00
Boyuan Zhang	ed55def44f	st/vaapi: fix vaapi VC-1 simple/main corruption v2 Apply the start code fix only to advanced profile. v2 (chk): add commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-11-06 20:07:23 +01:00
Julien Isorce	cc1e5c972e	st/va: add support for RGBX and BGRX in VPP Before it was only possible to convert a NV12 surface to RGBA or BGRA. This patch uses the same post processing function, "handleVAProcPipelineParameterBufferType", but add definitions for RGBX and BGRX. This patch also makes vlVaQuerySurfaceAttributes more generic to avoid copy and pasting the same lines. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-06 17:33:45 +00:00
Julien Isorce	42a5e143a8	vl/buffers: add RGBX and BGRX to the supported formats Useful is one wants to create RGBX or BGRX surfaces. The infrastructure is such that it required just a few definitions to support these formats. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-06 17:33:38 +00:00
Julien Isorce	bf6acbb2db	st/va: properly use brackets in vlVaAcquireBufferHandle's switch In "switch (mem_type)" the brackets were surrounding "case+default" instead of "case" only. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-06 17:33:16 +00:00
Julien Isorce	bfc245e9ac	st/va: properly indent buffer.c, config.c, image.c and picture.c Some lines were using 4 indentation spaces instead of 3. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian K<C3><B6>nig <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-06 17:33:01 +00:00
Rob Clark	6459e780ae	freedreno/a4xx: fix blend color Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-06 11:19:04 -05:00
Rob Clark	7465e16124	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-06 11:18:47 -05:00
Guillaume Charifi	6f5e0c08a4	freedreno: add a305 support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-06 11:17:58 -05:00
Boyan Ding	8f55ebe802	freedreno/ir3: Use nir_foreach_variable Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-11-06 11:17:53 -05:00
Rob Clark	99597d033a	nir: some small cleanups The various cf nodes all get allocated w/ shader as their ralloc_parent, so lets make this more explicit. Plus couple other corrections/ clarifications. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-06 11:15:41 -05:00
Ilia Mirkin	d68226087c	nvc0: reintroduce BGRA4 format support Commit `342e68dc60` (nvc0: remove BGRA4 format support) removed the support to fix a WoW trace. However after further experimentation, I was able to get the blit to work by using a different "fake" format in the 2d engine. The reason why this worked on nv50 is that nv50 falls back to the 3d blit path in case either the src or the dst aren't "faithfully" supported, while nvc0 only does it for the dst format. RG8 is better supported by the nvc0 2d engine than R16. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-06 00:47:44 -05:00
Brian Paul	581111c4d6	mesa: report enum name in glClientActiveTexture() error string As we do for glActiveTexture(). Trivial.	2015-11-05 20:12:33 -07:00
Julien Isorce	497bde6727	st/va: fix memory leak on error in vlVaCreateSurfaces2 Found by coverity: CID #1337953 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-05 23:39:45 +00:00
Julien Isorce	e0b896c86c	st/va: indent vlVaQuerySurfaceAttributes and vlVaCreateSurfaces2 Some lines were using 4 indentation spaces instead of 3. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-05 23:39:43 +00:00
Kenneth Graunke	8dcf807cb4	i965: Fix scalar VS float[] and vec2[] output arrays. The scalar VS backend has never handled float[] and vec2[] outputs correctly (my original code was broken). Outputs need to be padded out to vec4 slots. In fs_visitor::nir_setup_outputs(), we tried to process each vec4 slot by looping from 0 to ALIGN(type_size_scalar(type), 4) / 4. However, this is wrong: type_size_scalar() for a float[2] would return 2, or for vec2[2] it would return 4. This looked like a single slot, even though in reality each array element would be stored in separate vec4 slots. Because of this bug, outputs[] and output_components[] would not get initialized for the second element's VARYING_SLOT, which meant emit_urb_writes() would skip writing them. Nothing used those values, and dead code elimination threw a party. To fix this, we introduce a new type_size_vec4_times_4() function which pads array elements correctly, but still counts in scalar components, generating correct indices in store_output intrinsics. Normally, varying packing avoids this problem by turning varyings into vec4s. So this doesn't actually fix any Piglit or dEQP tests today. However, if varying packing is disabled, things would be broken. Tessellation shaders can't use varying packing, so this fixes various tcs-input Piglit tests on a branch of mine. v2: Shorten the implementation of type_size_4x to a single line (caught by Connor Abbott), and rename it to type_size_vec4_times_4() (renaming suggested by Jason Ekstrand). Use type_size_vec4 rather than using type_size_vec4_times_4 and then dividing by 4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-05 15:26:07 -08:00
Roland Scheidegger	5ae37ae615	llvmpipe: disable texture cache There are some weird problems with 8-wide vectors.	2015-11-05 18:00:42 +01:00
Ilia Mirkin	ba093a099a	nouveau: send back a debug message when waiting for a fence to complete Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-05 11:22:19 -05:00
Ilia Mirkin	4f6cd5fad0	nv50,nvc0: provide debug messages with shader compilation stats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-05 11:22:19 -05:00
Ilia Mirkin	4335b28840	nouveau: add support for sending debug messages via KHR_debug Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-05 11:22:19 -05:00
Ilia Mirkin	6706cc1671	st/clover: provide a path for drivers to call through to pfn_notify Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [ Francisco Jerez: Clean up clover::context interface by passing around a function object. ]	2015-11-05 11:22:19 -05:00
Ilia Mirkin	c93c9d220b	st/mesa: set debug callback for debug contexts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-11-05 11:22:19 -05:00
Ilia Mirkin	fc76cc05e3	gallium: expose a debug message callback settable by context owner This will allow gallium drivers to send messages to KHR_debug endpoints Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-05 11:22:18 -05:00
Ilia Mirkin	e587590a83	st/mesa: account for texture views when doing CopyImageSubData Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-05 11:22:18 -05:00
Iago Toral Quiroga	eea3c907cc	i965/fs: Do not mark used surfaces in FS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-05 16:11:52 +01:00
Iago Toral Quiroga	eca4c43a33	i965/vec4: Do not mark used surfaces in VS_OPCODE_GET_BUFFER_SIZE Do it in the visitor, like we do for other opcodes. v2: use const, get rid of useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-05 16:11:52 +01:00
Iago Toral Quiroga	6105d1d0a0	i965/vec4: Do not mark used direct surfaces in VS_OPCODE_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const, do not add unnecessary temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-05 16:11:52 +01:00
Iago Toral Quiroga	d7013988fb	i965/fs: Do not mark used direct surfaces in UNIFORM_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-05 16:11:52 +01:00
Iago Toral Quiroga	027b64a55a	i965/fs: Do not mark direct used surfaces in VARYING_PULL_CONSTANT_LOAD Right now the generator marks direct surfaces as used but leaves marking of indirect surfaces to the caller. Just make the callers handle marking in both cases for consistency. v2: Use const and remove useless surf_index temporary (Curro) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-05 16:11:52 +01:00
Neil Roberts	6c5f371a27	i965/skl+: Enable support for 16x multisampling Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:21 +01:00
Neil Roberts	aa3f9aaf31	mesa/meta: Use interpolateAtOffset for 16x MSAA copy blit Previously there was a problem in i965 where if 16x MSAA is used then some of the sample positions are exactly on the 0 x or y axis. When the MSAA copy blit shader interpolates the texture coordinates at these sample positions it was possible that it would jump to a neighboring texel due to rounding errors. It is likely that these positions would be used on 16x MSAA because that is where they are defined to be in D3D. To fix that this patch makes it use interpolateAtOffset in the blit shader whenever 16x MSAA is used and the GL_ARB_gpu_shader5 extension is available. This forces it to interpolate the texture coordinates at the pixel center to avoid these problematic positions. This fixes ext_framebuffer_multisample-unaligned-blit and ext_framebuffer_multisample-clip-and-scissor-blit with 16x MSAA on SKL+. v2: Use interpolateAtOffset instead of interpolateAtSample v3: Always try to enable GL_ARB_gpu_shader5 in the shader [Ian Romanick] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-05 10:33:16 +01:00
Neil Roberts	b080b3d54d	meta/blit: Always try to enable GL_ARB_sample_shading Previously this extension was only enabled when blitting between two multisampled buffers. However I don't think it does any harm to just enable it all the time. The ‘enable’ option is used instead of ‘require’ so that the shader will still compile if the extension isn't available in the cases where it isn't used. This will make the next patch simpler because it wants to add another optional extension. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-05 10:33:16 +01:00
Neil Roberts	2dd76ec16e	meta: Support 16x MSAA in the multisample scaled blit shader v2: Fix the x_scale in the shader. Remove the doubts in the commit message. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-05 10:33:16 +01:00
Neil Roberts	1a22b12fc5	i965/meta: Support 16x MSAA in the meta stencil blit The destination rectangle is now drawn at 4x4 the size and the shader code to calculate the sample number is adjusted accordingly. Acked-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:16 +01:00
Neil Roberts	a680465428	i965/fs/skl+: Fix calculating gl_SampleID for 16x MSAA In order to accomodate 16x MSAA, the starting sample pair index is now 3 bits rather than 2 on SKL+. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-11-05 10:33:16 +01:00
Neil Roberts	bf6bd7eaf0	i965: Support allocating the MCS buffer for 16x MSAA When 16 samples are used the MCS buffer needs 64 bits per pixel. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:16 +01:00
Neil Roberts	b4c2e6054f	i965: Support calculating the bits needed to set up 16x MSAA The gen7_surface_msaa_bits function already returns the right values for 16 samples but it just needs its assert to be relaxed. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:16 +01:00
Neil Roberts	1a97cac767	i965/fs: Add a sampler program key for whether the texture is 16x MSAA When 16x MSAA is used for sampling with texelFetch the compiler needs to use a different instruction which passes more arguments for the MCS data. Previously on skl+ it was unconditionally using this new instruction. However since 16x MSAA is probably going to be pretty rare, it is probably worthwhile to avoid using this instruction for the other sample counts. In order to do that this patch adds a new member to brw_sampler_prog_key_data to track when a sampler refers to a buffer with 16 samples. Note that this isn't done for the vec4 backend because it wouldn't change how many registers it uses. Acked-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:16 +01:00
Neil Roberts	4ef27745c8	i965/vec4/skl+: Use ld2dms_w instead of ld2dms In order to support 16x MSAA, skl+ has a wider version of ld2dms that takes two parameters for the MCS data. The MCS data in the response still fits in a single register so we just need to ensure we copy both values rather than just the lower one. Acked-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:16 +01:00
Neil Roberts	e386fb0dee	i965/fs/skl+: Use ld2dms_w instead of ld2dms In order to support 16x MSAA, skl+ has a wider version of ld2dms that takes two parameters for the MCS data. The MCS data retrieved from the ld_mcs instruction already returns 4 or 8 registers and is documented to return zeroes for the mcsh value when the sample count is less than 16. v2: Use get_lowered_simd_width to fall back to SIMD8 instructions when the message length would be too long in SIMD16. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:16 +01:00
Neil Roberts	20250e854e	i965: Program 16x MSAA sample positions. This is the standard pattern used by the other 3D graphics API. BDW has slots for these values, but they aren't actually used until SKL. Even though the documentation for BDW says they must be zero, it doesn't seem to cause any harm to program them anyway. The comment above for the 8x sample positions says that the hardware implements centroid interpolation by picking the centre-most sample that is inside the primitive. That implies that it might be worthwhile to pick a pattern that includes 0.5,0.5. However by experimentation this doesn't seem to actually be the case. With the sample positions in this patch, if I modify the piglit test below so that it instead reports the centroid position, it reports 0.492188,0.421875 which doesn't match any of the positions. If I modify the sample positions so that they include one at exactly 0.5,0.5 it doesn't help and it reports another position which is even further from the center for some reason. arb_gpu_shader5-interpolateAtSample-different Kenneth Graunke experimented with some other patterns that have a higher standard deviation but I think after some discussion it was decided that it would be better to pick the same pattern as the other graphics API in case there are games that rely on this pattern. (Based on a patch by Kenneth Graunke) Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben at bwidawsk.net>	2015-11-05 10:33:15 +01:00
Kenneth Graunke	5048da974e	i965: Handle 16x MSAA in IMS dimension munging code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-11-05 10:33:15 +01:00
Kenneth Graunke	b9f8e729c8	nir: Rename nir_live_variables.c to nir_liveness.c. It doesn't actually operate on variables. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-05 00:09:40 -08:00
Kenneth Graunke	5c6f21579d	nir: Rename live_variables to live_ssa_defs. This computes liveness of SSA values, not nir_variables. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-05 00:09:40 -08:00
Alejandro Piñeiro	56774e6302	i965/vec4: select predicate based on writemask for sel emissions Equivalent to commit `8ac3b525c` but with sel operations. In this case we select the PredCtrl based on the writemask. This patch helps on cases like this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD In this case, cmod propagation can't optimize instruction #2, because instructions #1 and #2 have different writemasks, and we can't update directly instruction #2 writemask because our code thinks that sel at instruction #3 reads all four channels of the flag, when it actually only reads .x. So, with this patch, the previous case becames this: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Now only the x channel of the flag is used, allowing dead code eliminate to update the writemask at the second instruction: 1: cmp.l.f0.0 vgrf40.0.x:F, vgrf0.zzzz:F, vgrf7.xxxx:F 2: cmp.nz.f0.0 null.x:D, vgrf40.xxxx:D, 0D 3: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD So now cmod propagation can simplify out #2: 1: cmp.l.f0.0 vgrf40.0.x:F, attr18.wwww:F, vgrf7.xxxx:F 2: (+f0.0.x) sel vgrf41.0.x:UD, vgrf6.xxxx:UD, vgrf5.xxxx:UD Shader-db numbers: total instructions in shared programs: 6235835 -> 6228008 (-0.13%) instructions in affected programs: 219850 -> 212023 (-3.56%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 1192 HURT: 0	2015-11-05 08:57:23 +01:00
Ilia Mirkin	bb73fc4cb8	nouveau: relax fence emit space assert We also have the "reserved for kick" space available. Some of my earlier changes can probably be removed, but this is a quick fix for some of the rarer fallout. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2015-11-04 22:43:56 -05:00
Eric Anholt	6d3a24bce8	vc4: When the create ioctl fails, free our cache and try again. This greatly increases the pressure you can put on the driver before create fails. Ultimately we need to let the kernel take control of our cached BOs and just take them from us (and other clients) directly, but this is a very easy patch for the moment. Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-04 14:04:14 -08:00
Eric Anholt	3f7c96c36c	vc4: Print the rounded shader size in debug output. It's surprising to see "0kb" printed for debug on short shaders, while 4kb alignment won't be suprising.	2015-11-04 13:32:07 -08:00
Eric Anholt	4a951f1c08	vc4: Fix dumping the size of BOs allocated/cached. 60MB of cached BOs are a lot less scary than 600MB.	2015-11-04 13:32:07 -08:00
Ilia Mirkin	5bbd522452	mesa/tests: add glBufferStorageEXT to ES 3.1 dispatch list I thought that aliased functions didn't need to be added, but that might only be if the function aliases something in the same {desktop,ES} space. Resolves the dispatch sanity test failure. Fixes: `13b19aa81` (mesa: expose support for GL_EXT_buffer_storage) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92824 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-04 14:28:57 -05:00
Brian Paul	bdf6cef033	vbo: fix another GL_LINE_LOOP bug Very long line loops which spanned 3 or more vertex buffers were not handled correctly and could result in stray lines. The piglit lineloop test draws 10000 vertices by default, and is not long enough to trigger this. Even 'lineloop -count 100000' doesn't trigger the bug. For future reference, the issue can be reproduced by changing Mesa's VBO_VERT_BUFFER_SIZE to 4096 and changing the piglit lineloop test to use glVertex2f(), draw 3 loops instead of 1, and specifying -count 1023. Acked-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-04 11:51:59 -07:00
Brian Paul	d31481e70a	svga: implement 'white_fragments' option for VGPU10 fragment shaders When we emulate XOR logicop mode with blend-subtract, we need to ensure that the fragment shader always emits white. We had this implemented for VGPU9, but not VGPU10. VMware bug 1545492. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-04 11:51:41 -07:00
Brian Paul	149ac1fe43	u_vbuf: minor code reformatting / line wrapping Trivial.	2015-11-04 11:51:41 -07:00
Brian Paul	e450d4371a	u_vbuf: add some const qualifiers Trivial.	2015-11-04 11:51:40 -07:00
Brian Paul	3f98c812b3	svga: use new enum indices_mode type Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-04 11:51:40 -07:00
Brian Paul	fa6efbd27d	util/indices: replace #define tokens with enum type To ease debugging in gdb. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-11-04 11:51:40 -07:00
Alejandro Piñeiro	c3d7caa1e0	i965: check inst->predicate when clearing flag_live at dead code eliminate Detected by Matt Turner while reviewing commit `a59359ecd2` Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-04 19:33:56 +01:00
Roland Scheidegger	c19443bc8b	gallivm: fix sampling for s3tc srgb formats when using texture cache This actually stored the values as 8bit linear values in the cache, then did another srgb->linear conversion... We don't want to do the former (decoding 8bit srgb values to 8bit linear completely defeats the purpose of srgb in the first place), so just decode to 8bit srgb. Fixes piglit.spec.ext_texture_srgb.texwrap formats-s3tc tests.	2015-11-04 14:21:43 +01:00
Ben Widawsky	d56a1478a8	i965/meta: Assert fast clears and rep clears never overlap There is nothing wrong with the code today, but as one modifies the code it turns out to be not too difficult to mess up the code, and this easy assertion should catch such driver implementation failures quickly. Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-11-03 21:54:11 -08:00
Ryan Houdek	13b19aa815	mesa: expose support for GL_EXT_buffer_storage This extension requires ES 3.1 since it relies on glMemoryBarrier. For testing purposes I temporarily moved glMemoryBarrier to be an ES 3.0 function. This has been tested with the piglit in the ML and the Dolphin emulator. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-04 00:01:03 -05:00
Timothy Arceri	8e4cf900f0	glsl: make sure to only add subroutines to resource list Over looked in `763cd8c080`. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-04 15:43:12 +11:00
Timothy Arceri	f6b3c163f9	glsl: remove old TODO SSBO support now exists as of commits f24e5e and `f408a13dd3`. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-11-04 15:40:38 +11:00
Timothy Arceri	6e3b380387	docs: Mark AoA as done for i965 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-04 13:41:16 +11:00
Timothy Arceri	5b75dbd7be	i965: enable ARB_arrays_of_arrays Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-11-04 13:39:08 +11:00
Timothy Arceri	fb77da89f5	i965: add support for image AoA V3: clamp array index to the correct size (the size of the current array rather than the inner array) Francisco Jerez. V2: avoid useless zero-initialization and addition for the first AoA level, avoid redundant temporary, make use of type_size_scalar(), rename aoa_size to element_size, assign the indirect indexing temporary directly to image.reladdr, and replace while loop with a for loop. All suggested by Francisco Jerez. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-11-04 13:38:32 +11:00
Roland Scheidegger	9285ed98f7	llvmpipe: add cache for compressed textures compressed textures are very slow because decoding is rather complex (and because there's no jit code code to decode them too for non-technical reasons). Thus, add some texture cache which holds a couple of decoded blocks. Right now this handles only s3tc format albeit it could be extended to work with other formats rather trivially as long as the result of decode fits into 32bit per texel (ideally, rgtc actually would decode to more than 8 bits per channel, but even then making it work for it shouldn't be too difficult). This can improve performance noticeably but don't expect wonders (uncompressed is unsurprisingly still faster). It's also possible it might be slower in some cases (using nearest filtering for example or if there's otherwise not many cache hits, the cache is only direct mapped which isn't great). Also, actual decode of a block relies on util code, thus even though always full blocks are decoded it is done texel by texel - this could obviously benefit greatly from simd-optimized code decoding full blocks at once... Note the cache is per (raster) thread, and currently only used for fragment shaders. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-04 02:51:02 +01:00
Oded Gabbay	39b4dfe6ab	llvmpipe: use simple coeffs calc for 128bit vectors There are currently two methods in llvmpipe code to calculate coeffs to be used as inputs for the fragment shader. The two methods use slightly different ways to do the floating point calculations and thus produce slightly different results. The decision which method to use is determined by the size of the vector that is used by the platform. For vectors with size of more than 128bit, a single-step method is used, in which coeffs_init_simple() + attribs_update_simple() are called. For vectors with size of 128bit or less, a two-step method is used, in which coeffs_init() + attribs_update() are called. This causes some piglit tests (clip-distance-bulk-copy, interface-vs-unnamed-to-fs-unnamed) to fail when using platforms with 128bit vectors (such as ppc64le or x86-64 without AVX). This patch makes platforms with 128bit vectors use the single-step method (aka "simple" method) instead of the two-step method. This would make the resulting coeffs identical between more platforms, make sure the piglit tests passes, and make debugging and maintainability a bit easier as the generated LLVM IR will be the same for more platforms. The performance impact is negligible for x86-64 without AVX, and basically non-existent for ppc64le, as it can be seen from the following benchmarking results: - glxspheres, on ppc64le: - original code: 4.892745317 frames/sec 5.460303857 Mpixels/sec - with the patch: 4.932083873 frames/sec 5.504205571 Mpixels/sec - Additional 0.8% performance boost - glxspheres, on x86-64 without AVX: - original code: 20.16418809 frames/sec 22.50323395 Mpixels/sec - with the patch: 20.31328989 frames/sec 22.66963152 Mpixels/sec - Additional 0.74% performance boost - glmark2, on ppc64le: - original code: score of 58 - with my change: score of 57 - glmark2, on x86-64 without AVX: - original code: score of 175 - with the patch: score of 167 - Impact of of -4.5% on performance - OpenArena, on ppc64le: - original code: 3398 frames 1719.0 seconds 2.0 fps 255.0/505.9/2773.0/0.0 ms - with the patch: 3398 frames 1690.4 seconds 2.0 fps 241.0/497.5/2563.0/0.2 ms - 29 seconds faster with the patch, which is about 2% - OpenArena, on x86-64 without AVX: - original code: 3398 frames 239.6 seconds 14.2 fps 38.0/70.5/719.0/14.6 ms - with the patch: 3398 frames 244.4 seconds 13.9 fps 38.0/71.9/697.0/14.3 ms - 0.3 fps slower with the patch (about 2%) Additional details can be found at: http://lists.freedesktop.org/archives/mesa-dev/2015-October/098635.html Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-11-04 02:38:53 +01:00
Kenneth Graunke	59bbe2681b	nir: Properly invalidate metadata in nir_opt_remove_phis(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-03 17:06:48 -08:00
Kenneth Graunke	bc3942e297	nir: Properly invalidate metadata in nir_lower_vec_to_movs(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-03 17:06:48 -08:00
Kenneth Graunke	0f037bd71f	nir: Properly invalidate metadata in nir_opt_copy_prop(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-03 17:06:48 -08:00
Kenneth Graunke	4cb7546066	nir: Properly invalidate metadata in nir_remove_dead_variables(). v2: Preserve live_variables too (Jason). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-11-03 17:06:48 -08:00
Kenneth Graunke	8bb44510fc	nir: Properly invalidate metadata in nir_split_var_copies(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Cc: mesa-stable@lists.freedesktop.org	2015-11-03 17:06:48 -08:00
Kenneth Graunke	aea40091f0	nir: Properly invalidate metadata in nir_lower_global_vars_to_local(). v2: Preserve nir_metadata_live_variables as well (caught by Jason). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-11-03 17:06:48 -08:00
Jason Ekstrand	531be601d5	nir: Unexpose _impl versions of copy_prop and dce Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-03 17:06:48 -08:00
Jordan Justen	4bc16ad217	mesa: rename UniformBlockStageIndex to InterfaceBlockStageIndex Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Iago Toral <itoral@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-11-03 16:44:22 -08:00
Matt Turner	cf3121ed18	i965/vec4: Send from GRF in atomic operations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-03 16:38:36 -08:00
Marek Olšák	3b37155a68	gallium/radeon: allow returning SDMA fences from pipe->flush pipe->flush never returned SDMA fences. This fixes it. This is only an issue on amdgpu where fences can signal out of order. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-04 00:43:14 +01:00
Marek Olšák	7f9122c968	gallium/radeon: always return the last SDMA fence on SDMA flush if needed Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-04 00:43:14 +01:00
Kenneth Graunke	36fd653817	i965: Add scalar geometry shader support. This is hidden behind INTEL_SCALAR_GS=1 for now, as we don't yet support instanced geometry shaders, and Orbital Explorer's shader spills like crazy. But the infrastructure is in place, and it's largely working. v2: Lots of rebasing. v3: (feedback from Kristian Høgsberg) - Handle stride and subreg_offset correctly for ATTRs; use a helper. - Fix missing emit_shader_time_end() call. - Delete dead code after early EOT in static vertex case to avoid tripping asserts in emit_shader_time_end(). - Use proper D/UD type in intexp2(). - Fix "EndPrimitve" and "to that" typos. - Assert that invocations == 1 so we know this is missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-03 15:08:49 -08:00
Kenneth Graunke	c9541a74e4	i965: Add scalar GS input lowering code. We really ought to compute the VUE map at link time and stash it, rather than recomputing it here, but with the mess of program structures I wasn't sure where to put it. We can improve that later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-03 15:08:49 -08:00
Kenneth Graunke	4861835d1c	i965: Fix the fs_visitor GS constructor to take shader_time_index. Jason reworked this so it isn't simply ST_GS anymore...it's either -1 (not enabled) or an actual offset. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-11-03 15:08:49 -08:00
Ben Widawsky	5d4b019d2a	i965/gen8+: Extract color clear surface state On future generation platforms the color clear value is stored elsewhere in the surface state. By extracting this logic, we can cleanly implement the difference in an upcoming patch. Should have no functional impact. v2: Move hunk from the next patch into this patch (Matt) Whitespace fix (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-11-03 13:49:21 -08:00
Ben Widawsky	f3223ebd6c	i965/gen8+: Remove redundant zeroing of surface state The allocate_surface_state already zeroes out the surface state, and doing it later in the function is destructive for what we want to accomplish when we split out support for gen9 fast clears (next patch). NOTE: Only dword 12 actually needed to be fixed, but it seemed more consistent to remove the other instances as well. I can make an argument both ways (open coding it, vs. not). I can rework the next patch if requires. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-11-03 13:49:21 -08:00
Samuel Pitoiset	e887407491	nvc0: add missing compute parameters required by clover This fixes crashes with some piglit OpenCL tests. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-03 22:17:00 +01:00
Samuel Pitoiset	e640ba41ed	nvc0: handle NULL pointer in nvc0_get_compute_param() To get the size (in bytes) of a compute parameter, clover first calls get_compute_param() with a NULL data pointer. The RET() macro is based on nv50. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-03 22:16:45 +01:00
Ben Widawsky	dde33fc23c	i965/skl: PCI ID cleanup and brand strings A few new PCI ids are added here, and one is removed (0x190B) because it no longer seems to exist anywhere. v2-4: Only use ascii characters (Ilia) 0x1921 is no longer marked as f Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-11-03 10:00:17 -08:00
Ben Widawsky	7cbd6608f5	i965/skl: Add GT4 PCI IDs Like other gen8+ hardware, the hardware automatically scales up thread counts. We must be careful about the URB sizes since GT4 adds another slice. One of the existing PCI IDs is actually mislabeled as GT3. Arguably this is a real bug since the URB size will be wrong. Because this patch is simply meant to add the missing IDs, that will be fixed in a later patch. v2: No longer relevant. v3: Update the wm thread count to support GT4. The WM thread count is used to determine the maximum scratch space required. Currently the code always allocates the maximum amount even though lower GT SKUs require less. The formula is threads_per_psd * subslices_per_slice * slices Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-11-03 09:45:04 -08:00
Jordan Justen	55365a7ad5	mesa: Add spec citations for DispatchCompute* Note: The OpenGL 4.3 - 4.5 specification language for DispatchCompute appears to have an error regarding the max allowed values. When adding the specification citation, we note why the code does not match the specification language. v2: * Updates based on review from Iago Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-11-02 15:25:37 -08:00
Jordan Justen	44c399f20a	mesa: Update DispatchComputeIndirect errors for indirect parameter There is some discrepancy between the return values for some error cases for the DispatchComputeIndirect call in the ARB_compute_shader specification. Regarding the indirect parameter, in one place the extension spec lists that the error returned for invalid values should be INVALID_OPERATION, while later it specifies INVALID_VALUE. The OpenGL 4.3 and OpenGLES 3.1 specifications appear to be consistent in requiring the INVALID_VALUE error return in this case. Here we update the code to match the main specifications, and update the citations use the main specification rather than the extension specification. v2: * Updates based on review from Iago Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-11-02 15:25:37 -08:00
Matt Turner	0b19f65195	i965/fs: Clean up FBH code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-02 09:33:31 -08:00
Matt Turner	c22d62f599	i965/vec4: Clean up FBH code. It did a bunch of unnecessary stuff, emitting an extra MOV included. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-02 09:33:31 -08:00
Matt Turner	7c81a6a647	i965: Replace default case with list of enum values. If we add a new file type, we'd like to get warnings if it's not handled. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-11-02 09:33:31 -08:00
Matt Turner	d9b09f8a30	i965/vec4: Don't disable channels in any/all comparisons. We've made a mistake in calling the Channel Enable bits "writemask", because they do more than control which channels of the destination are written -- they actually control which channels are enabled (surprise! surprise!) So, if we emit cmp.z.f0(8) null.xy<1>D g10<4,4,1>.xyzzD g2<0,4,1>.xyzzD mov(8) g12<1>.xUD 0x00000000UD (+f0.all4h) mov(8) g12<1>.xUD 0xffffffffUD where the CMP instruction has only .xy channel enables, it won't write the .zw channels of the flag register, which are of course read by the +f0.all4 predicate. We need to always emit CMP instructions whose flag result might be read by such a predicate with all channels enabled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-02 09:33:31 -08:00
Tapani Pälli	f4466c856f	mesa: fix uniforms calculation in glGetProgramiv Since introduction of SSBO, UniformStorage contains not just uniforms but also buffer variables, this needs to be taken in to account when calculating active uniforms with GL_ACTIVE_UNIFORMS and GL_ACTIVE_UNIFORM_MAX_LENGTH. No Piglit regressions. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-02 11:22:10 +02:00
Tapani Pälli	efb333acb7	mesa: fix program resource queries for atomic counter buffers gl_active_atomic_buffer contains index to UniformStorage, we need to calculate resource index for that gl_uniform_storage. Fixes following CTS tests: ES31-CTS.program_interface_query.atomic-counters ES31-CTS.program_interface_query.atomic-counters-one-buffer No Piglit regressions. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-11-02 11:22:06 +02:00
Juha-Pekka Heikkila	c2c124f891	glsl: join calculate_array_size() and calculate_array_stride() These helpers are ran for same case the same loop. Here joined their operation so the loop is ran just once. Also fixed out-of-memory condition here. v2: Make the loop simpler to read as per Tapani's suggestion Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Tested-by: Tapani Pälli <tapani.palli@intel.com>	2015-11-02 10:03:32 +02:00
Ryan Houdek	af7c98a9c7	mesa: expose support for OES/EXT_draw_elements_base_vertex to OpenGL ES This has been tested with the piglits in the mailing list and on the Dolphin emulator. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-01 23:02:06 -05:00
Ilia Mirkin	985b51551a	nouveau: set MaxDrawBuffers to the same value as MaxColorAttachments Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-11-01 20:15:15 -05:00
Samuel Pitoiset	00bb524716	nv50: use correct heaps for FP and GP code segments This is just a cosmetic change. Trivial. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-11-01 23:29:20 +01:00
Jordan Justen	39bb59a566	mesa/sso: Add compute shader support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> [itoral@igalia.com: Reviewed-by for all except the ctx->_Shader change] Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-01 01:10:01 -07:00
Jordan Justen	6e11855050	mesa/sso: Add MESA_VERBOSE=api trace support v2: * Use %u for unsigned values (Iago) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-01 01:09:20 -07:00
Jordan Justen	5bfe2835c2	i965: Setup pull constant state for compute programs Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-01 00:35:12 -07:00
Jordan Justen	a4a416f567	main/get: Add MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-11-01 00:11:42 -07:00
Jordan Justen	218e94906d	glsl: OpenGLES GLSL 3.1 precision qualifiers ordering rules The OpenGLES GLSL 3.1 specification uses the precision qualifier ordering rules from ARB_shading_language_420pack. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-10-31 23:17:06 -07:00
Jordan Justen	b6e9b2b7a0	glsl: Add compute shader builtin variables for OpenGLES 3.1 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-10-31 23:08:09 -07:00
Ilia Mirkin	67635a0a71	nouveau: get rid of tabs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-31 19:58:14 -04:00
Connor Abbott	0ef8c5cb96	i965/sched: don't calculate live intervals for post-RA scheduling For some reason, this causes assertions on gm965 only. In any case, it's unnecessary since we don't need liveness information in the post-RA scheduler. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92744 Cc: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-31 08:05:52 -07:00
Dave Airlie	425d8c2578	virgl/vtest: fix extra malloc This somehow got added twice, drop the first one. Reported by Coverity. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-31 18:05:33 +10:00
Dave Airlie	8d731ebd33	virgl: free sampler view on failure path Reported by Coverity. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-31 16:16:44 +10:00
Dave Airlie	7153b12651	gallium/swrast: fixup build breakage and warnings The front buffer rendering changes broke an interface, I didn't fix up all of them. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-31 16:16:44 +10:00
Dave Airlie	2b67657096	gallium/swrast: fix front buffer blitting. (v2) So I've known this was broken before, cogl has a workaround for it from what I know, but with the gallium based swrast drivers BlitFramebuffer from back to front or vice-versa was pretty broken. The legacy swrast driver tracks when a front buffer is used and does the get/put images when it is mapped/unmapped, so this patch attempts to add the same functionality to the gallium drivers. It creates a new context interface to denote when a front buffer is being created, and passes a private pointer to it, this pointer is then used to decide on map/unmap if the contents should be updated from the real frontbuffer using get/put image. This is primarily to make gtk's gl code work, the only thing I've tested so far is the glarea test from https://github.com/ebassi/glarea-example.git v2: bump extension version, check extension version before calling get image. (Ian) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91930 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-31 16:04:36 +10:00
Timothy Arceri	103de0225b	glsl: set image access qualifiers for AoA Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-31 08:37:08 +11:00
Ian Romanick	7b3684877c	i965: Do legacy userclipping in OpenGL ES 1.x contexts. Commit `fba4823a` disabled user clipping for everything except compatibility profile. Core profile and OpenGL ES 2.0+ have all removed the classic, OpenGL 1.0 user clip planes. ES 1.x, however, still has them. Fixes OpenGL ES 1.1 conformance mustpass.c and userclip.c Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Olivier Berthier <olivierx.berthier@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92639 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92641	2015-10-30 14:25:33 -07:00
Emmanuel Gil Peyrot	f3d4d10a1d	gbm.h: Add a missing stddef.h include for size_t. This was causing compilation issues when one of its providers wasn’t already included before gbm.h. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-30 19:12:14 +00:00
Emil Velikov	7bac333508	winsys/virgl: rework line wrapping/indent Wrap some of the 'omg it's getting out of hand' long lines, and re-indent where things feel off. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	493e410d55	virgl: unwrap the includes Include what you want, rather than relying on a header foo.h N levels down the include chain, to provide something that you need. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	7154d48c6e	winsys/virgl: remove temporary ret variable Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	bdcb005788	winsys/virgl: always memset prior to ioctl Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	e992715da2	winsys/virgl: use MALLOC to match FREE The uppercase versions are wrappers which must be matched. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	72d7d1e224	winsys/virgl: remove calloc/malloc casts Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	1ce685f05e	winsys/virgl: throw in some inline wrappers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	78be78b681	virgl: introduce virgl_query() inline wrapper Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	dafcb21405	virgl: use virgl_screen/surface upcast wrappers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	7af46b9c74	virgl: introduce and use virgl_transfer/texture/resource inline wrappers The only two remaining cases of (struct virgl_resource *) require a closer look. Either the error checking is missing or the arguments provided feel wrong. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:09 +00:00
Emil Velikov	6b123fa07f	virgl: add virgl_context/sampler_view/so_target() upcast wrappers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	1f43e4e1a3	winsys/virgl/drm: drop unneeded forward declaration Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	e0056228f6	virgl: remove sw_winsys pointer from virgl_screen The screen already has a pointer to the (base) winsys object. With the latter of which implemented/sub-classed as either drm or sw based one, depending on the target. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	0c82c2fb0b	virgl: rename virgl.h to virgl_screen.h Provide a more meaningful name considering it's purpose. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	87f7d61e19	virgl: move virgl_hw.h into the driver dir Strictly speaking virgl_hw.h should reside in the driver folder, as it describes the hardware. Moving it allows us to nuke the following strange dependency winsys/vtest > driver > winsys/drm Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	014f8ef2ff	virgl: straighten the includes confusion Use the relevant GALLIUM_foo_CFLAGS which has all the requirements (not to mention VISIBITY_CFLAGS) and keep ../ out of the include directives. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	2c705d2220	virgl: remove the _FILE_OFFSET_BITS defines The build already sets it as needed. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	a05648fd7e	winsys/virgl/drm: add all files to the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:37:08 +00:00
Emil Velikov	8b9e69e2ea	winsys/virgl/vtest: list all files in Makefile.sources Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:36:46 +00:00
Emil Velikov	73308ca802	virgl: move sources list to Makefile.sources ... and add the missing files while we're at it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:33:11 +00:00
Emil Velikov	c1bf71f77c	virgl: fix drm.h include path The drm/ prefix is required, if using the kernel provided headers. As most distros don't ship them it and we already depend on libdrm (which adds the relevant -I flag) just drop the drm/ from the include. Once a libdrm release with the virtgpu_drm.h header is released, we can drop our local copy of the file. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-30 17:29:01 +00:00
Emil Velikov	60418a28ea	i965: enable ARB_shader_clock on gen7+ Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:23:18 +00:00
Emil Velikov	4379ca22f1	i965: Implement nir_intrinsic_shader_clock v2: - Add a few const qualifiers for good measure. - Drop unneeded retype()s (Matt) - Convert timestamp to SIMD8/16, as fs_visitor::get_timestamp() returns SIMD4 (Connor) v3: - Remove unneeded temporary + MOV (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:40 +00:00
Emil Velikov	6a15517242	i965/fs: move the fs_reg::smear() from get_timestamp() to the callers We're about to reuse get_timestamp() for the nir_intrinsic_shader_clock. In the latter the generalisation does not apply, so move the smear() where needed. This also makes the function analogous to the vec4 one. v2: Tweak the comment - The caller -> We (Matt, Connor). v3: More comment tweaks (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:36 +00:00
Emil Velikov	7682844f34	nir: add shader_clock intrinsic v2: Add flags and inline comment/description. v3: None of the input/outputs are variables v4: Drop clockARB reference, relate code motion barrier comment wrt intrinsic flag. v5: Drop the "thus we can eliminate..." comment (Connor) Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:32 +00:00
Emil Velikov	f1d98fc90a	glsl: add support for the clock2x32ARB function v2: correctly set the return type Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:29 +00:00
Emil Velikov	51265c1b85	glsl: add ARB_shader_clock infrastructure Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:27 +00:00
Emil Velikov	e916d5e013	mesa: add infra for ARB_shader_clock Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-30 17:22:23 +00:00
Samuel Pitoiset	0d0329df8f	nv50: do not create an invalid HW query type While we are at it, store the rotate offset for occlusion queries to nv50_hw_query like on nvc0. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2015-10-30 17:57:15 +01:00
Samuel Pitoiset	5f1eeb799b	nv50: move HW queries to nv50_query_hw.c/h files Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2015-10-30 17:57:15 +01:00
Samuel Pitoiset	76b48ceee9	nv50: move nva0_so_target_save_offset() to its correct location Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>	2015-10-30 17:57:15 +01:00
Samuel Pitoiset	2e3fe0379e	nv50: add a header file for nv50_query Like for nvc0, this will allow to split different types of queries and to prepare the way for both global performance counters and MP counters. While we are at it, make use of nv50_query struct instead of pipe_query. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-30 17:57:15 +01:00
Julien Isorce	e7ed3963ed	st/va: add support to export a surface as dmabuf I.e. implements: VaAcquireBufferHandle VaReleaseBufferHandle for memory of type VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME And apply relatives change to: vlVaMapBuffer vlVaUnMapBuffer vlVaDestroyBuffer Implementation inspired from cgit.freedesktop.org/vaapi/intel-driver Tested with gstreamer-vaapi with nouveau driver. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:21:20 +01:00
Julien Isorce	802ba6f865	st/va: implement VaDeriveImage And apply relatives change to: vlVaBufferSetNumElements vlVaCreateBuffer vlVaMapBuffer vlVaUnmapBuffer vlVaDestroyBuffer vlVaPutImage It is unfortunate that there is no proper va buffer type and struct for this. Only possible to use VAImageBufferType which is normally used for normal user data array. On of the consequences is that it is only possible VaDeriveImage is only useful on surfaces backed with contiguous planes. Implementation inspired from cgit.freedesktop.org/vaapi/intel-driver Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:21:11 +01:00
Julien Isorce	5e763aaa21	st/va: add more errors checks in vlVaBufferSetNumElements and vlVaMapBuffer Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:41 +01:00
Julien Isorce	86eb4131a9	st/va: add headless support, i.e. VA_DISPLAY_DRM This patch allows to use gallium vaapi without requiring a X server running for your second graphic card. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:35 +01:00
Julien Isorce	1bdea0e579	st/va: handle Video Post Processing for configs Add support for VA_PROFILE_NONE and VAEntrypointVideoProc in the 4 following functions: vlVaQueryConfigProfiles vlVaQueryConfigEntrypoints vlVaCreateConfig vlVaQueryConfigAttributes Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:29 +01:00
Julien Isorce	0b868807e4	st/va: add colospace conversion through Video Post Processing Add support for VPP in the following functions: vlVaCreateContext vlVaDestroyContext vlVaBeginPicture vlVaRenderPicture vlVaEndPicture Add support for VAProcFilterNone in: vlVaQueryVideoProcFilters vlVaQueryVideoProcFilterCaps vlVaQueryVideoProcPipelineCaps Add handleVAProcPipelineParameterBufferType helper. One application is: VASurfaceNV12 -> gstvaapipostproc -> VASurfaceRGBA Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:10 +01:00
Julien Isorce	05b6ce4209	st/va: implement dmabuf import for VaCreateSurfaces2 For now it is limited to RGBA, BGRA, RGBX, BGRX surfaces. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:20:03 +01:00
Julien Isorce	adf1133118	st/va: implement VaCreateSurfaces2 and VaQuerySurfaceAttributes Inspired from http://cgit.freedesktop.org/vaapi/intel-driver/ especially src/i965_drv_video.c::i965_CreateSurfaces2. This patch is mainly to support gstreamer-vaapi and tools that uses this newer libva API. The first advantage of using VaCreateSurfaces2 over existing VaCreateSurfaces, is that it is possible to select which the pixel format for the surface. Indeed with the simple VaCreateSurfaces function it is only possible to create a NV12 surface. It can be useful to create a RGBA surface to use with video post processing. The avaible pixel formats can be query with VaQuerySurfaceAttributes. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:19:54 +01:00
Julien Isorce	d42029d2d9	st/va: do not destroy old buffer when new one failed If formats are not the same vlVaPutImage re-creates the video buffer with the right format. But if the creation of this new video buffer fails then the surface looses its current buffer. Let's just destroy the previous buffer on success. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:19:47 +01:00
Julien Isorce	87109e5f88	st/va: properly defines VAImageFormat formats and improve VaCreateImage Added PIPE_VIDEO_CHROMA_FORMAT_NONE in p_format.h and return it by default in ChromaToPipe. Renamed YCbCrToPipe to VaFourccToPipeFormat because it now contains RGB. Implemented PipeFormatToVaFourcc which will be used later in VlVaDeriveImage. Note that gstreamer-vaapi check all the VAImageFormat fields. Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-30 13:05:23 +01:00
Samuel Iglesias Gonsalvez	7b8cc37585	main: fix basename match's check if it's an array or struct Commit `4565b6f` did not update the basename match's check for the case that string would exactly match the name of the variable if the suffix "[0]" were appended to it. Fixes two dEQP-GLES31 tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element v2: - Change the position of rname_has_array_index_zero to avoid an out-of-bounds read. Reported by Tapani Pälli. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-30 08:12:53 +01:00
Kristian Høgsberg	f7f1bc6cca	i965: Fix invalid memory accesses after resizing brw_codegen's store table Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-30 07:49:10 +01:00
Connor Abbott	73caa26e43	i965/sched: use liveness analysis for computing register pressure Previously, we were using some heuristics to try and detect when a write was about to begin a live range, or when a read was about to end a live range. We never used the liveness analysis information used by the register allocator, though, which meant that the scheduler's and the allocator's ideas of when a live range began and ended were different. Not only did this make our estimate of the register pressure benefit of scheduling an instruction wrong in some cases, but it was preventing us from knowing the actual register pressure when scheduling each instruction, which we want to have in order to switch to register pressure scheduling only when the register pressure is too high. This commit rewrites the register pressure tracking code to use the same model as our register allocator currently uses. We use the results of liveness analysis, as well as the compute_payload_ranges() function that we split out in the last commit. This means that we compute live ranges twice on each round through the register allocator, although we could speed it up by only recomputing the ranges and not the live in/live out sets after scheduling, since we only shuffle around instructions within a single basic block when we schedule. Shader-db results on bdw: total instructions in shared programs: 7130187 -> 7129880 (-0.00%) instructions in affected programs: 1744 -> 1437 (-17.60%) helped: 1 HURT: 1 total cycles in shared programs: 172535126 -> 172473226 (-0.04%) cycles in affected programs: 11338636 -> 11276736 (-0.55%) helped: 876 HURT: 873 LOST: 8 GAINED: 0 v2: use regs_read() in more places. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:19:43 -04:00
Connor Abbott	c1860299b8	i965/fs: split out calculation of payload live ranges We'll need this for the scheduler too, since it wants to know when the live ranges of payload registers end in order to model them in our register pressure calculations. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:19:33 -04:00
Connor Abbott	45cd76e342	i965: dump scheduling cycle estimates The heuristic we're using is rather lame, since it assumes everything is non-uniform and loops execute 10 times, but it should be enough for measuring improvements in the scheduler that don't result in a change in the number of instructions. v2: - Switch loops and cycle counts to be compatible with older shader-db. - Make loop heuristic 10x to match with spilling code. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:19:24 -04:00
Connor Abbott	486268bdb0	i965: always run the post-RA scheduler Before, we would only do scheduling after register allocation if we spilled, despite the fact that the pre-RA scheduler was only supposed to be for register pressure and set the latencies of every instruction to 1. This meant that unless we spilled, which we rarely do, then we never considered instruction latencies at all, and we usually never bothered to try and hide texture fetch latency. Although a later commit removes the setting the latency to 1 part, we still want to always run the post-RA scheduler since it's able to take the false dependencies that the register allocator creates into account, and it can be more aggressive than the pre-RA scheduler since it doesn't have to worry about register pressure at all. Test master post-ra-sched diff %diff bench_OglPSBump2 396.730 402.386 5.656 +1.400% bench_OglPSBump8 244.370 247.591 3.221 +1.300% bench_OglPSPhong 241.117 242.002 0.885 +0.300% bench_OglPSPom 59.555 59.725 0.170 +0.200% bench_OglShMapPcf 86.149 102.346 16.197 +18.800% bench_OglVSTangent 388.849 395.489 6.640 +1.700% bench_trex 65.471 65.862 0.390 +0.500% bench_trexoff 69.562 70.150 0.588 +0.800% bench_heaven 25.179 25.254 0.074 +0.200% Reviewed-by: Jason Ekstrand <jasoan.ekstrand@intel.com>	2015-10-30 02:19:00 -04:00
Connor Abbott	85fce2d2f5	i965/sched: write-after-read dependencies are free Although write-after-write dependencies have the same latency as read-after-write dependencies due to how the register scoreboard works, write-after-read dependencies aren't checked by the EU at all, so they're purely a constraint on how the scheduler can order the instructions. v2: fix accumulator dependencies too. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:18:56 -04:00
Connor Abbott	6f231fddff	i965: fix cycle estimates when there's a pipeline stall The issue time for an instruction is how many cycles it takes to actually put it into the pipeline. If there's a pipeline stall that causes the instruction to be delayed, we should first take that into account to figure out when the instruction would start executing and then add the issue time. The old code had it backwards, and so we would underestimate the total time whenever we thought there would be a pipeline stall by up to the issue time of the instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-30 02:18:53 -04:00
Eric Anholt	04c42f3ab5	vc4: Allow user index buffers, to avoid slow readback for shadow IBs. Improves low-settings openarena performance by 31.9975% +/- 0.659931% (n=7).	2015-10-29 22:58:01 -07:00
Ilia Mirkin	06fa2e864a	nv50: mark contexts shareable, compile at creation time Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 23:25:08 -04:00
Ilia Mirkin	f768eaa87d	nv50: allow per-sample interpolation to be forced via rast Uses the same technique as for nvc0 of fixups before upload, and evicting in case of state change. Removes one source of variants kept by st/mesa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 22:42:38 -04:00
Matt Turner	85ee2f7fcf	i965: Add INTEL_DEBUG=nocompact to disable instruction compaction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	93268939e4	i965: Add INTEL_DEBUG=hex to print the hex with the disassembly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	18b194925f	i965: Print the type and writemask on null destinations. These are often useful in debugging, and the writemask (actually "Channel Enables") determines more than just what goes into the destination. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	bcdf664682	i965: Test fixed_hw_reg.file against BRW_IMMEDIATE_VALUE, not IMM. No functional change, since they were both 3, but BRW_IMMEDIATE_VALUE is the hardware value and IMM was the IR value -- and you can see that BRW_IMMEDIATE_VALUE was correctly used in the context of this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	ee46c1e626	i965/vec4: Test against BRW_IMMEDIATE_VALUE, not IMM. No functional change, since they were both 3, but BRW_IMMEDIATE_VALUE is the hardware value and IMM was the IR value -- and you can see that BRW_IMMEDIATE_VALUE was correctly used in the context of this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	8c4151b866	i965/fs: Use group(4, 0) to emit an exec-size 4 MOV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	9115fa1d13	i965/cfg: Handle no-idom case in cfg_t::dump_domtree(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	5916b073f6	i965/disasm: Remove unused _addr_mode argument from src_ia1(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	e09e5f992e	i965: Set correct field for indirect align16 addrimm. This has been wrong since the initial import of the i965 driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	fa142773d9	i965/vec4: Drop brw_set_default_* before popping insn state. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-29 17:51:16 -07:00
Matt Turner	11a7b6bbaa	i965/vec4: Remove unnecessary #includes from the generator. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-29 17:51:16 -07:00
Dave Airlie	744cc036b9	r600: enable SB for geom shaders on pre-evergreen I've checked with piglit and one tests fails, but it fails on evergreen as well, so will get fixed later. Otherwise SB seems to be working fine for geom shaders on my rv635. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-30 10:40:59 +10:00
Kenneth Graunke	c6b24448b5	i965/vec4: Eliminate the vec4_generator class altogether. We really weren't taking advantage of vec4_generator being a class. By adding a "p" parameter to the helper methods, and "prog_data" to ones which need binding table information, we can convert everything to static functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-29 16:56:41 -07:00
Kenneth Graunke	1a094a2ee2	i965/vec4: Move vec4_generator class definition into the .cpp file. The public API for the generator is brw_vec4_generate_code(); nobody actually needs to use the class. This means we can extend it without triggering the recompiles associated with altering brw_vec4.h. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-29 16:56:41 -07:00
Kenneth Graunke	4cba8f5d21	i965/vec4: Wrap vec4_generator in a C function. vec4_generator is a class for convenience, but only exports a single method as its public API. It makes much more sense to just export a single function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-29 16:56:41 -07:00
Kenneth Graunke	73ff0ead36	i965/vec4: Convert src_reg/dst_reg to brw_reg at the end of the visitor. This patch makes the visitor convert registers to the HW_REG file at the very end, after register allocation, post-RA scheduling, and dependency control flagging. After that, everything is in fixed brw_regs. This simplifies the code generator, as it can just use the hardware registers rather than having to interpret our abstract files. In particular, interpreting the UNIFORM file meant reading prog_data to figure out where push constants are supposed to start. Having the part of the code that performs register allocation also translate everything to hardware registers seems sensible. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-29 16:56:41 -07:00
Ivan Kalvachev	f75f21a24a	r600g: Fix special negative immediate constants when using ABS modifier. Some constants (like 1.0 and 0.5) could be inlined as immediate inputs without using their literal value. The r600_bytecode_special_constants() function emulates the negative of these constants by using NEG modifier. However some shaders define -1.0 constant and want to use it as 1.0. They do so by using ABS modifier. But r600_bytecode_special_constants() set NEG in addition to ABS. Since NEG modifier have priority over ABS one, we get -\|1.0\| as result, instead of \|1.0\|. The patch simply prevents the additional switching of NEG when ABS is set. [According to Ivan Kalvachev, this bug was fond via https://github.com/iXit/Mesa-3D/issues/126 and https://github.com/iXit/Mesa-3D/issues/127] Signed-off-by: Ivan Kalvachev <ikalvachev@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> CC: <mesa-stable@lists.freedesktop.org>	2015-10-29 23:56:57 +01:00
Nicolai Hähnle	24c90888ae	st/mesa: fix mipmap generation for immutable textures with incomplete pyramids Without the clamping by NumLevels, the state tracker would reallocate the texture storage (incorrect) and even fail to copy the base level image after reallocation, leading to the graphical glitch of https://bugs.freedesktop.org/show_bug.cgi?id=91993 . A piglit test has been submitted for review as well (subtest of arb_texture_storage-texture-storage). v2: also bypass all calls to st_finalize_texture (suggested by Marek Olšák) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-29 23:56:57 +01:00
Nanley Chery	65f6caf43e	mesa: Enable ASTC in GLES' [NUM_]COMPRESSED_TEXTURE_FORMATS queries In OpenGL ES, the COMPRESSED_TEXTURE_FORMATS query returns the set of supported specific compressed formats. Since ASTC formats fit within that category, include them in the set and update the NUM_COMPRESSED_TEXTURE_FORMATS query as well. This enables GLES2-based ASTC dEQP tests to run. See the Bugzilla for more info. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92193 Reported-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-29 15:40:37 -07:00
Nanley Chery	8090a1c326	mesa/texcompress: Restrict FXT1 format to desktop GL subset In agreement with the extension spec and commit `dd0eb00487`, filter FXT1 formats to the desktop GL profiles. Now we no longer advertise such formats as supported in an ES context and then throw an INVALID_ENUM error when the client tries to use such formats with CompressedTexImage2D. Fixes the following 26 dEQP tests: * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_neg_x * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_neg_y * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_neg_z * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_pos_x * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_pos_y * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_border_cube_pos_z * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_invalid_size * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_level_max_cube_pos * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_level_max_tex2d * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_level_cube * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_level_tex2d * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_neg_x * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_neg_y * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_neg_z * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_pos_x * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_pos_y * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_cube_pos_z * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_neg_width_height_tex2d * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_neg_x * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_neg_y * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_neg_z * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_pos_x * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_pos_y * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_cube_pos_z * dEQP-GLES2.functional.negative_api.texture.compressedteximage2d_width_height_max_tex2d v2. Use _mesa_is_desktop_gl() (Ilia, Ian) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-10-29 15:31:59 -07:00
Samuel Pitoiset	0260620ab3	nvc0: expose a group of performance metrics on Fermi This allows to monitor those performance metrics through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-29 23:01:29 +01:00
Ilia Mirkin	6166a8e369	st/mesa: create temporary textures with the same nr_samples as source Not sure if this is actually reachable in practice (to have a complex copy with MS textures). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-29 13:20:45 -04:00
Tapani Pälli	afbe8b6085	glsl: add fragdata arrays to program resource list This makes sure that user is still able to query properties about variables that have gotten removed by opt_dead_builtin_varyings pass. Fixes following OpenGL ES 3.1 test: ES31-CTS.program_interface_query.output-layout No Piglit regressions. v2: cleanup, drop extra parenthesis (Topi) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-10-29 17:17:42 +02:00
Tapani Pälli	6ce0857e30	mesa: add fragdata_arrays list to gl_shader This is required to store information about fragdata arrays, currently these variables get lost and cannot be retrieved later in sensible way for program interface queries. List will be utilized by next patch. Patch also modifies opt_dead_builtin_varyings pass to build list when lowering fragdata arrays. This is identical approach as taken with packed varyings pass. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-10-29 17:16:22 +02:00
Samuel Iglesias Gonsalvez	85f1f04413	glsl: fix GL_BUFFER_DATA_SIZE value for shader storage blocks with unsize arrays From ARB_program_interface_query: "For the property of BUFFER_DATA_SIZE, then the implementation-dependent minimum total buffer object size, in basic machine units, required to hold all active variables associated with an active uniform block, shader storage block, or atomic counter buffer is written to <params>. If the final member of an active shader storage block is array with no declared size, the minimum buffer size is computed assuming the array was declared as an array with one element." Fixes the following dEQP-GLES31 tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.named_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.unnamed_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.buffer_data_size.block_array v2: - Fix comment's indentation and explain that the parser already checked that unsized array is in last element of a shader storage block (Iago). - Add assert (Iago). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-29 08:29:06 +01:00
Kenneth Graunke	cf93251bed	docs: Mark GL_ARB_fragment_layer_viewport as done on i965.	2015-10-28 22:05:08 -07:00
Kenneth Graunke	8c902a580a	i965: Implement ARB_fragment_layer_viewport. Normally, we could read gl_Layer from bits 26:16 of R0.0. However, the specification requires that bogus out-of-range 32-bit values written by previous stages need to appear in the fragment shader as-written. Instead, we pass in the full 32-bit value from the VUE header as an extra flat-shaded varying. We have the SF override the value to 0 when the previous stage didn't actually write a value (it's actually defined to return 0). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-10-28 22:05:08 -07:00
Kenneth Graunke	5392328a32	i965: Make calculate_attr_overrides return the URB read offset. Traditionally, we've hardcoded "URB Entry Read Offset" to 1 (which represents 2 vec4 varying slots) to skip over the 8 DWord VUE header. In order to support ARB_fragment_layer_viewport, we'll need to read from that header. This patch adds the basic plumbing necessary to calculate a value dynamically and hook it up in the SBE packets. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-10-28 22:05:08 -07:00
Kenneth Graunke	b3d19d20f2	glsl: Mark gl_ViewportIndex and gl_Layer varyings as flat. Integer varyings need to be flat qualified - all others were already. I think we just missed this. Presumably some hardware passes this via sideband and ignores attribute interpolation, so no one has noticed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-10-28 22:05:08 -07:00
Kenneth Graunke	b94cdcdada	i965/fs: Properly check for PAD in fragment shaders with > 16 varyings. Commit `268008f98c` changed unused VUE map slots to be initialized with BRW_VARYING_SLOT_PAD, not COUNT. I missed updating this. It also means that commit message was wrong, as some code did rely slots being initialized to COUNT. This may fix a bug with SSO programs with > 16 FS input varyings. I think we probably just emitted extra pointless code, but probably didn't break anything. We might also just have no tests for that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-10-28 22:05:08 -07:00
Kenneth Graunke	6ae47a3eb4	i965: Update stale comment about unused VUE map slots. I changed this from COUNT to PAD in commit `268008f98c`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-10-28 22:05:08 -07:00
Ilia Mirkin	5227e91580	nv50/ir: adapt to new method for passing in cull/clip distance masks Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 00:44:22 -04:00
Ilia Mirkin	a5bae7b31d	nvc0: share shaders between contexts and build immediately Avoid deferring building shaders until draw time, should hopefully reduce any stuttering, as well as enable shader-db style analysis. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 00:44:22 -04:00
Ilia Mirkin	b75fff70d8	nvc0: do upload-time fixups for interpolation parameters Unfortunately flatshading is an all-or-nothing proposition on nvc0, while GL 3.0 calls for the ability to selectively specify explicit interpolation parameters on gl_Color/gl_SecondaryColor which would override the flatshading setting. This allows us to fix up the interpolation settings after shader generation based on rasterizer settings. While we're at it, we can add support for dynamically forcing all (non-flat) shader inputs to be interpolated per-sample, which allows st/mesa to not generate variants for these. Fixes the remaining failing glsl-1.30/execution/interpolation piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-29 00:44:22 -04:00
Kenneth Graunke	77f58c04cc	nir: Copy "patch" flag from ir_variable to nir_variable. This was introduced in GLSL IR after NIR development had branched. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-28 21:15:29 -07:00
Kenneth Graunke	9c8208f2c1	nir: Add intrinsics for tessellation shader system values. nir_intrinsic_load_patch_vertices_in corresponds to gl_PatchVerticesIn, a special input in both the TCS and TES stages. nir_intrinsic_load_tess_coord corresponds to gl_TessCoord, a special tessellation evaluation shader input. nir_intrinsic_load_tess_level_outer/inner correspond to the gl_TessLevelOuter[] and gl_TessLevelInner[] evaluation shader inputs, which we treat as system values because they're stored specially. (These intrinsics are only for the TES - the TCS uses output variables.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-28 21:14:53 -07:00
Kenneth Graunke	bf05af3f0e	i965: Fix missing BRW_NEW__PROG_DATA flagging caused by cache reuse. Consider the case of two nearly identical GLSL fragment shaders: out vec4 color; void main() { color = vec4(1); } and layout(early_fragment_tests) in; out vec4 color; void main() { color = vec4(1); } These shaders compile to the exact same assembly, but have distinct values for brw_wm_prog_data::early_fragment_tests. Since these are two independent GLSL shaders, they have different program keys - notably, brw_wm_prog_key::program_string_id differs. When uploading the second, brw_upload_cache will find an existing copy of the assembly in the cache BO, which means matching_data will be non-NULL. Although we create a second cache item (with the new key and prog_data), we set item->offset to the existing copy and avoid re-uploading duplicate assembly. However, brw_search_cache() would only flag BRW_NEW__PROG_DATA if item->offset differed from the supplied offset. With reuse, both programs have the same offset, but prog_data changed. We have to flag it, but failed to. To fix this, we simply need to check if the aux (prog_data) pointer changed. If either the assembly or the prog_data differs, flag it. This fixes a regression since `1bba29ed40`, where Topi fixed brw_upload_cache() to actually reuse identical assembly. Prior to that, reuse basically never happened due to bugs. Unfortunately, this code apparently wasn't prepared to handle reuse! Fixes GPU hangs in Dolphin on Broadwell. Huge thanks to Pierre Bourdon and Ilia Mirkin for debugging this and helping track down the real issue. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92623 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Pierre Bourdon <delroth@gmail.com>	2015-10-28 21:13:54 -07:00
Laurent Carlier	37402014e8	clover: fix building fix clang-3.8 https://bugs.freedesktop.org/show_bug.cgi?id=92705 v2.1: use Linker::Flags::None instead of 0 and emplace_back() Signed-off-by: Laurent Carlier <lordheavym@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-29 12:35:37 +09:00
Ilia Mirkin	d0693d7515	nv50: add ARB_copy_image support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-28 20:53:30 -04:00
Ilia Mirkin	ebbd7b41c0	nvc0: add ARB_copy_image support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-28 20:42:59 -04:00
Julien Isorce	3bbb8715ac	nvc0: fix crash when nv50_miptree_from_handle fails Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-28 18:26:20 +01:00
Brian Paul	2bf224b3f9	vbo: replace assertion with conditional in vbo_compute_max_verts() With just the right sequence of per-vertex commands and state changes, it's possible for this assertion to fail (such as with viewperf11's lightwave-06-1 test). Instead of asserting, return 0 so that the caller knows the VBO is full and needs to be flushed. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-28 11:03:27 -06:00
Brian Paul	8e9c3070bf	mesa: minor formatting fix in get_tex_rgba_compressed()	2015-10-28 11:03:21 -06:00
Marek Olšák	f04f13622f	st/mesa: implement ARB_copy_image I wonder if the craziness was worth it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-28 11:52:17 +01:00
Marek Olšák	ce9db16e1c	gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS For ARB_copy_image. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-28 11:52:17 +01:00
Marek Olšák	e82c527f1f	radeonsi: allow copying between compatible compressed and uncompressed formats which is where a block in src maps to a pixel in dst and vice versa. e.g. DXT1 <-> R32G32_UINT DXT5 <-> R32G32B32A32_UINT Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-28 11:52:17 +01:00
Marek Olšák	6a4dc1ad49	mesa: set TargetIndex in VDPAURegister*SurfaceNV (v2) We initialized Target, but not TargetIndex. This is required since `7d7dd18711`. v2: do it in the right place. Noticed by Brian Paul. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92645 Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-28 11:52:17 +01:00
Emil Velikov	bfc73ff10e	i965: remove unneeded src_reg copy in emit_shader_time_write The variable is already of type src_reg. creating a new instance only to destroy it seems unnecessary. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-28 02:28:38 -07:00
Emil Velikov	0325f68228	i965: remove cache_aux_free_func array There is only one function that can be called, which is well known at compilation time. The abstraction used here seems unnecessary, so let's use a direct call to brw_stage_prog_data_free() when appropriate, cut down the size of struct brw_cache. Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-28 02:28:38 -07:00
Samuel Iglesias Gonsalvez	74fcc4c41f	main: fix GL_MAX_NUM_ACTIVE_VARIABLES value for shader storage blocks The maximum number of active variables for shader storage blocks should take into account the specific rules for shader storage blocks, i.e. for an active shader storage block member declared as an array, an entry will be generated only for the first array element, regardless of its type. Fixes 3 dEQP-GLES31.functional.* tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.active_variables.named_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.active_variables.unnamed_block dEQP-GLES31.functional.program_interface_query.shader_storage_block.active_variables.block_array Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-28 08:40:51 +01:00
Boyuan Zhang	03c92ffbf6	st/vdpau: disable RefPicList for Vdpau HEVC Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-10-27 19:09:55 -04:00
Boyuan Zhang	ad2752e94b	st/va: add VAAPI HEVC decode support Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-10-27 19:09:55 -04:00
Boyuan Zhang	38c3d7cfc4	radeon/uvd: implement and add flag for VAAPI HEVC decode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-10-27 19:09:55 -04:00
Boyuan Zhang	231605d14d	vl: add RefPicList defines for VAAPI HEVC decode Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2015-10-27 19:09:55 -04:00
Marta Lofstedt	16c49da63a	mesa: Draw indirect is not allowed if the default VAO is bound. From OpenGL ES 3.1 specification, section 10.5: "DrawArraysIndirect requires that all data sourced for the command, including the DrawArraysIndirectCommand structure, be in buffer objects, and may not be called when the default vertex array object is bound." Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-27 12:16:23 +01:00
Marek Olšák	93eb4f9287	winsys/amdgpu: remove the dcc_enable surface flag dcc_size is sufficient and doesn't need a further comment in my opinion. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-10-27 10:49:24 +01:00
Marek Olšák	3aebc596b3	radeonsi: add debug flags that disable DCC and DCC fast clear For debugging, bug reports, etc. This is not in the radeonsi directory, but it is about radeonsi. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-10-27 10:49:24 +01:00
Marek Olšák	235d38584c	radeonsi: properly check if DCC is enabled and allocated Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-10-27 10:49:24 +01:00
Marek Olšák	5bc5dca0cb	radeonsi: simplify DCC handling in si_initialize_color_surface Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-10-27 10:49:24 +01:00
Marta Lofstedt	3daa7e5147	mesa: Draw indirect is not allowed when xfb is active and unpaused OpenGL ES 3.1 specification, section 10.5: "An INVALID_OPERATION error is generated if transform feedback is active and not paused." Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-27 08:49:21 +01:00
Marta Lofstedt	2c91e08656	mesa: Draw Indirect return wrong error code on unalinged From OpenGL 4.4 specification, section 10.4 and Open GL Es 3.1 section 10.5: "An INVALID_VALUE error is generated if indirect is not a multiple of the size, in basic machine units, of uint." However, the current code follow the ARB_draw_indirect: https://www.opengl.org/registry/specs/ARB/draw_indirect.txt "INVALID_OPERATION is generated by DrawArraysIndirect and DrawElementsIndirect if commands source data beyond the end of a buffer object or if <indirect> is not word aligned." V2: After discussions on the list, it was suggested to only keep the INVALID_VALUE error. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-27 08:49:21 +01:00
Samuel Iglesias Gonsalvez	4565b6f4fb	main: Remove interface block array index for doing the name comparison From ARB_program_query_interface spec: "uint GetProgramResourceIndex(uint program, enum programInterface, const char name); [...] If <name> exactly matches the name string of one of the active resources for <programInterface>, the index of the matched resource is returned. Additionally, if <name> would exactly match the name string of an active resource if "[0]" were appended to <name>, the index of the matched resource is returned. [...]" "A string provided to GetProgramResourceLocation or GetProgramResourceLocationIndex is considered to match an active variable if: [...] if the string identifies the base name of an active array, where the string would exactly match the name of the variable if the suffix "[0]" were appended to the string; [...] " Fixes the following two dEQP-GLES31 tests: dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array dEQP-GLES31.functional.program_interface_query.shader_storage_block.resource_list.block_array_single_element v2: - Add AoA support (Timothy) - Apply it too for GetUniformLocation(), GetUniformName() and others because ARB_program_interface_query says that they are equivalent to GetProgramResourceLocation() and GetProgramResourceName() (Tapani) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-27 08:10:04 +01:00
Eric Anholt	3359ad6cda	vc4: Add support for copy propagation with unpack flags present. total instructions in shared programs: 89251 -> 87862 (-1.56%) instructions in affected programs: 52971 -> 51582 (-2.62%)	2015-10-26 16:48:34 -07:00
Eric Anholt	01ca4f207e	vc4: Rewrite the pack instructions as a MOV with a dst pack flag Another step in reducing the special-casing of instructions.	2015-10-26 16:48:34 -07:00
Eric Anholt	72fa2ae20b	vc4: Move dst pack setup out to a helper function with more asserts.	2015-10-26 16:48:34 -07:00
Eric Anholt	99a9a5a345	vc4: Switch the unpack ops to being unpack flags on a mov. This paves the way for copy propagating our unpacks. We end up with a small change on shader-db: total instructions in shared programs: 89390 -> 89251 (-0.16%) instructions in affected programs: 19041 -> 18902 (-0.73%) which appears to be because we no longer convert MOVs for an FMAX dst, r4.unpack, r4.unpack (instead of the previous MOV dst, r4.unpack), and this ends up with a slightly better schedule.	2015-10-26 16:48:34 -07:00
Eric Anholt	548b05d53f	vc4: Drop some confused code about pack/unpack handling. At one point I thought packs and unpacks were in the same field of the instruction. They aren't. These instructions therefore never cause a pack. total instructions in shared programs: 89472 -> 89390 (-0.09%) instructions in affected programs: 15261 -> 15179 (-0.54%)	2015-10-26 16:48:34 -07:00
Eric Anholt	a7b424e835	vc4: Reduce MOV special-casing in QIR-to-QPU. I'm going to introduce some more types of MOV, which also want the elision of raw MOVs.	2015-10-26 16:48:34 -07:00
Eric Anholt	652a864b25	vc4: Fix up the test for whether the unpack can be from r4. We can do 16a/16b from float as well. No difference on shader-db.	2015-10-26 16:48:34 -07:00
Eric Anholt	3d7a088608	vc4: Don't try to follow MOVs across a pack.	2015-10-26 16:48:34 -07:00
Eric Anholt	6eb0760f48	vc4: Only copy propagate raw MOVs. No problems being fixed, but needed for the new unpack changes.	2015-10-26 16:48:34 -07:00
Eric Anholt	0ccacfa017	vc4: If a QIR source has an unpack set, print it. Not used yet, but will be.	2015-10-26 16:48:34 -07:00
Kenneth Graunke	8034e7d6f1	glsl: Convert TES gl_PatchVerticesIn into a constant when using a TCS. When a TCS is present, the TES input gl_PatchVerticesIn is actually a constant - it's simply the # of output vertices specified by the TCS layout qualifiers. So, we can replace the system value with a constant, which may allow further optimization, and will likely be more efficient. If the TCS is absent, we can't do this optimization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-26 16:37:07 -07:00
Ian Romanick	8f84a8e257	i965: Add missing close-parenthesis in error messages Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-26 16:15:55 -07:00
Ian Romanick	7070c8879a	i965: Fix is-renderable check in intel_image_target_renderbuffer_storage Previously we could create a renderbuffer with format MESA_FORMAT_R8G8B8A8_UNORM, convert that renderbuffer to an EGLImage, then FAIL to convert the EGLImage back to a renderbuffer because reasons. Just use the same check in intel_image_target_renderbuffer_storage that brw_render_target_supported uses. There are more checks in brw_render_target_supported, but I don't think they are necessary here. A different approach would be to refactor brw_render_target_supported to take rb->Format and rb->NumSamples as parameters (instead of a gl_renderbuffer) and use the new function here. Fixes: ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92476 Cc: "10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-26 16:15:55 -07:00
Timothy Arceri	a3d0359aff	glsl: keep track of intra-stage indices for atomics This is more optimal as it means we no longer have to upload the same set of ABO surfaces to all stages in the program. This also fixes a bug where since commit c0cd5b var->data.binding was being used as a replacement for atomic buffer index, but they don't have to be the same value they just happened to end up the same when binding is 0. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Alejandro Piñeiro <apinheiro@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175	2015-10-27 07:03:05 +11:00
Roland Scheidegger	711489648b	gallivm: disable f16c when not using AVX f16c intrinsic can only be emitted when AVX is used. So when we disable AVX due to forcing 128bit vectors we must not use this intrinsic (depending on llvm version, this worked previously because llvm used AVX even when we didn't tell it to, however I've seen this fail with llvm 3.3 since `718249843b` which seems to have the side effect of disabling avx in llvm albeit it only touches sse flags really, but with `ea421e919a` it's now really disabled). Albeit being able to use AVX with 128bit vectors also would have its uses, the code as is really was meant to emulate jit code creation for less capable cpus. v2: add some (ifdefed out) missing de-featuring options for simulating less capable cpus. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-26 16:45:49 +01:00
Julien Isorce	a61be1a798	st/va: pass picture desc to begin and decode At least vl_mpeg12_decoder uses the picture desc in begin_frame and decode_bitstream. https://bugs.freedesktop.org/show_bug.cgi?id=92634 Signed-off-by: Julien Isorce <j.isorce@samsung.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-26 13:53:10 +01:00
Tapani Pälli	8ae4317c36	mesa: add additional checks for uniform location query Patch adds additional check to make sure we don't return locations for structures or arrays of structures. From page 79 of the OpenGL 4.2 spec: "A valid name cannot be a structure, an array of structures, or any portion of a single vector or a matrix." v2: use without-array() to simplify code (Timothy) No Piglit or CTS regressions observed. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-26 12:52:17 +02:00
Emil Velikov	a305d59baa	docs: add news item and link release notes for 11.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-25 10:17:14 +00:00
Emil Velikov	47dd80a35d	docs: add sha256 checksums for 11.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `ec14e6f8fd`)	2015-10-25 10:14:04 +00:00
Emil Velikov	bddb7a51c3	docs: add release notes for 11.0.4 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `31bf247031`)	2015-10-25 10:14:03 +00:00
Kenneth Graunke	fcb39f5b6a	i965: Make brw_varying_to_offset take a const pointer to the VUE map. It doesn't modify it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-24 20:30:14 -07:00
Eric Anholt	a2eba3362f	vc4: Fix names of the 16-bit unpacks They're only f16-to-f32 on a float operation, otherwise they're i16-to-i32.	2015-10-24 17:55:55 -07:00
Eric Anholt	a238ad372d	vc4: Don't try to register coalesce into the VPM across non-raw MOVs. No known bugs, just something I noticed while updating optimization code for other changes.	2015-10-24 17:55:38 -07:00
Eric Anholt	ae1d3322cc	vc4: Take advantage of the 8888 pack function in pack_unorm_4x8. One instruction instead of four, and it turns out you do this a lot for the Over operator. total uniforms in shared programs: 32168 -> 32087 (-0.25%) uniforms in affected programs: 318 -> 237 (-25.47%) total instructions in shared programs: 89830 -> 89472 (-0.40%) instructions in affected programs: 6434 -> 6076 (-5.56%)	2015-10-24 17:55:22 -07:00
Eric Anholt	f09ed63f43	vc4: Fix the test for skipping raw MOVs. I don't know what previous test was trying to do, but it dates back to the first add of vc4_qpu_emit.c. No change to shader-db.	2015-10-24 17:55:22 -07:00
Ben Widawsky	9ecfc6baf1	i965: Remove unused devinfo revision I left the function to obtain the revision because it is, and will continue to be useful in the future. I'd rather not have to dig it up every time we need it. Comments left at the implementation to say as much. This was accidentally left here when I moved the early platform support: commit `28ed1e08e8` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Fri Aug 7 13:58:37 2015 -0700 i965/skl: Remove early platform support Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-24 12:48:01 -07:00
Fabio Pedretti	b0342f48d0	docs/index.html: fix typo Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-24 19:27:24 +01:00
Rob Clark	1e8d0cc628	freedreno: remove unnecessary null checks According to piglit/xonotic/neverball/stc, blend/rasterize/zsa state will always be bound (never null). And the null checks were in- consistent anyways, so remove them. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-24 12:38:33 -04:00
Bas Nieuwenhuizen	6529daca39	radeonsi: Implement DCC fast clear. Uses the DCC buffer instead of the CMASK buffer. The ELIMINATE_FAST_CLEAR still works. Furthermore, with DCC compression we can directly clear to a limited set of colors such that we do not need a postprocessing step. v2 Marek: check dcc_buffer && dirty_level_mask in set_sampler_view Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-24 17:46:08 +02:00
Roland Scheidegger	205a3ce5c1	gallivm: fix tex offsets with mirror repeat linear Can't see why anyone would ever want to use this, but it was clearly broken. This fixes the piglit texwrap offset test using this combination. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-24 03:00:33 +02:00
Roland Scheidegger	71ff5af5dd	gallivm: fix sampling with texture offsets in SoA path When using nearest filtering and clamp / clamp to edge wrapping results could be wrong for negative offsets. Fix this by adding the offset before doing the conversion to int coords (could also use floor instead of trunc int conversion but probably more complex on "typical" cpu). This fixes the piglit texwrap offset failures with this filter/wrap combo (which only leaves the linear/mirror repeat combination broken). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-24 03:00:33 +02:00
Roland Scheidegger	fb586e1edb	softpipe: fix using non-zero layer in non-array view from array resource For vertex/geometry shader sampling, this is the same as for llvmpipe - just use the original resource target. For fragment shader sampling though (which does not use first-layer based mip offsets) adjust the sampling code to use first_layer in the non-array cases. While here also fix up some code which looked wrong wrt buffer texel fetch (no piglit change). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-24 03:00:33 +02:00
Roland Scheidegger	fe707c0373	llvmpipe: fix using non-zero layer in non-array view from array resource Just need to use resource target not view target when calculating first-layer based mip offsets. (This is a gl specific problem since d3d10 does not distinguish between non-array and array resources neither at the resource nor view level, only at the shader level.) Fixes new piglit arb_texture_view sampling-2d-array-as-2d-layer test. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-24 03:00:33 +02:00
Alex Deucher	830e57b82d	radeonsi: add Stoney to si_init_gs_info() This patch was originally written before stoney support was merged. Add stoney. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-23 18:56:45 -04:00
Bas Nieuwenhuizen	48b5f104ac	radeonsi: Enable DCC. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-24 00:42:30 +02:00
Bas Nieuwenhuizen	81ebd6a882	radeonsi: Add FLUSH_AND_INV_CB_DATA_TS for DCC. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-24 00:42:28 +02:00
Bas Nieuwenhuizen	bb77467df9	radeonsi: Disable operations that do not work with DCC. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-24 00:42:24 +02:00
Bas Nieuwenhuizen	afa357c3b0	radeonsi: Allocate buffers for DCC. As the alignment requirements can be 32 KiB or more, also adding an aligned buffer creation function. DCC is disabled for textures that can be shared as sharing the DCC buffers has not been implemented yet. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-24 00:42:01 +02:00
Marek Olšák	edf6a4537c	radeonsi: only apply the SNORM blit workaround to *8_SNORM Like the comment says. This fixes DCC, which doesn't like blitting RG16 as RGBA8. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	e1c098f238	util/format: add helper util_format_is_snorm8 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	06083046a4	radeonsi: add another requirement for PARTIAL_ES_WAVE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	0d2cb35f68	radeonsi: merge two ifs setting WD_SWITCH_ON_EOP Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	ca18f12dbb	radeonsi: make PARTIAL_ES_WAVE globally dependent on SWITCH_ON_EOI This catches the other cases that enable SWITCH_ON_EOI. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	2070af2fb1	radeonsi: add one more SWITCH_ON_EOI requirement for Hawaii and VI The VI condition depends on geometry shaders and MAX_PRIMGRP_IN_WAVE. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	a6b5684e99	radeonsi: only apply the instancing bug workaround to Bonaire Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	96d5879d38	radeonsi: add SWITCH_ON_EOI requirement for 4 SE parts Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	7e056f872f	radeonsi: remove unnecessary PARTIAL_VS_WAVE setting for streamout hardware does this automatically Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	3a157e6e68	radeonsi: allow unbinding vertex shaders Draw calls without a vertex shader are skipped. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	07b3cc6ecf	radeonsi: allow unbinding pixel shaders and remove the dummy shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	50bb2decf7	radeonsi: add draw_vbo check for a NULL pixel shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	ed95cb3a31	radeonsi: add checks for a NULL pixel shader This will allow removing the dummy PS. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	d842d2f251	gallium/util: add a test for NULL fragment shaders Just to validate that radeonsi doesn't crash. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	dd05824b89	st/mesa: don't load state parameters if there are none Out of 7063 shaders from my shader-db: - 6564 (93%) shaders don't have any state parameters. - 347 (5%) shaders have 1 state parameter for WPOS lowering. - The remaining 2% have more state parameters, usually matrices. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-24 00:01:20 +02:00
Samuel Li	98546bfd03	radeonsi: add Stoney pci ids Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Samuel Li <samuel.li@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-10-23 17:53:48 -04:00
Samuel Li	bf0d0ce0d5	radeonsi: add support for Stoney asics (v3) v2 (agd): rebase on mesa master, split pci ids to separate commit v3 (agd): use carrizo for llvm processor name for llvm 3.7 and older Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Samuel Li <samuel.li@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-10-23 17:53:14 -04:00
Ilia Mirkin	e05021ff72	nvc0: respect edgeflag attribute width The edgeflag comes in as ubyte with glEdgeFlagPointer but as float with plain immediate glEdgeFlag. Avoid reading bytes that weren't meant for the edgeflag in the pointer case. Fixes intermittent failures with gl-2.0-edgeflag piglit (and valgrind complaints about reading uninitialized memory). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-10-23 16:43:06 -04:00
Jose Fonseca	ea421e919a	gallivm: Explicitly disable unsupported CPU features. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92214 CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-23 20:25:19 +01:00
Eric Anholt	70b06fb5d5	vc4: Convert blending to being done in 4x8 unorm normally. We can't do this all the time, because you want blending to be done in linear space, and sRGB would lose too much precision being done in 4x8. The win on instructions is pretty huge when you can, though. total uniforms in shared programs: 32065 -> 32168 (0.32%) uniforms in affected programs: 327 -> 430 (31.50%) total instructions in shared programs: 92644 -> 89830 (-3.04%) instructions in affected programs: 15580 -> 12766 (-18.06%) Improves openarena performance at 1920x1080 from 10.7fps to 11.2fps.	2015-10-23 18:11:21 +01:00
Eric Anholt	8e701fda49	vc4: Add QIR/QPU support for the 8-bit vector instructions.	2015-10-23 18:11:21 +01:00
Eric Anholt	817a7eb588	vc4: Don't try to CSE non-SSA instructions. This can happen when we're doing destination packing -- we don't know what's in the rest of the register. Signed-off-by: Eric Anholt <eric@anholt.net>	2015-10-23 18:11:21 +01:00
Eric Anholt	5b2fb138bc	nir: Add opcodes for saturated vector math. This corresponds to instructions used on vc4 for its blending inside of shaders. I've seen these opcodes on other architectures before, but I think it's the first time these are needed in Mesa. v2: Rename to 'u' instead of 'i', since they're all 'u'norm (from review by jekstrand)	2015-10-23 18:11:21 +01:00
Eric Anholt	1066a372d8	vc4: Add dumping of VC4_PACKET_GL_INDEXED_PRIMITIVE.	2015-10-23 18:11:21 +01:00
Eric Anholt	7d7fbcdf4e	vc4: Add a workaround for HW-2116 (state counter wrap fails). I haven't proven that this happens (I've got other GPU hangs in the way), but the closed driver also does this and it's documented as an errata.	2015-10-23 18:11:21 +01:00
Eric Anholt	73f6104532	vc4: Fix missing \n in a perf_debug().	2015-10-23 18:11:21 +01:00
Kristian Høgsberg Kristensen	8f60dc83f7	i965/fs: Allow copy propagating into new surface access opcodes Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	0cb7d7b4b7	i965/fs: Optimize ssbo stores Reviewed-by: Francisco Jerez <currojerez@riseup.net> Write groups of enabled components together. Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	feff21d1a6	i965/fs: Drop offset_reg temporary in ssbo load Now that we don't read each component one-by-one, we don't need the temoprary vgrf for the offset. More importantly, this register was type UD while the nir source was type D. This broke copy propagation and left a redundant MOV in the generated code. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	0a5a738252	i965/fs: Avoid scalar destinations in emit_uniformize() The scalar destination registers break copy propagation. Instead compute the results to a regular register and then reference a component when we later use the result as a source. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	a19bf6d3cc	i965/fs: Don't uniformize surface index twice The emit_untyped_read and emit_untyped_write helpers already uniformize the surface index argument. No need to do it before calling them. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	aedc0aab19	i965/fs: Use unsigned immediate 0 when eliminating SHADER_OPCODE_FIND_LIVE_CHANNEL The destination for SHADER_OPCODE_FIND_LIVE_CHANNEL is always a UD register. When we replace the opcode with a MOV, make sure we use a UD immediate 0 so copy propagation doesn't bail because of non-matching types. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	24a3a697e5	i965/fs: Read all components of a SSBO field with one send Instead of looping through single-component reads, read all components in one go. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Kristian Høgsberg Kristensen	de5a450bd3	i965: Don't use message headers for untyped reads We always set the mask to 0xffff, which is what it defaults to when no header is present. Let's drop the header instead. v2: Only remove header for untyped reads. Typed reads always need the header. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-23 09:42:28 -07:00
Alejandro Piñeiro	2f1bc1da86	i965/vec4: check opcode on vec4_instruction::reads_flag(channel) Commit f17b78 added an alternative reads_flag(channel) that returned if the instruction was reading a specific channel flag. By mistake it only took into account the predicate, but when the opcode is VS_OPCODE_UNPACK_FLAGS_SIMD4X2 there isn't any predicate, but the flag are used. That mistake caused some regressions on old hw. More information on this bug: https://bugs.freedesktop.org/show_bug.cgi?id=92621 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-23 18:11:09 +02:00
Eric Anholt	fb064901e9	vc4: Use Rob's NIR-based user clip lowering.	2015-10-23 14:30:15 +01:00
Eric Anholt	b3797a8f88	vc4: Also dump the decimation mode for resolved stores.	2015-10-23 14:30:15 +01:00
Eric Anholt	7516cbd261	vc4: Use VC4_GET_FIELD and other defines in dumping VC4_RENDER_CONFIG.	2015-10-23 14:30:15 +01:00
Eric Anholt	b0963ce758	vc4: Add a sentinel after simulator buffers for buffer overflow detection. This is a little bit like the mprotect-based fencing I've experimented with, but it's simple and low overhead. The downside is that only catches writes, not reads. It didn't catch any bad writes on a current piglit run, but may be useful in the future.	2015-10-23 14:29:07 +01:00
Samuel Iglesias Gonsalvez	f408a13dd3	glsl: fix shader storage block member rules when adding program resources Commit f24e5e did not take into account arrays of named shader storage blocks. Fixes 20 dEQP-GLES31.functional.ssbo.* tests: dEQP-GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.shared_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.packed_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.std140_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.per_block_buffer.std430_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.single_buffer.shared_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.single_buffer.packed_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.single_buffer.std140_instance_array dEQP-GLES31.functional.ssbo.layout.single_struct_array.single_buffer.std430_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_buffer.shared_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_buffer.packed_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_buffer.std140_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.per_block_buffer.std430_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.single_buffer.shared_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.single_buffer.packed_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.single_buffer.std140_instance_array dEQP-GLES31.functional.ssbo.layout.single_nested_struct_array.single_buffer.std430_instance_array dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.2 dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.29 dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.33 dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.3 V2: - Rename some variables (Timothy) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-10-23 13:12:43 +02:00
Chia-I Wu	582ecb3b91	ilo: add support for scratch spaces When a kernel reports a non-zero per-thread scratch space size, make sure the hardware state is correctly set up, and a scratch bo is allocated.	2015-10-23 17:29:58 +08:00
Chia-I Wu	4a7d18296a	ilo: fix scratch space setup in core Move scratch_size out of ilo_state_shader_kernel_info and ilo_state_compute_interface_info. A scratch space is shared by all kernels/interfaces. Update builder to emit relocs for scratch bos.	2015-10-23 17:29:58 +08:00
Timothy Arceri	3994ef5f1b	glsl: remove excess location qualifier validation Location has never been able to be a negative value because it has always been validated in the parser. Also the linker doesn't check for negatives like the comment claims. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-23 17:05:56 +11:00
Dave Airlie	5145243018	docs: update relnotes to mention virgl driver.	2015-10-23 14:40:07 +10:00
Dave Airlie	b3b82fe8ea	virgl/vtest: add vtest driver virgl/vtest is a swrast driver that allows the virgl acceleration to be tested without having a virtual machine. The backend has a unix socket server that this connects to. This is run by setting LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe In this mode all renderering is sent over a socket to the remote renderer, and the results are readback and copies to the screen using drisw. This works well enough to develop new features and to help debug. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-23 14:40:07 +10:00
Dave Airlie	a8987b88ff	virgl: add driver for virtio-gpu 3D (v2) virgl is the 3D acceleration backend for the virtio-gpu shipping with qemu. The 3D acceleration is designed around gallium and TGSI as the virtualisation layer. The backend renderer translates the virgl interface into OpenGL currently. This is the initial import of the driver to mesa. The kernel driver portions are lined up for drm-next. Currently this driver supports up to GL3.3 and some misc extensions if the host driver exposes it. It is planned to iterate the virgl API to new GL levels as mesa host drivers gain features. v2: fix resource tracking across flushes to avoid ->bind hack in mapping. consolidate mapping and waiting code for transfers. use u_range for dirt tracking. handle larger shaders in protocol. include virtgpu_drm.h in mesa for now. add translation layer for gallium tgsi to virgl tgsi. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-23 14:40:07 +10:00
Dave Airlie	531f5d1270	tgsi: try and handle overflowing shaders. (v2) This is used to detect error in virgl if we overflow the shader dumping buffers. v2: return a bool. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-23 11:57:56 +10:00
Dave Airlie	041081dc21	tgsi: add option to dump floats as hex values This adds support to the parser to accept hex values as floats, and then adds support to the dumper to allow the user to select to dump float as 32-bit hex numbers. This is required to get accurate values for virgl use of TGSI. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-23 11:55:02 +10:00
Sinclair Yeh	231d539239	svga: Condition preemptive flush on draw emission On ultra high resolution modes, the preemptive flush flag can be set midway through command submission, a condition that cannot be recovered from a flush-retry, causing rendering artifacts. This patch prevents a preemtive_flush until a draw has been emitted. Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	99effaa965	svga: try to avoid index generation for some primitive types The svga device doesn't directly support quads, quad strips or polygons so we have to convert those types to indexed triangle lists. But we can sometimes avoid that if we're drawing flat/constant-colored prims and we don't have to worry about provoking vertex. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	129d34da49	svga: avoid provoking vertex conversion when possible Provoking vertex comes into play when doing flat shading. But if we know that all fragments in a primitive are the same color, the provoking vertex doesn't matter. Check for that case and use whichever provoking vertex convention is supported by the device. This avoids generating an index buffer to do the PV conversion. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	1082735bb6	svga: detect constant color writes in fragment shaders Examine the fragment shader to try to detect TGSI shaders which use "MOV OUT[0], CONST[i]" to write a constant value for the fragment color. In this case, all fragments will have the same color (unless blending is enabled). This is a common case for OpenGL code such as: glColor(), glBegin(), glVertex(), ..., glEnd() when lighting/fog/etc are disabled. In this case, the Mesa/gallium state tracker actually generates a simple "MOV OUT[0], CONST[i]" fragment shader. This will be used by the next commit to avoid provoking vertex conversion (creating/rewriting an index buffer) when drawing flat-shaded primitives. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	df0f817e31	mesa: check for unchanged line width before error checking Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 17:19:20 -06:00
Brian Paul	990afdc045	st/mesa: use _mesa_RasterPos() when possible The st_RasterPos() function goes to great pains to implement the rasterpos transformation. It basically uses gallium's draw module to execute the vertex shader to draw a point, then capture that point's attributes. But glRasterPos isn't typically used with a vertex shader so we can usually use the old/fixed-function implementation which is a lot simpler and faster. This can add up for legacy apps that make a lot of calls to glRasterPos. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	af0399a1ce	tnl: remove t_rasterpos.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	234d5320bb	drivers/common: use _mesa_RasterPos instead of _tnl_RasterPos Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	614a743767	mesa: copy rasterpos evaluation code into core Mesa We'll remove it from the tnl module next. By lifting this code into core Mesa we can use it from the gallium state tracker. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-22 17:19:20 -06:00
Brian Paul	9919f56099	vbo: optimize vertex copying when 'wrapping' Instead of calling memcpy() 'n' times, we can do it all at once since the source and dest regions are all contiguous. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 17:19:20 -06:00
Alex Deucher	7b63658125	radeon/uvd: don't expose HEVC on old UVD hw (v3) The section for UVD 2 and older was not updated when HEVC support was added. Reported by Kano on irc. v2: integrate the UVD2 and older checks into the main switch statement. v3: handle encode checking as well. Encode is already checked in the top case statement, so drop encode checks in the lower case statement. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-10-22 16:22:44 -04:00
Alejandro Piñeiro	8cf84a7e47	i965/vec4: print predicate control at brw_vec4 dump_instruction v2: externalize pred_ctrl_align16 from brw_disasm.c instead of adding a copy on brw_vec4.c, as suggested by Matt Turner Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 21:58:03 +02:00
Alejandro Piñeiro	92ae101ed0	i965/vec4: use an envvar to decide to print the assembly on cmod_propagation tests The complete way to do this would be parse INTEL_DEBUG and print the output if DEBUG_VS (or a new one) is present (see intel_debug.c). But that seems like an overkill for the unit tests, that after all, the most common use case is being run when calling make check. v2: use the same idea for the fs counterpart too, as suggested by Matt Turner Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 21:58:03 +02:00
Alejandro Piñeiro	8fc8fcc04f	i965/vec4: Add unit tests for cmod propagation pass This include the same tests coming from test_fs_cmod_propagation, (non vector glsl types included) plus some new with vec4 types, inspired on the regressions found while the optimization was a work in progress. Additionally, the check of number of instructions after the optimization was changed from EXPECT_EQ to ASSERT_EQ. This was done to avoid a crash on failing tests that expected no optimization, as after checking the number of instructions, there were some checks related to this last instruction opcode/conditional mod. v2: update tests after Matt Turner's review of the optimization pass v3: tweaks on the tests (mostly on the comments), after Matt Turner's review Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 21:58:03 +02:00
Alejandro Piñeiro	627f94b72e	i965/vec4: adding vec4_cmod_propagation optimization vec4 port of fs_cmod_propagation. Shader-db results (no vec4 grepping): total instructions in shared programs: 6240413 -> 6235841 (-0.07%) instructions in affected programs: 401933 -> 397361 (-1.14%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 2265 HURT: 0 v2: remove extra space and combine two if blocks, as suggested by Matt Turner v3: add condition check to bail out if current inst and inst being scanned has different writemask, as pointed by Matt Turner v3: updated shader-db numbers v4: remove block from foreach_inst_in_block_*_starting_from after commit `801f151917` Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 21:58:03 +02:00
Alejandro Piñeiro	a59359ecd2	i965/vec4: track and use independently each flag channel vec4_live_variables tracks now each flag channel independently, so vec4_dead_code_eliminate can update the writemask of null registers, based on which component are alive at the moment. This would allow vec4_cmod_propagation to optimize out several movs involving null registers. v2: added support to track each flag channel independently at vec4 live_variables, as v1 assumed that it was already doing it, as pointed by Francisco Jerez v3: general cleaningn after Matt Turner's review Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-22 21:58:03 +02:00
Alejandro Piñeiro	8ac3b525c7	i965/vec4: nir_emit_if doesn't need to predicate based on all the channels v2: changed comment, as suggested by Matt Turner Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-22 21:58:03 +02:00
Matt Turner	1095d837dc	i965/vec4/gs: Fix signed/unsigned comparison warning.	2015-10-22 12:27:04 -07:00
Matt Turner	e2707c8765	i965/fs: Emit a single ADD instruction for SET_SAMPLE_ID on Gen8+. Gen8+ lifted the register region restriction that an instruction whose destination spans two registers must have sources that also span two registers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-22 12:27:00 -07:00
Matt Turner	0f74796e33	i965/fs: Drop unnecessary write-enable-all from SET_SAMPLE_ID. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-22 12:26:57 -07:00
Matt Turner	e2344e11ce	i965/fs: Trim unneeded channels in SampleID setup. The AND and SHR produce a scalar value that we had been replicating across $dispatch_width channels. The immediate MOV produces only four useful channels of data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-22 12:26:54 -07:00
Matt Turner	e10fc055e7	i965/fs: Use type-W for immediate in SampleID setup. Not a functional difference, but register is loaded with a signed immediate (V) and added to a signed type (D) producing a signed result (D). Also change the type of g0 to allow for compaction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-22 12:26:49 -07:00
Matt Turner	cfb67c3d06	i965/vec4: Initialize LOD to 0.0f for textureQueryLevels() and texture(). We implement textureQueryLevels (which takes no arguments, save the sampler) using the resinfo message (which takes an argument of LOD). Without initializing it, we'd generate a MOV from the null register to load the LOD argument. Essentially the same logic applies to texture. A vertex shader cannot compute derivatives and so cannot produce an LOD, so TXL with an LOD of 0.0 is used. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-22 10:16:52 -07:00
Matt Turner	65ffaf2740	i965: Note that the UV immediate type is Gen6+.	2015-10-22 10:16:52 -07:00
Jose Fonseca	718249843b	gallivm: Translate all util_cpu_caps bits to LLVM attributes. This should prevent disparity between features Mesa and LLVM believe are supported by the CPU. http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990 Tested on a i7-3720QM w/ LLVM 3.3 and 3.6. v2: Increase SmallVector initial size as suggested by Gustaw Smolarczyk. Reviewed-by: Roland Scheidegger <sroland@vmware.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-22 11:11:40 +01:00
Jordan Justen	627c15cde4	i965/fs: Disable CSE optimization for untyped & typed surface reads An untyped surface read is volatile because it might be affected by a write. In the ES31-CTS.compute_shader.resources-max test, two back to back read/modify/writes of an SSBO variable looked something like this: r1 = untyped_surface_read(ssbo_float) r2 = r1 + 1 untyped_surface_write(ssbo_float, r2) r3 = untyped_surface_read(ssbo_float) r4 = r3 + 1 untyped_surface_write(ssbo_float, r4) And after CSE, we had: r1 = untyped_surface_read(ssbo_float) r2 = r1 + 1 untyped_surface_write(ssbo_float, r2) r4 = r1 + 1 untyped_surface_write(ssbo_float, r4) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-22 00:36:37 -07:00
Chia-I Wu	13a5805b64	ilo: make sure there is HiZ before resolving We do not want to perform a depth resolve on an MCS enabled surface.	2015-10-22 14:06:21 +08:00
Chia-I Wu	0b6f6ee50f	ilo: fix max thread count for HS on Gen8 It is in DW2 on Gen8.	2015-10-22 14:06:21 +08:00
Ben Widawsky	8eefdacb38	i965: Advertise ARB_shader_stencil_export (gen9+) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 21:14:44 -07:00
Ben Widawsky	1db44252d0	i965: Implement ARB_shader_stencil_export (gen9+) v2: remove useless source_stencil_to_render_target (Ken) Squash in the actual packing function, which also got to v2: Move the definition of the OPCODE outside of FB_WRITE opcodes (Matt) Reorder the regioning to be in VWH order (Matt) Don't retype src in the backend, just assert instead (Matt) Rename the debug prints to something better (Matt) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 21:14:44 -07:00
Ben Widawsky	5fa7114652	i965/fs: Enumerate logical fb writes arguments Gen9 adds the ability to write out a stencil value, so we need to expand the virtual payload by one. Abstracting this now makes that change easier to read. I was admittedly confused early on about some of the hardcoding. If people believe the resulting code is inferior, I am not super attached to the patch. v2: Remove explicit numbering from the enumeration (Matt). Use a real naming scheme, and reference it in the opcode definition (Curro) Add a missed hardcoded logical position in get_lowered_simd_width (Ben) Add an assertion to make sure the component numbering is correct (Ben) Cc: Matt Turner <mattst88@gmail.com> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 21:14:44 -07:00
Brian Paul	18a631eb90	svga: fix clip plane regression after recent tgsi_scan change Before the change "tgsi/scan: use properties for clip/cull distance writemasks", the tgsi_shader_info::num_written_clipdistance field was a multiple of four, now it's an accurate count. In the svga driver, we need a minor change to the loop test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-21 17:12:19 -06:00
Kenneth Graunke	48c76eae8e	i965: Implement gl_InvocationID. It's stored in bits 31:27 of g1 (along with the URB handles). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:58 -07:00
Kenneth Graunke	c5ae34f38f	i965: Implement nir_intrinsic_load_primitive. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:56 -07:00
Kenneth Graunke	b3ebf03b84	i965: Add a fs_visitor constructor that takes a brw_gs_compile. Unlike the vs/wm structs, brw_gs_compile is actually useful: it contains the input VUE map and information about the control data headers. Passing this in allows us to share that code in brw_gs.c, and calculate them before deciding on vec4 vs. scalar mode, as it's independent of that choice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:54 -07:00
Kenneth Graunke	55dfd39b5f	i965: Add a brw->scalar_gs flag controlled by INTEL_SCALAR_GS=1. This patch introduces a brw->scalar_gs flag, similar to brw->scalar_vs, which controls whether or not to use SIMD8 geometry shaders. For now, we control it via a new environment variable, INTEL_SCALAR_GS. This provides a convenient way to try it out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:53 -07:00
Kenneth Graunke	ac0a33666b	i965: Make emit_urb_writes() reserve space for GS header information. Geometry shaders have additional header data at the beginning of their output URB entries. Shaders that use EndPrimitive() or multiple streams have a control data header; shaders with a dynamic vertex count have an additional vec4 slot to hold the 32-bit vertex count (and 96 bits of padding). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:52 -07:00
Kenneth Graunke	cb755996d9	i965: Make emit_urb_writes() only set EOT for the VS. The GS will emit a bunch of vertices, and we don't want to do an EOT prematurely. We'll emit GS_OPCODE_THREAD_END when we want to terminate the thread. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:50 -07:00
Kenneth Graunke	6ae419b94d	i965: Make fs_visitor::emit_urb_writes reusable for scalar GS. GS doesn't have ClampVertexColor, and we don't want to go through VS structures. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:49 -07:00
Kenneth Graunke	72d84ae7ce	i965: Introduce a brw_vue_prog_data::include_vue_handles flag. Tessellation shaders and SIMD8 geometry shaders may need to resort to the pull model for inputs at times. When set, the state upload code will tell the hardware to provide URB handles for input data. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:48 -07:00
Kenneth Graunke	ac98888afd	i965: Introduce a new SHADER_OPCODE_URB_READ_SIMD8 opcode. In scalar mode, geometry shader inputs can easily take up hundreds of registers. This makes pushing VUE entries impractical; we'll need to resort to the pull model in some cases. To support this, we introduce a new opcode corresponding to the "URB Read SIMD8" message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:46 -07:00
Kenneth Graunke	bea7522782	i965: Introduce new SHADER_OPCODE_URB_WRITE_SIMD8_MASKED/PER_SLOT opcodes. In the vec4 backend, we have a vec4_instruction::urb_write_flags field. There are many kinds of flags for SIMD4x2 messages. However, there are really only two (per-slot offset, use channel masks) for SIMD8 messages. Rather than adding a boolean flag for per-slot offsets (polluting all instructions), I decided to just make three new opcodes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-21 14:27:41 -07:00
Jason Ekstrand	0e57694745	i965/gs: Do prog_data setup and other calculations in brw_compile_gs This commit moves the large pile of setup calculations we have to do for geometry shaders out of brw_gs_emit and into brw_compile_gs. This has a couple of nice implications. First, it's less work that the caller of brw_compile_gs has to do. Second, it's consistent with the vertex and fragment stages. Finally, it allows us to put brw_gs_compile back behind the API boundary where it belongs. v2 (Jason Ekstrand): - Pull the changes to use nir info into a separate patch - Put brw_gs_compile into brw_shader.h rather than brw_vec4_gs_visitor.h so that we can use it for scalar GS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	f3bc73073a	i965/gs: Use NIR info for setting up prog_data Previously, we were pulling bits from GL data structures in order to set up the prog_data. However, in this brave new world of NIR, we want to be pulling it out of the NIR shader whenever possible. This way, we can move all this setup code into brw_compile_gs without depending on the old GL stuff. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	fac9b21e03	i965/gs: Pull prog_data out of brw_gs_compile Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	6ac2bbec16	i965/gs: Use NIR instead of the brw_geometry_program for GS metadata With this, we can remove the geometry program from brw_gs_compile. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	72148de217	i965/gs: Move the mem_ctx argument to brw_compile_gs This makes it better match the other brw_compile_* functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	8e8b527b27	i965/gs: Set static_vertex_count unconditionally on GEN8+ We always have NIR, so there's no reason for the check. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	2686477d37	nir: Constify nir_gs_count_vertices Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Jason Ekstrand	4eb84a03be	nir/info: Add more information about geometry shaders Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 14:20:32 -07:00
Ben Widawsky	3c5d24363a	i965: (trivial) rename computes stencil to gen9 All the documentation I can find says that this bit (and functionality) only exists on SKL+. Since the bit isn't yet used, there is no real impact here. The original code was added by Ken here (a surprisingly long time ago): commit `f3c6d6f1e1` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Thu Nov 29 21:00:27 2012 -0800 i965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-21 11:00:03 -07:00
Ben Widawsky	c643518452	i965: Correct the comment about fb write payload Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-21 11:00:00 -07:00
Nanley Chery	f1147a238a	mesa/glformats: Undo code changes from _mesa_base_tex_format() move The refactoring commit, `c6bf1cd`, accidentally reverted `cd49b97` and `99b1f47`. These changes caused more code to be added to the function and removed the existing support for ASTC. This patch reverts those modifications. v2. Actually include ASTC support again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92221 Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-21 10:36:31 -07:00
Matt Turner	2ce659b5e4	i965: Mark compacted 3-src instructions as Gen8+. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	05cc56cca3	i965: Add const to brw_compact_inst_bits. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	b29f92daec	i965: Add mask_control_ex field and handle it in compaction. Documentation is sparse, but it appears to have existed on G45 and ILK as a second bit extension of the mask_control field. Setting the pair of bits to 0b11 enables "NoCMask". Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	3ec9d96d43	i965: Add devinfo->gen assertions for acc_wr_control. ... and for flag_subreg_nr since it's right near by. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	d14907b946	i965: Prepare for next commit by adding more whitespace. We're going to add a field with a longer name that wouldn't align with the rest. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:38 -07:00
Matt Turner	35f3f06c8a	i965: Compact acc_wr_control only on Gen6+. It only exists on Gen6+, and the next patches will add compaction support for the (unused) field in the same location on earlier platforms. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:37 -07:00
Matt Turner	ee868c46e8	i965: Add devinfo parameter to brw_compact_inst_* funcs. The next commit will add assertions dependent on devinfo->gen. Use compact()/uncompact() macros where possible, like the 3-src code does. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:37 -07:00
Matt Turner	4a132349c3	i965/vec4: Don't emit MOVs for unused URB slots. Otherwise we'd emit a MOV from the null register (which isn't allowed). Helps 24 programs in shader-db (the geometry shaders in GSCloth): instructions in affected programs: 302 -> 262 (-13.25%) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-21 10:17:37 -07:00
Nigel Stewart	04703762e5	osmesa: Expose GL entry points for Windows build via DEF file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92437 CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-21 14:06:58 +01:00
Jonathan Gray	99c4079c37	configure.ac: ensure RM is set GNU make predefines RM to rm -f but this is not required by POSIX so ensure that RM is set. This fixes "make clean" on OpenBSD. v2: use AC_CHECK_PROG Signed-off-by: Jonathan Gray <jsg@jsg.id.au> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-21 14:09:38 +01:00
Neil Roberts	ee77796a5c	i965/fs: Disable opt_sampler_eot for more message types In `bfdae9149e` I disabled the opt_sampler_eot optimisation for TG4 message types because I found by experimentation that it doesn't work. I wrote in the comment that I couldn't find any documentation for this problem. However I've now found the documentation and it has additional restrictions on further message types so this patch updates the comment and adds the others. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-21 11:08:37 +02:00
Neil Roberts	801f151917	i965: Remove block arg from foreach_inst_in_block_*_starting_from Since `49374fab5d` these macros no longer actually use the block argument. I think this is worth doing to make the macros easier to use because they already have really long names and a confusing set of arguments. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-21 11:07:04 +02:00
Timothy Arceri	38ceeeadaa	glsl: check for arrays of arrays when assigning explicit locations This fixes assigning explicit locations in the CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-arrays-of-arrays Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-21 15:49:32 +11:00
Timothy Arceri	9a04057ef1	glsl: add is_array_of_arrays() helper As suggested by Ian Romanick Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-21 15:49:17 +11:00
Kenneth Graunke	156b7d3113	glsl: Fix bad indentation in bit_logic_result_type(). The first level of indentation was using 4 spaces. Mesa uses 3. Trivial. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-20 21:25:11 -07:00
Timothy Arceri	fd01840c0b	glsl: add AoA support to subroutines process_parameters() will now be called earlier because we need actual_parameters processed earlier so we can use it with match_subroutine_by_name() to get the subroutine variable, we need to do this inside the recursive function generate_array_index() because we can't create the ir_dereference_array() until we have gotten to the outermost array. For the remainder of the array dimensions the type doesn't matter so we can just use the existing _mesa_ast_array_index_to_hir() function to process the ast. Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-10-21 14:56:57 +11:00
Tapani Pälli	a59c1adcc6	glsl: fix record type detection in explicit location assign Check current_var directly instead of using the passed in record_type. This fixes following failing CTS test: ES31-CTS.explicit_uniform_location.uniform-loc-types-structs No Piglit regressions. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-21 06:12:15 +03:00
Tapani Pälli	1f48ea1193	glsl: do not try to reserve explicit locations for buffer variables Explicit locations are only used with uniform variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-21 06:11:38 +03:00
Tapani Pälli	96bbb3707f	glsl: skip buffer variables when filling UniformRemapTable UniformRemapTable is used only for remapping user specified uniform locations to driver internally used ones, shader storage buffer variables should not utilize uniform locations. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-21 06:10:52 +03:00
Brian Paul	f1682fdafa	svga: add switch case for PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT A third instance of this was needed but missed in the previous commit. Return 32 as for the two other cases. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-20 19:14:51 -06:00
Brian Paul	b48e16fa2f	draw: fix splitting of line loops (v2) When the draw module splits long line loops, the sections are emitted as line strips. But the primitive type wasn't set correctly so each section was being drawn as a loop, introducing extra line segments. To fix this, we pass a new DRAW_LINE_LOOP_AS_STRIP flag to the run() function. The linear/elt_run() functions have to check for this flag and set their primitive type accordingly. No piglit regressions. Fixes piglit's lineloop with -count 4097 or higher. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81174 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-20 19:14:51 -06:00
Anuj Phogat	876d07d837	i965/gen9: Remove temporary variable 'bpp' in tr_mode_..._texture_alignment() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-20 13:26:25 -07:00
Anuj Phogat	06ec19bca4	i965/gen9: Remove temporary variable 'align_yf' in tr_mode_..._texture_alignment() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-20 13:26:25 -07:00
Anuj Phogat	8f8c450bc7	i965/gen9: Remove parameter 'brw' from tr_mode_..._texture_alignment() V2: Rebased on master. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-20 13:26:25 -07:00
Anuj Phogat	a5a00bd747	i965/gen9: Reuse YF alignment tables in tr_mode_..._texture_alignment() Patch just does some refactoring to make the code look better. No functional changes in here. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-20 13:26:25 -07:00
Brian Paul	f221580937	vbo: convert display list GL_LINE_LOOP prims to GL_LINE_STRIP When a long GL_LINE_LOOP prim was split across primitives we drew stray lines. See previous commit for details. This patch converts GL_LINE_LOOP prims into GL_LINE_STRIP prims so that drivers don't have to worry about the _mesa_prim::begin/end flags. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81174 Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Acked-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	d79595bf02	vbo: fix GL_LINE_LOOP stray line bug When long GL_LINE_LOOP primitives don't fit in one vertex buffer they have to be split across buffers. The code to do this was basically correct but drivers had to pay special attention to the _mesa_prim::begin,end flags in order to draw the sections of the line loop properly. Apparently, the only drivers to do this were those using the old 'tnl' module for software vertex processing. Now we convert the split pieces of GL_LINE_LOOP prims into GL_LINE_STRIP primitives so that drivers don't have to worry about the special begin/end flags. The only time a driver will get a GL_LINE_LOOP prim is when the whole thing fits in one vertex buffer. Mostly fixes bug 81174, but not completely. There's another bug somewhere in the src/gallium/auxiliary/draw/ code. If the piglit lineloop test is run with -count 4096, rendering is correct, but with -count 4097 there are stray lines. 4096 is a magic number in the draw code (search for "4096"). Also note that this does not fix long line loops in display lists. The next patch fixes that. v2: fix incorrect -1 in vbo_compute_max_verts(), per Charmaine. Remove incorrect assertion which was added in vbo_copy_vertices(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81174 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49779 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28130 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	03d2f08539	vbo: add new vbo_compute_max_verts() helper function Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	002c5c1da3	vbo: simplify some code in vbo_exec_End() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	d916175c4d	vbo: simplify some code in vbo_copy_vertices() As before, use a new 'last_prim' pointer to simplify things. Plus, add some const qualifiers. v2: use 'sz' in another place, per Sinclair. And update subject line. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	d24c3a680e	vbo: simplify some code in vbo_exec_wrap_buffers() Use a new 'last_prim' pointer to simplify things. v2: remove unneeded assert(exec->vtx.prim_count > 0) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	1637cec8f8	vbo: replace the comment on vbo_copy_vertices() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	e05ffcf1d9	vbo: make vbo_exec_vtx_wrap() static Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	971b56c643	vbo: remove unneeded ctx parameter for merge_prims() Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:41 -06:00
Brian Paul	6cc596c66b	tnl: add some comments in render_line_loop code And remove '(void) flags' line which is not needed. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:40 -06:00
Brian Paul	f7272032be	mesa: simple whitespace fix in texstore.c	2015-10-20 12:52:40 -06:00
Brian Paul	f6d4e20d10	vbo: reduce number of vertex buffer mappings for vertex attributes Whenever we got a glColor, glNormal, glTexCoord, etc. call outside a glBegin/End pair, we'd immediately map a vertex buffer to begin accumulating vertex data. In some cases, such as with display lists, this led to excessive vertex buffer mapping. For example, if we have a display list such as: glNewList(42, GL_COMPILE); glBegin(prim); glVertex2f(); ... glVertex2f(); glEnd(); glEndList(); Then did: glColor3f(); glCallList(42); We'd map a vertex buffer as soon as we saw glColor3f but we'd never actually write anything to it. Note that the vertex position data was put into a vertex buffer during display list compilation. With this change, we delay mapping the vertex buffer until we actually have a vertex to write to it (triggered by a glVertex() call). In the above case, we no longer map a vertex buffer when setting the color and calling the list. For drivers such as VMware's, reducing buffer mappings gives improved performance. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-20 12:52:40 -06:00
Brian Paul	d11fefa961	st/mesa: optimize 4-component ubyte glDrawPixels If we didn't find a gallium surface format that exactly matched the glDrawPixels format/type combination, we used some other 32-bit packed RGBA format and swizzled the whole image in the mesa texstore/format code. That slow path can be avoided in some common cases by using the pipe_samper_view's swizzle terms to do the swizzling at texture sampling time instead. For now, only GL_RGBA/ubyte and GL_BGRA/ubyte combinations are supported. In the future other formats and types like GL_UNSIGNED_INT_8_8_8_8 could be added. v2: fix incorrect swizzle setup (need to invert the tex format's swizzle) Reviewed by: Jose Fonseca <jfonseca@vmware.com>	2015-10-20 12:52:40 -06:00
Brian Paul	cf405922eb	mesa: make memcpy_texture() non-static So that we can use it directly from the mesa/gallium state tracker. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-10-20 12:52:40 -06:00
Brian Paul	31ae52acce	st/mesa: check for out-of-memory in st_DrawPixels() Before, if make_texture() or st_create_texture_sampler_view() failed we silently no-op'd the glDrawPixels. Now, set GL_OUT_OF_MEMORY. This also allows us to un-nest a bunch of code. v2: also check if allocation of sv[1] fails, per Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-20 12:52:40 -06:00
Brian Paul	c5de38abc9	st/mesa: use MAX3() instead of MAX2(MAX2) in draw_textured_quad() Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-20 12:52:40 -06:00
Brian Paul	e24d04e436	mesa: fix incorrect opcode in save_BlendFunci() Fixes assertion failure with new piglit arb_draw_buffers_blend-state_set_get test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-20 12:52:40 -06:00
Brian Paul	b1f8ef5ae3	mesa: add more cases to print_list() in dlist.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-10-20 12:52:40 -06:00
Emil Velikov	6994d8ec01	i965: silence incompatible pointer type warning src/mesa/drivers/dri/i965/brw_program.c:94:39: warning: passing argument 1 of ‘_mesa_init_gl_program’ from incompatible pointer type [-Wincompatible-pointer-types] return _mesa_init_gl_program(&prog->program, target, id); ^ Runtime was unaffected as brw_geometry_program is subclassed from gl_geometry_program, thus the address passed was the same. Fixes: `bcb56c2c69` (program: convert _mesa_init_gl_program() to take struct gl_program *) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-20 18:37:22 +01:00
Marek Olšák	814f31457e	gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINT This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-20 18:27:11 +02:00
Eric Anholt	921feb8782	vc4: Switch our vertex attr lowering to being NIR-based. This exposes more information to NIR's optimization, and should be particularly useful when we do range-based optimization. total uniforms in shared programs: 32066 -> 32065 (-0.00%) uniforms in affected programs: 21 -> 20 (-4.76%) total instructions in shared programs: 93104 -> 92630 (-0.51%) instructions in affected programs: 31901 -> 31427 (-1.49%)	2015-10-20 12:47:27 +01:00
Eric Anholt	85b946478c	vc4: Add limited support for ibfe/ubfe. This is just enough to cover our unpack modes, which will be used by some new NIR-based lowering in the next commit.	2015-10-20 12:47:27 +01:00
Marek Olšák	8910ebd8e8	tgsi/scan: use properties for clip/cull distance writemasks No changes needed for drivers already relying on tgsi_shader_info. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-20 12:58:25 +02:00
Marek Olšák	7c75f23cb9	st/mesa: pass the clip distance array size to drivers Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-20 12:58:25 +02:00
Marek Olšák	e70c66197e	gallium: add new properties for clip and cull distance usage The TGSI usage mask can't be used, because these are declared as an output array of 2 elements. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-20 12:58:25 +02:00
Marek Olšák	67f489ded3	mesa: replace UsesClipDistance with ClipDistanceArraySize This is more practical and needed by gallium. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-20 12:58:25 +02:00
Marek Olšák	8339585b12	radeonsi: enable BC_OPTIMIZE if centroid isn't used This solution was recommended by a Catalyst developer. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-20 12:56:46 +02:00
Marek Olšák	38391835b5	radeonsi: fix the export_prim_id field size in the shader key Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-20 12:56:40 +02:00
Marek Olšák	9b54ce3362	radeonsi: support thread-safe shaders shared by multiple contexts The "current" shader pointer is moved from the CSO to the context, so that the CSO is mostly immutable. The only drawback is that the "current" pointer isn't saved when unbinding a shader and it must be looked up when the shader is bound again. This is also a prerequisite for multithreaded shader compilation. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-20 12:51:51 +02:00
Marek Olšák	e57dd7a08b	st/mesa: create shaders which have only one variant immediatelly (v2) v2: fix the condition when lacking sample shading Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-20 12:51:51 +02:00
Marek Olšák	b99645f819	st/mesa: negate the can_force_persample_interp flag Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-20 12:51:51 +02:00
Marek Olšák	f4e938e9ae	st/mesa: decouple shaders from contexts if they are shareable Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-20 12:51:51 +02:00
Marek Olšák	d74e7b6fb9	gallium: add PIPE_CAP_SHAREABLE_SHADERS I'll let drivers figure out how to do it. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-20 12:51:51 +02:00
Marek Olšák	12321966ae	radeonsi: add support for ARB_texture_view All tests pass. We don't need to do much - just set CUBE if the view target is CUBE or CUBE_ARRAY, otherwise set the resource target. The reason this can be so simple is that texture instructions have a greater effect on the target than the sampler view. Thanks Glenn for the piglit test. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-20 12:25:19 +02:00
Boyan Ding	6bd9e03512	vc4: Use nir_foreach_variable Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-20 09:54:53 +01:00
Timothy Arceri	2832ca95ec	glsl: fix stream qualifier for blocks with an instance name This also removes the validation from the parser as it is not required and once arb_enhanced_layouts comes along we wont be able to do validation on the stream qualifier in the parser anyway as it adds constant expression support to the stream qualifier. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Cc: 11.0 <mesa-stable@lists.freedesktop.org>	2015-10-20 11:58:28 +11:00
Timothy Arceri	aa9f06b3ea	glsl: fix regression when building interface field name for SSBOs Fixes regression cased by `bb5aeb8549` We don't care about the swizzle when building the name so just skip over it. Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-20 11:54:09 +11:00
Leo Liu	867284a8f0	st/omx/dec/h264: fix field picture type 0 poc disorder Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-19 20:43:03 -04:00
Anuj Phogat	2eed9e6b75	i965/gen9: Handle the GL_TEXTURE_{1D, 1D_ARRAY} targets inside switch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 13:43:44 -07:00
Matt Turner	de862f03ac	i965/fs: Localize variables' scopes.	2015-10-19 10:19:32 -07:00
Matt Turner	35a2d259f2	i965/fs: Consider type mismatches in saturate propagation. NIR considers bcsel to produce and consume unsigned types, leading to SEL instructions operating on unsigned types when the data is really floating-point. Previous to this patch, saturate propagation would happily transform (+f0) sel g20:UD, g30:UD, g40:UD mov.sat g50:F, g20:F into (+f0) sel.sat g20:UD, g30:UD, g40:UD mov g50:F, g20:F But since the meaning of .sat is dependent on the type of the destination register, this is not valid. Instead, allow saturate propagation to change the types of dest/source on instructions that are simply copying data in order to propagate the saturate modifier. Fixes bad code gen in 158 programs. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-19 10:19:32 -07:00
Matt Turner	9e17c36b8b	i965: Extract can_change_source_types() functions. Make them members of fs_inst/vec4_instruction for use elsewhere. Also fix the fs version to check that dst.type == src[1].type and for !saturate. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-19 10:19:32 -07:00
Jason Ekstrand	41c474df53	i965/vs: Move URB entry_size and read_length calculations to compile_vs Reviewed-By: Eduardo Lima Mitev <elima@igalia.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	6980372010	i965: Move the entire compiler API into a single file At this point, the compiler API has been substantially simplified. In the spirit of Kristian's making a compiler library, this commit makes a single header file that contains, more-or-less, the entire compiler API. There's still a bit of cleanup to do particularly in the area of geometry shaders. However, this gets us much closer to having a separate compiler. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	4467344c82	i965: Rename brw_foo_emit to brw_compile_foo Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	67db9072b9	i965/fs: Move some of the prog_data setup into brw_wm_emit This commit moves the common/modern stuff. Some legacy stuff such as setting use_alt_mode was left because it needs to know whether or not we're an ARB program. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	4e711872d0	i965/cs: Rework cs_emit to take a nir_shader and a brw_compiler This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	657863bb5c	i965/gs: Rework gs_emit to take a nir_shader and a brw_compiler This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. Unfortunately, we still have to pass in the gl_shader_program for gen6 because it's needed for transform feedback. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	5d8bf6de61	i965/vs: Rework vs_emit to take a nir_shader and a brw_compiler This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. v2 (Jason Ekstrand): - Patch use_legacy_snorm_formula through as a function argument rather than trying to go through the shader key. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	22ad44910e	i965/fs: Rework wm_fs_emit to take a nir_shader and a brw_compiler This commit removes all dependence on GL state by getting rid of the brw_context parameter and the GL data structures. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	0ca401327e	i965: Use a const nir_shader in backend_shader Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	8f1d968704	i965/vec4: Remove gl_program and gl_shader_program from the generator Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	5e86f5b3d2	i965/fs: Remove the gl_program from the generator Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	688d2e4585	nir/info: Add a few bits of info for fragment shaders Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	4889c73dd1	nir/info: Add compute shader local size to nir_shader_info Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	fe399f3a69	nir/info: Move the GS info into a stage-specific info union This way we can have other stage-specific info without consuming too much extra space. While we're at it, we make sure that the geometry info is only set if we're actually a goemetry shader. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	16619477bc	mesa: Move gl_frag_depth_layout from mtypes.h to shader_enums.h Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:47:03 -07:00
Jason Ekstrand	5d4bc5ec13	nir: Add a label to nir_shader_info Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:45:14 -07:00
Jason Ekstrand	e00314bc57	i965/asm: Explicitly use a nir_instr for IR annotations Now that everything goes through NIR, we don't need this to be a void pointer anymore. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-19 08:45:14 -07:00
Jose Fonseca	b23a4859f4	scons: Build nir/glsl_types.cpp once. Undoes early hacks, and ensures nir/glsl_types.cpp is built once, and only once. The root problem is that SCons doesn't know about NIR nor any source file in the NIR_FILES source list. Tested with libgl-gdi and libgl-xlib scons targets. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-19 15:59:59 +01:00
Brian Paul	530eb39c71	svga: fix incorrect round-down arithmetic Spotted by Roland. Luckily, this code should never really be hit since the const buffer size and offset should already be multiples of 16. I could probably add more assertions to that effect, but let's just fix the arithmetic for now. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-10-19 08:54:42 -06:00
Samuel Iglesias Gonsalvez	6f3954618b	glsl: fix segfault when indirect indexing a buffer variable which is an array Fixes a regression added by `bb5aeb8549`. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-19 11:16:50 +02:00
Indrajit Das	b0a44f1017	st/va: Added support for NV12 to IYUV conversion in vlVaGetImage Reviewed-by: Christian König <christian.koenig@amd.com>	2015-10-19 09:47:33 +02:00
Indrajit Das	381c17d695	st/va: Used correct parameter to derive the value of the "h" variable in vlVaCreateImage Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-19 09:47:24 +02:00
Iago Toral Quiroga	36c93e9659	glsl_to_tgsi: Use {Num}UniformBlocks instead of {Num}BufferInterfaceBlocks The latter holds both UBOs and SSBOs, but here we only want UBOs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-19 08:20:40 +02:00
Iago Toral Quiroga	5a9ff87d0f	st/mesa: Use {Num}UniformBlocks instead of {Num}BufferInterfaceBlocks The latter holds both UBOs and SSBOs, but here we only want UBOs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-19 08:20:40 +02:00
Iago Toral Quiroga	55403665b6	i965: Do not use NumBufferInterfaceBlocks This is the only place in the driver where we use this. Since we now work with separate index spaces, always use NumUniformBlocks and NumShaderStorageBlocks instead of NumBufferInterfaceBlocks to be more consistent with the rest of the code. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-19 08:20:40 +02:00
Iago Toral Quiroga	14c3db7bc5	main: GL_ACTIVE_UNIFORM_BLOCK_MAX_NAME_LENGTH is about UBOS, not SSBOs Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-19 08:20:40 +02:00
Iago Toral Quiroga	fba582efc7	main: Use NumUniformBlocks to count UBOs Now that we have separate index spaces for UBOs and SSBOs we do not need to iterate through BufferInterfaceBlocks any more, we can just take the UBO count directly from NumUniformBlocks. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-19 08:20:40 +02:00
Chia-I Wu	86ccb2a16f	ilo: set VME for 3DSTATE_PS When the bit is not set, we can see sampling artifacts on triangle edges when the mip filter is not GEN6_MIPFILTER_NONE.	2015-10-18 21:35:16 +08:00
Chia-I Wu	d04126a773	ilo: ignore prefer_linear_threshold when zero This was the intended behavior but it did not work as intended until now.	2015-10-18 21:04:52 +08:00
Chia-I Wu	a445e0f7ef	ilo: remove some unused kernel params	2015-10-18 21:04:52 +08:00
Chia-I Wu	6e132f4730	ilo: remove unused ilo_shader_get_type()	2015-10-18 21:04:52 +08:00
Chia-I Wu	29a0f7479d	ilo: remove u_debug.h inclusion from ilo_core.h Move it to ilo_debug.h.	2015-10-18 21:04:52 +08:00
Chia-I Wu	3fe568e2a4	ilo: remove u_memory.h inclusion from ilo_core.h We do not make allocations generally in the core.	2015-10-18 21:04:52 +08:00
Samuel Pitoiset	fc5ae0c13f	nvc0: do not bind input params at compute state init on Fermi It looks like binding a constant buffer on compute overwrites the 3D state. To avoid that, we already re-bind all the 3D constant buffers after launching a compute grid but this is not enough. Binding the constant buffer of input parameters for the compute state at initialization corrupts the 3D constant buffers, and it's just useless to bind it because this is not needed until we really launch a grid. This fixes some piglit regressions related to interpolation tests introduced in "nvc0: enable compute support by default on Fermi". Fixes: `00d6186` (nvc0: enable compute support by default on Fermi) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-18 14:25:05 +02:00
Kenneth Graunke	ca2b807ca3	i965/vs: Drop hack that created NIR for fixed function vertex programs. Marek made core Mesa call ProgramStringNotify(), which solves this properly. The hack is no longer needed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-17 17:26:11 -07:00
Kenneth Graunke	dbac0a6352	i965/nir: Switch on shader stage in nir_lower_outputs(). VS, GS, and FS continue doing the same thing they did before. We can simplify the FS code a bit because it is always scalar. Compute shaders now assert that there are no outputs instead of doing a loop over 0 outputs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-17 17:26:11 -07:00
Marek Olšák	7c10af6425	radeonsi: don't use the AMDGPU intrinsic for CMP No difference according to shader-db. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	f2cdb68c8b	radeonsi: use LRP from gallivm Totals: SGPRS: 344552 -> 344368 (-0.05 %) VGPRS: 197132 -> 197552 (0.21 %) Code Size: 7375376 -> 7366304 (-0.12 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1679360 -> 1615872 (-3.78 %) bytes per wave Totals from affected shaders: SGPRS: 47736 -> 47552 (-0.39 %) VGPRS: 27952 -> 28372 (1.50 %) Code Size: 1392724 -> 1383652 (-0.65 %) bytes LDS: 39 -> 39 (0.00 %) blocks Scratch: 513024 -> 449536 (-12.38 %) bytes per wave Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	eb11efc989	radeonsi: don't emit AMDGPU intrinsics for integer abs, min, max No difference according to shader-db. (with the new S_ABS_I32 pattern) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	d72a26ec5d	radeonsi: don't emit AMDGPU intrinsics for EX2, ROUND, TRUNC No difference according to shader-db. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:04 +02:00
Marek Olšák	6660ca7121	radeonsi: initialize output, temp, and address registers to "undef" This removes "v_mov v0, 0" which typically occurs before exports. Totals: SGPRS: 345216 -> 344552 (-0.19 %) VGPRS: 197684 -> 197132 (-0.28 %) Code Size: 7390408 -> 7375376 (-0.20 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1842176 -> 1679360 (-8.84 %) bytes per wave Totals from affected shaders: SGPRS: 101336 -> 100672 (-0.66 %) VGPRS: 53920 -> 53368 (-1.02 %) Code Size: 2170176 -> 2155144 (-0.69 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 1015808 -> 852992 (-16.03 %) bytes per wave Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	529c5e7740	gallivm: implement the correct version of LRP The previous version has precision issues. This can be a problem with tessellation. Sadly, I can't find the article where I read it anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert this. v2: added the comment	2015-10-17 21:40:03 +02:00
Marek Olšák	a2197cac7f	gallivm: set correct opcode info from unary/binary/ternary emits and clear the emit_data structure. The new radeonsi min/max opcode implementation requires this. (it looks good according to Roland S.)	2015-10-17 21:40:03 +02:00
Marek Olšák	5bc871a4ca	radeonsi: implement vertex color clamping This is only supported in the compatibility profile (without GS and tess). Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	208d1ed38d	radeonsi: implement fragment color clamping using the shader key for now. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	acc6a07874	radeonsi: clean up other scratch buffer functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	9098d7e9bd	radeonsi: clean up copy-pasted scratch buffer updates Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	938a1bee34	radeonsi: unify shader create functions The shader specifies the processor type, so use that instead. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	b0167809f1	radeonsi: unify shader delete functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	aa060e276c	radeonsi: fix a GS copy shader leak Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	c4f086f399	radeonsi: remove an unused ctx parameter in si_shader_destroy Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	4f4f477d6d	radeonsi: print export_prim_id from the shader key Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	b11edf8872	radeonsi: disable NaNs for LS and HS They're disabled for all other shaders except compute, but I forgot to do this for tess stages. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	73e3fba335	radeonsi: clean up si_llvm_init_export_args Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Marek Olšák	82335978bb	tgsi: move pipe_shader_from_tgsi_processor function to util Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-17 21:40:03 +02:00
Brian Paul	8c5647db5e	mesa: remove FLUSH_VERTICES() in _mesa_MatrixMode() Changing the matrix mode alone has no effect on rendering and does not need to trigger a flush or state validation. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-17 19:36:46 +02:00
Marek Olšák	3c6156a4a7	st/mesa: fix clip state dependencies This allows removing FLUSH_VERTICES in MatrixMode. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-17 19:36:44 +02:00
Marek Olšák	006fcc0da6	gallium/hud: fix possible NULL pointer dereference Trivial.	2015-10-17 19:06:27 +02:00
Brian Paul	3272f632ee	scons: fix MSVC, MinGW build Duplicate the glsl_types_hack.cpp work-around from the libgl-xlib target.	2015-10-17 10:06:49 -06:00
Rob Clark	7e6aafd6ab	build: fix make-check after `a6a6a71` commit `a6a6a71092` Author: Rob Clark <robclark@freedesktop.org> AuthorDate: Sat Oct 10 14:13:50 2015 -0400 glsl: (mostly) remove libglsl_util Was a bit too ambitious on removal of libglsl_util.. it is still needed by some of the tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-17 09:51:29 -04:00
Rob Clark	b7963b6926	build: fix out-of-tree build after `b9b40ef` commit `b9b40ef9b7` Author: Rob Clark <robclark@freedesktop.org> AuthorDate: Sat Oct 10 13:55:07 2015 -0400 nir: remove dependency on glsl broke things for i965 out of tree build. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-17 09:51:29 -04:00
Samuel Pitoiset	c188235d1b	nvc0: add support for performance monitoring metrics on Fermi As explained in the CUDA toolkit documentation, "a metric is a characteristic of an application that is calculated from one or more event values." Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-17 10:50:00 +02:00
Rob Clark	a6a6a71092	glsl: (mostly) remove libglsl_util Now that NIR does not depend on glsl, we can (mostly[]) get rid of the libglsl_util hack. [] glsl_compiler is the one remaining user of libglsl_util Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:38 -04:00
Rob Clark	b9b40ef9b7	nir: remove dependency on glsl Move glsl_types into NIR, now that the dependency on glsl_symbol_table has been split out. Possibly makes sense to rename things at this point, but if we do that I'd like to keep it split out into a separate patch to make git history easier to follow (IMHO). v2: fix android build v3: I f***ing hate scons.. but at least it builds Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:38 -04:00
Rob Clark	183db3a645	glsl: move half<->float convertion to util Needed in NIR too, so move out of mesa/main/imports.c Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Rob Clark	60690cb3b3	glsl: move builtin vector types to glsl_types.cpp First step at untangling NIR's dependency on glsl_types without bringing in the dependency on glsl_symbol_table. The builtin types are now in glsl_types (which will end up in NIR), but adding them to the symbol- table stays in builtin_types.cpp (which will not be part of NIR). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Rob Clark	33de998230	glsl: couple shader_enums cleanups Add missing enum to gl_system_value_name() and move VARYING_SLOT_MAX / FRAG_RESULT_MAX / etc into shader_enums.h as suggested by Emil. v2: add STATIC_ASSERT()'s Reported-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-16 19:33:37 -04:00
Timothy Arceri	698cdbf492	glsl: initialise record array count to 1 This was only being done in one of the two process methods. Fixes an issue with samplers using the array size of a previous record. Tested-by: Marek Olšák <marek.olsak@amd.com> Cc: Jason Ekstrand <jason@jlekstrand.net>	2015-10-17 08:50:40 +11:00
Timothy Arceri	3c87377d0b	nir: add atomic lowering support for AoA Cc: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-17 08:43:21 +11:00
Timothy Arceri	2e1798f183	nir: wrapper for glsl_type arrays_of_arrays_size() Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-17 08:43:15 +11:00
Ilia Mirkin	fd5e0581dd	configure: show which gallium drivers/sts are built Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-16 17:18:43 -04:00
Brian Paul	2023906667	tgsi: initialize ctx.file in tgsi_dump_instruction() Fixes segfault because of uninitialized file pointer. Trivial.	2015-10-16 14:32:09 -06:00
Samuel Pitoiset	a3b1757551	nvc0: add a note about MP counters on GF100/GF110 MP counters on GF100/GF110 (compute capability 2.0) are buggy because there is a context-switch problem that we need to fix. Results might be wrong sometimes, be careful! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	0461260d77	nvc0: add MP counters variants for GF100/GF110 GF100 and GF110 chipsets are compute capability 2.0, while the other Fermi chipsets are compute capability 2.1. That's why, some MP counters are different between these chipsets and we need to handle variants. Signed-off-by: Samuel Pitoiet <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	ec5001d25b	nvc0: move SW/HW queries info to their respective files This will help for handling HW SM queries variants on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	00d61869a5	nvc0: enable compute support by default on Fermi Compute support was not enabled by default because weird effects on 3D state happened, but I can't reproduce them anymore. This also enables MP performance counters by default on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	8cd4b8478a	nvc0: allow only one active query for the MP counters group Because we can't expose the number of hardware counters needed for each different query, we don't want to allow more than one active query simultaneously to avoid failure when the maximum number of counters is reached. Note that these groups of GPU counters are currently only used by AMD_performance_monitor. Like for Kepler, this limits the maximum number of active queries to 1 on Fermi. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	cef22f3490	nvc0: read MP counters of all GPCs on Fermi When a card has more than one GPC, the grid used by the compute kernel which reads MP performance counters seems to be too small. The consequence is that the kernel is not launched on all TPCs. Increasing the grid size using the number of GPCs now launches enough blocks and we can read MP performance counters of all TPCs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	1825898e04	nvc0: store the number of GPCs to nvc0_screen NOUVEAU_GETPARAM_GRAPH_UNITS param returns the number of GPCs, the total number of TPCs and the number of ROP units. Note that when the DRM version is too old the default number of GPCs is fixed to 4. This will be used to launch the compute kernel which is used to read MP performance counters over all GPCs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	c4896c99cb	nvc0: fix unaligned mem access when reading MP counters on Fermi Memory access have to be aligned to 128-bits. Note that this doesn't happen when the card only has TPC. This patch fixes the following dmesg fail: gr: GPC0/TPC1/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 000f [UNALIGNED_MEM_ACCESS] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	7abd707251	nvc0: fix monitoring multiple MP counters queries on Fermi For strange reasons, the signal id depends on the slot selected on Fermi but not on Kepler. Fortunately, the signal ids are just offseted by the slot id! Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	4fcb661711	nvc0: fix queries which use multiple MP counters on Fermi Queries which use more than one MP counters was misconfigured and computing the final result was also wrong because sources need to be configured on different hardware counters instead. According to the blob, computing the result is now as follows: FOR i..n val += ctr[i] * pow(2, i) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	6353f620cd	nvc0: allow to use 8 MP counters on Fermi On Fermi, we have one domain of 8 MP counters while we have two domains of 4 MP counters on Kepler. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	cac897197b	nvc0: fix sequence field init for MP counters on Fermi Sequence fields are located at MP[i] + 0x20 in the buffer object. This is used to check if result is available for MP[i]. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	409658c367	nvc0: correctly enable the MP counters' multiplexer on Fermi Writing 0x408000 to 0x419e00 (like on Kepler) has no effect on Fermi because we only have one domain of 8 counters. Instead, we have to write 0x80000000. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	c3570c3fb9	nvc0: rip off the kepler MP-enabling logic from the Fermi codepath Writing 0x1fcb to 0x419eac is definitely not related to MP counters and has no effect on Fermi (although this enables MP counters on Kepler). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	dab7e0ed09	nvc0: split out begin_query() hook used by MP counters The way we configure MP performance counters is going to pretty different between Fermi and Kepler. Having two separate functions is much better. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Samuel Pitoiset	d4ecc2bce4	nvc0: remove useless call to query_get_cfg() in nvc0_hw_sm_query_end() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-16 21:57:44 +02:00
Brian Paul	efe37519b0	svga: only count hardware buffer mappings for HUD Don't count client memory buffer mappings since they're basically free. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-16 11:44:00 -06:00
Neha Bhende	9bc7e3105a	svga: add new GALLIUM_HUD queries Add new GALLIUM_HUD queries for: num-shaders num-resources num-state-objects num-validations map-buffer-time num-surface-views num-resources-mapped num-flushes Most of this patch was originally written by Neha. Additional clean-ups and num-flushes counter added by Brian Paul. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-16 11:43:28 -06:00
Brian Paul	f413f1a17c	svga: use new svga_new_shader_variant() function To simplify upcoming new HUD shader count implementation. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-16 11:43:28 -06:00
Brian Paul	8d0d5dca5b	svga: pass context to svga_tgsi_vgpu9_translate() Will be used for upcoming change. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-16 11:43:28 -06:00
Brian Paul	615b37a0e2	svga: remove svga_tgsi_vgpu9_translate() call in GS path We can never have geometry shaders with vgpu9. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-10-16 11:43:28 -06:00
Brian Paul	cb473c46fe	glsl: silence warning about unhandled ast_unsized_array_dim case in switch Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-16 11:34:05 -06:00
Brian Paul	afff809fea	st/mesa: fix incorrect pointer type arguments in st_new_program() Silences 5 warnings of the type: state_tracker/st_cb_program.c: In function 'st_new_program': state_tracker/st_cb_program.c:108:7: warning: passing argument 1 of '_mesa_init_gl_program' from incompatible pointer type [enabled by default] return _mesa_init_gl_program(&prog->Base, target, id); ^ Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-16 09:15:05 -06:00
Brian Paul	4627e8058e	Revert "mesa: remove FLUSH_VERTICES() in _mesa_MatrixMode()" This reverts commit `0de5e0f3fb`. Michel Dänzer spotted two piglit regressions from the change. I suspect that removing the FLUSH_VERTICES() actually exposed a bug elsewhere but I don't have time to hunt down the root issue at this time.	2015-10-16 09:10:22 -06:00
Samuel Iglesias Gonsalvez	ccbb52ac11	glsl: fix check SSBOs support for builtin functions has_shader_storage_buffer_objects() returns true also if the OpenGL context is 4.30 or ES 3.1. Previously, we were saying that all atomic() GLSL builtin functions for SSBOs were not available when OpenGL ES 3.1 context was in use. Fixes 48 dEQP-GLES31 tests: dEQP-GLES31.functional.ssbo.atomic. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-16 12:47:58 +02:00
Tapani Pälli	dc8c221e28	mesa: Set api prefix to version string when overriding version Otherwise there are problems when user overrides version and application such as Piglit wants to detect used api with glGetString(GL_VERSION). This makes it currently impossible to run glslparsertest tests for OpenGL ES when using version override. Below is example when using MESA_GLES_VERSION_OVERRIDE=3.1. Before: "3.1 Mesa 11.1.0-devel (git-24a1a15)" After: "OpenGL ES 3.1 Mesa 11.1.0-devel (git-78042ff)" v2: only include api prefix for OpenGL ES (Boyan Ding) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-10-16 12:58:52 +03:00
Iago Toral Quiroga	c8f5274b52	nir: Get the number of SSBOs and UBOs right Before `d31f98a272` and `56e2bdbca3` we had a sigle index space for UBOs and SSBOs, so NumBufferInterfaceBlocks would contain the combined number of blocks, not just one kind. This means that for shader programs using both UBOs and SSBOs, we were setting num_ssbos and num_ubos to a larger number than we should. Since the above commits we have separate index spaces for each so we can just get the right numbers. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-16 10:12:44 +02:00
Iago Toral Quiroga	f534f331ca	i965/vec4: Use the right number of UBOs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-16 10:12:44 +02:00
Iago Toral Quiroga	6f9ca30266	i965/fs: use the right number of UBOs Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-16 10:12:44 +02:00
Rob Clark	ef7a563829	freedreno: add debug option to dirty state after draw Similar to "dclear", "ddraw" will mark all state dirty after each draw. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-15 18:04:18 -04:00
Rob Clark	6206da736c	freedreno/a3xx: cache-flush is needed after MEM_WRITE Otherwise the mem2gmem blit would see potentially bogus texture coordinates. Fixes an issue that shows up with glamor. CC: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-15 18:04:17 -04:00
Rob Clark	fefffdc2b2	gallium/util: fix debug_get_flags_option on 32-bit harder (yes, we want PRI?64, but we want the x version rather than the u version) Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-15 18:04:17 -04:00
Chih-Wei Huang	7599f8b167	nv30: include the header of ffs prototype It fixes a building error of the android 6.0 64-bit target. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-10-15 15:00:14 -04:00
Chih-Wei Huang	d31005e3e5	nv50/ir: use C++11 standard std::unordered_map if possible Note Android version before Lollipop is not supported. Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-10-15 14:59:59 -04:00
Jason Ekstrand	5f106153f5	nir/prog: Don't double-insert the fog-coord variable nir_variable_create already inserts it in the right list for us so inserting it again causes a linked list corruption. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-15 10:48:21 -07:00
Jason Ekstrand	b705005584	nir/glsl: Use shader_prog->Name for naming the NIR shader This has the better name to use. Aparently, sh->Name is usually 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-10-15 07:31:09 -07:00
Jason Ekstrand	eb893c220c	nir: Add helpers for creating variables and adding them to lists Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-15 07:31:09 -07:00
Jason Ekstrand	635daef76e	nir/prog: Use nir_foreach_variable Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-15 07:31:09 -07:00
Brian Paul	5d954fd5cb	mesa: wrap a ridiculously long line in es1_conversion.c Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:08 -06:00
Brian Paul	d8c23d156d	mesa: add num_buffers() helper in blend.c Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:08 -06:00
Brian Paul	dfbd62e772	mesa: optimize _UsesDualSrc blend flag setting For glBlendFunc and glBlendFuncSeparate(), the _UsesDualSrc flag will be the same for all buffers, so no need to compute it N times. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:08 -06:00
Brian Paul	d21e17f48f	mesa: fix incorrect error string in _mesa_BlendEquationiARB() Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Brian Paul	1d75165501	mesa: move validate_blend_factors() call after no-change check A redundant call to glBlendFuncSeparateiARB() is more likely than getting invalid values, so do the no-op check first. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Brian Paul	34de3c4c16	mesa: optimize no-change check in _mesa_BlendEquationSeparate() Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Brian Paul	2dfedf105d	mesa: optimize no-change check in _mesa_BlendEquation() Same story as preceeding change to _mesa_BlendFuncSeparate(). Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Brian Paul	6fd29e6c31	mesa: optimize no-change check in _mesa_BlendFuncSeparate() Streamline the checking for no state change in _mesa_BlendFuncSeparate() (and _mesa_BlendFunc()). If _BlendFuncPerBuffer is false, we only need to check the 0th buffer state. Move argument validation after the no-op check. I'm looking at an app that issues about 1000 redundant glBlendFunc() calls per frame! Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Brian Paul	083b3f5cb4	mesa: short-cut new_state == _NEW_LINE in _mesa_update_state_locked() We can skip to the end of _mesa_update_state_locked() if only the _NEW_LINE flag is set since none of the derived state depends on it (just like _NEW_CURRENT_ATTRIB). Note that we still call the ctx->Driver.UpdateState() function, of course. v2: use bitmask-based test, per Eric. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Brian Paul	0de5e0f3fb	mesa: remove FLUSH_VERTICES() in _mesa_MatrixMode() Changing the matrix mode alone has no effect on rendering and does not need to trigger a flush or state validation. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-15 07:21:07 -06:00
Chih-Wei Huang	67d8518a0e	mesa: android: Fix the incorrect path of sse_minmax.c Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Fixes: `669cfc267a` (android: mesa: fix the path of the SSE4_1 optimisations) Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-15 13:41:02 +01:00
Mauro Rossi	45f0392ceb	i965: android: add the i965_compile_FILES sources to the driver i965_compile_FILES are needed otherwise we'll error out as below: target SharedLib: i915_dri (out/target/product/x86/obj/SHARED_LIBRARIES/i915_dri_intermediates/LINKED/i915_dri.so) external/mesa/src/mesa/drivers/dri/i965/brw_ir_fs.h:181: error: undefined reference to 'fs_inst::~fs_inst()' ... ... external/mesa/src/mesa/drivers/dri/i965/intel_screen.c:1484: error: undefined reference to 'brw_compiler_create' collect2: error: ld returned 1 exit status build/core/shared_library.mk:81: recipe for target 'out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so' failed make: *** [out/target/product/x86/obj/SHARED_LIBRARIES/i965_dri_intermediates/LINKED/i965_dri.so] Error 1 [Emil Velikov: tweak commit message] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-15 13:35:19 +01:00
Emil Velikov	bcb56c2c69	program: convert _mesa_init_gl_program() to take struct gl_program * Rather than accepting a void pointer, only to down and up cast around it, convert the function to take the base (struct gl_program) pointer. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-15 13:30:52 +01:00
Emil Velikov	2034bdd46c	nir: include nir_instr_set.h in the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-15 13:30:22 +01:00
Timothy Arceri	8da9e154b7	glsl: Allow arrays of arrays in GLSL ES 3.10 and GLSL 4.30 V3: use a check__allowed style function for requirements checking rather than has_ which doesn't encapsulate the error message V2: add missing 's' to the extension name in error messages and add decimal place in version string Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	f22b7933e2	glsl: allow for AoA in calculating offset to ubo start region Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	bb5aeb8549	glsl: build ubo name and indexing offset for AoA V2: split out unrelated change as suggested by Samuel Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	8cf1333b18	glsl: link uniform block arrays of arrays This adds support for setting up the UniformBlock structures for AoA and also adds support for resizing AoA blocks with a packed layout. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	d9f1f2bbc6	glsl: Add AoA support when checking for non-const index When checking for non-const indexing of interfaces take into account arrays of arrays Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	082b1ca2fe	glsl: Add support for lowering interface block arrays of arrays V2: make array processing functions static Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	132b9e9dd9	glsl: add AoA support for an inteface with unsized array members Add support for setting the max access of an unsized member of an interface array of arrays. For example ifc[j][k].foo[i] where foo is unsized. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	d1d05c0f85	glsl: add AoA support for linking interface blocks with unsized members Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 21:42:24 +11:00
Timothy Arceri	dd89880dc0	glsl: avoid hitting assert for arrays of arrays Also add TODO comment about adding proper support Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 21:21:33 +11:00
Timothy Arceri	2d7a98de18	glsl: add AoA support for atomic counters This marks all counters in an AoA as active. For AoA all but the innermost array are treated as separate counters/uniforms. The Nvidia binary also goes further and finds inactive counters in the AoA, in future we should do this too, however this gets things working for the time being. This change also removes the use of UniformHash for atomic counters, this avoids having to generate name strings used as hash keys. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 21:21:27 +11:00
Timothy Arceri	261a434996	glsl: add std140 layout support for AoA Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 20:44:33 +11:00
Timothy Arceri	176e6930e6	i965: add arrays of arrays support for varyings V2: get the correct vector elements value for outputs Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 20:44:26 +11:00
Timothy Arceri	be822b89ac	glsl: calculate AoA uniform offset correctly for structs This allows the correct offset to be calculated for use in indirect indexing of samplers. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 20:44:20 +11:00
Timothy Arceri	410609c968	glsl: remove dead code in a single pass Currently only one ir assignment is removed for each var in a single dead code optimisation pass. This means if a var has more than one assignment, then it requires all the glsl optimisations to be run again for each additional assignment to be removed. Another pass is also required to remove the variable itself. With this change all assignments and the variable are removed in a single pass. Some of the arrays of arrays conformance tests that were looping through 8 dimensions ended up with a var with hundreds of assignments. This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1 go from around 3 min 20 sec -> 2 min ES31-CTS.arrays_of_arrays.InteractionFunctionCalls2 went from around 9 min 20 sec to 7 min 30 sec I had difficulty getting the public shader-db to give a consistent result with or without this change but the results seemed unchanged at between 15-20 seconds. Thomas Helland measured change with shader-db on his machine from approx 117 secs to 112 secs. V3: Simplify freeing of list as suggested by Ian, and spelling fixes. V2: Add assert to be sure references are counted before assignments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-By: Thomas Helland <thomashelland90@gmail.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 20:36:14 +11:00
Timothy Arceri	d337da81f2	glsl: dont allow gl_PerVertex to be redeclared as an array of arrays V3: move patch after fixes to ast for AoA and add const to helper as suggested by Ian V2: move single dimensional array detection into a helper Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 20:36:01 +11:00
Timothy Arceri	dea0af8f82	glsl: check that only the outermost array is unsized Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 20:35:54 +11:00
Timothy Arceri	3129359ed7	glsl: allow AoA to be sized by initializer or constructor V2: Split out unsized array validation to its own patch as suggested by Samuel. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-15 20:35:45 +11:00
Timothy Arceri	296a7ea471	glsl: add support for initialising sampler AoA Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 20:35:40 +11:00
Timothy Arceri	db280e951a	glsl: Add support for linking uniform arrays of arrays V3: Fix setting of data.location for struct AoA UBO members V2: Handle arrays of arrays in the same way structures are handled The ARB_arrays_of_arrays spec doesn't give very many details on how AoA uniforms are intended to be implemented. However in the ARB_program_interface_query spec there are details that show AoA are intended to be handled in a similar way to structs. Issues 7 from the ARB_program_interface_query spec: We define rules consistent with our enumeration rules for other complex types. For existing one-dimensional arrays, we enumerate a single entry if the array is an array of basic types, or separate entries for each array element if the array is an array of structures. We follow similar rules here. For a uniform array such as: uniform vec4 a[5][4][3]; we enumerate twenty different entries ("a[0][0][0]" through "a[4][3][0]"), each of which is treated as an array with three elements. This is morally equivalent to what you'd get if you worked around the limitation in current GLSL via: struct ArrayBottom { vec4 c[3]; }; struct ArrayMid { ArrayBottom b[3]; }; uniform ArrayMid a[5]; which would enumerate "a[0].b[0].c[0]" through "a[4].b[3].c[0]". Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-15 20:35:35 +11:00
Kenneth Graunke	ff31c243e3	i965: Don't hardcode FS in "validation failed!" message. Instead, print "Scalar VS" or "Scalar FS". Otherwise it's really confusing which stage is broken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 14:07:47 -07:00
Jordan Justen	a274eff9ff	glsl: Support uint index in lower_vector_insert The ES31-CTS.compute_shader.pipeline-compute-chain test case generates an unsigned index by using gl_LocalInvocationID.x and gl_LocalInvocationID.y as array indices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-14 13:16:35 -07:00
Jordan Justen	ab04adcf63	glsl: Support uint index in do_vec_index_to_cond_assign The ES31-CTS.compute_shader.pipeline-compute-chain test case generates an unsigned index by using gl_LocalInvocationID.x and gl_LocalInvocationID.y as array indices. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-14 13:16:34 -07:00
Jordan Justen	0d1eef536b	i965/fs: Ignore compute shaders in brw_nir_lower_inputs The commit shown below caused compute shaders to hit the unreachable in the default of the switch block. Since compute shaders don't have any inputs, we can make brw_nir_lower_inputs a no-op for CS. commit `2953c3d761` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Fri Aug 14 15:15:11 2015 -0700 i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-14 13:16:30 -07:00
Jordan Justen	63728dac57	i965/fs: Simplify FS in brw_nir_lower_inputs to only support scalar mode Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-14 13:16:29 -07:00
Brian Paul	9abbf65d0a	mesa: remove unused functions in program.c replace_registers() and adjust_param_indexes() were unused. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-14 12:47:15 -06:00
Brian Paul	9d4ce80736	mesa: minor indentation fix in _mesa_BindTextureUnit()	2015-10-14 12:47:15 -06:00
Brian Paul	77eef81370	mesa: remove unused texUnit local in _mesa_BindTextureUnit() The texture unit is error-checked before this and the texUnit var is unused, so remove it. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-14 12:47:15 -06:00
Krzysztof Sobiecki	14f7ce4248	st/fbo: use pipe_surface_release instead of pipe_surface_reference pipe_surface_reference have problems with deleted contexts, so use of pipe_surface_release might be more appropriate. Fixes Wasteland 2 Director's Cut crash on start. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-14 12:47:07 -06:00
Marta Lofstedt	93267887a0	glsl: Enable split of lower UBOs and SSBO also for compute shaders The split of Uniform blocks and shader storage block only loops up to MESA_SHADER_FRAGMENT and igonres compute shaders. This cause segfault when running the OpenGL ES 3.1 CTS tests with GL_ARB_compute_shader enabled. V2: Changed to use MESA_SHADER_STAGES instead of MESA_SHADER_COMPUTE Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>	2015-10-14 16:05:42 +02:00
Jose Fonseca	5423c1e855	glsl: Include util/strndup.h. Fixes Windows builds. Trivial.	2015-10-14 11:50:06 +01:00
Tapani Pälli	ac257f1070	glsl: calculate TOP_LEVEL_ARRAY_SIZE and STRIDE when adding resources Patch moves existing calculation code from shader_query.cpp to happen during program resource list creation. No Piglit or CTS regressions were observed during testing. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-14 12:39:04 +03:00
Tapani Pälli	b76159b096	glsl: add top level array size and stride to gl_uniform_storage Patch adds 2 new fields to gl_uniform_storage so that we don't need to calculate these values during runtime shader queries. This is required by upcoming changes to free GLSL IR after linking. Patch moves 3 booleans inside structure so that structure size stays the same after this change. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-14 09:32:58 +03:00
Iago Toral Quiroga	d3f4588804	i965: Adapt SSBOs to work with their own separate index space Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 08:11:13 +02:00
Iago Toral Quiroga	56e2bdbca3	glsl/lower_ubo_reference: lower UBOs and SSBOs to separate index spaces Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 08:11:13 +02:00
Iago Toral Quiroga	d31f98a272	mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader{_program} These arrays provide backends with separate index spaces for UBOS and SSBOs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 08:11:13 +02:00
Iago Toral Quiroga	27dccf097d	mesa: Rename {Num}UniformBlocks to {Num}BufferInterfaceBlocks Currently, these arrays in gl_shader and gl_shader_program hold both UBOs and SSBOs, so this looks like a better name. We were already using NumBufferInterfaceBlocks in gl_shader_program, so this makes things more consistent as well. In a later patch we will add {Num}UniformBlocks and {Num}ShaderStorageBlocks which will contain only references to UBOs and SSBOs respectively that will provide backends with a separate index space for both types of objects. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 08:11:13 +02:00
Iago Toral Quiroga	9de651b261	glsl: Fix variable_referenced() for vector_{extract,insert} expressions We get these when we operate on vector variables with array accessors (i.e. things like a[0] where 'a' is a vec4). When we call variable_referenced() on these expressions we want to return a reference to 'a' instead of NULL. This fixes a problem where we pass a[0] as the first argument to an atomic SSBO function that expects a buffer variable. In order to check this, we use variable_referenced(), but that is currently returning NULL in this case, since the underlying rvalue is a vector_extract expression. Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-10-14 08:08:12 +02:00
Iago Toral Quiroga	baee16bf02	nir: split SSBO min/max atomic instrinsics into signed/unsigned versions NIR is typeless so this is the only way to keep track of the type to select the proper atomic to use. v2: - Use imin,imax,umin,umax for the intrinsic names (Connor Abbott) - Change message for unreachable paths (Michael Schellenberger) Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-14 08:03:58 +02:00
Iago Toral Quiroga	be800ef6d8	i965/vec4: fix indentation in vec4_visitor::calculate_live_intervals Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-14 08:01:58 +02:00
Iago Toral Quiroga	9d2bbca98d	i965/fs: Fix indentation in fs_live_variables::compute_start_end Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-14 08:01:46 +02:00
Brian Paul	4a168ad797	mesa: clean up comments for gl_current_attrib struct Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	a7b6e6192a	vbo: make void vbo_exec_BeginVertices() static Not called from any other file. Rename and move before use. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	84719ad9df	vbo: document vbo_exec_context fields Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	d65b029dc2	vbo: minor clean-ups for vbo_exec_fixup_vertex() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	7f67bfaa74	vbo: add assertion in ATTR_UNION macro Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	3491ec5930	vbo: add comments, braces in ATTR_UNION() in vbo_exec_api.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	e729f36c09	vbo: fix whitespace in vbo_exec_draw.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	8fbb72c297	vbo: move 'tmp' var initialization Improve readability a bit. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	a1cbf85de0	vbo: improve fprintf() formatting Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	a639bbf098	vbo: simplify vertex array initializations in vbo_context.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	20f31ae37c	vbo: get rid of needless NR_MAT_ATTRIBS constant Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:23 -06:00
Brian Paul	dd293d8aae	vbo: fix incorrect switch statement in init_mat_currval() The variable 'i' is a value in [0, MAT_ATTRIB_MAX-1] so subtracting VERT_ATTRIB_GENERIC0 gave a bogus value and we executed the default switch clause for all loop iterations. This doesn't fix any known issues but was clearly incorrect. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-13 08:28:22 -06:00
Brian Paul	c73c481c4a	mesa: pass caller name to create_textures() Simpler than the dsa flag approach.	2015-10-13 08:28:22 -06:00
Samuel Iglesias Gonsalvez	6a506689db	glsl: fix matrix stride calculation for std430's row_major matrices with two columns This is the result of applying several rules: From OpenGL 4.3 spec, section 7.6.2.2 "Standard Uniform Block Layout": "2. If the member is a two- or four-component vector with components consuming N basic machine units, the base alignment is 2N or 4N, respectively." [...] "4. If the member is an array of scalars or vectors, the base alignment and array stride are set to match the base alignment of a single array element, according to rules (1), (2), and (3), and rounded up to the base alignment of a vec4." [...] "7. If the member is a row-major matrix with C columns and R rows, the matrix is stored identically to an array of R row vectors with C components each, according to rule (4)." [...] "When using the std430 storage layout, shader storage blocks will be laid out in buffer storage identically to uniform and shader storage blocks using the std140 layout, except that the base alignment and stride of arrays of scalars and vectors in rule 4 and of structures in rule 9 are not rounded up a multiple of the base alignment of a vec4." In summary: vec2 has a base alignment of 2N, a row-major mat2xY is stored like an array of Y row vectors with 2 components each. Because of std430 storage layout, the base alignment of the array of vectors is not rounded up to vec4, so it is still 2N. Fixes 15 dEQP tests: dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x3 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x3 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x3 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x4 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x4 dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x3 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x4 dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2 dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x3 dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x4 v2: - Add spec quote in both commit log and code (Timothy) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-10-13 15:58:54 +02:00
Christian König	685335639a	r600/vce: enable VCE for trinity/richland Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-13 14:32:52 +02:00
Christian König	83de93309e	r600/uvd: disable UVD tiling by default It has only minimal advantages for post processing and doesn't work with VCE. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-13 14:32:48 +02:00
Glenn Kennard	24a1a157a6	r600g: Enable GL_ARB_gpu_shader5 extension Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-13 08:55:42 +10:00
Glenn Kennard	1befb7ed98	r600g/sb: SB support for UBO indexing Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-13 08:55:33 +10:00
Glenn Kennard	80c5062abf	r600g/sb: Support gs5 sampler indexing (v2) [airlied: v2 cayman fixups] Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-13 08:53:35 +10:00
Kenneth Graunke	bd198b9f0a	i965/vs: Simplify fs_visitor's ATTR file. Previously, ATTR was indexed by VERT_ATTRIB_* slots; at the end of compilation, assign_vs_urb_setup() translated those into GRF units, and converted ATTR to HW_REGs. This patch moves the transslation earlier, making ATTR work in terms of GRF units from the beginning. assign_vs_urb_setup() simply has to add the number of payload registers and push constants to obtain the final hardware GRF number. (We can't do this earlier as those values aren't known.) ATTR still supports reg_offset; however, it's simply added to reg. It's not clear whether this is valuable or not. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-12 14:33:26 -07:00
Ilia Mirkin	bf97f8d467	nouveau: avoid double-emitting fence The act of ensuring that there is space can cause a flush to happen, which will emit the current screen fence. If that is the fence we're trying to wait on, then it will have been emitted as a result of doing the PUSH_SPACE. Don't attempt to emit it a second time. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Fixes: `8053c9208f` (nouveau: avoid emitting new fences unnecessarily) Cc: mesa-stable@lists.freedesktop.org	2015-10-12 17:21:29 -04:00
Ian Romanick	eeb444bc99	glsl: Never allow the sequence operator anywhere in an array size Fixes: spec/glsl-1.20/compiler/structure-and-array-operations/array-size-sequence-in-parenthesis.vert spec/glsl-es-1.00/compiler/array-sized-by-sequence-in-parenthesis.vert spec/glsl-es-3.00/compiler/array-sized-by-sequence-in-parenthesis.vert Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-12 10:15:14 -07:00
Ian Romanick	92635a84a7	glsl: In later GLSL versions, sequence operator is cannot be a constant expression Fixes: ES3-CTS.shaders.negative.constant_sequence spec/glsl-es-3.00/compiler/global-initializer/from-sequence.vert spec/glsl-es-3.00/compiler/global-initializer/from-sequence.frag v2: Fix a couple copy-and-paste mistake in the spec quotations. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:14 -07:00
Ian Romanick	05e4601c6b	glsl: Add method to determine whether an expression contains the sequence operator This will be used in the next patch to enforce some language sematics. v2: Fix inverted logic in ast_function_expression::has_sequence_subexpression. The method originally had a different name and a different meaning. I fixed the logic in ast_to_hir.cpp, but I only changed the names in ast_function.cpp. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:13 -07:00
Ian Romanick	bb329f2ff6	glsl: Restrict initializers for global variables to constant expression in ES v2: Combine this check with the existing const and uniform checks. This change depends on the previous patch (glsl: Only set ir_variable::constant_value for const-decorated variables). Fixes: ES2-CTS.shaders.negative.initialize ES3-CTS.shaders.negative.initialize spec/glsl-es-1.00/compiler/global-initializer/from-attribute.vert spec/glsl-es-1.00/compiler/global-initializer/from-uniform.vert spec/glsl-es-1.00/compiler/global-initializer/from-uniform.frag spec/glsl-es-1.00/compiler/global-initializer/from-global.vert spec/glsl-es-1.00/compiler/global-initializer/from-global.frag spec/glsl-es-1.00/compiler/global-initializer/from-varying.frag spec/glsl-es-3.00/compiler/global-initializer/from-uniform.vert spec/glsl-es-3.00/compiler/global-initializer/from-uniform.frag spec/glsl-es-3.00/compiler/global-initializer/from-in.vert spec/glsl-es-3.00/compiler/global-initializer/from-in.frag spec/glsl-es-3.00/compiler/global-initializer/from-global.vert spec/glsl-es-3.00/compiler/global-initializer/from-global.frag Note: spec/glsl-es-3.00/compiler/global-initializer/from-sequence.* still fail because the result of a sequence operator is still considered to be a constant expression. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92304 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1] Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:13 -07:00
Ian Romanick	3524d6df33	glsl: Only set ir_variable::constant_value for const-decorated variables Right now we're also setting for uniforms, and that doesn't seem to hurt things. The next patch will make general global variables in GLSL ES, and those definitely should not have constant_value set! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:13 -07:00
Ian Romanick	5bc68f0f2b	glsl: Use constant_initializer instead of constant_value to determine whether to keep an unused uniform This even matches the comment "uniform initializers are precious, and could get used by another stage." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:13 -07:00
Ian Romanick	313372cae8	glsl/linker: Use constant_initializer instead of constant_value to initialize uniforms Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:13 -07:00
Ian Romanick	8acce5d53a	ff_fragment_shader: Use binding to set the sampler unit This is the way layout(binding=xxx) works from GLSL. The old method just happened to work (and significantly predated support for layout(binding=xxx)), but future changes will break this. v2: Remove some stale comments. Suggested by Matt and Chris Forbes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-12 10:15:13 -07:00
Ian Romanick	43b07eb60f	glsl: Allow built-in functions as constant expressions in OpenGL ES 1.00 In `d4a24745` (August 2012), Paul made functions calls not be constant expressions in GLSL ES 1.00. Since this feature was added in desktop GLSL 1.20, we believed that it was added in GLSL ES 3.00. That turns out to be completely wrong. Built-in functions have always been allowed as constant expressions in GLSL ES, and the patch adds the (many) spec quotations to prove it. While we never previously encountered this, a later patch enforces a GLSL ES 1.00 rule that global variable initializers must be constant expressions. Without this fix, several dEQP tests fail. Fixes: tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.frag tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.vert tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.frag tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.vert Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.0 10.1 10.2 10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org> Yes, I know we don't maintain stable branches that far back, but that is how far back this bug goes!	2015-10-12 10:15:13 -07:00
Nicolai Hähnle	45ed627d89	u_vbuf: fix vb slot assignment for translated buffers Vertex attributes of different categories (constant/per-instance/ per-vertex) go into different buffers for translation, and this is now properly reflected in the vertex buffers passed to the driver. Fixes e.g. piglit's point-vertex-id divisor test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-10-12 16:46:30 +02:00
Iago Toral Quiroga	7a1143f29e	glsl: include variable name in error messages about initializers Also fix style / wrong indentation along the way and make the messages more uniform. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-12 08:31:08 +02:00
Iago Toral Quiroga	f09c229cc6	glsl: shader outputs cannot have initializers GLSL Spec 4.20.8, 4.3 Storage Qualifiers: "Initializers in global declarations may only be used in declarations of global variables with no storage qualifier, with a const qualifier or with a uniform qualifier." We do this for input variables, but not for output variables. AMD and NVIDIA proprietary drivers don't allow this either. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-12 08:31:08 +02:00
Iago Toral Quiroga	8281a7c533	i965: Fix unsafe pointer when dumping VS/FS IR For the VS and FS stages that use ARB_vertex_program or ARB_fragment_program we don't have a shader program, however, when debuging is enabled, we call brw_dump_ir like this: brw_dump_ir("vertex", prog, &vs->base, &vp->program.Base); where vs will be NULL (since prog is NULL). As pointed out by Chris, this &vs->base is not really a dereference, it simply computes a new address that just happens to be 0x0 because the offset of base in brw_shader is 0. Then brw_dump_ir will see a NULL pointer and not do anything. This is why this does not crash at the moment. However, this does not look very safe (it would crash for any location of base that is not the first in brw_shader), so patch it to prevent a potential (even if unlikely) problem in the future. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-10-12 08:30:57 +02:00
Dave Airlie	bcfaab3885	mesa/uniforms: fix get_uniform for doubles (v2) The initial glGetUniformdv support didn't cover all the casting cases that are apparantly legal, and cts seems to test for them. I've updated the piglit test to cover these cases now. v2: fix indentation - it's all broken in this file (Ilia) fix src/dst index tracking in light of fp64 support (Ilia) cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-10-12 13:10:59 +10:00
Chia-I Wu	c8083b1adc	ilo: improve Gen8 defines based on its PRMs	2015-10-12 10:15:28 +08:00
Matt Turner	4642d53a03	i965/vec4: Implement b2f and b2i using negation. Curro added this in commit `3ee2daf23d` (before the vec4/NIR backend was added) but it was missed in the new NIR backend. Add it there as well. instructions in affected programs: 1857 -> 1810 (-2.53%) helped: 15 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-11 16:19:52 -07:00
Ilia Mirkin	9fe458335f	nv50,nvc0: don't base decisions on available pushbuf space We still have to push everything out, might as well kick earlier and flip pushbufs when we know we'll need it. This resolves some issues with the new policy of making sure that we always leave a bit of room at the end for fences. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `47d11990b` (nouveau: make sure there's always room to emit a fence) Cc: mesa-stable@lists.freedesktop.org	2015-10-11 17:57:04 -04:00
Ilia Mirkin	8053c9208f	nouveau: avoid emitting new fences unnecessarily Right now we emit on every kick, but this is only necessary if something will ever be able to observe that the fence completed. If there are no refs, leave the fence alone and emit it another day. This also happens to work around an issue for the kick handler -- a kick can be a result of e.g. nouveau_bo_wait or explicit kick, or it can be due to lack of space in the pushbuf. We want the emit to happen in the current batch, so we want there to always be enough space. However an explicit kick could take the reserved space for the implicitly-triggered kick's fence emission if it happened right after. With the new mechanism, hopefully there's no way to cause two fences to be emitted into the same reserved space. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `47d11990b` (nouveau: make sure there's always room to emit a fence) Cc: mesa-stable@lists.freedesktop.org	2015-10-11 17:57:04 -04:00
Samuel Pitoiset	06abd1a25e	nvc0: make use of NVC0_COMPUTE_CLASS for GF110 In theory, GF110+ should also support NVC8_COMPUTE_CLASS but, in practice, a ILLEGAL_CLASS dmesg fail appears when using it. This fixes compute support and MP performance counters on GF110. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-10 22:11:03 +02:00
Kenneth Graunke	a23bdd1fae	i965/gs: Make MAX_GS_INPUT_VERTICES a #define in brw_context.h. For scalar VS, I'll need this in brw_fs.cpp as well. It seems silly to redeclare it in three places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-10 11:40:19 -07:00
Kenneth Graunke	2953c3d761	i965/vs: Map scalar VS input locations properly; avoid tons of MOVs. Previously, we used nir_lower_io with the scalar type_size function, which mapped VERT_ATTRIB_* locations to...some numbers. Then, in fs_visitor::nir_setup_inputs(), we created temporaries indexed by those numbers, and emitted MOVs from the actual ATTR registers to those temporaries. Virtually all of these were copy propagated away, but it's still ugly. This patch reworks our input lowering to produce NIR lower_input intrinsics that properly index into the ATTR file, so we can access it directly. No changes in shader-db. v2: Fix unreachable() message (Ken), update commit message (Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-10 11:40:19 -07:00
Kenneth Graunke	6842ad7912	i965/vs: Fix a subtlety in the nr_attributes == 0 workaround. nr_attributes is used to compute first_non_payload_grf, which is the first register we're allowed to use for ordinary register allocation. The hardware requires us to read at least one pair of values, but we're completely free to overwrite that garbage register with whatever we like. Instead of altering nr_attributes, we should alter urb_read_length, which only affects the amount we ask the VF to read. This should save us a register in trivial cases (which admittedly isn't very useful). While we're at it, improve the explanation in the comments. v2: Actually do what I said (caught by Ilia). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-10 11:40:19 -07:00
Kenneth Graunke	031d350132	i965/vs: Unify URB entry size/read length calculations between backends. Both the vec4 and scalar VS backends had virtually identical URB entry size and read length calculations. We can move those up a level to backend-agnostic code and reuse it for both. Unfortunately, the backends need to know nr_attributes to compute first_non_payload_grf, so I had to store that in prog_data. We could use urb_read_length, but that's nr_attributes rounded up to a multiple of two, so doing so would waste a register in some cases. There's more code to be removed in the vec4 backend, but that will come in a follow-on patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-10 11:40:19 -07:00
Kenneth Graunke	a4e988f481	i965/cfg: Fix cfg_t::dump() when a block has no immediate dominator. Switch statements introduce a bogus loop with an unconditional break at the end of the loop, just before the while...so the while is unreachable and has no immediate dominator. v2: With less exuberance Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-10 11:40:19 -07:00
Emil Velikov	2496cfd771	docs: add news item and link release notes for 11.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-10 17:10:26 +01:00
Emil Velikov	55a8f072ea	docs: add sha256 checksums for 11.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `b4bfea0094`)	2015-10-10 17:10:26 +01:00
Emil Velikov	8337a31bcc	docs: add release notes for 11.0.3 Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (cherry picked from commit `914966befc`)	2015-10-10 17:10:26 +01:00
Chad Versace	82b324c24b	i965/gen8: Remove gen<8 checks in gen8 code Some assertions in gen8_surface_state.c checked for gen < 8. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-09 14:24:12 -07:00
Chad Versace	8a0c85b258	i965/gen9: Enable rep clears on gen9 The (gen < 9) check in brw_clear() was too broad. It disabled all types of fast color clears: a. singlesample rep clears b. singlesample MCS fast clears c. multisample MCS fast clears The MCS clears are still buggy, but the rep clear works well. So let's enable it. Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-10-09 14:24:12 -07:00
Chad Versace	dcd59a9e32	i965/gen9: Disable MCS for 1x color surfaces Fast color clears are disabled for gen9 (see the checks in brw_meta_fast_clear), so there is no reason to allocate the MCS and track its clear/resolve state. Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-10-09 14:24:12 -07:00
Roland Scheidegger	4c4ba5a8c3	tgsi: (trivial) kill c99-ism.	2015-10-09 23:12:14 +02:00
Marek Olšák	d695c676ea	program: remove _mesa_init_*_program wrappers They didn't do anything useful. Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:19 +02:00
Marek Olšák	092f0427dc	program: remove other unused functions Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	5042a3eef8	program: remove unused cloning and combining functions Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	c947a3a4c4	program: remove unused function _mesa_find_line_column Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	ee01942eb5	st/mesa: release the glsl_to_tgsi visitor after translation Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	e5073e8d0c	st/mesa: translate tessellation shaders into TGSI when we get them Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	897177020b	st/mesa: translate geometry shaders into TGSI when we get them Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	a907b5dd16	st/mesa: translate fragment shaders into TGSI when we get them Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	46021ace51	st/mesa: translate vertex shaders into TGSI when we get them The translate functions is split into two: - translation to TGSI - creating the variant (TGSI transformations only) Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	de6a004035	st/mesa: fix glDrawPixels with a texture The samplers for DrawPixels data and the pixel map are assigned to slots which don't overlap with the existing sampler slots. The texture coordinates for the user texture are uploaded as a constant. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	f15bb3e633	st/mesa: implement DrawPixels shader transformation using tgsi_transform_shader Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	b55b986dc9	st/mesa: make Z/S drawpix shaders independent of variants, don't use Mesa IR v2 - there is no connection to user fragment shaders, so having these as shader variants makes no sense - don't use Mesa IR, use TGSI - don't create gl_fragment_program, just create the shader CSO v2: generate exactly the same shader as before to fix llvmpipe Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	f4ec81032b	st/mesa: implement glBitmap shader transformation using tgsi_transform_shader Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	3eedb63371	st/mesa: remove old emulation for VS and FS variants Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	c04e91a0e9	st/mesa: use TGSI utility to emulate features for FS variants Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	941721ee2a	st/mesa: use TGSI utility to emulate features for VS variants Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	4bbe418b4b	st/mesa: decrease the size of st_vertex_program The other variables can't be moved. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	4a21edf067	st/mesa: inline st_prepare_vertex_program No other shader stage has a "prepare" function. This will allow removing some variables from st_vertex_program. Also, prepare_fragment_program was a dead prototype. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	c80c19a9d5	tgsi/scan: add info about declared samplers (v2) v2: get it from declarations, not instructions	2015-10-09 22:02:18 +02:00
Marek Olšák	417927ebde	tgsi: add a utility for emulating some GL features st/mesa will use this, but drivers can use it too. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Marek Olšák	9ea2a86809	mesa: call ProgramStringNotify for fixed-function vertex programs Drivers weren't notified about this at all. This allows disabling on-demand compilation in drivers. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2015-10-09 22:02:18 +02:00
Rob Clark	c9b982b72d	glsl: move shader_enums into nir First step towards inverting the dependency between glsl and nir (so nir can be used without glsl). Also solves this issue with 'make distclean' Making distclean in mesa make[2]: Entering directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa' Makefile:2486: ../glsl/.deps/shader_enums.Plo: No such file or directory make[2]: * No rule to make target '../glsl/.deps/shader_enums.Plo'. Stop. make[2]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/mesa' Makefile:684: recipe for target 'distclean-recursive' failed make[1]: * [distclean-recursive] Error 1 make[1]: Leaving directory '/mnt/sdb1/Src64/Mesa-git/mesa/src' Makefile:615: recipe for target 'distclean-recursive' failed make: *** [distclean-recursive] Error 1 Reported-by: Andy Furniss <adf.lists@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-10-09 15:03:28 -04:00
Francisco Jerez	7e441bf025	mesa: Get rid of texture-dependent image unit derived state. The point is to avoid having to re-validate all image units when _NEW_TEXTURE is flagged, which can be expensive if the driver exposes a large number of image units. This has been reported to fix a 36% performance regression in the Synmark2 Multithread benchmark on the i965 driver which exposes 192 image units. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91788 Reported-by: Wendy Wang <wendy.wang@intel.com> Tested-by: Ye Tian <yex.tian@intel.com> CC: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-09 17:49:01 +03:00
Francisco Jerez	2d97a78b37	i965: Use _mesa_is_image_unit_valid() instead of gl_image_unit::_Valid. gl_image_unit::_Valid will be removed in a future commit. Tested-by: Ye Tian <yex.tian@intel.com> CC: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-09 17:48:52 +03:00
Francisco Jerez	25d3338be3	mesa: Skip redundant texture completeness checking during image validation. The call to _mesa_test_texobj_completeness() is unnecessary if the texture is already known to be complete. If the texture object is dirtied in the meantime _BaseComplete and _MipmapComplete will be reset to false. _mesa_is_image_unit_valid() will start to be called more frequently in a future commit, so it seems desirable to avoid the unnecessary work. Tested-by: Ye Tian <yex.tian@intel.com> CC: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-09 17:48:46 +03:00
Francisco Jerez	5152db415f	mesa: Expose function to calculate whether a shader image unit is valid. A future commit will remove all texture object-dependent derived state from the image unit struct to make validation unnecessary on texture state changes. Instead of checking gl_image_unit::_Valid drivers will be required to call this function when needed to find out whether an image unit is in a valid state and whether access from the shader is allowed. Tested-by: Ye Tian <yex.tian@intel.com> CC: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-09 17:48:28 +03:00
Francisco Jerez	5346c11670	i965: Don't tell the hardware about our UAV access. The hardware documentation relating to the UAV HW-assisted coherency mechanism and UAV access enable bits is scarce and sometimes contradictory, and there's quite some guesswork behind this commit, so let me summarize the background first: HSW and later hardware have infrastructure to support a stricter form of data coherency between shader invocations from separate primitives. The mechanism is controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and _PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on the 3DPRIMITIVE command. Regardless of whether "UAV Coherency Required" is set, the hardware fixed-function units will increment a per-stage semaphore for each request received if "Accesses UAV" is set for the same or any lower stage. An implicit DC flush is emitted by the lowermost stage with "Accesses UAV" set once it's done processing the request, this also happens regardless of the value of "UAV Coherency Required". The completion of the DC flush will cause the same stage and all previous ones to decrement the semaphore, marking the UAV accesses for the primitive as coherent with L3. The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline stall before any threads are dispatched for the first FF stage with "Accesses UAV" set until the semaphore is cleared for the same stage. Effectively this guarantees that UAV memory accesses performed by previous primitives from any stage will be strictly ordered (and thanks to the implicit DC flush visible in memory) with UAV accesses from the following primitives. None of this is required by the usual image, atomic counter and SSBO GL APIs which have very relaxed cross-primitive coherency and ordering requirements, so we don't actually ever set the "UAV Coherency Required" bit -- Ordering with respect to shader invocations from previous stages on the same primitive where there is a data dependency is of course already guaranteed as the spec requires, regardless of this mechanism being enabled. We do set the "Accesses UAV" bits though since my commit `ac7664e493` (which this patch partially reverts), mainly because of comments like the following from the BDW PRM: > 3DSTATE_GS >[...] > 12 Accesses UAV > Format: Enable > This field must be set when GS has a UAV access. There are similar comments in the documentation for the other 3DSTATE_*S commands. The "must" part is misleading and unjustified AFAIK. Most of the "Accesses UAV" bits don't seem to have any side effects other than the implicit DC flushes and the related book-keeping in anticipation for a subsequent primitive with "UAV Coherency Required" set, so in most cases they are unnecessary and may incur a performance penalty. There is an exception though. On Gen8+ the PS_EXTRA UAV access bit influences the calculation of the PS UAV-only and ThreadDispatchEnable signals which on previous generations were set explicitly by the driver, so we cannot always avoid enabling it on the PS stage. The primary motivation for this change is that in fact the hardware coherency mechanism is buggy and will cause a rather non-deterministic hang on Gen8 when VS is the only stage with "Accesses UAV" set and the processing of a request terminates immediately after the implicit DC flush is sent for a previous primitive with no additional vertices being emitted for the second primitive, what will cause the hardware to skip sending a second DC flush and cause the VS to stall indefinitely waiting for a response from the DC (BDWGFX HSD 1912017). This hardware bug can be reproduced on current master with the spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit subtest (if you have the patience to run it a few dozen times). The proposed workaround is to insert CS STALLs speculatively between 3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage only. Because this would affect one of the hottest paths in the driver and likely decrease performance even further due to the unnecessary serialization, and because we don't actually need the implicit DC flushes, it seems better to just disable them. Cc: 11.0 <mesa-stable@lists.freedesktop.org>	2015-10-09 17:48:26 +03:00
Connor Abbott	bb59ba8634	nir/instr_set: remove unnecessary check in nir_instrs_equal() This was originally added to nir_instrs_equal() instead of nir_instr_can_cse() incorrectly, but this was fixed when moving to the instruction set API (as it had to be, otherwise hashing wouldn't work). Now, this is dead code since instr_can_rewrite() will only return true for texture instructions that use an index, so we can turn the check into an assert. This also means that now nir_instrs_equal(instr, instr) will always return true unless it assert-fails. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:15:28 -04:00
Connor Abbott	bf5f931aee	nir: make nir_instrs_equal() static This was previously tied to CSE, since it would only work for instructions where nir_can_cse() (now instr_can_rewrite()) returned true. Now that CSE uses the instruction set abstraction which only uses this internally, we can make it local to nir_instr_set.c. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:15:15 -04:00
Connor Abbott	e8308d0523	nir/cse: use the instruction set API This replaces an O(n^2) algorithm with an O(n) one, while allowing us to import most of the infrastructure required for GVN. The idea is to walk the dominance tree depth-first, similar when converting to SSA, and remove the instructions from the set when we're done visiting the sub-tree of the dominance tree so that the only instructions in the set are the instructions that dominate the current block. No piglit regressions. No shader-db changes. Compilation time for full shader-db: Difference at 95.0% confidence -35.826 +/- 2.16018 -6.2852% +/- 0.378975% (Student's t, pooled s = 3.37504) v2: - rebase on start_block removal - remove useless state struct - change commit message Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:42 -04:00
Connor Abbott	523a28d3fe	nir: add an instruction set API This will replace direct usage of nir_instrs_equal() in the CSE pass, which reduces an O(n^2) algorithm with an effectively O(n) one. It'll also be useful for implementing GVN on top of GCM. v2: - Add texture support. - Add more comments. - Rename instr_can_hash() to instr_can_rewrite() since it's really more about whether its uses can be rewritten, and it's implicitly used by nir_instrs_equal() as well. - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason). - Make the HASH() macro less magical (Topi). - Rewrite the commit message. v3: - For sorting phi sources, use a VLA, store pointers to the sources, and compare the predecessor pointer directly (Jason). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:35 -04:00
Connor Abbott	005c2efb7b	nir: constify instruction comparison functions v2: rebase, don't constify nir_srcs_equal() as it's pass-by-value anyways Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:28 -04:00
Connor Abbott	d6bc35934f	nir: constify nir_ssa_alu_instr_src_components() Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:14:20 -04:00
Connor Abbott	20d6d812dc	nir: split out instruction comparison functions Right now nir_instrs_equal() is tied pretty tightly to CSE, but we're going to introduce the idea of an instruction set and tie it to that instead. In anticipation of that, move this into its own file where we'll add the rest of the instruction set implementation later. v2: Rebase on texture support. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-09 10:13:27 -04:00
Neil Roberts	da361acd1c	i965/fs: Handle non-const sample number in interpolateAtSample If a non-const sample number is given to interpolateAtSample it will now generate an indirect send message with the sample ID similar to how non-const sampler array indexing works. Previously non-const values were ignored and instead it ended up using a constant 0 value. The generator will try to determine if the sample ID is dynamically uniform via nir_src_is_dynamically_uniform. If not it will query the pixel interpolator in a loop, once for each different live sample number. The next live sample number is found using emit_uniformize. If multiple live channels have the same sample number then they will be handled in a single iteration of the loop. The loop is necessary because the indirect send message doesn't seem to have a way to specify a different value for each fragment. This fixes the following two Piglit tests: arb_gpu_shader5-interpolateAtSample-nonconst arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform v2: Handle dynamically non-uniform sample ids. v3: Remove the BREAK instruction and predicate the WHILE directly. Make the tokens arrays const. (Matt Turner) v4: Iterate over the live channels instead of each possible sample number. v5: Don't special case immediate values in brw_pixel_interpolator_query. Make a better wrapper for the function to set up the PI send instruction. Ensure that the SHL instructions are scalar. (Francisco Jerez). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-09 15:13:40 +02:00
Neil Roberts	728d7bc85f	i965: Add a second successor to BRW_OPCODE_WHILE It is possible to directly predicate the WHILE instruction. In this case there will be a second successor block because the execution can resume from the instruction after the loop. This will be used in a subsequent patch. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-09 15:13:22 +02:00
Neil Roberts	886d46b089	nir: Add a function to determine if a source is dynamically uniform Adds nir_src_is_dynamically_uniform which returns true if the source is known to be dynamically uniform. This will be used in a later patch to add a workaround for cases that only work with dynamically uniform sources. Note that the function is not definitive, it can return false negatives (but not false positives). Currently it only detects constants and uniform accesses. It could easily be extended to include more cases. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-09 15:10:40 +02:00
Samuel Pitoiset	7129cbf5f4	nvc0: move HW SM queries to nvc0_query_hw_sm.c/h files Global performance counters (PCOUNTER) will be added to nvc0_query_hw_pm.c/h files. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-09 14:09:57 +02:00
Samuel Pitoiset	224fec05ea	nvc0: move HW queries to nvc0_query_hw.c/h files Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-09 14:09:57 +02:00
Samuel Pitoiset	77b6990d14	nvc0: move SW queries to nvc0_query_sw.c/h files Loosely based on freedreno driver. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-09 14:09:57 +02:00
Samuel Pitoiset	0678530b9e	nvc0: move nvc0_so_target_save_offset() to its correct location Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-09 14:09:57 +02:00
Samuel Pitoiset	0644196ab1	nvc0: add a header file for nvc0_query This will allow to split SW and HW queries in an upcoming patch. While we are at it, make use of nvc0_query struct instead of pipe_query. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-09 14:09:57 +02:00
Samuel Iglesias Gonsalvez	3da58730ee	main: fix length of values written to glGetProgramResourceiv() for ACTIVE_VARIABLES Return the number of values written. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-09 08:13:55 +02:00
Samuel Iglesias Gonsalvez	d0992fa15a	main: buffer array variables can have array size of 0 if they are unsized From ARB_program_query_interface: For the property ARRAY_SIZE, a single integer identifying the number of active array elements of an active variable is written to <params>. The array size returned is in units of the type associated with the property TYPE. For active variables not corresponding to an array of basic types, the value one is written to <params>. If the variable is a shader storage block member in an array with no declared size, the value zero is written to <params>. v2: - Unsized arrays of arrays have an array size different than zero v3: - Arrays and unsized arrays will have an array_stride > 0. Use it instead of is_unsized_array flag (Timothy). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-09 08:13:55 +02:00
Samuel Iglesias Gonsalvez	66ca8e6632	main: consider that unsized arrays have at least one active element From ARB_shader_storage_buffer_object: "When using the ARB_program_interface_query extension to enumerate the set of active buffer variables, only the first element of arrays (sized or unsized) will be enumerated" _mesa_program_resource_array_size() is used when getting the name (and name length) of the active variables. When it is an unsized array, we want to indicate it has one active element so the returned name would have "[0]" at the end. v2: - Use array_stride > 0 and array_elements == 0 to detect unsized arrays. Because of that, we don't need is_unsized_array flag (Timothy) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-09 08:13:55 +02:00
Samuel Iglesias Gonsalvez	77c0b64ce3	main: fix TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE When the active variable is an array which is already a top-level shader storage block member, don't return its array size and stride when querying TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE respectively. Fixes the following 12 dEQP-GLES31 tests: dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.column_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.column_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std140.column_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat3x4 dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.column_major_mat3x4 v2: - Fix check when the shader storage block is instanced - Write auxiliary function to do the check. v3: - Check if full_instanced_name is NULL just after allocation (Ilia) - Remove () from one strcmp() in the if statement (Ilia) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-09 08:13:49 +02:00
Samuel Iglesias Gonsalvez	5be9bf2746	main: fix goto in program_resource_top_level_array_stride Use found_top_level_array_stride instead of found_top_level_array_size. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-09 08:12:10 +02:00
Tapani Pälli	d8d0e4a81e	mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span Patch adds missing type (used with NV_read_depth) so that it gets handled correctly. This fixes errors seen with following CTS test: ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-10-09 09:11:14 +03:00
Brian Paul	7d7dd18711	mesa,meta: move gl_texture_object::TargetIndex initializations Before, we were unconditionally assigning the TargetIndex field in _mesa_BindTexture(), even if it was already set properly. Now we initialize TargetIndex wherever we initialize the Target field, in _mesa_initialize_texture_object(), finish_texture_init(), etc. v2: also update the meta_copy_image code. In make_view() the view_tex_obj->Target field was set, but not the TargetIndex field. Also, remove a second, redundant assignment to view_tex_obj->Target. Add sanity check assertions too. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-10-08 13:53:33 -06:00
Brian Paul	d61f492aba	mesa: remove unused _mesa_create_nameless_texture() Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-10-08 13:53:33 -06:00
Brian Paul	b373c77693	mesa: remove unneeded error check in create_textures() Callers of create_texture() will either pass target=0 or a validated GL texture target enum so no need to do another error check inside the loop. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-10-08 13:53:33 -06:00
Kristian Høgsberg Kristensen	c71f0d45e6	i965: Link compiler unit tests to libi965_compiler.la We can now link the unit tests against just libi965_compiler.la. This lets us drop a lot of DRI driver dependencies, but we still pull in all of libmesa and more. This also provides a few standalone users of libi965_compiler.la, which will help us accidentally using i965_dri.so functions from the compiler. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	08d890d3bb	i965: Break out backend compiler to its own library This introduces a new libtool helper library, libi965_compiler.la. This library is moderately self-contained, but still needs to link to all of libmesa.la among other things. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	9a2573e5fc	i965/cs: Get max_cs_threads from brw_compiler devinfo Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	ee0f0108c8	i965: Move brw_get_shader_time_index() call out of emit functions brw_get_shader_time_index() is all tangled up in brw_context state and we can't call it from the compiler. Thanks the Jasons recent refactoring, we can just get the index and pass to the emit functions instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	ffc841cae5	i965: Move brw_select_clip_planes() to brw_shader.cpp We call this from the compiler so move it to brw_shader.cpp. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	365e5d7892	i965: Use util_next_power_of_two() for brw_get_scratch_size() This function computes the next power of two, but at least 1024. We can do that by bitwise or'ing in 1023 and calling util_next_power_of_two(). We use brw_get_scratch_size() from the compiler so we need it out of brw_program.c. We could move it to brw_shader.cpp, but let's make it a small inline function instead. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	cc4683992b	i965: Move brw_mark_surface_used() to brw_shader.cpp brw_program.c won't be part of the compiler library, but we need brw_mark_surface_used() in the compiler. Move to brw_shader.cpp. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:03 -07:00
Kristian Høgsberg Kristensen	469d0e449b	i965/cs: Split out helper for building local id payload The initial motivation for this patch was to avoid calling brw_cs_prog_local_id_payload_dwords() in gen7_cs_state.c from the compiler. This commit ends up refactoring things a bit more so as to split out the logic to build the local id payload to brw_fs.cpp. This moves the payload building closer to the compiler code that uses the payload layout and makes it available to other users of the compiler. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:15:02 -07:00
Kristian Høgsberg Kristensen	4f33700f5a	i965: Move brw_link_shader() and friends to new file brw_link.cpp We want to use the rest of brw_shader.cpp with the rest of the compiler without pulling in the GLSL linking code. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:14:44 -07:00
Kristian Høgsberg Kristensen	99ca2256c1	i965: Configure bufmgr debug options from intel_screen.c We need the debug flag parsing and INTEL_DEBUG in the compiler, but we don't want the dependency on bufmgr (libdrm_intel) in there. Move to intel_screen.c. There are now only two lines left in brw_process_intel_debug_variable(), but we keep it in intel_debug.h to avoid having to expose 'debug_control' as a global variable. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:13:31 -07:00
Kristian Høgsberg Kristensen	04158fb0f6	util: Move DRI parse_debug_string() to util We want to use intel_debug.c in code that doesn't link to dri common. v2: Remove unnecessary stddef.h include (Topi), use util/debug.h in all DRI driver and remove driParseDebugString() (Iago). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:13:31 -07:00
Kristian Høgsberg Kristensen	ba71d581ae	i965: Move brw_dump_ir() out of brw_*_emit() functions We move these calls one level up into the codegen functions. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-10-08 12:13:31 -07:00
Emil Velikov	1fda56cdb2	gallium/ddebug: add missing dd_util.h to sources list Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-08 18:13:24 +01:00
Emil Velikov	62741ff052	gallium/ddebug: automake: sort sources alphabetically Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2015-10-08 18:13:24 +01:00
Jason Ekstrand	9c528f5dfa	nir/sweep: Reparent the shader name Previously the name of the nir shader was being freed prematurely during nir_sweep. Since `756613ed35` the name was later being used to generate filenames for the optimiser debug output and these would end up with garbage from the dangling pointer. Co-authored-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-08 08:20:31 -07:00
Jan Vesely	c8031a879a	c11/threads: initialize timeout structure Signed-off-by: Jan Vesely <jano.vesely@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-08 14:05:57 +01:00
Boyan Ding	89ae41ab4c	docs/relnotes: document EGL_KHR_create_context on llvmpipe and softpipe Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-10-08 14:05:36 +01:00
Iago Toral Quiroga	1efbb8151b	i965/gs/gen6: Maximum allowed size of SEND messages is 15 (4 bits) Comit `d48ac93066` addressed this for VS, but we forgot to do the same for URB writes generated by the gen6 GS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-08 11:28:16 +02:00
Iago Toral Quiroga	3141906fa3	i965: Define FIRST_SPILL_MRF and FIRST_PULL_LOAD_MRF only once and in one place That should make tracking where we do spills and pull loads a bit easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-08 11:28:16 +02:00
Iago Toral Quiroga	36e82b137d	i965: make pull constant loads in gen6 start at MRFs 16/17 So they do not conflict with our (un)spills (MRF 21..23) or our URB writes (MRF 1..15) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-08 11:28:16 +02:00
Iago Toral Quiroga	0c2add7751	i965: Fix remove_duplicate_mrf_writes so it can handle 24 MRFs in gen6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-08 11:28:16 +02:00
Tapani Pälli	aee28a0aa3	mesa: include bad type in error string of _mesa_pack_depth_span Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-08 09:25:16 +03:00
Tapani Pälli	4e7fd66cf0	glsl: add varyings to resource list only with SSO Varyings can be considered inputs or outputs of a program only when SSO is in use. With multi-stage programs, inputs contain only inputs for first stage and outputs contains outputs of the final shader stage. I've tested that fix works for Assault Android Cactus (demo version) and does not cause Piglit or CTS regressions in glGetProgramiv tests. Following ES 3.1 CTS separate shader tests that do query properties of varyings in SSO shader programs pass: ES31-CTS.program_interface_query.separate-programs-vertex ES31-CTS.program_interface_query.separate-programs-fragment Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122	2015-10-08 07:43:11 +03:00
Jason Ekstrand	6ad9ebb073	mesa: Correctly handle GL_BGRA_EXT in ES3 format_and_type checks The EXT_texture_format_BGRA8888 extension (which mesa supports unconditionally) adds a new format and internal format called GL_BGRA_EXT. Previously, this was not really handled at all in _mesa_ex3_error_check_format_and_type. When the checks were tightened in commit `f15a7f3c`, we accidentally tightened things too far and GL_BGRA_EXT would always cause an error to be thrown. There were two primary issues here. First, is that _mesa_es3_effective_internal_format_for_format_and_type didn't handle the GL_BGRA_EXT format. Second is that it blindly uses _mesa_base_tex_format which returns GL_RGBA for GL_BGRA_EXT. This commit fixes both of these issues as well as adds explicit checks that GL_BGRA_EXT is only ever used with GL_BGRA_EXT and GL_UNSIGNED_BYTE. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92265 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-10-07 20:32:53 -07:00
Emil Velikov	bbf728f11b	Revert "mesa: enable KHR_debug for ES contexts" This reverts commit `b69cfbdf18`. This isn't quite baked yet. Seems that despite building the ES piglits, none of them got executed.	2015-10-07 21:49:50 +01:00
Matt Turner	164c8277f0	egl/dri2: Properly dereference array. Fixes a regression that broke EGL since commit `858f2f2ae6` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Sun Sep 13 12:25:27 2015 +0100 egl/dri2: ease srgb __DRIconfig conditionals	2015-10-07 11:48:49 -07:00
Marek Olšák	13e69805ea	radeonsi: fix a GS hang on VI Broken by one of the cleanups: `0d46c3bc9d` Not applicable to stable. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-07 19:18:50 +02:00
Marek Olšák	5749676d03	radeonsi: remove TC L2 cache flush for index buffers on VI Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-07 19:18:50 +02:00
Brian Paul	6ed8fd3d67	svga: whitespace fixes in svga_sampler_view.c	2015-10-07 08:45:56 -06:00
Brian Paul	70c4cde453	svga: whitespace fixes in svga_resource_buffer.c	2015-10-07 08:45:56 -06:00
Stefan Dösinger	a2bc4a7b04	mesa: Remove GL_ARB_sampler_object depth compare error checking. Version 3: Simplify the code comment, word wrap commit description. Version 2: Return GL_FALSE if ARB_shadow is unsupported instead of pretending to store the value as suggested by Brian Paul. This fixes a GL error warning on r200 in Wine. The GL_ARB_sampler_objects extension does not specify a dependency on GL_ARB_shadow or GL_ARB_depth_texture for setting the depth texture compare mode and function. Silently ignore attempts to change these settings. They won't matter without a depth texture being assigned anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-07 08:45:56 -06:00
Brian Paul	2bad030ac9	svga: round UBO constant buffer size up/down to multiple of 16 bytes The svga3d device requires constant buffers to be a multiple of 16 bytes in size. OpenGL UBOs may not fit that restriction. As a work-around, round the size up if possible, else round down. Note that this patch only effects UBO constant buffers (index 1 or higher), not the 0th/default constant buffer. Fixes the game Grim Fandango Remastered. VMware bug 1510130. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-10-07 08:45:56 -06:00
Emil Velikov	4ea5ed9f51	egl/dri2: enable EGL_KHR_gl_colorspace for swrast No driver changes needed for softpipe/llvmpipe - things just work. v2: Whitespace fixes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-07 15:18:03 +01:00
Emil Velikov	858f2f2ae6	egl/dri2: ease srgb __DRIconfig conditionals One can simplify the if-else chain, by declaring the driconfigs as a two sized array, whist using srgb as a index to the correct entry. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-07 15:17:57 +01:00
Emil Velikov	b69cfbdf18	mesa: enable KHR_debug for ES contexts Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-07 15:08:50 +01:00
Matthew Waters	70643a1389	main/get: make KHR_debug enums available everywhere Move all the enums but CONTEXT_FLAGS. The spec seems quite explicit about the latter (wrt OpenGL ES) "In OpenGL ES versions prior to and including ES 3.1 there is no CONTEXT_FLAGS state and therefore the CONTEXT_FLAG_DEBUG_BIT cannot be queried." v2 [Emil Velikov] Rebase. v3 [Emil Veliokv] Drop the CONTEXT_FLAGS hunk - not applicable for GLES Signed-off-by: Matthew Waters <ystreet00@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-07 15:07:01 +01:00
Matthew Waters	ae6ff72f5a	glapi: add function pointers for KHR_debug for gles v2 [Emil Velikov] - Rebase. - Correct version in gles11 dispatch_sanity. - Move the extension enable to a separate patch. Signed-off-by: Matthew Waters <ystreet00@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-07 15:07:01 +01:00
Varad Gautam	deb1765ec6	egl: move memcpy to bring conf->base operations together Signed-off-by: Varad Gautam <varadgautam@gmail.com> Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-07 15:05:28 +01:00
Varad Gautam	f988eff379	egl: restore surface type before linking config to its display commit `c2c2e9a` (egl: implement EGL_KHR_gl_colorspace (v2)) leaves _EGLConfig->SurfaceType set incorrectly before calling _eglLinkConfig(), and the bad value is passed around to platform_android. set it to zero as earlier. v2: Set SurfaceType to 0, rather than surface_type (Suggested by Emil) Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596 Signed-off-by: Varad Gautam <varadgautam@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-07 15:05:20 +01:00
Ilia Mirkin	47d11990b2	nouveau: make sure there's always room to emit a fence I started seeing a lot of situations on nv30 where fence emission wouldn't fit into the previous buffer (causing assertions). This ensures that whenever checking for space, we always leave a bit of extra room for the fence emission commands. Adjusts the nv30 and nvc0 fence emission logic to bypass the space checking as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-10-07 04:30:05 -04:00
Boyan Ding	64d9d4b730	vc4: use nir two-sided-color lowering Similar to `9ffc1049ca` (freedreno/ir3: use nir two-sided-color lowering). No piglit regression. Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-10-06 16:34:07 -07:00
Eric Anholt	b6cd39fc47	vc4: Fix a leak of the last color read/write surface on context destroy.	2015-10-06 16:32:03 -07:00
Eric Anholt	922e0680f9	vc4: Fix a memory leak in the simulator case. We validate per draw call, and need to free the shader per draw call, too.	2015-10-06 16:29:14 -07:00
Mark Janes	3861010213	mesa: remove unneeded #include of colormac.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 12:36:32 -07:00
Mark Janes	3475b68abd	radeon/r200: remove unneeded #include of colormac.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 12:36:32 -07:00
Mark Janes	eb6b80842f	i965: remove unneeded #include of colormac.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 12:36:32 -07:00
Mark Janes	83f9f911b2	i915: remove unneeded #include of colormac.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 12:36:32 -07:00
Ville Syrjälä	3bcc780126	i915: Drop broken front_buffer_reading/drawing optimization Bring the following commit over to i915: commit `ec542d7457` Author: Eric Anholt <eric@anholt.net> Date: Mon Mar 3 10:43:10 2014 -0800 i965: Drop broken front_buffer_reading/drawing optimization. Not sure if it might fix anything, but since the i965 and i915 used to share a bunch of that code, it would seem reasonable the same problems could be present in the i915 code still, and the i965 approach is well tested by now so bringing it over seems fairly safe. No piglit regressions on 855. v2: Rebase on _mesa_is_front_buffer_* refactor. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:36:37 -07:00
Ian Romanick	ea8b77e892	mesa/i965: Refactor brw_is_front_buffer_{drawing,reading} to common code There are multiple similar implementations of these functions, and a later patch was going to add another. v2: Move removing intel_framebuffer to a different patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-06 11:36:37 -07:00
Ian Romanick	5c4ef9f1d2	st/mesa: Don't override NewFramebuffer just to call _mesa_new_framebuffer v2: Since state_tracker does not call _mesa_init_driver_functions, we need to initialize the dd::NewFramebuffer pointer to _mesa_new_framebuffer here. Suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-06 11:36:37 -07:00
Ian Romanick	df75babf74	radeon: Don't override NewFramebuffer just to call _mesa_new_framebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-06 11:36:32 -07:00
Ian Romanick	e32a6590a4	i915: Don't override NewFramebuffer just to call _mesa_new_framebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-06 11:28:00 -07:00
Ian Romanick	ed7f00f564	i965: Don't override NewFramebuffer just to call _mesa_new_framebuffer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-06 11:27:45 -07:00
Ville Syrjälä	021f15816e	i830: Fix culling with user fbos on gen2 Flip the cull bits when rendering to a user fbo on gen2. This was already done on gen3 (since before git history starts) but was missing from the gen2 code. Fixes rendering of the driver+kart model in supertuxkart kart selection screen. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	3e2c7ca773	i915: Adjust line size limits The hardware can draw lines 0.5 to 7.5 pixels wide. Adjust the limits to 1.0-7.0. The old limits seems to be from the era when i915 and i965 were sharing this code. Not really sure if 1.0-7.0 is correct. Maybe it could be 0.5.7.5 as those are the hw limits, or maybe some combination of the two? Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	00ee403883	i915: Enable intel_render path for points The sub-pixel adjustment for points was killed off in commit `60d762aa62` Author: Xiang, Haihao <haihao.xiang@intel.com> Date: Wed Jan 2 11:38:51 2008 +0800 i915: Needn't adjust pixel centers. fix #12944 so if we don't need it in intel_tris.c we don't need it in intel_render.c either, which means we can allow intel_render.c to render points. No apparent regressions on PNV in ES1 or ES2 conformance. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	0febd0ecfd	i915: Use COPY_DWORDS for points The sub-pixel adjustment for points was killed off in commit `60d762aa62` Author: Xiang, Haihao <haihao.xiang@intel.com> Date: Wed Jan 2 11:38:51 2008 +0800 i915: Needn't adjust pixel centers. fix #12944 so we can just as well use COPY_DWORDS(). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	bcf650496f	i915: Use _tnl_RenderClippedPolygon and _tnl_RenderClippedLine _tnl_RenderClippedPolygon and _tnl_RenderClippedLine already do most of what we want so use them. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	303895655c	i915: Handle provoking vertex in intelFastRenderClippedPoly() intelFastRenderClippedPoly() renders the polygon using triangles. For polygons the provoking vertex is always the first one, and currently this function assumes that the provoking vertex for triangles is the last one. In case the user changed the provoking vertex convention, the hardware may be configured to treat the first vertex of triangles as the provoking vertex. So check the convention and emit the triangles in the appropriate order to avoid having to change the hardware provoking vertex convention for rendering polygons. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	0886426503	t_dd_dmatmp: Check provoking vertex convention when rendering quads When drawing quads using triangles we need to be careful to make the provoking vertices match when flat shading. v2: Major rebase on top of Ian's other t_dd_dmatmp.h work. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	83d511e190	t_dd_dmatmp: Disallow flat shading when rendering quad strips via tri strips When rendering quad strips via tri strips we can't get the provoking vertex right, so disallow flat shading. v2: Major rebase on top of Ian's other t_dd_dmatmp.h work. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ville Syrjälä	b15b4581d1	t_dd_dmatmp: Allow flat shaded polygons with tri fans We can allow rendering flat shaded polygons using tri fans if we check the provoking vertex convention. v2 (idr): Remove _EXT suffixes from GL_FIRST_VERTEX_CONVENTION. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-06 11:16:19 -07:00
Ian Romanick	5ca00e0b8d	t_dd_dmatmp: Replace fprintf with unreachable From http://lists.freedesktop.org/archives/mesa-dev/2015-May/084883.html: "There are no real error cases here, just dead code. validate_render() is supposed to make sure we never call these functions if the code can't actually render the primitives. The fprintf()+return branches should really just contain assert(0) or equivalent." I also rearranged the if-else-block in render_quad_strip_verts to look more like the other functions. A future patch is going to change a bunch of that code anyway. v2: Make "unreachable" message more descriptive. Suggested by Iago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-06 10:44:00 -07:00
Ian Romanick	46b13666d8	radeon: Use C99 initializers for primitive arrays Using C99 initializers for the primitive arrays makes things more readable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 10:41:56 -07:00
Ian Romanick	68976a5a00	i965: Use C99 initializers for primitive arrays Using C99 initializers for the primitive arrays makes things more readable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 10:41:56 -07:00
Ville Syrjälä	fad5fd3a25	i915: Use C99 initializers for primitive arrays Using C99 initializers for the primitive arrays makes things more readable. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-06 10:41:56 -07:00
Brian Paul	3801fa65c1	tgsi: add const qualifier to silence warning Trivial.	2015-10-06 08:51:33 -06:00
Brian Paul	b7766a95e1	glsl: whitespace/formatting/typo fixes in link_uniforms.cpp	2015-10-06 08:51:33 -06:00
Samuel Iglesias Gonsalvez	50d5a36f35	main: array stride for unsized arrays of arrays are calculated like records Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-06 14:28:26 +02:00
Samuel Iglesias Gonsalvez	82db642042	glsl: add std430 layout support for AoA Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-10-06 14:02:13 +02:00
Timothy Arceri	6483183279	docs: Mark GL_ARB_enhanced_layouts as in progress	2015-10-06 14:04:23 +11:00
Ilia Mirkin	dbae576f7f	i965: add EXT_polygon_offset_clamp support to gen4/gen5 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-05 14:39:38 -07:00
Matt Turner	833fa9a8cd	meta: Update comment about unsupported texture types. Ken added support for 2DArray (commit `ec23d5197e`) and 1DArray (commit `14ca61125`) last year. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-05 14:35:13 -07:00
Matt Turner	d4ff638504	glx: Drop CRAY support. It couldn't have worked anyway. There were calls to undefined functions. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-05 14:34:16 -07:00
Matt Turner	617eb5e6c3	glsl: Remove CSE pass. With NIR, it actually hurts things. total instructions in shared programs: 6529329 -> 6528888 (-0.01%) instructions in affected programs: 14833 -> 14392 (-2.97%) helped: 299 HURT: 1 In all affected programs I inspected (including the single hurt one) the pass CSE'd some multiplies and caused some reassociation (e.g., caused (A * B) * C to be A * (B * C)) when the original intermediate result was reused elsewhere. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-05 14:31:26 -07:00
Matt Turner	5a360dcad1	i965: Generalize predicated break pass for use in vec4 backend. instructions in affected programs: 44204 -> 43762 (-1.00%) helped: 221 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-05 13:42:58 -07:00
Matt Turner	4098a756b5	i965/fs: Use backend_instruction in predicated break peephole. We're not using any fs_inst fields, and the next commit will make the peephole used by the vec4 backend. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-05 13:42:58 -07:00
Matt Turner	5964419921	i965/fs: Remove SNB embedded-comparison support from optimizations. We never emit IF instructions with an embedded comparison (lost in the switch to NIR), so this code is not used. If we want to readd support, we should have a pass that merges a CMP instruction with an IF or a WHILE instruction after other optimizations have run. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-05 13:42:58 -07:00
Matt Turner	36ea9922ad	mesa: Add missing _mm_mfence() before streaming loads. According to the Intel Software Development Manual (Volume 1: Basic Architecture, 12.10.3 Streaming Load Hint Instruction): Streaming loads may be weakly ordered and may appear to software to execute out of order with respect to other memory operations. Software must explicitly use fences (e.g. MFENCE) if it needs to preserve order among streaming loads or between streaming loads and other memory operations. That is, a memory fence is needed to preserve the order between the GPU writing the buffer and the streaming loads reading it back. Reported-by: Joseph Nuzman <joseph.nuzman@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-10-05 12:06:33 -07:00
Chad Versace	93161be9e7	i965: Fix intel_miptree_is_fast_clear_capable() There are three types of fast clears: a. fast depth clears b. fast singlesample color clears c. fast multisample color clears Function intel_miptree_is_fast_clear_capable() checks if a miptree supports fast clears of type (b). Rename the function to disambiguate what it does: old: intel_miptree_is_fast_clear_capable new: intel_miptree_supports_non_msrt_fast_clear The functionally accidentally rejected multisampled color surfaces because it thought they were singlesample array surfaces. Fix that by explicitly rejecting surfaces with samples > 1. This fix would have been needed before we enabled layered fast singlesample color clears (introduced in gen8), which we want to do eventually. For now, though, this patch changes no behavior; it just fixes how the driver chooses its behavior. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-05 11:14:04 -07:00
Chad Versace	125a04b474	i965/mt: Declare some functions as static intel_tiling_supports_non_msrt_mcs() and intel_miptree_is_fast_clear_capable() are not used outside of intel_mipmap_tree.c. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-10-05 11:10:11 -07:00
Iago Toral Quiroga	73e0dfbaca	i965: Make vec4_visitor's destructor virtual We need a virtual destructor when at least one of the class' methods is virtual. Failure to do so might lead to undefined behavior when destructing derived classes. Fixes the following warning: brw_vec4_gs_visitor.cpp: In function 'const unsigned int* brw::brw_gs_emit(brw_context, gl_shader_program, brw_gs_compile, void, unsigned int)': brw_vec4_gs_visitor.cpp:703:11: warning: deleting object of polymorphic class type 'brw::vec4_gs_visitor' which has non-virtual destructor might cause undefined behaviour [-Wdelete-non-virtual-dtor] delete gs; Curro: This shouldn't be causing any actual bugs at the moment because gen6_gs_visitor is the only subclass of vec4_visitor destroyed through a pointer of a base class (vec4_gs_visitor ) and its destructor is basically the same as its parent's. Anyway it seems sensible to change this so it doesn't bite us in the future. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-10-05 13:50:15 +02:00
Tapani Pälli	a90feb581a	glsl: set glsl error if binding qualifier used on global scope Fixes following Piglit test: global-scope-binding-qualifier.frag Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-05 14:44:24 +03:00
Iago Toral Quiroga	102f6c446b	i965: Assert on the number of combined UBO and SSBO binding table entries In theory we can't break this assertion since the compiler frontend checks that we don't exceed any of the individual limits, but it does not hurt to be extra safe. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-05 08:19:34 +02:00
Iago Toral Quiroga	20cbe3688a	i965: Reserve binding table space for SSBO surfaces These share the space with UBO surfaces but we need to make sure we allocate enough space for both sets (12 of each) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-05 08:12:17 +02:00
Iago Toral Quiroga	41c4d45e08	i965: Define BRW_MAX_SSBO Instead of using hard-coded values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-05 08:12:17 +02:00
Iago Toral Quiroga	440f9348c1	i965: Define BRW_MAX_UBO Instead of using hard-coded values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-05 08:12:17 +02:00
Matt Turner	4caa10193f	i965/vec4: Remove more dead visitor/vertex program code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-04 23:03:59 -07:00
Matt Turner	cd7fa1034a	i965: Don't print line numbers with INTEL_DEBUG=optimizer. The thing you want to do with the output files is diff them, which is made more difficult by line numbers changing. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2015-10-04 23:03:59 -07:00
Ilia Mirkin	78ec9e28ec	nv30: always go through translate module on big-endian It seems like things are either coming in slighly wrong, or perhaps uploaded incorrectly, but either way passing them through the translate module seems to fix everything. Eventually we should figure out what's going wrong and fix it "for real", but this should do for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-10-04 21:50:41 -04:00
Ilia Mirkin	1fec05d114	nv30: pretend to have packed texture/surface formats This puts us in line with what the DDX/DRI2 st are expecting. It also happens to work... no idea why, but seems better to have it work than to ask lots of questions. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-10-04 21:50:41 -04:00
Michel Dänzer	87c3c9acd2	st/dri: Use packed RGB formats Fixes Gallium based DRI drivers failing to load on big endian hosts because they can't find any matching fbconfigs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-04 21:50:31 -04:00
Timothy Arceri	763cd8c080	glsl: reduce memory footprint of uniform_storage struct The uniform will only be of a single type so store the data for opaque types in a single array. Cc: Francisco Jerez <currojerez@riseup.net> Cc: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-05 10:53:24 +11:00
Kenneth Graunke	b85757bc72	i965: Remove shader_prog from vec4_gs_visitor. Unfortunately it has to stay in gen6_gs_visitor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-04 14:00:01 -07:00
Kenneth Graunke	21585048a2	i965: Use nir->has_transform_feedback_varyings to avoid shader_prog. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-04 14:00:01 -07:00
Kenneth Graunke	7768b802e5	nir: Add a nir_shader_info::has_transform_feedback_varyings flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-04 14:00:01 -07:00
Kenneth Graunke	5d7f8cb5a5	nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics. Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. v2: Rebase on earlier refactors. v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-04 14:00:01 -07:00
Kenneth Graunke	f2a4b40cf1	nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects. get_io_offset() already walks the dereference chain and discovers whether or not we have an indirect; we can just return that rather than computing it a second time via deref_has_indirect(). This means moving the call a bit earlier. By returning a nir_ssa_def *, we can pass back both an existence flag (via NULL checking the pointer) and the value in one parameter. It also simplifies the code somewhat. nir_lower_samplers works in a similar fashion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-10-04 14:00:01 -07:00
Timothy Arceri	6994ca20aa	glsl: fix whitespace Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-04 17:42:41 +11:00
Marek Olšák	814b7d1ab9	radeonsi: enable PIPE_CAP_FORCE_PERSAMPLE_INTERP Now st/mesa won't generate 2 variants for this state. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	b3c55fc669	radeonsi: do force_persample_interp in shaders for non-trivial cases Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	9652bfcf2d	radeonsi: implement the simple case of force_persample_interp Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	214de2d815	radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate state This will be a derived state used for changing center->sample and centroid->sample at runtime. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	55d406b71e	tgsi/scan: add interpolation info into tgsi_shader_info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	6b0f21cb28	st/mesa: automatically set per-sample interpolation if using SampleID/Pos Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-03 22:06:09 +02:00
Marek Olšák	4e9fc7e4e2	st/mesa: set force_persample_interp if ARB_sample_shading is used This is only a half of the work. The next patch will handle gl_SampleID/SamplePos, which is the other half of ARB_sample_shading. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-03 22:06:09 +02:00
Marek Olšák	f3b37e321f	gallium: add per-sample interpolation control into rasterizer statOAe Required by ARB_sample_shading for drivers that don't want a shader variant in st/mesa. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Roland Scheidegger <sroland@vmware.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	d8932a355d	st/mesa: add ST_DEBUG=precompile support for tessellation shaders Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-03 22:06:09 +02:00
Marek Olšák	dd340b34f3	mesa: remove Driver.BindImageTexture Nothing sets it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	92709dcb9b	mesa: remove Driver.DeleteSamplerObject Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	00f6beed02	mesa: remove Driver.EndCallList Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	ef6c0714af	mesa: remove Driver.BeginCallList Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	f457964885	mesa: remove Driver.EndList Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	55735cad00	mesa: remove Driver.NewList Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	7a54939728	mesa: remove Driver.NotifySaveBegin Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	4b8bb2f559	mesa: remove Driver.SaveFlushVertices Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	72a5dff9cb	mesa: remove Driver.FlushVertices Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	91799880b3	mesa: remove Driver.BeginVertices Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	82a950f187	mesa: remove Driver.BindArrayObject Nothing sets it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	d1269a844f	mesa: remove Driver.DeleteArrayObject Nothing reimplements it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	7401807e8d	mesa: remove Driver.NewArrayObject Nothing reimplements it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	1044f99812	mesa: remove Driver.Hint Nothing sets it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	8de82faf95	mesa: remove Driver.ColorMaskIndexed Nothing sets it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	379255298f	mesa: remove some Driver.Blend* hooks Nothing sets them. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	a6cc895e93	mesa: remove Driver.Accum Nothing calls it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	a4fca24484	mesa: remove Driver.ResizeBuffers Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	6863d5b02a	mesa: remove Driver.DeleteShaderProgram Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	b37dcb8c18	mesa: remove Driver.NewShaderProgram Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	95e0303312	mesa: remove Driver.DeleteShader Nothing overrides it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	18123a732b	egl/dri2: don't require a context for ClientWaitSync (v2) The spec doesn't require it. This fixes a crash on Android. v2: don't set any flags if ctx == NULL v3: add the spec note Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	b78336085b	st/dri: don't use _ctx in client_wait_sync Not needed and it can be NULL. v2: fix dri2_get_fence_from_cl_event - thanks Albert Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	27b102e7fd	r600g: only do depth-only or stencil-only in-place decompression instead of always doing both. Usually, only depth is needed, so stencil decompression is useless. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	c23c92c965	radeonsi: only do depth-only or stencil-only in-place decompression instead of always doing both. Usually, only depth is needed, so stencil decompression is useless. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	5804c6adf8	gallium/radeon: add separate stencil level dirty flags We will only do depth-only or stencil-only decompress blits, whichever is needed by textures, instead of always doing both. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	cc92b90375	radeonsi: dump buffer lists while debugging Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	eb55610c89	winsys/radeon: implement cs_get_buffer_list This is more complicated, because tracking priority_usage needed changing the relocs_bo type. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	6f48e2bee1	winsys/amdgpu: add winsys function cs_get_buffer_list For debugging. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	93641f4341	gallium/radeon: stop using "reloc" in a few places Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	2edb060639	gallium/radeon: tell the winsys the exact resource binding types Use the priority flags and expand them. This information will be used for debugging. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	9bd7928a35	radeonsi: add an option for debugging VM faults Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	4502d0bf88	radeonsi: move dumping the last IB into its own function v2: indentation fix Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	89f73827d0	ddebug: separate creation of debug files This will be used by radeonsi for logging. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Emil Velikov	3cd5395206	docs: add news item and link release notes for 10.6.9 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-03 13:23:13 +01:00
Emil Velikov	61c35ce4f9	docs: add sha256 checksums for 10.6.9 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `8957b696f9`)	2015-10-03 13:20:08 +01:00
Emil Velikov	b2a987fc12	docs: add release notes for 10.6.9 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `ab9aacce2d`)	2015-10-03 13:20:06 +01:00
Matthew Waters	11cabc45b7	egl: rework handling EGL_CONTEXT_FLAGS As of version 15 of the EGL_KHR_create_context spec, debug contexts are allowed for ES contexts. We should allow creation instead of erroring. While we're here provide a more comprehensive checking for the other two flags - ROBUST_ACCESS_BIT_KHR and FORWARD_COMPATIBLE_BIT_KHR v2 [Emil Velikov] Rebase. Minor tweak in commit message. Cc: Boyan Ding <boyan.j.ding@gmail.com> Cc: Chad Versace <chad.versace@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91044 Signed-off-by: Matthew Waters <ystreet00@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-03 12:30:13 +01:00
Jason Ekstrand	443d3bf340	i965/wm: Make compute_barycentric_interp_modes take a nir_shader and a devinfo Now that everything comes in through NIR, we can pick this directly out of the shader source and don't need to reference the gl_fragment_program. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 21:21:20 -07:00
Jason Ekstrand	1e3c1b107e	i965: Use nir_foreach_variable Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 21:21:18 -07:00
Jason Ekstrand	050e4787d3	nir: Add a nir_foreach_variable macro This is a common enough operation that it's nice to not have to think about the arguments to foreach_list_typed every time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 21:21:16 -07:00
Jason Ekstrand	ca941799ce	i965/nir: Remove the prog parameter from brw_nir_lower_inputs Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 21:21:00 -07:00
Tom Stellard	a2e1e3d325	radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2 This fixes a race condition in the glx-multithreaded-shader-compile test. v2: - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-02 23:41:27 +00:00
Tom Stellard	76cfd6f1da	gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2 Drivers and state trackers that use LLVM for generating code, must register the targets they use with LLVM's global TargetRegistry. The TargetRegistry is not thread-safe, so all targets must be added to the registry before it can be queried for target information. When drivers and state trackers initialize their own targets, they need a way to force gallivm to initialize its targets at the same time. Otherwise, there can be a race condition in some multi-threaded applications (e.g. glx-multihreaded-shader-compile in piglit), when one thread creates a context for a driver that uses LLVM (e.g. radeonsi) and another thread creates a gallivm context (glxContextCreate does this). The race happens when the driver thread initializes its LLVM targets and then starts using the registry before the gallivm thread has a chance to register its targets. This patch allows users to force gallivm to register its targets by calling the gallivm_init_llvm_targets() function. v2: - Use call_once and remove mutexes and static initializations. - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-02 23:41:26 +00:00
Tom Stellard	3219b48ae5	gallium/radeon: Use call_once() when initailizing LLVM targets Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-02 23:19:01 +00:00
Jason Ekstrand	bf7b6fd3fd	i965/shader: Get rid of the shader, prog, and shader_prog fields Unfortunately, we can't get rid of them entirely. The FS backend still needs gl_program for handling TEXTURE_RECTANGLE. The GS vec4 backend still needs gl_shader_program for handling transfom feedback. However, the VS needs neither and we can substantially reduce the amount they are used. One day we will be free from their tyranny. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:54 -07:00
Jason Ekstrand	404419ee1a	i965/fs,vec4: Get rid of the sanity_param_count It doesn't exist for anything other than an assert that, as far as I can tell, isn't possible to trip. Soon, we will remove prog from the visitor entirely and this will become even more impossible to hit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	ca6a436f12	i965/vec4: Use nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	756613ed35	i965/fs: Use the nir info instead of pulling things out of [shader_]prog Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	b62e36d18f	i965/fs: Move sampler unit lookup into rescale_texcoord The texunit variable we create and assign in nir_emit_texture gets passed through two more layers of function calls before it gets to its sole use in rescale_texcoord. The best part is that we already pass the sampler into rescale_texcoord so we can just look it up there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	7b974c5f90	i965/cs: Remove the prog argument from local_id_payload_dwords Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	7926c3ea7d	i965/backend_shader: Add a field to store the NIR shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	7a8d06b6dd	nir: Move GS data to nir_shader_info Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	e4fea486da	nir: Add a a nir_shader_info struct This commit also adds code to glsl_to_nir and prog_to_nir to fill it out. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	cd1ae6ebfa	nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	30c6357113	i965: Move prog_data uniform setup to the codegen level As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	ea006c4cb5	i965: Move binding table setup to codegen time. Setting up binding tables really has little to do with the actual process of turning shaders into instructions; it's more part of setting up prog_data. This commit moves it out of the visitors and with the rest of the prog_data setup stuff. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:53 -07:00
Jason Ekstrand	28709e37d9	i965/shader: Pull assign_common_binding_table_offsets out of backend_shader This really has nothing to do with the backend compiler and we'd like to eventually be able to set this up earlier in the compile process. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 14:22:52 -07:00
Jason Ekstrand	cdf314cb21	i965/nir: Simplify uniform setup Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	7fee8b6f05	i965/nir: Pull GLSL uniform handling into a common function The way we deal with GLSL uniforms and builtins is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	03c4171b57	i965/nir: Pull common ARB program uniform handling into a common function The way we deal with ARB program uniforms is basically the same in both the vec4 and the fs backend. This commit takes the best parts of both implementations and pulls the common code into a shared helper function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	390b48fc4a	i965/vec4: Use the uniform count from nir_assign_var_locations Previously, we were counting up uniforms as we set them up. However, this count should be exactly identical to shader->num_uniforms provided by nir_assign_var_locations. (If it's not, we're in trouble anyway because that means that locations don't match up.) This matches what the fs backend is already doing. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	3de81508ea	i965/shader: Get rid of the setup_vec4_uniform_value helper It's not used by anything anymore Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	58cea0c2b6	i965/shader: Pull setup_image_uniform_values out of backend_shader I tried to do this once before but Curro pointed out that having it in backend_shader meant it could use the setup_vec4_uniform_values helper which did different things in vec4 and fs. Now the setup_uniform_values function differs only by an assert in the two backends so there's no real good reason to be using it anymore. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	5609e0d7b4	i965/vec4: Get rid of the uniform_vector_size array The uniform_vector_size array was only ever used by pack_uniform_registers which no longer needs it. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	ea35fb0fbe	i965/vec4: Use the actual channels used in pack_uniform_registers Previously, pack_uniform_registers worked based on the size of the uniform as given to us when we initially set up the uniforms. However, we have to walk through the uniforms and figure out liveness anyway, so we migh as well record the number of channels used as we go. This may also allow us to pack things tighter in a few cases. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	cd2132f45b	glsl/types: Make subroutine types have a single matrix column That way, if we do the usual thing of multiplying vector_elements by matrix_columns we get the actual number of components in the type as per component_slots(). While we're at it, we also switch to using the actual C++ field initializers for vector_elements and matrix_columns. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	a7e0f755bc	i965: Pull stage_prog_data.nr_params out of the NIR shader Previously, we had a bunch of code in each stage to figure out how many slots we needed in stage_prog_data.param. This code was mostly identical across the stages and had been copied and pasted around. Unfortunately, this meant that any time you did something special, you had to add code for it to each of these places. In particular, none of the stages took subroutines into account; they were working entirely by accident. By taking this data from the NIR shader, we know the exact number of entries we need and everything goes a bit smoother. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:39 -07:00
Jason Ekstrand	fc3f45234b	i965/vs: Move lazy NIR creation to codegen_vs_prog The next commit will add code to codegen_vs_prog that requires the NIR shader to be there in all cases. It doesn't hurt anything to just move it from brw_vs_emit to its only caller. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:19:38 -07:00
Jason Ekstrand	64b145422b	i965/vec4: Delete the old vec4_vp code Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-02 14:19:36 -07:00
Jason Ekstrand	1153f12076	i965/vec4: Delete the old ir_visitor code Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-02 14:19:34 -07:00
Jason Ekstrand	b85761d11d	i965/vec4: Always use NIR GLSL IR vs. NIR shader-db results for vec4 programs on i965: total instructions in shared programs: 1499328 -> 1388354 (-7.40%) instructions in affected programs: 1245199 -> 1134225 (-8.91%) helped: 7469 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on G4x: total instructions in shared programs: 1436799 -> 1325825 (-7.72%) instructions in affected programs: 1205599 -> 1094625 (-9.20%) helped: 7469 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake: total instructions in shared programs: 1436654 -> 1325682 (-7.72%) instructions in affected programs: 1205503 -> 1094531 (-9.21%) helped: 7468 HURT: 2440 GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge: total instructions in shared programs: 2016249 -> 1787033 (-11.37%) instructions in affected programs: 1850547 -> 1621331 (-12.39%) helped: 14856 HURT: 1481 GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 GLSL IR vs. NIR shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1848027 -> 1648216 (-10.81%) instructions in affected programs: 1660279 -> 1460468 (-12.03%) helped: 14668 HURT: 1369 I also ran our full suite of benchmarks on a Haswell and had the following statistically significant (according to ministat) changes: Test master-glsl master-nir diff bench_OglGeomPoint 461.556 463.006 1.450 bench_OglTerrainFlyInst 184.484 187.574 3.090 bench_OglTerrainPanInst 132.412 136.307 3.895 bench_OglTexFilterAniso 19.653 19.645 -0.008 bench_OglTexFilterTri 58.333 58.009 -0.324 bench_OglVSInstancing 65.049 65.327 0.278 bench_trexoff 69.474 69.694 0.220 bench_valley 40.708 41.125 0.417 v2 (Jason Ekstrand): - Remove more uses of NirOptions as a switch - New shader-db numbers - Added benchmark numbers Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-02 14:18:46 -07:00
Ilia Mirkin	4e0a8e0a50	i965: don't forget to free image_param on prog_data free Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:14:27 -04:00
Ilia Mirkin	19598aaa5d	glsl: avoid leaking hiddenUniforms map when there are no uniforms Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:14:27 -04:00
Ilia Mirkin	da2fdf950f	mesa: avoid leaking closure when iterating over a string_to_uint_map Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 14:14:27 -04:00
Chris Wilson	6b7036498a	nir: Fix uninitialized 'progress' variable in nir_lower_system_values. Commit `0a1adaf11d` (nir: Report progress from nir_lower_system_values().) introduced a bug caught by Valgrind: ==823== Conditional jump or move depends on uninitialised value(s) ==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68) ==823== by 0xB079FB8: foreach_cf_node (nir.c:1310) ==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336) ==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79) ... ==823== Uninitialised value was created by a stack allocation ==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76) which is trivially fixed by initializing progress. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-10-02 10:44:28 -07:00
Connor Abbott	33da78adee	nir/remove_phis: handle trivial back-edges Some loops may have phi nodes that look like: foo = ... loop { bar = phi(foo, bar) ... } in which case we can remove the phi node and replace all uses of 'bar' with 'foo'. In particular, there are some L4D2 vertex shaders with loops that, after optimization, look like: /* succs: block_1 / loop { block block_1: / preds: block_0 block_4 / vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994 vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321 vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324 vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327 vec1 ssa_8139 = intrinsic load_uniform () () (232) vec1 ssa_588 = ige ssa_2195, ssa_8139 / succs: block_2 block_3 / if ssa_588 { block block_2: / preds: block_1 / break / succs: block_5 / } else { block block_3: / preds: block_1 / / succs: block_4 / } block block_4: / preds: block_3 / vec1 ssa_994 = iadd ssa_2195, ssa_2150 / succs: block_1 */ } where after removing the second, third, and fourth phi nodes, the loop becomes entirely dead, and this patch will cause the loop to be deleted entirely. No piglit regressions. Shader-db results on bdw: instructions in affected programs: 5824 -> 5664 (-2.75%) total loops in shared programs: 2234 -> 2202 (-1.43%) helped: 32 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-10-02 13:19:45 -04:00
Kyle Brenneman	d35391cfda	glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3) Add a macro GL_LIB_NAME to hold the filename that configure comes up with based on the --with-gl-lib-name and --enable-mangling options. In driOpenDriver, use the GL_LIB_NAME macro instead of hard-coding "libGL.so.1". v2: Add an #ifndef/#define for GL_LIB_NAME so that non-autoconf builds will work. v3: Fix the library filename in the Makefile. Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-02 13:25:05 +01:00
Kyle Brenneman	798f260a2f	mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix. When USE_MGL_NAMESPACE is defined, _glapi_get_stub will check for the "m" prefix before trying to skip it, so that "glFoo" and "mglFoo" are equivalent. This should let it work with all the places where something calls _glapi_get_proc_offset with a hard-coded name that starts with the normal "gl" prefix. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552 Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-10-02 13:23:18 +01:00
Kyle Brenneman	a27f2d991b	glx: Fix build errors with --enable-mangling (v2) Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up the renames from glx_mangle.h. Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is defined. v2: Add a comment clarifying why GLX_ALIAS needs two macros. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552 Signed-off-by: Kyle Brenneman <kbrenneman@nvidia.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-10-02 13:22:46 +01:00
Tapani Pälli	85313ff8ab	glsl: validate binding qualifier on block members Fixes following Piglit test: member-invalid-binding-qualifier.frag Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-02 10:50:42 +03:00
Samuel Iglesias Gonsalvez	f42466322a	glsl: emit row_major matrix's SSBO stores only for components in writemask When writing to a column of a row-major matrix, each component of the vector is stored to non-consecutive memory addresses, so we generate one instruction per component. This patch skips the disabled components in the writemask, saving some store instructions plus avoid storing wrong data on each disabled component. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-10-02 08:34:25 +02:00
Tapani Pälli	a552b77dcc	glsl: error out if non-constant indexing of SSBO arrays with GLSL ES Fixes a failing subtest in: ES31-CTS.shader_storage_buffer_object.negative-glsl-compileTime Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-10-02 08:37:02 +03:00
Daniel Scharrer	b3f9c5cc0f	mesa: Add abs input modifier to base for POW in ffvertex_prog The result of POW for a negative base is undefined. Even when the result is multiplied by zero (which is the case here whenever the base is negative), the Inf and NaNs can propagate past that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 Signed-off-by: Daniel Scharrer <daniel@constexpr.org> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-10-01 16:37:55 -04:00
Kenneth Graunke	604ce8253a	i965/fs: Print reg and reg_offset separately for ATTR files. Reading this output was really confusing. reg represents attribute slots; reg_offset is the x/y/z/w component (0..3) within a vec4 slot. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-01 11:01:58 -07:00
Kenneth Graunke	193d29516d	i965/nir: Refactor input/output lowering setup into helpers. The code for input lowering is going to get significantly more complicated shortly, so I wanted to pull it out. Vertex shader inputs are handled nearly identically regardless of vec4/scalar mode, so I opted to not split that. I thought about having each function actually do the lowering, but one pass through nir_lower_io that handles all types (which weren't handled earlier) is probably more efficient. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-01 10:58:30 -07:00
Kenneth Graunke	39a1d36a67	nir: Allow nir_lower_io() to only lower one type of variable. We may want to use different type_size functions for (e.g.) inputs vs. uniforms. Passing in -1 for mode ignores this, handling all modes as before. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-10-01 10:58:30 -07:00
Brian Paul	1c6689bf03	mesa: fix incorrect error in _mesa_BindTextureUnit() If the texture object exists, but the Name field is zero, it means the object was created but never bound to a target. Trying to bind it in _mesa_BindTextureUnit() should generate GL_INVALID_OPERATION. Fixes piglit's arb_direct_state_access-bind-texture-unit test. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	a9408f3ca1	mesa: remove _mesa_get_tex_unit_err() and fix error handling This helper was only called from _mesa_BindTextureUnit(). It's simpler to just inline it. The error check / code / message in the helper was incorrect. It was written for glBindTextures(), not glBindTextureUnit(). The correct error for a bad texture unit number is GL_INVALID_VALUE. The error message now reports the unit number rather than a GL_TEXTUREi enum. Fixes a failure in piglit's arb_direct_state_access-bind-texture-unit test. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	c277fa3940	mesa: consolidate texture binding code Before, we were doing the actual _mesa_reference_texobj() call and ctx->Driver.BindTexture() and misc housekeeping in three different places. This consolidates the common code in a new bind_texture() function. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	78f908c54b	mesa: fix indentation in _mesa_create_nameless_texture()	2015-10-01 07:45:43 -06:00
Brian Paul	aa249190a5	st/mesa: clean up #includes in st_draw.c Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	82e3d8ba8b	mesa: clean up #includes in sampler.cpp Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	32a4999ee7	mesa: clean up #includes in ir_to_mesa.cpp Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	b9b13d873a	mesa: clean up #includes in uniforms.h Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:43 -06:00
Brian Paul	e13b515044	mesa: clean up #includes in uniform_query.cpp Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:42 -06:00
Brian Paul	85ea125620	mesa: clean up #includes in pipelineobj.c Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:42 -06:00
Brian Paul	1a22550725	mesa: clean up #includes in ff_fragment_shader.cpp Get rid of "../glsl/" paths. Sort alphabetically. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 07:45:42 -06:00
Iago Toral Quiroga	7455324030	main: Fix block index when mixing UBO and SSBO blocks Since we store both in UniformBlocks, we can't just compute the index by subtracting the array address start, we need to count the number of buffers of the approriate type. v2: - Just fall back to calc_resource_index (Tapani) Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-10-01 09:25:30 +02:00
Tapani Pälli	ca2e16d26e	mesa: use strtok_s for strtok_r on windows https://msdn.microsoft.com/en-us/library/ftsafwz3.aspx v2: use _WIN32 instead of _MSC_VER (Brian Paul) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92183 Reviewed-by: Brian Paul <brianp@vmware.com>	2015-10-01 08:01:03 +03:00
Ian Romanick	9bd9cf1fa4	meta: Handle array textures in scaled MSAA blits The old code had some significant problems with respect to sampler2DArray textures. The biggest problem was that some of the code would use vec3 for the texture coordinate type, and other parts of the code would use vec2. The resulting shader would not even compile. Since there were not tests for this path, nobody noticed. The input to the fragment shader is always treated as a vec3. If the source data is only vec2, the vertex puller will supply 0 for the .z component. The texture coordinate passed to the fragment shader is always a vec2 that comes from the .xy part of the vertex shader input. The layer, taken from the .z of the vertex shader input is passed separately as a flat integer. If the generated fragment shader does not use the layer integer, the GLSL linker will eliminate all the dead code in the vertex shader. Fixes the new piglit tests "blit-scaled samples=2 with gl_texture_2d_multisample_array", etc. on i965. Note for stable maintainer: This patch may depend on `46037237`, and that patch should be safe for stable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-30 16:22:56 -07:00
Chad Versace	b217e6f035	i965/miptree: Add PRM references for most struct members (v2) Add comments that link the driver's miptree structures to the hardware structures documented in the PRM. This provides sorely needed orientation to developers new to the miptree code. And for miptree veterans, this clarifies some of the more obscure miptree data. For each driver struct field that closely corresponds to a hardware struct field, add a PRM reference to that hardware field's name. For example, struct intel_mipmap_tree { ... /** * @brief One of GL_TEXTURE_2D, GL_TEXTURE_2D_ARRAY, etc. * * @see RENDER_SURFACE_STATE.SurfaceType * @see RENDER_SURFACE_STATE.SurfaceArray * @see 3DSTATE_DEPTH_BUFFER.SurfaceType / GLenum target; ... }; Also annotate the INTEL_MSAA_LAYOUT_ enums with the name of the PRM sections that documents the layout. v2: Replace "2D subimage" with "slice", and define what a "slice" is. For Ben. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1) Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com> (v1)	2015-09-30 15:32:03 -07:00
Chad Versace	f7fe9fb0f1	i965/miptree: Rename align_w,align_h -> halign,valign The values of intel_mipmap_tree::align_w and ::align_h correspond to the hardware enums HALIGN_* and VALIGN_*. See the confusion? align_h != HALIGN align_h == VALIGN Reduce the confusion by renaming the variables to match the hardware enum names: git ls-files \| xargs sed -i -e 's/align_w/halign/g' \ -e 's/align_h/valign/g' Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-30 15:31:06 -07:00
Chad Versace	56367b0290	i965/miptree: Rename intel_miptree_map::mt -> ::linear_mt (v2) Because that's what it is. It's an untiled, linear miptree. v2: - Add space after /*. - Use one comment per function argument. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-09-30 15:31:04 -07:00
Chad Versace	b7882ae677	i965/miptree: Fix comments for map mode The comment for intel_miptree_map::mode claimed that it was a bitmask of GL_MAP_{READ,WRITE,INVALIDATE}_BIT. In reality, the bitmask may include any of {GL,BRW}_MAP_*_BIT. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-09-30 15:31:03 -07:00
Chad Versace	bd191b7cc6	i965/miptree: More comments for BRW_MAP_DIRECT_BIT (v2) Clarify that this bit extends the set of GL_MAP_*_BIT enums. Also fix typo of "temporary". Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Ben Widawsky <benjamin.widawsky@intel.com>	2015-09-30 15:30:55 -07:00
Kenneth Graunke	651395b6e8	i965: Remove duplicate copy of is_scalar_shader_stage(). Jason open coded this in `60befc63` when cleaning up some ugly code; using our existing helper tidies it up a bit more. v2: Drop inline (suggested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-30 13:56:24 -07:00
Ville Syrjälä	a1a3f0961b	i915: Remember to call intel_prepare_render() before blitting Bring over the following fix from i965: commit `fb3d62fe3d` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Tue Aug 6 14:36:09 2013 -0700 i965: Remember to call intel_prepare_render() before blitting. Fixes a crash in the following piglit tests: bin/fbo-sys-blit -auto bin/fbo-sys-sub-blit -auto Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-30 13:10:03 -07:00
Ville Syrjälä	c349031c27	i915: Fix texcoord vs. varying collision in fragment programs i915 fragment programs utilize the texture coordinate registers for both texture coordinates and varyings. Unfortunately the code doesn't check if the same index might be in use for both. It just naively uses the index to pick a texture unit, which could lead to collisions. Add an extra mapping step to allocate non conflicting texture units for both uses. The issue can be reproduced with a pair of simple shaders like these: attribute vec4 in_mod; varying vec4 mod; void main() { mod = in_mod; gl_TexCoord[0] = gl_MultiTexCoord0; gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex; } varying vec4 mod; uniform sampler2D tex; void main() { gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod; } Fixes many piglit tests on i915: glsl-link-varyings-2 glsl-orangebook-ch06-bump interpolation-none-gl_frontcolor-smooth-fixed interpolation-none-gl_frontcolor-smooth-none interpolation-none-gl_frontcolor-smooth-vertex interpolation-none-gl_frontsecondarycolor-smooth-fixed interpolation-none-gl_frontsecondarycolor-smooth-vertex interpolation-none-gl_frontsecondarycolor-smooth-none interpolation-none-other-flat-fixed interpolation-none-other-flat-none interpolation-none-other-flat-vertex interpolation-none-other-smooth-fixed interpolation-none-other-smooth-none interpolation-none-other-smooth-vertex v2 [idr]: Minor formatting tweaks. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-30 13:10:03 -07:00
Ville Syrjälä	9504740f3e	i830: Fix collision between I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) are trying to occupy the same bit. Move the texture bits upwards a bit to make room for I830_UPLOAD_RASTER_RULES. Now the driver will actually upload the raster rules which is rather important to get the provoking vertex right. Fixes the appearance of glxgears teeth on gen2. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-30 12:49:28 -07:00
Jordan Justen	7b391142e9	i965/cs: Upload UBO/SSBO surfaces Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-09-30 11:28:12 -07:00
Rhys Kidd	83018f5c20	mesa: Fix format specifier warning in mesa_DispatchComputeIndirect() Commit `1665d29ee3` introduced an incorrect format specifier that operates on GLintptr indirect within the function _mesa_DispatchComputeIndirect(). This patch mitigates the introduced GCC warning: src/mesa/main/compute.c: In function '_mesa_DispatchComputeIndirect': src/mesa/main/compute.c:53:7: warning: format '%d' expects argument of type 'int', but argument 3 has type 'GLintptr' [-Wformat=] _mesa_debug(ctx, "glDispatchComputeIndirect(%d)\n", indirect); ^ v2: Amend for Boyan Ding <boyan.j.ding@gmail.com> feedback. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-09-30 10:13:41 -07:00
Jason Ekstrand	3948ac19a4	i965: Get rid of prog_data compare functions They are no longer used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-30 08:35:32 -07:00
Jason Ekstrand	bfdc76c133	i965/state_cache: Remove the aux_compare fields They haven't been used since `1bba29ed40` so there's no good reason to keep them around. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-30 08:35:32 -07:00
Jason Ekstrand	a4734b34b3	i965/copy_image: Fix a copy+past error Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-30 08:35:32 -07:00
Chris Wilson	70e91d61fd	i965: Remove early release of DRI2 miptree intel_update_winsys_renderbuffer_miptree() will release the existing miptree when wrapping a new DRI2 buffer, so we can remove the early release and so prevent a NULL mt dereference should importing the new DRI2 name fail for any reason. (Reusing the old DRI2 name will result in the rendering going astray, to a stale buffer, and not shown on the screen, but it allows us to issue a warning and not crash much later in innocent code.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281 Reviewed-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-30 10:52:30 +03:00
Samuel Iglesias Gonsalvez	e21bb9e7bd	glsl: assert base_alignment > 0 for records From GLSL 1.50 spec, section 4.1.8 "Structures": "Structures must have at least one member declaration." So the base_alignment should be higher than zero. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-30 08:13:07 +02:00
Samuel Iglesias Gonsalvez	f3afcbecc6	util: use strnlen() in strndup() implementations If the string being copied is not NULL-terminated the result of strlen() is undefined. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-09-30 08:13:07 +02:00
Samuel Iglesias Gonsalvez	023165a734	i965/vec4/nir: add nir_intrinsic_memory_barrier support Fix OpenGL ES 3.1 conformance tests: advanced-readWrite-case1-vsfs and advanced-matrix-vsfs. v2: - Fix SHADER_OPCODE_MEMORY_FENCE emission and the allocation of 'tmp' (Francisco). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-30 08:13:07 +02:00
Samuel Iglesias Gonsalvez	f24e5e68d6	glsl: apply shader storage block member rules when adding program resources From ARB_program_interface_query: "For an active shader storage block member declared as an array, an entry will be generated only for the first array element, regardless of its type. For arrays of aggregate types, the enumeration rules are applied recursively for the single enumerated array element." v2: - Simplify 'if' conditions and return true if it is not a buffer variable, because these rules only apply to buffer variables (Timothy). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-30 08:13:07 +02:00
Jordan Justen	4810d02112	nir: Don't set dest in SSBO store glsl_to_nir conversion This matches the function signature created in lower_ubo_reference_visitor::ssbo_store which has a void return. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-09-29 17:17:20 -07:00
Kenneth Graunke	476e6d732f	nir: Use a system value for gl_PrimitiveIDIn. At least on Intel hardware, gl_PrimitiveIDIn comes in as a special part of the payload rather than a normal input. This is typically what we use system values for. Dave and Ilia also agree that a system value would be nicer. At some point, we should change it at the GLSL IR level as well. But that requires changing most of the drivers. For now, let's at least make NIR do the right thing, which is easy. v2: Add a comment about not creating a temporary (suggested by Iago). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-29 14:19:32 -07:00
Brian Paul	cb758b892a	st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats For 8-bit RGB(A) texture formats we set the PIPE_BIND_RENDER_TARGET flag to try to get a hardware format which also supports rendering (for FBO textures). Do the same thing for floating point formats. This allows the Redway3D Flat demo to run. Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-29 11:52:22 -06:00
Brian Paul	daf23bd4cb	st/mesa: add some debugging code in st_ChooseTextureFormat() I've temporarily added code like this many times. Wrap it in a conditional that can be enabled when needed. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-29 11:52:03 -06:00
Brian Paul	7147f7098e	mesa: clean up #includes in shaderapi.c Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-29 11:51:56 -06:00
Brian Paul	b24c6d3fef	mesa: clean up the #includes in shader_query.cpp Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-29 11:51:51 -06:00
Brian Paul	3bbff1e26e	mesa: remove an extern "C" wrapper in shader_query.cpp The shaderapi.h header already has the extern "C" wrapper. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-29 11:51:38 -06:00
Jordan Justen	681b4badae	i965/cs: Generate code to load gl_NumWorkGroups This code also sets cs_prog_data->uses_num_work_groups which is later used by state setup to indicate that the gl_NumWorkGroups surface needs to be setup. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	4c6ddd3397	nir: Convert SYSTEM_VALUE_NUM_WORK_GROUPS to a nir intrinsic Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	f6ae914069	glsl/cs: Add gl_NumWorkGroups as a system value Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	63d7b33f51	i965/cs: Setup surface binding for gl_NumWorkGroups This will only be setup when the prog_data uses_num_work_groups boolean is set. At this point nothing will set uses_num_work_groups, but soon code will set it when emitting code for the intrinsic that loads gl_NumWorkGroups. We can't emit this surface information earlier at the start of the DispatchCompute* call because we may not have generated the program yet. Until we generate the program, we don't know if the gl_NumWorkGroups variable is accessed. We also can't emit the surface as part of the brw_cs_state atom, because we might not need the surface if gl_NumWorkGroups is not used by the program. Lastly, we cannot emit the surface later (after state upload) in the DispatchCompute* call, because it needs to be run before the brw_cs_state atom is emitted, since it changes the surface state. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	d1be9d2126	i965/cs: Add a binding table entry for gl_NumWorkGroups If glDispatchComputeIndirect is used, then the value for this variable must be read from the indirect BO. To allow the same generated code to support indirect and glDispatchCompute, we will also setup a BO for the number of work groups using the intel_upload_data mechanism. This will only be required if the gl_NumWorkGroups variable is accessed. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	d57a85f32b	i965/cs: Store compute invocation information in brw context We will need this in an atom to setup a surface to read the gl_NumWorkGroups values from. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	60cf84dea7	i965/cs: Re-emit cs_state when surfaces have changed Unlike rendering (BINDING_TABLE_POINTERS_*S), compute doesn't have a binding table pointers command. Instead it is part of the MEDIA_INTERFACE_DESCRIPTOR structure loaded by the brw_cs_state atom. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	2ec5f3e1d5	i965/cs: Re-emit push constants and cs_state on new batches We need to re-emit push constansts when a new batch is started since the push constants are stored in the batch. We also need to re-emit the MEDIA_INTERFACE_DESCRIPTOR (in brw_cs_state) since it is stored in the batch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jordan Justen	1665d29ee3	mesa/cs: Add MESA_VERBOSE=api support in DispatchCompute* Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 08:23:47 -07:00
Jose Fonseca	952366a60e	util: Fix strndup prototype on C++. Trivial.	2015-09-29 16:01:56 +01:00
Tapani Pälli	c0722be9f5	mesa: fix ARRAY_SIZE query for GetProgramResourceiv Patch also refactors name length queries which were using array size in computation, this has to be done in same time to avoid regression in arb_program_interface_query-resource-query Piglit test. Fixes rest of the failures with ES31-CTS.program_interface_query.no-locations v2: make additional check only for GS inputs v3: create helper function for resource name length so that it gets calculated only in one place Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-09-29 12:46:28 +03:00
Iago Toral Quiroga	12d510ab74	glsl: Fix forward NULL dereference coverity warning The comment says that it should be impossible for decl_type to be NULL here, so don't try to handle the case where it is, simply add an assert. >>> CID 1324977: Null pointer dereferences (FORWARD_NULL) >>> Comparing "decl_type" to null implies that "decl_type" might be null. No piglit regressions observed. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-29 10:53:08 +02:00
Iago Toral Quiroga	1dc2db7a4d	glsl: Fix null return coverity warning Add an assert on the result of as_dereference() not being NULL: >>> CID 1324978: Null pointer dereferences (NULL_RETURNS) >>> Dereferencing a null pointer "deref_record->record->as_dereference()". Since we are introducing a new variable to hold the result of as_dereference(), take the opportunity to rename deref_record_type to interface_type and just name the new variable interface_deref, which is less confusing. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 10:53:08 +02:00
Iago Toral Quiroga	6bf718fec2	glsl: Fix unused value warning reported by Coverity We don't use param in this part of the code, so no point in advancing the pointer forward: >>> CID 1324983: Code maintainability issues (UNUSED_VALUE) >>> Assigning value from "param->get_next()" to "param" here, but that stored value is overwritten before it can be used. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 10:53:08 +02:00
Samuel Iglesias Gonsalvez	bea66d22f2	util: implement strndup for WIN32 v2: - Add strndup.h to Makefile.sources (Emil) - Use calloc instead of malloc (Emil). - Check if allocation fails (Emil, Jose) - Add '#pragma once' and include stdlib.h to strndup.h (Jose) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92124 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-09-29 10:03:47 +02:00
Samuel Iglesias Gonsalvez	7efb235019	glsl: use correct number of uniform blocks in error message Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-29 10:03:47 +02:00
Samuel Iglesias Gonsalvez	6668eb5a45	mesa: rename gl_shader_program's NumUniformBlocks to NumBufferInterfaceBlocks Because it counts shader storage blocks too. v2: - Use NumBufferInterfaceBlocks instead (Jordan). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-29 10:03:47 +02:00
Samuel Iglesias Gonsalvez	38004eb17c	main: fix ACTIVE_UNIFORM_BLOCKS value NumUniformBlocks also counts shader storage blocks. NumUniformBlocks variable will be renamed in a later patch to avoid misunderstandings. v2: - Modify the condition to use !IsShaderStorage and the list of uniform blocks (Timothy) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-29 10:03:47 +02:00
Emil Velikov	589249a792	docs: add news item and link release notes for 11.0.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-29 00:22:32 +01:00
Emil Velikov	dda02d202e	docs: add sha256 checksums for 11.0.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `4c0b484612`)	2015-09-29 00:21:14 +01:00
Emil Velikov	58e02b2a4e	docs: add release notes for 11.0.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `51e0b06d99`)	2015-09-29 00:21:12 +01:00
Anuj Phogat	945592f92c	i965/gen9: Add a condition for starting pixel in fast copy blit This condition restricts the use of fast copy blit to cases where starting pixel of src and dst is oword (16 byte) aligned. Many piglit tests (if using fast copy blit in Mesa) failed earlier because I missed adding this condition.Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 15:00:53 -07:00
Ilia Mirkin	1d8cba9b51	nouveau: wait to unref the transfer's bo until it's no longer used The bo will often come from a slab in which case it doesn't matter. But for larger allocations this will be in its own bo, and we have to make sure to wait until it's no longer used in order for it to be freed. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>	2015-09-28 17:28:54 -04:00
Ilia Mirkin	3a6b9a7830	nouveau: delay deleting buffer with unflushed fence If there is an unflushed fence on the bo, then the resource may still be used in commands built up in the local pushbuf. Flushing can cause all sorts of unwanted effects, so just free the bo when the relevant fence is hit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>	2015-09-28 17:28:54 -04:00
Ilia Mirkin	d4e650b07b	nouveau: be more careful about freeing temporary transfer buffers Deleting a buffer does not flush the command stream. Make sure that we wait for the copies to finish before deleting the temporary bo. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org Tested-by: Marcin Ślusarz <marcin.slusarz@gmail.com>	2015-09-28 17:28:54 -04:00
Anuj Phogat	4c5308bbf4	i965: Rename intel_miptree_get_dimensions_for_image() This function isn't specific to miptrees. So, drop the "miptree" from function name. V3: Add a comment explaining how the 1D Array texture height and depth is interpreted by Intel hardware. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	0bfd914f9f	i965/gen9: Fix {src, dst}_pitch alignment check for XY_FAST_COPY_BLT I misinterpreted the alignmnet restriction in XY_FAST_COPY_BLT earlier. Instead of checking pitch for 64KB alignmnet we need to check it for tile widh alignment. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	0fa39bff19	i965: Fix {src, dst}_pitch alignment check for XY_SRC_COPY_BLT Current code checks the alignment restrictions only for Y tiling. From Broadwell PRM vol 10: "pitch is of 512Byte granularity for Tile-X: This means the tiled-x surface pitch can be (512, 1024, 1536, 2048...)/4 (in Dwords)." This patch adds the restriction for X tiling as well. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	e83b07aa7b	i965: Move conversion of {src, dst}_pitch to dwords outside if/else Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	485285498f	i965: Delete temporary variable 'src_pitch' Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	bbbc9fd8e5	i965: Use helper function intel_get_tile_dims() in surface setup It takes care of using the correct tile width if we later use other tiling patterns for aux miptree. V2: Remove the comment about using Yf for aux miptree. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	1dc41be9eb	i965: Use intel_get_tile_dims() to get tile masks This will require change in the parameters passed to intel_miptree_get_tile_masks(). V2: Rearrange the order of parameters. (Ben) Change the name to intel_get_tile_masks(). (Topi) V3: Use temporary variables in intel_get_tile_masks() for clarity. Fix mask_y computation. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Anuj Phogat	21fdc59d34	i965: Add a helper function intel_get_tile_dims() V2: - Do the tile width/height computations in the new helper function and use it later in intel_miptree_get_tile_masks(). - Change the name to intel_get_tile_dims(). V3: Return the tile_h in number of rows in place of bytes. Document the units of tile_w, tile_h parameters. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-09-28 12:43:43 -07:00
Eduardo Lima Mitev	5edd9961c1	mesa: Use the effective internal format instead for validation When validating format+type+internalFormat for texture pixel operations on GLES3, the effective internal format should be used if the one specified is an unsized internal format. Page 127, section "3.8 Texturing" of the GLES 3.0.4 spec says: "if internalformat is a base internal format, the effective internal format is a sized internal format that is derived from the format and type for internal use by the GL. Table 3.12 specifies the mapping of format and type to effective internal formats. The effective internal format is used by the GL for purposes such as texture completeness or type checks for CopyTex* commands. In these cases, the GL is required to operate as if the effective internal format was used as the internalformat when specifying the texture data." v2: Per the spec, Luminance8Alpha8, Luminance8 and Alpha8 should not be considered sized internal formats. Return the corresponding unsize format instead. v4: * Improved comments in _mesa_es3_effective_internal_format_for_format_and_type(). * Splitted patch to separate chunk about reordering of error_check_subtexture_dimensions() error check, which is not directly related with this patch. v5: Dropped the splitted patch because it was actually a work around 3 dEQP tests that are buggy: dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_offset dEQP-GLES2.functional.negative_api.texture.texsubimage2d_offset_allowed dEQP-GLES2.functional.negative_api.texture.texsubimage2d_neg_wdt_hgt Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-09-28 11:39:53 -07:00
Eduardo Lima Mitev	c6bf1cd146	mesa: Move _mesa_base_tex_format() from teximage to glformats files This function will be needed as part of validating the combination of format, type and internal format of texture pixel operations, which happens in glformats files. Specifically, we want to be able to obtain the base format of a resolved effective internal format, to compare it with the original internal format passed. Also, since this function deals solely with GL formats, it fits better in glformats where the rest of similar format functionality rests. The function is moved as-is, without any modification. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-09-28 11:39:53 -07:00
Eduardo Lima Mitev	15ab968f62	mesa: Fix order of format+type and internal format checks for glTexImageXD ops The more specific GLES constrains should be checked after the general validation performed by _mesa_error_check_format_and_type(). This is also for consistency with the error checks order of glTexSubImage ops. v3: The change of order uncovered a bug that regresses a couple of piglit tests written against OpenGL-ES 1.1 spec, which expects an INVALID_VALUE instead of the INVALID_ENUM returned by _mesa_error_check_format_and_type() when an invalid format is passed to glTexImage2D. This version of the patch accounts for those cases. Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.texture.teximage2d Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-09-28 11:39:53 -07:00
Alexander von Gluck IV	7cdd818d2a	egl: Fix missing Haiku include path	2015-09-28 13:58:25 -04:00
Alexander von Gluck IV	255a225265	state_trackers/hgl: Fix missing include path	2015-09-28 13:58:24 -04:00
Francisco Jerez	b61292296b	i965/fs: Fix hang on IVB and VLV with image format mismatch. IVB and VLV hang sporadically when an untyped surface read or write message is used to access a surface of format other than RAW, as may happen when there is a mismatch between the format qualifier of the image uniform and the format of the actual image bound to the pipeline. According to the spec this condition gives undefined results but may not lead to program termination (which is one of the possible outcomes of the hang). Fix it by checking at runtime whether the surface is of the right type. Fixes the "arb_shader_image_load_store.invalid/format mismatch" piglit subtest. Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91718 CC: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-28 18:10:39 +03:00
Serge Martin	2518645f63	clover: Implement clCreateImage?D w/ clCreateImage. Remplace clCreateImage2D and clCreateImage3D implementation with call to clCreateImage. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-28 18:10:39 +03:00
Serge Martin	f2c52e392b	clover: Implement CL1.2 clCreateImage(). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-28 18:10:39 +03:00
Francisco Jerez	92666b90c0	clover: Move down canonicalization of memory object flags into validate_flags(). This will be used to share the same logic between buffer and image creation. v2: Make memory flag set constants local to validate_flags. (Serge Martin)	2015-09-28 18:10:39 +03:00
Samuel Iglesias Gonsalvez	2b9248dc58	docs: mention ARB_shader_storage_buffer_object on 11.1.0 release notes Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-28 16:34:24 +02:00
Iago Toral Quiroga	e7ae6d9e14	glsl: revert "glsl: atomic counters can be declared as buffer-qualified variables" This reverts commit `586142658e`. The specs are not explicit about any restrictions related to the types allowed on buffer variables, however, the description of opaque types (like atomic counters) is in conclict with the purpose of buffer variables: "The opaque types declare variables that are effectively opaque handles to other objects. These objects are accessed through built-in functions, not through direct reading or writing of the declared variable. (...) Opaque variables cannot be treated as l-values;(...)" Also, Mesa is already disallowing opaque types in interface blocks anyway, so that commit was not really achieving anything. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-28 14:23:26 +02:00
Ilia Mirkin	5bff12ecb4	gallium/util: avoid unreferencing random memory on buffer alloc failure Found by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-28 02:38:58 -04:00
Ilia Mirkin	6dd059fefe	mesa: don't leak interface_name Found by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-28 02:38:58 -04:00
Timothy Arceri	e413d2fbc4	glsl: fix component size calculation for tessellation and geom shaders Broken in commit `abdab88b30` when adding arrays of arrays support Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-09-28 11:31:50 +10:00
Boyan Ding	3c63a2d2f0	docs/GL3.txt: fix typo Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Albert Freeman <albertwdfreeman@gmail.com>	2015-09-27 17:54:23 -07:00
Kenneth Graunke	d6a41b5f70	i965/gs: Optimize away the EOT write on Gen8+ with static vertex count. With static vertex counts, the final EOT write doesn't actually write any data - it's just there to end the thread. Typically, the last thing before ending the thread will be an EmitVertex() call, resulting in a URB write. We can just set EOT on that. Note that this isn't always possible - there might be an intervening SSBO write/image store, or the URB write may have been in a loop. shader-db statistics for geometry shaders only: total instructions in shared programs: 3173 -> 3149 (-0.76%) instructions in affected programs: 176 -> 152 (-13.64%) helped: 8 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-26 12:02:34 -07:00
Kenneth Graunke	08fe5799e6	i965/gs: Allow src0 immediates in GS_OPCODE_SET_WRITE_OFFSET. GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special strides. We can easily make the generator handle constant src[0] arguments by instead generating a MOV with the product of both operands. This isn't necessarily a win in and of itself - instead of a MUL, we generate a MOV, which should be basically the same cost. However, we can probably avoid the earlier MOV to put src[0] into a register. shader-db statistics for geometry shaders only: total instructions in shared programs: 3207 -> 3173 (-1.06%) instructions in affected programs: 3207 -> 3173 (-1.06%) helped: 11 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-26 12:02:31 -07:00
Kenneth Graunke	f0a618ee7c	i965: Implement "Static Vertex Count" geometry shader optimization. Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex Count" fields, which control a new optimization. Normally, geometry shaders can output arbitrary numbers of vertices, which means that resource allocation has to be done on the fly. However, if the number of vertices is statically known, the hardware can pre-allocate resources up front, which is more efficient. Thanks to the new NIR GS intrinsics, this is easy. We just call the function introduced in the previous commit to get the vertex count. If it obtains a count, we stop emitting the extra 32-bit "Vertex Count" field in the VUE, and instead fill out the 3DSTATE_GS fields. Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91) on my Lenovo X250 laptop (Broadwell GT2) at 1024x768. shader-db statistics for geometry shaders only: total instructions in shared programs: 3227 -> 3207 (-0.62%) instructions in affected programs: 242 -> 222 (-8.26%) helped: 10 v2: Don't break non-NIR paths (just skip this optimization). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-09-26 12:01:58 -07:00
Kenneth Graunke	bcef2abad7	i965: Move GS_THREAD_END mlen calculations out of the generator. The visitor was setting a mlen that was wrong for Broadwell, but the generator was ignoring it and doing the right thing regardless. We may as well move the logic fully into the visitor. This will be useful in the next commit as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-09-26 12:01:57 -07:00
Kenneth Graunke	02530c5dc5	nir: Add a function to count the number of vertices a GS emits. Some hardware (such as Broadwell) can run geometry shaders more efficiently when the number of vertices emitted is statically known. This pass provides a way to obtain the constant vertex count, or -1 indicating that the vertex count is unknown/non-constant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-09-26 12:01:53 -07:00
Kenneth Graunke	df221f65e2	i965: Simplify handling of VUE map changes. The old code was disasterously complex - spread across multiple atoms which may not even run, inspecting the dirty bits to try and decide whether it was necessary to do checks...storing VS information in brw_context...extra flagging... This code tripped me and Carl up very badly when working on the shader cache code. It's very fragile and hard to maintain. Now that geometry shaders only depend on their inputs and don't have to worry about the VS VUE map, we can dramatically simplify this: just compute the VUE map coming out of the geometry shader stage in brw_upload_programs. If it changes, flag it. Done. v2: Also check vue_map.separable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-26 11:59:56 -07:00
Kenneth Graunke	6301af22bb	i965/gs: Remove the dependency on the VS VUE map. Because we only support geometry shaders in core profile, we can safely ignore any driver-extending of VS outputs. Those are: - Legacy userclipping (doesn't exist in core profile) - Edgeflag copying (Gen4-5 only, no GS support) - Point coord replacement (Gen4-5 only, no GS support) - front/back color hacks (Gen4-5 only, no GS support) v2: Rebase; leave a comment about why SSO works. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-26 11:59:56 -07:00
Kenneth Graunke	99df02ca26	i965: Don't re-layout varyings for separate shader programs. Previously, our VUE map code always assigned slots to varyings sequentially, in one contiguous block. This was a bad fit for separate shaders - the GS input layout depended or the VS output layout, so if we swapped out vertex shaders, we might have to recompile the GS on the fly - which rather defeats the point of using separate shader objects. (Tessellation would suffer from this as well - we could have to recompile the HS, DS, and GS.) Instead, this patch makes the VUE map for separate shaders use a fixed layout, based on the input/output variable's location field. (This is either specified by layout(location = ...) or assigned by the linker.) Corresponding inputs/outputs will match up by location; if there's a mismatch, we're allowed to have undefined behavior. This may be less efficient - depending what locations were chosen, we may have empty padding slots in the VUE. But applications presumably use small consecutive integers for locations, so it hopefully won't be much worse in practice. 3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions. This seems like a small price to pay for avoiding recompiles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-26 11:59:56 -07:00
Kenneth Graunke	1e5180316c	i965/vue: Make assign_vue_map() take an explicit slot. Our plan of assigning consecutive slots doesn't work properly for separate shader objects - at least, if we want to avoid recompiling them whenever the interface changes. As a first step, make assign_vue_map take an explicit slot parameter, rather than implicitly incrementing it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-26 11:59:56 -07:00
Kenneth Graunke	268008f98c	i965: Initialize unused VUE map slots to BRW_VARYING_SLOT_PAD. Nothing actually relies on unused slots being initialized to BRW_VARYING_SLOT_COUNT. Soon, we're going to have VUE maps with holes in them, at which point pre-filling with BRW_VARYING_SLOT_PAD make a lot more sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-26 11:59:56 -07:00
Kenneth Graunke	39d4b553a8	i965: Fix BRW_VARYING_SLOT_PAD handling in the scalar VS backend. We can't just break for padding slots. Instead, treat them like unwritten output variables, so we handle flushing and incrementing urb_offset correctly. Paul introduced the concept of padding slots back in 2011, but we've never actually used them for anything. So it's unsurprising that the scalar VS backend didn't handle them quite right. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-26 11:59:56 -07:00
Samuel Iglesias Gonsalvez	511a86383b	main/tests: Enable glShaderStorageBlockBinding() check in dispatch_sanity test Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-26 16:54:02 +02:00
Emil Velikov	d2d4f00a2c	docs: add news item and link release notes for 11.0.1 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-26 14:25:19 +01:00
Emil Velikov	5d08669e2f	docs: add sha256 checksums for 11.0.1 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `7f1a77ae66`)	2015-09-26 14:23:00 +01:00
Emil Velikov	aeec994954	docs: add release notes for 11.0.1 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `bcb9e1d26b`)	2015-09-26 14:22:59 +01:00
Timothy Arceri	abdab88b30	glsl: calculate component size for arrays of arrays when varying packing disabled Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-26 22:48:49 +10:00
Timothy Arceri	1d401f9ce4	glsl: validate binding qualifier for AoA Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-26 22:28:05 +10:00
Timothy Arceri	9bad7afbc2	glsl: add helper for calculating size of AoA V2: return 0 if not array rather than -1 Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-26 22:27:47 +10:00
Timothy Arceri	776a3845d6	glsl: clean-up link uniform code These changes are also needed to allow linking of struct and interface arrays of arrays. Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-26 22:27:24 +10:00
Marek Olšák	9932142192	radeonsi: add scratch buffer to the buffer list when it's re-allocated Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: mesa-stable@lists.freedesktop.org	2015-09-26 01:51:05 +02:00
Leo Liu	1e97b41893	radeon/vce: fix vui time_scale zero error if app pass 0 as frame_rate_num, it should not be encoded to the VUI. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-25 18:47:14 -04:00
Matt Turner	1dd943d7fb	mesa: Add locking to programs. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-25 14:08:31 -07:00
Matt Turner	3c57a102eb	mesa: Add locking to sampler objects. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-25 14:08:31 -07:00
Matt Turner	d4b0e0b717	mesa: Remove debugging code from _mesa_reference_*. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-25 14:08:31 -07:00
Matt Turner	c8dc04d4c0	c11/threads: Assert that mtx is non-NULL and check return values. Passing NULL to C11 threads functions isn't safe, so there's no need for our implementation to handle it. Cuts about 1k of .text. text data bss dec hex filename 5009514 198440 26328 5234282 4fde6a i965_dri.so before 5008346 198440 26328 5233114 4fd9da i965_dri.so after Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-25 14:08:31 -07:00
Tapani Pälli	266d05a3a0	glsl: fix packed varyings interface type and add default case fixes Piglit test: arb_program_interface_query/linker/query-varyings.shader_test Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-25 12:19:36 +03:00
Antia Puentes	e92c35a872	glsl: Mark as active all elements of shared/std140 block arrays Commit `1ca25ab` (glsl: Do not eliminate 'shared' or 'std140' blocks or block members) considered as active 'shared' and 'std140' uniform blocks and uniform block arrays, but did not include the block array elements. Because of that, it was possible to have an active uniform block array without any elements marked as used, making the assertion ((b->num_array_elements > 0) == b->type->is_array()) in link_uniform_blocks() fail. Fixes the following 5 dEQP tests: * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18 * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24 * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19 * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49 * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83508 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	065e7d37f1	docs: Mark ARB_shader_storage_buffer_object as done for i965 v2: - Mark it too for GLES 3.1 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	614b5307fd	i965: Enable ARB_shader_storage_buffer_object extension for gen7+ Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	5b080e3ddf	mesa: enable ARB_shader_storage_buffer_object extension for GLES 3.1 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	10b5c6491f	mesa: Add getters for the GL_ARB_shader_storage_buffer_object max constants v2: - Add tessellation shader constants support v3: - Add GLES 3.1 support. v4: - Move the getters to the proper place Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	91191af6d6	glapi: add ARB_shader_storage_block_buffer_object Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	26011fa22a	main/tests: add ARB_shader_storage_buffer_object tokens to enum_strings Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	9b477ad49d	main: Add SHADER_STORAGE_BLOCK and BUFFER_VARIABLE support for ARB_program_interface_query Including TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE queries. v2: - Use std430_array_stride() to get top level array stride following std430's rules. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	0f18945cb6	glsl: Do not allow reads from write-only buffer variables The error location won't be right, but fixing that would require to check for this as we process each type of AST node that can involve a variable read. v2: - Limit the check to buffer variables, image variables have different semantics involved. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	995a719499	glsl: Do not allow assignments to read-only buffer variables v2: - Merge the error check for the readonly qualifier with the already existing check for variables flagged as readonly (Timothy). - Limit the check to buffer variables, image variables have different semantics involved (Curro). Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	6ef82f039c	glsl: Allow memory qualifiers on shader storage buffer blocks v2: - Memory qualifiers on shader storage buffer objects do not come in the form of layout qualifiers, they are block-level qualifiers. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	f1b647fdd1	glsl: Apply memory qualifiers to buffer variables v2: - Save memory qualifier info in the top level members of a shader storage block. - Add a checks to record_compare() which is used when comparing shader storage buffer declarations in different shaders. - Always report an error for incompatible readonly/writeonly definitions, whether they are present at block or field level. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	f4c8c01a3d	glsl: Allow use of memory qualifiers with ARB_shader_storage_buffer_object. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	3b2037f88c	glsl: fix UNIFORM_BUFFER_START or UNIFORM_BUFFER_SIZE query when no buffer object is bound According to ARB_uniform_buffer_object spec: "If the parameter (starting offset or size) was not specified when the buffer object was bound (e.g. if bound with BindBufferBase), or if no buffer object is bound to <index>, zero is returned." Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	2e16dd1350	mesa: Add queries for GL_SHADER_STORAGE_BUFFER These handle querying the buffer name attached to a giving binding point as well as the start offset and size of that buffer. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Samuel Iglesias Gonsalvez	4b7b1cf3c0	mesa: add glShaderStorageBlockBinding() Defined in ARB_shader_storage_buffer_object extension. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	a07d0c2657	glsl: First argument to atomic functions must be a buffer variable v2: - Add ssbo_in the names of the static functions so it is clear that this is specific to SSBO atomics. v3: - Move the check after the loop (Kristian Høgsberg) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	5ef169034c	i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	14af6f4698	i965/nir/fs: Implement nir_intrinsic_ssbo_atomic_* Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	9d5c0be5d5	nir: Implement lowered SSBO atomic intrinsics The original GLSL IR intrinsics have been lowered to an internal version that accepts a block index and an offset instead of a SSBO reference. v2 (Connor): - Document the sources used by the atomic intrinsics. Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:23 +02:00
Iago Toral Quiroga	d2719b6e4f	glsl: lower SSBO atomic intrinsics The first argument to SSBO atomics is a reference to a SSBO buffer variable so we want to compute its block index and offset and provide these values to an internal version of the intrinsic that takes them instead of the buffer variable reference. v2: - Support single components of integer vectors to be passed in as arguments. - Get interface packing information from interface's type. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	da659087b9	glsl: use ir_rvalue instead of ir_dereference in auxiliary functions In a later commit we will need to handle ir_swizzle nodes too, which are not an ir_dereference. That can happen, for example, when we pass a component of an integer vector as argument to any of the SSBO atomic functions. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	ea0a1f5beb	glsl: Add atomic functions from ARB_shader_storage_buffer_object Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	2cacebaad3	glsl: Rename atomic counter functions Shader Storage Buffer Object will add new atomic functions that are not associated with counters, so better have atomic counter-specific functions explicitly include the word "counter" in their names. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	586142658e	glsl: atomic counters can be declared as buffer-qualified variables Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	475d9c32d1	nir/glsl_to_nir: ignore an instruction's dest if it hasn't any Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	e3f9c7829c	i965/nir/vec4: Implement nir_intrinsic_load_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	5b186aafe7	i965/nir/fs: Implement nir_intrinsic_load_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	e59ae238b6	nir: Implement __intrinsic_load_ssbo v2: - Fix ssbo loads with boolean variables. v3: - Simplify the changes (Kristian) Reviewed-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	3e70c968de	nir: modify the instruction insertion in nir_visitor::visit(ir_call ir) This patch moves nir_instr_insert_after_cf_list call into each case in the intrinsics switch at nir_visitor::visit(ir_call ir) and define a nir_dest variable which will be used when handling ir->return_deref after the switch. This patch simplifies the code for nir_intrinsic_load_ssbo implementation changes we are going to do next. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	922b3d1bb1	i965/nir/vec4: Implement nir_intrinsic_store_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	337dad8cee	i965/nir/fs: Implement nir_intrinsic_store_ssbo Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Iago Toral Quiroga	9bb7d9ecf8	nir: Implement __intrinsic_store_ssbo v2 (Connor): - Make the STORE() macro take arguments for the extra sources (and their size) and any extra indices required. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Francisco Jerez	f17c6b9066	i965/vec4: Import surface message builder functions. Implement helper functions that can be used to construct and send untyped and typed surface read, write and atomic messages to the shared dataport unit. v2: Split from the FS implementation. v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Francisco Jerez	d5503ce39f	i965/vec4: Import helpers to convert vectors into arrays and back. These functions handle the conversion of a vec4 into the form expected by the dataport unit in message and message return payloads. The conversion is not always trivial because some messages don't support SIMD4x2 for some generations, in which case a strided copy may be necessary. v2: Split from the FS implementation. v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Francisco Jerez	402cb7ce13	i965/vec4: Introduce VEC4 IR builder. See "i965/fs: Introduce FS IR builder." for the rationale. v2: Drop scalarizing VEC4 builder. v3: Take a backend_shader as constructor argument. Improve handling of debug annotations and execution control flags. Rename "instr" variable. Initialize cursor to NULL by default and add method to explicitly point the builder at the end of the program. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	203cd1bf28	glsl: shader storage blocks use different max block size values than uniforms Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	eb9a9b62b1	glsl: ignore buffer variables when counting uniform components Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	138e4ae8ae	glsl: number of active shader storage blocks must be within allowed limits Notice that we should differentiate between shader storage blocks and uniform blocks, since they have different limits. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	a7b4ab45d0	glsl: a shader storage buffer must be smaller than the maximum size allowed Otherwise, generate a link time error as per the ARB_shader_storage_buffer_object spec. v2: - Fix error message (Jordan) v3: - Move std140_size() changes to its own patch (Kristian) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	e854a98001	glsl: add std430 interface packing support to ssbo related operations v2: - Get interface packing information from interface's type, not the variable type. - Simplify is_std430 condition in emit_access() for readability (Jordan) - Add a commment explaing why array of three-component vector case is different in std430 than the rest of cases. - Add calls to std430_array_stride(). v3: - Simplify size_mul change for std430's case (Jordan) - Fix commit log lines length (Jordan) - Pass 'packing' instead of 'is_std430' to emit_access() (Kristian) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	1be180b941	glsl: Add std430 support to program_resource_visitor's member functions They are used to calculate the offset, array stride of uniform/shader storage buffer variables. Take into account this info to get the right value for std430. v2: - Fix commit log line length and indention. (Jordan) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	8f0167c65b	glsl: Add parser/compiler support for std430 interface packing qualifier v2: - Fix a missing check in has_layout() v3: - Mention shader storage block in error message for layout qualifiers (Kristian). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:22 +02:00
Samuel Iglesias Gonsalvez	35476c2bae	glsl: Add std430 related member functions to glsl_type class They are used to calculate size, base alignment and array stride values for a glsl_type following std430 rules. v2: - Paste OpenGL 4.3 spec wording as it mentions stride of array. (Jordan) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	a40f917c4b	glsl: allow default qualifiers for shader storage block definitions This kind of definitions: layout(xxx) buffer; was not supported by commit `84fc5fece0`. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	3763a0e0a7	glsl: Move interface block processing to glsl_parser_extras.cpp No functional changes. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	9c1f10b1bc	glsl: ignore default qualifier declarations when checking for duplicate layout qualifiers Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	130031168d	glsl: layout qualifier can appear more than once since OpenGL 4.20 Also if GL_ARB_shading_language_420pack extension is enabled. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	5bb5eeea00	i965/wm: surfaces should have the API buffer size, not the drm buffer size The returned drm buffer object has a size multiple of 4096 but that should not be exposed to the API user, which is working with a different size. As far as I can see this problem is only visible in the calculation of the length of unsized arrays used in SSBOs, as the implementation of this needs to query the underlying buffer size via a message. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	eaa6f01c8d	i965/wm: emit null buffer surfaces when null buffers are attached Otherwise we can expect odd things to happen if, for example, we ask for the size of the attached buffer from shader code, since that might query this value from the surface we uploaded and get random results. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	f5dd2c1822	i965/fs/nir: implement nir_intrinsic_get_buffer_size v2: - Remove inst->regs_written assignment as the instruction only writes to one register. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	b23eb643eb	i965/fs: Implement FS_OPCODE_GET_BUFFER_SIZE Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	65d7f5fe9f	i965/vec4/nir: implement nir_intrinsic_get_buffer_size Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	6485880232	i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE Notice that Skylake needs to include a header in the sampler message so it will need some tweaks to work there. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	003ce30e36	nir: Implement ir_unop_get_buffer_size This is how backends provide the buffer size required to compute the size of unsized arrays in the previous patch Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	750c694474	glsl: implement unsized array length v2: - Reduce the number of lines over 80 character line width limit. (Thomas Hellan) v3: - Inject the formula to compute the array length in the IR, backends only need to provide the buffer size (Curro) - Create an auxiliary function to simplify code (Jordan Justen) - Rename variables (Jordan Justen) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	273f61a005	glsl: Add parser/compiler support for unsized array's length() The unsized array length is computed with the following formula: array.length() = max((buffer_object_size - offset_of_array) / stride_of_array, 0) Of these, only the buffer size needs to be provided by the backends, the frontend already knows the values of the two other variables. This patch identifies the cases where we need to get the length of an unsized array, injecting ir_unop_ssbo_unsized_array_length expressions that will be lowered (in a later patch) to inject the formula mentioned above. It also adds the ir_unop_get_buffer_size expression that drivers will implement to provide the buffer length. v2: - Do not define a triop that will force backends to implement the entire formula, they should only need to provide the buffer size since the other values are known by the frontend (Curro). v3: - Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead of using state->ARB_shader_storage_buffer_object_enable (Tapani). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	1440d2a683	glsl: Add unsized array support to glsl_type::std140_size() Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	68f5a4e6d2	glsl: fix indention in glsl_types.cpp No functional changes. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	f3f64cd0c4	glsl: add support for unsized arrays in shader storage blocks They only can be defined in the last position of the shader storage blocks. When an unsized array is used in different shaders, it might be converted in different sized arrays, avoid get a linker error in that case. v2: - Rework error condition and error messages (Timothy Arceri) v3: - Move OpenGL ES check to its own patch. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:21 +02:00
Samuel Iglesias Gonsalvez	f45d39f6af	glsl: return error if unsized arrays are found in OpenGL ES Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Iago Toral Quiroga	6335c79236	i965/fs: Do not split buffer variables Buffer variables are the same as uniforms, only that read/write, so we want the same treatment. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Iago Toral Quiroga	2773a7cf1d	i965: handle visiting of ir_var_shader_storage variables Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Iago Toral Quiroga	37da6a2acd	i965: Upload Shader Storage Buffer Object surfaces Since these are a special kind of UBOs we emit them together reusing the same infrastructure, however, we use a RAW surface so we can reuse existing untyped read/write/atomic messages which include a pixel mask header that we need to set to obtain correct behavior with helper invocations of the fragment shader. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Iago Toral Quiroga	bdbabc57e3	i965: Set MaxShaderStorageBuffers for compute shaders v2: - Set it after the driver's MaxShaderStorageBuffers value assignment. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Samuel Iglesias Gonsalvez	36f392c4ef	i965: set ARB_shader_storage_buffer_object related constant values v2: - Add tessellation shader constants assignment v3: - Set MaxShaderStorageBufferBindings to 36. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Iago Toral Quiroga	dfdeb94a5a	i965: Implement DriverFlags.NewShaderStorageBuffer We use the same dirty state for SSBOs and UBOs because they share the same infrastructure. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Iago Toral Quiroga	332ff009ff	i965: Use 64-byte offset alignment for shader storage buffers This should be a cacheline (64 bytes) so that we can safely have the CPU and GPU writing the same SSBO on non-cachecoherent systems (our Atom CPUs). With UBOs, the GPU never writes, so there's no problem. For an SSBO, the GPU and the CPU can be updating disjoint regions of the buffer simultaneously and that will break if the regions overlap the same cacheline. v2: - Use cacheline size (64 bytes) instead of 16 bytes (Kristian). - Update commit log and add a comment in the code explaining why we use cacheline size (Ben). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Samuel Iglesias Gonsalvez	4cf908f9cb	mesa: set MAX_SHADER_STORAGE_BUFFERS to 16. v2: - Set the value to 16 and drop the comment. (Kristian) Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-25 08:39:20 +02:00
Tapani Pälli	4639cea292	glsl: add packed varyings to program resource list This makes sure that user is still able to query properties about variables that have gotten packed by lower_packed_varyings pass. Fixes following OpenGL ES 3.1 test: ES31-CTS.program_interface_query.separate-programs-vertex v2: fix 'name included in packed list' check (Ilia Mirkin) v3: iterate over instances of name using strtok_r (Ilia Mirkin) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-25 08:14:41 +03:00
Tapani Pälli	a6b55beb78	mesa: add packed_varyings list to gl_shader This is required to store information about packed varyings, currently these variables get lost and cannot be retrieved later in sensible way for program interface queries. List will be utilized by next patch. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-25 08:05:59 +03:00
Jordan Justen	ebbe6cdad7	i965/cs: Implement DispatchComputeIndirect support Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-24 19:15:13 -07:00
Jordan Justen	d11d018ce3	mesa/cs: Implement glDispatchComputeIndirect Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-24 19:15:13 -07:00
Jordan Justen	12cf91db02	mesa/cs: Support GL_DISPATCH_INDIRECT_BUFFER v2: * Use _mesa_has_compute_shaders (Ilia) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-24 19:15:13 -07:00
Jordan Justen	4a1ba7e6bd	mesa/cs: Add _mesa_validate_DispatchCompute Move API validation to _mesa_validate_DispatchCompute in api_validate.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-24 19:15:13 -07:00
Roland Scheidegger	19604d30e1	mesa: fix mipmap generation for immutable, compressed textures If the immutable compressed texture didn't have the full mip pyramid, this didn't work, because it tried to generate mip levels for non-existing levels. _mesa_prepare_mipmap_level() would correctly handle this by returning FALSE if the mip level didn't exist, however we actually created the non-existing mip level right before that because we used _mesa_get_tex_image() before calling _mesa_prepare_mipmap_level(). It would then proceed to crash (we allocated the mip level, which is a bad idea on an immutable texture, but didn't initialize the values, leading to assertion failures or segfaults). Fix this by using _mesa_select_tex_image() instead and call it after _mesa_prepare_mipmap_level(), as that function will allocate missing mip levels for non-immutable textures already. This fixes a (2 year old) crash with astromenace which was hack-fixed in ubuntu packages instead: http://bugs.debian.org/718680 (I guess most apps do full mip chains - I believe this app not doing it is actually unintentional, always one level less than full mip chain...). Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-25 00:06:10 +02:00
Matt Turner	d6bb46bbe8	glsl: Expose gl_MaxTess{Control,Evaluation}AtomicCounters. ... with only ARB_shader_atomic_counters. I expected to see interactions with ARB_tessellation_shader in the ARB_shader_atomic_counters spec, but they do not exist. It seems that we should unconditionally expose these variables in the presence of ARB_shader_atomic_counters: gl_MaxTessControlAtomicCounters gl_MaxTessEvaluationAtomicCounters This partially reverts commit `da7adb99e8`. The commit also affected gl_MaxTessControlImageUniforms and gl_MaxTessEvaluationImageUniforms similarly but the ARB_shader_image_load_store spec does list an interaction with ARB_tessellation_shader. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92095 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-24 12:15:47 -07:00
Alejandro Piñeiro	7fee23569b	i965/vec4: check swizzle before discarding a uniform on a 3src operand Without this commit, copy propagation is discarded if it involves a uniform with an instruction that has 3 sources. But 3 sourced instructions can access scalar values. For example, this is what vec4_visitor::fix_3src_operand() is already doing: if (src.file == UNIFORM && brw_is_single_value_swizzle(src.swizzle)) return src; Shader-db results (unfiltered) on NIR: total instructions in shared programs: 6259650 -> 6241985 (-0.28%) instructions in affected programs: 812755 -> 795090 (-2.17%) helped: 7930 HURT: 0 Shader-db results (unfiltered) on IR: total instructions in shared programs: 6445822 -> 6441788 (-0.06%) instructions in affected programs: 296630 -> 292596 (-1.36%) helped: 2533 HURT: 0 v2: - Updated commit message, using Matt Turner suggestions - Move the check after we've created the final value, as Jason Ekstrand suggested - Clean up the condition v3: - Move the check back to the original place, to keep things tidy, as suggested by Jason Ekstrand v4: - Fixed missing is_single_value_swizzle() as pointed by Jason Ekstrand Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-24 21:12:53 +02:00
Mauro Rossi	1d040160f8	android: radeonsi: fix sid_tables.h missing LOCAL_MODULE_CLASS Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-24 20:05:41 +02:00
Benjamin Bellec	ebcc886d87	gallium/radeon: remove the percentage symbol from HUD temperature The HUD adds '%' if max == 100. Signed-off-by: Benjamin Bellec <b.bellec@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-24 19:54:50 +02:00
Marek Olšák	7bbce21e45	gallium/u_blitter: handle allocation failures Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	ae418a7b56	radeonsi: handle dummy constant buffer allocation failure Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	b737d9c1dc	radeonsi: don't forget to update scratch relocations for LS, HS, ES shaders Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	d556346b35	radeonsi: skip drawing if updating the scratch buffer fails Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	1f99b0be7e	radeonsi: skip drawing if PS fails to compile or upload Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	237d7cccce	radeonsi: skip drawing if VS, TCS, TES, GS fail to compile or upload Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	9b6d9dd7d8	radeonsi: handle fixed-func TCS shader create failure Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	5dbadb0257	radeonsi: handle shader precompile failures Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	263f5a2cf9	radeonsi: skip drawing if GS ring allocations fail Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:43 +02:00
Marek Olšák	22d3ccf5a8	radeonsi: skip drawing if the tess factor ring allocation fails Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	5c219ab552	radeonsi: add malloc fail paths to si_create_shader_state Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	394d67a58f	radeonsi: report alloc failure from si_shader_binary_read Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	dea834e639	gallium/radeon: add a fail path for depth MSAA texture readback Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	f95e695059	gallium/radeon: handle buffer alloc failures in r600_draw_rectangle Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	282b378012	gallium/radeon: handle buffer_map staging buffer failures better Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	cd27ff6a0f	radeonsi: handle constant buffer alloc failures Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	29dff6f676	radeonsi: handle index buffer alloc failures Cc: 11.0 <mesa-stable@lists.freedesktop.org> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-24 19:51:42 +02:00
Marek Olšák	f3a0819533	st/mesa: fix front buffer regression after dropping st_validate_state in Blit Broken by: `d082c53249` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92072 Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-24 19:51:42 +02:00
Kristian Høgsberg Kristensen	21c1c7ff81	wayland: Add copyright notice for wayland-egl.c Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-09-24 10:51:10 -07:00
Kristian Høgsberg Kristensen	2ea16966ae	i965: Respect stride and subreg_offset for ATTR registers When we assign hw regs to attributes, we don't incorporate the stride and subreg_offset from the fs_reg. It's rarely used, but the integer multiplication lowering uses unusual stride and subreg_offset combination breaks when one source is an attribute. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970 Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-24 10:17:27 -07:00
Brian Paul	200aee4247	mesa: rework Driver.CopyImageSubData() and related code Previously, core Mesa's _mesa_CopyImageSubData() created temporary textures to wrap renderbuffer sources/destinations. This caused a bit of a mess in the Mesa/gallium state tracker because we had to basically undo that wrapping. Instead, change ctx->Driver.CopyImageSubData() to take both gl_renderbuffer and gl_texture_image src/dst pointers (one being null, the other non-null) so the driver can handle renderbuffer vs. texture as needed. For the i965 driver, we basically moved the code that wrapped textures around renderbuffers from copyimage.c down into the met and driver code. The old code in copyimage.c also made some questionable calls to _mesa_BindTexture(), etc. which weren't undone at the end. v2 (Jason Ekstrand): Rework the intel bits v3 (Brian Paul): Update the temporary st_CopyImageSubData() function. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2015-09-24 07:52:42 -06:00
Thomas Hellstrom	c8cb5ed93c	st/xa: Fixups for PIPE_FORMAT_R8_UNORM A8 usage v2. Check for PIPE_FORMAT_R8_UNORM when setting up the copy shader. Also re-enable the dest alpha blending with A8 destination that actually turned out to be correct. Verified using rendercheck that the composite operators overreverse, in, out, atop, atopreverse and xor seem to work fine with a8 destiation. v2: Fix a copy-paste error. Reported-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-09-24 04:47:48 -07:00
Ilia Mirkin	1614c39a8f	st/mesa: keep track of saturated writes when eliminating dead code It doesn't matter whether a write is saturated or not, in another implementation it might even have been a separate opcode. This code was most likely copied from the copy-propagation pass (where one does have to distinguish saturation). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-24 00:19:55 -04:00
Timothy Arceri	827d794834	glsl: correctly detect inactive UBO arrays Previously the code was trying to get the packing type from the array not the interface. Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Antia Puentes <apuentes@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>	2015-09-24 10:07:42 +10:00
Ilia Mirkin	71e187430c	i965: add ARB_texture_barrier support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 15:49:54 -04:00
Kenneth Graunke	31a36ffbc8	i965/gs: Fix extra level of indentation left by the previous commit. I left a bunch of code indented a level in the previous patch to make the diff easier to read. But now we should fix that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	df31c1850d	i965/gs: Use new NIR intrinsics. By performing the vertex counting in NIR, we're able to elide a ton of useless safety checks around every EmitVertex() call: total instructions in shared programs: 3952 -> 3720 (-5.87%) instructions in affected programs: 3491 -> 3259 (-6.65%) helped: 11 HURT: 0 Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621) on Haswell GT3e at 1024x768. This should also make it easier to implement Broadwell's "Static Vertex Count" feature someday. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	542d40d698	nir: Add new GS intrinsics that maintain a count of emitted vertices. This patch also introduces a lowering pass to convert the simple GS intrinsics to the new ones. See the comments above that for the rationale behind the new intrinsics. This should be useful for i965; it's a generic enough mechanism that I could see other drivers potentially using it as well, so I don't feel too bad about putting it in the generic code. v2: - Use nir_after_block_before_jump for the cursor (caught by Jason Ekstrand - I'd mistakenly used nir_after_block when rebasing this code onto the new NIR control flow API). - Remove the old emit_vertex intrinsic at the end, rather than in the middle (requested by Jason). - Use state->... directly rather than locals (requested by Jason). - Report progress from nir_lower_gs_intrinsics() (requested by me). - Remove "Authors:" section from file comment (requested by Michael Schellenberger Costa). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	0a040975ec	nir: Add unit tests for control flow graphs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	fbaa1b19d7	nir/cf: Fix dominance metadata in the dead control flow pass. The NIR control flow modification API churns the block structure, splitting blocks, stitching them back together, and so on. Preserving information about block dominance is hard (and probably not worthwhile). This patch makes nir_cf_extract() throw away all metadata, like we do when adding/removing jumps. We then make the dead control flow pass compute dominance information right before it uses it. This is necessary because earlier work by the pass may have invalidated it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	6560838703	nir/cf: Fix unlink_block_successors to actually unlink the second one. Calling unlink_blocks(block, block->successors[0]) will successfully unlink the first successor, but then will shift block->successors[1] down to block->successor[0]. So the successors[1] != NULL check will always fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 11:00:00 -07:00
Kenneth Graunke	024e5ec977	nir/cf: Alter block successors before adding a fake link. Consider the case of "while (...) { break }". Or in NIR: block block_0 (0x7ab640): ... /* succs: block_1 / loop { block block_1: / preds: block_0 / break / succs: block_2 */ } block block_2: Calling nir_handle_remove_jump(block_1, nir_jump_break) will remove the break. Unfortunately, it would mangle the predecessors and successors. Here, block_2->predecessors->entries == 1, so we would create a fake link, setting block_1->successors[1] = block_2, and adding block_1 to block_2's predecessor set. This is illegal: a block cannot specify the same successor twice. In particular, adding the predecessor would have no effect, as it was already present in the set. We'd then call unlink_block_successors(), which would delete the fake link and remove block_1 from block_2's predecessor set. It would then delete successors[0], and attempt to remove block_1 from block_2's predecessor set a second time...except that it wouldn't be present, triggering an assertion failure. The fix appears to be simple: simply unlink the block's successors and recreate them to point at the correct blocks first. Then, add the fake link. In the above example, removing the break would cause block_1 to have itself as a successor (as it becomes an infinite loop), so adding the fake link won't cause a duplicate successor. v2: Add comments (requested by Connor Abbott) and fix commit message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	0991b2eb35	nir/cf: Conditionally do block_add_normal_succs() in unlink_jump(); There is a bug where we mess up predecessors/successors due to the ordering of unlinking/recreating edges/adding fake edges. In order to fix that, I need everything in one routine. However, calling block_add_normal_succs() isn't safe from cleanup_cf_node() - it would crash trying to insert phi undefs. So unfortunately I need to add a parameter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	9674c76c0e	nir/cf: Don't break outer-block successors in split_block_beginning(). Consider the following NIR: block block_0; /* succs: block_1 block_2 / if (...) { block block_1; ... } else { block block_2; } Calling split_block_beginning() on block_1 would break block_0's successors: link_block() sets both successors of a block, so calling link_block(block_0, new_block, NULL) would throw away the second successor, leaving only / succ: new_block */. This is invalid: the block before an if statement must have two successors. Changing the call to link_block(pred, new_block, pred->successors[0]) would correctly leave both successors in place, but because unlink_block may shift successor[1] to successor[0], it may not preserve the original order. NIR maintains a convention that successor[0] must point to the "then" block, while successor[1] points to the "else" block, so we need to take care to preserve this ordering. This patch creates a new function that swaps out one successor for another, preserving the ordering. It then uses this to fix the issue. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	e2637db618	nir/cf: Make a helper function for removing a predecessor. I need to do this in a second place, and I'd rather make a helper function than cut and paste the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Kenneth Graunke	6a67ede6b3	nir: Validate that a block doesn't have two identical successors. This is invalid, and causes disasters if we try to unlink successors: removing the first will work, but removing the second copy will fail because the block isn't in the successor's predecessor set any longer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-23 10:59:59 -07:00
Jason Ekstrand	8dcbca5957	nir/lower_vec_to_movs: Don't emit unneeded movs It's possible that, if a vecN operation is involved in a phi node, that we could end up moving from a register to itself. If swizzling is involved, we need to emit the move but. However, if there is no swizzling, then the mov is a no-op and we might as well not bother emitting it. Shader-db results on Haswell: total instructions in shared programs: 6262536 -> 6259558 (-0.05%) instructions in affected programs: 184780 -> 181802 (-1.61%) helped: 838 HURT: 0 Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-23 10:12:39 -07:00
Jason Ekstrand	65e80ce5b5	nir/lower_vec_to_movs: Properly handle source modifiers on vecN ops I don't know of any piglit tests that are currently broken. However, there is nothing stopping a vecN instruction from getting source modifiers and lower_vec_to_movs is run after we lower to source modifiers. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-23 10:12:39 -07:00
Ville Syrjälä	aae0c88797	i915: Make hw_prim[] const The table used to map the GL primitive to the hw primitive never changes so make it const. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-23 09:57:46 -07:00
Ville Syrjälä	84fec757de	t_dd_dmatmp: Make the render_tab[]s const These tables hold function pointers and they never change so make them const. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-23 09:57:46 -07:00
Ian Romanick	abbaf3301f	mesa: Remove unused HAVE_TRI_STRIP_1 defines Defined to 0 in a few places, but it's not used anywhere. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:42 -07:00
Ian Romanick	d830965057	t_dd_dmatmp: Constify dmasz Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:38 -07:00
Ian Romanick	8e9968f184	t_dd_dmatmp: Silence comparison between signed and unsigned integer expression warnings ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:83:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:83:55: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:116:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:116:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:140:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:140:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h: In function 'intel_render_line_loop_verts': ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:174:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:174:55: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:224:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:224:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:255:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:255:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:281:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:281:56: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h: In function 'intel_render_poly_verts': ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:313:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:313:59: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:365:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - nr); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:365:56: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - nr); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:83:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:83:55: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:116:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:116:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:140:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:140:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h: In function 'radeon_dma_render_line_loop_verts': ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:174:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:174:55: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:224:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:224:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:255:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:255:52: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:281:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:281:56: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h: In function 'radeon_dma_render_poly_verts': ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:313:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:313:59: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - j + 1); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:365:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] nr = MIN2(currentsz, count - nr); ^ ../../../../../src/mesa/tnl_dd/t_dd_dmatmp.h:365:56: warning: signed and unsigned type in conditional expression [-Wsign-compare] nr = MIN2(currentsz, count - nr); ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:26 -07:00
Ian Romanick	d663d8f5d4	t_dd_dmatmp: Use stdbool.h No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:24 -07:00
Ian Romanick	b7259fc6b0	t_dd_dmatmp: General indentation and formatting fixes No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:22 -07:00
Ian Romanick	57ae5c237d	t_dd_dmatmp: Indentation and formatting fixes after HAVE_ELTS change No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:20 -07:00
Ian Romanick	25b42f13bd	t_dd_dmatmp: Remove HAVE_ELTS support Two drivers use this file, and neither supports ELTs. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:17 -07:00
Ian Romanick	1f374958fd	t_dd_dmatmp: Indentation and formatting fixes after HAVE_TRI_FANS change No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:15 -07:00
Ian Romanick	03c3208c18	t_dd_dmatmp: Require HAVE_TRI_FANS Two drivers use this file, and both support triangle fans. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:13 -07:00
Ian Romanick	2e19ed3cb5	t_dd_dmatmp: Indentation and formatting fixes after HAVE_TRI_STRIPS change v2: Fix '- nr' typo noticed by Marius. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> [v1]	2015-09-23 09:57:11 -07:00
Ian Romanick	fd97a05508	t_dd_dmatmp: Require HAVE_TRI_STRIPS Two drivers use this file, and both support triangle strips. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:08 -07:00
Ian Romanick	22b73f3c2a	t_dd_dmatmp: Require HAVE_TRIANGLES Two drivers use this file, and both support triangles. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:06 -07:00
Ian Romanick	dcd8e49962	t_dd_dmatmp: Indentation and formatting fixes after HAVE_LINE_STRIPS change No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:04 -07:00
Ian Romanick	1ecdf956ac	t_dd_dmatmp: Require HAVE_LINE_STRIPS Two drivers use this file, and both support line strips. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:57:01 -07:00
Ian Romanick	1ab8a69a3b	t_dd_dmatmp: Indentation and formatting fixes after HAVE_LINES change No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:56:59 -07:00
Ian Romanick	b8461e03f0	t_dd_dmatmp: Require HAVE_LINES Two drivers use this file, and both support lines. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:56:56 -07:00
Ian Romanick	265624c5af	t_dd_dmatmp: Indentation and formatting fixes after HAVE_QUADS change No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:56:53 -07:00
Ian Romanick	4ecc387a93	t_dd_dmatmp: Remove HAVE_QUADS support Two drivers use this file, and neither supports quads. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:56:51 -07:00
Ian Romanick	249ba09f59	t_dd_dmatmp: Remove HAVE_QUAD_STRIPS support Two drivers use this file, and neither supports quad strips. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-23 09:56:48 -07:00
Ian Romanick	25543d8ec5	t_dd_dmatmp: Use addition instead of subtraction in loop bounds This is used everywhere else in this file because it avoids problems when count is zero (due to trimming). No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109 Reviewed-by: Brian Paul <brianp@vmware.com> Cc: Marius Predut <marius.predut@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-23 09:56:46 -07:00
Ian Romanick	c0b3b2f760	t_dd_dmatmp: Pull out common 'count -= count & 3' code This was missing in the HAVE_TRIANGLES path, and that could cause incorrect rendering. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38109 Reviewed-by: Brian Paul <brianp@vmware.com> Cc: Marius Predut <marius.predut@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-23 09:56:43 -07:00
Ian Romanick	0d475ee2b9	t_dd_dmatmp: Use '& 3' instead of '% 4' everywhere No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-23 09:56:36 -07:00
Ian Romanick	fad8d54de7	t_dd_dmatmp: Clean up improper code formatting from previous patch No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-23 09:56:34 -07:00
Ian Romanick	d7bf7969b9	t_dd_dmatmp: Make "count" actually be the count The value passed in count previously was "vertex after the last vertex to be processed." Calling that "count" was misleading and kind of mean. Looking at the code, many functions immediately do "count-start" to get back the true count. That's just silly. If it is better for the loops to be 'for (j = start; j < (start + count); j++)', GCC will do that transformation. NOTE: There is some strange formatting left by this patch. That was done to make it more obvious that the before and after code is equivalent. These will be fixed in the next patch. No piglit regressions on i915 (G33) or radeon (Radeon 7500). v2: Fix a remaining (count-start) in render_quad_strip_verts. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> [v1] Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-23 09:56:01 -07:00
Antia Puentes	f2e75ac88a	i965/vec4: Don't coalesce regs in Gen6 MATH ops if reswizzle/writemask needed Gen6 MATH instructions can not execute in align16 mode, so swizzles or writemasking are not allowed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-23 13:12:25 +02:00
Iago Toral Quiroga	cf439951b7	mesa: Fix GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for default framebuffer. From section 9.2. Binding and Managing Framebuffer Objects: "Upon successful return from Get*FramebufferAttachmentParameteriv, if pname is FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE, then params will contain one of NONE, FRAMEBUFFER_DEFAULT, TEXTURE, or RENDERBUFFER, identifying the type of object which contains the attached image." And then it clarifies further: "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then either no framebuffer is bound to target; or the default framebuffer is bound, attachment is DEPTH or STENCIL, and the number of depth or stencil bits, respectively, is zero" Currently, if the default framebuffer is bound, we always return GL_FRAMEBUFFER_DEFAULT for FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE, but according to the spec, when GL_DEPTH or GL_STENCIL attachments are the ones being queried, we should return GL_NONE if they don't exist. Fixes the following dEQP test: dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_x_size_initial Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.6" <mesa-stable@lists.freedesktop.org>	2015-09-23 12:50:00 +02:00
Tapani Pälli	89524e7171	glsl: bail out early in _mesa_ShaderSource if no shaderobj Patch fixes a crash in conformance test that tries out different invalid arguments for glShaderSource and glGetShaderSource: ES2-CTS.gtf.GL.glGetShaderSource.getshadersource_programhandle This is a regression from commit: `04e201d0c0` Additions in v2 also fix following failing deqp test: dEQP-GLES[2\|3].functional.negative_api.shader.shader_source v2: cleanup function, do check earlier (Iago Toral) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-23 08:45:00 +03:00
Matt Turner	10da96887c	i965/vec4: Detect and delete useless MOVs. With NIR: instructions in affected programs: 111508 -> 109193 (-2.08%) helped: 507 Without NIR: instructions in affected programs: 28763 -> 28474 (-1.00%) helped: 186 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 21:20:29 -07:00
Jason Ekstrand	e7496fed2a	prog_to_nir: Use nir_op_dph Shader-db results on HSW: instructions in affected programs: 72 -> 56 (-22.22%) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	999ff3c77d	nir/lower_alu_to_scalar: Add support for nir_op_fdph Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	2e5423ad63	i965/vec4: Add support for fdph_replicated Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	e5a9346d00	nir: Add fdph and fdph_replicated opcodes Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	0f9bf64770	nir/lower_alu_to_scalar: Return after lower_reduction We don't use any of the code after the switch anyway. Since we check for num_components == 1 and early-return, it doesn't get executed so everything's ok. However, it makes it much clearer what's going on if we simply do an early return. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Jason Ekstrand	2b79db2c02	nir/lower_alu_to_scalar: Use the builder Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:37:35 -07:00
Chris Forbes	f5991ebf34	i965: Add defines for tessellation stages v2 (Ken): - Squash together commits for HS, DS, and TE, as well as fixes. - Add INTEL_MASK variants so we can use SET_FIELD if we want. - Rename GEN7_HS_INSTANCE_CONTROL to GEN7_HS_INSTANCE_COUNT to match the documentation. - Add some more fields from the PRMs. - Add Broadwell variants. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-22 20:23:46 -07:00
Grazvydas Ignotas	8ae8feca84	r600g: update num_dw in scissor_enable workaround "r600g: apply disable workaround on all scissors" forgot to update num_dw, fix it. Fixes: `fbb423b433` "r600g: apply disable workaround on all scissors" Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-23 09:09:04 +10:00
Alejandro Piñeiro	1bd89db921	i965/vec4: refactor brw_vec4_copy_propagation. Now it is more similar to brw_fs_copy_propagation, with three clear stages: 1) Build up the value we are propagating as if it were the source of a single MOV: 2) Check that we can propagate that value 3) Build the final value Previously everything was somewhat messed up, making the implementation on some specific cases, like knowing if you can propagate from a previous instruction even with type mismatches, even messier (for example, with the need of maintaining more of one has_source_modifiers). The refactoring clears stuff, and gives support to this mentioned use case without doing anything extra (for example, only one has_source_modifiers is used). Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1683842 -> 1669037 (-0.88%) instructions in affected programs: 739837 -> 725032 (-2.00%) helped: 6237 HURT: 0 v2: using 'arg' index to get the from inst was wrong v3: rebased against last change on the previous patch of the series v4: don't need to track instructions on struct copy_entry, as we only set the source on a direct copy v5: change the approach for a refactoring v6: tweaked comments Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-22 19:30:18 +02:00
Brian Paul	4a03066e5a	st/mesa: remove st_bind_framebuffer() The function was a no-op and if the ctx->Driver.BindFramebuffer pointer is null, Mesa won't try to use it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-22 10:15:32 -06:00
Brian Paul	b590ffd0f9	mesa: const-qualify _mesa_is_legal_tex_storage_format ctx param Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-22 10:15:32 -06:00
Brian Paul	acee1a322d	mesa: const-qualify _mesa_base_tex_format() ctx param Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-22 10:15:31 -06:00
Brian Paul	4879b76601	mesa: const-qualify buffer_object_subdata_range_good() bufObj parameter Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-22 10:15:30 -06:00
Brian Paul	76dbab0a69	mesa: whitespace, comment fixes in texstorage.c	2015-09-22 09:10:10 -06:00
Marta Lofstedt	419210005a	mesa/es3.1: Enable GL_ARB_vertex_attrib_binding functionality for GLES 3.1 Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-22 12:22:13 +02:00
Marta Lofstedt	cf293e518e	mesa/es3.1: Allow query of Vertex bindings for GLES 3.1 Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-22 12:22:06 +02:00
Marta Lofstedt	6c3de8996f	mesa/es3.1 : Align OpenGL ES 3.1 glBindVertexBuffer error handling with OpenGL Core According to OpenGL ES 3.1 specification 10.3.1: "An INVALID_OPERATION error is generated if buffer is not zero or a name returned from a previous call to GenBuffers, or if such a name has since been deleted with DeleteBuffers." This error check was previously limited to OpenGL Core. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-22 12:21:59 +02:00
Tapani Pälli	7f8815bcb9	i965: fix textureGrad for cubemaps Fixes bugs exposed by commit `2b1cdb0edd` in: ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag No regressions observed in deqp, CTS or Piglit. v2: address review feedback from Iago Toral: - move rho calculation to else branch - optimize dx and dy calculation - fix documentation inconsistensies Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91114 Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-22 08:14:20 +03:00
Kenneth Graunke	5cede90f62	nir: Report progress from nir_normalize_cubemap_coords(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:54:34 -07:00
Kenneth Graunke	d7ffd90ecb	nir: Add braces around multi-line loop. This was correct but not our usual style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:47:01 -07:00
Kenneth Graunke	0a1adaf11d	nir: Report progress from nir_lower_system_values(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:47:00 -07:00
Kenneth Graunke	dc18b9357b	nir: Report progress from nir_split_var_copies(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:59 -07:00
Kenneth Graunke	cfae0f8a3a	nir: Report progress from nir_lower_locals_to_regs(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:57 -07:00
Kenneth Graunke	1adde5b87e	nir: Report progress from nir_remove_dead_variables(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:55 -07:00
Jason Ekstrand	9f5e7ae9d8	nir: Report progress from lower_vec_to_movs(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:54 -07:00
Kenneth Graunke	967a5ddb88	nir: Report progress from nir_lower_globals_vars_to_local(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-21 13:46:45 -07:00
Jason Ekstrand	60befc6347	i965: Clean up GLSL compiler option setup The only functional change here is that we now set EmitNoIndirectOutput and EmitNoIndirectTemp for compute shaders. Compute shaders don't have outputs per-se and we should have been setting EmitNoIndirectTemp all along. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-21 13:26:52 -07:00
Jeremy Huddleston	6dfc5e28f7	configure.ac: Add support to enable read-only text segment on x86. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.gentoo.org/240956 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-21 12:47:09 -07:00
Ben Widawsky	c1e38ad370	i965/skl: Use larger URB size where available. All SKL SKUs except the lowest one which has half the L3 size actually have 384K of URB per slice. For once, I can explain how this mistake was made and how it was missed in review... Historically when we enable a platform and put the production sizes, you can simply look at the "smallest" SKU and see what its URB size is (and we assumed it was the 1 slice variant). Since on newer platforms the URB sizes are scaled automatically by HW, this was sufficient. On SKL, this is a bit different as the lowest SKU actually has half of the L3 fused off. GT2 is the 1 slice (not GT1) variant and it has 384K. There are no Jenkins tests fixed (or regressions) and we don't expect any fixes here because you can always run with less URB size. Thanks to Sarah for bringing this to my attention. Cc: Sarah Sharp <sarah.a.sharp@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-09-21 11:27:08 -07:00
Jason Ekstrand	46362db4a6	nir/builder: Don't use designated initializers Designated initializers are not allowed in C++ (not even C++11). Since nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in C++, this breaks the build on some compilers. Aparently, GCC 5 allows it in some limited extent because mesa still builds on my system without this patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 10:41:43 -07:00
Jason Ekstrand	d513388c8a	nir: Move system value -> intrinsic mapping into nir.c This way they're right next to the map going the other direction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 09:49:40 -07:00
Emil Velikov	de7ffdb383	nir: rename nir_lower_samplers.c{pp,} With the only C++ function having its own wrapper we can 'demote' this file to a normal C one. This allows us to get rid of extern C { #include <foo.h> } 'hacks'. Plus some of the headers may use C99 initializers, which are not supported by the ISO standard. This may cause build issue on incremental builds. If so run the following: sed -i -e 's\|samplers\.cpp\|samplers.c\|' src/glsl/nir/.deps/nir_lower_samplers.Plo Fixes: ef8eebc6ad5(nir: support indirect indexing samplers in struct arrays) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reported-by: Gottfried Haider <gottfried.haider@gmail.com> Tested-by: Gottfried Haider <gottfried.haider@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-21 17:02:06 +01:00
Emil Velikov	d130cda453	nir: add C wrapper around glsl_type::record_location_offset This will allow us to convert nir_lower_sampler.cpp to C. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Gottfried Haider <gottfried.haider@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-21 17:01:56 +01:00
Emil Velikov	bdb1faf44e	nir: move stdio.h inclusion before extern C Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Gottfried Haider <gottfried.haider@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-21 17:01:32 +01:00
Kenneth Graunke	c1070550c2	i965: Fix MRF register number assertions for compr4. compr4 is represented by setting the high bit on the MRF number. We need to mask it out before sanity checking the register number. Fixes ~8000 assert fails on Ironlake and G45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 07:45:14 -07:00
Ilia Mirkin	72ebd532a1	radeonsi: implement TXQS support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Fredrik Bruhn <f@unibap.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-21 08:31:29 -04:00
Ilia Mirkin	7d5162bdc0	radeonsi: load fmask ptr relative to the resources array res_ptr already contains the resource values. fmask_ptr needs to be looked up relative to the start of the resource params. Note that this only affects indirect loads of MS sampler arrays. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-21 08:30:51 -04:00
Iago Toral Quiroga	5d23ce2f15	i965/vec4: Use MRF registers 21-23 for spilling in gen6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 12:48:05 +02:00
Iago Toral Quiroga	6789a32075	i965/fs: Use MRF registers 21-23 for spilling in gen6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 12:47:56 +02:00
Iago Toral Quiroga	f50645d05c	i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation There are some bug reports about shaders failing to compile in gen6 because MRF 14 is used when we need to spill. For example: https://bugs.freedesktop.org/show_bug.cgi?id=86469 https://bugs.freedesktop.org/show_bug.cgi?id=90631 Discussion in bugzilla pointed to the fact that gen6 might actually have 24 MRF registers available instead of 16, so we could use other MRF registers and avoid these conflicts (we still need to investigate why some shaders need up to MRF 14 anyway, since this is not expected). Notice that the hardware docs are not clear about this fact: SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device Hardware" says "Number per Thread" - "24 registers" However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says: "Normal threads should construct their messages in m1..m15. (...) Regardless of actual hardware implementation, the thread should not assume th at MRF addresses above m15 wrap to legal MRF registers." Therefore experimentation was necessary to evaluate if we had these extra MRF registers available or not. This was tested in gen6 using MRF registers 21..23 for spilling and doing a full piglit run (all.py) forcing spilling of everything on the FS backend. It was also tested by doing spilling of everything on both the FS and the VS backends with a piglit run of shader.py. In both cases no regressions were observed. In fact, many of these tests where helped in the cases where we forced spilling, since that triggered the same underlying problem described in the bug reports. Here are some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on gen6 hardware: Using MRFs 13..15 for spilling: crash: 2, fail: 113, pass: 6621, skip: 5461 Using MRFs 21..23 for spilling: crash: 2, fail: 12, pass: 6722, skip: 5461 This patch sets the ground for later patches to implement spilling using MRF registers 21..23 in gen6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 12:47:45 +02:00
Iago Toral Quiroga	0858610836	i965: Move MRF register asserts out of brw_reg.h In a later patch we will make BRW_MAX_MRF return a different value depending on the hardware generation, but it is inconvenient to add a gen parameter to the brw_reg functions only for the assertions, so move these to places where we have the hardware generation available. Ken suggested to add the asserts to brw_set_src0 and brw_set_dest since that would make sure that we catch all uses of MRF registers, even those coming from modules that generate native code directly, like blorp. Unfortunately, this is very late in the process which can make things harder to debug, so add asserts to the generator as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 12:47:35 +02:00
Iago Toral Quiroga	d48ac93066	i965: Maximum allowed size of SEND messages is 15 (4 bits) Until now we only used MRFs 1..15 for regular SEND messages, so the message length could not possibly exceed the maximum size. Soon we'll allow to use MRF registers 1..23 in gen6, so we need to be careful not to build messages that can go beyond the limit. That could occur, specifically, when building URB write messages, which we may need to split in chunks due to their size. Previously we would simply go and create a new message when we reached MRF 13 (since 13..15 were reserved for spilling), now we also want to check the size of the message explicitly. Besides adding that condition to split URB write messages properly, this patch also adds asserts in the generator. Notice that brw_inst_set_mlen already asserts for this, but asserting in the generators is easy and can make debugging easier in some cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-21 12:47:03 +02:00
Rob Clark	b65f91dd32	nir/print: fix coverity error Not something actually hit in real life (now state is never non-null, but only case state->syms is null is if nir_print_instr() path). But it was something I overlooked the first time, so might as well fix it. *** CID 1324642: Null pointer dereferences (REVERSE_INULL) /src/glsl/nir/nir_print.c: 299 in print_var_decl() 293 294 fprintf(fp, " (%s, %u)", loc, var->data.driver_location); 295 } 296 297 fprintf(fp, "\n"); 298 >>> CID 1324642: Null pointer dereferences (REVERSE_INULL) >>> Null-checking "state" suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 299 if (state) { 300 _mesa_set_add(state->syms, name); 301 _mesa_hash_table_insert(state->ht, var, name); 302 } 303 } 304 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-20 14:04:06 -04:00
Eduardo Lima Mitev	6ba291db4b	i965/vec4/nir: Remove all "this->" snippets For consistency, either we have all class members dereferenced, or none. In this case, very few are so lets get rid of them all. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-20 17:11:49 +02:00
Marcin Ślusarz	8f6fd57db2	dri/common: fix gbm-symbols-check regression Broken by commit `c228514c72` "dri/common: use sysconfdir when looking for drirc". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92054 Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com>	2015-09-20 13:44:07 +02:00
Emil Velikov	1e01db0fa9	docs: add news item and link release notes for 10.6.8 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-20 11:59:24 +01:00
Emil Velikov	278a32374c	docs: add sha256 checksums for 10.6.8 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `02387926ad`)	2015-09-20 11:58:04 +01:00
Emil Velikov	72d407da10	docs: add release notes for 10.6.8 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `91c6302734`)	2015-09-20 11:58:03 +01:00
Nanley Chery	99b1f4751f	mesa/teximage: reuse compressed format utility functions for base_format Reuse utility functions instead of reimplementing the same logic. * _mesa_is_compressed_format() performs the required checking to determine format support in the current context. * _mesa_gl_compressed_format_base_format() returns the base format. As a side effect, we now check that we're in a desktop context when determining support for the FXT1 and RGTC formats. This is in agreement with our extension table and the glext headers. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-19 13:27:15 -07:00
Nanley Chery	db2777091d	mesa/texcompress: add compressed formats to base format utility function Add S3TC and PALETTE formats. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-19 13:27:10 -07:00
Nanley Chery	29835fe19e	mesa/glformats: refactor compressed format support function Instead of case statements, use _mesa_get_format_layout() to determine if a GL format is part of a family of compressed formats. v2. restrict LATC formats to API_OPENGL_COMPAT (Ilia). rename the variable mFormat to m_format. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-19 13:26:55 -07:00
Nanley Chery	31a5135cd7	mesa/formats: add MESA_LAYOUT_LATC This enables us to predicate statments on a compressed format being a type of LATC format. Also, remove the comment that lists the enum (it was getting a tad long). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-19 13:25:59 -07:00
Marcin Ślusarz	c228514c72	dri/common: use sysconfdir when looking for drirc Useful when locally installed mesa has more quirks than the system one. Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-19 19:17:34 +02:00
Rob Clark	9ffc1049ca	freedreno/ir3: use nir two-sided-color lowering With this, we completely switch over to nir lowering passes instead of tgsi_lowering. So one step closer to supporting direct glsl or spirv to nir support for freedreno a3xx/a4xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-18 21:07:50 -04:00
Rob Clark	e13ed3ffb4	nir: add two-sided-color lowering pass Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-18 21:07:50 -04:00
Rob Clark	e4dfcdcbec	nir/build: add nir_vec() helper Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-18 21:07:50 -04:00
Rob Clark	c71cb670ba	freedreno/ir3: lower txp/clamp in NIR Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-18 21:07:50 -04:00
Rob Clark	3745c38425	nir/lower_tex: add support to clamp texture coords Some hardware needs to clamp texture coordinates to [0.0, 1.0] in the shader to emulate GL_CLAMP. This is added to lower_tex_proj since, in the case of projected coords, the clamping needs to happen after projection. v2: comments/suggestions from Ilia and Eric, use txs to get texture size and clamp RECT textures to their dimensions rather than [0.0, 1.0] to avoid having to lower RECT textures to 2D. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	1ce8060c25	nir/lower_tex: support for lowering RECT textures v2: comments/suggestions from Ilia and Eric, split out get_texture_size() helper so we can use it in the next commit for clamping RECT textures. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	faf5f174dd	nir/lower_tex: support projector lowering per sampler type Some hardware, such as adreno a3xx, supports txp on some but not all sampler types. In this case we want more fine grained control over which texture projectors get lowered. v2: split out nir_lower_tex_options struct to make it easier to add the additional parameters coming in the following patches Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	f83ba7bc41	nir/lower_tex: split out project_src() helper Split this out to reduce noise in later patches. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Rob Clark	d9b9ff76f1	nir: rename nir_lower_tex_projector Since the following patches will add additional tex-lowering related functionality, which doesn't make sense to split out into a separate pass (as they would require duplication of the projector lowering logic), let's give this pass a more generic name. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-18 21:07:49 -04:00
Alejandro Piñeiro	06d31dceae	i965/vec4: Change types as needed to propagate source modifiers using current instruction SEL and MOV instructions, as long as they don't have source modifiers, are just copying bits around. So those kind of instruction could be propagated even if there are type mismatches. This is needed because NIR generates integer SEL and MOV instructions whenever it doesn't know what else to generate. This commit adds support for copy propagation using current instruction as reference. Equivalent to commit 472ef9 but for vec4. v2: include check for saturate, as Jason Ekstrand suggested v3: check that the dst.type and the src type are the same, in order to solve (among others) the following deqp regression with v2: dEQP-GLES3.functional.shaders.operator.unary_operator.minus.lowp_uint_vertex Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-19 00:31:25 +02:00
Iago Toral Quiroga	f7ca52dd6d	i965/fs: Fix comparison between signed and unsigned integer expressions brw_fs_visitor.cpp: In member function 'void fs_visitor::emit_urb_writes()': brw_fs_visitor.cpp:977:58: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-18 13:37:25 +02:00
Tapani Pälli	afa1efdc85	mesa: fix errors when reading depth with glReadPixels OpenGL ES 3.0 spec 3.7.2 "Transfer of Pixel Rectangles" specifies DEPTH_COMPONENT, UNSIGNED_INT as a valid couple, validation for internal format is checked by is_float_depth(). Fix regression caused by `81d2fd91a9` in: ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels Test uses GL_DEPTH_COMPONENT, UNSIGNED_INT only when GL_NV_read_depth extension is present. v2: change check in _mesa_error_check_format_and_type to be explicit for ES 2.0+, desktop OpenGL does not allow this behaviour + uses this function for both glReadPixels and glDrawPixels validation. (No Piglit regressions seen with v2.) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92009 Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-18 07:41:47 +03:00
Rob Clark	2e4ab489b5	nir/builder: fix c++11 compiler warning Fixes: In file included from nir/nir_lower_samplers.cpp:27:0: nir/nir_builder.h: In function 'nir_ssa_def* nir_channel(nir_builder, nir_ssa_def, int)': nir/nir_builder.h:222:37: warning: narrowing conversion of 'c' from 'int' to 'unsigned int' inside { } is ill-formed in C++11 [-Wnarrowing] unsigned swizzle[4] = {c, c, c, c}; Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 21:08:25 -04:00
Rob Clark	7c72f593ad	nir: really actually fix comment this time Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 21:06:11 -04:00
Rob Clark	5305603b9d	nir/print: print variable names Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-17 20:26:12 -04:00
Rob Clark	ba78260b0f	nir: some comment fixups Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-09-17 20:25:33 -04:00
Rob Clark	c70ed86172	freedreno/ir3: add --gpu arg to cmdline compiler Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 19:57:52 -04:00
Rob Clark	c970ec0577	freedreno/a4xx: wire up ucp support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 19:57:52 -04:00
Rob Clark	91ec210ea8	freedreno/ir3: add support for ucp Use nir_lower_clip pass for adding the VS/FS instructions to handle user-clip-planes and CLIPDIST. Wire up support for load_user_clip_plane intrinsic to fetch ucp[plane] values as driver-params (passed as const's to the shader). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 19:57:52 -04:00
Rob Clark	509e0c4505	nir: add lowering stage for user-clip-planes / clipdist The vertex shader lowering adds calculation for CLIPDIST, if needed (ie. user-clip-planes), and the frag shader lowering adds conditional kills based on CLIPDIST value (which should be treated as a normal interpolated varying by the driver). Note that this won't quite do the right thing in the face of MSAA plus user-clip-planes, since all the samples would be killed or not (rather than potentially only a portion of them). But it's better than no UCP support at all for drivers that don't have this in hw. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-17 19:57:21 -04:00
Rob Clark	53671a3723	nir: add sysval for user-clip-planes For lowering user-clip-planes, we need a way to pass the enabled/used user-clip-planes in to shader. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-09-17 19:55:43 -04:00
Rob Clark	c4572b7dfe	freedreno/ir3: convert from tgsi semantic/index to varying-slot Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 19:55:43 -04:00
Rob Clark	4a121e1a90	glsl: add SYSTEM_VALUE_VERTEX_CNT Used internally in freedreno/ir3 to calc stream-out position. Seems like a generic enough way to implement stream-out (using str instrs), plus it avoids compiler warnings by sneaking in a non-enum value in switch statements. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 19:55:43 -04:00
Rob Clark	e523f69b1d	freedreno/ir3: switch to shader_enums.h interp constants A small step towards un-TGSI'ifying ir3. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-17 19:55:43 -04:00
Ilia Mirkin	e844e1007d	nv50,nvc0: flush texture cache in presence of coherent bufs This fixes the newly-added arb_texture_buffer_object-bufferstorage piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-17 19:50:47 -04:00
Ilia Mirkin	323c912506	nv50,nvc0: detect underlying resource changes and update tic When updating texture buffers, we might end up replacing the whole buffer. Check that the tic address matches the resource address, and if not, update the tic and reupload it. This fixes: arb_direct_state_access-texture-buffer arb_texture_buffer_object-data-sync Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-17 19:50:47 -04:00
Boyan Ding	8d3b92af21	vc4: Try to pair up instructions when only one of them has PM bit Instructions with difference in PM field can actually be paired up if the one without PM doesn't do packing/unpacking and non-NOP packing/unpacking operations from PM instruction aren't added to the other without PM. total instructions in shared programs: 48209 -> 47460 (-1.55%) instructions in affected programs: 11688 -> 10939 (-6.41%) Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-17 14:57:46 -04:00
Jason Ekstrand	fc11dbe13f	i965/vec4: Use nir_move_vec_src_uses_to_dest The idea here is not that it gives register coalescing a little bit of a helping hand. It doesn't actually fix the coalescing problems, but it seems to help a good bit. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1746280 -> 1683959 (-3.57%) instructions in affected programs: 1259166 -> 1196845 (-4.95%) helped: 11363 HURT: 148 v2 (Jason Ekstrand): - Run nir_move_vec_src_uses_to_dest after going out of SSA - New shader-db numbers Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-17 08:21:31 -07:00
Jason Ekstrand	a6c467d6c5	nir: Add a pass to rewrite uses of vecN sources to the vecN destination v2 (Jason Ekstrand): - Handle non-SSA sources and destinations Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-17 08:19:48 -07:00
Jason Ekstrand	ddffe30f40	nir: Add comments to nir_index_instrs and nir_index_ssa_defs The provided indices have the very nice property that if A dominates B then A->index <= B->index. We should document that somewhere. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-17 08:16:01 -07:00
Jason Ekstrand	8ecaef967d	nir: Add a generic instruction index Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-17 08:16:01 -07:00
Ulrich Weigand	bd016a2601	mesa: Fix texture compression on big-endian systems Various pieces of code to create compressed textures will first generate an uncompressed RGBA texture into a temporary buffer, and then read from that buffer while creating the final compressed texture in the requested format. The code reading from the temporary buffer assumes the buffer is formatted as an array of bytes in RGBA order. However, the buffer is filled using a _mesa_texstore call with MESA_FORMAT_R8G8B8A8_UNORM format -- this is defined as an array of integers holding the RGBA values in packed format (least-significant to most-significant). This means incorrect bytes are accessed on big-endian systems. This patch fixes this by using the MESA_FORMAT_A8B8G8R8_UNORM format instead on big-endian systems when filling the buffer. This fixes about 100 piglit test case failures on s390x for me. Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Tested-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@gmail.com>	2015-09-17 21:23:45 +10:00
Thomas Hellstrom	7e28650649	st/xa: Use PIPE_FORMAT_R8_UNORM when available XA has been using L8_UNORM for a8 and yuv component surfaces. This commit instead makes XA prefer R8_UNORM since it's assumed to have a higher availability. Also neither of these formats are suitable as destination formats using destination alpha blending, so reject those operations. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-17 00:03:00 -07:00
Tapani Pälli	ba02f7a3b6	mesa: return initial value for VALIDATE_STATUS if pipe not bound From OpenGL 4.5 Core spec (7.13): "If pipeline is a name that has been generated (without subsequent deletion) by GenProgramPipelines, but refers to a program pipeline object that has not been previously bound, the GL first creates a new state vector in the same manner as when BindProgramPipeline creates a new program pipeline object." I interpret this as "If GetProgramPipelineiv gets called without a bound (but valid) pipeline object, the state should reflect initial state of a new pipeline object." This is also expected behaviour by ES31-CTS.sepshaderobjs.PipelineApi conformance test. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-17 08:26:33 +03:00
Tapani Pälli	d9689be5c6	mesa: return initial value for PROGRAM_SEPARABLE when not linked From OpenGL ES 3.1 spec (7.12): "Most properties set within program objects are specified not to take effect until the next call to LinkProgram or ProgramBinary. Some properties further require a successful call to either of these commands before taking effect. GetProgramiv returns the properties currently in effect for program, which may differ from the properties set within program since the most recent call to LinkProgram or ProgramBinary, which have not yet taken effect. If there has been no such call putting changes to pname into effect, initial values are returned." Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-17 08:26:33 +03:00
Tapani Pälli	8f1ae9abeb	mesa: enable query of PROGRAM_PIPELINE_BINDING for ES 3.1 Specified in OpenGL ES 3.1 spec, Table 23.32: Program Object State. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>	2015-09-17 08:26:33 +03:00
Timothy Arceri	ef8eebc6ad	nir: support indirect indexing samplers in struct arrays As a bonus we get indirect support for arrays of arrays for free. V5: couple of small clean-ups suggested by Jason. V4: fix struct member location caclulation, use nir_ssa_def rather than nir_src for the indirect as suggested by Jason V3: Use nir_instr_rewrite_src() with empty src rather then clearing the use_link list directly for the old indirects as suggested by Jason V2: Fixed validation error in debug build Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:28:34 +10:00
Timothy	0ad44ce373	glsl: add helper for calculating offsets for struct members V2: update comments Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:28:27 +10:00
Timothy Arceri	12af915e27	glsl: make variables private Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:28:21 +10:00
Timothy Arceri	dcd9cd0383	glsl: store uniform slot id in var location field This will allow us to access the uniform later on without resorting to building a name string and looking it up in UniformHash. V3: remove line wrap change from this patch V2: store slot number for all non-UBO uniforms to make code more consitent, renamed explicit_binding to explicit_location and added comment about what it does. Store the location at every shader stage. Updated data.location comments in ir/nir.h. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:28:14 +10:00
Timothy Arceri	9788700caf	glsl: assign hidden uniforms their slot id earlier This is required so that the next patch can safely assign the slot id to the var. The ids are now assigned in the order we want before allocating storage so there is no need to sort the storage array and move things around. V2: rename variable to make code easier to follow as suggested by Jason Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:26:45 +10:00
Timothy Arceri	874a0217fd	glsl: order indices for samplers inside a struct array This allows the correct offset to be easily calculated for indirect indexing when a struct array contains multiple samplers, or any crazy nesting. The indices for the folling struct will now look like this: Sampler index: 0 Name: s[0].tex Sampler index: 1 Name: s[1].tex Sampler index: 2 Name: s[0].si.tex Sampler index: 3 Name: s[1].si.tex Sampler index: 4 Name: s[0].si.tex2 Sampler index: 5 Name: s[1].si.tex2 Before this change it looked like this: Sampler index: 0 Name: s[0].tex Sampler index: 3 Name: s[1].tex Sampler index: 1 Name: s[0].si.tex Sampler index: 4 Name: s[1].si.tex Sampler index: 2 Name: s[0].si.tex2 Sampler index: 5 Name: s[1].si.tex2 struct S_inner { sampler2D tex; sampler2D tex2; }; struct S { sampler2D tex; S_inner si; }; uniform S s[2]; V3: Update comments with suggestions from Jason V2: rename struct array counter to have better name Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-17 11:26:39 +10:00
Dave Airlie	b5df52b112	Revert "mesa/extensions: restrict GL_OES_EGL_image to GLES" This reverts commit `48961fa3ba`. glamor/Xwayland use this, the spec saying something when it was written, and the fact that the comment says Mesa relies on it hasn't changed. I also don't have a copy of this patch in my mail archive, which seems wierd, did it get posted to mesa-dev? Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-17 06:58:51 +10:00
Eric Anholt	f5b26b4744	vc4: Only build in simulator mode if we find pkg-config for it. This will let other developers build it x86 for build-testing purposes.	2015-09-16 15:54:00 -04:00
Ilia Mirkin	37d0becfd9	freedreno/a3xx: use NUM_USER_CLIP_PLANES helper instead of magic number Use the helper from the newly-updated generated header file. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-16 15:42:55 -04:00
Ilia Mirkin	545a3cbb01	freedreno/a3xx: fix blending of L8 format Even though luminance formats don't have alpha, we still want the alpha output to go to the blender. This fixes the luminance blending tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-16 15:42:55 -04:00
Ilia Mirkin	ee6b95c82c	freedreno/a3xx: add support for dual-source blending Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-16 15:42:54 -04:00
Eric Anholt	cfa980f493	vc4: convert from tgsi semantic/index to varying-slot (originally part of previous patch, split out to separate patch by Rob) v2: squash in some fixes from Eric v3: Another fix from Eric for point coords. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-16 15:07:08 -04:00
Eric Anholt	8fd3e53f3d	gallium/ttn: Convert to using VARYING_SLOT_* / FRAG_RESULT_*. This avoids exceeding the size of the .index bitfield since it got truncated, and should make our NIR look more like the NIR that the rest of the NIR developers are working on. v2: split out vc4 updates, first patch uses varying_slot_to_tgsi_semantic() helper, and second patch does the actual conversion. v3: add frag_result_to_tgsi_semantic() helper and don't try to map frag_results to semantic name/index as if they were varying_slot's v4: use VERT_ATTRIB_ for VS inputs v5: Fix vc4 build. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-16 15:03:53 -04:00
Ilia Mirkin	7a275fcda8	nv50, nvc0: fix max texture buffer size to 128M elements This is what the hardware supports, there never was any sort of 64K limit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-16 12:51:58 -04:00
Ilia Mirkin	eb081681df	st/mesa: avoid integer overflows with buffers >= 512MB This fixes failures with the newly-submitted max-size texture buffer piglit test for GPUs exposing >= 128M max texels. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-09-16 12:51:58 -04:00
Brian Paul	1aff899a87	mesa: move GL_APPLE_object_purgeable functions to new file Move this code out of bufferobj.c since it's not strongly connected to buffer objects. Acked-by: Matt Turner <mattst88@gmail.com>	2015-09-16 09:02:40 -06:00
Brian Paul	8faed71830	mesa: remove trailing whitespace in bufferobj.c Trivial.	2015-09-16 08:53:21 -06:00
Brian Paul	edc01c6704	mesa: whitespace, line wrap fixes in varray.c Trivial.	2015-09-16 08:53:21 -06:00
Rob Clark	aecbc93f2d	nir/print: print symbolic names from shader-enum v2: split out moving of FILE *fp into state structure into it's own (more complete patch) to reduce the noise in this one Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-16 10:15:35 -04:00
Rob Clark	840df72f93	nir/print: bit of state refactoring Rename print_var_state to print_state, and stuff FILE ptr into the state object. This avoids passing around an extra parameter everywhere. v2: even more extensive conversion.. use state everywhere instead of FILE ptr, and convert nir_print_instr() to use state as well Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-09-16 10:15:17 -04:00
Rob Clark	f2533f2f8c	glsl: shader-enum to name debug fxns Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-16 10:04:13 -04:00
Rob Clark	5bb41d9094	freedreno: one screen to rule them all Similar to `fee0686c21`, but in this case to ensure that drm_gralloc and libGLES_mesa are sharing a single screen. Bumps libdrm_freedreno version dependency, as it requires the new fd_device_fd() API. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-16 09:14:39 -04:00
Rob Clark	b3958f9f83	freedreno/ir3: use NIR to lower ffract instead of tgsi_lowering Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-16 08:28:18 -04:00
Rob Clark	d9efe40dc9	nir: add lowering for ffract Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-16 08:27:36 -04:00
Jordan Justen	47e18a5957	i965/fs: The barrier send uses only 1 payload register When preparing the barrier payload, the instructions should operate in simd8 mode since we only use 1 payload register. fs_inst::regs_read is also updated to indicate that it only reads one register for SHADER_OPCODE_BARRIER. These issues were flagged by: commit `cadd7dd384` Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Thu Jul 2 15:41:02 2015 -0700 i965/fs: Add a very basic validation pass Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-15 15:41:07 -07:00
Jason Ekstrand	cb503c3227	nir/builder: Use a normal temporary array in nir_channel C++ gets cranky if we take references of temporaries. This isn't a problem yet in master because nir_builder is never used from C++. However, it will be in the future so we should fix it now. Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 14:51:05 -07:00
Rob Clark	18385bc3ac	freedreno/a4xx: more texture formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 17:29:01 -04:00
Rob Clark	d85267c4bb	freedreno/a4xx: border-color support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 17:29:01 -04:00
Rob Clark	f8222724f5	freedreno/a4xx: wire up texture clamp lowering Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 17:29:01 -04:00
Rob Clark	9124a49d54	freedreno: helper for a3xx/a4xx border-colors Both use the same layout for the buffer containing border-color values, so rather than duplicating the logic in a4xx, split it out into a helper. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 17:29:01 -04:00
Rob Clark	76977222af	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-15 17:29:00 -04:00
Jason Ekstrand	29348631fe	nir/lower_vec_to_movs: Coalesce into destinations of fdot instructions Now that we have a replicating fdot instruction, we can actually coalesce into the destinations of vec4 instructions. We couldn't really do this before because, if the destination had to end up in .z, we couldn't reswizzle the instruction. With a replicated destination, the result ends up in all channels so we can just set the writemask and we're done. Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1747753 -> 1746280 (-0.08%) instructions in affected programs: 143274 -> 141801 (-1.03%) helped: 667 HURT: 0 It turns out that dot-products matter... Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:48 -07:00
Jason Ekstrand	a88ce0c1c4	i965/vec4: Use the replicated fdot instruction in NIR Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:48 -07:00
Jason Ekstrand	47739c7df4	nir: Add a fdot instruction that replicates the result to a vec4 Fortunately, nir_constant_expr already auto-splats if "dst" never shows up in the constant expression field so we don't need to do anything there. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:48 -07:00
Jason Ekstrand	2458ea95c5	nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible The old pass blindly inserted a bunch of moves into the shader with no concern for whether or not it was really needed. This adds code to try and coalesce into the destination of the instruction providing the value. Shader-db results for vec4 shaders on Haswell: total instructions in shared programs: 1754420 -> 1747753 (-0.38%) instructions in affected programs: 231230 -> 224563 (-2.88%) helped: 1017 HURT: 2 This approach is heavily based on a different patch by Eduardo Lima Mitev <elima@igalia.com>. Eduardo's patch did this in a separate pass as opposed to integrating it into nir_lower_vec_to_movs. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 12:38:07 -07:00
Jason Ekstrand	2b2f1f16a0	nir/lower_vec_to_movs: Get rid of start_idx and swizzle compacting Previously, we did this thing with keeping track of a separate start_idx which was different from the iteration variable. I think this was a relic of the way that GLSL IR implements writemasks. In NIR, if a given bit in the writemask is unset then that channel is just "unused", not missing. In particular, a vec4 operation with a writemask of 0xd will use sources 0, 2, and 3 and leave source 1 alone. We can simplify things a good deal (and make them correct) by removing this "compacting" step. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-09-15 11:13:48 -07:00
Jason Ekstrand	c951bb8305	i965/vec4_nir: Use partial SSA form rather than full non-SSA We made this switch in the FS backend some time ago and it seems to make a number of things a bit easier. In particular, supporting SSA values takes very little work in the backend and allows us to take advantage of the majority of the SSA information even after we've gotten rid of Phi nodes. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 11:13:48 -07:00
Jason Ekstrand	c3f8cde964	nir/lower_vec_to_movs: Handle partially SSA shaders v2 (Jason Ekstrand): - Use nir_instr_rewrite_dest - Pass the impl directly into lower_vec_to_movs_block Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 11:13:45 -07:00
Jason Ekstrand	b7eeced3c7	nir/lower_vec_to_movs: Pass the shader around directly Previously, we were passing the shader around, we were just calling it "mem_ctx". However, the nir_shader is (and must be for the purposes of mark-and-sweep) the mem_ctx so we might as well pass it around explicitly. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-15 11:13:40 -07:00
Jason Ekstrand	cadd7dd384	i965/fs: Add a very basic validation pass Currently the validation pass only validates that regs_read and regs_written are consistent with the sizes of VGRF's. We can add more as we find it to be useful. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-15 11:11:50 -07:00
Jason Ekstrand	0c6df7a1cb	i965/fs_surface_builder: Only apply predicate to components that exist In certain conditions, we have to do bounds-checking in the shader for image_load_store. The way this works for image loads is that we do a predicated load and then emit a series of selects, one per component, that gives us 0 or the loaded value depending on whether or not you're in bounds. However, we were hard-coding 4 components which may not be correct. Instead, we should be using size which is the number of components read. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-15 11:09:48 -07:00
Jason Ekstrand	5182400054	i965/fs: Only read output_components many components when writing an output Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-15 11:08:12 -07:00
Jason Ekstrand	f55836f567	i965/fs: Set output_components for lowered clip distance outputs Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-15 11:07:54 -07:00
Nanley Chery	8200793649	mesa/teximage: restrict GL_ETC1_RGB8_OES support to GLES According to the extensions table and our glext headers, OES_compressed_ETC1_RGB8_texture is only supported in GLES1 and GLES2. Since we may give users a GLES3 context when a GLES2 context is requested, we also allow this extension for GLES3 as well. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-15 10:11:14 -07:00
Nanley Chery	48961fa3ba	mesa/extensions: restrict GL_OES_EGL_image to GLES Driver vendors do this as well. The extension specification lists GLES 1.1 or 2.0 as requirements. Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-15 10:00:00 -07:00
Nanley Chery	fe796a1831	mesa/extensions: restrict luminance alpha formats to API_OPENGL_COMPAT According the GL 3.1 spec, luminance alpha formats are deprecated. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-09-15 10:00:00 -07:00
Thomas Hellstrom	edfb7ed109	gallium/svga: Enable PIPE_FORMAT_L8_UNORM for vgpu10 It's extensively used by XA for a8- and planar yuv component surfaces. This fixes broken XA yuv blits using vgpu10 contexts. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-15 09:25:02 -07:00
Emil Velikov	a1ac742f70	egl/dri2: don't leak the fd on dri2_terminate Currently the check was incorrect as it did not consider the (unlikely) case of fd == 0. In order to fix this we should first correctly initialize it to -1, as the swrast implementations leave it set to zero (props to calloc()). Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-09-15 12:39:02 +01:00
Emil Velikov	bd5bcb5b8c	egl/dri2/drm: compact existing device mgmt Move the fcntl(dupfd_cloexec) to the else branch where it belongs. Otherwise it's not immediately obvious that the code is hit, only when an existing device is used. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-09-15 12:37:27 +01:00
Matt Turner	e4f0d26c8c	egl/dri2: Close file descriptor on error. v2: [Emil Velikov] Rework the error path to a common goto, close only if we own the fd. v3; [Emil Velikov] Always close the fd (we either opened the device or dup'd) (Boyan, Ian) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-09-15 12:37:26 +01:00
Ray Strode	4bf151e662	gbm: convert gbm bo format to fourcc format on dma-buf import At the moment if a gbm buffer is imported and the gbm buffer has an old-style GBM_BO_FORMAT format, the import will crash, since it's passed directly to DRI functions that expect a fourcc format (as provided by the newer GBM_FORMAT definitions) This commit addresses the problem in two ways: 1) it prevents invalid formats from leading to a crash by returning EINVAL if the image couldn't be created 2) it translates GBM_BO_FORMAT formats into the comparable GBM_FORMAT formats. Reference: https://bugzilla.gnome.org/show_bug.cgi?id=753531 CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-15 12:27:45 +01:00
Alejandro Piñeiro	a26e82b81d	docs: document INTEL_DEBUG 'optimizer' envvar Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-15 08:33:35 +02:00
Kristian Høgsberg Kristensen	a548c75e31	i965: Move perf_debug code to brw_codegen__prog() We're trying to avoid a libdrm dependency in the core compiler, so let's move the perf_debug code one level up from the brw__emit() helpers to the brw_codegen_*_prog() helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-09-14 16:56:59 -07:00
Kristian Høgsberg Kristensen	84f2ed2cfd	i965: Move brw_fs_precompile() to brw_wm.c All other precompile functions live in the brw_<stage>.c files, make fs follow the convention. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-09-14 16:55:49 -07:00
Kristian Høgsberg Kristensen	dc70c86b9b	i965: Move compute shader code around This moves the compute shader code around in order to make the way the code is split up more consistent. There should be no functional changes. Typically we have a few files per stage: brw_vs.c, brw_wm.c brw_gs.c: code to drive code generation and implement precompiling and cache search. genX_<stage>_state.c gen specific implementation of the state emission for the shader stage. The brw_*_emit() functions are all in the same files as the visitor classes they use (with the exception of VS, which may use either vec4 or fs). To make compute follow this convention, we move the brw_cs_emit() function into brw_fs.cpp. We can then rename brw_cs.cpp to brw_cs.c and do this in C like the other similar files. Finally, move state setup and atoms to gen7_cs_state.c. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-09-14 16:52:42 -07:00
Anuj Phogat	64e25167ed	meta: Abort meta pbo path if TexSubImage need signed unsigned conversion See similar fix for Readpixels in mesa commit `0d20790`. Jason suggested we need that for TexSubImage as well. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-14 15:22:37 -07:00
Ilia Mirkin	5877a594d5	nvc0/ir: start offset at texBindBase for txq, like regular texturing Curiously this has no actual effect. I think it's because the first 8 textures are bound in multiple slots for some reason. However seems prudent to use these the same way as regular texturing, esp in the case where there are more than 8 textures bound. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-14 17:26:25 -04:00
Eric Anholt	64aee8fe9f	vc4: Fix build from recent NIR cleanups.	2015-09-14 11:21:07 -04:00
Antia Puentes	b8d2263c83	i965/vec4_nir: Load constants as integers Loads constants using integer as their register type, like it is done in FS backend. No shader-db changes in HSW. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91716 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-14 12:11:46 +02:00
Antia Puentes	79f1a7ae28	i965/vec4: Fix saturation errors when coalescing registers If the register types do not match and the instruction that contains the final destination is saturated, register coalescing generated non-equivalent code. This did not happen when using IR because types usually matched, but it is visible in nir-vec4. For example, mov vgrf7:D vgrf2:D mov.sat m4:F vgrf7:F is coalesced to: mov.sat m4:D vgrf2:D The patch prevents coalescing in such scenario, unless the instruction we want to coalesce into is a MOV (without type conversion implied). In that case, the patch sets the register types to the type of the final destination. Shader-db results in HSW (only vec4 instructions shown): total instructions in shared programs: 1754415 -> 1754416 (0.00%) instructions in affected programs: 74 -> 75 (1.35%) helped: 0 HURT: 1 GAINED: 0 LOST: 0 Only one extra instruction in one of the shaders, that comes from eliminating a saturation error by preventing register coalesce. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-14 12:11:46 +02:00
Tapani Pälli	d1bce52e13	docs: cleanups + mark some work as done Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-14 09:29:30 +03:00
Ilia Mirkin	f0b9d53262	docs: only astc ldr required for ES3.2, not hdr Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-14 02:08:42 -04:00
Ilia Mirkin	67d2d3ba43	st/mesa: emit TXQS, support ARB_shader_texture_image_samples The image component of the ext is a no-op since there is no image support in gallium (yet). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-13 18:24:45 -04:00
Ilia Mirkin	ec3fe42b3a	r600g: add support for TXQS tgsi opcode Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-09-13 18:24:44 -04:00
Ilia Mirkin	4294db90b1	nv50/ir: add support for TXQS tgsi opcode Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-13 18:24:44 -04:00
Ilia Mirkin	f46a53ffa5	gallium: add PIPE_CAP_TGSI_TXQS to let st know if TXQS is supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-09-13 18:24:37 -04:00
Ilia Mirkin	d173c5e77d	tgsi: add a TXQS opcode to retrieve the number of texture samples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-09-13 18:24:01 -04:00
Jordan Justen	c4cf824658	glsl/cs: Initialize gl_LocalInvocationIndex in main() We initialize gl_LocalInvocationIndex based on the extension spec formula: gl_LocalInvocationIndex = gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.x; https://www.opengl.org/registry/specs/ARB/compute_shader.txt Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-13 09:53:17 -07:00
Jordan Justen	6823e12d5a	glsl/cs: Exclude gl_LocalInvocationIndex from builtin variable stripping We lower gl_LocalInvocationIndex based on the extension spec formula: gl_LocalInvocationIndex = gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y + gl_LocalInvocationID.y * gl_WorkGroupSize.x + gl_LocalInvocationID.x; https://www.opengl.org/registry/specs/ARB/compute_shader.txt We need to set this variable in main(), even if gl_LocalInvocationIndex is not referenced by the shader. (It may be used by a linked shader.) Therefore, we can't eliminate it as a dead variable. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-13 09:53:16 -07:00
Jordan Justen	2b6cc0395b	glsl/cs: Initialize gl_GlobalInvocationID in main() We initialize gl_GlobalInvocationID based on the extension spec formula: gl_GlobalInvocationID = gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID https://www.opengl.org/registry/specs/ARB/compute_shader.txt Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-13 09:53:16 -07:00
Jordan Justen	c4d049f646	glsl: Move link_get_main_function_signature to a common location Also rename to _mesa_get_main_function_signature. We will call it near the end of compilation to insert some code into main for initializing some compute shader global variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-13 09:53:16 -07:00
Jordan Justen	34e187ec38	glsl/cs: Don't strip gl_GlobalInvocationID and dependencies We lower gl_GlobalInvocationID based on the extension spec formula: gl_GlobalInvocationID = gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID https://www.opengl.org/registry/specs/ARB/compute_shader.txt We need to set this variable in main(), even if gl_GlobalInvocationID is not referenced by the shader. (It may be used by a linked shader.) Therefore, we can't eliminate these as dead variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-13 09:53:16 -07:00
Jordan Justen	c5743a5d7f	i965/nir: Support gl_WorkGroupID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	4e454cb7c6	i965/cs: Initialize gl_WorkGroupID variable from payload Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	4f178f0d8b	nir: Add gl_WorkGroupID system variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	f5bb5a1bf1	glsl/cs: Add gl_WorkGroupID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	49f999b9cb	i965/nir: Support gl_LocalInvocationID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	43624361df	i965/cs: Initialize gl_LocalInvocationID from payload Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	b94b57f7c5	i965/cs: Initialize gl_LocalInvocationID in push constant data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	c7161a3c35	i965/cs: Reserve local invocation id in payload regs Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-13 09:53:16 -07:00
Jordan Justen	62e011d593	nir: Add gl_LocalInvocationID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Jordan Justen	bf8d6e501c	glsl/cs: Add gl_LocalInvocationID variable Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-13 09:53:16 -07:00
Krzesimir Nowak	08ceb5e076	softpipe: Change faces type to uint This is to avoid needless float<->int conversions, since all face-related computations are made on integers. Spotted by Emil Velikov. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-13 09:50:21 -06:00
Rob Clark	59519c2283	freedreno/ir3: fix compile warn after `1807a08e` New enum to add to switch so compiler doesn't complain. commit `1807a08e4f` Author: Ilia Mirkin <imirkin@alum.mit.edu> AuthorDate: Thu Aug 27 23:05:03 2015 -0400 Commit: Ilia Mirkin <imirkin@alum.mit.edu> CommitDate: Thu Sep 10 17:38:33 2015 -0400 nir: add nir_texop_texture_samples and convert from glsl Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-13 11:31:45 -04:00
Rob Clark	bf45a7d28e	freedreno/ir3: fix compile break after `a4aa25be` Following commit dropped the unused memctx arg: commit `a4aa25be1e` Author: Jason Ekstrand <jason.ekstrand@intel.com> AuthorDate: Wed Sep 9 13:24:35 2015 -0700 Commit: Jason Ekstrand <jason.ekstrand@intel.com> CommitDate: Fri Sep 11 09:21:20 2015 -0700 nir: Remove the mem_ctx parameter from ssa_def_rewrite_uses Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-13 11:31:30 -04:00
Rob Clark	b88aeff4f5	nir: add nir_channel() to get at single components of vec's Rather than make yet another copy of channel(), let's move it into nir. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-13 11:08:27 -04:00
Rob Clark	86358e949e	tgsi/scan: add support to figure out max nesting depth Sometimes a useful thing for compilers (or, for example, tgsi_to_nir) to know. And pretty trivial for scan to figure this out for us. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-13 11:08:27 -04:00
Kai Wasserbäch	d6fbcf6ee2	r600: Fix llvm build since const buffer changes In commit `f9caabe8f1`: One place in r600_llvm.c was forgotten when replacing R600_UCP_CONST_BUFFER with R600_BUFFER_INFO_CONST_BUFFER. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91985 Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Signed-off-by: Dave Airlie <airlied@gmail.com>	2015-09-13 07:09:08 +10:00
Jason Ekstrand	1037e0a84f	i965/vec4: Don't reswizzle hardware registers Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91719 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-12 10:46:26 -07:00
Jason Ekstrand	dd7290cf59	i965/emit: Add assertions for accumulator restrictions Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-12 10:46:26 -07:00
Emil Velikov	7852a44e3c	docs: add news item and link release notes for 11.0.0 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-12 13:50:33 +01:00
Emil Velikov	c34ed46217	docs: add sha256 checksums for 11.0.0 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `c4bae5792b`)	2015-09-12 13:48:15 +01:00
Emil Velikov	09223bfa9b	docs: Update 11.0.0 release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `4f1e500150`)	2015-09-12 13:48:14 +01:00
Glenn Kennard	ce34048b57	r600: Enable fp64 on chips with native support Cypress/Cayman/Aruba, earlier r6xx/r7xx chips only support a subset of the needed fp64 ops, and don't do GL4 anyway. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-12 07:32:08 +01:00
Glenn Kennard	d2ca9afd5d	r600g: Support I2D/U2D/D2I/D2U Only for Cypress/Cayman/Aruba, older chips have only partial fp64 support. Uses float intermediate values so only accurate for int24 range, which matches what the blob does. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-12 07:30:10 +01:00
Dave Airlie	f9caabe8f1	r600g: lower number of driver const buffers I'm going to want a driver constant buffer for tess to coordinate LDS storage, so before I go tackling that I decided to merge the clip/samplepos and texture info buffers into one. So I can steal the spare one. This creates a single constant buffer between the two, with clip/samplepos taking up a reserved 128 bytes at the start. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-12 06:56:58 +01:00
Dave Airlie	0337a9b2af	r600: define some values for the fetch constant offsets. This just puts these in one place and #defines them. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-12 06:56:51 +01:00
Thomas Helland	2e7e3fe55f	docs: Update with GLES3.2 entries and status V2: -Change to "not started" for most entries -Add status for multisample_2d_array -Change shader_multisample_interpolation to "not_stared" V3 (idr): Move the GLES 3.2 section after the "Additional functions" section from GLES 3.1. Note that GL_KHR_texture_compression_astc_hdr is done for i965 on gen9+ hardware. Note that GL_OES_shader_io_blocks is based on some features from GLSL 1.50. Signed-off-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v2] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-11 18:46:43 -07:00
Krzesimir Nowak	2135aba8d9	softpipe: Constify variables This commit makes a lot of variables constant - this is basically done by moving the computation to variable definition. Some of them are moved into lower scopes (like in img_filter_2d_ewa). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:37:00 -06:00
Krzesimir Nowak	231687c19b	softpipe: Constify sp_tgsi_sampler Add a small inline function doing the casting - this is to make sure we don't do a cast from some completely unrelated type. This commit does not make tgsi_sampler parameters const in vfuncs themselves for now - probably llvmpipe would need looking at before making such a change. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:36:54 -06:00
Krzesimir Nowak	ac23116de5	softpipe: Constify sampler and view parameters in mip filters Those functions actually could always take them as constants. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:36:47 -06:00
Krzesimir Nowak	ea764baa61	softpipe: Constify sampler and view parameters in img filters Those functions actually could always take them as constants. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:36:43 -06:00
Krzesimir Nowak	ba72e6cfb8	tgsi, softpipe: Constify tgsi_sampler in query_lod vfunc A followup from previous commit - since all functions called by query_lod take pointers to const sp_sampler_view and const sp_sampler, which are taken from tgsi_sampler subclass, we can the tgsi_sampler as const itself now. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:36:38 -06:00
Krzesimir Nowak	ea0fecd1a3	softpipe: Constify some sampler and view parameters This is to prepare for making tgsi_sampler parameter in query_lod a const too. These functions do not modify anything in either sampler or view anymore. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:36:32 -06:00
Krzesimir Nowak	4ca2896e8e	softpipe: Move the faces array from view to filter_args With that, sp_sampler_view instances are not abused anymore as a local storage, so we can later make them constant. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 15:36:23 -06:00
Jason Ekstrand	ca11c3c0a4	nir/from_ssa: Use instr_rewrite_dest Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	cee29220e3	nir: Add a function for rewriting instruction destinations Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	106a3b2cc3	nir: Only unlink sources that are actually valid Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	a4aa25be1e	nir: Remove the mem_ctx parameter from ssa_def_rewrite_uses Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-11 09:21:20 -07:00
Jason Ekstrand	8c8fc5f833	nir: Fix a bunch of ralloc parenting errors As of `a10d4937`, we would really like things associated with an instruction to be allocated out of that instruction and not out of the shader. In particular, you should be passing the instruction that will ultimately be holding the source into nir_src_copy rather than an arbitrary memory context. We also change the prototypes of nir_dest_copy and nir_alu_src/dest_copy to explicitly take an instruction so we catch this earlier in the future. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-09-11 09:21:04 -07:00
Jason Ekstrand	794355e771	nir/lower_outputs_to_temporaries: Reparent the output name We copy the output, make the old output the temporary, and give the temporary a new name. The copy keeps the pointer to the old name. This works just fine up until the point where we lower things to SSA and delete the old variable and, with it, the name. Instead, we should re-parent to the copy. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2015-09-11 08:55:51 -07:00
Alejandro Piñeiro	d4e29af234	i965/vec4: check writemask when bailing out at register coalesce opt_register_coalesce stopped to check previous instructions to coalesce with if somebody else was writing on the same destination. This can be optimized to check if somebody else was writing to the same channels of the same destination using the writemask. Shader DB results (taking into account only vec4): total instructions in shared programs: 1781593 -> 1734957 (-2.62%) instructions in affected programs: 1238390 -> 1191754 (-3.77%) helped: 12782 HURT: 0 GAINED: 0 LOST: 0 v2: removed some parenthesis, fixed indentation, as suggested by Matt Turner v3: added brackets, for consistency, as suggested by Eduardo Lima Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-11 17:43:22 +02:00
Brian Paul	2c52c794d7	tgsi,softpipe: capitalize the tgsi_sampler_control enum values We use capitalized enum values everywhere else. This improves understanding a bit too. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-11 08:50:10 -06:00
Kenneth Graunke	b811085b79	nir: Store some geometry shader data in nir_shader. This makes it possible for NIR shaders to know the number of output vertices and the number of invocations. Drivers could also access these directly without going through gl_program. We should probably add InputType and OutputType here too, but currently those are stored as GL_* enums, and I wanted to avoid using those in NIR, as I suspect Vulkan/SPIR-V will use different enums. (We should probably make our own.) We could add VerticesIn, but it's easily computable from the input topology, so I'm not sure whether it's worth it. It's also currently not stored in gl_shader (only gl_shader_program), which would require changes to the glsl_to_nir interface or require us to store it there. This is a bit of duplication of data...ideally, we would factor these substructs out of gl_program, gl_shader_program, and nir_shader, creating a gl_geometry_info class...but it would need to go in a new place (in src/glsl?) that isn't mtypes.h nor nir.h. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-11 00:05:09 -07:00
Kenneth Graunke	cb2b118e40	nir/builder: Add nir_load_var() and nir_store_var() helpers. These provide a convenient way to do simple variable loads and stores. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-11 00:04:17 -07:00
Kenneth Graunke	4654439fdd	glsl: Use hash tables for opt_constant_propagation() kill sets. Cuts compile/link time of the fragment shader in #91857 by 19% (16.28 -> 13.05). I didn't bother with the acp sets because they're smaller, but it might be worth doing as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Tested-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-11 00:01:24 -07:00
Kenneth Graunke	e20f30eb51	i965: Use hash tables for brw_fs_vector_splitting(). Cuts compile/link time of the fragment shader in #91857 by 25% (21.64 -> 16.28). v2: Drop unnecessary _mesa_hash_table_destroy call, and use refs.ht->entries == 0 rather than ad-hoc checking (suggested by Timothy Arceri). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Tested-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-11 00:01:24 -07:00
Kenneth Graunke	2fc0ce293a	glsl: Use hash tables in opt_constant_variable(). Cuts compile/link time of the fragment shader in bug #91857 by 31% (31.79 -> 21.64). It has over 8,000 variables so linked lists are terrible. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91857 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Tested-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-11 00:01:24 -07:00
Ian Romanick	4603723722	meta: Use result of texture coordinate clamping operation Previously the result of the complicated clamp() expression just dropped on the floor: clamp does not modify any of its parameters. Looking at the surrounding code, I believe this is supposed to modify the value of tex_coord. This change (along with a change to avoid the use of brw_blorp_framebuffer) does not affect any existing piglit tests. I'm not sure what this clamp is trying to accomplish, so I'm not sure how to write a test to exercise this path. I also noticed another bug in this code. There is no way the array texture case could possibly work. This will generate code for the TEXEL_FETCH macro like: #define TEXEL_FETCH(coord) texelFetch(texSampler, ivec3(coord), sample_map[int(2 * fract(coord.x))]); Since the coord parameter of this macro is a vec2 at all invocations, no expansion of this macro will even compile. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: Jordan Justen <jordan.l.justen@intel.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	767c33e881	meta: Always bind the texture We may have been called from glGenerateTextureMipmap with CurrentUnit still set to 0, so we don't know when we can skip binding the texture. Assume that _mesa_BindTexture will be fast if we're rebinding the same texture. v2: Remove currentTexUnitSave because it is now unused. Suggested by both Neil and Anuj. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91847 Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	86c0a2d574	i915, i965: Silence unused parameter warnings in intel_batchbuffer_advance These only occurred in release builds, but they occurred in every file that included intel_batchbuffer.h. Lots of spam. :( intel_batchbuffer.h: In function 'intel_batchbuffer_advance': intel_batchbuffer.h:153:47: warning: unused parameter 'brw' [-Wunused-parameter] intel_batchbuffer_advance(struct brw_context *brw) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	307d5e5849	i915: Silence unused parameter warning in intel_miptree_create_layout The for_bo parameter of intel_miptree_create_layout appears to be unused since `27eedca` when Eric removed some Gen5 code (after the i915 and i965 drivers parted ways). intel_mipmap_tree.c: In function 'old_intel_miptree_create_layout': intel_mipmap_tree.c:77:35: warning: unused parameter 'for_bo' [-Wunused-parameter] bool for_bo) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	5c8aa21309	i915, i965: Silence unused parameter warnings in intel_miptree_unmap_gtt intel_mipmap_tree.c: In function 'intel_miptree_unmap_gtt': intel_mipmap_tree.c:777:34: warning: unused parameter 'map' [-Wunused-parameter] struct intel_miptree_map *map, ^ intel_mipmap_tree.c:778:17: warning: unused parameter 'level' [-Wunused-parameter] unsigned int level, ^ intel_mipmap_tree.c:779:17: warning: unused parameter 'slice' [-Wunused-parameter] unsigned int slice) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	0412231266	i915: Silence unused parameter warnings intel_mipmap_tree.c: In function 'old_intel_miptree_unmap_raw': intel_mipmap_tree.c:726:51: warning: unused parameter 'intel' [-Wunused-parameter] intel_miptree_unmap_raw(struct intel_context *intel, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	20915dd2e0	i915: Remove prototype for nonexistent brw_miptree_layout Hasn't existed in the i915 source since the i915 and i965 drivers parted ways. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	31f0967fb5	i965: Make intel_miptree_map_raw static This hasn't been used outside intel_mipmap_tree.c since `d5d4ba9` started using meta instead of the blitter for PBO TexSubImage. While we're here, remove the unused brw parameter from the function formerly known as intel_miptree_unmap_raw. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	68b44dd5b2	i915, i965: Silence unused parameter warnings in intel_mipmap_tree.h These only occurred in release builds, but they occurred in every file that included intel_mipmap_tree.h. Lots of spam. :( intel_mipmap_tree.h: In function 'intel_miptree_check_level_layer': intel_mipmap_tree.h:595:59: warning: unused parameter 'mt' [-Wunused-parameter] intel_miptree_check_level_layer(struct intel_mipmap_tree *mt, ^ intel_mipmap_tree.h:596:42: warning: unused parameter 'level' [-Wunused-parameter] uint32_t level, ^ intel_mipmap_tree.h:597:42: warning: unused parameter 'layer' [-Wunused-parameter] uint32_t layer) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:51 -07:00
Ian Romanick	094877f9d2	i965: Silence unused parameter warnings in intel_mipmap_tree.c The target parameter of compute_msaa_layout appears to be unused since `83b83fb` when support for CMS textures was added for Gen7. The brw parameter of intel_get_non_msrt_mcs_alignment appears to be unused since `e92fbdc` when the GEN check (along with the "can we fast clear" decision) was moved to a different function. intel_mipmap_tree.c: In function 'compute_msaa_layout': intel_mipmap_tree.c:62:73: warning: unused parameter 'target' [-Wunused-parameter] compute_msaa_layout(struct brw_context brw, mesa_format format, GLenum target, ^ intel_mipmap_tree.c: In function 'intel_get_non_msrt_mcs_alignment': intel_mipmap_tree.c:143:54: warning: unused parameter 'brw' [-Wunused-parameter] intel_get_non_msrt_mcs_alignment(struct brw_context brw, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Ben Widawsky <benjamin.widawsky@intel.com>	2015-09-10 20:29:50 -07:00
Ian Romanick	38e412d548	i965: Silence unused parameter warnings in intel_fbo.c intel_fbo.c: In function 'intel_alloc_window_storage': intel_fbo.c:415:48: warning: unused parameter 'ctx' [-Wunused-parameter] intel_alloc_window_storage(struct gl_context * ctx, struct gl_renderbuffer rb, ^ intel_fbo.c: In function 'intel_nop_alloc_storage': intel_fbo.c:428:74: warning: unused parameter 'rb' [-Wunused-parameter] intel_nop_alloc_storage(struct gl_context ctx, struct gl_renderbuffer *rb, ^ intel_fbo.c:429:32: warning: unused parameter 'internalFormat' [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^ intel_fbo.c:429:55: warning: unused parameter 'width' [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^ intel_fbo.c:429:69: warning: unused parameter 'height' [-Wunused-parameter] GLenum internalFormat, GLuint width, GLuint height) ^ intel_fbo.c: In function 'intel_blit_framebuffer_with_blitter': intel_fbo.c:790:61: warning: unused parameter 'filter' [-Wunused-parameter] GLbitfield mask, GLenum filter) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-10 20:29:50 -07:00
Dave Airlie	b46cbc3607	st/mesa: set the vbuffer to NULL if we are skipping it If we skip a vbuffer we need to make sure we NULL out the contents, otherwise when it gets passed to the driver it will get confused. This was hit by: GL41-CTS.gpu_shader_fp64.varyings Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-11 03:05:42 +01:00
Jordan Justen	34cff76fc2	i965/cs: Enable barrier in MEDIA_INTERFACE_DESCRIPTOR Enable barrier in MEDIA_INTERFACE_DESCRIPTOR if the program uses the barrier() GLSL function. On Ivy Bridge and Haswell, this allows the piglit test tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test to pass. On gen8, this enables a similar test with a local group size of 896 to pass. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-10 16:46:29 -07:00
Jordan Justen	b01d047391	i965/cs: Emit texture surfaces to enable CS sampling Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-10 16:46:29 -07:00
Jordan Justen	1180b79487	i965: Set up sampler state for compute shaders Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-10 16:46:29 -07:00
Jordan Justen	af48612b88	i965/fs: Set first_non_payload_grf in assign_curb_setup first_non_payload_grf may be updated in assign_urb_setup for FS or assign_vs_urb_setup for VS. We need to set this in assign_curb_setup for compute shaders since cs does not have an assign_cs_urb_setup like assign_urb_setup (fs) or assign_vs_urb_setup (vs). Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-10 16:46:29 -07:00
Jordan Justen	75d04e561b	i965: Support compute shaders in is_scalar_shader_stage() Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-10 16:46:29 -07:00
Jordan Justen	2b9c35945a	i965: Support CS in update_stage_texture_surfaces Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-09-10 16:46:29 -07:00
Ilia Mirkin	bfc5ace5bd	i965: enable ARB_shader_texture_image_samples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:39:46 -04:00
Ilia Mirkin	55ebaa6d00	i965: add handling for imageSamples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:55 -04:00
Ilia Mirkin	56238305e5	nir: convert glsl imageSamples into a new intrinsic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:52 -04:00
Ilia Mirkin	37c5c86281	glsl: add support for the imageSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:49 -04:00
Ilia Mirkin	0b91bcea98	i965: add support for textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> [v2: kayden-supplied code in fs_nir replacing need for logical opcode] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:45 -04:00
Ilia Mirkin	0c7fbcb844	glsl: add support for the textureSamples function Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:41 -04:00
Ilia Mirkin	fb18ee9ba6	glsl: add ARB_shader_texture_image_samples infrastructure Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:37 -04:00
Ilia Mirkin	1807a08e4f	nir: add nir_texop_texture_samples and convert from glsl Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:33 -04:00
Ilia Mirkin	f9052914e9	glsl: add ir_texture_samples texture opcode Will be used for textureSamples() Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:38:29 -04:00
Ilia Mirkin	6efae687b7	mesa: add infra for ARB_shader_texture_image_samples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-10 17:37:05 -04:00
Ian Romanick	284dcad20a	i965: Fix typos in license grep -lr 'sub license' \| while read f; do \ sed --in-place -e 's/sub license/sublicense/' $f ;\ done grep -lr 'NON-INFRINGEMENT' \| while read f; do \ sed --in-place -e 's/NON-INFRINGEMENT/NONINFRINGEMENT/' $f ;\ done As noted by Matt, both of these changes match the MIT license text found at http://opensource.org/licenses/MIT. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-09-10 11:36:30 -07:00
Ian Romanick	aa1a5c0c9e	i965: Remove horizontal bars from file header comments Why was that ever a thing? Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-09-10 11:36:03 -07:00
Brian Paul	a9b143a648	svga: clean up the compile_vs/gs/fs() functions Sipmlify structure and remove gotos. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-09-10 12:23:46 -06:00
Brian Paul	289804515f	svga: fix shader variant memory leak Fixes a small leak in a seldom-hit corner case for VS/FS compilation. Found with coverity. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-09-10 12:23:46 -06:00
Brian Paul	ece33f9687	svga: remove useless MAX2() call The sum of two unsigned ints is always >= 0. Found with Coverity. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-09-10 12:23:46 -06:00
Brian Paul	bc75fe214d	winsys/svga: remove useless assertion An unsigned int is always >= 0. Found with Coverity. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-09-10 12:23:46 -06:00
Emil Velikov	9de62819c9	docs: add news item and link release notes for 10.6.7 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 19:12:38 +01:00
Emil Velikov	ded289e348	docs: add sha256 checksums for 10.6.7 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `8789dd627c`)	2015-09-10 19:10:58 +01:00
Emil Velikov	e3c5aeee71	docs: add release notes for 10.6.7 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `32efdc87cb`)	2015-09-10 19:10:57 +01:00
Krzesimir Nowak	423a1dca2f	docs: Update wrt. textureQueryLod on softpipe Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	60905f2b19	softpipe: Implement and enable textureQueryLod Passes the shader piglit tests and introduces no regressions. This commit finally makes use of the refactoring in previous commits. v2: - adapted the code to changes in previous commits (renames, need_cube_convert stuff) - splitted too long lines Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	263d4a7406	tgsi: Add code for handling lodq opcode This introduces new vfunc in tgsi_sampler just for this opcode. I decided against extending get_samples vfunc to return the mipmap level and LOD - the function's prototype is already too scary and doing the sampling for textureQueryLod would be a waste of time. v2: - splitted too long lines Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	d71a3be860	softpipe: Add functions for computing relative mipmap level These functions will be used by textureQueryLod. v2: - renamed mip_level_* funcs to mip_rel_level_* to indicate that these functions return mip level relative to base level and documented them - renamed a level member in sp_filter_funcs struct to relative_level - changed mip_rel_level_none and mip_rel_level_nearest to return mip level relative to base level, mip_rel_level_linear already did that - documented clamp_lod function Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	ac3637dda0	softpipe: Split 3D to 2D coords conversion into separate function This is to avoid tying the conversion to the sampling - textureQueryLod will need to do the conversion too, but it does not do any sampling. So instead of a "get_samples" vfunc, there is just a bool saying whether the conversion is needed or not. This solution keeps a nice property of not adding any overhead for the common case (2D textures). v2: - replaced the "convert_coords" vfunc with a "need_cube_convert" boolean to avoid overhead of copying arrays in common case - removed an unused typedef - splitted too long lines in convert_cube - const fixes in convert_cube Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	380a3c0804	softpipe: Split code getting a filter into separate function This function will be later used by textureQueryLod. The img_filter_func are optional, because textureQueryLod will not need them. v2: - adapted to changes in previous commit (renames) - simplified conditions a bit - updated docs - splitted too long lines Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	b9bc6c42c9	softpipe: Put mip_filter_func inside a struct Putting this function pointer into a struct enables grouping of several related functions in a single place. For now it is just a single function, but the struct will be later extended with a mip_level_func for returning relative mip level. v2: - renamed sp_mip struct to sp_filter_funcs - renamed sp_filter_funcs instances from mip_foo to funcs_foo - splitted too long lines - sp_sampler now holds a pointer to sp_filter_funcs instead of an instance of it - some const fixes Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	16084cd2cf	softpipe: Split compute_lambda_lod into two functions textureQueryLod returns a vec2 with a mipmap information and a LOD. The latter needs to be not clamped. v2: - changed the "not_clamped" part to "unclamped" - corrected "clamp into" to "clamp to" - splitted too long lines Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:14 -06:00
Krzesimir Nowak	bdc69552ca	softpipe: Fix textureLod with nonzero GL_TEXTURE_LOD_BIAS value The level-of-detail bias wasn't simply added in the explicit LOD case. This case seems to be tested only in piglit's fs-texturequerylod-nearest-biased test, which is currently skipped, as softpipe does not support textureQueryLod at the moment. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:13 -06:00
Krzesimir Nowak	85500fe2e1	tgsi: Remove trailing backslash in comment It clearly is here by accident. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-10 09:45:13 -06:00
Marek Olšák	b409524fef	gallium/radeon: handle PIPE_TRANSFER_FLUSH_EXPLICIT Basically, do the same thing as for buffer_unmap, but use the explicit range instead. It's for apps which want to map a whole buffer and mark touched ranges explicitly. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	60ec8fb448	radeonsi: don't update polygon offset state if it has no effect Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	afa752d3f0	radeonsi: decrease the size of si_pm4_state Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	6a684ff67e	radeonsi/compute: add buffers to the CS directly Packets are emitted immediately anyway. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	2176b3b09f	radeonsi: only use new versions of LLVM image and sample intrinsics Just a cleanup I had made a long time ago and forgot about. v2: use tgsi_is_shadow_target Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	e6d3846dd0	gallium/radeon: drop support for LLVM 3.4 This allows using the new tex instrinsics unconditionally. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	5fbfd8dd23	r600/llvm: remove dead code for LLVM 3.3 LLVM 3.3 has been unsupported for quite a while. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	5c6c5b5246	r600g: use pipe_resource::width0 instead pb_buffer::size pb_buffer::size was aligned by `29aaab2b5f`, which broke the CMASK code I think. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91881 Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	7956eae1c7	radeonsi: enable VGPR spilling on VI This fixes corruption in Unigine Heaven on VI Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-10 17:14:15 +02:00
Marek Olšák	c6502e880b	winsys/amdgpu: calculate the maximum number of compute units Required for register spilling. Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-10 17:14:15 +02:00
Jon TURNEY	adeba943e1	Use IMP_LIB_EXT when checking for LLVM shared libraries When checking for LLVM shared libraries, use IMP_LIB_EXT for the extension for shared libraries appropriate to the target, rather than hardcoding '.so' Also add some comments to explain why we have this circus of pain. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 15:09:30 +01:00
Rhys Kidd	2c3007652d	i965: Resolve GCC sign-compare warning. mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c: In function 'set_3src_control_index': mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c:805:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < ARRAY_SIZE(gen8_3src_control_index_table); i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c: In function 'set_3src_source_index': mesa/src/mesa/drivers/dri/i965/brw_eu_compact.c:839:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < ARRAY_SIZE(gen8_3src_source_index_table); i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_state_dump.c: In function 'dump_sampler_state': mesa/src/mesa/drivers/dri/i965/brw_state_dump.c:382:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < size / 16; i++) { ^ mesa/src/mesa/drivers/dri/i965/brw_state_upload.c: In function 'brw_pipeline_state_finished': mesa/src/mesa/drivers/dri/i965/brw_state_upload.c:801:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i != pipeline) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function 'intel_gen7_hiz_buf_create': mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1544:47: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int level = mt->first_level; level <= mt->last_level; ++level) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function 'intel_gen8_hiz_buf_create': mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1638:44: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int level = mt->first_level; level <= mt->last_level; ++level) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function 'intel_miptree_alloc_hiz': mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1771:44: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int level = mt->first_level; level <= mt->last_level; ++level) { ^ mesa/src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1775:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int layer = 0; layer < mt->level[level].depth; ++layer) { ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 14:56:41 +01:00
Rhys Kidd	1c194840fd	mesa: Resolve GCC sign-compare warning. mesa/src/mesa/program/prog_to_nir.c: In function 'setup_registers_and_variables': /mesa/src/mesa/program/prog_to_nir.c:1059:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < c->prog->NumTemporaries; i++) { ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 14:56:41 +01:00
Rhys Kidd	32cdb49fe2	glsl: Resolve GCC sign-compare warning. mesa/src/glsl/nir/nir_lower_tex_projector.c: In function 'nir_lower_tex_projector_block': mesa/src/glsl/nir/nir_lower_tex_projector.c:63:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < tex->num_srcs; i++) { ^ mesa/src/glsl/nir/nir_lower_tex_projector.c: In function 'nir_lower_tex_projector_block': mesa/src/glsl/nir/nir_lower_tex_projector.c:114:38: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = proj_index + 1; i < tex->num_srcs; i++) { ^ mesa/src/glsl/nir/nir_lower_tex_projector.c: In function 'nir_lower_tex_projector_block': mesa/src/glsl/nir/nir_lower_tex_projector.c:53:39: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (proj_index = 0; proj_index < tex->num_srcs; proj_index++) { ^ mesa/src/glsl/nir/nir_lower_tex_projector.c:57:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (proj_index == tex->num_srcs) ^ mesa/src/glsl/nir/nir_search.c: In function 'match_value': mesa/src/glsl/nir/nir_search.c:84:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < num_components; ++i) ^ mesa/src/glsl/nir/nir_search.c: In function 'match_value': mesa/src/glsl/nir/nir_search.c:110:28: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (int i = 0; i < num_components; ++i) { ^ mesa/src/glsl/nir/nir_search.c: In function 'match_value': mesa/src/glsl/nir/nir_search.c:139:19: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (i < num_components) ^ mesa/src/glsl/nir/nir_opt_peephole_ffma.c: In function 'get_mul_for_src': mesa/src/glsl/nir/nir_opt_peephole_ffma.c:130:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (unsigned i = 0; i < num_components; i++) ^ Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 14:56:41 +01:00
Rhys Kidd	548bf70fd2	mesa: Resolve GCC missing field initializer warning. Resolve a series of missing field initializer warnings within get_hash_params.py Of the form: In file included from mesa/src/mesa/main/get.c:495:0: mesa/src/mesa/main/get_hash.h:180:5: warning: missing initializer for field 'extra' of 'const struct value_desc' [-Wmissing-field-initializers] { GL_POINT_SIZE_ARRAY_BUFFER_BINDING_OES, LOC_CUSTOM, TYPE_INT, 0 }, ^ mesa/src/mesa/main/get.c:165:15: note: 'extra' declared here const int extra; ^ This patch addresses some likely code rot around the extra field, where the initialization is via C code generated indirectly from a Python script. It resolves a number of warnings reported by GCC when configured to be pedantic. $ gcc --version gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2 No piglit regressions on Ironlake. v2: - Squash series into a single patch. Signed-off-by: Rhys Kidd <rhyskidd@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-10 14:56:41 +01:00
Albert Freeman	1691ead1b8	clover: Avoid using typename to allow compilation of clover by clang When parsing an variable declaration qualified with the typename keyword, clang attempted to declare a variable with the type of non type member "enum type type" of module::argument (within the header file clover/core/module.hpp) instead of the typed member of module::argument "enum type". Replaced "typename" with "enum" to force clang to declare the variable marg_type with type "enum type" of module::argument. CC: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Albert Freeman <albertwdfreeman@gmail.com>	2015-09-10 14:56:40 +01:00
Kenneth Graunke	bf58a2c362	i965: Advertise 65536 for GL_MAX_UNIFORM_BLOCK_SIZE. Our old value of 16384 is the minimum value. DirectX apparently requires 65536 at a minimum; that's also what nVidia and the Intel Windows driver advertise. AMD advertises MAX_INT. Ilia Mirkin noticed that "Shadow Warrior" uses UBOs larger than 16k on Nouveau, which advertises 65536 bytes for this limit. Traces captured on Nouveau don't work on i965 because our lower limit causes the GLSL linker to reject the captured shaders. While this isn't important in and of itself, it does suggest that raising the limit would be beneficial. We can read linear buffers up to 2^27 bytes in size, so raising this should be safe; we could probably even go larger. For now, matching nVidia and Intel/Windows seems like a good plan. We have to reinitialize MaxCombinedUniformComponents as core Mesa will have set it based on a stale value for MaxUniformBlockSize. According to Tapani, there's an unreleased game that asserts on this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-10 02:26:26 -07:00
Ilia Mirkin	74b86b971f	nv50/ir: don't fold immediate into mad if registers are too high Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-10 05:03:24 -04:00
Ilia Mirkin	ce28ca7133	nv50/ir: fix emission of 8-byte wide interp instruction This can come up if the target register number is > 63, which is fairly rare. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-10 04:30:45 -04:00
Ilia Mirkin	641eda0c79	nv50/ir: r63 is only 0 if we are using less than 63 registers It is advantageous to use r63 instead of r127 since r63 can fit into the shorter encoding. However if we've RA'd over 63 registers, we must use r127 as the replacement instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-10 04:30:45 -04:00
Ilia Mirkin	a072ef8748	nv50/ir: make edge splitting fix up phi node sources Unfortunately nv50_ir phi nodes aren't directly connected to the CFG, so the mapping between source and the actual BB is by inbound edge order. So when manipulating edges one has to be extremely careful. We were insufficiently careful when splitting critical edges which resulted in the phi nodes being confused as to where their sources were coming from. This primarily manifests itself with the TXL-lowering logic on nv50, when it is inside of a conditional. I've been unable to trigger the issue anywhere else so far. This resolves rendering failures in a number of games like Two Worlds 2, Trine: Enchanted Edition, Trine 2, XCOM:Enemy Unknown, Stacking. It also improves the situation in Hearthstone, Sonic Generations, and The Raven: Legacy of a Master Thief. However more work needs to be done there (splitting a lot more edges solves it, so it's some other sort of RA-related issue). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90887 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-10 03:11:31 -04:00
Ian Romanick	13a974f9ae	glsl: Remove ADD_VARYING macro The purpose of the macro was to create the name_as_gs_input from name. The previous commit removed the name_as_gs_input from add_varying, so the macro is unnecessary. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-09 19:15:15 -07:00
Ian Romanick	bd0245b8b2	glsl: Silence unused parameter warnings builtin_variables.cpp:1062:53: warning: unused parameter 'name_as_gs_input' [-Wunused-parameter] const char name_as_gs_input) ^ builtin_functions.cpp:4774:47: warning: unused parameter 'intrinsic_name' [-Wunused-parameter] const char intrinsic_name, ^ builtin_functions.cpp:4907:66: warning: unused parameter 'state' [-Wunused-parameter] _mesa_glsl_find_builtin_function_by_name(_mesa_glsl_parse_state state, ^ builtin_functions.cpp:4915:49: warning: unused parameter 'num_arguments' [-Wunused-parameter] unsigned num_arguments, ^ builtin_functions.cpp:4916:49: warning: unused parameter 'flags' [-Wunused-parameter] unsigned flags) ^ ir_print_visitor.cpp:589:37: warning: unused parameter 'ir' [-Wunused-parameter] ir_print_visitor::visit(ir_barrier ir) ^ linker.cpp:3212:48: warning: unused parameter 'ctx' [-Wunused-parameter] build_program_resource_list(struct gl_context ctx, ^ standalone_scaffolding.cpp:65:57: warning: unused parameter ‘id’ [-Wunused-parameter] _mesa_shader_debug(struct gl_context , GLenum, GLuint *id, ^ v2: Rebase on top of GL_ARB_shader_image_size work (especially `58a86897`). Silence more warnings added by that work. v3: Remove mention of the removed parameter from comments. Suggested by Iago. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1] Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "Martin Peres <martin.peres@linux.intel.com>"	2015-09-09 19:15:15 -07:00
Ilia Mirkin	342e68dc60	nvc0: remove BGRA4 format support Something is wrong with the support somewhere. I couldn't get the blob driver to use it either, although it happily used RGB5_A1. teximage-colors works, but WoW seems to fail in the menus for drawing text. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-09 21:54:47 -04:00
Rob Clark	9ce2e30726	gallium/ttn: fix cursor handling vs builder After inserting instructions the cursor.option becomes _after_instr (even if it started life as an _after_block). So we cannot simply stash the current cursor on the if/loop_stack. Otherwise we end up inserting instructions after the endif/endloop in the block preceeding the if/ loop. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-09 17:34:47 -04:00
Ilia Mirkin	e50c01d5af	nvc0: keep track of cb bindings per buffer, use for upload settings CB updates to bound buffers need to go through the CB_DATA endpoints, otherwise the shader may not notice that the updates happened. Furthermore, these updates have to go in to the same address as the bound buffer, otherwise, again, the shader may not notice updates. So we keep track of all the places where a constbuf is bound, and iterate over all of them when updating data. If a binding is found that encompasses the region to be updated, then we use the settings of that binding for the upload. Otherwise we upload as a regular data update. This fixes piglit 'arb_uniform_buffer_object-rendering offset' as well as blurriness in Witcher2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91890 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-09 16:29:21 -04:00
Jason Ekstrand	b828f7a27b	nir/glsl: Use lower_outputs_to_temporaries instead of relying on GLSL IR Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-09 12:29:38 -07:00
Jason Ekstrand	1dbe4af9c9	nir: Add a pass to lower outputs to temporary variables This pass can be used as a helper for NIR producers so they don't have to worry about creating the temporaries themselves. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-09 12:29:21 -07:00
Jason Ekstrand	f5e08ab6b1	nir/cursor: Add a constructor for the end of a block but before the jump Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-09 12:28:51 -07:00
Hans de Goede	3e9df0e3af	nv30: Disable msaa unless requested from the env by NV30_MAX_MSAA Some modern apps try to use msaa without keeping in mind the restrictions on videomem of older cards. Resulting in dmesg saying: [ 1197.850642] nouveau E[soffice.bin[3785]] fail ttm_validate [ 1197.850648] nouveau E[soffice.bin[3785]] validating bo list [ 1197.850654] nouveau E[soffice.bin[3785]] validate: -12 Because we are running out of video memory, after which the program using the msaa visual freezes, and eventually the entire system freezes. To work around this we do not allow msaa visauls by default and allow the user to override this via NV30_MAX_MSAA. Signed-off-by: Hans de Goede <hdegoede@redhat.com> [imirkin: move env var lookup to screen so that it's only done once] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-09 12:10:20 -04:00
Hans de Goede	ac066bf65c	nv30: Fix color resolving for nv3x cards We do not have a generic blitter on nv3x cards, so we must use the sifm object for color resolving. This commit divides the sources and dest surfaces in to tiles which match the constraints of the sifm object, so that color resolving will work properly on nv3x cards. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-09 11:57:34 -04:00
Rob Clark	30a915bd17	gallium/docs: clairify dmabuf fd ownership Since debugging issues w/ fd's close()d at the wrong time can be quite fun, this should probably be made more explicit in the docs. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-09-09 11:24:56 -04:00
Mauro Rossi	c12ffb30b4	android: radeonsi: add support for sid_tables.h generated sources This patch is necessary to avoid building error on android, due to missing sid_tables.h generated sources v2:[Emil Velikov] Correctly split the lists. Fixes: fbbebeae10f(radeonsi: inline si_cmd_context_control) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 15:27:31 +01:00
Mauro Rossi	8056b3ffeb	android: Always define __STDC_LIMIT_MACROS. Analogous to commit `02a4fe22b1` (configure.ac: Always define __STDC_LIMIT_MACROS.) v2: [Emil Velikov] keep the LLVM specific __STDC_FORMAT_MACROS Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 15:26:46 +01:00
Mauro Rossi	5235bfe7b7	android: rename LLVM_VERSION_PATCH to MESA_LLVM_VERSION_PATCH Fixes: 797f4eacea8(configure.ac: rename LLVM_VERSION_PATCH to avoid conflict with llvm-config.h) Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 15:26:06 +01:00
Mauro Rossi	e838d91b94	nouveau: android: add space before PRIx64 macro Otherwise the android build fails with error : unable to find string literal operator ‘operator"" PRIx64’ There are several resources referring to the problem, which is related to c++11, in our case used when building mesa for lollipop. http://comments.gmane.org/gmane.comp.graphics.opensg.user/5883 I've not investigated all the semantics, some people even suggested a bug in the gcc compiler, I just saw the building error was solved with one little space for lollipop and no side effect when c+11 not used. v2: [Emil Velikov] add an alternative commit message from Mauro. Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 15:25:35 +01:00
Emil Velikov	d9df8c2fa2	svga: pick all the files into the tarball Signed-off-by: Emil Velikov <emil.velikov@collabora.co.uk>	2015-09-09 14:52:34 +01:00
Emil Velikov	0d39279448	auxiliary: rework the python generated sources rules There are a few bits this commit aims to resolve: One can generalise the mkdir rule to a simple MKDIR_P $(@D) which will expand appropriately for even if we change the subdir name, and/or add new rules. We can also drop the explicit $(srcdir) prefix for the dependency rules, they they are not strictly required, nor used elsewhere in mesa. Finally replace $< with explicit filename to be consistent through the file, and honour PYTHON_FLAGS. v2: Add comprehensive commit summary/message (Ian, Matt) Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 12:48:50 +01:00
Emil Velikov	c373eaedfc	glsl: build: remove bogus dependency v2: rebase on top of the previous commit - don't touch the LOCAL_PATH prefix for nir_constant_expressions.h Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-09 12:48:47 +01:00
Emil Velikov	a3b05e0492	glsl: build: use makefile.sources variables when possible Rather than folding one variable within the other only to unwrap them, just use the ones we need. v2: bring back LOCAL_PATH prefix for nir_constant_expressions,h Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2015-09-09 12:48:43 +01:00
Emil Velikov	da5e4559ee	glsl: automake: reuse $(NIR_GENERATED_FILES) where possible Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-09 12:48:39 +01:00
Emil Velikov	9e0594418d	glsl: automake: rework the sources generation rules The glsl equivalent of "mesa: automake: rework the source generation rules". Plus let's make things consistent and always explicitly provide the header name. v2: Rebase on top of reverted "remove custom AM_V_LEX/YACC" (Matt) Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 12:48:33 +01:00
Emil Velikov	fd913f47b7	mesa: automake: rework the source generation rules Same logic as previous commit applies. Additionally remove the odd (set -e/mv/INDENT) from the rules. The last one is the only one we remotely care about, if reading the generated sources. Upcoming work from DylanB which will replace the existing python scripts with ones that produce more readable output anyway. Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-09 12:48:29 +01:00
Emil Velikov	96509aa804	mapi: automake: rework the source generation rules Same logic as previous commit applies. Also fix bogus MESA_MAPI_DIR - the sources are located in the source dir (duh). Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-09 12:48:25 +01:00
Emil Velikov	449ce5d64f	mapi: automake: rework the *api/glapi_mapi_tmp.h rules Same logic as previous commit applies. v2: Merge with "inline glapi_gen_mapi define" (Matt) Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-09 12:48:18 +01:00
Emil Velikov	d65bd7a7be	util: automake: rework the format_srgb.c rule A handful of changes/cleanups paving the way to bmake support: - Remove optional $(srcdir)/ prefix for files in the prereq list. - Drop the space after the AM_V_GEN variable. - Using $< in a non-suffix rule is a GNU make idiom. - Use $(@D) over $(dir $@). The latter is a POSIX standard. v2: Cosmetic tweaks in the commit summary. Cc: 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2015-09-09 12:48:09 +01:00
Emil Velikov	c8984a7a46	xmlpool: 'promote' LOCALEDIR variable This is the only place in mesa that uses this constuct which seems to be GNUmake-ism. Attempting to build with POSIX make implementations (bmake) would fail as below. --- options.h --- LOCALEDIR := . sh: line 2: LOCALEDIR: command not found *** [options.h] Error code 127 So let's keep things consistent and compatible by making the variable non target specific. v2: - Bring back LOCALEDIR. - Reword the commit message - Change mesa-stable tag 10.6 > 11.0 Cc: 11.0 <mesa-stable@lists.freedesktop.org> Cc: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-09 12:48:04 +01:00
Boyan Ding	63c4b7ee1e	egl_dri2: Add support for EGL_KHR_create_contest when using swrast This requires swrast version >= 3. Also EGL_EXT_create_context_robostness is supported if __DRI2_ROBUSTNESS extension is found. Reference: https://bugs.freedesktop.org/show_bug.cgi?id=80821 Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-09-09 11:26:48 +01:00
Boyan Ding	6345d2da60	egl_dri2: Use createContextAttribs if swrast version >= 3 v2: Change return type of the new function from int to bool Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-09-09 11:25:55 +01:00
Boyan Ding	b9ea608c1a	egl_dri2: Move filling context_attrib array in a separate function v2: Change return type of the new function from int to bool Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-09-09 11:25:18 +01:00
Marta Lofstedt	b8d6de87f6	mesa: Allow query of GL_VERTEX_BINDING_BUFFER According to OpenGL ES 3.1 specification table : 20.2 and OpenGL specification 4.4 table 23.4. The glGetIntegeri_v functions should report the name of the buffer bound when called with GL_VERTEX_BINDING_BUFFER. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-09 09:29:04 +02:00
Marta Lofstedt	ea69ae04db	mesa/es3.1: Enable GL_MAX_VERTEX_ATTRIB enums for GLES 3.1 Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-09-09 09:29:04 +02:00
Kenneth Graunke	0cc331dddd	i965/nir: Use nir_system_value_from_intrinsic to reduce duplication. This code is all pretty much identical. We just needed the translation from one enum value to the other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-08 18:02:16 -07:00
Kenneth Graunke	d5d74d0b86	nir: Add a nir_system_value_from_intrinsic() function. This converts NIR intrinsics that load system values into Mesa's SYSTEM_VALUE_* enumerations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-08 18:02:08 -07:00
Kenneth Graunke	8fbc4ae330	i965: Mark topologies with adjacency information as G45+. These didn't exist on the original 965. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-08 18:00:42 -07:00
Kenneth Graunke	aa18fa30c5	i965: Fix value of _3DPRIM_TRIFAN_NOSTIPPLE. TRIFAN_NOSTIPPLE has always been 0x16 - 0x15 is marked "Reserved" on all platforms. See the 965 PRM, Volume 2, Table 3-1, "3D Primitive Topology Type Encoding" for a list. We don't currently use this, and I don't expect we will, but we may as well not leave the bogus value around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-08 18:00:40 -07:00
Chris Forbes	70650094ef	i965: Add 64-bit dirty flag handling to brw_upload_pull_constants Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-08 18:00:36 -07:00
Chris Forbes	a9df772e0e	i965: Add defines for all new Gen7/8 URB opcodes Tessellation needs to emit URB reads and atomics; Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-08 17:57:54 -07:00
Ben Widawsky	e8a219ab46	i965/gen8+: Skip depth stalls on state change Docs suggest this is no longer required starting with Gen8. Perf (no regressions in n=20) OglMultithread 0.67% OglTerrainPanInst 0.12% trex 0.45% warsow 0.64% Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-09-08 16:09:52 -07:00
Dave Airlie	6d2ceb10cd	r600: don't use shader key without verifying shader type (v2) Since `7a32652231` r600: Turn 'r600_shader_key' struct into union we were accessing key fields that might be aliased in the union with other fields, so we should check what shader type we are compiling for before using key values from it. v1.1: make it compile v2: have caffeine, make it work - we don't set type until later, so don't reference it until we've set it. Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-09 08:42:06 +10:00
Ben Widawsky	f5509874aa	i965/skl: Use more compact hiz dimensions I meant to do this here, but it was in the wrong place: commit `c1151b18f2` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Wed Jun 24 20:07:54 2015 -0700 i965/skl: Use more compact hiz dimensions NOTE: Jordan did go back and look at the original mailing list post. I mailed the right thing, and pushed the wrong one. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-09-08 15:36:01 -07:00
Ilia Mirkin	458e55d7c5	st/mesa: increase viewport bounds limits for GL4 hw According to the ARB_viewport_array spec, GL4 limit is higher than the GL3 limit. Also take this opportunity to fix the GL3 limit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-08 17:15:02 -04:00
Ilia Mirkin	39df725f73	nvc0: always emit a full shader colormask Indications are that if the colormask indicates a single bit set on fermi, that value will always be read from $r0 instead of a potentially higher register (if e.g. green is set). Not to upset the counting logic, always set the header up with a full color mask for each RT. Such a situation can basically only ever happen with generated blit shaders. Fixes the following piglit on Fermi (Kepler is unaffected): fbo-stencil blit GL_DEPTH32F_STENCIL8 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-08 17:13:12 -04:00
Brian Paul	a3b0b3fda5	docs: fix date formatting in index.html	2015-09-08 08:47:01 -06:00
Iago Toral Quiroga	205ff843ff	nir: UBO loads no longer use const_index[1] Commit `2126c68e5c` killed the array elements parameter on load/store intrinsics that was stored in const_index[1]. It looks like that patch missed to remove this assignment in the UBO path. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-08 09:06:34 +02:00
Hans de Goede	87073c69f3	nv30: Fix max width / height checks in nv30 sifm code The sifm object has a limit of 1024x1024 for its input size and 2048x2048 for its output. The code checking this was trying to be clever resulting in it seeing a surface of e.g 1024x256 being outside of the input size limit. This commit fixes this. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-07 16:10:23 -04:00
Chris Wilson	be519c2d50	i965: Disallow fast blit paths for CopyTexImage with PixelTransfer ops glCopyTexImage behaves similarly to glReadPixels with respect to the pixel transfer operations. Therefore if any are set we cannot use the simple blit-only fast paths. (Though if would be possible to relax the blorp path to handle pixel zoom, or we can just enhance meta.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviwewed-by: Iago Toral <itoral@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-09-07 20:50:07 +01:00
Jon TURNEY	a1575b55c2	mesa/tests: Remove unneeded X11_CFLAGS X11_CFLAGS is never defined. Path to X11 headers is not needed here, so just remove. Future work: Using AM_CFLAGS here looks wrong, as this Makefile only builds C++ files Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-07 10:43:32 +01:00
Jon TURNEY	5f9c72ad23	glxl/tests: Use X11_INCLUDES instead of X11_CFLAGS X11_CFLAGS is undefined, so these tests will fail to build if x11proto is installed in a non-standard location. (See also commits `35189d76`, `bc93c3798`, `54b028ba`, `d901d7e08`, etc.) Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-07 10:43:32 +01:00
Thomas Hellstrom	f1ef89eaab	svga: Fix surface view error handling Make sure errors are correcly propagated. Also don't flush during state emission if emission fails. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-07 01:25:08 -07:00
Rob Clark	1432a18241	xa: add xa_surface_from_handle2 v2 Like xa_surface_from_handle(), but takes a handle type, rather than hard-coding 'shared' handle. This is needed to fix bugs seen with xf86-video-freedreno with xrandr rotation, for example. The root issue is that doing a GEM_OPEN ioctl on a bo that already has a GEM handle associated with the drm_file will result in two unique handles for the same bo. Which causes all sorts of follow-on fail. v2: - Add support for for fd handles. - Avoid duplicating code. - Bump xa version minor. Signed-off-by: Rob Clark <robclark@freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2015-09-07 01:25:08 -07:00
Alejandro Piñeiro	00c568f679	i965/nir/vec4: removed unneeded tex src swizzle set At that point the swizzle should be correct. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-07 10:10:42 +02:00
Ilia Mirkin	ae535cb0bf	util: make mesa-sha1.c completely empty when there are no SHA1 impls My earlier attempt to fix this missed the fact that there was a #else clause that assumes that you have openssh. This moves the whole thing under #ifdef HAVE_SHA1 which should avoid this issue. Fixes: `13bfa5201` (util: always include sha1 into the build) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91898 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@gmail.com>	2015-09-07 00:18:12 -04:00
Ilia Mirkin	13bfa52011	util: always include sha1 into the build SHA1 is now used in all builds when HAVE_SHA1 is defined. Adjust src to do the same thing, rather than predicating on shader cache. Fixes: `04e201d0c0` ("mesa: change 'SHADER_SUBST' facility to work with env variables") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@gmail.com>	2015-09-06 16:11:24 -04:00
Ilia Mirkin	e40f32d562	st/mesa: don't fall back to 16F when 32F is requested Nothing in the spec allows for the reduced precision, and this also fixes st_QuerySamplesForFormat for nv50, which does not allow MS8 on RGBA32F. Now this will be respected instead of reporting MS8 as supported with an assumption that the format used will be RGBA16F. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-06 14:15:59 -04:00
Ilia Mirkin	bfd3d5244b	st/mesa: properly handle u_upload_alloc failure vbuf is never null. We want to make sure that a resource was allocated for the vbuf, which is *vbuf. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-09-06 11:32:07 -04:00
Ilia Mirkin	a778831735	nouveau: don't mark full range as used on unmap with explicit flush Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-09-05 23:04:23 -04:00
Ilia Mirkin	c830d193db	nv50: avoid using inline vertex data submit when gl_VertexID is used The hardware only generates vertexid when vertices come from a VBO. This fixes: vertexid-drawelements vertexid-drawarrays Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-09-05 23:04:21 -04:00
Ilia Mirkin	4a025c6bc8	nv50: don't flush vertex arrays when index buffer changes The index buffer is fed in inline over a pushbuf. It's not related to vertices or any caching that might be done on them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-09-05 23:04:18 -04:00
Ilia Mirkin	1f62d36ae2	nv50: rebind bo to bufctx when invalidating idxbuf storage There is nothing to be done on a dirty idxbuf, but the bo may have changed, so we have to rebind it to the bufctx. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-09-05 23:04:15 -04:00
Ilia Mirkin	114cc18b98	nv50: clear buffer status on all vertex bufs, not just the first one Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-09-05 23:04:08 -04:00
Ilia Mirkin	75e34d1df8	nv50: fix drawing from tfb, direct-to-pushbuf submits The stride was being set to 0, which is illegal (and also non-sensical). Also we must wait for the buffer to become available for reading as otherwise a wrong value may be prefetched. Since we must wait for the buffer anyways, and it's mapped and in GART, we may as well avoid the annoyance of the indirect pushbuf submit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2015-09-05 23:03:52 -04:00
Ben Widawsky	5165e464f2	i965: Remove base miplevel from sampler state. Gen9 changes the meaning of this to coarse LOD quality mode. Although that's a desirable thing to be setting, it doesn't match the gen8 behavior and this was unintentional. More importantly, we don't ever use this field. So instead of getting it "wrong" drop it entirely. This is a respin of a patch which only [incorrectly] tried to address gen9. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-04 16:05:02 -07:00
Emil Velikov	509ba61d5a	docs: add news item and link release notes for 10.6.6 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-04 23:11:40 +01:00
Emil Velikov	f39bc1c828	docs: add sha256 checksums for 10.6.6 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `e3e2a3e0e5`)	2015-09-04 23:10:09 +01:00
Emil Velikov	5685ed72b8	docs: add release notes for 10.6.6 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `4b05739e9d`)	2015-09-04 23:10:07 +01:00
Oded Gabbay	4f2290d161	llvmpipe: convert double to long long instead of unsigned long long round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-04 17:37:17 -04:00
Hans de Goede	3c6c4d4f29	nv30: Implement color resolve for msaa Note this is not ideal. Since the sifm can only do source sizes upto 1024x1024 we end up using the blitter on nv4x, which is not that fast. And on nv3x we end up using the cpu which is really slow. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-04 16:07:08 -04:00
Hans de Goede	3329703eb1	nv30: Fix creation of scanout buffers Scanout buffers on nv30 must always be non-swizzled and have special width alignment constraints. These constrains have been taken from the xf86-video-nouveau src/nv_accel_common.c: nouveau_allocate_surface() function. nouveau_allocate_surface() applies these width constraints only when a tiled attribute is set, which it sets for all surfaces allocated via dri, and this "tiling" is not the same as swizzling, scanout surfaces must be linear / have a uniform_pitch or only complete garbage is shown. This commit fixes dri3 on nv30 showing a garbled display, with dri3 the scanout buffers are allocated by mesa, rather then by the ddx, and the wrong stride of these buffers was causing the garbled display. Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-04 16:07:08 -04:00
Boyan Ding	48de40ce9c	vc4: Initialize pack field of qreg to 0 in qir_get_temp This avoids generation of undefined packing in qir and qpu instructions, fixing a lot of rendering errors. Fixes `8b36d107fd` (vc4: Pack the unorm-packing bits into a src MUL instruction when possible.) Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-09-04 12:16:07 -07:00
Chris Wilson	099f5b3a62	i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixels The tiled memcpy fast paths perform a simple blit (with only a couple of trivial pixel conversion routines) and do not accommodate PixelTransfer operations. Therefore if any are set, fallback to the regular routines. Note that PixelTransfer only applies to TexImage and ReadPixels, not to GetTexImage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-09-04 20:11:15 +01:00
Iago Toral Quiroga	96ea166308	i965/vec4: Don't unspill the same register in consecutive instructions If we have spilled/unspilled a register in the current instruction, avoid emitting unspills for the same register in the same instruction or consecutive instructions following the current one as long as they keep reading the spilled register. This should allow us to avoid emitting costy unspills that come with little benefit to register allocation. v2: - Apply the same logic when evaluating spilling costs (Curro). v3: - Abstract the logic that decides if a register can be reused in a function. that can be used from both spill_reg and evaluate_spill_costs (Curro). v4: - Do not disallow reusing scratch_reg in predicated reads (Curro). - Track if previous sources in the same instruction read scratch_reg (Curro). - Return prev_inst_read_scratch_reg at the end (Curro). - No need to explicitily skip scratch read/write opcodes in spill_reg (Curro). - Fix the comments explaining what happens when we hit an instruction that does not read or write scratch_reg (Curro) - Return true early when the current or previous instructions read scratch_reg with a compatible mask. v5: - Do not return true early, the loop should not be expensive anyway and this adds more complexity (Curro). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-04 15:13:49 +02:00
Iago Toral Quiroga	bd6e516fc2	i965: Add a debug option for spilling everything in vec4 code Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-09-04 12:49:36 +02:00
Francisco Jerez	6cf4142db8	dri/common: Tokenize driParseDebugString() argument before matching debug flags. Fixes debug string parsing when one of the supported flags is a substring of another. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-04 12:49:36 +02:00
Francisco Jerez	3d4f75506c	dri/common: Fix codestyle of driParseDebugString(). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-09-04 12:49:36 +02:00
Tapani Pälli	08e9049e3d	glsl: error out on ES 3.1 if VS or FS present but not both Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-04 09:22:24 +03:00
Tapani Pälli	69678953d1	glsl: error on linking if no shaders are attached to program This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-04 09:01:00 +03:00
Kenneth Graunke	4323e78d3f	i965: Improve disassembly of data port read messages. We now print out the name of the message instead of its numerical value, and label the message control and surface numbers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:04 -07:00
Kenneth Graunke	0e23c246c0	i965: Optimize VUE map comparisons. The entire VUE map is computed based on the slots_valid bitfield; calling brw_compute_vue_map on the same bitfield will return the same result. So we can simply compare those. struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is much cheaper and should work just as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:04 -07:00
Kenneth Graunke	6e03377daf	i965/gs: Don't reserve space for clip plane uniforms. These were only for legacy userclipping, which we no longer support in geometry shaders. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:04 -07:00
Kenneth Graunke	fba4823a91	i965: Don't do legacy userclipping in non-compatibility contexts. According to the GLSL 1.50 specification, page 76: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." With this patch, we only enable clip distance writes when the shader actually writes them. We no longer force a value to be written when clip planes are enabled in the API. This could mean the first varying slot would be used as clip distances - I believe it should be the safe kind of undefined behavior. Empirically, it doesn't seem to cause a problem. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:04 -07:00
Kenneth Graunke	4f4b7c4711	i965: Remove the brw_vue_prog_key base class. The legacy userclip fields are only used for the vertex shader, and at that point there's only program_string_id and the tex struct, which are common to all keys. So there's no need for a "VUE" key base class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:04 -07:00
Kenneth Graunke	3239621825	i965: Virtualize vec4_visitor::emit_urb_slot(). This avoids a downcast of key, which won't exist in the base class soon. I'm not a huge fan of this patch, but given that we're currently using inheritance, this seems like the "right" way to do it. The alternative is to make key a void pointer in the parent class and continue downcasting. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:03 -07:00
Kenneth Graunke	27e83b62bb	i965: Store a key_tex pointer in vec4_visitor. I'm about to remove the base class for VS/GS/HS/DS program keys, at which point we won't be able to use key->tex anymore. Instead, we'll need to store a direct pointer (like we do in the FS backend). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:03 -07:00
Kenneth Graunke	014b90221a	i965: Move legacy clip plane handling to vec4_vs_visitor. This is now only used for the vertex shader, so it makes sense to get it out of any paths run by the geometry shader. Instead of passing the gl_clip_plane array into the run() method (which is shared among all subclasses), we add it as a vec4_vs_visitor constructor parameter. This eliminates the bogus NULL parameter in the GS case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:03 -07:00
Kenneth Graunke	082b7f1876	i965: Delete the brw_vue_program_key::userclip_active flag. There are two uses of this flag. The primary use is checking whether we need to emit code to convert legacy gl_ClipVertex/gl_Position clipping to clip distances. In this case, we also have to upload the clip planes as uniforms, which means setting nr_userclip_plane_consts to a positive value. Checking if it's > 0 works for detecting this case. Gen4-5 also wants to know whether we're doing clipping at all, so it can emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0] is set to a real register suffices for this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:03 -07:00
Kenneth Graunke	294282aaa6	i965: Remove legacy clip plane handling from geometry shaders. We only support geometry shaders in core profiles, where gl_ClipVertex doesn't exist. Presumably the even older behavior of clipping to gl_Position isn't supported either. In fact, GLSL 1.50 page 76 claims: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." So we don't need to handle legacy clipping in geometry shaders. I think Paul added this back when we were considering supporting the old GL_ARB_geometry_shader4 extension. This removes a non-orthagonal state dependency on GS compilation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:03 -07:00
Kenneth Graunke	a2151560b8	i965: Move brw_setup_tex_for_precompile to brw_program.[ch]. This living in brw_fs.{h,cpp} is a historical artifact of us supporting texturing for fragment shaders before any other stages. It's kind of awkward given that we use it for all stages. This avoids having to include brw_fs.h in geometry shader code in order to access this function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-09-03 22:31:03 -07:00
Tapani Pälli	04e201d0c0	mesa: change 'SHADER_SUBST' facility to work with env variables Patch modifies existing shader source and replace functionality to work with environment variables rather than enable dumping on compile time. Also instead of _mesa_str_checksum, _mesa_sha1_compute is used to avoid collisions. Functionality is controlled via two environment variables: MESA_SHADER_DUMP_PATH - path where shader sources are dumped MESA_SHADER_READ_PATH - path where replacement shaders are read v2: cleanups, add strerror if fopen fails, put all functionality inside HAVE_SHA1 since sha1 is required Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-04 08:22:37 +03:00
Tapani Pälli	0db323a624	build: add HAVE_SHA1 define when using --with-sha1 option Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Brian Paul <brianp@vmware.com>	2015-09-04 08:05:24 +03:00
Kenneth Graunke	2ace64fd59	i965: Fix copy propagation type changes. commit `472ef9a02f` introduced code to change the types of SEL and MOV instructions for moves that simply "copy bits around". It didn't account for type conversion moves, however. So it would happily turn this: mov(8) vgrf6:D, -vgrf5:D mov(8) vgrf7:F, vgrf6:UD into this: mov(8) vgrf6:D, -vgrf5:D mov(8) vgrf7:D, -vgrf5:D which erroneously drops the conversion to float. Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-03 21:12:54 -07:00
Dave Airlie	5fa5a012b1	r600: fix loop overrun in cayman_mul_double_instr Coverity warned about this. Ilia pointed it out. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-04 08:02:14 +10:00
Ben Widawsky	b05619c627	i965/gen9: Annotate input coverage mask change As far as I can tell, the behavior is preserved from the previous generations. Before we set a single bit to tell the FS whether or not we'll be using an input coverage mask. Now we have some options which are implementing various extensions. These bits are used for the various conservative rasterization mechanisms (for collision detection, binning, and whatever else). I believe that the behavior is preserved because the problem which conservative rasterization is attempting to fix would go away with the "NORMAL" mode (at the cost of performance, I believe). This patch serves as documentation of the change by creating the enums, as well as giving some of the history with the links here so that the next person who comes along and looks at it doesn't spend as long as I had to in order to determine if there is an issue or not. Previously, this algorithm had been done in software, and this can still be used as long as we don't export an extension stating otherwise. References: https://www.opengl.org/registry/specs/NV/conservative_raster.txt References: https://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-03 11:55:31 -07:00
Brian Paul	70dbdca15f	svga: update call to u_upload_alloc() u_upload_alloc() no longer returns a return value. Trivial.	2015-09-03 11:24:24 -06:00
Marek Olšák	efea7c3a3f	winsys/radeon: remove exported buffers from the cache Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-03 18:41:45 +02:00
Marek Olšák	54964c7751	winsys/amdgpu: remove exported buffers from the cache Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-03 18:41:42 +02:00
Marek Olšák	35d0f12797	gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitly This must be done before exporting a buffer as dmabuf fds, because we lose track of who is using it and can't trust the reference counter. Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-09-03 18:41:40 +02:00
Marek Olšák	44dbaa1746	u_upload_mgr: remove the return value from u_upload_data Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-03 18:14:50 +02:00
Marek Olšák	0c5df863ba	u_upload_mgr: remove the return value from u_upload_buffer Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-03 18:14:48 +02:00
Marek Olšák	b4f7639955	u_upload_mgr: remove the return value from u_upload_alloc_buffer Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-03 18:14:43 +02:00
Marek Olšák	8c6ff05517	u_upload_mgr: remove the return value from u_upload_alloc The return buffer or the returned pointer can be used instead. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-03 18:14:09 +02:00
Marek Olšák	6c1e368cf3	u_upload_mgr: optimize u_upload_alloc This is probably the most called util function. It does almost nothing, yet it can consume 10% of the CPU on the profile. This drops it down to 5%. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-03 18:09:13 +02:00
Grazvydas Ignotas	722ce74743	gallium/radeon: remove 'dirty' member from r600_atom It's no longer used by both r600 and radeonsi now. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-03 18:06:51 +02:00
Grazvydas Ignotas	ccbc7952a4	r600g: simplify dirty atom tracking Now that R600_NUM_ATOMS is below 64, dirty atom tracking can be simplified. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-03 18:06:42 +02:00
Grazvydas Ignotas	6ef4572937	r600g: start numbering atoms from 1 There doesn't seem any reason to start from 4. Start from 1 instead (0 is left reserved to catch uninitialized atoms). Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-03 18:06:29 +02:00
Grazvydas Ignotas	4d9af438bc	r600g: make all viewport states use single atom Similarly to scissor states, we can use single atom to track all viewport states. This will allow to simplify dirty atom handling later. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-03 18:06:14 +02:00
Grazvydas Ignotas	fbb423b433	r600g: apply disable workaround on all scissors During review of the "r600g: make all scissor states use single atom" patch Marek Olšák noticed that scissor disable workaround should be applied on all scissor states and not just first one, so let's do so. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-03 18:05:58 +02:00
Grazvydas Ignotas	7d475bad66	r600g: make all scissor states use single atom As suggested by Marek Olšák, we can use single atom to track all scissor states. This will allow to simplify dirty atom handling later. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-09-03 18:05:54 +02:00
Neil Roberts	ce181aea6c	mesa/pbo: Handle zero width, height or depth when validating access It's legal to call glTexSubImage with zero values for the width, height or depth. Previously this was breaking the PBO access validation because it tries to work out the last pixel accessed by getting the pixel at height-1 and depth-1 which would end up with bogus values. This was causing GL errors to be generated during the Piglit texsubimage test, although the test was passing anyway. v2: Also check for width == 0. Don't validate the start pointer if any of the dimensions are zero. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-03 17:00:54 +01:00
Kenneth Graunke	30e84530a0	glsl: Remove unused total_attribs_size variable. Accidentally left behind by my previous patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-03 00:56:18 -07:00
Kenneth Graunke	c3294ca5a1	glsl: Handle attribute aliasing in attribute storage limit check. In various versions of OpenGL and GLSL, it's possible to declare multiple VS input variables with aliasing attribute locations. So, when computing the storage requirements for vertex attributes, we can't simply add up the sizes. Instead, we need to look at the enabled slots. This patch begins tracking which attributes are double types that are larger than 128-bits (i.e. take up two vec4 slots). We then count normal attributes once, and count the double-size attributes a second time. Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests on i965, which regressed with commit `ad208d975a`. No Piglit changes on llvmpipe (which actually supports dvecs). Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-02 23:28:20 -07:00
Ian Romanick	6e37304521	i965/meta: Fix typo in comment Trivial. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-02 16:24:18 -07:00
Ian Romanick	7237c937af	mesa: Don't allow wrong type setters for matrix uniforms Previously we would allow glUniformMatrix4fv on a dmat4 and glUniformMatrix4dv on a mat4. Both are illegal. That later also overwrites the storage for the mat4 and causes bad things to happen. Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Cc: Dave Airlie <airlied@redhat.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-02 16:24:17 -07:00
Ian Romanick	a6976f0972	mesa: Pass the type to _mesa_uniform_matrix as a glsl_base_type This matches _mesa_uniform, and it enables the bug fix in the next patch. v2: s/type/basicType/ in the assert in _mesa_uniform_matrix. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> [v1] Cc: Dave Airlie <airlied@redhat.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-09-02 16:24:17 -07:00
Ian Romanick	882aab00ab	mesa: Silence unused parameter warnings in bufferobj.c main/bufferobj.c: In function 'count_buffer_size': main/bufferobj.c:520:26: warning: unused parameter 'key' [-Wunused-parameter] count_buffer_size(GLuint key, void data, void userData) ^ main/bufferobj.c: In function 'flush_mapped_buffer_range_fallback': main/bufferobj.c:740:56: warning: unused parameter 'index' [-Wunused-parameter] gl_map_buffer_index index) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-02 16:24:17 -07:00
Ian Romanick	8ba3b7661b	mesa: Remove target parameter from _mesa_handle_bind_buffer_gen main/bufferobj.c: In function '_mesa_handle_bind_buffer_gen': main/bufferobj.c:915:37: warning: unused parameter 'target' [-Wunused-parameter] GLenum target, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-02 16:24:17 -07:00
Ian Romanick	1e4d3d25ff	i965: Make gen7_enable_hw_binding_tables static All of the other state upload functions are static because the only use is in the brw_tracked_state structure. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-09-02 16:24:17 -07:00
Ian Romanick	97ce8bd437	i965: Make gen8_upload_state_base_address static All of the other state upload functions are static because the only use is in the brw_tracked_state structure. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-09-02 16:24:17 -07:00
Ian Romanick	4ff9e599cb	linker: Silence GCC unused parameter warnings linker.cpp:320:55: warning: unused parameter 'ir' [-Wunused-parameter] virtual ir_visitor_status visit_leave(ir_function ir) ^ linker.cpp:327:53: warning: unused parameter 'ir' [-Wunused-parameter] virtual ir_visitor_status visit_leave(ir_return ir) ^ linker.cpp:333:49: warning: unused parameter 'ir' [-Wunused-parameter] virtual ir_visitor_status visit_enter(ir_if ir) ^ linker.cpp:339:49: warning: unused parameter 'ir' [-Wunused-parameter] virtual ir_visitor_status visit_leave(ir_if ir) ^ linker.cpp:345:51: warning: unused parameter 'ir' [-Wunused-parameter] virtual ir_visitor_status visit_enter(ir_loop ir) ^ linker.cpp:351:51: warning: unused parameter 'ir' [-Wunused-parameter] virtual ir_visitor_status visit_leave(ir_loop ir) ^ linker.cpp:2824:53: warning: unused parameter 'ctx' [-Wunused-parameter] link_calculate_subroutine_compat(struct gl_context ctx, struct gl_shader_program prog) ^ linker.cpp:2854:47: warning: unused parameter 'ctx' [-Wunused-parameter] check_subroutine_resources(struct gl_context ctx, struct gl_shader_program prog) ^ linker.cpp:3368:49: warning: unused parameter 'ctx' [-Wunused-parameter] link_assign_subroutine_types(struct gl_context *ctx, ^ Also make link_assign_subroutine_types static since it is only called from this file. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-02 16:24:17 -07:00
Ian Romanick	8fafb0a67f	mesa: Fix warning about static being in the wrong place Because the compiler already has enough things to complain about. grep -rl 'const static' src/ \| while read f do sed --in-place -e 's/const static/static const/g' $f done brw_eu_emit.c: In function 'brw_reg_type_to_hw_type': brw_eu_emit.c:98:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int imm_hw_types[] = { ^ brw_eu_emit.c:120:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int hw_types[] = { ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-02 16:24:17 -07:00
Jordan Justen	06ada493fb	i965/cs: Setup push constant data for uniforms brw_upload_cs_push_constants was based on gen6_upload_push_constants. v2: * Add FINISHME comments about more efficient ways to push uniforms Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-09-02 14:17:24 -07:00
Jordan Justen	4bdd5e09c3	meta: Save/restore compute shaders Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-02 14:17:24 -07:00
Charmaine Lee	4a9480b64a	svga: fix referencing a NULL framebuffer cbuf Check for a valid framebuffer cbuf pointer before accessing its associated surface. Fix piglit test fbo-drawbuffers-none. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-02 13:22:42 -06:00
Charmaine Lee	5a5e5e3959	svga: increment texture age when surface is to be marked as dirty Commit b9ba8492 removes an unneeded pipe_surface_release() from st_render_texture(). This implies a surface can now be reused for a render buffer. Currently, when we render to a texture, we mark the surface as dirty. But in svga_mark_surface_dirty(), if the surface is already marked as dirty, it does not increment the texture age. Any view to this texture might not be updated properly then. With this patch, the texture age is incremented regardless of whether the surface is already marked as dirty or not. Fix bug 1499181. Reviewed-by: Sinclair Yeh <syeh@vmware.com>	2015-09-02 13:22:42 -06:00
Charmaine Lee	b2fd41ce46	svga: fix backed surface view regression Commit b9ba8492 removes an unneeded pipe_surface_release() from st_render_texture() and exposes a bug in the backed surface view creation. Currently a backed surface view for a conflicted surface view is created at framebuffer emit time. But if shader sampler views are changed but framebuffer surface views remain unchanged, emit_framebuffer() will not be called and conflicted surface views will not be detected. To fix this, also check for conflicted surface views when setting sampler views. If there is any conflicted surface views, enable the framebuffer dirty bit so that the framebuffer emit code has a chance to create a backed surface view for the conflicted surface view. Fix cinebench-r11-test regression. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-02 13:22:42 -06:00
Matt Turner	9390cb8459	i965/fs: Handle MRF destinations in lower_integer_multiplication(). The lowered code reads from the destination, which isn't possible from message registers. Fixes the following dEQP tests on SNB: dEQP-GLES3.functional.shaders.precision.int.highp_mul_fragment dEQP-GLES3.functional.shaders.precision.int.mediump_mul_fragment dEQP-GLES3.functional.shaders.precision.int.lowp_mul_fragment Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-02 11:52:10 -07:00
Brian Paul	4fd314852c	docs: document VMware OpenGL 3.3 support Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:27:43 -06:00
Brian Paul	e054251ed1	svga: update driver for version 10 GPU interface This is a squash commit of roughly two years of development work. Authors include: Brian Paul Charmaine Lee Thomas Hellstrom Jakob Bornecrantz Sinclair Yeh Mingcheng Chen Kai Ninomiya MengLin Wu The driver supports OpenGL 3.3. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:27:43 -06:00
Brian Paul	656dac120d	svga: add new version 10 device command prototypes Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:27:43 -06:00
Brian Paul	e8c20d97eb	svga: add new svga_streamout.h file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:24 -06:00
Brian Paul	8ddf98d671	svga: add new svga_state_tgsi_transform.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:24 -06:00
Brian Paul	26d8bae889	svga: add new svga_state_sampler.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	a633948e7e	svga: add new svga_state_gs.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	ff85bcdba2	svga: add new svga_pipe_streamout.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	7ce20cf59a	svga: add new svga_pipe_gs.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	9cb2d9ddfa	svga: add new svga_link.[ch] files Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	53d07910c3	svga: add new svga_cmd_vgpu10.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	35bb29d499	svga: add new svga_tgsi_vgpu10.c file Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	1c5468e9c0	svga: remove unused SVGA3D_* command functions Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	133a47107c	gallium/st: add pipe_context::get_timestamp() The VMware svga driver doesn't directly support pipe_screen::get_timestamp() but we can do a work-around. However, we need a gallium context to do so. This patch adds a new pipe_context::get_timestamp() function that will only be called if the pipe_screen::get_timestamp() function is NULL. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	e2a1d21cb6	svga/winsys: Add support for VGPU10 This involves a few driver modifications to keep things building. The driver may not actually run properly at this point. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	c191b507cb	svga: update the svga3d device header files Remove some obsolete svga_dump.c code for items which no longer exist. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	3a92526704	svga: add new version 10 device header files Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Brian Paul	75f92e28b4	winsys/svga: add new vmw_query.c[h] files Functions for creating, destroying, getting queries, etc. Signed-off-by: Brian Paul <brianp@vmware.com>	2015-09-02 09:05:23 -06:00
Chris Wilson	f30cf3258e	meta: Compute correct buffer size with SkipRows/SkipPixels If the user is specifying a subregion of a buffer using SKIP_ROWS and SKIP_PIXELS, we must compute the buffer size carefully as the end of the last row may be much shorter than strideimage_heightdepth. The current code tries to memcpy from beyond the end of the user data, for example causing: ==28136== Invalid read of size 8 ==28136== at 0x4C2D94E: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915) ==28136== by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856) ==28136== by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208) ==28136== by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600) ==28136== by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631) ==28136== by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103) ==28136== by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176) ==28136== by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195) ==28136== by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654) ==28136== by 0xB254C9F: texsubimage (teximage.c:3712) ==28136== by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853) ==28136== by 0x401CA0: UploadTexSubImage2D (teximage.c:171) ==28136== Address 0xd8bfbe0 is 0 bytes after a block of size 1,024 alloc'd ==28136== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==28136== by 0x402014: PerfDraw (teximage.c:270) ==28136== by 0x402648: Draw (glmain.c:182) ==28136== by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x4019C1: main (glmain.c:262) ==28136== ==28136== Invalid read of size 8 ==28136== at 0x4C2D940: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915) ==28136== by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856) ==28136== by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208) ==28136== by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600) ==28136== by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631) ==28136== by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103) ==28136== by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176) ==28136== by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195) ==28136== by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654) ==28136== by 0xB254C9F: texsubimage (teximage.c:3712) ==28136== by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853) ==28136== by 0x401CA0: UploadTexSubImage2D (teximage.c:171) ==28136== Address 0xd8bfbe8 is 8 bytes after a block of size 1,024 alloc'd ==28136== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==28136== by 0x402014: PerfDraw (teximage.c:270) ==28136== by 0x402648: Draw (glmain.c:182) ==28136== by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0) ==28136== by 0x4019C1: main (glmain.c:262) ==28136== Fixes regression from commit `7f396189f0` Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Mon Jan 5 18:17:04 2015 -0800 meta: Add a BlitFramebuffers-based implementation of TexSubImage v2: However, the teximage we create does need to be width x full_height x 1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Neil Roberts <neil@linux.intel.com> Reviewed-by Neil Roberts <neil@linux.intel.com>	2015-09-02 10:08:39 +01:00
Alejandro Piñeiro	4de86e1371	i965/vec4: fill src_reg type using the constructor type parameter The src_reg constructor that received the glsl_type was using it only to build the swizzle, but not to fill this->type as dst_reg is doing. This caused some type mismatch between movs and alu operations on the NIR path, so copy propagation optimization was not applied to remove unneeded movs if negate modifier was involved. This was first detected on minus (negate+add) operations. Shader DB results (taking into account only vec4): total instructions in shared programs: 20019 -> 19934 (-0.42%) instructions in affected programs: 2918 -> 2833 (-2.91%) helped: 79 HURT: 0 GAINED: 0 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-02 09:59:47 +02:00
Glenn Kennard	d2cab815b4	r600g: Add doubles support for CYPRESS This doesn't enable the support, just adds some of the code, so we don't have to keep rebasing. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 16:34:39 +10:00
Dave Airlie	3be5ee1574	r600g: add doubles support for CAYMAN Only a subset of AMD GPUs supported by r600g support doubles, CAYMAN and CYPRESS are probably all we'll try and support, however I don't have a CYPRESS so ignore that for now. This disables SB support for doubles, as we think we need to make the scheduler smarter to introduce delay slots. [airlied: pushing this to avoid pain of rebasing, it mostly works on cayman only so far, Glenn has some ideas about delay slot issues we need to look into. turned off by default for now] Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 16:06:18 +10:00
Dave Airlie	ee67fd70c2	tgsi/scan: add uses_doubles to tgsi scanner This allows drivers to work out if a shader contains any double opcodes easily. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 16:06:13 +10:00
Glenn Kennard	3bfa345c1e	r600g: add multiple stream support for geom shaders This patch is taken from work by Glenn and myself, and I've spent some time making it all work here. This adds support for the multiple streams part of ARB_gpu_shader5 to r600g. It doesn't enable ARB_gpu_shader5 yet. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 15:55:47 +10:00
Dave Airlie	3d497e0d91	r600g/sb: add support for multiple streams to SB backend This adds a peephole and removes an assert that isn't actually valid with some of the stream emit instructions. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 15:55:47 +10:00
Dave Airlie	d503bbbf30	r600g: add support for streams to the assembler. This just adds support to the assembler dumper and allows stream instructions to be generated. Also fix up the stream debugging to add stream info. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 15:55:47 +10:00
Dave Airlie	90ac5fb6bb	r600g/sb: dump sampler/resource index modes for textures. This just aids debugging. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 15:55:47 +10:00
Dave Airlie	32769ac016	mesa/readpixels: check strides are equal before skipping conversion The CTS packed_pixels test checks that readpixels doesn't write into the space between rows, however we fail that here unless we check the format and stride match. This fixes all the core mesa problems with CTS packed_pixels tests. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:34:21 +10:00
Dave Airlie	b4a70401f5	texcompress_s3tc/fxt1: fix stride checks (v1.1) The fastpath currently checks the RowLength != width, but if you have a RowLength of 7, and Alignment of 4, then that shouldn't match. align the rowlength to the pack alignment before comparing. This fixes compressed cases in CTS packed_pixels_pixelstore test when SKIP_PIXELS is enabled, which causes row length to get set. v1.1: add fxt1 fix (Iago) Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:32:26 +10:00
Dave Airlie	6a3e1fb958	st/readpixels: fix accel path for skipimages. We don't need to use the 3d image address here as that will include SKIP_IMAGES, and we are only blitting a single 2D anyways, so just use the 2D path. This fixes some memory overruns under CTS packed_pixels.packed_pixels_pixelstore when PACK_SKIP_IMAGES is used. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:30:48 +10:00
Dave Airlie	c3c242070e	mesa/formats: 8-bit channel integer formats addition Add enough 8-bit channel formats to handle all the different things CTS throws at us. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:26:34 +10:00
Dave Airlie	8185a02316	mesa/formats: add some formats from GL3.3 GL3.3 added GL_ARB_texture_rgb10_a2ui, which specifies a lot more things than just rgb10/a2ui. While playing with ogl conform one of the tests must attempted all valid formats for GL3.3 and hits the unreachable here. This adds the first chunk of formats that hit the assert. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:26:13 +10:00
Dave Airlie	5b6c7da460	mesa: handle SwapBytes in compressed texture get code. This case just wasn't handled, so add support for it. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:17:29 +10:00
Dave Airlie	0ad3a475ef	mesa: fix SwapBytes handling in numerous places In a number of places the SwapBytes handling didn't handle cases with GL_(UN)PACK_ALIGNMENT set and 7 byte width cases aligned to 8 bytes. This adds a common routine to swap bytes a 2D image and uses this code in: texture storage texture get readpixels swrast drawpixels. [airlied: updated with Brian's nitpicks]. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-02 09:16:43 +10:00
José Fonseca	60aea30115	auxiliary/os: Don't implement os_get_option() on embedded builds. Let it be defined externally instead, allowing setting mechanisms other than environment variables. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2015-09-01 16:29:17 -06:00
Brian Paul	84e71ef2ee	util: add a couple primitive restart helper functions The first function translates prim restart indexes to be 0xffff or 0xffffffff. The second splits indexed primitives with restart indexes into sub- primitives without restart indexes. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-09-01 16:29:17 -06:00
Charmaine Lee	14f35194d8	tgsi: add tgsi utility to transform a fragment shader to support aa point This adds a tgsi utility tgsi_add_aa_point to transform a fragment shader to support anti-aliased wide point by computing the fragment distance from the point center. This utility assumes the geometry shader is emitting an extra generic output with point coord data. The semantic index of this generic output is passed to the tgsi_add_aa_point utility. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-01 16:29:17 -06:00
Charmaine Lee	bca238d4f5	tgsi: adds tgsi utility to transform a shader to support point sprite This adds a tgsi utility tgsi_add_point_sprite to transform a geometry shader to emulate wide points by drawing quads. This utility adds an extra output for the original point position if the point position is to be written to a stream output buffer. It also assumes the driver will add a constant for inverse viewport scale after the user defined constants. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-01 16:29:17 -06:00
Brian Paul	a65bdf5f47	tgsi: add new tgsi_two_side.c utility code This could be used by any driver where the device doesn't directly support two-sided lighting. This code modifies a fragment shader to accecpt back-face colors and choose between the front/back colors depending on the triangle's front-face sign.	2015-09-01 16:29:17 -06:00
Brian Paul	da33c2434b	util: add util_strcasecmp() wrapper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-09-01 16:29:17 -06:00
Charmaine Lee	0c4b621590	gallium/util: add a utility to create geometry passthrough shader Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-01 16:29:17 -06:00
Roland Scheidegger	1754208617	gallium/util: fix returning empty box for rectangle intersection These functions deal with inclusive coordinates, hence a 0/0/0/0 rect returned when there's no intersection doesn't actually represent an empty rectangle. Hence return 0/-1/0/-1 instead. This fixes some problems in llvmpipe with empty scissor rects (which up to now didn't really matter because while the intersect test returned the wrong result all pixels were scissored away later anyway).	2015-09-01 16:29:17 -06:00
Roland Scheidegger	fec4f5de67	gallium/util: return FALSE for intersection if there's empty rectangles It isn't really obvious if intersection test should take into account empty rectangles or if the caller should do it. But it looks like most callers actually verified one of the rects but not the other, but since correctly returning an empty rect that other rect could actually be empty leading to more bugs. Hence just verify both rects for emptyness in the intersection test itself which makes the code easier in the caller (though it will be slower if the caller knows the rectangles are non-empty). Reviewed-by: Zack Rusin <zackr@vmware.com>	2015-09-01 16:29:17 -06:00
Charmaine Lee	1775687637	tgsi: add some more helper functions This patch adds some more helper functions such as . tgsi_transform_temps_decl . tgsi_transform_output_decl . tgsi_transform_dst_reg . tgsi_transform_src_reg Reviewed-by: Brian Paul <brianp@vmware.com>	2015-09-01 16:29:17 -06:00
Brian Paul	f8da1e1459	tgsi: added tgsi_is_shadow_target() helper	2015-09-01 16:29:17 -06:00
Brian Paul	bd883c9070	tgsi: add negate parameter to tgsi_transform_kill_inst() Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2015-09-01 16:29:17 -06:00
Brian Paul	56852e925e	util: added ffsll() function v2: fix errant _GNU_SOURCE test, per Matt Turner. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-09-01 16:29:17 -06:00
Brian Paul	84dad65088	util: added util_set_index_buffer() Like util_set_vertex_buffers_count(), this basically just copies a pipe_index_buffer object, taking care of refcounting.	2015-09-01 16:29:17 -06:00
Jason Ekstrand	47b4efc710	mesa: Move gl_vert_attrib from mtypes.h to shader_enums.h It is a shader enum after all... Acked-by: Brian Paul <brianp@vmware.com>	2015-09-01 14:45:37 -07:00
Matt Turner	e34834f059	glapi: Inline x86_64_current_tls(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-09-01 13:23:13 -07:00
Edward O'Callaghan	d351bab9c5	r600g: Simplify out a couple of unnecessary branches Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>	2015-09-01 21:55:23 +02:00
Marek Olšák	2d8f7d3c15	radeonsi: use an indirect buffer for init_config Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	df12ddb55d	radeonsi: add IB2 indirect buffer support for pm4 states Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	8a9ab86ca6	winsys/radeon: add a flag telling how gfx IBs should be padded This is always false on amdgpu (set by calloc). Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	ba79ff7fa8	winsys/amdgpu: remove IB padding for SI SI is unsupported by amdgpu Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	0f4688fbe7	radeonsi: remove unused macro si_pm4_set_state Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	b89fa63d45	radeonsi: remove si_pm4_cleanup All remaining pm4 state are created and destroyed by state trackers. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	a9971e85d9	radeonsi: rework uploading border colors The border colors are uploaded only once when the state is created. This brings truly immutable sampler descriptors, because they don't have to be updated every time a sampler state is re-bound. It also moves the TA_BC_BASE_ADDR registers to init_config, removing one more state. The catch is there is now a limit: only 4096 border colors can be used by one context. I don't think that will be a problem. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	5e2619ef30	radeonsi: use all built-in border colors Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	fbbebeae10	radeonsi: inline si_cmd_context_control Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	77f80a20be	radeonsi: remove unused si_pm4_state code Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	228e80123a	radeonsi: reorder si_context variables Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	28b34b474e	radeonsi: don't send IB dword usage to si_need_cs_space Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	aad43f0768	radeonsi: don't set number of IB dwords for states Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	ec9d5e181e	radeonsi: don't count IB space for states, just use an upper bound Since we don't put any resource descriptors in IBs, the space used by draw calls is quite small. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	fc95058add	radeonsi: convert SPI state to an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	7ff2991e34	gallium/radeon: rename r600_context_bo_reloc -> radeon_add_to_buffer_list this name should be easy to understand without other knowledge Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	d2e63ac042	gallium/radeon: rename write_*_reg functions e.g. radeon_set_context_reg is nicer and looks consistent next to radeon_emit(). Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	0da159ecac	radeonsi: rename and precalculate polygon offset states one less calloc and state construction while drawing Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	45e549fcbc	radeonsi: convert CB_TARGET_MASK setup to an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	8a67e78bb8	radeonsi: don't set VGT_VTX_CNT_EN twice in init_config Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	e21418f221	radeonsi: convert stencil ref state into an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	c44de30979	radeonsi: convert blend color state into an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	74aa64876b	radeonsi: convert sample mask state into an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	12b205341a	radeonsi: convert clip state into an atom Reducing calloc overhead. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	0c2eed0ede	radeonsi: avoid redundant CB and DB register updates The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly when those colorbuffers aren't used. This is mainly for glamor. Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	c2a42d1f9f	radeonsi: don't rebind GSVS ring buffers every draw call using GS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	c9a3196b14	radeonsi: don't clear the tessellation factor ring buffer Leftover from the bring-up. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	a2c6ae07b4	radeonsi: remove the tf_ring state, add the registers to init_config One less state to worry about. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	0d46c3bc9d	radeonsi: remove the gs_rings state, add the registers to init_config Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	87c1e9e19c	radeonsi: use a bitmask for tracking dirty atoms This mainly removes the cache misses when checking the dirty flags. Not much else though. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	2fe040ee61	radeonsi: initialize atom IDs for external atoms Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	5bb0ad7ccc	radeonsi: call si_init_atom for remaining radeonsi atoms I need to initialize more atom IDs. This adds 4 more si_init_atom calls, which simplifies the code. (si_init_atom needs a different context type of the emit functions though) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	e191c58324	radeonsi: initialize atom IDs Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	ba7a6cf626	radeonsi: define the state atom array separately Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	8a97528b3a	radeonsi: optimize viewport states same as scissors Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	f6a10f60b7	radeonsi: optimize scissor states - convert 16 states to 1 atom - only emit 1 scissor if VIEWPORT_INDEX isn't written - use only one packet when emitting consecutive scissors Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	02c8e06497	radeonsi: add SI_MAX_ATTRIBS PIPE_MAX_ATTRIBS is 32, but we currently only support 16. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	05af645a95	radeonsi: fix memory usage checking for big IBs Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	08775a2196	radeonsi: set all 16 viewport Z bounds for GL 4.1 Cc: 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	9b510a9652	radeonsi: fix a Unigine Heaven hang when drirc is missing Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	b1e5451211	winsys/amdgpu: use small IBs for better performance on VI Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	fc292b5821	gallium/util: add u_bit_scan_consecutive_range Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Chris Wilson	d38a560106	i965: Prevent coordinate overflow in intel_emit_linear_blit Fixes regression from commit `8c17d53823` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Wed Apr 15 03:04:33 2015 -0700 i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions. which adjusted the coordinates to be relative to the nearest cacheline. However, this then offsets the coordinates by up to 63 and this may then cause them to overflow the BLT limits. For the well aligned large transfer case, we can use 32bpp pixels and so reduce the coordinates by 4 (versus the current 8bpp pixels). We also have to be more careful doing the last line just in case it may exceed the coordinate limit. Reported-and-tested-by: kaillasse91@hotmail.fr Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90734 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: Anuj Phogat <anuj.phogat@gmail.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-09-01 16:41:07 +01:00
Connor Abbott	1484d8c9aa	i965/nir: enable the dead control flow optimization total instructions in shared programs: 7541551 -> 7541381 (-0.00%) instructions in affected programs: 3054 -> 2884 (-5.57%) helped: 29 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 01:48:04 -07:00
Connor Abbott	aec6744501	nir/dead_cf: add support for removing useless loops v2: fix detecting if the loop has any phi nodes after it. v2: use nir_foreach_ssa_def() instead of nir_foreach_dest() when checking for values live after the loop to catch const_load instructions. v2: fix handling return instructions v2: add some documentation to loop_is_dead() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-09-01 00:58:17 -07:00
Connor Abbott	019eea1c4f	nir: add a helper for iterating over blocks in a cf node We were already doing this internally for iterating over a function implementation, so just expose it directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Connor Abbott	89dc0626bd	nir: add nir_block_get_following_loop() helper Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Connor Abbott	f649afc9dd	nir/dead_cf: delete code that's unreachable due to jumps v2: use nir_cf_node_remove_after(). v2: use foreach_list_typed() instead of hardcoding a list walk. v3: update to new control flow modification helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Connor Abbott	1e6ad4b027	nir: add an optimization for removing dead control flow v2: use nir_cf_node_remove_after() instead of our own broken thing. v3: use the new control flow modification helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-09-01 00:58:17 -07:00
Dave Airlie	0de53ccc8c	r600g: fix calculation for gpr allocation I've been chasing a geom shader hang on rv635 since I wrote r600 geom code, and finally I hacked some values from fglrx in and I could run texelfetch without failures. This is totally my fault as well, maths fail 101. This makes geom shaders on r600 not fail heavily. Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-01 16:43:22 +10:00
Marta Lofstedt	f8a938814e	mesa: Limit Framebuffer Parameter OpenGL ES 3.1 usage According to OpenGL ES 3.1 specification, section 9.2.1 for glFramebufferParameter and section 9.2.3 for glGetFramebufferParameteriv: "An INVALID_ENUM error is generated if pname is not FRAMEBUFFER_DEFAULT_WIDTH, FRAMEBUFFER_DEFAULT_HEIGHT, FRAMEBUFFER_DEFAULT_SAMPLES, or FRAMEBUFFER_DEFAULT_FIXED_SAMPLE_LOCATIONS." Therefore exclude OpenGL ES 3.1 from using the GL_FRAMEBUFFER_DEFAULT_LAYERS parameter. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Kevin Rogovin <kevin.rogovin at intel.com>	2015-09-01 08:24:37 +03:00
Marta Lofstedt	d770e2746c	mesa: Expose GL_ARB_framebuffer_no_attachments to GLES 3.1 V2: Conform to new standard for exposing enums for OpenGL ES 3.1. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-09-01 08:19:11 +03:00
Jason Ekstrand	e16531fbe3	nir/builder: Use nir_after_instr to advance the cursor This should ensure that the cursor gets properly advanced in all cases. We had a problem before where, if the cursor was created using nir_after_cf_node on a non-block cf_node, that would call nir_before_block on the block following the cf node. Instructions would then get inserted in backwards order at the top of the block which is not at all what you would expect from nir_after_cf_node. By just resetting to after_instr, we avoid all these problems. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-31 18:17:07 -07:00
Nanley Chery	f3a483069a	i965: advertise ASTC support for Skylake v2: remove OES ASTC extension reference. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-31 17:29:36 -07:00
Nanley Chery	be7f640257	mesa/glformats: recognize ASTC formats as color formats ASTC formats contain RGBA components. Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-31 17:23:10 -07:00
Nanley Chery	76f17266ec	mesa/texformat: use format conversion function in _mesa_choose_tex_format This function's cases for non-generic compressed formats duplicate the GL to MESA translation in _mesa_glenum_to_compressed_format(). This patch replaces the switch cases with a call to the translation function. This change teaches this function about ASTC, thus enabling ASTC for glTexStorage() calls. Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-31 15:03:21 -07:00
Nanley Chery	01024ded1e	mesa/texcompress: correct mapping of S3TC formats in conversion function MESA_FORMAT_RGBA_DXT5 should actually be reserved for GL_RGBA[4]_DXT5_S3TC. Also, Gallium and other dri drivers (radeon and nouveau) follow this mapping scheme. Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-31 15:03:08 -07:00
Dave Airlie	3063913f77	r600/sb: update last_cf for finalize if. As Glenn did for finalize_loop we need to update_cf when we add a POP at the end of a shader. I think this fixes one of the earlier shader going off end of memory problems we've stopped. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-09-01 07:39:24 +10:00
Matt Turner	a4ba41638d	i965/fs: Use greater-equal cmod to implement maximum. The docs specifically call out SEL with .l and .ge as the implementations of MIN and MAX respectively. Among other things, SEL with these conditional mods are commutative. See commit `3b7f683f`. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-08-31 11:51:59 -07:00
Ben Widawsky	d2e3638ef9	i965/chv\|skl: Apply sampler bypass w/a Certain compressed formats require this setting. The docs don't go into much detail as to why it's needed exactly. This patch introduces no piglit regressions on gen9 (bsw is untested). Note that the SKL "regressions" are fixed tests, and the egl_khr_gl_colorspace tests are WTF. The patch also fixes nothing I can find. http://otc-mesa-ci.jf.intel.com/job/Leeroy/127820/ v2: Reworded commit message (Matt); Added piglit results link. Restructured condition (Matt) Moved check out to function (Nanley). I left the setting of the bit in the surface state open coded because it seems to go better with the existing code. v3: Use and inline function only in gen8_emit_texture_surface_state() (Matt). Cc: Matt Turner <mattst88@gmail.com> Cc: Nanley Chery <nanleychery@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-31 10:08:43 -07:00
Dave Airlie	78027c965a	st/mesa: move to renumbering registers in a group This can be done with a single pass for the instruction base, and takes renumber_registers out of its spot on the profile. Acked-by: Marek Olšák <marek.olsak@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-31 11:27:33 +01:00
Dave Airlie	aee73f2942	st/mesa: reduce time spent in calculating temp read/writes The glsl->tgsi convertor does some temporary register reduction however in profiling shader-db this shows up quite highly, so optimise things to reduce the number of loops through all the instructions we do. This drops merge_registers from 4-5% on the profile to 1%. I think this can be reduced further by possibly optimising the renumber pass. Acked-by: Marek Olšák <marek.olsak@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-31 11:27:18 +01:00
Dave Airlie	46968c1140	st/mesa: cache tgsi opcode info in the instruction Instead of looking this up lots, lets just cache it in the instruction translation up front. I just noticed this function what high in a profile of shader-db on radeonsi. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-31 11:26:23 +01:00
Dave Airlie	03b7ec8778	r600: move prim convert from geom shader to function. This should avoid C++ fail including this header. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-31 19:45:13 +10:00
Timothy Arceri	c8bc8d7235	glsl: remove specical case subroutine type counting Unlike samplers we can get the correct value for subroutines from component_slots() Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-08-31 13:10:44 +10:00
Edward O'Callaghan	0d19dc302f	r600g: Use TGSI parse results instead of manually exfiltrating This makes better use of the work that the TGSI API has done for us. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-08-30 11:41:14 +02:00
Edward O'Callaghan	3eed81a97b	r600g: Set geometry properties in r600_create_shader_state() The selector is shared by all shader variants, so the individual shaders shouldn't change it. Use tgsi_shader_scan() results to set geometry properties within a r600_create_shader_state() call and treat said propertices in the selector as read-only within r600_shader_from_tgsi(). Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-08-30 11:41:00 +02:00
Edward O'Callaghan	b4dee1b636	r600g: Move geometry properties state from shader to selector Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-08-30 11:40:44 +02:00
Edward O'Callaghan	7b6369eb69	r600g: Remove dead assigment to 'gs_input_prim' in shader state Note that 'geometry shader properties' should be carried in the selector state over the shader state in any case. Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-08-30 11:40:26 +02:00
Marek Olšák	7dc8a3497f	radeonsi: don't use the emit qt keyword in si_init_atom It confuses my editor.	2015-08-29 23:18:23 +02:00
Marek Olšák	379e3382e8	radeonsi: remove no-op 32-bit masking Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-29 23:03:21 +02:00
Marek Olšák	437cb1e3f4	gallium/radeon: fix the ADDRESS_HI mask for EVENT_WRITE CIK packets Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-29 23:03:08 +02:00
Marek Olšák	e321596e9f	winsys/radeon: handle non-zero finite timeout when waiting for buffers Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-29 23:03:06 +02:00
Ilia Mirkin	a5a96118ed	freedreno/a3xx: implement half-z clipping Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-29 16:18:04 -04:00
Ilia Mirkin	58e24b4761	freedreno/a3xx: add basic clip plane support The hardware is capable of dealing with GL1-style user clip planes. No clip vertex, no clip distances. Fixes a number of ucp tests, as well as neverball. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-08-29 16:18:04 -04:00
Samuel Pitoiset	c8a61ea4fb	nvc0: change prefix of MP performance counters to HW_SM According to NVIDIA, local performance counters (MP) are prefixed with SM, while global performance counters (PCOUNTER) are called PM. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-08-29 11:04:00 +02:00
Samuel Pitoiset	21bdb4d8f3	nvc0: sort performance counter queries by name Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-08-29 10:24:50 +02:00
Samuel Pitoiset	ebca85423c	nvc0: make names of performance counter queries consistent Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-08-29 10:24:44 +02:00
Samuel Pitoiset	981f46aa95	nvc0: use enumerations for driver queries Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-08-29 10:24:40 +02:00
Samuel Pitoiset	0eac599001	nvc0: remove commented out code related to PCOUNTER queries Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2015-08-29 10:24:35 +02:00
Dave Airlie	6941883175	r600: port si_conv_prim_to_gs_out from radeonsi This code was broken by the tess merge, and I totally missed it until now. I'm not sure this fixes anything but it stops the assert. Cc: "11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-29 09:06:04 +10:00
Dave Airlie	c149d84d45	r600g: use PRIi64 for some compute debug printfs Otherwise this will crash on 32-bit, and it gets rid of warnings building on 32-bit. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-29 09:06:04 +10:00
Dave Airlie	8d6d0cc17d	gallium/util: fix debug_get_flags_option on 32-bit On 32-bit we need to use PRIu64 flags for printfs, otherwise this segfaults in R600_DEBUG=help otherwise. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-29 09:06:04 +10:00
Ilia Mirkin	275c5810ca	glsl: provide the option of using BFE for unpack builting lowering This greatly improves generated code, especially for the snorm variants, since it is able to get rid of the lshift/rshift for sext, as well as replacing each shift + mask with a single op. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-28 18:28:04 -04:00
Ilia Mirkin	889a946a45	glsl: use bitfield_insert instead of and + shift + or for packing It is fairly tricky to detect the proper conditions for using bitfield insert, but easy to just use it up front. This removes a lot of instructions on nvc0 when invoking the packing builtins. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-28 18:28:04 -04:00
Matt Turner	c676c432f3	i965/fs: Remove fs_visitor::try_replace_with_sel(). No shader-db changes on g4x, snb, hsw, or bdw. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-28 11:30:47 -07:00
Matt Turner	64e312d7fa	i965/fs: Replace awful variable names. start_to -> dst_start end_to -> dst_end start_from -> src_start end_from -> src_end var_to -> dst_var var_from -> src_var reg_to -> dst_reg reg_to_offset -> dst_reg_offset reg_from -> src_reg Not sure how these made sense to me before. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-28 11:30:47 -07:00
Matt Turner	a2ff1e95a4	i965/fs: Skip blocks in register coalescing interference check. No need to walk through instructions in blocks we know don't contain our registers' live ranges. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-28 11:30:47 -07:00
Matt Turner	f2f8c43af9	i965/fs: Improve register coalescing interference check. I always thought that the is_control_flow() -> return false check was a bad hack, and some previous attempts to remove it have failed and have been reverted. The previous two patches fix some problems that caused register coalescing to not notice some interference between registers, which the is_control_flow() check apparently works around. With that fixed, we can calculate interference more accurately. total instructions in shared programs: 6261319 -> 6257917 (-0.05%) instructions in affected programs: 346282 -> 342880 (-0.98%) helped: 1552 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-28 11:30:47 -07:00
Matt Turner	f3d0a894af	i965/fs: Use overwrites_reg() instead of dst.equals(). equals() returns false for registers with different types, using it isn't appropriate to determine whether an is overwriting a register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-28 11:30:47 -07:00
Matt Turner	8765f1d7dd	i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM. Noticed when debugging things that lead to the next patch. On G45 (and presumably ILK) this helps register coalescing: total instructions in shared programs: 4077373 -> 4077340 (-0.00%) instructions in affected programs: 43751 -> 43718 (-0.08%) helped: 52 HURT: 2 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-28 11:30:47 -07:00
Marta Lofstedt	2581fe931a	i965/fs: Do not set the size for zero-size uniforms Zero sized uniforms can exist in the list, but they don't get get any space allocated in prog_data->params or in the param_size array, so the size should not be set for them. This was previously fixed in: commit: `781dc7c0e1`. However, commit: `259f7291de` removed the fix. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-28 09:52:59 -07:00
Daniel Scharrer	0516159613	mesa: return old name for deleted samplers for SAMPLER_BINDING queries If the sampler object has been deleted in the same context the binding will have been cleared. If it has been deleted in another context, the spec does not say what should returned. None of the other binding point queries check for deletion in another context. Also, as names of deleted objects are free for reuse, the current code didn't even work reliably. Reviewed-by: Fredrik Höglund <fredrik@kde.org> Signed-off-by: Fredrik Höglund <fredrik@kde.org>	2015-08-28 18:08:39 +02:00
Daniel Scharrer	5aaaaebf22	mesa: add missing queries for ARB_direct_state_access This adds index queries (glGeti_v) for GL_TEXTURE_BINDING_ and GL_SAMPLER_BINDING, as well as textue queries (glGetTex{,ture}Parameter*) for GL_TEXTURE_TARGET. CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Fredrik Höglund <fredrik@kde.org> Signed-off-by: Fredrik Höglund <fredrik@kde.org>	2015-08-28 18:08:26 +02:00
Neil Roberts	2dbc6a0ad9	docs: Fix a typo in GL3.txt concerning GL_KHR_context_flush_control	2015-08-28 14:29:22 +01:00
Ilia Mirkin	b319fd7c14	mesa: fix dispatch sanity with GL_OES_texture_storage_multisample_2d_array Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91785 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Matt Turner <mattst88@gmail.com>	2015-08-28 03:12:05 -04:00
Vinson Lee	2ef5a4f830	ABI-check: Use more portable bash invocation. Fixes 'make check' on FreeBSD. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-08-27 23:48:43 -07:00
Boyan Ding	86c57ebe0e	i965/nir: Make use of nir_opt_undef Shader-db result on Ivy Bridge: total instructions in shared programs: 145484 -> 145445 (-0.03%) instructions in affected programs: 225 -> 186 (-17.33%) helped: 5 HURT: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>	2015-08-27 23:33:49 -07:00
Matt Turner	559b8842fa	glapi: Remove _x86_64_get_get_dispatch symbol from x86-64 assembly. Never used. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-08-27 22:28:49 -07:00
Ilia Mirkin	4a6a47ed05	glsl: clean up textureSize prototype Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-08-27 23:49:13 -04:00
Glenn Kennard	608c7b4a63	r600g/sb: Don't crash on empty if jump target Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-28 12:32:36 +10:00
Glenn Kennard	a830225adb	r600g/sb: Don't read junk after EOP Shaders that contain instruction data after an instruction with EOP could end up parsing that as an instruction, leading to various crashes and asserts in SB as it gets very confused if it sees for instance a loop start instruction jumping off to some random point. Add a couple of asserts, and print EOP bit if set in old asm printer. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-28 12:32:32 +10:00
Glenn Kennard	36f1999a87	r600g/sb: Handle undef in read port tracker e8e443 missed adding check for undef values also in unreserve function, leading to an assert triggering. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-28 12:32:14 +10:00
Brian Paul	52f7487923	mesa: rename rowStride to imageStride in texturesubimage() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-27 15:22:01 -06:00
Ilia Mirkin	2259b11100	mesa: only copy the requested teximage faces Cube maps are special in that they have separate teximages for each face. We handled that by copying the data to them separately, but in case zoffset != 0 or depth != 6 we would read off the end of the client array or modify the wrong images. zoffset/depth have already been verified by the time the code gets to this stage, so no need to double-check. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-08-27 17:18:43 -04:00
Kenneth Graunke	0a913a9d85	nir: Convert the builder to use the new NIR cursor API. The NIR cursor API is exactly what we want for the builder's insertion point. This simplifies the API, the implementation, and is actually more flexible as well. This required a bit of reworking of TGSI->NIR's if/loop stack handling; we now store cursors instead of cf_node_lists, for better or worse. v2: Actually move the cursor in the after_instr case. v3: Take advantage of nir_instr_insert (suggested by Connor). v4: vc4 build fixes (thanks to Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v4] Acked-by: Connor Abbott <cwabbott0@gmail.com> [v4]	2015-08-27 13:36:57 -07:00
Kenneth Graunke	3e3cb77901	nir: Convert the NIR instruction insertion API to use cursors. This patch implements a general nir_instr_insert() function that takes a nir_cursor for the insertion point. It then reworks the existing API to simply be a wrapper around that for compatibility. This largely involves moving the existing code into a new function. Suggested by Connor Abbott. v2: Make the legacy functions static inline in nir.h (requested by Connor Abbott). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-27 13:36:57 -07:00
Kenneth Graunke	f90c6b1ce0	nir: Move nir_cursor to nir.h. We want to use this for normal instruction insertion too, not just control flow. Generally these functions are going to be extremely useful when working with NIR, so I want them to be widely available without having to include a separate file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-27 13:36:57 -07:00
Kenneth Graunke	c44d507752	nir: Strengthen "no jumps" assertions in instruction insertion API. Jumps must be the last instruction in a block, so inserting another instruction after a jump is illegal. Previously, we only checked this when the new instruction being inserted was a jump. This is a red herring - inserting any kind of instruction after a jump is illegal. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-27 13:36:57 -07:00
Brian Paul	bcae4640c8	st/mesa: use PROGRAM_ARRAY for storing structs containing arrays Previously, we used PROGRAM_ARRAY only for variables which were arrays or matrices. But if the variable is a structure containing an array or matrix, we need to use PROGRAM_ARRAY for that too. Before, we failed an assertion: state_tracker/st_glsl_to_tgsi.cpp:4900: Assertion `src_reg->file != PROGRAM_TEMPORARY' failed. when running the piglit test glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-08-27 13:11:26 -06:00
Brian Paul	42c7be5877	glsl: fix comment typo: s/filed/field/	2015-08-27 13:11:26 -06:00
Brian Paul	3c256f572b	gallium/util: fix code formatting in u_blitter.h Trivial.	2015-08-27 13:11:26 -06:00
Jason Ekstrand	fee0c5af11	i965/fs: Split VGRFs after lowering pull constants The split_virtual_grfs code doesn't properly rewrite reladdr so we need to make sure that any uniform indirects are lowered away first. This fixes the glsl-fs-uniform-indexed-by-swizzled-vec4.shader_test in piglit Cc: "10.6" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-27 12:09:36 -07:00
Jason Ekstrand	f2e667172a	i964/fs: Refactor assign_constant_locations Now that all constant locations are assigned in a single function, we can refactor it a bit to unify things. In particular, we now handle pull_constant_loc and push_constant_loc more similarly and we only modify stage_prog_data->params[] in one place at the end of the function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-27 12:09:24 -07:00
Kenneth Graunke	885a9b058c	i965: Rename INTEL_DEBUG=vec4vs to INTEL_DEBUG=vec4. driParseDebugString() doesn't have actual code to parse comma separated lists (or any other supported options?); instead it dumbly uses strstr(). This means that INTEL_DEBUG="vec4vs" will trigger both DEBUG_VEC4VS and DEBUG_VS, as "vs" is also a substring. We should probably improve the driconf parsing, but for now, just rename the option so it's usable in the meantime. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2015-08-27 11:38:50 -07:00
Tapani Pälli	16ad1d2a8d	mesa: enable enums for OES_texture_storage_multisample_2d_array v2: use _mesa_is_gles31(ctx) for verifying we are on ES 3.1, remove _es31 usage from get_hash_params.py Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-27 10:58:10 +03:00
Tapani Pälli	c2c64fd269	glsl: add support for OES_texture_storage_multisample_2d_array v2: use ARB_texture_multisample enable bit Patch adds extension enable bit and enables required keywords and builtin functions for the extension. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-27 10:54:41 +03:00
Tapani Pälli	b9101b1443	mesa: Add extension enable for OES_texture_storage_multisample_2d_array v2: use ARB_texture_multisample bit to enable extension Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-27 10:53:15 +03:00
Tapani Pälli	f4280b740d	glapi: add GL_OES_texture_storage_multisample_2d_array extension Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-27 10:52:46 +03:00
Nanley Chery	9a759a6ee0	swrast: add a new macro, FETCH_COMPRESSED This patch creates a new macro, FETCH_COMPRESSED - similar in nature to the other FETCH_* macros. This reduces repetition in the code that deals with compressed textures. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	42ee16176d	mesa: return bool instead of GLboolean in compressedteximage_only_format() In agreement with the coding style, functions that aren't directly visible to the GL API should prefer the use of bool over GLboolean. Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	43d5b4db96	i965: refactor miptree alignment calculation code Remove redundant checks and comments by grouping our calculations for align_w and align_h wherever possible. v2: reintroduce brw. don't include functional changes. don't adjust function parameters or create a new function. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	a687734135	i965: change the meaning of cpp for compressed textures An ASTC block takes up 16 bytes for all block width and height configurations. This size is not integrally divisible by all ASTC block widths. Therefore cpp is changed to mean bytes per block if the texture is compressed. Because the original definition was bytes per block divided by block width, all references to the mipmap width must be divided the block width. This keeps the address calculation formulas consistent. For example, the units for miptree_level x_offset and miptree total_width has changed from pixels to blocks. v2: reuse preexisting ALIGN_NPOT macro located in an i965 driver file. v3: move ALIGN_NPOT into seperate commit. simplify cpp assignment in copy_image_with_blitter(). update miptree width and offset variables in: intel_miptree_copy_slice(), intel_miptree_map_gtt(), and brw_miptree_layout_texture_3d(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	1a9ceed4ba	i965: correct mt->align_h for 2D textures on Skylake In agreement with commit `4ab8d59a23`, vertical alignment values are equal to four times the block height on Gen9+. v2: add newlines to separate declarations, statments, and comments. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	10ff64fd3d	i965: use ALIGN_NPOT for setting ASTC mipmap layouts ALIGN is changed to ALIGN_NPOT because alignment values are sometimes not powers of two when working with ASTC. v2: handle texture arrays and LDR-only systems. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	54d2aa4258	mesa/macros: move ALIGN_NPOT to macros.h Aligning with a non-power-of-two number is a general task that can be used in various places. This commit is required for the next one. v2: add greater than 0 assertion (Anuj). convert the macro to a static inline function. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	97f4efd573	mesa/macros: add power-of-two assertions for alignment macros ALIGN and ROUND_DOWN_TO both require that the alignment value passed into the macro be a power of two in the comments. Using software assertions verifies this to be the case. v2: use static inline functions instead of gcc-specific statement expressions (Brian). v3: fix indendation (Brian). v4: add greater than zero requirement (Anuj). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	8b1f008e9a	i965/surface_formats: add support for 2D ASTC surface formats Define two-thirds of the 2D Intel ASTC surface formats (LDR-only). This allows a 1-to-1 mapping from the mesa format to the Intel format. ASTC textures will default to being processed in LDR mode. If there is hardware support for HDR/Full mode and the texture is not sRGB, add the format bit necessary to process it in HDR/Full mode. v2: remove extra newlines. v3: follow existing coding style in translate_tex_format(). v4: expound on the GEN9_SURFACE_ASTC_HDR_FORMAT_BIT comment. update SF table - ASTC is actually supported in Gen8. v5: conform the ASTC MESA_FORMAT enums to the existing naming convention. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	cd49b97a8a	mesa/teximage: return the base internal format of the ASTC formats This is necesary to initialize the gl_texture_image struct. From the KHR_texture_compression_astc_ldr spec: "Added to Section 3.8.6, Compressed Texture Images Add the tokens specified above to Table 3.16, Compressed Internal Formats. In all cases, the base internal format will be RGBA. The encoding allows images to be encoded with fewer channels, but this is always presented as RGBA to the sampler." v2. use _mesa_is_astc_format(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	12b519b457	mesa/teximage: accept ASTC formats for 3D texture specification The ASTC spec was revised as follows: Revision 2, April 28, 2015 - added CompressedTex{Sub,}Image3D to commands accepting ASTC format tokens in the New Tokens section [...]. Support only exists in the HDR submode: Add a second new column "3D Tex." which is empty for all non-ASTC formats. If only the LDR profile is supported by the implementation, this column is also empty for all ASTC formats. If both the LDR and HDR profiles are supported only, this column is checked for all ASTC formats. LDR-only systems should generate an INVALID_OPERATION error when attempting to call CompressedTexImage3D with the TEXTURE_3D target. v2. return the proper error for LDR-only systems. v3. update is_astc_format(). v4. use _mesa_is_astc_format(). v5. place logic in _mesa_target_can_be_compressed. v6. fix issues handling ASTC formats. Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	23c9cd5a96	mesa/texcompress: enable translation between MESA and GL ASTC formats v3. conform the ASTC MESA_FORMAT enums to the existing naming convention. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:43 -07:00
Nanley Chery	692578ed13	mesa/glformats: recognize ASTC formats as compressed Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:42 -07:00
Nanley Chery	4143511b15	mesa: add ASTC extensions to the extensions table v2: alphabetize the extensions. remove OES ASTC extension. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:42 -07:00
Nanley Chery	582ce1ea97	mesa: don't enable online compression for ASTC formats In agreement with the ASTC spec, this makes calls to TexImageD unsuccessful. Implied by the spec, Generate[Texture]Mipmap and [Copy]Tex[Sub]ImageD calls must be unsuccessful as well. v2. actually force attempts to compress online to fail. v3. indentation (Matt). v4. update copytexture_error_check to account for CopyTexImage*D (Chad). Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:42 -07:00
Nanley Chery	e9fd8e154f	glapi: add support for KHR_texture_compression_astc_ldr v2: correct the spelling of the sRGB variants. remove spaces around "=" when setting the enum value. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:42 -07:00
Nanley Chery	8ae37365f3	mesa/formats: define the 2D ASTC formats Define the mesa formats and make changes necessary for compilation without errors. Also add support for _mesa_get_srgb_format_linear(). v2. conform the ASTC MESA_FORMAT enums to the existing naming convention. v3. remove ASTC cases for _mesa_get_uncompressed_format(). This function is only used for generating mipmaps - something ASTC formats do not support due to lack of online compression. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-26 14:36:42 -07:00
Ilia Mirkin	c4cbaca327	nouveau: avoid build failures since `0fc21ecf` Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-26 14:04:41 -04:00
Marek Olšák	6924ecac77	gallium/radeon: read_registers should return bool meaning success or failure Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:20 +02:00
Marek Olšák	16e5d8ad38	radeonsi: add IB parser support for CP DMA packets If the packet encoding is defined in the same format as register definitions, the python script can process them automatically and the parser support becomes trivial. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	2c14a6d3b1	radeonsi: add IB tracing support for debug contexts This adds trace points to all IBs and the parser prints them and also prints which trace points were reached (executed) by the CP. This can help pinpoint a problematic packet, draw call, etc. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	189953ee13	radeonsi: remove old CS tracing code Some of it is left there and it will be re-used in the next commit. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	df6a5666b6	radeonsi: parse and dump status registers on GPU hang GPU hang detection must be enabled by setting: GALLIUM_DDEBUG=[timeout in ms] This may print too much information that we might not understand yet, but some of the bits are very useful. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	61df4f0cd3	radeonsi: add an IB parser Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	be6dc87776	radeonsi: save the contents of indirect buffers for debug contexts This will be used by the IB parser. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	a6a6c68955	radeonsi: generate register and packet tables for an IB parser from sid.h This makes writing a good IB parser a lot easier. It generates 2 tables: - packet3 table - register table with all registers, fields, and named values Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	d15b71b4bd	radeonsi: remove duplicated register definitions and instruction definitions Instruction encoding isn't needed in Mesa. The border color address registers were duplicated. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	c59ad265df	r600g,radeonsi: remove unused ill-formed register field definitions Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Marek Olšák	110873ed11	radeonsi: add an initial dump_debug_state implementation dumping shaders This is usually called after a draw call. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Marek Olšák	93d97db349	radeonsi: allow si_dump_key to write to a file Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Marek Olšák	525921ed51	gallium/ddebug: new pipe for hang detection and driver state dumping (v2) v2: lots of improvements This is like identity or trace, but simpler. It doesn't wrap most states. Run with: GALLIUM_DDEBUG=1000 [executable] where "executable" is the app and "1000" is in miliseconds, meaning that the context will be considered hung if a fence fails to signal in 1000 ms. If that happens, all shaders, context states, bound resources, draw parameters, and driver debug information (if any) will be dumped into: /home/$username/dd_dumps/$processname_$pid_$index. Note that the context is flushed after every draw/clear/copy/blit operation and then waited for to find the exact call that hangs. You can also do: GALLIUM_DDEBUG=always to do the dumping after every draw/clear/copy/blit operation without flushing and waiting. Examples of driver states that can be dumped are: - Hardware status registers saying which hw block is busy (hung). - Disassembled shaders in a human-readable form. - The last submitted command buffer in a human-readable form. v2: drop pipe-loader changes, drop SConscript rename dd.h -> dd_pipe.h Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Marek Olšák	0fc21ecfc0	gallium: add flags parameter to pipe_screen::context_create This allows creating compute-only and debug contexts. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Marek Olšák	7b5c92391f	gallium: add an interface for dumping debug driver state Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Ilia Mirkin	a3b617a258	mesa: remove pointless es31 checks, fix indirect to only be in es31 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-08-26 12:37:38 -04:00
Ilia Mirkin	332fb341dd	mesa: uncomment checks in es31 computation, add texture_ms Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-08-26 12:37:17 -04:00
Marek Olšák	f432ae899f	mesa: create multisample fallback textures like normal textures This works if drivers upsample on upload (like all radeon ones do). The alternative is an unexpected GL error from anything calling _mesa_update_state and possibly other issues. Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-08-26 15:42:26 +02:00
Grazvydas Ignotas	f8b01ae47c	radeonsi: mark unreachable paths to avoid warnings Otherwise we get: warning: 'num_user_sgprs' may be used uninitialized in this function ... Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-08-26 15:42:26 +02:00
Tapani Pälli	e0c2ea0337	mesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1 Patch refactors existing parameters check to first check common enums between desktop GL and GLES 3.1 and modifies get_tex_level_parameter_image to be compatible with enums specified in 3.1. v2: remove extra is_gles31() checks (suggested by Ilia) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1) Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> (v1) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-08-26 08:38:25 +03:00
Marta Lofstedt	ae8d0e7abe	mesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1 According to OpenGL ES specification section 7.12, GL_COMPUTE_WORK_GROUP_SIZE, is supported by the glGetProgramiv function. Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-08-26 08:25:07 +03:00
Marta Lofstedt	c2a766880d	mesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1 According to the OpenGL ES 3.1 specification chapter 17, the MAX_COMPUTE_WORK_GROUP_COUNT and MAX_COMPUTE_WORK_GROUP_SIZE is available for glGetIntegeri_v. Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-08-26 08:25:07 +03:00
Dave Airlie	73e5adc4b2	mesa/formats: pass correct parameter to _mesa_is_format_compressed commit `26c549e69d` Author: Nanley Chery <nanley.g.chery@intel.com> Date: Fri Jul 31 10:26:36 2015 -0700 mesa/formats: remove compressed formats from matching function caused a regression in my CTS testing, this looks like a clear thinko. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> sSigned-off-by: Dave Airlie <airlied@redhat.com>	2015-08-26 14:13:27 +10:00
Roland Scheidegger	48e6404c04	gallium/auxiliary: optimize rgb9e5 helper some more I used this as some testing ground for investigating some compiler bits initially (e.g. lrint calls etc.), figured I could do much better in the end just for fun... This is mathematically equivalent, but uses some tricks to avoid doubles and also replaces some float math with ints. Good for another performance doubling or so. As a side note, some quick tests show that llvm's loop vectorizer would be able to properly vectorize this version (which it failed to do earlier due to doubles, producing a mess), giving another 3 times performance increase with sse2 (more with sse4.1), but this may not apply to mesa. No piglit change. Acked-by: Marek Olšák <marek.olsak@amd.com>	2015-08-26 02:57:38 +02:00
Roland Scheidegger	941346a803	gallium/auxiliary: optimize rgb9e5 helper a bit This code (lifted straight from the extension) was doing things the most inefficient way you could think of. This drops some of the more expensive float operations, in particular - int-cast floors (pointless, values always positive) - 2 raised to (signed) integers (replace with simple exponent manipulation), getting rid of a misguided comment in the process (implement with table...) - float division (replace with mul of reverse of those exponents) This is like 3 times faster (measured for float3_to_rgb9e5), though it depends (e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not, division is not too bad on cpus with early-exit divs). Note that keeping the double math for now (float x + 0.5), as the results may otherwise differ. Acked-by: Marek Olšák <marek.olsak@amd.com>	2015-08-26 02:57:37 +02:00
Dave Airlie	c1452983b4	mesa/texgetimage: fix missing stencil check GetTexImage can read to stencil8 but only from a stencil or depthstencil textures. This fixes a bunch of failures in CTS GL33-CTS.gtf32.GL3Tests.packed_pixels Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-26 10:22:09 +10:00
Nanley Chery	1d2a844e7d	mesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed Enables _mesa_target_can_be_compressed to return the appropriate GL error depending on it's inputs. Use the parameter to return the appropriate GL error for ETC2 formats on GLES3. Suggested-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-25 15:53:46 -07:00
Nanley Chery	26c549e69d	mesa/formats: remove compressed formats from matching function All compressed formats return GL_FALSE and there isn't any evidence to support that this behaviour would change. Remove all switch cases for compressed formats. v2. Since the exhaustive switch is removed, add a gtest to ensure all formats are handled. v3. Ensure that GL_NO_ERROR is set before returning. v4. Fix an arg to _mesa_uncompressed_format_to_type_and_comps(); fix formatting and misc improvements (Chad). Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-25 15:45:17 -07:00
Nanley Chery	8e581747d2	mesa/formats: make format testing a gtest We currently check that our format info table is sane during context initialization in debug builds. Perform this check during `make check` instead. This enables format testing in release builds and removes the requirement of an exhuastive switch for _mesa_uncompressed_format_to_type_and_comps(). v2. indentation and conditional inclusion fixes (Chad). allow tests to continue running if any format fails and display the failing format name. Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-25 15:45:13 -07:00
Kenneth Graunke	1bec29d04d	gallium/ttn: Use nir_builder_insert() rather than poking at cf_list. I intend to remove nir_builder::cf_node_list, so I can't have this code poking at it directly. The proper way is to set the insertion point and then simply insert things there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-25 11:12:35 -07:00
Kenneth Graunke	78856194c1	prog_to_nir: Use nir_builder_insert() rather than poking at cf_list. I intend to remove nir_builder::cf_node_list, so I can't have this code poking at it directly. The proper way is to set the insertion point and then simply insert things there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-25 11:12:35 -07:00
Kenneth Graunke	5f14c417c8	nir: Use nir_shader::stage rather than passing it around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-25 11:12:35 -07:00
Kenneth Graunke	d4d5b430a5	nir: Store gl_shader_stage in nir_shader. This makes it easy for NIR passes to inspect what kind of shader they're operating on. Thanks to Michel Dänzer for helping me figure out where TGSI stores the shader stage information. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-08-25 11:12:35 -07:00
Jason Ekstrand	dfacae3a56	i965/fs: Combine assign_constant_locations and move_uniform_array_access_to_pull_constants The comment above move_uniform_array_access_to_pull_constants was completely bogus because it has nothing to do with lowering instructions. Instead, it's assiging locations of pull constants. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	c999a58f50	nir/lower_io: Remove assign_var_locations_direct_first This is no longer used so we might as well get rid of it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	259f7291de	i965/fs: Rework uniform handling Previously, we treated the entire UNIFORM file as if it had two elements: One for direct things and one for indirect. This is substantially different from how the old visitor code handled it where each element was effectively its own uniform. This commit makes the NIR path more like the old ir_visitor path where each uniform is separate. This should allow us to more easily make decisions about what to push. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	cfa056c6a5	i965/vec4_nir: Get rid of the uniform_driver_location tracking Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	ce5e9139aa	nir/lower_io: Separate driver_location and base offset for uniforms Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	0db8e87b4a	nir/intrinsics: Add a second const index to load_uniform In the i965 backend, we want to be able to "pull apart" the uniforms and push some of them into the shader through a different path. In order to do this effectively, we need to know which variable is actually being referred to by a given uniform load. Previously, it was completely flattened by nir_lower_io which made things difficult. This adds more information to the intrinsic to make this easier for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Kenneth Graunke	6c33d6bbf9	nir: Pass a type_size() function pointer into nir_lower_io(). Previously, there were four type_size() functions in play - the i965 compiler backend defined scalar and vec4 type_size() functions, and nir_lower_io contained its own similar functions. In fact, the i965 driver used nir_lower_io() and then looped over the components using its own type_size - meaning both were in play. The two are /basically/ the same, but not exactly in obscure cases like subroutines and images. This patch removes nir_lower_io's functions, and instead makes the driver supply a function pointer. This gives the driver ultimate flexibility in deciding how it wants to count things, reduces code duplication, and improves consistency. v2 (Jason Ekstrand): - One side-effect of passing in a function pointer is that nir_lower_io is now aware of and properly allocates space for image uniforms, allowing us to drop hacks in the backend Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Kenneth Graunke	a23f82053d	prog_to_nir: Don't allocate nir_variable with type vec4[0] for uniforms. If there are no parameters, we don't need to create a nir_variable to hold them...and allocating an array of length 0 is pretty bogus. Should avoid i965 backend assertions in future patches Jason and I are working on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-25 10:18:27 -07:00
Kenneth Graunke	640c472fd0	i965: Move type_size() methods out of visitor classes. I want to use C function pointers to these, and they don't use anything in the visitor classes anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	c56899f41a	i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset This way they don't implicitly increment the uniforms variable and don't have to be called in-sequence during uniform setup. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Jason Ekstrand	8d8b8f5854	i965: Rename setup_vector_uniform_values to setup_vec4_uniform_value The new name more accurately represents what it does: Set up a single vec4 uniform value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-25 10:18:27 -07:00
Rob Clark	0ab29751b6	freedreno/ir3: fix compile break after splitting out nir_control_flow.h The commit: commit `b49371b8ed` Author: Connor Abbott <cwabbott0@gmail.com> AuthorDate: Tue Jul 21 19:54:18 2015 -0700 nir: move control flow modification to its own file split out some control flow related APIs into a separate header, but did not update drivers. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-25 08:17:30 -04:00
Rob Clark	8b2d0bb844	freedreno/ir3: fix compile break after fxn->start_block removal The commit: commit `8e0d4ef341` Author: Kenneth Graunke <kenneth@whitecape.org> AuthorDate: Thu Aug 6 18:18:40 2015 -0700 nir: Delete the nir_function_impl::start_block field. removed the start_block field without fixing up drivers.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-25 08:13:04 -04:00
Dave Airlie	529acab22a	mesa: enable texture stencil8 for multisample This fixes GL45-CTS.gtf44.GL31Tests.texture_stencil8.texture_stencil8_gl44 from the ogl conform suite. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-25 11:06:58 +10:00
Brian Paul	e089ca26e1	mesa: make _mesa_bind_texture_unit() static It's only called from the file it's defined in. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-08-24 18:23:19 -06:00
Nanley Chery	8f378d1083	mesa/formats: store whether or not a format is sRGB in gl_format_info v2: remove extra newline. v3: use bool instead of GLboolean. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-24 16:08:01 -07:00
Kenneth Graunke	4f2cdd8497	nir: Use !block_ends_in_jump() in a few places rather than open-coding. Connor introduced this helper recently; we should use it here too. I had to move the function earlier in the file for it to be available. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-08-24 15:10:55 -07:00
Connor Abbott	d7971b41ce	nir/cf: reimplement nir_cf_node_remove() using the new API This gives us some testing of it. Also, the old nir_cf_node_remove() wasn't handling phi nodes correctly and was calling cleanup_cf_node() too late. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	fc7f2d2364	nir/cf: add new control modification API's These will help us do a number of things, including: - Early return elimination. - Dead control flow elimination. - Various optimizations, such as replacing: if (foo) { ... } if (!foo) { ... } with: if (foo) { ... } else { ... } Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	476eb5e4a1	nir/cf: use a cursor for inserting control flow Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	d356f84d4c	nir/cf: add split_block_cursor() This is a helper that will be shared between the new control flow insertion and modification code. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	58a360c6b8	nir/cf: add split_block_before_instr() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	6e47a34b29	nir/cf: add a cursor structure For now, it allows us to refactor the control flow insertion API's so that there's a single entrypoint (with some wrappers). More importantly, it will allow us to reduce the combinatorial explosion in the extract function. There, we need to specify two points to extract, which may be at the beginning of a block, the end of a block, or in the middle of a block. And then there are various wrappers based off of that (before a control flow node, before a control flow list, etc.). Rather than having 9 different functions, we can have one function and push the actual logic of determining which variant to use down to the split function, which will be shared with nir_cf_node_insert(). In the future, we may want to make the instruction insertion API's as well as the builder use this, but that's a future cleanup. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	6f5c81f86f	nir/cf: fix link_blocks() when there are no successors When we insert a single basic block A into another basic block B, we will split B into C and D, insert A in the middle, and then splice together C, A, and D. When we splice together C and A, we need to move the successors of A into C -- except A has no successors, since it hasn't been inserted yet. So in move_successors(), we need to handle the case where the block whose successors are to be moved doesn't have any successors. Fixing link_blocks() here prevents a segfault and makes it work correctly. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	6d028749ac	nir/cf: clean up jumps when cleaning up CF nodes We may delete a control flow node which contains structured jumps to other parts of the program. We need to remove the jump as a predecessor, as well as remove any phi node sources which reference it. Right now, the same problem exists for blocks that don't end in a jump instruction, but with the new API it shouldn't be an issue, since blocks that don't end in a jump must either point to another block in the same extracted CF list or not point to anything at all. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	211c79515d	nir/cf: remove uses of SSA definitions that are being deleted Unlike calling nir_instr_remove(), calling nir_cf_node_remove() (and later in the series, the nir_cf_list_delete()) implies that you're removing instructions that may still have uses, except those instructions are never executed so any uses will be undefined. When cleaning up a CF node for deletion, we must clean up any uses of the deleted instructions by making them point to undef instructions instead. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	633cbbc068	nir/cf: handle jumps better in stitch_blocks() In particular, handle the case where the earlier block ends in a jump and the later block is empty. In that case, we want to preserve the jump and remove any traces of the later block. Before, we would only hit this case when removing a control flow node after a jump, which wasn't a common occurance, but we'll need it to handle inserting a control flow list which ends in a jump, which should be more common/useful. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	940873bf22	nir/cf: handle jumps in split_block_end() Before, we would only split a block with a jump at the end if we were inserting something after a block with a jump, which never happened in practice. But now, we want to use this to extract control flow lists which may end in a jump, in which case we really need to do the correct patching up. As a side effect, when removing jumps we now correctly insert undef phi sources in some corner cases, which can't hurt. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	f596e4021c	nir/cf: add block_ends_in_jump() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	788d45cb47	nir/cf: handle phi nodes better in split_block_beginning() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	747ddc3cdd	nir/cf: split up and improve nir_handle_remove_jumps() Before, the process of removing a jump and wiring up the remaining block correctly was atomic, but with the new control flow modification it's split into two parts: first, we extract the jump, which creates a new block with re-wired successors as well as a free-floating jump, and then we delete the control flow containing the jump, which removes the entry in the predecessors and any phi node sources. Split up nir_handle_remove_jumps() to accomodate this, and add the missing support for removing phi node sources. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:42 -07:00
Connor Abbott	13482111d0	nir/cf: add remove_phi_src() helper Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	f41e108d8b	nir: add nir_foreach_phi_src_safe() Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	762ae436ea	nir/cf: add insert_phi_undef() helper Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	b49371b8ed	nir: move control flow modification to its own file We want to start reworking and expanding this code, but it'll be a lot easier to do once we disentangle it from the rest of the stuff in nir.c. Unfortunately, there are a few unavoidable dependencies in nir.c on methods we'd rather not expose publicly, since if not used in very specific situations they can cause Bad Things (tm) to happen. Namely, we need to do some magical control flow munging when adding/removing jumps. In the future, we may disallow adding/removing jumps in nir_instr_insert_*() and nir_instr_remove(), and use separate functions that are part of the control flow modification code, but for now we expose them and put them in a separate, private header. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	1c53f89696	nir: make cleanup_cf_node() not use remove_defs_uses() cleanup_cf_node() is part of the control flow modification code, which we're going to split into its own file, but remove_defs_uses() is an internal function used by nir_instr_remove(). Break the dependency by making cleanup_cf_node() use nir_instr_remove() instead, which simply calls remove_defs_uses() and then removes the instruction from the list. nir_instr_remove() does do extra things for jumps, though, so we avoid calling it on jumps which matches the previous behavior (this will be fixed later in the series). Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	9d5944053c	nir: inline block_add_pred() a few places It was being used to initialize function impls and loops, even though it's really a control flow modification helper. It's pretty trivial, so just inline it to avoid the dependency. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Connor Abbott	c7df141c71	nir/validate: check successors/predecessors more carefully We should be checking almost everything now. Signed-off-by: Connor Abbott <connor.w.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-08-24 13:31:41 -07:00
Kenneth Graunke	8e0d4ef341	nir: Delete the nir_function_impl::start_block field. It's simply the first nir_cf_node in the nir_function_impl::body list, which is easy enough to access - we don't to store a pointer to it explicitly. Removing it means we don't need to maintain the pointer when, say, splitting the start block when modifying control flow. Thanks to Connor Abbott for suggesting this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-08-24 13:31:41 -07:00
Nanley Chery	9f00af672b	mesa/formats: only do type and component lookup for uncompressed formats Only uncompressed formats have a non-void type and actual components per pixel. Rename _mesa_format_to_type_and_comps to _mesa_uncompressed_format_to_type_and_comps and require callers to check if the format is not compressed. v2. include compressed format cases to avoid gcc warnings (Chad). Reviewed-by: Chad Versace <chad.versace@intel.com> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2015-08-24 11:27:46 -07:00
Rob Clark	000e225360	freedreno/a4xx: formats update Fixes glamor, which wants to use R8 integer textures. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-24 13:16:27 -04:00
Rob Clark	afb6c24a20	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-08-24 13:15:57 -04:00
Chris Wilson	4e5752e2b7	i965: Always re-emit the pipeline select during invariant state emission On the older platforms where we don't have logical contexts preserving state across batches, we emit the invariant state setup on every batch using the brw_invariant_state atom. This includes the pipeline selection which is cached with the introduction of commit `0e0e23ef53` Author: Jordan Justen <jordan.l.justen@intel.com> Date: Wed Apr 22 11:43:50 2015 -0700 i965/state: Emit pipeline select when changing pipelines However, we do not reset the cache between batches on context-less platforms resulting in us not setting the pipeline selection and can cause GPU hangs if a media pipelined was loaded in the meantime (e.g. mixing mplayer/gstreamer using libva and gnome-shell). A simple solution is to just forcibly re-emit the pipeline select along with the invariant state and reset the cache at that point. Reported-and-tested-by: Tomasz C. <tomaszc@o2.pl> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91254 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>	2015-08-24 08:57:55 +01:00
Marek Olšák	a83c36b5c0	Revert "radeon/winsys: increase the IB size for VM" This reverts commit `567394112d`. It regressed performance. It looks like smaller IBs are better, because the GPU goes idle quicker and there is less waiting for buffers and fences. Cc: 11.0 <mesa-stable@lists.freedesktop.org>	2015-08-23 19:01:15 +02:00
Ilia Mirkin	e18c29b031	nv50: fix 2d engine blits for 64- and 128-bit formats This fixes bin/ext_framebuffer_multisample-formats all_samples Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-08-23 03:12:07 -04:00
Ilia Mirkin	a6ad49cbbd	nv50: account for the int RT0 rule for alpha-to-one/cov Same as commit `1af0641db` but for nvc0. If an integer texture is bound to RT0, don't do alpha-to-one or alpha-to-coverage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-08-23 02:58:58 -04:00
Dave Airlie	45971fd0df	mesa/arb_gpu_shader_fp64: add support for glGetUniformdv This was missed when I did fp64, I've sent a piglit test to cover the case as well. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-08-23 15:56:35 +10:00
Ilia Mirkin	abbf05cfc2	nv50,nvc0: disable depth bounds test on blit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-08-23 01:39:29 -04:00
Neil Roberts	3a1ab23480	i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used When the edge flag element is enabled then the elements are slightly reordered so that the edge flag is always the last one. This was confusing the code to upload the 3DSTATE_VF_INSTANCING state because that is uploaded with a separate loop which has an instruction for each element. The indices used in these instructions weren't taking into account the reordering so the state would be incorrect. v2: Use nr_elements instead of brw->vb.nr_enabled so that it will cope when gl_VertexID is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91292 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-08-22 22:25:39 -07:00
Neil Roberts	fb02b4ec48	i965: Swap the order of the vertex ID and edge flag attributes The edge flag data on Gen6+ is passed through the fixed function hardware as an extra attribute. According to the PRM it must be the last valid VERTEX_ELEMENT structure. However if the vertex ID is also used then another extra element is added to source the VID. This made it so the vertex ID is in the wrong register in the vertex shader and the edge attribute is no longer in the last element. v2: Also implement for BDW+ v3 [by Ben]: Remove 10.5 tag. Too late. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84677 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-08-22 22:20:33 -07:00
Glenn Kennard	50932268aa	r600g: Fix assert in tgsi_cmp Fixes https://bugs.freedesktop.org/show_bug.cgi?id=91726 Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "11.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@gmail.com>	2015-08-23 09:31:12 +10:00
Alexander von Gluck IV	5abbd1cacc	egl: scons: fix the haiku build, do not build the dri2 backend Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-08-22 10:13:31 -05:00
Emil Velikov	a8c5c62359	docs: add 11.1.0-devel release notes template, bump version Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-08-22 13:28:16 +01:00

2261 changed files with 204836 additions and 67621 deletions

									
										1

.dir-locals.el
									
												View File
												
				@@ -5,6 +5,7 @@

				  (c-file-style . "stroustrup")

				  (fill-column . 78)

				  (eval . (progn

					    (c-set-offset 'case-label '0)

					    (c-set-offset 'innamespace '0)

					    (c-set-offset 'inline-open '0)))

				  )

									
										101

.travis.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,101 @@

				language: c

				sudo: false

				cache:

				  directories:

				    - $HOME/.ccache

				addons:

				  apt:

				    packages:

				      - libdrm-dev

				      - libudev-dev

				      - x11proto-xf86vidmode-dev

				      - libexpat1-dev

				      - libxcb-dri2-0-dev

				      - libx11-xcb-dev

				      - llvm-3.4-dev

				      - scons

				env:

				  global:

				    - XORG_RELEASES=http://xorg.freedesktop.org/releases/individual

				    - XCB_RELEASES=http://xcb.freedesktop.org/dist

				    - XORGMACROS_VERSION=util-macros-1.19.0

				    - GLPROTO_VERSION=glproto-1.4.17

				    - DRI2PROTO_VERSION=dri2proto-2.8

				    - DRI3PROTO_VERSION=dri3proto-1.0

				    - PRESENTPROTO_VERSION=presentproto-1.0

				    - LIBPCIACCESS_VERSION=libpciaccess-0.13.4

				    - LIBDRM_VERSION=libdrm-2.4.65

				    - XCBPROTO_VERSION=xcb-proto-1.11

				    - LIBXCB_VERSION=libxcb-1.11

				    - LIBXSHMFENCE_VERSION=libxshmfence-1.2

				    - PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig

				  matrix:

				    - BUILD=make

				    - BUILD=scons

				install:

				  - export PATH="/usr/lib/ccache:$PATH"

				  - pip install --user mako

				  # Install dependencies where we require specific versions (or where

				  # disallowed by Travis CI's package whitelisting).

				  - wget $XORG_RELEASES/util/$XORGMACROS_VERSION.tar.bz2

				  - tar -jxvf $XORGMACROS_VERSION.tar.bz2

				  - (cd $XORGMACROS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$GLPROTO_VERSION.tar.bz2

				  - tar -jxvf $GLPROTO_VERSION.tar.bz2

				  - (cd $GLPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$DRI2PROTO_VERSION.tar.bz2

				  - tar -jxvf $DRI2PROTO_VERSION.tar.bz2

				  - (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$DRI3PROTO_VERSION.tar.bz2

				  - tar -jxvf $DRI3PROTO_VERSION.tar.bz2

				  - (cd $DRI3PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/proto/$PRESENTPROTO_VERSION.tar.bz2

				  - tar -jxvf $PRESENTPROTO_VERSION.tar.bz2

				  - (cd $PRESENTPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2

				  - tar -jxvf $XCBPROTO_VERSION.tar.bz2

				  - (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XCB_RELEASES/$LIBXCB_VERSION.tar.bz2

				  - tar -jxvf $LIBXCB_VERSION.tar.bz2

				  - (cd $LIBXCB_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/lib/$LIBPCIACCESS_VERSION.tar.bz2

				  - tar -jxvf $LIBPCIACCESS_VERSION.tar.bz2

				  - (cd $LIBPCIACCESS_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2

				  - tar -jxvf $LIBDRM_VERSION.tar.bz2

				  - (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix && make install)

				  - wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2

				  - tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2

				  - (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)

				# Disabled LLVM (and therefore r300 and r600) because the build fails

				# with "undefined reference to `clock_gettime'" and "undefined

				# reference to `setupterm'" in llvmpipe.

				script:

				  - if test "x$BUILD" = xmake; then

				      ./autogen.sh --enable-debug

				        --disable-gallium-llvm

				        --with-egl-platforms=x11,drm

				        --with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau

				        --with-gallium-drivers=svga,swrast,vc4,virgl

				        ;

				      make && make check;

				    elif test x$BUILD = xscons; then

				      scons;

				    fi

									
										12

Android.common.mk
									
												View File
												
				@@ -21,13 +21,8 @@

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				# use c99 compiler by default

				ifeq ($(LOCAL_CC),)

				ifeq ($(LOCAL_IS_HOST_MODULE),true)

				LOCAL_CC := $(HOST_CC) -std=c99 -D_GNU_SOURCE

				else

				LOCAL_CC := $(TARGET_CC) -std=c99

				endif

				LOCAL_CFLAGS += -D_GNU_SOURCE

				endif

				LOCAL_C_INCLUDES += \

				@@ -37,6 +32,7 @@ LOCAL_C_INCLUDES += \

				MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)

				# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)

				LOCAL_CFLAGS += \

					-Wno-unused-parameter \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \

					-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

				@@ -60,6 +56,10 @@ LOCAL_CFLAGS += \

					-fvisibility=hidden \

					-Wno-sign-compare

				# mesa requires at least c99 compiler

				LOCAL_CONLYFLAGS += \

					-std=c99

				ifeq ($(strip $(MESA_ENABLE_ASM)),true)

				ifeq ($(TARGET_ARCH),x86)

				LOCAL_CFLAGS += \

									
										5

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 vmwgfx

				#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 virgl vmwgfx

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -46,7 +46,7 @@ MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk

				MESA_PYTHON2 := python

				classic_drivers := i915 i965

				gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4

				gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4 virgl

				MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

				@@ -86,6 +86,7 @@ ifneq ($(strip $(MESA_GPU_DRIVERS)),)

				SUBDIRS := \

					src/loader \

					src/mapi \

					src/compiler \

					src/glsl \

					src/mesa \

					src/util \

									
										1

Makefile.am
									
												View File
												
				@@ -51,7 +51,6 @@ noinst_HEADERS = \

					include/c99_alloca.h \

					include/c99_compat.h \

					include/c99_math.h \

					include/c99 \

					include/c11 \

					include/D3D9 \

					include/HaikuGL \

2

VERSION

View File

@@ -1 +1 @@
 .0.9
 .2.0-devel

									
										73

appveyor.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,73 @@

				# http://www.appveyor.com/docs/appveyor-yml

				#

				# To setup AppVeyor for your own personal repositories do the following:

				# - Sign up

				# - Add a new project

				# - Select Git and fill in the Git clone URL

				# - Setup a Git hook as explained in

				#   https://github.com/appveyor/webhooks#installing-git-hook

				# - Check 'Settings > General > Skip branches without appveyor.yml'

				# - Check 'Settings > General > Rolling builds'

				# - Setup the global or project notifications to your liking

				#

				# Note that kicking (or restarting) a build via the web UI will not work, as it

				# will fail to find appveyor.yml .  The Git hook is the most practical way to

				# kick a build.

				#

				# See also:

				# - http://help.appveyor.com/discussions/problems/2209-node-grunt-build-specify-a-project-or-solution-file-the-directory-does-not-contain-a-project-or-solution-file

				# - http://help.appveyor.com/discussions/questions/1184-build-config-vs-appveyoryaml

				version: '{build}'

				branches:

				  except:

				  - /^travis.*$/

				# Don't download the full Mesa history to speed up cloning.  However the clone

				# depth must not be too small, otherwise builds might fail when lots of patches

				# are committed in succession, because the desired commit is not found on the

				# truncated history.

				#

				# See also:

				# - https://www.appveyor.com/blog/2014/06/04/shallow-clone-for-git-repositories

				clone_depth: 100

				cache:

				- win_flex_bison-2.4.5.zip

				- llvm-3.3.1-msvc2013-mtd.7z

				environment:

				  WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip

				  LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z

				install:

				# Check pip

				- python --version

				- python -m pip --version

				# Install Mako

				- python -m pip install --egg Mako

				# Install SCons

				- python -m pip install --egg scons==2.4.1

				- scons --version

				# Install flex/bison

				- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "http://downloads.sourceforge.net/project/winflexbison/%WINFLEXBISON_ARCHIVE%"

				- 7z x -y -owinflexbison\ "%WINFLEXBISON_ARCHIVE%" > nul

				- set Path=%CD%\winflexbison;%Path%

				- win_flex --version

				- win_bison --version

				# Download and extract LLVM

				- if not exist "%LLVM_ARCHIVE%" appveyor DownloadFile "https://people.freedesktop.org/~jrfonseca/llvm/%LLVM_ARCHIVE%"

				- 7z x -y "%LLVM_ARCHIVE%" > nul

				- mkdir llvm\bin

				- set LLVM=%CD%\llvm

				build_script:

				- scons -j%NUMBER_OF_PROCESSORS% MSVC_VERSION=12.0 llvm=1

				# It's possible to setup notification here, as described in

				# http://www.appveyor.com/docs/notifications#appveyor-yml-configuration , but

				# doing so would cause the notification settings to be replicated across all

				# repos, which is most likely undesired.  So it's better to rely on the

				# Appveyor global/project notification settings.

14

bin/.cherry-ignore

View File

@@ -1,14 +0,0 @@
 # The commit base differs greatly between 11.0 and master
 ca95ecce064c7d841a3a374c2179f56161be glsl: fix stream qualifier for blocks with an instance name
 # Somewhat of a mixed feature/bugfix patch, causing some 200 piglit regressions
 b676570960277d47477822ffeccc672613f9142 gallium/swrast: fix front buffer blitting. (v2)
 # causes regression in xwayland, kde/plasma, mpv, steam ... fdo#92759
 839793680f99b8387bee9489733d5071c10f3ace i965: Use MESA_FORMAT_B8G8R8X8_SRGB for RGB visuals
 # already picked as commit 94ac4b3e84737b8c5faa371834670fd25502e024
 b5b87c4ed1dfd58aec8905e0514c9ba92ba83e1d r600g: write all MRTs only if there is exactly one output (fixes a hang)
 # patch not applicable on branch (null check already exists)
 f7b71451231c75c36771e8b7b0d78f05e0d50f65 glx/dri3: a drawable might not be bound at wait time

									
										35

bin/get-extra-pick-list.sh
									
												View File
											
				@@ -1,35 +0,0 @@

				#!/bin/sh

				# Script for generating a list of candidates which fix commits that have been

				# previously cherry-picked to a stable branch.

				#

				# Usage examples:

				#

				# $ bin/get-extra-pick-list.sh

				# $ bin/get-extra-pick-list.sh > picklist

				# $ bin/get-extra-pick-list.sh | tee picklist

				# Use the last branchpoint as our limit for the search

				# XXX: there should be a better way for this

				latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\

					grep "cherry picked from commit" |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\

					cut -c -8 |\

				while read sha

				do

					# Check if the original commit is referenced in master

					git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\

						cut -c -8 |\

					while read candidate

					do

						# Check if the potential fix, hasn't landed in branch yet.

						found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`

						if test $found = 0

						then

							echo Commit $candidate might need to be picked, as it references $sha

						fi

					done

				done

									
										2

bin/get-pick-list.sh
									
												View File
												
				@@ -14,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*11\.0.*mesa-stable\)' HEAD..origin/master |\

				git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

402

configure.ac

View File

@@ -9,6 +9,7 @@ dnl Copyright © 2009-2014 Jon TURNEY
 dnl Copyright © 2011-2012 Benjamin Franzke
 dnl Copyright © 2008-2014 David Airlie
 dnl Copyright © 2009-2013 Brian Paul
 dnl Copyright © 2003-2007 Keith Packard, Daniel Stone
 dnl
 dnl Permission is hereby granted, free of charge, to any person obtaining a
 dnl copy of this software and associated documentation files (the "Software"),
@@ -71,16 +72,16 @@ LIBDRM_REQUIRED=2.4.60
 LIBDRM_RADEON_REQUIRED=2.4.56
 LIBDRM_AMDGPU_REQUIRED=2.4.63
 LIBDRM_INTEL_REQUIRED=2.4.61
 LIBDRM_NVVIEUX_REQUIRED=2.4.33
 LIBDRM_NOUVEAU_REQUIRED=2.4.62
 LIBDRM_FREEDRENO_REQUIRED=2.4.64
 LIBDRM_NVVIEUX_REQUIRED=2.4.66
 LIBDRM_NOUVEAU_REQUIRED=2.4.66
 LIBDRM_FREEDRENO_REQUIRED=2.4.67
 DRI2PROTO_REQUIRED=2.6
 DRI3PROTO_REQUIRED=1.0
 PRESENTPROTO_REQUIRED=1.0
 LIBUDEV_REQUIRED=151
 GLPROTO_REQUIRED=1.4.14
 LIBOMXIL_BELLAGIO_REQUIRED=0.0
 LIBVA_REQUIRED=0.35.0
 LIBVA_REQUIRED=0.38.0
 VDPAU_REQUIRED=1.1
 WAYLAND_REQUIRED=1.2.0
 XCB_REQUIRED=1.9.3
@@ -196,6 +197,13 @@ if test "x$GCC" = xyes -a "x$acv_mesa_CLANG" = xno; then
     fi
 fi
 dnl We don't support building Mesa with Sun C compiler
 dnl https://bugs.freedesktop.org/show_bug.cgi?id=93189
 AC_CHECK_DECL([__SUNPRO_C], [SUNCC=yes], [SUNCC=no])
 if test "x$SUNCC" = xyes; then
     AC_MSG_ERROR([Building with Sun C compiler is not supported, use GCC instead.])
 fi
 dnl Check for compiler builtins
 AX_GCC_BUILTIN([__builtin_bswap32])
 AX_GCC_BUILTIN([__builtin_bswap64])
@@ -237,7 +245,7 @@ _SAVE_LDFLAGS="$LDFLAGS"
 _SAVE_CPPFLAGS="$CPPFLAGS"
 dnl Compiler macros
 DEFINES="-D__STDC_LIMIT_MACROS"
 DEFINES="-D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS"
 AC_SUBST([DEFINES])
 case "$host_os" in
 linux*|*-gnu*|gnu*)
@@ -297,8 +305,7 @@ if test "x$GCC" = xyes; then
     # Flags to help ensure that certain portions of the code -- and only those
     # portions -- can be built with MSVC:
     # - src/util, src/gallium/auxiliary, and src/gallium/drivers/llvmpipe needs
     #   to build with Windows SDK 7.0.7600, which bundles MSVC 2008
     # - src/util, src/gallium/auxiliary, rc/gallium/drivers/llvmpipe, and
     # - non-Linux/Posix OpenGL portions needs to build on MSVC 2013 (which
     #   supports most of C99)
     # - the rest has no compiler compiler restrictions
@@ -315,9 +322,6 @@ if test "x$GCC" = xyes; then
 		    AC_MSG_RESULT([yes])],
 		    AC_MSG_RESULT([no]));
     CFLAGS="$save_CFLAGS"
     MSVC2008_COMPAT_CFLAGS="$MSVC2013_COMPAT_CFLAGS -Werror=declaration-after-statement"
     MSVC2008_COMPAT_CXXFLAGS="$MSVC2013_COMPAT_CXXFLAGS"
 fi
 if test "x$GXX" = xyes; then
     CXXFLAGS="$CXXFLAGS -Wall"
@@ -345,8 +349,6 @@ fi
 AC_SUBST([MSVC2013_COMPAT_CFLAGS])
 AC_SUBST([MSVC2013_COMPAT_CXXFLAGS])
 AC_SUBST([MSVC2008_COMPAT_CFLAGS])
 AC_SUBST([MSVC2008_COMPAT_CXXFLAGS])
 dnl even if the compiler appears to support it, using visibility attributes isn't
 dnl going to do anything useful currently on cygwin apart from emit lots of warnings
@@ -388,6 +390,61 @@ fi
 AM_CONDITIONAL([SSE41_SUPPORTED], [test x$SSE41_SUPPORTED = x1])
 AC_SUBST([SSE41_CFLAGS], $SSE41_CFLAGS)
 dnl Check for Endianness
 AC_C_BIGENDIAN(
    little_endian=no,
    little_endian=yes,
    little_endian=no,
    little_endian=no
 )
 dnl Check for POWER8 Architecture
 PWR8_CFLAGS="-mpower8-vector"
 have_pwr8_intrinsics=no
 AC_MSG_CHECKING(whether gcc supports -mpower8-vector)
 save_CFLAGS=$CFLAGS
 CFLAGS="$PWR8_CFLAGS $CFLAGS"
 AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
 #if defined(__GNUC__) && (__GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8))
 #error "Need GCC >= 4.8 for sane POWER8 support"
 #endif
 #include <altivec.h>
 int main () {
     vector unsigned char r;
     vector unsigned int v = vec_splat_u32 (1);
     r = __builtin_vec_vgbbd ((vector unsigned char) v);
     return 0;
 }]])], have_pwr8_intrinsics=yes)
 CFLAGS=$save_CFLAGS
 AC_ARG_ENABLE(pwr8,
    [AC_HELP_STRING([--disable-pwr8-inst],
                    [disable POWER8-specific instructions])],
    [enable_pwr8=$enableval], [enable_pwr8=auto])
 if test "x$enable_pwr8" = xno ; then
    have_pwr8_intrinsics=disabled
 fi
 if test $have_pwr8_intrinsics = yes && test $little_endian = yes ; then
    DEFINES="$DEFINES -D_ARCH_PWR8"
    CXXFLAGS="$CXXFLAGS $PWR8_CFLAGS"
    CFLAGS="$CFLAGS $PWR8_CFLAGS"
 else
    PWR8_CFLAGS=
 fi
 AC_MSG_RESULT($have_pwr8_intrinsics)
 if test "x$enable_pwr8" = xyes && test $have_pwr8_intrinsics = no ; then
    AC_MSG_ERROR([POWER8 compiler support not detected])
 fi
 if test $have_pwr8_intrinsics = yes && test $little_endian = no ; then
    AC_MSG_WARN([POWER8 optimization is enabled only on POWER8 Little-Endian])
 fi
 AC_SUBST([PWR8_CFLAGS], $PWR8_CFLAGS)
 dnl Can't have static and shared libraries, default to static if user
 dnl explicitly requested. If both disabled, set to static since shared
 dnl was explicitly requested.
@@ -413,8 +470,29 @@ AC_ARG_ENABLE([debug],
     [enable_debug="$enableval"],
     [enable_debug=no]
 )
 AC_ARG_ENABLE([profile],
     [AS_HELP_STRING([--enable-profile],
         [enable profiling of code @<:@default=disabled@:>@])],
     [enable_profile="$enableval"],
     [enable_profile=no]
 )
 if test "x$enable_profile" = xyes; then
     DEFINES="$DEFINES -DPROFILE"
     if test "x$GCC" = xyes; then
         CFLAGS="$CFLAGS -fno-omit-frame-pointer"
     fi
     if test "x$GXX" = xyes; then
         CXXFLAGS="$CXXFLAGS -fno-omit-frame-pointer"
     fi
 fi
 if test "x$enable_debug" = xyes; then
     DEFINES="$DEFINES -DDEBUG"
     if test "x$enable_profile" = xyes; then
         AC_MSG_WARN([Debug and Profile are enabled at the same time])
     fi
     if test "x$GCC" = xyes; then
         if ! echo "$CFLAGS" | grep -q -e '-g'; then
             CFLAGS="$CFLAGS -g"
@@ -535,15 +613,32 @@ AM_CONDITIONAL(HAVE_COMPAT_SYMLINKS, test "x$HAVE_COMPAT_SYMLINKS" = xyes)
 dnl
 dnl library names
 dnl
 dnl Unfortunately we need to do a few things that libtool can't help us with,
 dnl so we need some knowledge of shared library filenames:
 dnl
 dnl LIB_EXT is the extension used when creating symlinks for alternate
 dnl filenames for a shared library which will be dynamically loaded
 dnl
 dnl IMP_LIB_EXT is the extension used when checking for the presence of a
 dnl the file for a shared library we wish to link with
 dnl
 case "$host_os" in
 darwin* )
     LIB_EXT='dylib' ;;
     LIB_EXT='dylib'
     IMP_LIB_EXT=$LIB_EXT
     ;;
 cygwin* )
     LIB_EXT='dll' ;;
     LIB_EXT='dll'
     IMP_LIB_EXT='dll.a'
     ;;
 aix* )
     LIB_EXT='a' ;;
     LIB_EXT='a'
     IMP_LIB_EXT=$LIB_EXT
     ;;
 * )
     LIB_EXT='so' ;;
     LIB_EXT='so'
     IMP_LIB_EXT=$LIB_EXT
     ;;
 esac
 AC_SUBST([LIB_EXT])
@@ -750,6 +845,11 @@ linux*)
     dri3_default=no
     ;;
 esac
 if test "x$enable_dri" = xno; then
     dri3_default=no
 fi
 AC_ARG_ENABLE([dri3],
     [AS_HELP_STRING([--enable-dri3],
         [enable DRI3 @<:@default=auto@:>@])],
@@ -849,7 +949,7 @@ GALLIUM_DRIVERS_DEFAULT="r300,r600,svga,swrast"
 AC_ARG_WITH([gallium-drivers],
     [AS_HELP_STRING([--with-gallium-drivers@<:@=DIRS...@:>@],
         [comma delimited Gallium drivers list, e.g.
         "i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4"
         "i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl"
         @<:@default=r300,r600,svga,swrast@:>@])],
     [with_gallium_drivers="$withval"],
     [with_gallium_drivers="$GALLIUM_DRIVERS_DEFAULT"])
@@ -939,8 +1039,13 @@ gnu*|cygwin*)
     dri_platform='drm' ;;
 esac
 if test "x$enable_dri" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes; then
     have_drisw_kms='yes'
 fi
 AM_CONDITIONAL(HAVE_DRICOMMON, test "x$enable_dri" = xyes )
 AM_CONDITIONAL(HAVE_DRISW, test "x$enable_dri" = xyes )
 AM_CONDITIONAL(HAVE_DRISW_KMS, test "x$have_drisw_kms" = xyes )
 AM_CONDITIONAL(HAVE_DRI2, test "x$enable_dri" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes )
 AM_CONDITIONAL(HAVE_DRI3, test "x$enable_dri3" = xyes -a "x$dri_platform" = xdrm -a "x$have_libdrm" = xyes )
 AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = xapple )
@@ -975,10 +1080,6 @@ if test -n "$with_gallium_drivers" -a "x$enable_glx$enable_xlib_glx" = xyesyes;
     NEED_WINSYS_XLIB="yes"
 fi
 if test "x$enable_dri" = xyes; then
     enable_gallium_loader="$enable_shared_pipe_drivers"
 fi
 if test "x$enable_gallium_osmesa" = xyes; then
     if ! echo "$with_gallium_drivers" | grep -q 'swrast'; then
         AC_MSG_ERROR([gallium_osmesa requires the gallium swrast driver])
@@ -990,6 +1091,149 @@ fi
 AC_SUBST([MESA_LLVM])
 # SHA1 hashing
 AC_ARG_WITH([sha1],
         [AS_HELP_STRING([--with-sha1=libc|libmd|libnettle|libgcrypt|libcrypto|libsha1|CommonCrypto|CryptoAPI],
         [choose SHA1 implementation])])
 case "x$with_sha1" in
 x | xlibc | xlibmd | xlibnettle | xlibgcrypt | xlibcrypto | xlibsha1 | xCommonCrypto | xCryptoAPI)
   ;;
 *)
         AC_MSG_ERROR([Illegal value for --with-sha1: $with_sha1])
 esac
 AC_CHECK_FUNC([SHA1Init], [HAVE_SHA1_IN_LIBC=yes])
 if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_LIBC" = xyes; then
 	with_sha1=libc
 fi
 if test "x$with_sha1" = xlibc && test "x$HAVE_SHA1_IN_LIBC" != xyes; then
 	AC_MSG_ERROR([sha1 in libc requested but not found])
 fi
 if test "x$with_sha1" = xlibc; then
 	AC_DEFINE([HAVE_SHA1_IN_LIBC], [1],
 		[Use libc SHA1 functions])
 	SHA1_LIBS=""
 fi
 AC_CHECK_FUNC([CC_SHA1_Init], [HAVE_SHA1_IN_COMMONCRYPTO=yes])
 if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_COMMONCRYPTO" = xyes; then
 	with_sha1=CommonCrypto
 fi
 if test "x$with_sha1" = xCommonCrypto && test "x$HAVE_SHA1_IN_COMMONCRYPTO" != xyes; then
 	AC_MSG_ERROR([CommonCrypto requested but not found])
 fi
 if test "x$with_sha1" = xCommonCrypto; then
 	AC_DEFINE([HAVE_SHA1_IN_COMMONCRYPTO], [1],
 		[Use CommonCrypto SHA1 functions])
 	SHA1_LIBS=""
 fi
 dnl stdcall functions cannot be tested with AC_CHECK_LIB
 AC_CHECK_HEADER([wincrypt.h], [HAVE_SHA1_IN_CRYPTOAPI=yes], [], [#include <windows.h>])
 if test "x$with_sha1" = x && test "x$HAVE_SHA1_IN_CRYPTOAPI" = xyes; then
 	with_sha1=CryptoAPI
 fi
 if test "x$with_sha1" = xCryptoAPI && test "x$HAVE_SHA1_IN_CRYPTOAPI" != xyes; then
 	AC_MSG_ERROR([CryptoAPI requested but not found])
 fi
 if test "x$with_sha1" = xCryptoAPI; then
 	AC_DEFINE([HAVE_SHA1_IN_CRYPTOAPI], [1],
 		[Use CryptoAPI SHA1 functions])
 	SHA1_LIBS=""
 fi
 AC_CHECK_LIB([md], [SHA1Init], [HAVE_LIBMD=yes])
 if test "x$with_sha1" = x && test "x$HAVE_LIBMD" = xyes; then
 	with_sha1=libmd
 fi
 if test "x$with_sha1" = xlibmd && test "x$HAVE_LIBMD" != xyes; then
 	AC_MSG_ERROR([libmd requested but not found])
 fi
 if test "x$with_sha1" = xlibmd; then
 	AC_DEFINE([HAVE_SHA1_IN_LIBMD], [1],
 	          [Use libmd SHA1 functions])
 	SHA1_LIBS=-lmd
 fi
 PKG_CHECK_MODULES([LIBSHA1], [libsha1], [HAVE_LIBSHA1=yes], [HAVE_LIBSHA1=no])
 if test "x$with_sha1" = x && test "x$HAVE_LIBSHA1" = xyes; then
    with_sha1=libsha1
 fi
 if test "x$with_sha1" = xlibsha1 && test "x$HAVE_LIBSHA1" != xyes; then
 	AC_MSG_ERROR([libsha1 requested but not found])
 fi
 if test "x$with_sha1" = xlibsha1; then
 	AC_DEFINE([HAVE_SHA1_IN_LIBSHA1], [1],
 	          [Use libsha1 for SHA1])
 	SHA1_LIBS=-lsha1
 fi
 AC_CHECK_LIB([nettle], [nettle_sha1_init], [HAVE_LIBNETTLE=yes])
 if test "x$with_sha1" = x && test "x$HAVE_LIBNETTLE" = xyes; then
 	with_sha1=libnettle
 fi
 if test "x$with_sha1" = xlibnettle && test "x$HAVE_LIBNETTLE" != xyes; then
 	AC_MSG_ERROR([libnettle requested but not found])
 fi
 if test "x$with_sha1" = xlibnettle; then
 	AC_DEFINE([HAVE_SHA1_IN_LIBNETTLE], [1],
 	          [Use libnettle SHA1 functions])
 	SHA1_LIBS=-lnettle
 fi
 AC_CHECK_LIB([gcrypt], [gcry_md_open], [HAVE_LIBGCRYPT=yes])
 if test "x$with_sha1" = x && test "x$HAVE_LIBGCRYPT" = xyes; then
 	with_sha1=libgcrypt
 fi
 if test "x$with_sha1" = xlibgcrypt && test "x$HAVE_LIBGCRYPT" != xyes; then
 	AC_MSG_ERROR([libgcrypt requested but not found])
 fi
 if test "x$with_sha1" = xlibgcrypt; then
 	AC_DEFINE([HAVE_SHA1_IN_LIBGCRYPT], [1],
 	          [Use libgcrypt SHA1 functions])
 	SHA1_LIBS=-lgcrypt
 fi
 # We don't need all of the OpenSSL libraries, just libcrypto
 AC_CHECK_LIB([crypto], [SHA1_Init], [HAVE_LIBCRYPTO=yes])
 PKG_CHECK_MODULES([OPENSSL], [openssl], [HAVE_OPENSSL_PKC=yes],
                   [HAVE_OPENSSL_PKC=no])
 if test "x$HAVE_LIBCRYPTO" = xyes || test "x$HAVE_OPENSSL_PKC" = xyes; then
 	if test "x$with_sha1" = x; then
 		with_sha1=libcrypto
 	fi
 else
 	if test "x$with_sha1" = xlibcrypto; then
 		AC_MSG_ERROR([OpenSSL libcrypto requested but not found])
 	fi
 fi
 if test "x$with_sha1" = xlibcrypto; then
 	if test "x$HAVE_LIBCRYPTO" = xyes; then
 		SHA1_LIBS=-lcrypto
 	else
 		SHA1_LIBS="$OPENSSL_LIBS"
 		SHA1_CFLAGS="$OPENSSL_CFLAGS"
 	fi
 fi
 AC_MSG_CHECKING([for SHA1 implementation])
 AC_MSG_RESULT([$with_sha1])
 AC_SUBST(SHA1_LIBS)
 AC_SUBST(SHA1_CFLAGS)
 # Enable a define for SHA1
 if test "x$with_sha1" != "x"; then
 	DEFINES="$DEFINES -DHAVE_SHA1"
 fi
 # Allow user to configure out the shader-cache feature
 AC_ARG_ENABLE([shader-cache],
     AS_HELP_STRING([--disable-shader-cache], [Disable binary shader cache]),
     [enable_shader_cache="$enableval"],
     [if test "x$with_sha1" != "x"; then
         enable_shader_cache=yes
      else
         enable_shader_cache=no
      fi])
 if test "x$with_sha1" = "x"; then
     if test "x$enable_shader_cache" = "xyes"; then
         AC_MSG_ERROR([Cannot enable shader cache (no SHA-1 implementation found)])
     fi
 fi
 AM_CONDITIONAL([ENABLE_SHADER_CACHE], [test x$enable_shader_cache = xyes])
 case "$host_os" in
 linux*)
     need_pci_id=yes ;;
@@ -1066,7 +1310,8 @@ xyesno)
             if test x"$enable_dri3" = xyes; then
                PKG_CHECK_EXISTS([xcb >= $XCB_REQUIRED], [], AC_MSG_ERROR([DRI3 requires xcb >= $XCB_REQUIRED]))
                dri_modules="$dri_modules xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
                dri3_modules="xcb-dri3 xcb-present xcb-sync xshmfence >= $XSHMFENCE_REQUIRED"
                PKG_CHECK_MODULES([XCB_DRI3], [$dri3_modules])
             fi
         fi
         if test x"$dri_platform" = xapple ; then
@@ -1407,6 +1652,12 @@ if test "x$enable_egl" = xyes; then
             if test "x$enable_shared_glapi" = xno; then
                 AC_MSG_ERROR([egl_dri2 requires --enable-shared-glapi])
             fi
             if test "x$enable_dri3" = xyes; then
                 HAVE_EGL_DRIVER_DRI3=1
                 if test "x$enable_shared_glapi" = xno; then
                     AC_MSG_ERROR([egl_dri3 requires --enable-shared-glapi])
                 fi
             fi
         else
             # Avoid building an "empty" libEGL. Drop/update this
             # when other backends (haiku?) come along.
@@ -1418,6 +1669,8 @@ fi
 AM_CONDITIONAL(HAVE_EGL, test "x$enable_egl" = xyes)
 AC_SUBST([EGL_LIB_DEPS])
 gallium_st="mesa"
 dnl
 dnl XA configuration
 dnl
@@ -1430,7 +1683,7 @@ if test "x$enable_xa" = xyes; then
           enabling XA.
           Example: ./configure --enable-xa --with-gallium-drivers=svga...])
     fi
     enable_gallium_loader=$enable_shared_pipe_drivers
     gallium_st="$gallium_st xa"
 fi
 AM_CONDITIONAL(HAVE_ST_XA, test "x$enable_xa" = xyes)
@@ -1475,25 +1728,25 @@ AM_CONDITIONAL(NEED_GALLIUM_VL_WINSYS, test "x$need_gallium_vl_winsys" = xyes)
 if test "x$enable_xvmc" = xyes; then
     PKG_CHECK_MODULES([XVMC], [xvmc >= $XVMC_REQUIRED])
     enable_gallium_loader=$enable_shared_pipe_drivers
     gallium_st="$gallium_st xvmc"
 fi
 AM_CONDITIONAL(HAVE_ST_XVMC, test "x$enable_xvmc" = xyes)
 if test "x$enable_vdpau" = xyes; then
     PKG_CHECK_MODULES([VDPAU], [vdpau >= $VDPAU_REQUIRED])
     enable_gallium_loader=$enable_shared_pipe_drivers
     gallium_st="$gallium_st vdpau"
 fi
 AM_CONDITIONAL(HAVE_ST_VDPAU, test "x$enable_vdpau" = xyes)
 if test "x$enable_omx" = xyes; then
     PKG_CHECK_MODULES([OMX], [libomxil-bellagio >= $LIBOMXIL_BELLAGIO_REQUIRED])
     enable_gallium_loader=$enable_shared_pipe_drivers
     gallium_st="$gallium_st omx"
 fi
 AM_CONDITIONAL(HAVE_ST_OMX, test "x$enable_omx" = xyes)
 if test "x$enable_va" = xyes; then
     PKG_CHECK_MODULES([VA], [libva >= $LIBVA_REQUIRED])
     enable_gallium_loader=$enable_shared_pipe_drivers
     gallium_st="$gallium_st va"
 fi
 AM_CONDITIONAL(HAVE_ST_VA, test "x$enable_va" = xyes)
@@ -1515,7 +1768,7 @@ if test "x$enable_nine" = xyes; then
         AC_MSG_WARN([using nine together with wine requires DRI3 enabled system])
     fi
     enable_gallium_loader=$enable_shared_pipe_drivers
     gallium_st="$gallium_st nine"
 fi
 AM_CONDITIONAL(HAVE_ST_NINE, test "x$enable_nine" = xyes)
@@ -1561,8 +1814,7 @@ if test "x$enable_opencl" = xyes; then
         AC_SUBST([LIBCLC_LIBEXECDIR])
     fi
     # XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers
     enable_gallium_loader=yes
     gallium_st="$gallium_st clover"
     if test "x$enable_opencl_icd" = xyes; then
         OPENCL_LIBNAME="MesaOpenCL"
@@ -1842,10 +2094,6 @@ AC_SUBST([XVMC_LIB_INSTALL_DIR])
 dnl
 dnl Gallium Tests
 dnl
 if test "x$enable_gallium_tests" = xyes; then
     # XXX: Use $enable_shared_pipe_drivers once converted to use static/shared pipe-drivers
     enable_gallium_loader=yes
 fi
 AM_CONDITIONAL(HAVE_GALLIUM_TESTS, test "x$enable_gallium_tests" = xyes)
 dnl Directory for VDPAU libs
@@ -1900,18 +2148,17 @@ gallium_require_llvm() {
 }
 gallium_require_drm_loader() {
     if test "x$enable_gallium_loader" = xyes; then
         if test "x$need_pci_id$have_pci_id" = xyesno; then
             AC_MSG_ERROR([Gallium drm loader requires libudev >= $LIBUDEV_REQUIRED or sysfs])
         fi
         enable_gallium_drm_loader=yes
     fi
     if test "x$enable_va" = xyes && test "x$7" != x; then
          GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS $7"
     if test "x$need_pci_id$have_pci_id" = xyesno; then
         AC_MSG_ERROR([Gallium drm loader requires libudev >= $LIBUDEV_REQUIRED or sysfs])
     fi
 }
 dnl This is for Glamor. Skip this if OpenGL is disabled.
 require_egl_drm() {
     if test "x$enable_opengl" = xno; then
         return 0
     fi
     case "$with_egl_platforms" in
         *drm*)
             ;;
@@ -1933,7 +2180,7 @@ radeon_llvm_check() {
     if test "x$enable_gallium_llvm" != "xyes"; then
         AC_MSG_ERROR([--enable-gallium-llvm is required when building $1])
     fi
     llvm_check_version_for "3" "4" "2" $1
     llvm_check_version_for "3" "6" "0" $1
     if test true && $LLVM_CONFIG --targets-built | grep -iqvw $amdgpu_llvm_target_name ; then
         AC_MSG_ERROR([LLVM $amdgpu_llvm_target_name not enabled in your LLVM build.])
     fi
@@ -2021,11 +2268,16 @@ if test -n "$with_gallium_drivers"; then
             gallium_require_drm "vc4"
             gallium_require_drm_loader
             case "$host_cpu" in
                 i?86 | x86_64 | amd64)
                 USE_VC4_SIMULATOR=yes
                 ;;
             esac
             PKG_CHECK_MODULES([SIMPENROSE], [simpenrose],
                               [USE_VC4_SIMULATOR=yes;
                                DEFINES="$DEFINES -DUSE_VC4_SIMULATOR"],
                               [USE_VC4_SIMULATOR=no])
             ;;
         xvirgl)
             HAVE_GALLIUM_VIRGL=yes
             gallium_require_drm "virgl"
             gallium_require_drm_loader
             require_egl_drm "virgl"
             ;;
         *)
             AC_MSG_ERROR([Unknown Gallium driver: $driver])
@@ -2043,12 +2295,19 @@ dnl in LLVM_LIBS.
 if test "x$MESA_LLVM" != x0; then
     if ! $LLVM_CONFIG --libs ${LLVM_COMPONENTS} >/dev/null; then
        AC_MSG_ERROR([Calling ${LLVM_CONFIG} failed])
     fi
     LLVM_LIBS="`$LLVM_CONFIG --libs ${LLVM_COMPONENTS}`"
     dnl llvm-config may not give the right answer when llvm is a built as a
     dnl single shared library, so we must work the library name out for
     dnl ourselves.
     dnl (See https://llvm.org/bugs/show_bug.cgi?id=6823)
     if test "x$enable_llvm_shared_libs" = xyes; then
         dnl We can't use $LLVM_VERSION because it has 'svn' stripped out,
         LLVM_SO_NAME=LLVM-`$LLVM_CONFIG --version`
         AS_IF([test -f "$LLVM_LIBDIR/lib$LLVM_SO_NAME.so"], [llvm_have_one_so=yes])
         AS_IF([test -f "$LLVM_LIBDIR/lib$LLVM_SO_NAME.$IMP_LIB_EXT"], [llvm_have_one_so=yes])
         if test "x$llvm_have_one_so" = xyes; then
             dnl LLVM was built using auto*, so there is only one shared object.
@@ -2056,7 +2315,7 @@ if test "x$MESA_LLVM" != x0; then
         else
             dnl If LLVM was built with CMake, there will be one shared object per
             dnl component.
             AS_IF([test ! -f "$LLVM_LIBDIR/libLLVMTarget.so"],
             AS_IF([test ! -f "$LLVM_LIBDIR/libLLVMTarget.$IMP_LIB_EXT"],
                     [AC_MSG_ERROR([Could not find llvm shared libraries:
 	Please make sure you have built llvm with the --enable-shared option
 	and that your llvm libraries are installed in $LLVM_LIBDIR
@@ -2094,26 +2353,19 @@ AM_CONDITIONAL(HAVE_GALLIUM_FREEDRENO, test "x$HAVE_GALLIUM_FREEDRENO" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_SOFTPIPE, test "x$HAVE_GALLIUM_SOFTPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_LLVMPIPE, test "x$HAVE_GALLIUM_LLVMPIPE" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VC4, test "x$HAVE_GALLIUM_VC4" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_VIRGL, test "x$HAVE_GALLIUM_VIRGL" = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_STATIC_TARGETS, test "x$enable_shared_pipe_drivers" = xno)
 # NOTE: anything using xcb or other client side libs ends up in separate
 #       _CLIENT variables.  The pipe loader is built in two variants,
 #       one that is standalone and does not link any x client libs (for
 #       use by XA tracker in particular, but could be used in any case
 #       where communication with xserver is not desired).
 if test "x$enable_gallium_loader" = xyes; then
     if test "x$enable_dri" = xyes; then
         GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES -DHAVE_PIPE_LOADER_DRI"
     fi
     if test "x$enable_gallium_drm_loader" = xyes; then
         GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES -DHAVE_PIPE_LOADER_DRM"
     fi
     AC_SUBST([GALLIUM_PIPE_LOADER_DEFINES])
 if test "x$enable_dri" = xyes; then
     GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES -DHAVE_PIPE_LOADER_DRI"
 fi
 if test "x$have_drisw_kms" = xyes; then
     GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES -DHAVE_PIPE_LOADER_KMS"
 fi
 AC_SUBST([GALLIUM_PIPE_LOADER_DEFINES])
 AM_CONDITIONAL(HAVE_I915_DRI, test x$HAVE_I915_DRI = xyes)
 AM_CONDITIONAL(HAVE_I965_DRI, test x$HAVE_I965_DRI = xyes)
 AM_CONDITIONAL(HAVE_NOUVEAU_DRI, test x$HAVE_NOUVEAU_DRI = xyes)
@@ -2127,8 +2379,6 @@ AM_CONDITIONAL(NEED_RADEON_DRM_WINSYS, test "x$HAVE_GALLIUM_R300" = xyes -o \
 AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$NEED_WINSYS_XLIB" = xyes)
 AM_CONDITIONAL(NEED_RADEON_LLVM, test x$NEED_RADEON_LLVM = xyes)
 AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes)
 AM_CONDITIONAL(HAVE_LOADER_GALLIUM, test x$enable_gallium_loader = xyes)
 AM_CONDITIONAL(HAVE_DRM_LOADER_GALLIUM, test x$enable_gallium_drm_loader = xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
 AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1)
 AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
@@ -2188,6 +2438,7 @@ CXXFLAGS="$CXXFLAGS $USER_CXXFLAGS"
 dnl Substitute the config
 AC_CONFIG_FILES([Makefile
 		src/Makefile
 		src/compiler/Makefile
 		src/egl/Makefile
 		src/egl/main/egl.pc
 		src/egl/wayland/wayland-drm/Makefile
@@ -2197,6 +2448,7 @@ AC_CONFIG_FILES([Makefile
 		src/gallium/auxiliary/Makefile
 		src/gallium/auxiliary/pipe-loader/Makefile
 		src/gallium/drivers/freedreno/Makefile
 		src/gallium/drivers/ddebug/Makefile
 		src/gallium/drivers/i915/Makefile
 		src/gallium/drivers/ilo/Makefile
 		src/gallium/drivers/llvmpipe/Makefile
@@ -2211,6 +2463,7 @@ AC_CONFIG_FILES([Makefile
 		src/gallium/drivers/svga/Makefile
 		src/gallium/drivers/trace/Makefile
 		src/gallium/drivers/vc4/Makefile
 		src/gallium/drivers/virgl/Makefile
 		src/gallium/state_trackers/clover/Makefile
 		src/gallium/state_trackers/dri/Makefile
 		src/gallium/state_trackers/glx/xlib/Makefile
@@ -2251,9 +2504,10 @@ AC_CONFIG_FILES([Makefile
 		src/gallium/winsys/sw/wrapper/Makefile
 		src/gallium/winsys/sw/xlib/Makefile
 		src/gallium/winsys/vc4/drm/Makefile
 		src/gallium/winsys/virgl/drm/Makefile
 		src/gallium/winsys/virgl/vtest/Makefile
 		src/gbm/Makefile
 		src/gbm/main/gbm.pc
 		src/glsl/Makefile
 		src/glx/Makefile
 		src/glx/apple/Makefile
 		src/glx/tests/Makefile
@@ -2344,6 +2598,9 @@ if test "$enable_egl" = yes; then
     if test "x$HAVE_EGL_DRIVER_DRI2" != "x"; then
         egl_drivers="$egl_drivers builtin:egl_dri2"
     fi
     if test "x$HAVE_EGL_DRIVER_DRI3" != "x"; then
         egl_drivers="$egl_drivers builtin:egl_dri3"
     fi
     echo "        EGL drivers:    $egl_drivers"
 fi
@@ -2359,11 +2616,18 @@ fi
 echo ""
 if test -n "$with_gallium_drivers"; then
     echo "        Gallium:         yes"
     echo "        Gallium drivers: $gallium_drivers"
     echo "        Gallium st:      $gallium_st"
 else
     echo "        Gallium:         no"
 fi
 dnl Shader cache
 echo ""
 echo "        Shader cache:    $enable_shader_cache"
 if test "x$enable_shader_cache" = "xyes"; then
     echo "        With SHA1 from:  $with_sha1"
 fi
 dnl Libraries
 echo ""

119

docs/GL3.txt

View File

@@ -92,50 +92,50 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE ()
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, radeonsi
 GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_draw_buffers_blend                            DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_draw_indirect                                 DONE (i965, r600, llvmpipe, softpipe)
   GL_ARB_draw_buffers_blend                            DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_draw_indirect                                 DONE (i965, llvmpipe, softpipe)
   GL_ARB_gpu_shader5                                   DONE (i965)
   - 'precise' qualifier                                DONE
   - Dynamically uniform sampler array indices          DONE (r600, softpipe)
   - Dynamically uniform UBO array indices              DONE (r600)
   - Dynamically uniform sampler array indices          DONE (softpipe)
   - Dynamically uniform UBO array indices              DONE ()
   - Implicit signed -> unsigned conversions            DONE
   - Fused multiply-add                                 DONE ()
   - Packing/bitfield/conversion functions              DONE (r600, softpipe)
   - Enhanced textureGather                             DONE (r600, softpipe)
   - Geometry shader instancing                         DONE (r600, llvmpipe, softpipe)
   - Packing/bitfield/conversion functions              DONE (softpipe)
   - Enhanced textureGather                             DONE (softpipe)
   - Geometry shader instancing                         DONE (llvmpipe, softpipe)
   - Geometry shader multiple streams                   DONE ()
   - Enhanced per-sample shading                        DONE (r600)
   - Interpolation functions                            DONE (r600)
   - Enhanced per-sample shading                        DONE ()
   - Interpolation functions                            DONE ()
   - New overload resolution rules                      DONE
   GL_ARB_gpu_shader_fp64                               DONE (llvmpipe, softpipe)
   GL_ARB_sample_shading                                DONE (i965, nv50, r600)
   GL_ARB_shader_subroutine                             DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_tessellation_shader                           DONE ()
   GL_ARB_texture_buffer_object_rgb32                   DONE (i965, r600, llvmpipe, softpipe)
   GL_ARB_texture_cube_map_array                        DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_gather                                DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_texture_query_lod                             DONE (i965, nv50, r600)
   GL_ARB_transform_feedback2                           DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_transform_feedback3                           DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_sample_shading                                DONE (i965, nv50)
   GL_ARB_shader_subroutine                             DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_tessellation_shader                           DONE (i965)
   GL_ARB_texture_buffer_object_rgb32                   DONE (i965, llvmpipe, softpipe)
   GL_ARB_texture_cube_map_array                        DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_gather                                DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_texture_query_lod                             DONE (i965, nv50, softpipe)
   GL_ARB_transform_feedback2                           DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_transform_feedback3                           DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, radeonsi
 GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_ES2_compatibility                             DONE (i965, nv50, r600, llvmpipe, softpipe)
   GL_ARB_ES2_compatibility                             DONE (i965, nv50, llvmpipe, softpipe)
   GL_ARB_get_program_binary                            DONE (0 binary formats)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_shader_precision                              DONE (all drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bit                           DONE (llvmpipe, softpipe)
   GL_ARB_viewport_array                                DONE (i965, nv50, r600, llvmpipe)
   GL_ARB_viewport_array                                DONE (i965, nv50, llvmpipe, softpipe)
 GL 4.2, GLSL 4.20:
   GL_ARB_texture_compression_bptc                      DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_compressed_texture_pixel_storage              DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_shader_atomic_counters                        DONE (i965, nvc0)
   GL_ARB_texture_storage                               DONE (all drivers)
   GL_ARB_transform_feedback_instanced                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
@@ -149,27 +149,27 @@ GL 4.2, GLSL 4.20:
 GL 4.3, GLSL 4.30:
   GL_ARB_arrays_of_arrays                              started (Timothy)
   GL_ARB_arrays_of_arrays                              DONE (all drivers that support GLSL 1.30)
   GL_ARB_ES3_compatibility                             DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                           DONE (all drivers)
   GL_ARB_compute_shader                                in progress (jljusten)
   GL_ARB_copy_image                                    DONE (i965) (gallium - in progress, VMware)
   GL_ARB_compute_shader                                DONE (i965)
   GL_ARB_copy_image                                    DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_debug                                         DONE (all drivers)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                       DONE (nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_fragment_layer_viewport                       DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_framebuffer_no_attachments                    DONE (i965)
   GL_ARB_internalformat_query2                         not started
   GL_ARB_internalformat_query2                         in progress (elima)
   GL_ARB_invalidate_subdata                            DONE (all drivers)
   GL_ARB_multi_draw_indirect                           DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_program_interface_query                       DONE (all drivers)
   GL_ARB_robust_buffer_access_behavior                 not started
   GL_ARB_shader_image_size                             DONE (i965)
   GL_ARB_shader_storage_buffer_object                  in progress (Iago Toral, Samuel Iglesias)
   GL_ARB_shader_storage_buffer_object                  DONE (i965, nvc0)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_buffer_range                          DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels                          DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                  DONE (i965, nv50, nvc0, llvmpipe, softpipe)
   GL_ARB_texture_view                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
@@ -177,10 +177,16 @@ GL 4.4, GLSL 4.40:
   GL_MAX_VERTEX_ATTRIB_STRIDE                          DONE (all drivers)
   GL_ARB_buffer_storage                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                 DONE (i965) (gallium - in progress, VMware)
   GL_ARB_enhanced_layouts                              not started
   GL_ARB_clear_texture                                 DONE (i965, nv50, nvc0)
   GL_ARB_enhanced_layouts                              in progress (Timothy)
   - compile-time constant expressions                  DONE
   - explicit byte offsets for blocks                   in progress
   - forced alignment within blocks                     in progress
   - specified vec4-slot component numbers              in progress
   - specified transform/feedback layout                in progress
   - input/output block locations                       DONE
   GL_ARB_multi_bind                                    DONE (all drivers)
   GL_ARB_query_buffer_object                           not started
   GL_ARB_query_buffer_object                           DONE (nvc0)
   GL_ARB_texture_mirror_clamp_to_edge                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_stencil8                              DONE (nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_type_10f_11f_11f_rev                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
@@ -194,25 +200,25 @@ GL 4.5, GLSL 4.50:
   GL_ARB_derivative_control                            DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_direct_state_access                           DONE (all drivers)
   GL_ARB_get_texture_sub_image                         DONE (all drivers)
   GL_ARB_shader_texture_image_samples                  not started
   GL_ARB_texture_barrier                               DONE (nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                         DONE (all - but needs GLX/EXT extension to be useful)
   GL_ARB_shader_texture_image_samples                  DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_texture_barrier                               DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                         DONE (all - but needs GLX/EGL extension to be useful)
   GL_KHR_robust_buffer_access_behavior                 not started
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
   GL_ARB_arrays_of_arrays                              started (Timothy)
   GL_ARB_compute_shader                                in progress (jljusten)
   GL_ARB_arrays_of_arrays                              DONE (all drivers that support GLSL 1.30)
   GL_ARB_compute_shader                                DONE (i965)
   GL_ARB_draw_indirect                                 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                    DONE (i965)
   GL_ARB_program_interface_query                       DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_shader_atomic_counters                        DONE (i965, nvc0)
   GL_ARB_shader_image_load_store                       DONE (i965)
   GL_ARB_shader_image_size                             DONE (i965)
   GL_ARB_shader_storage_buffer_object                  in progress (Iago Toral, Samuel Iglesias)
   GL_ARB_shader_storage_buffer_object                  DONE (i965, nvc0)
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
@@ -223,10 +229,35 @@ GLES3.1, GLSL ES 3.1
   GS5 Packing/bitfield/conversion functions            DONE (i965, nvc0, r600, radeonsi)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
   Additional functions not covered above:
       glMemoryBarrierByRegion
       glGetTexLevelParameter[fi]v - needs updates to restrict to GLES enums
       glGetBooleani_v - needs updates to restrict to GLES enums
   Additional functionality not covered above:
       glMemoryBarrierByRegion                          DONE
       glGetTexLevelParameter[fi]v - needs updates      DONE
       glGetBooleani_v - restrict to GLES enums
       gl_HelperInvocation support                      DONE (i965, nvc0, r600)
 GLES3.2, GLSL ES 3.2
   GL_EXT_color_buffer_float                            DONE (all drivers)
   GL_KHR_blend_equation_advanced                       not started
   GL_KHR_debug                                         DONE (all drivers)
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_KHR_texture_compression_astc_ldr                  DONE (i965/gen9+)
   GL_OES_copy_image                                    not started (based on GL_ARB_copy_image, which is done for some drivers)
   GL_OES_draw_buffers_indexed                          not started
   GL_OES_draw_elements_base_vertex                     DONE (all drivers)
   GL_OES_geometry_shader                               started (Marta)
   GL_OES_gpu_shader5                                   not started (based on parts of GL_ARB_gpu_shader5, which is done for some drivers)
   GL_OES_primitive_bounding box                        not started
   GL_OES_sample_shading                                not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
   GL_OES_sample_variables                              not started (based on parts of GL_ARB_sample_shading, which is done for some drivers)
   GL_OES_shader_image_atomic                           not started (based on parts of GL_ARB_shader_image_load_store, which is done for some drivers)
   GL_OES_shader_io_blocks                              not started (based on parts of GLSL 1.50, which is done)
   GL_OES_shader_multisample_interpolation              not started (based on parts of GL_ARB_gpu_shader5, which is done)
   GL_OES_tessellation_shader                           not started (based on GL_ARB_tessellation_shader, which is done for some drivers)
   GL_OES_texture_border_clamp                          not started (based on GL_ARB_texture_border_clamp, which is done)
   GL_OES_texture_buffer                                not started (based on GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_range, and GL_ARB_texture_buffer_object_rgb32 that are all done)
   GL_OES_texture_cube_map_array                        not started (based on GL_ARB_texture_cube_map_array, which is done for all drivers)
   GL_OES_texture_stencil8                              DONE (all drivers that support GL_ARB_texture_stencil8)
   GL_OES_texture_storage_multisample_2d_array          DONE (all drivers that support GL_ARB_texture_multisample)
 More info about these features and the work involved can be found at
 http://dri.freedesktop.org/wiki/MissingFunctionality

4

docs/README.UVD

View File

@@ -2,8 +2,8 @@ The software may implement third party technologies (e.g. third party
 libraries) that are not licensed to you by AMD and for which you may need
 to obtain licenses from other parties.  Unless explicitly stated otherwise,
 these third party technologies are not licensed hereunder.  Such third
 party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
 AVC, and VC-1.
 party technologies include, but are not limited, to H.264, H.265, HEVC, MPEG-2,
 MPEG-4, AVC, and VC-1.
 For MPEG-2 Encoding Products ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
 THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD FOR ENCODING VIDEO

									
										9

docs/autoconf.html
									
												View File
												
				@@ -87,6 +87,13 @@ created in a <code>lib64</code> directory at the top of the Mesa source

				tree.</p>

				</dd>

				<dt><code>--sysconfdir=DIR</code></dt>

				<dd><p>This option specifies the directory where the configuration

				files will be installed. The default is <code>${prefix}/etc</code>.

				Currently there's only one config file provided when dri drivers are

				enabled - it's <code>drirc</code>.</p>

				</dd>

				<dt><code>--enable-static, --disable-shared</code></dt>

				<dd><p>By default, Mesa

				will build shared libraries. Either of these options will force static

				@@ -217,7 +224,7 @@ GLX.

				<dt><code>--with-expat=DIR</code>

				<dd><p><strong>DEPRECATED</strong>, use <code>PKG_CONFIG_PATH</code> instead.</p>

				<p>The DRI-enabled libGL uses expat to

				parse the DRI configuration files in <code>/etc/drirc</code> and

				parse the DRI configuration files in <code>${sysconfdir}/drirc</code> and

				<code>~/.drirc</code>. This option allows a specific expat installation

				to be used. For example, <code>--with-expat=/usr/local</code> will

				search for expat headers and libraries in <code>/usr/local/include</code>

									
										4

docs/contents.html
									
												View File
												
				@@ -90,14 +90,14 @@

				<li><a href="http://www.opengl.org" target="_parent">OpenGL website</a>

				<li><a href="http://dri.freedesktop.org" target="_parent">DRI website</a>

				<li><a href="http://www.freedesktop.org" target="_parent">freedesktop.org</a>

				<li><a href="http://planet.freedesktop.org" target="_parent">Developer blogs</a>

				</ul>

				<b>Hosted by:</b>

				<br>

				<blockquote>

				<a href="http://sourceforge.net"

				target="_parent"><img src="http://sourceforge.net/sflogo.php?group_id=3&amp;type=1"

				width="88" height="31" align="bottom" alt="Sourceforge.net" border="0"></a>

				target="_parent">sourceforge.net</a>

				</blockquote>

				</body>

									
										45

docs/envvars.html
									
												View File
												
				@@ -91,11 +91,20 @@ This is only valid for versions &gt;= 3.0.

				<li> Mesa may not really implement all the features of the given version.

				(for developers only)

				</ul>

				<li>MESA_GLES_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_VERSION) for OpenGL ES.

				<ul>

				<li> The format should be MAJOR.MINOR

				<li> Examples: 2.0, 3.0, 3.1

				<li> Mesa may not really implement all the features of the given version.

				(for developers only)

				</ul>

				<li>MESA_GLSL_VERSION_OVERRIDE - changes the value returned by

				glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as

				"130".  Mesa will not really implement all the features of the given language version

				if it's higher than what's normally reported. (for developers only)

				<li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a>

				<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.

				</ul>

				@@ -153,6 +162,7 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				   <li>no16 - suppress generation of 16-wide fragment shaders. useful for debugging broken shaders</li>

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				   <li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>

				</ul>

				</ul>

				@@ -178,6 +188,14 @@ Mesa EGL supports different sets of environment variables.  See the

				<li>GALLIUM_HUD - draws various information on the screen, like framerate,

				    cpu load, driver statistics, performance counters, etc.

				    Set GALLIUM_HUD=help and run e.g. glxgears for more info.

				<li>GALLIUM_HUD_PERIOD - sets the hud update rate in seconds (float). Use zero

				    to update every frame. The default period is 1/2 second.

				<li>GALLIUM_HUD_VISIBLE - control default visibility, defaults to true.

				<li>GALLIUM_HUD_TOGGLE_SIGNAL - toggle visibility via user specified signal.

				    Especially useful to toggle hud at specific points of application and

				    disable for unencumbered viewing the rest of the time. For example, set

				    GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_SIGNAL_TOGGLE to 10 (SIGUSR1).

				    Use kill -10 <pid> to toggle the hud as desired.

				<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.

				    rather than stderr.

				<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment

				@@ -214,7 +232,7 @@ See src/mesa/state_tracker/st_debug.c for other options.

				<li>LP_PERF - a comma-separated list of options to selectively no-op various

				    parts of the driver.  See the source code for details.

				<li>LP_NUM_THREADS - an integer indicating how many threads to use for rendering.

				    Zero turns of threading completely.  The default value is the number of CPU

				    Zero turns off threading completely.  The default value is the number of CPU

				    cores present.

				</ul>

				@@ -229,6 +247,31 @@ for details.

				</ul>

				<h3>VA-API state tracker environment variables</h3>

				<ul>

				<li>VAAPI_MPEG4_ENABLED - enable MPEG4 for VA-API, disabled by default.

				</ul>

				<h3>VC4 driver environment variables</h3>

				<ul>

				<li>VC4_DEBUG - a comma-separated list of named flags, which do various things:

				<ul>

				   <li>cl - dump command list during creation</li>

				   <li>qpu - dump generated QPU instructions</li>

				   <li>qir - dump QPU IR during program compile</li>

				   <li>nir - dump NIR during program compile</li>

				   <li>tgsi - dump TGSI during program compile</li>

				   <li>shaderdb - dump program compile information for shader-db analysis</li>

				   <li>perf - print during performance-related events</li>

				   <li>norast - skip actual hardware execution of commands</li>

				   <li>always_flush - flush after each draw call</li>

				   <li>always_sync - wait for finish after each flush</li>

				   <li>dump - write a GPU command stream trace file (VC4 simulator only)</li>

				</ul>

				</ul>

				<p>

				Other Gallium drivers have their own environment variables.  These may change

				frequently so the source code should be consulted for details.

									
										125

docs/index.html
									
												View File
												
				@@ -16,25 +16,142 @@

				<h1>News</h1>

				<h2>August 22 2015</h2>

				<h2>February 10, 2016</h2>

				<p>

				<a href="relnotes/11.1.2.html">Mesa 11.1.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 22, 2016</h2>

				<p>

				<a href="relnotes/11.0.9.html">Mesa 11.0.9</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 11.0.9 will be the final release in the 11.0

				series. Users of 11.0 are encouraged to migrate to the 11.1 series in order

				to obtain future fixes.

				</p>

				<h2>January 13, 2016</h2>

				<p>

				<a href="relnotes/11.1.1.html">Mesa 11.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 21, 2015</h2>

				<p>

				<a href="relnotes/11.0.8.html">Mesa 11.0.8</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 15, 2015</h2>

				<p>

				<a href="relnotes/11.1.0.html">Mesa 11.1.0</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<h2>December 9, 2015</h2>

				<p>

				<a href="relnotes/11.0.7.html">Mesa 11.0.7</a> is released.

				This is a bug-fix release.

				</p>

				<p>

				Mesa demos 8.3.0 is also released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2015-December/000191.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/">ftp.freedesktop.org/pub/mesa/demos/8.3.0/</a>.

				</p>

				<h2>November 21, 2015</h2>

				<p>

				<a href="relnotes/11.0.6.html">Mesa 11.0.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 11, 2015</h2>

				<p>

				<a href="relnotes/11.0.5.html">Mesa 11.0.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 24, 2015</h2>

				<p>

				<a href="relnotes/11.0.4.html">Mesa 11.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 10, 2015</h2>

				<p>

				<a href="relnotes/11.0.3.html">Mesa 11.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 3, 2015</h2>

				<p>

				<a href="relnotes/10.6.9.html">Mesa 10.6.9</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: It is anticipated that 10.6.9 will be the final release in the 10.6

				series. Users of 10.6 are encouraged to migrate to the 11.0 series in order

				to obtain future fixes.

				</p>

				<h2>September 28, 2015</h2>

				<p>

				<a href="relnotes/11.0.2.html">Mesa 11.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 26, 2015</h2>

				<p>

				<a href="relnotes/11.0.1.html">Mesa 11.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 20, 2015</h2>

				<p>

				<a href="relnotes/10.6.8.html">Mesa 10.6.8</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 12, 2015</h2>

				<p>

				<a href="relnotes/11.0.0.html">Mesa 11.0.0</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<h2>September 10, 2015</h2>

				<p>

				<a href="relnotes/10.6.7.html">Mesa 10.6.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>September 4, 2015</h2>

				<p>

				<a href="relnotes/10.6.6.html">Mesa 10.6.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 22, 2015</h2>

				<p>

				<a href="relnotes/10.6.5.html">Mesa 10.6.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 11 2015</h2>

				<h2>August 11, 2015</h2>

				<p>

				<a href="relnotes/10.6.4.html">Mesa 10.6.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 26 2015</h2>

				<h2>July 26, 2015</h2>

				<p>

				<a href="relnotes/10.6.3.html">Mesa 10.6.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 11 2015</h2>

				<h2>July 11, 2015</h2>

				<p>

				<a href="relnotes/10.6.2.html">Mesa 10.6.2</a> is released.

				This is a bug-fix release.

									
										5

docs/install.html
									
												View File
												
				@@ -39,7 +39,7 @@ Version 2.6.4 or later should work.

				</li>

				<br>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.7.3 or later should work.

				Python Mako module is required. Version 0.3.4 or later should work.

				</li>

				</br>

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				@@ -58,6 +58,9 @@ On Windows with MinGW, install flex and bison with:

				For MSVC on Windows, install

				<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.

				</li>

				<br>

				<li>For building on Windows, Microsoft Visual Studio 2013 or later is required.

				</li>

				</ul>

									
										17

docs/relnotes.html
									
												View File
												
				@@ -21,6 +21,23 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/11.1.2.html">11.1.2 release notes</a>

				<li><a href="relnotes/11.0.9.html">11.0.9 release notes</a>

				<li><a href="relnotes/11.1.1.html">11.1.1 release notes</a>

				<li><a href="relnotes/11.0.8.html">11.0.8 release notes</a>

				<li><a href="relnotes/11.1.0.html">11.1.0 release notes</a>

				<li><a href="relnotes/11.0.7.html">11.0.7 release notes</a>

				<li><a href="relnotes/11.0.6.html">11.0.6 release notes</a>

				<li><a href="relnotes/11.0.5.html">11.0.5 release notes</a>

				<li><a href="relnotes/11.0.4.html">11.0.4 release notes</a>

				<li><a href="relnotes/11.0.3.html">11.0.3 release notes</a>

				<li><a href="relnotes/10.6.9.html">10.6.9 release notes</a>

				<li><a href="relnotes/11.0.2.html">11.0.2 release notes</a>

				<li><a href="relnotes/11.0.1.html">11.0.1 release notes</a>

				<li><a href="relnotes/10.6.8.html">10.6.8 release notes</a>

				<li><a href="relnotes/11.0.0.html">11.0.0 release notes</a>

				<li><a href="relnotes/10.6.7.html">10.6.7 release notes</a>

				<li><a href="relnotes/10.6.6.html">10.6.6 release notes</a>

				<li><a href="relnotes/10.6.5.html">10.6.5 release notes</a>

				<li><a href="relnotes/10.6.4.html">10.6.4 release notes</a>

				<li><a href="relnotes/10.6.3.html">10.6.3 release notes</a>

									
										164

docs/relnotes/10.6.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,164 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.6.6 Release Notes / September 04, 2015</h1>

				<p>

				Mesa 10.6.6 is a bug fix release which fixes bugs found since the 10.6.5 release.

				</p>

				<p>

				Mesa 10.6.6 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				416517aa9df4791f97d34451a9e4da33c966afcd18c115c5769b92b15b018ef5  mesa-10.6.6.tar.gz

				570f2154b7340ff5db61ff103bc6e85165b8958798b78a50fa2df488e98e5778  mesa-10.6.6.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84677">Bug 84677</a> - Triangle disappears with glPolygonMode GL_LINE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90734">Bug 90734</a> - glBufferSubData is corrupting data when buffer is &gt; 32k</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90748">Bug 90748</a> - [BDW Bisected]dEQP-GLES3.functional.fbo.completeness.renderable.texture.depth.rg_half_float_oes fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90902">Bug 90902</a> - [bsw][regression] dEQP: &quot;Found invalid pixel values&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90925">Bug 90925</a> - &quot;high fidelity&quot;: Segfault in _mesa_program_resource_find_name</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91254">Bug 91254</a> - (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91292">Bug 91292</a> - [BDW+] glVertexAttribDivisor not working in combination with glPolygonMode</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91673">Bug 91673</a> - Segfault when calling glTexSubImage2D on storage texture to bound FBO</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91726">Bug 91726</a> - R600 asserts in tgsi_cmp/make_src_for_op3</li>

				</ul>

				<h2>Changes</h2>

				<p>Chris Wilson (2):</p>

				<ul>

				  <li>i965: Prevent coordinate overflow in intel_emit_linear_blit</li>

				  <li>i965: Always re-emit the pipeline select during invariant state emission</li>

				</ul>

				<p>Daniel Scharrer (1):</p>

				<ul>

				  <li>mesa: add missing queries for ARB_direct_state_access</li>

				</ul>

				<p>Dave Airlie (8):</p>

				<ul>

				  <li>mesa/arb_gpu_shader_fp64: add support for glGetUniformdv</li>

				  <li>mesa/texgetimage: fix missing stencil check</li>

				  <li>st/readpixels: fix accel path for skipimages.</li>

				  <li>texcompress_s3tc/fxt1: fix stride checks (v1.1)</li>

				  <li>mesa/readpixels: check strides are equal before skipping conversion</li>

				  <li>mesa: enable texture stencil8 for multisample</li>

				  <li>r600/sb: update last_cf for finalize if.</li>

				  <li>r600g: fix calculation for gpr allocation</li>

				</ul>

				<p>David Heidelberg (1):</p>

				<ul>

				  <li>st/nine: Require gcc &gt;= 4.6</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 10.6.5</li>

				  <li>get-pick-list.sh: Require explicit "10.6" for nominating stable patches</li>

				</ul>

				<p>Glenn Kennard (4):</p>

				<ul>

				  <li>r600g: Fix assert in tgsi_cmp</li>

				  <li>r600g/sb: Handle undef in read port tracker</li>

				  <li>r600g/sb: Don't read junk after EOP</li>

				  <li>r600g/sb: Don't crash on empty if jump target</li>

				</ul>

				<p>Ilia Mirkin (5):</p>

				<ul>

				  <li>st/mesa: fix assignments with 4-operand arguments (i.e. BFI)</li>

				  <li>st/mesa: pass through 4th opcode argument in bitmap/pixel visitors</li>

				  <li>nv50,nvc0: disable depth bounds test on blit</li>

				  <li>nv50: fix 2d engine blits for 64- and 128-bit formats</li>

				  <li>mesa: only copy the requested teximage faces</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965/fs: Split VGRFs after lowering pull constants</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Fix copy propagation type changes.</li>

				  <li>Revert "i965: Advertise a line width of 40.0 on Cherryview and Skylake."</li>

				  <li>i965: Momentarily pretend to support ARB_texture_stencil8 for blits.</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>gallium/radeon: fix the ADDRESS_HI mask for EVENT_WRITE CIK packets</li>

				  <li>mesa: create multisample fallback textures like normal textures</li>

				  <li>radeonsi: fix a Unigine Heaven hang when drirc is missing</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>i965/fs: Handle MRF destinations in lower_integer_multiplication().</li>

				</ul>

				<p>Neil Roberts (2):</p>

				<ul>

				  <li>i965: Swap the order of the vertex ID and edge flag attributes</li>

				  <li>i965/bdw: Fix 3DSTATE_VF_INSTANCING when the edge flag is used</li>

				</ul>

				<p>Tapani Pälli (5):</p>

				<ul>

				  <li>mesa: update fbo state in glTexStorage</li>

				  <li>glsl: build stageref mask using IR, not symbol table</li>

				  <li>glsl: expose build_program_resource_list function</li>

				  <li>glsl: create program resource list after LinkShader</li>

				  <li>mesa: add GL_RED, GL_RG support for floating point textures</li>

				</ul>

				</div>

				</body>

				</html>

									
										75

docs/relnotes/10.6.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.6.7 Release Notes / September 10, 2015</h1>

				<p>

				Mesa 10.6.7 is a bug fix release which fixes bugs found since the 10.6.6 release.

				</p>

				<p>

				Mesa 10.6.7 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				4ba10c59abee30d72476543a57afd2f33803dabf4620dc333b335d47966ff842  mesa-10.6.7.tar.gz

				feb1f640b915dada88a7c793dfaff0ae23580f8903f87a6b76469253de0d28d8  mesa-10.6.7.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90751">Bug 90751</a> - [BDW Bisected]dEQP-GLES3.functional.fbo.completeness.renderable.texture.stencil.stencil_index8 fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>mesa/teximage: use correct extension for accept stencil texture.</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: add sha256 checksums for 10.6.6</li>

				  <li>Revert "i965: Momentarily pretend to support ARB_texture_stencil8 for blits."</li>

				  <li>Update version to 10.6.7</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>glsl: Handle attribute aliasing in attribute storage limit check.</li>

				</ul>

				</div>

				</body>

				</html>

									
										136

docs/relnotes/10.6.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,136 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.6.8 Release Notes / September 20, 2015</h1>

				<p>

				Mesa 10.6.8 is a bug fix release which fixes bugs found since the 10.6.7 release.

				</p>

				<p>

				Mesa 10.6.8 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				1f34dba2a8059782e3e4e0f18b9628004e253b2c69085f735b846d2e63c9e250  mesa-10.6.8.tar.gz

				e36ee5ceeadb3966fb5ce5b4cf18322dbb76a4f075558ae49c3bba94f57d58fd  mesa-10.6.8.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90621">Bug 90621</a> - Mesa fail to build from git</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91719">Bug 91719</a> - [SNB,HSW,BYT] dEQP regressions associated with using NIR for vertex shaders</li>

				</ul>

				<h2>Changes</h2>

				<p>Alejandro Piñeiro (1):</p>

				<ul>

				  <li>i965/vec4: fill src_reg type using the constructor type parameter</li>

				</ul>

				<p>Antia Puentes (1):</p>

				<ul>

				  <li>i965/vec4: Fix saturation errors when coalescing registers</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: add sha256 checksums for 10.6.7</li>

				  <li>cherry-ignore: add commit non applicable for 10.6</li>

				</ul>

				<p>Hans de Goede (4):</p>

				<ul>

				  <li>nv30: Fix creation of scanout buffers</li>

				  <li>nv30: Implement color resolve for msaa</li>

				  <li>nv30: Fix max width / height checks in nv30 sifm code</li>

				  <li>nv30: Disable msaa unless requested from the env by NV30_MAX_MSAA</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>mesa: Pass the type to _mesa_uniform_matrix as a glsl_base_type</li>

				  <li>mesa: Don't allow wrong type setters for matrix uniforms</li>

				</ul>

				<p>Ilia Mirkin (5):</p>

				<ul>

				  <li>st/mesa: don't fall back to 16F when 32F is requested</li>

				  <li>nvc0: always emit a full shader colormask</li>

				  <li>nvc0: remove BGRA4 format support</li>

				  <li>st/mesa: avoid integer overflows with buffers &gt;= 512MB</li>

				  <li>nv50, nvc0: fix max texture buffer size to 128M elements</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>i965/vec4: Don't reswizzle hardware registers</li>

				</ul>

				<p>Jose Fonseca (1):</p>

				<ul>

				  <li>gallivm: Workaround LLVM PR23628.</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i965: Momentarily pretend to support ARB_texture_stencil8 for blits.</li>

				</ul>

				<p>Oded Gabbay (1):</p>

				<ul>

				  <li>llvmpipe: convert double to long long instead of unsigned long long</li>

				</ul>

				<p>Ray Strode (1):</p>

				<ul>

				  <li>gbm: convert gbm bo format to fourcc format on dma-buf import</li>

				</ul>

				<p>Ulrich Weigand (1):</p>

				<ul>

				  <li>mesa: Fix texture compression on big-endian systems</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>gallivm: Do not use NoFramePointerElim with LLVM 3.7.</li>

				</ul>

				</div>

				</body>

				</html>

									
										130

docs/relnotes/10.6.9.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.6.9 Release Notes / Octover 03, 2015</h1>

				<p>

				Mesa 10.6.9 is a bug fix release which fixes bugs found since the 10.6.8 release.

				</p>

				<p>

				Mesa 10.6.9 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				3406876aac67546d0c3e2cb97da330b62644c313e7992b95618662e13c54296a  mesa-10.6.9.tar.gz

				b04c4de6280b863babc2929573da17218d92e9e4ba6272d548d135415723e8c3  mesa-10.6.9.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38109">Bug 38109</a> - i915 driver crashes if too few vertices are submitted (Mesa 7.10.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55552">Bug 55552</a> - Compile errors with --enable-mangling</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86281">Bug 86281</a> - brw_meta_fast_clear (brw=brw&#64;entry=0x7fffd4097a08, fb=fb&#64;entry=0x7fffd40fa900, buffers=buffers&#64;entry=2, partial_clear=partial_clear&#64;entry=false)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91970">Bug 91970</a> - [BSW regression] dEQP-GLES3.functional.shaders.precision.int.highp_mul_vertex</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92072">Bug 92072</a> - Wine breakage since d082c5324 (st/mesa: don't call st_validate_state in BlitFramebuffer)</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats</li>

				</ul>

				<p>Chris Wilson (1):</p>

				<ul>

				  <li>i965: Remove early release of DRI2 miptree</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 10.6.8</li>

				  <li>cherry-ignore: add commit non applicable for 10.6</li>

				  <li>cherry-ignore: add commit non applicable for 10.6</li>

				  <li>Update version to 10.6.9</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>mesa: Fix GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for default framebuffer.</li>

				</ul>

				<p>Ian Romanick (5):</p>

				<ul>

				  <li>t_dd_dmatmp: Make "count" actually be the count</li>

				  <li>t_dd_dmatmp: Clean up improper code formatting from previous patch</li>

				  <li>t_dd_dmatmp: Use '&amp; 3' instead of '% 4' everywhere</li>

				  <li>t_dd_dmatmp: Pull out common 'count -= count &amp; 3' code</li>

				  <li>t_dd_dmatmp: Use addition instead of subtraction in loop bounds</li>

				</ul>

				<p>Jeremy Huddleston (1):</p>

				<ul>

				  <li>configure.ac: Add support to enable read-only text segment on x86.</li>

				</ul>

				<p>Kristian Høgsberg Kristensen (1):</p>

				<ul>

				  <li>i965: Respect stride and subreg_offset for ATTR registers</li>

				</ul>

				<p>Kyle Brenneman (3):</p>

				<ul>

				  <li>glx: Fix build errors with --enable-mangling (v2)</li>

				  <li>mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.</li>

				  <li>glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>radeon/vce: fix vui time_scale zero error</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: fix front buffer regression after dropping st_validate_state in Blit</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>mesa: fix mipmap generation for immutable, compressed textures</li>

				</ul>

				</div>

				</body>

				</html>

									
										2

docs/relnotes/11.0.5.html
									
												View File
												
				@@ -45,8 +45,6 @@ because compatibility contexts are not supported.

				<ul>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91993">Bug 91993</a> - Graphical glitch in Astromenace (open-source game).</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92214">Bug 92214</a> - Flightgear crashes during splashboot with R600 driver, LLVM 3.7.0 and mesa 11.0.2</li>

									
										281

docs/relnotes/11.1.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,281 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.0 Release Notes / 15 December 2015</h1>

				<p>

				Mesa 11.1.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 11.1.1.

				</p>

				<p>

				Mesa 11.1.0 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e3bc44be4df5e4dc728dfda7b55b1aaeadfce36eca6a367b76cc07598070cb2d  mesa-11.1.0.tar.gz

				9befe03b04223eb1ede177fa8cac001e2850292c8c12a3ec9929106afad9cf1f  mesa-11.1.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>OpenGL 3.1 support on freedreno (a3xx, a4xx)</li>

				<li>OpenGL 3.3 support for VMware guest VM driver (supported by Workstation 12

				    and Fusion 8).

				<li>GL_AMD_performance_monitor on nv50</li>

				<li>GL_ARB_arrays_of_arrays on i965</li>

				<li>GL_ARB_blend_func_extended on freedreno (a3xx)</li>

				<li>GL_ARB_clear_texture on nv50, nvc0</li>

				<li>GL_ARB_clip_control on freedreno/a4xx</li>

				<li>GL_ARB_copy_image on nv50, nvc0, radeonsi</li>

				<li>GL_ARB_depth_clamp on freedreno/a4xx</li>

				<li>GL_ARB_fragment_layer_viewport on i965 (gen6+)</li>

				<li>GL_ARB_gpu_shader_fp64 on r600 for Cypress/Cayman/Aruba chips</li>

				<li>GL_ARB_gpu_shader5 on r600 for Evergreen and later chips</li>

				<li>GL_ARB_seamless_cubemap_per_texture on freedreno/a4xx</li>

				<li>GL_ARB_shader_clock on i965 (gen7+)</li>

				<li>GL_ARB_shader_stencil_export on i965 (gen9+)</li>

				<li>GL_ARB_shader_storage_buffer_object on i965</li>

				<li>GL_ARB_shader_texture_image_samples on i965, nv50, nvc0, r600, radeonsi</li>

				<li>GL_ARB_texture_barrier / GL_NV_texture_barrier on i965</li>

				<li>GL_ARB_texture_buffer_range on freedreno/a3xx</li>

				<li>GL_ARB_texture_compression_bptc on freedreno/a4xx</li>

				<li>GL_ARB_texture_query_lod on softpipe</li>

				<li>GL_ARB_texture_view on radeonsi and r600 (for evergeen and newer)</li>

				<li>GL_ARB_vertex_type_2_10_10_10_rev on freedreno (a3xx, a4xx)</li>

				<li>GL_EXT_blend_func_extended on all drivers that support the ARB version</li>

				<li>GL_EXT_buffer_storage implemented for when ES 3.1 support is gained</li>

				<li>GL_EXT_draw_elements_base_vertex on all drivers</li>

				<li>GL_EXT_texture_compression_rgtc / latc on freedreno (a3xx & a4xx)</li>

				<li>GL_KHR_debug (GLES)</li>

				<li>GL_NV_conditional_render on freedreno</li>

				<li>GL_OES_draw_elements_base_vertex on all drivers</li>

				<li>EGL_KHR_create_context on softpipe, llvmpipe</li>

				<li>EGL_KHR_gl_colorspace on softpipe, llvmpipe</li>

				<li>new virgl gallium driver for qemu virtio-gpu</li>

				<li>16x multisampling on i965 (gen9+)</li>

				<li>GL_EXT_shader_samples_identical on i965.</li>

				</ul>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=28130">Bug 28130</a> - vbo: premature flushing breaks GL_LINE_LOOP</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38109">Bug 38109</a> - i915 driver crashes if too few vertices are submitted (Mesa 7.10.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=49779">Bug 49779</a> - Extra line segments in GL_LINE_LOOP</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=55552">Bug 55552</a> - Compile errors with --enable-mangling</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71789">Bug 71789</a> - [r300g] Visuals not found in (default) depth = 24</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79783">Bug 79783</a> - Distorted output in obs-studio where other vendors &quot;work&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80821">Bug 80821</a> - When LIBGL_ALWAYS_SOFTWARE is set, KHR_create_context is not supported</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81174">Bug 81174</a> - Gallium: GL_LINE_LOOP broken with more than 512 points</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83508">Bug 83508</a> - [UBO] Assertion for array of blocks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84677">Bug 84677</a> - Triangle disappears with glPolygonMode GL_LINE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86281">Bug 86281</a> - brw_meta_fast_clear (brw=brw&#64;entry=0x7fffd4097a08, fb=fb&#64;entry=0x7fffd40fa900, buffers=buffers&#64;entry=2, partial_clear=partial_clear&#64;entry=false)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86469">Bug 86469</a> - Unreal Engine demo doesn't run</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86720">Bug 86720</a> - [radeon] Europa Universalis 4 freezing during game start (10.3.3+, still broken on 11.0.2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89014">Bug 89014</a> - PIPE_QUERY_GPU_FINISHED is not acting as expected on SI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90175">Bug 90175</a> - [hsw bisected][PATCH] atomic counters doesn't work for a binding point different to zero</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90348">Bug 90348</a> - Spilling failure of b96 merged value</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90631">Bug 90631</a> - Compilation failure for fragment shader with many branches on Sandy Bridge</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90734">Bug 90734</a> - glBufferSubData is corrupting data when buffer is &gt; 32k</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=90887">Bug 90887</a> - PhiMovesPass in register allocator broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91044">Bug 91044</a> - piglit spec/egl_khr_create_context/valid debug flag gles* fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91114">Bug 91114</a> - ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91254">Bug 91254</a> - (regresion) video using VA-API on Intel slow and freeze system with mesa 10.6 or 10.6.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91292">Bug 91292</a> - [BDW+] glVertexAttribDivisor not working in combination with glPolygonMode</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91342">Bug 91342</a> - Very dark textures on some objects in indoors environments in Postal 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91526">Bug 91526</a> - World of Warcraft (on Wine) has UI corruption with nouveau</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91551">Bug 91551</a> - DXTn compressed normal maps produce severe artifacts on all NV5x and NVDx chipsets</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91716">Bug 91716</a> - [bisected] piglit.shaders.glsl-vs-int-attrib regresses on 32 bit BYT, HSW, IVB, SNB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91718">Bug 91718</a> - piglit.spec.arb_shader_image_load_store.invalid causes intermittent GPU HANG</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91719">Bug 91719</a> - [SNB,HSW,BYT] dEQP regressions associated with using NIR for vertex shaders</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91726">Bug 91726</a> - R600 asserts in tgsi_cmp/make_src_for_op3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91780">Bug 91780</a> - Rendering issues with geometry shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91785">Bug 91785</a> - make check DispatchSanity_test.GLES31 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91788">Bug 91788</a> - [HSW Regression] Synmark2_v6 Multithread performance case FPS reduced by 36%</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91847">Bug 91847</a> - glGenerateTextureMipmap not working (no errors) unless glActiveTexture(GL_TEXTURE1) is called before</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91857">Bug 91857</a> - Mesa 10.6.3 linker is slow</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91881">Bug 91881</a> - regression: GPU lockups since mesa-11.0.0_rc1 on RV620 (r600) driver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91890">Bug 91890</a> - [nve7] witcher2: blurry image &amp; DATA_ERRORs (class 0xa097 mthd 0x2380/0x238c)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91898">Bug 91898</a> - src/util/mesa-sha1.c:250:25: fatal error: openssl/sha.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91927">Bug 91927</a> - [SKL] [regression] piglit compressed textures tests fail  with kernel upgrade</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91930">Bug 91930</a> - Program with GtkGLArea widget does not redraw</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91970">Bug 91970</a> - [BSW regression] dEQP-GLES3.functional.shaders.precision.int.highp_mul_vertex</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91985">Bug 91985</a> - [regression, bisected] FTBFS with commit f9caabe8f1: R600_UCP_CONST_BUFFER is undefined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91993">Bug 91993</a> - Graphical glitch in Astromenace (open-source game).</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92009">Bug 92009</a> - ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92033">Bug 92033</a> - [SNB,regression,dEQP,bisected] functional.shaders.random tests regressed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92052">Bug 92052</a> - nir/nir_builder.h:79: error: expected primary-expression before ‘.’ token</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92054">Bug 92054</a> - make check gbm-symbols-check regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92066">Bug 92066</a> - [ILK,G45,regression] New assertion on BRW_MAX_MRF breaks ilk and g45</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92072">Bug 92072</a> - Wine breakage since d082c5324 (st/mesa: don't call st_validate_state in BlitFramebuffer)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92095">Bug 92095</a> - [Regression, bisected] arb_shader_atomic_counters.compiler.builtins.frag</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92122">Bug 92122</a> - [bisected, cts] Regression with Assault Android Cactus</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92124">Bug 92124</a> - shader_query.cpp:841:34: error: ‘strndup’ was not declared in this scope</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92183">Bug 92183</a> - linker.cpp:3187:46: error: ‘strtok_r’ was not declared in this scope</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92193">Bug 92193</a> - [SKL] ES2-CTS.gtf.GL2ExtensionTests.compressed_astc_texture.compressed_astc_texture fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92214">Bug 92214</a> - Flightgear crashes during splashboot with R600 driver, LLVM 3.7.0 and mesa 11.0.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92221">Bug 92221</a> - Unintended code changes in _mesa_base_tex_format commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92265">Bug 92265</a> - Black windows in weston after update mesa to 11.0.2-1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92304">Bug 92304</a> - [cts] cts.shaders.negative conformance tests fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92363">Bug 92363</a> - [BSW/BDW] ogles1conform Gets test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92437">Bug 92437</a> - osmesa: Expose GL entry points for Windows build, via .def file</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92438">Bug 92438</a> - Segfault in pushbuf_kref when running the android emulator (qemu) on nv50</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92476">Bug 92476</a> - [cts] ES2-CTS.gtf.GL2ExtensionTests.egl_image.egl_image fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92588">Bug 92588</a> - [HSW,BDW,BSW,SKL-Y][GLES 3.1 CTS] ES31-CTS.arrays_of_arrays.InteractionFunctionCalls2 - assert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92621">Bug 92621</a> - [G965 ILK G45] Regression: 24 piglit regressions in glsl-1.10</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92623">Bug 92623</a> - Differences in prog_data ignored when caching fragment programs (causes hangs)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92639">Bug 92639</a> - [Regression bisected] Ogles1conform mustpass.c fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92641">Bug 92641</a> - [SKL BSW] [Regression] Ogles1conform userclip.c fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92645">Bug 92645</a> - kodi vdpau interop fails since  mesa,meta: move gl_texture_object::TargetIndex initializations</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92705">Bug 92705</a> - [clover] fail to build with llvm-svn/clang-svn 3.8</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92709">Bug 92709</a> - &quot;LLVM triggered Diagnostic Handler: unsupported call to function ldexpf in main&quot; when starting race in stuntrally</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92738">Bug 92738</a> - Randon R7 240 doesn't work on 16KiB page size platform</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92744">Bug 92744</a> - [g965 Regression bisected] Performance regression and piglit assertions due to liveness analysis</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92770">Bug 92770</a> - [SNB, regression, dEQP] deqp-gles3.functional.shaders.discard.dynamic_loop_texture</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92824">Bug 92824</a> - [regression, bisected] `make check` dispatch-sanity broken by GL_EXT_buffer_storage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92849">Bug 92849</a> - [IVB HSW BDW] piglit image load/store load-from-cleared-image.shader_test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92859">Bug 92859</a> - [regression, bisected] validate_intrinsic_instr: Assertion triggered</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92860">Bug 92860</a> - [radeonsi][bisected] st/mesa: implement ARB_copy_image - Corruption in ARK Survival Evolved</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92900">Bug 92900</a> - [regression bisected] About 700 piglit regressions is what could go wrong</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92909">Bug 92909</a> - Offset/alignment issue with layout std140 and vec3</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92985">Bug 92985</a> - Mac OS X build error &quot;ar: no archive members specified&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93015">Bug 93015</a> - Tonga Elemental segfault + VM faults since  radeon: implement r600_query_hw_get_result via function pointers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93048">Bug 93048</a> - [CTS regression] mesa af2723 breaks GL Conformance for debug extension</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93063">Bug 93063</a> - drm_helper.h:227:1: error: static declaration of ‘pipe_virgl_create_screen’ follows non-static declaration</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93091">Bug 93091</a> - [opencl] segfault when running any opencl programs (like clinfo)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93126">Bug 93126</a> - wrongly claim supporting GL_EXT_texture_rg</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93180">Bug 93180</a> - [regression] arb_separate_shader_objects.active sampler conflict fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93235">Bug 93235</a> - [regression] dispatch sanity broken by GetPointerv</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93266">Bug 93266</a> - gl_arb_shading_language_420pack does not allow binding of image variables</li>

				</ul>

				<h2>Changes</h2>

				<li>MPEG4 decoding has been disabled by default in the VAAPI driver</li>

				</div>

				</body>

				</html>

									
										197

docs/relnotes/11.1.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,197 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.1 Release Notes / January 13, 2016</h1>

				<p>

				Mesa 11.1.1 is a bug fix release which fixes bugs found since the 11.1.0 release.

				</p>

				<p>

				Mesa 11.1.1 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b15089817540ba0bffd0aad323ecf3a8ff6779568451827c7274890b4a269d58  mesa-11.1.1.tar.gz

				64db074fc514136b5fb3890111f0d50604db52f0b1e94ba3fcb0fe8668a7fd20  mesa-11.1.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91806">Bug 91806</a> - configure does not test whether assembler supports sse4.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92229">Bug 92229</a> - [APITRACE] SOMA have serious graphical errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92233">Bug 92233</a> - Unigine Heaven 4.0 silhuette run</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93004">Bug 93004</a> - Guild Wars 2 crash on nouveau DX11 cards</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93215">Bug 93215</a> - [Regression bisected] Ogles1conform Automatic mipmap generation test is fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93257">Bug 93257</a> - [SKL, bisected] ASTC dEQP tests segfault</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>st/mesa: check state-&gt;mesa in early return check in st_validate_state()</li>

				</ul>

				<p>Dave Airlie (6):</p>

				<ul>

				  <li>mesa/varray: set double arrays to non-normalised.</li>

				  <li>mesa/shader: return correct attribute location for double matrix arrays</li>

				  <li>glsl: pass stage into mark function</li>

				  <li>glsl/fp64: add helper for dual slot double detection.</li>

				  <li>glsl: fix count_attribute_slots to allow for different 64-bit handling</li>

				  <li>glsl: only update doubles inputs for vertex inputs.</li>

				</ul>

				<p>Emil Velikov (4):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.0.1</li>

				  <li>cherry-ignore: drop the "re-enable" DCC on Stoney</li>

				  <li>cherry-ignore: don't pick a specific i965 formats patch</li>

				  <li>Update version to 11.1.1</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>vc4: Warn instead of abort()ing on exec ioctl failures.</li>

				  <li>vc4: Keep sample mask writes from being reordered after TLB writes</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>r600: fix constant buffer size programming</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>meta/generate_mipmap: Work-around GLES 1.x problem with GL_DRAW_FRAMEBUFFER</li>

				</ul>

				<p>Ilia Mirkin (9):</p>

				<ul>

				  <li>nv50/ir: can't have predication and immediates</li>

				  <li>gk104/ir: simplify and fool-proof texbar algorithm</li>

				  <li>glsl: assign varying locations to tess shaders when doing SSO</li>

				  <li>glx/dri3: a drawable might not be bound at wait time</li>

				  <li>nvc0: don't forget to reset VTX_TMP bufctx slot after blit completion</li>

				  <li>nv50/ir: float(s32 &amp; 0xff) = float(u8), not s8</li>

				  <li>nv50,nvc0: make sure there's pushbuf space and that we ref the bo early</li>

				  <li>nv50,nvc0: fix crash when increasing bsp bo size for h264</li>

				  <li>nvc0: scale up inter_bo size so that it's 16M for a 4K video</li>

				</ul>

				<p>Jonathan Gray (2):</p>

				<ul>

				  <li>configure.ac: use pkg-config for libelf</li>

				  <li>configure: check for python2.7 for PYTHON2</li>

				</ul>

				<p>Kenneth Graunke (5):</p>

				<ul>

				  <li>ralloc: Fix ralloc_adopt() to the old context's last child's parent.</li>

				  <li>drirc: Disable ARB_blend_func_extended for Heaven 4.0/Valley 1.0.</li>

				  <li>glsl: Fix varying struct locations when varying packing is disabled.</li>

				  <li>nvc0: Set winding order regardless of domain.</li>

				  <li>nir: Add a lower_fdiv option, turn fdiv into fmul/frcp.</li>

				</ul>

				<p>Marek Olšák (7):</p>

				<ul>

				  <li>tgsi/scan: add flag colors_written</li>

				  <li>r600g: write all MRTs only if there is exactly one output (fixes a hang)</li>

				  <li>radeonsi: don't call of u_prims_for_vertices for patches and rectangles</li>

				  <li>radeonsi: apply the streamout workaround to Fiji as well</li>

				  <li>gallium/radeon: fix Hyper-Z hangs by programming PA_SC_MODE_CNTL_1 correctly</li>

				  <li>program: add _mesa_reserve_parameter_storage</li>

				  <li>st/mesa: fix GLSL uniform updates for glBitmap &amp; glDrawPixels (v2)</li>

				</ul>

				<p>Mark Janes (1):</p>

				<ul>

				  <li>Add missing platform information for KBL</li>

				</ul>

				<p>Miklós Máté (1):</p>

				<ul>

				  <li>mesa: Don't leak ATIfs instructions in DeleteFragmentShader</li>

				</ul>

				<p>Neil Roberts (3):</p>

				<ul>

				  <li>i965: Add MESA_FORMAT_B8G8R8X8_SRGB to brw_format_for_mesa_format</li>

				  <li>i965: Add B8G8R8X8_SRGB to the alpha format override</li>

				  <li>i965: Fix crash when calling glViewport with no surface bound</li>

				</ul>

				<p>Nicolai Hähnle (2):</p>

				<ul>

				  <li>gallium/radeon: only dispose locally created target machine in radeon_llvm_compile</li>

				  <li>gallium/radeon: fix regression in a number of driver queries</li>

				</ul>

				<p>Oded Gabbay (1):</p>

				<ul>

				  <li>configura.ac: fix test for SSE4.1 assembler support</li>

				</ul>

				<p>Patrick Rudolph (2):</p>

				<ul>

				  <li>nv50,nvc0: fix use-after-free when vertex buffers are unbound</li>

				  <li>gallium/util: return correct number of bound vertex buffers</li>

				</ul>

				<p>Rob Herring (1):</p>

				<ul>

				  <li>freedreno/ir3: fix 32-bit builds with pointer-to-int-cast error enabled</li>

				</ul>

				<p>Samuel Pitoiset (3):</p>

				<ul>

				  <li>nvc0: free memory allocated by the prog which reads MP perf counters</li>

				  <li>nv50,nvc0: free memory allocated by performance metrics</li>

				  <li>nv50: free memory allocated by the prog which reads MP perf counters</li>

				</ul>

				<p>Sarah Sharp (1):</p>

				<ul>

				  <li>mesa: Add KBL PCI IDs and platform information.</li>

				</ul>

				</div>

				</body>

				</html>

									
										182

docs/relnotes/11.1.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,182 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.1.2 Release Notes / February 10, 2016</h1>

				<p>

				Mesa 11.1.2 is a bug fix release which fixes bugs found since the 11.1.1 release.

				</p>

				<p>

				Mesa 11.1.2 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				ba0e7462b2936b86e6684c26fbb55519f8d9ad31d13a1c1e1afbe41e73466eea  mesa-11.1.2.tar.gz

				8f72aead896b340ba0f7a4a474bfaf71681f5d675592aec1cb7ba698e319148b  mesa-11.1.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91596">Bug 91596</a> - EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93628">Bug 93628</a> - Exception: attempt to use unavailable module DRM when building MesaGL 11.1.0 on windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93648">Bug 93648</a> - Random lines being rendered when playing Dolphin (geometry shaders related, w/ apitrace)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93650">Bug 93650</a> - GL_ARB_separate_shader_objects is buggy (PCSX2)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93717">Bug 93717</a> - Meta mipmap generation can corrupt texture state</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93722">Bug 93722</a> - Segfault when compiling shader with a subroutine that takes a parameter</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93731">Bug 93731</a> - glUniformSubroutinesuiv segfaults when subroutine uniform is bound to a specific location</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93761">Bug 93761</a> - A conditional discard in a fragment shader causes no depth writing at all</li>

				</ul>

				<h2>Changes</h2>

				<p>Ben Widawsky (1):</p>

				<ul>

				  <li>i965/bxt: Fix conservative wm thread counts.</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>glsl: fix subroutine lowering reusing actual parmaters</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: add sha256 checksums for 11.1.1</li>

				  <li>cherry-ignore: drop the i965/kbl .num_slices patch</li>

				  <li>i915: correctly parse/set the context flags</li>

				  <li>targets/dri: android: use WHOLE static libraries</li>

				  <li>egl/dri2: expose srgb configs when KHR_gl_colorspace is available</li>

				  <li>Update version to 11.1.2</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>vc4: Don't record the seqno of a failed job submit.</li>

				  <li>vc4: Throttle outstanding rendering after submission.</li>

				</ul>

				<p>François Tigeot (1):</p>

				<ul>

				  <li>gallium: Add DragonFly support</li>

				</ul>

				<p>Grazvydas Ignotas (1):</p>

				<ul>

				  <li>r600g: don't leak driver const buffers</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>meta/blit: Restore GL_DEPTH_STENCIL_TEXTURE_MODE state for GL_TEXTURE_RECTANGLE</li>

				  <li>meta: Use internal functions to set texture parameters</li>

				</ul>

				<p>Ilia Mirkin (6):</p>

				<ul>

				  <li>st/mesa: use surface format to generate mipmaps when available</li>

				  <li>glsl: always compute proper varying type, irrespective of varying packing</li>

				  <li>nvc0: avoid crashing when there are holes in vertex array bindings</li>

				  <li>nv50,nvc0: fix buffer clearing to respect engine alignment requirements</li>

				  <li>nv50/ir: fix false global CSE on instructions with multiple defs</li>

				  <li>st/mesa: treat a write as a read for range purposes</li>

				</ul>

				<p>Jason Ekstrand (3):</p>

				<ul>

				  <li>i965/vec4: Use UW type for multiply into accumulator on GEN8+</li>

				  <li>i965/fs/generator: Take an actual shader stage rather than a string</li>

				  <li>i965/fs: Always set channel 2 of texture headers in some stages</li>

				</ul>

				<p>Jose Fonseca (2):</p>

				<ul>

				  <li>scons: Conditionally use DRM module on pipe-loader.</li>

				  <li>pipe-loader: Fix PATH_MAX define on MSVC.</li>

				</ul>

				<p>Karol Herbst (1):</p>

				<ul>

				  <li>nv50/ir: fix memory corruption when spilling and redoing RA</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.</li>

				  <li>glsl: Allow implicit int -&gt; uint conversions for bitwise operators (&amp;, ^, |).</li>

				</ul>

				<p>Leo Liu (2):</p>

				<ul>

				  <li>vl: add zig zag scan for list 4x4</li>

				  <li>st/omx/dec/h264: fix corruption when scaling matrix present flag set</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: don't miss changes to SPI_TMPRING_SIZE</li>

				</ul>

				<p>Nicolai Hähnle (11):</p>

				<ul>

				  <li>mesa/bufferobj: make _mesa_delete_buffer_object externally accessible</li>

				  <li>st/mesa: use _mesa_delete_buffer_object</li>

				  <li>radeon: use _mesa_delete_buffer_object</li>

				  <li>i915: use _mesa_delete_buffer_object</li>

				  <li>i965: use _mesa_delete_buffer_object</li>

				  <li>util/u_pstipple.c: copy immediates during transformation</li>

				  <li>radeonsi: extract the VGT_GS_MODE calculation into its own function</li>

				  <li>radeonsi: ensure that VGT_GS_MODE is sent when necessary</li>

				  <li>radeonsi: add DCC buffer for sampler views on new CS</li>

				  <li>st/mesa: use the correct address generation functions in st_TexSubImage blit</li>

				  <li>radeonsi: fix discard-only fragment shaders (11.1 version)</li>

				</ul>

				<p>Timothy Arceri (4):</p>

				<ul>

				  <li>glsl: fix segfault linking subroutine uniform with explicit location</li>

				  <li>mesa: fix segfault in glUniformSubroutinesuiv()</li>

				  <li>glsl: fix interface block error message</li>

				  <li>glsl: create helper to remove outer vertex index array used by some stages</li>

				</ul>

				</div>

				</body>

				</html>

									
										85

docs/relnotes/11.2.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 11.2.0 Release Notes / TBD</h1>

				<p>

				Mesa 11.2.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 11.2.1.

				</p>

				<p>

				Mesa 11.2.0 implements the OpenGL 4.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 4.1.  OpenGL

				4.1 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_arrays_of_arrays on all gallium drivers that provide GLSL 1.30</li>

				<li>GL_ARB_base_instance on freedreno/a4xx</li>

				<li>GL_ARB_compute_shader on i965</li>

				<li>GL_ARB_copy_image on r600</li>

				<li>GL_ARB_indirect_parameters on nvc0</li>

				<li>GL_ARB_query_buffer_object on nvc0</li>

				<li>GL_ARB_shader_atomic_counters on nvc0</li>

				<li>GL_ARB_shader_draw_parameters on i965, nvc0</li>

				<li>GL_ARB_shader_storage_buffer_object on nvc0</li>

				<li>GL_ARB_tessellation_shader on i965 and r600 (evergreen/cayman only)</li>

				<li>GL_ARB_texture_buffer_object_rgb32 on freedreno/a4xx</li>

				<li>GL_ARB_texture_buffer_range on freedreno/a4xx</li>

				<li>GL_ARB_texture_query_lod on freedreno/a4xx</li>

				<li>GL_ARB_texture_rgb10_a2ui on freedreno/a4xx</li>

				<li>GL_ARB_texture_view on freedreno/a4xx</li>

				<li>GL_ARB_vertex_type_10f_11f_11f_rev on freedreno/a4xx</li>

				<li>GL_KHR_texture_compression_astc_ldr on freedreno/a4xx</li>

				<li>GL_AMD_performance_monitor on radeonsi (CIK+ only)</li>

				<li>GL_ATI_meminfo on r600, radeonsi</li>

				<li>GL_NVX_gpu_memory_info on r600, radeonsi</li>

				<li>New OSMesaCreateContextAttribs() function (for creating core profile

				    contexts)</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				Microsoft Visual Studio 2013 or later is now required for building

				on Windows.

				Previously, Visual Studio 2008 and later were supported.

				TBD.

				</div>

				</body>

				</html>

									
										14

docs/shading.html
									
												View File
												
				@@ -63,6 +63,20 @@ execution.  These are generally used for debugging.

				Example:  export MESA_GLSL=dump,nopt

				</p>

				<p>

				Shaders can be dumped and replaced on runtime for debugging purposes. Mesa 

				needs to be configured with '--with-sha1' to enable this functionality. This 

				feature is not currently supported by SCons build.

				This is controlled via following environment variables:

				<ul>

				<li><b>MESA_SHADER_DUMP_PATH</b> - path where shader sources are dumped

				<li><b>MESA_SHADER_READ_PATH</b> - path where replacement shaders are read

				</ul>

				Note, path set must exist before running for dumping or replacing to work. 

				When both are set, these paths should be different so the dumped shaders do 

				not clobber the replacement shaders.

				</p>

				<h2 id="support">GLSL Version</h2>

176

docs/specs/EXT_shader_samples_identical.txt Normal file

View File

@@ -0,0 +1,176 @@
 Name
     EXT_shader_samples_identical
 Name Strings
     GL_EXT_shader_samples_identical
 Contact
     Ian Romanick, Intel (ian.d.romanick 'at' intel.com)
 Contributors
     Chris Forbes, Mesa
     Magnus Wendt, Intel
     Neil S. Roberts, Intel
     Graham Sellers, AMD
 Status
     XXX - Not complete yet.
 Version
     Last Modified Date: November 19, 2015
     Revision: 6
 Number
     TBD
 Dependencies
     OpenGL 3.2, or OpenGL ES 3.1, or ARB_texture_multisample is required.
     This extension is written against the OpenGL 4.5 (Core Profile)
     Specification
 Overview
     Multisampled antialiasing has become a common method for improving the
     quality of rendered images.  Multisampling differs from supersampling in
     that the color of a primitive that covers all or part of a pixel is
     resolved once, regardless of the number of samples covered.  If a large
     polygon is rendered, the colors of all samples in each interior pixel will
     be the same.  This suggests a simple compression scheme that can reduce
     the necessary memory bandwidth requirements.  In one such scheme, each
     sample is stored in a separate slice of the multisample surface.  An
     additional multisample control surface (MCS) contains a mapping from pixel
     samples to slices.
     If all the values stored in the MCS for a particular pixel are the same,
     then all the samples have the same value.  Applications can take advantage
     of this information to reduce the bandwidth of reading multisample
     textures.  A custom multisample resolve filter could optimize resolving
     pixels where every sample is identical by reading the color once.
     color = texelFetch(sampler, coordinate, 0);
     if (!textureSamplesIdenticalEXT(sampler, coordinate)) {
         for (int i = 1; i < MAX_SAMPLES; i++) {
             vec4 c = texelFetch(sampler, coordinate, i);
             //... accumulate c into color
         }
     }
 New Procedures and Functions
     None.
 New Tokens
     None.
 Additions to the OpenGL 4.5 (Core Profile) Specification
     None.
 Modifications to The OpenGL Shading Language Specification, Version 4.50.5
     Including the following line in a shader can be used to control the
     language features described in this extension:
         #extension GL_EXT_shader_samples_identical
     A new preprocessor #define is added to the OpenGL Shading Language:
         #define GL_EXT_shader_samples_identical
     Add to the table in section 8.7 "Texture Lookup Functions"
     Syntax:
         bool textureSamplesIdenticalEXT(gsampler2DMS sampler, ivec2 coord)
         bool textureSamplesIdenticalEXT(gsampler2DMSArray sampler,
                                         ivec3 coord)
     Description:
         Returns true if it can be determined that all samples within the texel
         of the multisample texture bound to <sampler> at <coord> contain the
         same values or false if this cannot be determined."
 Additions to the AGL/EGL/GLX/WGL Specifications
     None
 Errors
     None
 New State
     None
 New Implementation Dependent State
     None
 Issues
 ) What should the new functions be called?
     RESOLVED: textureSamplesIdenticalEXT.  Initially
     textureAllSamplesIdenticalEXT was considered, but
     textureSamplesIdenticalEXT is more similar to the existing textureSamples
     function.
 ) It seems like applications could implement additional optimization if
        they were provided with raw MCS data.  Should this extension also
        provide that data?
     There are a number of challenges in providing raw MCS data.  The biggest
     problem being that the amount of MCS data depends on the number of
     samples, and that is not known at compile time.  Additionally, without new
     texelFetch functions, applications would have difficulty utilizing the
     information.
     Another option is to have a function that returns an array of tuples of
     sample number and count.  This also has difficulties with the maximum
     array size not being known at compile time.
     RESOLVED: Do not expose raw MCS data in this extension.
 ) Should this extension also extend SPIR-V?
     RESOLVED: Yes, but this has not yet been written.
 ) Is it possible for textureSamplesIdenticalEXT to report false negatives?
     RESOLVED: Yes.  It is possible that the underlying hardware may not detect
     that separate writes of the same color to different samples of a pixel are
     the same.  The shader function is at the whim of the underlying hardware
     implementation.  It is also possible that a compressed multisample surface
     is not used.  In that case the function will likely always return false.
 Revision History
     Rev  Date        Author    Changes
     ---  ----------  --------  ---------------------------------------------
 2014/08/20  cforbes   Initial version
 2015/10/23  idr       Change from MESA to EXT.  Rebase on OpenGL 4.5,
                                and add dependency on OpenGL ES 3.1.  Initial
                                draft of overview section and issues 1 through
 .
 2015/10/27  idr       Typo fixes.
 2015/11/10  idr       Rename extension from EXT_shader_multisample_compression
                                to EXT_shader_samples_identical.
                                Add issue #4.
 2015/11/18  idr       Fix some typos spotted by gsellers.  Change the
                                name of the name of the function to
                                textureSamplesIdenticalEXT.
 2015/11/19  idr       Fix more typos spotted by Nicolai Hähnle.

									
										4

docs/thanks.html
									
												View File
												
				@@ -42,9 +42,7 @@ Tungsten Graphics, Inc. have supported the ongoing development of Mesa.

				<li>The

				<a href="http://www.mesa3d.org">Mesa</a>

				website is hosted by

				<a href="http://sourceforge.net">

				<img src="http://sourceforge.net/sflogo.php?group_id=3&amp;type=1"

				width="88" height="31" align="bottom" alt="Sourceforge.net" border="0"></a>

				<a href="http://sourceforge.net">sourceforge.net</a>.

				<br>

				<br>

									
										4

docs/utilities.html
									
												View File
												
				@@ -30,6 +30,10 @@

				  <dt><a href="http://www.valgrind.org">Valgrind</a></dt>

				  <dd>is a very useful tool for tracking down

				  memory-related problems in your code.</dd>

				  <dt><a href="http:scan.coverity.com/projects/mesa">Coverity</a><dt>

				  <dd>provides static code analysis of Mesa.  If you create an account

				  you can see the results and try to fix outstanding issues.</dd>

				</dl>

				</div>

									
										99

docs/vmware-guest.html
									
												View File
												
				@@ -26,6 +26,31 @@ VMware Workstation running on Linux or Windows and VMware Fusion running on

				MacOS are all supported.

				</p>

				<p>

				With the August 2015 Workstation 12 / Fusion 8 releases, OpenGL 3.3

				is supported in the guest.

				This requires:

				<ul>

				<li>The VM is configured for virtual hardware version 12.

				<li>The host OS, GPU and graphics driver supports DX11 (Windows) or

				    OpenGL 4.0 (Linux, Mac)

				<li>On Linux, the vmwgfx kernel module must be version 2.9.0 or later.

				<li>A recent version of Mesa with the updated svga gallium driver.

				</ul>

				</p>

				<p>

				Otherwise, OpenGL 2.1 is supported.

				</p>

				<p>

				OpenGL 3.3 support can be disabled by setting the environment variable

				SVGA_VGPU10=0.

				You will then have OpenGL 2.1 support.

				This may be useful to work around application bugs (such as incorrect use

				of the OpenGL 3.x core profile).

				</p>

				<p>

				Most modern Linux distros include the SVGA3D driver so end users shouldn't

				be concerned with this information.

				@@ -123,10 +148,33 @@ To get the latest code from git:

				<h2>Building the Code</h2>

				<ul>

				<li>Build libdrm: If you're on a 32-bit system, you should skip the --libdir configure option. Note also the comment about toolchain libdrm above. 

				<li>

				Determine where the GL-related libraries reside on your system and set

				the LIBDIR environment variable accordingly.

				<br><br>

				For 32-bit Ubuntu systems:

				<pre>

				  export LIBDIR=/usr/lib/i386-linux-gnu

				</pre>

				For 64-bit Ubuntu systems:

				<pre>

				  export LIBDIR=/usr/lib/x86_64-linux-gnu

				</pre>

				For 32-bit Fedora systems:

				<pre>

				  export LIBDIR=/usr/lib

				</pre>

				For 64-bit Fedora systems:

				<pre>

				  export LIBDIR=/usr/lib64

				</pre>

				</li>

				<li>Build libdrm:

				  <pre>

				  cd $TOP/drm

				  ./autogen.sh --prefix=/usr --libdir=/usr/lib64

				  ./autogen.sh --prefix=/usr --libdir=${LIBDIR}

				  make

				  sudo make install

				  </pre>

				@@ -137,12 +185,9 @@ The libxatracker library is used exclusively by the X server to do render,

				copy and video acceleration:

				<br>

				The following configure options doesn't build the EGL system.

				<br>

				As before, if you're on a 32-bit system, you should skip the --libdir

				configure option.

				  <pre>

				  cd $TOP/mesa

				  ./autogen.sh --prefix=/usr --libdir=/usr/lib64 --with-gallium-drivers=svga --with-dri-drivers= --enable-xa --disable-dri3

				  ./autogen.sh --prefix=/usr --libdir=${LIBDIR} --with-gallium-drivers=svga --with-dri-drivers=swrast --enable-xa --disable-dri3 --enable-glx-tls

				  make

				  sudo make install

				  </pre>

				@@ -152,25 +197,39 @@ if they're not installed in your system.  You should be told what's missing.

				<br>

				<br>

				<li>xf86-video-vmware: Now, once libxatracker is installed, we proceed with building and replacing the current Xorg driver. First check if your system is 32- or 64-bit. If you're building for a 32-bit system, you will not be needing the --libdir=/usr/lib64 option to autogen. 

				<li>xf86-video-vmware: Now, once libxatracker is installed, we proceed with

				building and replacing the current Xorg driver.

				First check if your system is 32- or 64-bit.

				  <pre>

				  cd $TOP/xf86-video-vmware

				  ./autogen.sh --prefix=/usr --libdir=/usr/lib64

				  ./autogen.sh --prefix=/usr --libdir=${LIBDIR}

				  make

				  sudo make install

				  </pre>

				<li>vmwgfx kernel module. First make sure that any old version of this kernel module is removed from the system by issuing

				  <pre>

				<pre>

				  sudo rm /lib/modules/`uname -r`/kernel/drivers/gpu/drm/vmwgfx.ko*

				  </pre>

				Then 

				  <pre>

				</pre>

				Build and install:

				<pre>

				  cd $TOP/vmwgfx

				  make

				  sudo make install

				  sudo cp 00-vmwgfx.rules /etc/udev/rules.d

				  sudo depmod -ae

				  </pre>

				  sudo depmod -a

				</pre>

				If you're using a Ubuntu OS:

				<pre>

				  sudo update-initramfs -u

				</pre>

				If you're using a Fedora OS:

				<pre>

				  sudo dracut --force

				</pre>

				Add 'vmwgfx' to the /etc/modules file:

				<pre>

				  echo vmwgfx | sudo tee -a /etc/modules

				</pre>

				Note: some distros put DRM kernel drivers in different directories.

				For example, sometimes vmwgfx.ko might be found in

				@@ -227,6 +286,16 @@ If you don't see this, try setting this environment variable:

				then rerun glxinfo and examine the output for error messages.

				</p>

				<p>

				If OpenGL 3.3 is not working (you only get OpenGL 2.1):

				</p>

				<ul>

				<li>Make sure the VM uses hardware version 12.

				<li>Make sure the vmwgfx kernel module is version 2.9.0 or later.

				<li>Check the vmware.log file for errors.

				<li>Run 'dmesg | grep vmwgfx' and look for "DX: yes".

				</div>

				</body>

				</html>

									
										1

include/D3D9/d3d9types.h
									
												View File
												
				@@ -227,6 +227,7 @@ typedef struct _RGNDATA {

				#define D3DERR_DRIVERINVALIDCALL         MAKE_D3DHRESULT(2157)

				#define D3DERR_DEVICEREMOVED             MAKE_D3DHRESULT(2160)

				#define D3DERR_DEVICEHUNG                MAKE_D3DHRESULT(2164)

				#define S_PRESENT_OCCLUDED               MAKE_D3DSTATUS(2168)

				/********************************************************

				 * Bitmasks                                             *

									
										11

include/GL/internal/dri_interface.h
									
												View File
												
				@@ -495,7 +495,7 @@ struct __DRIdamageExtensionRec {

				 * SWRast Loader extension.

				 */

				#define __DRI_SWRAST_LOADER "DRI_SWRastLoader"

				#define __DRI_SWRAST_LOADER_VERSION 2

				#define __DRI_SWRAST_LOADER_VERSION 3

				struct __DRIswrastLoaderExtensionRec {

				    __DRIextension base;

				@@ -528,6 +528,15 @@ struct __DRIswrastLoaderExtensionRec {

				    void (*putImage2)(__DRIdrawable *drawable, int op,

				                      int x, int y, int width, int height, int stride,

				                      char *data, void *loaderPrivate);

				   /**

				     * Put image to drawable

				     *

				     * \since 3

				     */

				   void (*getImage2)(__DRIdrawable *readable,

						     int x, int y, int width, int height, int stride,

						     char *data, void *loaderPrivate);

				};

				/**

									
										45

include/GL/osmesa.h
									
												View File
												
				@@ -58,8 +58,8 @@ extern "C" {

				#include <GL/gl.h>

				#define OSMESA_MAJOR_VERSION 10

				#define OSMESA_MINOR_VERSION 0

				#define OSMESA_MAJOR_VERSION 11

				#define OSMESA_MINOR_VERSION 2

				#define OSMESA_PATCH_VERSION 0

				@@ -95,6 +95,18 @@ extern "C" {

				#define OSMESA_MAX_WIDTH	0x24  /* new in 4.0 */

				#define OSMESA_MAX_HEIGHT	0x25  /* new in 4.0 */

				/*

				 * Accepted in OSMesaCreateContextAttrib's attribute list.

				 */

				#define OSMESA_DEPTH_BITS            0x30

				#define OSMESA_STENCIL_BITS          0x31

				#define OSMESA_ACCUM_BITS            0x32

				#define OSMESA_PROFILE               0x33

				#define OSMESA_CORE_PROFILE          0x34

				#define OSMESA_COMPAT_PROFILE        0x35

				#define OSMESA_CONTEXT_MAJOR_VERSION 0x36

				#define OSMESA_CONTEXT_MINOR_VERSION 0x37

				typedef struct osmesa_context *OSMesaContext;

				@@ -127,6 +139,35 @@ OSMesaCreateContextExt( GLenum format, GLint depthBits, GLint stencilBits,

				                        GLint accumBits, OSMesaContext sharelist);

				/*

				 * Create an Off-Screen Mesa rendering context with attribute list.

				 * The list is composed of (attribute, value) pairs and terminated with

				 * attribute==0.  Supported Attributes:

				 *

				 * Attributes                    Values

				 * --------------------------------------------------------------------------

				 * OSMESA_FORMAT                 OSMESA_RGBA*, OSMESA_BGRA, OSMESA_ARGB, etc.

				 * OSMESA_DEPTH_BITS             0*, 16, 24, 32

				 * OSMESA_STENCIL_BITS           0*, 8

				 * OSMESA_ACCUM_BITS             0*, 16

				 * OSMESA_PROFILE                OSMESA_COMPAT_PROFILE*, OSMESA_CORE_PROFILE

				 * OSMESA_CONTEXT_MAJOR_VERSION  1*, 2, 3

				 * OSMESA_CONTEXT_MINOR_VERSION  0+

				 *

				 * Note: * = default value

				 *

				 * We return a context version >= what's specified by OSMESA_CONTEXT_MAJOR/

				 * MINOR_VERSION for the given profile.  For example, if you request a GL 1.4

				 * compat profile, you might get a GL 3.0 compat profile.

				 * Otherwise, null is returned if the version/profile is not supported.

				 *

				 * New in Mesa 11.2

				 */

				GLAPI OSMesaContext GLAPIENTRY

				OSMesaCreateContextAttribs( const int *attribList, OSMesaContext sharelist );

				/*

				 * Destroy an Off-Screen Mesa rendering context.

				 *

									
										54

include/c11/threads_posix.h
									
												View File
												
				@@ -102,9 +102,8 @@ call_once(once_flag *flag, void (*func)(void))

				static inline int

				cnd_broadcast(cnd_t *cond)

				{

				    if (!cond) return thrd_error;

				    pthread_cond_broadcast(cond);

				    return thrd_success;

				    assert(cond != NULL);

				    return (pthread_cond_broadcast(cond) == 0) ? thrd_success : thrd_error;

				}

				// 7.25.3.2

				@@ -119,18 +118,16 @@ cnd_destroy(cnd_t *cond)

				static inline int

				cnd_init(cnd_t *cond)

				{

				    if (!cond) return thrd_error;

				    pthread_cond_init(cond, NULL);

				    return thrd_success;

				    assert(cond != NULL);

				    return (pthread_cond_init(cond, NULL) == 0) ? thrd_success : thrd_error;

				}

				// 7.25.3.4

				static inline int

				cnd_signal(cnd_t *cond)

				{

				    if (!cond) return thrd_error;

				    pthread_cond_signal(cond);

				    return thrd_success;

				    assert(cond != NULL);

				    return (pthread_cond_signal(cond) == 0) ? thrd_success : thrd_error;

				}

				// 7.25.3.5

				@@ -139,7 +136,14 @@ cnd_timedwait(cnd_t *cond, mtx_t *mtx, const xtime *xt)

				{

				    struct timespec abs_time;

				    int rt;

				    if (!cond || !mtx || !xt) return thrd_error;

				    assert(mtx != NULL);

				    assert(cond != NULL);

				    assert(xt != NULL);

				    abs_time.tv_sec = xt->sec;

				    abs_time.tv_nsec = xt->nsec;

				    rt = pthread_cond_timedwait(cond, mtx, &abs_time);

				    if (rt == ETIMEDOUT)

				        return thrd_busy;

				@@ -150,9 +154,9 @@ cnd_timedwait(cnd_t *cond, mtx_t *mtx, const xtime *xt)

				static inline int

				cnd_wait(cnd_t *cond, mtx_t *mtx)

				{

				    if (!cond || !mtx) return thrd_error;

				    pthread_cond_wait(cond, mtx);

				    return thrd_success;

				    assert(mtx != NULL);

				    assert(cond != NULL);

				    return (pthread_cond_wait(cond, mtx) == 0) ? thrd_success : thrd_error;

				}

				@@ -161,7 +165,7 @@ cnd_wait(cnd_t *cond, mtx_t *mtx)

				static inline void

				mtx_destroy(mtx_t *mtx)

				{

				    assert(mtx);

				    assert(mtx != NULL);

				    pthread_mutex_destroy(mtx);

				}

				@@ -170,7 +174,7 @@ static inline int

				mtx_init(mtx_t *mtx, int type)

				{

				    pthread_mutexattr_t attr;

				    if (!mtx) return thrd_error;

				    assert(mtx != NULL);

				    if (type != mtx_plain && type != mtx_timed && type != mtx_try

				      && type != (mtx_plain|mtx_recursive)

				      && type != (mtx_timed|mtx_recursive)

				@@ -188,9 +192,8 @@ mtx_init(mtx_t *mtx, int type)

				static inline int

				mtx_lock(mtx_t *mtx)

				{

				    if (!mtx) return thrd_error;

				    pthread_mutex_lock(mtx);

				    return thrd_success;

				    assert(mtx != NULL);

				    return (pthread_mutex_lock(mtx) == 0) ? thrd_success : thrd_error;

				}

				static inline int

				@@ -203,7 +206,9 @@ thrd_yield(void);

				static inline int

				mtx_timedlock(mtx_t *mtx, const xtime *xt)

				{

				    if (!mtx || !xt) return thrd_error;

				    assert(mtx != NULL);

				    assert(xt != NULL);

				    {

				#ifdef EMULATED_THREADS_USE_NATIVE_TIMEDLOCK

				    struct timespec ts;

				@@ -233,7 +238,7 @@ mtx_timedlock(mtx_t *mtx, const xtime *xt)

				static inline int

				mtx_trylock(mtx_t *mtx)

				{

				    if (!mtx) return thrd_error;

				    assert(mtx != NULL);

				    return (pthread_mutex_trylock(mtx) == 0) ? thrd_success : thrd_busy;

				}

				@@ -241,9 +246,8 @@ mtx_trylock(mtx_t *mtx)

				static inline int

				mtx_unlock(mtx_t *mtx)

				{

				    if (!mtx) return thrd_error;

				    pthread_mutex_unlock(mtx);

				    return thrd_success;

				    assert(mtx != NULL);

				    return (pthread_mutex_unlock(mtx) == 0) ? thrd_success : thrd_error;

				}

				@@ -253,7 +257,7 @@ static inline int

				thrd_create(thrd_t *thr, thrd_start_t func, void *arg)

				{

				    struct impl_thrd_param *pack;

				    if (!thr) return thrd_error;

				    assert(thr != NULL);

				    pack = (struct impl_thrd_param *)malloc(sizeof(struct impl_thrd_param));

				    if (!pack) return thrd_nomem;

				    pack->func = func;

				@@ -329,7 +333,7 @@ thrd_yield(void)

				static inline int

				tss_create(tss_t *key, tss_dtor_t dtor)

				{

				    if (!key) return thrd_error;

				    assert(key != NULL);

				    return (pthread_key_create(key, dtor) == 0) ? thrd_success : thrd_error;

				}

									
										305

include/c99/inttypes.h
									
												View File
											
				@@ -1,305 +0,0 @@

				// ISO C9x  compliant inttypes.h for Microsoft Visual Studio

				// Based on ISO/IEC 9899:TC2 Committee draft (May 6, 2005) WG14/N1124 

				// 

				//  Copyright (c) 2006 Alexander Chemeris

				// 

				// Redistribution and use in source and binary forms, with or without

				// modification, are permitted provided that the following conditions are met:

				// 

				//   1. Redistributions of source code must retain the above copyright notice,

				//      this list of conditions and the following disclaimer.

				// 

				//   2. Redistributions in binary form must reproduce the above copyright

				//      notice, this list of conditions and the following disclaimer in the

				//      documentation and/or other materials provided with the distribution.

				// 

				//   3. The name of the author may be used to endorse or promote products

				//      derived from this software without specific prior written permission.

				// 

				// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED

				// WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

				// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO

				// EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,

				// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;

				// OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 

				// WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR

				// OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF

				// ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				// 

				///////////////////////////////////////////////////////////////////////////////

				#ifndef _MSC_VER // [

				#error "Use this header only with Microsoft Visual C++ compilers!"

				#endif // _MSC_VER ]

				#ifndef _MSC_INTTYPES_H_ // [

				#define _MSC_INTTYPES_H_

				#if _MSC_VER > 1000

				#pragma once

				#endif

				#include "stdint.h"

				// 7.8 Format conversion of integer types

				typedef struct {

				   intmax_t quot;

				   intmax_t rem;

				} imaxdiv_t;

				// 7.8.1 Macros for format specifiers

				#if !defined(__cplusplus) || defined(__STDC_FORMAT_MACROS) // [   See footnote 185 at page 198

				// The fprintf macros for signed integers are:

				#define PRId8       "d"

				#define PRIi8       "i"

				#define PRIdLEAST8  "d"

				#define PRIiLEAST8  "i"

				#define PRIdFAST8   "d"

				#define PRIiFAST8   "i"

				#define PRId16       "hd"

				#define PRIi16       "hi"

				#define PRIdLEAST16  "hd"

				#define PRIiLEAST16  "hi"

				#define PRIdFAST16   "hd"

				#define PRIiFAST16   "hi"

				#define PRId32       "I32d"

				#define PRIi32       "I32i"

				#define PRIdLEAST32  "I32d"

				#define PRIiLEAST32  "I32i"

				#define PRIdFAST32   "I32d"

				#define PRIiFAST32   "I32i"

				#define PRId64       "I64d"

				#define PRIi64       "I64i"

				#define PRIdLEAST64  "I64d"

				#define PRIiLEAST64  "I64i"

				#define PRIdFAST64   "I64d"

				#define PRIiFAST64   "I64i"

				#define PRIdMAX     "I64d"

				#define PRIiMAX     "I64i"

				#define PRIdPTR     "Id"

				#define PRIiPTR     "Ii"

				// The fprintf macros for unsigned integers are:

				#define PRIo8       "o"

				#define PRIu8       "u"

				#define PRIx8       "x"

				#define PRIX8       "X"

				#define PRIoLEAST8  "o"

				#define PRIuLEAST8  "u"

				#define PRIxLEAST8  "x"

				#define PRIXLEAST8  "X"

				#define PRIoFAST8   "o"

				#define PRIuFAST8   "u"

				#define PRIxFAST8   "x"

				#define PRIXFAST8   "X"

				#define PRIo16       "ho"

				#define PRIu16       "hu"

				#define PRIx16       "hx"

				#define PRIX16       "hX"

				#define PRIoLEAST16  "ho"

				#define PRIuLEAST16  "hu"

				#define PRIxLEAST16  "hx"

				#define PRIXLEAST16  "hX"

				#define PRIoFAST16   "ho"

				#define PRIuFAST16   "hu"

				#define PRIxFAST16   "hx"

				#define PRIXFAST16   "hX"

				#define PRIo32       "I32o"

				#define PRIu32       "I32u"

				#define PRIx32       "I32x"

				#define PRIX32       "I32X"

				#define PRIoLEAST32  "I32o"

				#define PRIuLEAST32  "I32u"

				#define PRIxLEAST32  "I32x"

				#define PRIXLEAST32  "I32X"

				#define PRIoFAST32   "I32o"

				#define PRIuFAST32   "I32u"

				#define PRIxFAST32   "I32x"

				#define PRIXFAST32   "I32X"

				#define PRIo64       "I64o"

				#define PRIu64       "I64u"

				#define PRIx64       "I64x"

				#define PRIX64       "I64X"

				#define PRIoLEAST64  "I64o"

				#define PRIuLEAST64  "I64u"

				#define PRIxLEAST64  "I64x"

				#define PRIXLEAST64  "I64X"

				#define PRIoFAST64   "I64o"

				#define PRIuFAST64   "I64u"

				#define PRIxFAST64   "I64x"

				#define PRIXFAST64   "I64X"

				#define PRIoMAX     "I64o"

				#define PRIuMAX     "I64u"

				#define PRIxMAX     "I64x"

				#define PRIXMAX     "I64X"

				#define PRIoPTR     "Io"

				#define PRIuPTR     "Iu"

				#define PRIxPTR     "Ix"

				#define PRIXPTR     "IX"

				// The fscanf macros for signed integers are:

				#define SCNd8       "d"

				#define SCNi8       "i"

				#define SCNdLEAST8  "d"

				#define SCNiLEAST8  "i"

				#define SCNdFAST8   "d"

				#define SCNiFAST8   "i"

				#define SCNd16       "hd"

				#define SCNi16       "hi"

				#define SCNdLEAST16  "hd"

				#define SCNiLEAST16  "hi"

				#define SCNdFAST16   "hd"

				#define SCNiFAST16   "hi"

				#define SCNd32       "ld"

				#define SCNi32       "li"

				#define SCNdLEAST32  "ld"

				#define SCNiLEAST32  "li"

				#define SCNdFAST32   "ld"

				#define SCNiFAST32   "li"

				#define SCNd64       "I64d"

				#define SCNi64       "I64i"

				#define SCNdLEAST64  "I64d"

				#define SCNiLEAST64  "I64i"

				#define SCNdFAST64   "I64d"

				#define SCNiFAST64   "I64i"

				#define SCNdMAX     "I64d"

				#define SCNiMAX     "I64i"

				#ifdef _WIN64 // [

				#  define SCNdPTR     "I64d"

				#  define SCNiPTR     "I64i"

				#else  // _WIN64 ][

				#  define SCNdPTR     "ld"

				#  define SCNiPTR     "li"

				#endif  // _WIN64 ]

				// The fscanf macros for unsigned integers are:

				#define SCNo8       "o"

				#define SCNu8       "u"

				#define SCNx8       "x"

				#define SCNX8       "X"

				#define SCNoLEAST8  "o"

				#define SCNuLEAST8  "u"

				#define SCNxLEAST8  "x"

				#define SCNXLEAST8  "X"

				#define SCNoFAST8   "o"

				#define SCNuFAST8   "u"

				#define SCNxFAST8   "x"

				#define SCNXFAST8   "X"

				#define SCNo16       "ho"

				#define SCNu16       "hu"

				#define SCNx16       "hx"

				#define SCNX16       "hX"

				#define SCNoLEAST16  "ho"

				#define SCNuLEAST16  "hu"

				#define SCNxLEAST16  "hx"

				#define SCNXLEAST16  "hX"

				#define SCNoFAST16   "ho"

				#define SCNuFAST16   "hu"

				#define SCNxFAST16   "hx"

				#define SCNXFAST16   "hX"

				#define SCNo32       "lo"

				#define SCNu32       "lu"

				#define SCNx32       "lx"

				#define SCNX32       "lX"

				#define SCNoLEAST32  "lo"

				#define SCNuLEAST32  "lu"

				#define SCNxLEAST32  "lx"

				#define SCNXLEAST32  "lX"

				#define SCNoFAST32   "lo"

				#define SCNuFAST32   "lu"

				#define SCNxFAST32   "lx"

				#define SCNXFAST32   "lX"

				#define SCNo64       "I64o"

				#define SCNu64       "I64u"

				#define SCNx64       "I64x"

				#define SCNX64       "I64X"

				#define SCNoLEAST64  "I64o"

				#define SCNuLEAST64  "I64u"

				#define SCNxLEAST64  "I64x"

				#define SCNXLEAST64  "I64X"

				#define SCNoFAST64   "I64o"

				#define SCNuFAST64   "I64u"

				#define SCNxFAST64   "I64x"

				#define SCNXFAST64   "I64X"

				#define SCNoMAX     "I64o"

				#define SCNuMAX     "I64u"

				#define SCNxMAX     "I64x"

				#define SCNXMAX     "I64X"

				#ifdef _WIN64 // [

				#  define SCNoPTR     "I64o"

				#  define SCNuPTR     "I64u"

				#  define SCNxPTR     "I64x"

				#  define SCNXPTR     "I64X"

				#else  // _WIN64 ][

				#  define SCNoPTR     "lo"

				#  define SCNuPTR     "lu"

				#  define SCNxPTR     "lx"

				#  define SCNXPTR     "lX"

				#endif  // _WIN64 ]

				#endif // __STDC_FORMAT_MACROS ]

				// 7.8.2 Functions for greatest-width integer types

				// 7.8.2.1 The imaxabs function

				#define imaxabs _abs64

				// 7.8.2.2 The imaxdiv function

				// This is modified version of div() function from Microsoft's div.c found

				// in %MSVC.NET%\crt\src\div.c

				#ifdef STATIC_IMAXDIV // [

				static

				#else // STATIC_IMAXDIV ][

				_inline

				#endif // STATIC_IMAXDIV ]

				imaxdiv_t __cdecl imaxdiv(intmax_t numer, intmax_t denom)

				{

				   imaxdiv_t result;

				   result.quot = numer / denom;

				   result.rem = numer % denom;

				   if (numer < 0 && result.rem > 0) {

				      // did division wrong; must fix up

				      ++result.quot;

				      result.rem -= denom;

				   }

				   return result;

				}

				// 7.8.2.3 The strtoimax and strtoumax functions

				#define strtoimax _strtoi64

				#define strtoumax _strtoui64

				// 7.8.2.4 The wcstoimax and wcstoumax functions

				#define wcstoimax _wcstoi64

				#define wcstoumax _wcstoui64

				#endif // _MSC_INTTYPES_H_ ]

									
										247

include/c99/stdint.h
									
												View File
											
				@@ -1,247 +0,0 @@

				// ISO C9x  compliant stdint.h for Microsoft Visual Studio

				// Based on ISO/IEC 9899:TC2 Committee draft (May 6, 2005) WG14/N1124 

				// 

				//  Copyright (c) 2006-2008 Alexander Chemeris

				// 

				// Redistribution and use in source and binary forms, with or without

				// modification, are permitted provided that the following conditions are met:

				// 

				//   1. Redistributions of source code must retain the above copyright notice,

				//      this list of conditions and the following disclaimer.

				// 

				//   2. Redistributions in binary form must reproduce the above copyright

				//      notice, this list of conditions and the following disclaimer in the

				//      documentation and/or other materials provided with the distribution.

				// 

				//   3. The name of the author may be used to endorse or promote products

				//      derived from this software without specific prior written permission.

				// 

				// THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED

				// WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF

				// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO

				// EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,

				// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,

				// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;

				// OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 

				// WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR

				// OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF

				// ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

				// 

				///////////////////////////////////////////////////////////////////////////////

				#ifndef _MSC_VER // [

				#error "Use this header only with Microsoft Visual C++ compilers!"

				#endif // _MSC_VER ]

				#ifndef _MSC_STDINT_H_ // [

				#define _MSC_STDINT_H_

				#if _MSC_VER > 1000

				#pragma once

				#endif

				#include <limits.h>

				// For Visual Studio 6 in C++ mode and for many Visual Studio versions when

				// compiling for ARM we should wrap <wchar.h> include with 'extern "C++" {}'

				// or compiler give many errors like this:

				//   error C2733: second C linkage of overloaded function 'wmemchr' not allowed

				#ifdef __cplusplus

				extern "C" {

				#endif

				#  include <wchar.h>

				#ifdef __cplusplus

				}

				#endif

				// Define _W64 macros to mark types changing their size, like intptr_t.

				#ifndef _W64

				#  if !defined(__midl) && (defined(_X86_) || defined(_M_IX86)) && _MSC_VER >= 1300

				#     define _W64 __w64

				#  else

				#     define _W64

				#  endif

				#endif

				// 7.18.1 Integer types

				// 7.18.1.1 Exact-width integer types

				// Visual Studio 6 and Embedded Visual C++ 4 doesn't

				// realize that, e.g. char has the same size as __int8

				// so we give up on __intX for them.

				#if (_MSC_VER < 1300)

				   typedef signed char       int8_t;

				   typedef signed short      int16_t;

				   typedef signed int        int32_t;

				   typedef unsigned char     uint8_t;

				   typedef unsigned short    uint16_t;

				   typedef unsigned int      uint32_t;

				#else

				   typedef signed __int8     int8_t;

				   typedef signed __int16    int16_t;

				   typedef signed __int32    int32_t;

				   typedef unsigned __int8   uint8_t;

				   typedef unsigned __int16  uint16_t;

				   typedef unsigned __int32  uint32_t;

				#endif

				typedef signed __int64       int64_t;

				typedef unsigned __int64     uint64_t;

				// 7.18.1.2 Minimum-width integer types

				typedef int8_t    int_least8_t;

				typedef int16_t   int_least16_t;

				typedef int32_t   int_least32_t;

				typedef int64_t   int_least64_t;

				typedef uint8_t   uint_least8_t;

				typedef uint16_t  uint_least16_t;

				typedef uint32_t  uint_least32_t;

				typedef uint64_t  uint_least64_t;

				// 7.18.1.3 Fastest minimum-width integer types

				typedef int8_t    int_fast8_t;

				typedef int16_t   int_fast16_t;

				typedef int32_t   int_fast32_t;

				typedef int64_t   int_fast64_t;

				typedef uint8_t   uint_fast8_t;

				typedef uint16_t  uint_fast16_t;

				typedef uint32_t  uint_fast32_t;

				typedef uint64_t  uint_fast64_t;

				// 7.18.1.4 Integer types capable of holding object pointers

				#ifdef _WIN64 // [

				   typedef signed __int64    intptr_t;

				   typedef unsigned __int64  uintptr_t;

				#else // _WIN64 ][

				   typedef _W64 signed int   intptr_t;

				   typedef _W64 unsigned int uintptr_t;

				#endif // _WIN64 ]

				// 7.18.1.5 Greatest-width integer types

				typedef int64_t   intmax_t;

				typedef uint64_t  uintmax_t;

				// 7.18.2 Limits of specified-width integer types

				#if !defined(__cplusplus) || defined(__STDC_LIMIT_MACROS) // [   See footnote 220 at page 257 and footnote 221 at page 259

				// 7.18.2.1 Limits of exact-width integer types

				#define INT8_MIN     ((int8_t)_I8_MIN)

				#define INT8_MAX     _I8_MAX

				#define INT16_MIN    ((int16_t)_I16_MIN)

				#define INT16_MAX    _I16_MAX

				#define INT32_MIN    ((int32_t)_I32_MIN)

				#define INT32_MAX    _I32_MAX

				#define INT64_MIN    ((int64_t)_I64_MIN)

				#define INT64_MAX    _I64_MAX

				#define UINT8_MAX    _UI8_MAX

				#define UINT16_MAX   _UI16_MAX

				#define UINT32_MAX   _UI32_MAX

				#define UINT64_MAX   _UI64_MAX

				// 7.18.2.2 Limits of minimum-width integer types

				#define INT_LEAST8_MIN    INT8_MIN

				#define INT_LEAST8_MAX    INT8_MAX

				#define INT_LEAST16_MIN   INT16_MIN

				#define INT_LEAST16_MAX   INT16_MAX

				#define INT_LEAST32_MIN   INT32_MIN

				#define INT_LEAST32_MAX   INT32_MAX

				#define INT_LEAST64_MIN   INT64_MIN

				#define INT_LEAST64_MAX   INT64_MAX

				#define UINT_LEAST8_MAX   UINT8_MAX

				#define UINT_LEAST16_MAX  UINT16_MAX

				#define UINT_LEAST32_MAX  UINT32_MAX

				#define UINT_LEAST64_MAX  UINT64_MAX

				// 7.18.2.3 Limits of fastest minimum-width integer types

				#define INT_FAST8_MIN    INT8_MIN

				#define INT_FAST8_MAX    INT8_MAX

				#define INT_FAST16_MIN   INT16_MIN

				#define INT_FAST16_MAX   INT16_MAX

				#define INT_FAST32_MIN   INT32_MIN

				#define INT_FAST32_MAX   INT32_MAX

				#define INT_FAST64_MIN   INT64_MIN

				#define INT_FAST64_MAX   INT64_MAX

				#define UINT_FAST8_MAX   UINT8_MAX

				#define UINT_FAST16_MAX  UINT16_MAX

				#define UINT_FAST32_MAX  UINT32_MAX

				#define UINT_FAST64_MAX  UINT64_MAX

				// 7.18.2.4 Limits of integer types capable of holding object pointers

				#ifdef _WIN64 // [

				#  define INTPTR_MIN   INT64_MIN

				#  define INTPTR_MAX   INT64_MAX

				#  define UINTPTR_MAX  UINT64_MAX

				#else // _WIN64 ][

				#  define INTPTR_MIN   INT32_MIN

				#  define INTPTR_MAX   INT32_MAX

				#  define UINTPTR_MAX  UINT32_MAX

				#endif // _WIN64 ]

				// 7.18.2.5 Limits of greatest-width integer types

				#define INTMAX_MIN   INT64_MIN

				#define INTMAX_MAX   INT64_MAX

				#define UINTMAX_MAX  UINT64_MAX

				// 7.18.3 Limits of other integer types

				#ifdef _WIN64 // [

				#  define PTRDIFF_MIN  _I64_MIN

				#  define PTRDIFF_MAX  _I64_MAX

				#else  // _WIN64 ][

				#  define PTRDIFF_MIN  _I32_MIN

				#  define PTRDIFF_MAX  _I32_MAX

				#endif  // _WIN64 ]

				#define SIG_ATOMIC_MIN  INT_MIN

				#define SIG_ATOMIC_MAX  INT_MAX

				#ifndef SIZE_MAX // [

				#  ifdef _WIN64 // [

				#     define SIZE_MAX  _UI64_MAX

				#  else // _WIN64 ][

				#     define SIZE_MAX  _UI32_MAX

				#  endif // _WIN64 ]

				#endif // SIZE_MAX ]

				// WCHAR_MIN and WCHAR_MAX are also defined in <wchar.h>

				#ifndef WCHAR_MIN // [

				#  define WCHAR_MIN  0

				#endif  // WCHAR_MIN ]

				#ifndef WCHAR_MAX // [

				#  define WCHAR_MAX  _UI16_MAX

				#endif  // WCHAR_MAX ]

				#define WINT_MIN  0

				#define WINT_MAX  _UI16_MAX

				#endif // __STDC_LIMIT_MACROS ]

				// 7.18.4 Limits of other integer types

				#if !defined(__cplusplus) || defined(__STDC_CONSTANT_MACROS) // [   See footnote 224 at page 260

				// 7.18.4.1 Macros for minimum-width integer constants

				#define INT8_C(val)  val##i8

				#define INT16_C(val) val##i16

				#define INT32_C(val) val##i32

				#define INT64_C(val) val##i64

				#define UINT8_C(val)  val##ui8

				#define UINT16_C(val) val##ui16

				#define UINT32_C(val) val##ui32

				#define UINT64_C(val) val##ui64

				// 7.18.4.2 Macros for greatest-width integer constants

				#define INTMAX_C   INT64_C

				#define UINTMAX_C  UINT64_C

				#endif // __STDC_CONSTANT_MACROS ]

				#endif // _MSC_STDINT_H_ ]

									
										14

include/c99_compat.h
									
												View File
												
				@@ -36,17 +36,17 @@

				 */

				#if defined(_MSC_VER)

				#  if _MSC_VER < 1500

				#    error "Microsoft Visual Studio 2008 or higher required"

				#  if _MSC_VER < 1800

				#    error "Microsoft Visual Studio 2013 or higher required"

				#  endif

				   /*

				    * Visual Studio 2012 will complain if we define the `inline` keyword, but

				    * Visual Studio will complain if we define the `inline` keyword, but

				    * actually it only supports the keyword on C++.

				    *

				    * To avoid this the _ALLOW_KEYWORD_MACROS must be set.

				    */

				#  if (_MSC_VER >= 1700) && !defined(_ALLOW_KEYWORD_MACROS)

				#  if !defined(_ALLOW_KEYWORD_MACROS)

				#    define _ALLOW_KEYWORD_MACROS

				#  endif

				@@ -81,8 +81,6 @@

				     /* Intel compiler supports inline keyword */

				#  elif defined(__WATCOMC__) && (__WATCOMC__ >= 1100)

				#    define inline __inline

				#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)

				     /* C99 supports inline keyword */

				#  elif (__STDC_VERSION__ >= 199901L)

				     /* C99 supports inline keyword */

				#  else

				@@ -100,8 +98,6 @@

				#ifndef restrict

				#  if (__STDC_VERSION__ >= 199901L)

				     /* C99 */

				#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)

				     /* C99 */

				#  elif defined(__GNUC__)

				#    define restrict __restrict__

				#  elif defined(_MSC_VER)

				@@ -118,8 +114,6 @@

				#ifndef __func__

				#  if (__STDC_VERSION__ >= 199901L)

				     /* C99 */

				#  elif defined(__SUNPRO_C) && defined(__C99FEATURES__)

				     /* C99 */

				#  elif defined(__GNUC__)

				#    define __func__ __FUNCTION__

				#  elif defined(_MSC_VER)

									
										49

include/c99_math.h
									
												View File
												
				@@ -38,55 +38,16 @@

				#include "c99_compat.h"

				#if defined(_MSC_VER)

				/* This is to ensure that we get M_PI, etc. definitions */

				#if !defined(_USE_MATH_DEFINES)

				#if defined(_MSC_VER) && !defined(_USE_MATH_DEFINES)

				#error _USE_MATH_DEFINES define required when building with MSVC

				#endif 

				#if _MSC_VER < 1800

				#define isfinite(x) _finite((double)(x))

				#define isnan(x) _isnan((double)(x))

				#endif /* _MSC_VER < 1800 */

				#if _MSC_VER < 1800

				static inline double log2( double x )

				{

				   const double invln2 = 1.442695041;

				   return log( x ) * invln2;

				}

				static inline double

				round(double x)

				{

				   return x >= 0.0 ? floor(x + 0.5) : ceil(x - 0.5);

				}

				static inline float

				roundf(float x)

				{

				   return x >= 0.0f ? floorf(x + 0.5f) : ceilf(x - 0.5f);

				}

				#endif

				#ifndef INFINITY

				#include <float.h> // DBL_MAX

				#define INFINITY (DBL_MAX + DBL_MAX)

				#endif

				#ifndef NAN

				#define NAN (INFINITY - INFINITY)

				#endif

				#endif /* _MSC_VER */

				#if (defined(_MSC_VER) && _MSC_VER < 1800) || \

				    (!defined(_MSC_VER) && \

				     __STDC_VERSION__ < 199901L && \

				     (!defined(_XOPEN_SOURCE) || _XOPEN_SOURCE < 600) && \

				     !defined(__cplusplus))

				#if !defined(_MSC_VER) && \

				    __STDC_VERSION__ < 199901L && \

				    (!defined(_XOPEN_SOURCE) || _XOPEN_SOURCE < 600) && \

				    !defined(__cplusplus)

				static inline long int

				lrint(double d)

									
										3

include/d3dadapter/present.h
									
												View File
												
				@@ -69,6 +69,8 @@ typedef struct ID3DPresentVtbl

				    HRESULT (WINAPI *SetCursor)(ID3DPresent *This, void *pBitmap, POINT *pHotspot, BOOL bShow);

				    HRESULT (WINAPI *SetGammaRamp)(ID3DPresent *This, const D3DGAMMARAMP *pRamp, HWND hWndOverride);

				    HRESULT (WINAPI *GetWindowInfo)(ID3DPresent *This,  HWND hWnd, int *width, int *height, int *depth);

				    /* Available since version 1.1 */

				    BOOL (WINAPI *GetWindowOccluded)(ID3DPresent *This);

				} ID3DPresentVtbl;

				struct ID3DPresent

				@@ -96,6 +98,7 @@ struct ID3DPresent

				#define ID3DPresent_SetCursor(p,a,b,c) (p)->lpVtbl->SetCursor(p,a,b,c)

				#define ID3DPresent_SetGammaRamp(p,a,b) (p)->lpVtbl->SetGammaRamp(p,a,b)

				#define ID3DPresent_GetWindowInfo(p,a,b,c,d) (p)->lpVtbl->GetWindowSize(p,a,b,c,d)

				#define ID3DPresent_GetWindowOccluded(p) (p)->lpVtbl->GetWindowOccluded(p)

				typedef struct ID3DPresentGroupVtbl

				{

									
										66

include/pci_ids/i965_pci_ids.h
									
												View File
												
				@@ -109,25 +109,53 @@ CHIPSET(0x162A, bdw_gt3, "Intel(R) Iris Pro P6300 (Broadwell GT3e)")

				CHIPSET(0x162B, bdw_gt3, "Intel(R) Iris 6100 (Broadwell GT3)")

				CHIPSET(0x162D, bdw_gt3, "Intel(R) Broadwell GT3")

				CHIPSET(0x162E, bdw_gt3, "Intel(R) Broadwell GT3")

				CHIPSET(0x1902, skl_gt1, "Intel(R) Skylake DT  GT1")

				CHIPSET(0x1906, skl_gt1, "Intel(R) Skylake ULT GT1")

				CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake SRV GT1")

				CHIPSET(0x190B, skl_gt1, "Intel(R) Skylake Halo GT1")

				CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake ULX GT1")

				CHIPSET(0x1912, skl_gt2, "Intel(R) Skylake DT  GT2")

				CHIPSET(0x1916, skl_gt2, "Intel(R) Skylake ULT GT2")

				CHIPSET(0x191A, skl_gt2, "Intel(R) Skylake SRV GT2")

				CHIPSET(0x191B, skl_gt2, "Intel(R) Skylake Halo GT2")

				CHIPSET(0x191D, skl_gt2, "Intel(R) Skylake WKS GT2")

				CHIPSET(0x191E, skl_gt2, "Intel(R) Skylake ULX GT2")

				CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake ULT GT2F")

				CHIPSET(0x1926, skl_gt3, "Intel(R) Skylake ULT GT3")

				CHIPSET(0x192A, skl_gt3, "Intel(R) Skylake SRV GT3")

				CHIPSET(0x192B, skl_gt3, "Intel(R) Skylake Halo GT3")

				CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x193D, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x1902, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")

				CHIPSET(0x1906, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")

				CHIPSET(0x190A, skl_gt1, "Intel(R) Skylake GT1")

				CHIPSET(0x190B, skl_gt1, "Intel(R) HD Graphics 510 (Skylake GT1)")

				CHIPSET(0x190E, skl_gt1, "Intel(R) Skylake GT1")

				CHIPSET(0x1912, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")

				CHIPSET(0x1913, skl_gt2, "Intel(R) Skylake GT2f")

				CHIPSET(0x1915, skl_gt2, "Intel(R) Skylake GT2f")

				CHIPSET(0x1916, skl_gt2, "Intel(R) HD Graphics 520 (Skylake GT2)")

				CHIPSET(0x1917, skl_gt2, "Intel(R) Skylake GT2f")

				CHIPSET(0x191A, skl_gt2, "Intel(R) Skylake GT2")

				CHIPSET(0x191B, skl_gt2, "Intel(R) HD Graphics 530 (Skylake GT2)")

				CHIPSET(0x191D, skl_gt2, "Intel(R) HD Graphics P530 (Skylake GT2)")

				CHIPSET(0x191E, skl_gt2, "Intel(R) HD Graphics 515 (Skylake GT2)")

				CHIPSET(0x1921, skl_gt2, "Intel(R) HD Graphics 520 (Skylake GT2)")

				CHIPSET(0x1923, skl_gt3, "Intel(R) Skylake GT3e")

				CHIPSET(0x1926, skl_gt3, "Intel(R) Iris Graphics 540 (Skylake GT3e)")

				CHIPSET(0x1927, skl_gt3, "Intel(R) Iris Graphics 550 (Skylake GT3e)")

				CHIPSET(0x192A, skl_gt4, "Intel(R) Skylake GT4")

				CHIPSET(0x192B, skl_gt3, "Intel(R) Iris Graphics 555 (Skylake GT3e)")

				CHIPSET(0x192D, skl_gt3, "Intel(R) Iris Graphics P555 (Skylake GT3e)")

				CHIPSET(0x1932, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")

				CHIPSET(0x193A, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")

				CHIPSET(0x193B, skl_gt4, "Intel(R) Iris Pro Graphics 580 (Skylake GT4e)")

				CHIPSET(0x193D, skl_gt4, "Intel(R) Iris Pro Graphics P580 (Skylake GT4e)")

				CHIPSET(0x5902, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x5906, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590A, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590B, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x590E, kbl_gt1, "Intel(R) Kabylake GT1")

				CHIPSET(0x5913, kbl_gt1_5, "Intel(R) Kabylake GT1.5")

				CHIPSET(0x5915, kbl_gt1_5, "Intel(R) Kabylake GT1.5")

				CHIPSET(0x5917, kbl_gt1_5, "Intel(R) Kabylake GT1.5")

				CHIPSET(0x5912, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x5916, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x591A, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x591B, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x591D, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x591E, kbl_gt2, "Intel(R) Kabylake GT2")

				CHIPSET(0x5921, kbl_gt2, "Intel(R) Kabylake GT2F")

				CHIPSET(0x5926, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x592A, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x592B, kbl_gt3, "Intel(R) Kabylake GT3")

				CHIPSET(0x5932, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593A, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593B, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x593D, kbl_gt4, "Intel(R) Kabylake GT4")

				CHIPSET(0x22B0, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B1, chv,     "Intel(R) HD Graphics (Cherryview)")

				CHIPSET(0x22B2, chv,     "Intel(R) HD Graphics (Cherryview)")

									
										1

include/pci_ids/virtio_gpu_pci_ids.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				CHIPSET(0x0010, VIRTGL, VIRTGL)

									
										18

scons/gallium.py
									
												View File
												
				@@ -94,16 +94,8 @@ def msvc2013_compat(env):

				            '-Werror=pointer-arith',

				        ])

				def msvc2008_compat(env):

				    msvc2013_compat(env)

				    if env['gcc']:

				        env.Append(CFLAGS = [

				            '-Werror=declaration-after-statement',

				        ])

				def createMSVCCompatMethods(env):

				    env.AddMethod(msvc2013_compat, 'MSVC2013Compat')

				    env.AddMethod(msvc2008_compat, 'MSVC2008Compat')

				def num_jobs():

				@@ -300,7 +292,7 @@ def generate(env):

				    # C preprocessor options

				    cppdefines = []

				    cppdefines += ['__STDC_LIMIT_MACROS']

				    cppdefines += ['__STDC_LIMIT_MACROS', '__STDC_CONSTANT_MACROS']

				    if env['build'] in ('debug', 'checked'):

				        cppdefines += ['DEBUG']

				    else:

				@@ -479,20 +471,12 @@ def generate(env):

				        # See also:

				        # - http://msdn.microsoft.com/en-us/library/19z1t1wy.aspx

				        # - cl /?

				        if 'MSVC_VERSION' not in env or distutils.version.LooseVersion(env['MSVC_VERSION']) < distutils.version.LooseVersion('12.0'):

				            # Use bundled stdbool.h and stdint.h headers for older MSVC

				            # versions.  stdint.h was introduced in MSVC 2010, but stdbool.h

				            # was only introduced in MSVC 2013.

				            top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))

				            env.Append(CPPPATH = [os.path.join(top_dir, 'include/c99')])

				        if env['build'] == 'debug':

				            ccflags += [

				              '/Od', # disable optimizations

				              '/Oi', # enable intrinsic functions

				            ]

				        else:

				            if 'MSVC_VERSION' in env and distutils.version.LooseVersion(env['MSVC_VERSION']) < distutils.version.LooseVersion('11.0'):

				                print 'scons: warning: Visual Studio versions prior to 2012 are known to produce incorrect code when optimizations are enabled ( https://bugs.freedesktop.org/show_bug.cgi?id=58718 )'

				            ccflags += [

				                '/O2', # optimize for speed

				            ]

									
										14

scons/llvm.py
									
												View File
												
				@@ -106,7 +106,19 @@ def generate(env):

				        ])

				        env.Prepend(LIBPATH = [os.path.join(llvm_dir, 'lib')])

				        # LIBS should match the output of `llvm-config --libs engine mcjit bitwriter x86asmprinter`

				        if llvm_version >= distutils.version.LooseVersion('3.6'):

				        if llvm_version >= distutils.version.LooseVersion('3.7'):

				            env.Prepend(LIBS = [

				                'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',

				                'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',

				                'LLVMCodeGen', 'LLVMScalarOpts', 'LLVMProfileData',

				                'LLVMInstCombine', 'LLVMInstrumentation', 'LLVMTransformUtils', 'LLVMipa',

				                'LLVMAnalysis', 'LLVMX86Desc', 'LLVMMCDisassembler',

				                'LLVMX86Info', 'LLVMX86AsmPrinter', 'LLVMX86Utils',

				                'LLVMMCJIT', 'LLVMTarget', 'LLVMExecutionEngine',

				                'LLVMRuntimeDyld', 'LLVMObject', 'LLVMMCParser',

				                'LLVMBitReader', 'LLVMMC', 'LLVMCore', 'LLVMSupport'

				            ])

				        elif llvm_version >= distutils.version.LooseVersion('3.6'):

				            env.Prepend(LIBS = [

				                'LLVMBitWriter', 'LLVMX86Disassembler', 'LLVMX86AsmParser',

				                'LLVMX86CodeGen', 'LLVMSelectionDAG', 'LLVMAsmPrinter',

									
										5

src/Makefile.am
									
												View File
												
				@@ -21,8 +21,11 @@

				SUBDIRS = . gtest util mapi/glapi/gen mapi

				# include only conditionally ?

				SUBDIRS += compiler

				if NEED_OPENGL_COMMON

				SUBDIRS += glsl mesa

				SUBDIRS += mesa

				endif

				SUBDIRS += loader

									
										2

src/SConscript
									
												View File
												
				@@ -5,7 +5,7 @@ if env['platform'] == 'windows':

				    SConscript('getopt/SConscript')

				SConscript('util/SConscript')

				SConscript('glsl/SConscript')

				SConscript('compiler/SConscript')

				if env['hostonly']:

				    # We are just compiling the things necessary on the host for cross

1

src/compiler/.gitignore vendored Normal file

View File

				`@@ -0,0 +1 @@`
				`glsl_compiler`

									
										46

src/glsl/Android.gen.mk → src/compiler/Android.gen.mk
									
												View File
												
				@@ -32,54 +32,16 @@ intermediates := $(call local-generated-sources-dir)

				LOCAL_SRC_FILES := $(LOCAL_SRC_FILES)

				LOCAL_C_INCLUDES += \

					$(intermediates)/glcpp \

					$(intermediates)/nir \

					$(MESA_TOP)/src/glsl/glcpp \

					$(MESA_TOP)/src/glsl/nir

					$(MESA_TOP)/src/compiler/nir

				LOCAL_EXPORT_C_INCLUDE_DIRS += \

					$(intermediates)/nir

					$(intermediates)/nir \

					$(MESA_TOP)/src/compiler/nir

				LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \

					$(LIBGLCPP_GENERATED_FILES) \

					$(NIR_GENERATED_FILES) \

					$(LIBGLSL_GENERATED_CXX_FILES))

					$(NIR_GENERATED_FILES))

				define local-l-or-ll-to-c-or-cpp

					@mkdir -p $(dir $@)

					@echo "Mesa Lex: $(PRIVATE_MODULE) <= $<"

					$(hide) $(LEX) --nounistd -o$@ $<

				endef

				define glsl_local-y-to-c-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -o $@ -p "glcpp_parser_" $<

				endef

				define local-yy-to-cpp-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -p "_mesa_glsl_" -o $@ $<

					touch $(@:$1=$(YACC_HEADER_SUFFIX))

					echo '#ifndef '$(@F:$1=_h) > $(@:$1=.h)

					echo '#define '$(@F:$1=_h) >> $(@:$1=.h)

					cat $(@:$1=$(YACC_HEADER_SUFFIX)) >> $(@:$1=.h)

					echo '#endif' >> $(@:$1=.h)

					rm -f $(@:$1=$(YACC_HEADER_SUFFIX))

				endef

				$(intermediates)/glsl_lexer.cpp: $(LOCAL_PATH)/glsl_lexer.ll

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glsl_parser.cpp: $(LOCAL_PATH)/glsl_parser.yy

					$(call local-yy-to-cpp-and-h,.cpp)

				$(intermediates)/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glcpp/glcpp-lex.l

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glcpp/glcpp-parse.y

					$(call glsl_local-y-to-c-and-h)

				nir_builder_opcodes_gen := $(LOCAL_PATH)/nir/nir_builder_opcodes_h.py

				nir_builder_opcodes_deps := \

									
										67

src/compiler/Android.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,67 @@

				# Mesa 3-D graphics library

				#

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice shall be included

				# in all copies or substantial portions of the Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				LOCAL_PATH := $(call my-dir)

				include $(LOCAL_PATH)/Makefile.sources

				# ---------------------------------------

				# Build libmesa_compiler

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := $(LIBCOMPILER_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_MODULE := libmesa_compiler

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

				# ---------------------------------------

				# Build libmesa_nir

				# ---------------------------------------

				include $(CLEAR_VARS)

				LOCAL_SRC_FILES := \

					$(NIR_FILES)

				LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/mapi \

					$(MESA_TOP)/src/mesa \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_MODULE := libmesa_nir

				include $(LOCAL_PATH)/Android.gen.mk

				include $(MESA_COMMON_MK)

				include $(BUILD_STATIC_LIBRARY)

									
										325

src/compiler/Makefile.am
									
										Normal file
									
												View File
												
				@@ -0,0 +1,325 @@

				#

				# Copyright © 2012 Jon TURNEY

				# Copyright (C) 2015 Intel Corporation

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice (including the next

				# paragraph) shall be included in all copies or substantial portions of the

				# Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS

				# IN THE SOFTWARE.

				include Makefile.sources

				AM_CPPFLAGS = \

					-I$(top_srcdir)/include \

					-I$(top_srcdir)/src \

					-I$(top_srcdir)/src/mapi \

					-I$(top_srcdir)/src/mesa/ \

					-I$(top_builddir)/src/compiler/glsl\

					-I$(top_srcdir)/src/compiler/glsl\

					-I$(top_srcdir)/src/compiler/glsl/glcpp\

					-I$(top_srcdir)/src/gallium/include \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/gtest/include \

					$(DEFINES)

				AM_CFLAGS = \

					$(VISIBILITY_CFLAGS) \

					$(MSVC2013_COMPAT_CFLAGS)

				AM_CXXFLAGS = \

					$(VISIBILITY_CXXFLAGS) \

					$(MSVC2013_COMPAT_CXXFLAGS)

				noinst_LTLIBRARIES = libcompiler.la

				libcompiler_la_SOURCES = $(LIBCOMPILER_FILES)

				check_PROGRAMS =

				TESTS =

				BUILT_SOURCES =

				CLEANFILES =

				EXTRA_DIST = SConscript

				EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README	\

					glsl/TODO glsl/glcpp/README			\

					glsl/glsl_lexer.ll				\

					glsl/glsl_parser.yy				\

					glsl/glcpp/glcpp-lex.l				\

					glsl/glcpp/glcpp-parse.y			\

					glsl/Makefile.sources				\

					glsl/SConscript

				TESTS += glsl/glcpp/tests/glcpp-test			\

					glsl/glcpp/tests/glcpp-test-cr-lf		\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/optimization-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test

				TESTS_ENVIRONMENT= \

					export PYTHON2=$(PYTHON2); \

					export PYTHON_FLAGS=$(PYTHON_FLAGS);

				check_PROGRAMS +=					\

					glsl/glcpp/glcpp				\

					glsl/glsl_test					\

					glsl/tests/blob-test				\

					glsl/tests/general-ir-test			\

					glsl/tests/sampler-types-test			\

					glsl/tests/uniform-initializer-test

				noinst_PROGRAMS = glsl_compiler

				glsl_tests_blob_test_SOURCES =				\

					glsl/tests/blob_test.c

				glsl_tests_blob_test_LDADD =				\

					glsl/libglsl.la

				glsl_tests_general_ir_test_SOURCES =			\

					glsl/standalone_scaffolding.cpp			\

					glsl/tests/builtin_variable_test.cpp		\

					glsl/tests/invalidate_locations_test.cpp	\

					glsl/tests/general_ir_test.cpp			\

					glsl/tests/varyings_test.cpp

				glsl_tests_general_ir_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				glsl_tests_general_ir_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la		\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				glsl_tests_uniform_initializer_test_SOURCES =		\

					glsl/tests/copy_constant_to_storage_tests.cpp	\

					glsl/tests/set_uniform_initializer_tests.cpp	\

					glsl/tests/uniform_initializer_utils.cpp	\

					glsl/tests/uniform_initializer_utils.h

				glsl_tests_uniform_initializer_test_CFLAGS =		\

					$(PTHREAD_CFLAGS)

				glsl_tests_uniform_initializer_test_LDADD =		\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la		\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				glsl_tests_sampler_types_test_SOURCES =			\

					glsl/tests/sampler_types_test.cpp

				glsl_tests_sampler_types_test_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				glsl_tests_sampler_types_test_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la

				glsl_libglcpp_la_LIBADD =				\

					$(top_builddir)/src/util/libmesautil.la

				glsl_libglcpp_la_SOURCES =				\

					glsl/glcpp/glcpp-lex.c				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-parse.h			\

					$(LIBGLCPP_FILES)

				glsl_glcpp_glcpp_SOURCES =				\

					glsl/glcpp/glcpp.c

				glsl_glcpp_glcpp_LDADD =				\

					glsl/libglcpp.la	\

					$(top_builddir)/src/libglsl_util.la		\

					-lm

				glsl_libglsl_la_LIBADD = \

					nir/libnir.la \

					glsl/libglcpp.la

				glsl_libglsl_la_SOURCES =				\

					glsl/glsl_lexer.cpp				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_parser.h				\

					$(LIBGLSL_FILES)

				glsl_compiler_SOURCES = \

					$(GLSL_COMPILER_CXX_FILES)

				glsl_compiler_LDADD =					\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				glsl_glsl_test_SOURCES = \

					glsl/standalone_scaffolding.cpp \

					glsl/test.cpp \

					glsl/test_optpass.cpp \

					glsl/test_optpass.h

				glsl_glsl_test_LDADD =					\

					glsl/libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(PTHREAD_LIBS)

				# We write our own rules for yacc and lex below. We'd rather use automake,

				# but automake makes it especially difficult for a number of reasons:

				#

				#  * < automake-1.12 generates .h files from .yy and .ypp files, but

				#    >=automake-1.12 generates .hh and .hpp files respectively. There's no

				#    good way of making a project that uses C++ yacc files compatible with

				#    both versions of automake. Strong work automake developers.

				#

				#  * Since we're generating code from .l/.y files in a subdirectory (glcpp/)

				#    we'd like the resulting generated code to also go in glcpp/ for purposes

				#    of distribution. Automake gives no way to do this.

				#

				#  * Since we're building multiple yacc parsers into one library (and via one

				#    Makefile) we have to use per-target YFLAGS. Using per-target YFLAGS causes

				#    automake to name the resulting generated code as <library-name>_filename.c.

				#    Frankly, that's ugly and we don't want a libglcpp_glcpp_parser.h file.

				# In order to make build output print "LEX" and "YACC", we reproduce the

				# automake variables below.

				AM_V_LEX = $(am__v_LEX_$(V))

				am__v_LEX_ = $(am__v_LEX_$(AM_DEFAULT_VERBOSITY))

				am__v_LEX_0 = @echo "  LEX     " $@;

				am__v_LEX_1 =

				AM_V_YACC = $(am__v_YACC_$(V))

				am__v_YACC_ = $(am__v_YACC_$(AM_DEFAULT_VERBOSITY))

				am__v_YACC_0 = @echo "  YACC    " $@;

				am__v_YACC_1 =

				MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)

				YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)

				LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)

				glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "_mesa_glsl_" --defines=$(builddir)/glsl/glsl_parser.h $(srcdir)/glsl/glsl_parser.yy

				glsl/glsl_lexer.cpp: glsl/glsl_lexer.ll

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glsl_lexer.ll

				glsl/glcpp/glcpp-parse.c glsl/glcpp/glcpp-parse.h: glsl/glcpp/glcpp-parse.y

					$(MKDIR_GEN)

					$(YACC_GEN) -o $@ -p "glcpp_parser_" --defines=$(builddir)/glsl/glcpp/glcpp-parse.h $(srcdir)/glsl/glcpp/glcpp-parse.y

				glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l

					$(MKDIR_GEN)

					$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l

				# Only the parsers (specifically the header files generated at the same time)

				# need to be in BUILT_SOURCES. Though if we list the parser headers YACC is

				# called for the .c/.cpp file and the .h files. By listing the .c/.cpp files

				# YACC is only executed once for each parser. The rest of the generated code

				# will be created at the appropriate times according to standard automake

				# dependency rules.

				BUILT_SOURCES +=					\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				CLEANFILES +=						\

					glsl/glcpp/glcpp-parse.h			\

					glsl/glsl_parser.h				\

					glsl/glsl_parser.cpp				\

					glsl/glsl_lexer.cpp				\

					glsl/glcpp/glcpp-parse.c			\

					glsl/glcpp/glcpp-lex.c

				clean-local:

					$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr

				dist-hook:

					$(RM) glsl/glcpp/tests/*.out

					$(RM) glsl/glcpp/tests/subtest*/*.out

				noinst_LTLIBRARIES += nir/libnir.la

				nir_libnir_la_CPPFLAGS = \

					$(AM_CPPFLAGS) \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir

				nir_libnir_la_LIBADD = \

					libcompiler.la

				nir_libnir_la_SOURCES =					\

					$(NIR_FILES)					\

					$(NIR_GENERATED_FILES)

				PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

				nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@ || ($(RM) $@; false)

				nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@ || ($(RM) $@; false)

				nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; false)

				check_PROGRAMS += nir/tests/control_flow_tests

				nir_tests_control_flow_tests_CPPFLAGS = \

					$(AM_CPPFLAGS) \

					-I$(top_builddir)/src/compiler/nir \

					-I$(top_srcdir)/src/compiler/nir

				nir_tests_control_flow_tests_SOURCES =			\

					nir/tests/control_flow_tests.cpp

				nir_tests_control_flow_tests_CFLAGS =			\

					$(PTHREAD_CFLAGS)

				nir_tests_control_flow_tests_LDADD =			\

					$(top_builddir)/src/gtest/libgtest.la		\

					nir/libnir.la	\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				TESTS += nir/tests/control_flow_tests

				BUILT_SOURCES += $(NIR_GENERATED_FILES)

				CLEANFILES += $(NIR_GENERATED_FILES)

				EXTRA_DIST += \

					nir/nir_algebraic.py				\

					nir/nir_builder_opcodes_h.py			\

					nir/nir_constant_expressions.py			\

					nir/nir_opcodes.py				\

					nir/nir_opcodes_c.py				\

					nir/nir_opcodes_h.py				\

					nir/nir_opt_algebraic.py			\

					nir/tests					\

					nir/Makefile.sources

									
										226

src/compiler/Makefile.sources
									
										Normal file
									
												View File
												
				@@ -0,0 +1,226 @@

				LIBCOMPILER_FILES = \

					builtin_type_macros.h \

					glsl_types.cpp \

					glsl_types.h \

					nir_types.cpp \

					nir_types.h \

					shader_enums.c \

					shader_enums.h

				# libglsl

				LIBGLSL_FILES = \

					glsl/ast.h \

					glsl/ast_array_index.cpp \

					glsl/ast_expr.cpp \

					glsl/ast_function.cpp \

					glsl/ast_to_hir.cpp \

					glsl/ast_type.cpp \

					glsl/blob.c \

					glsl/blob.h \

					glsl/builtin_functions.cpp \

					glsl/builtin_types.cpp \

					glsl/builtin_variables.cpp \

					glsl/glsl_parser_extras.cpp \

					glsl/glsl_parser_extras.h \

					glsl/glsl_symbol_table.cpp \

					glsl/glsl_symbol_table.h \

					glsl/hir_field_selection.cpp \

					glsl/ir_basic_block.cpp \

					glsl/ir_basic_block.h \

					glsl/ir_builder.cpp \

					glsl/ir_builder.h \

					glsl/ir_clone.cpp \

					glsl/ir_constant_expression.cpp \

					glsl/ir.cpp \

					glsl/ir.h \

					glsl/ir_equals.cpp \

					glsl/ir_expression_flattening.cpp \

					glsl/ir_expression_flattening.h \

					glsl/ir_function_can_inline.cpp \

					glsl/ir_function_detect_recursion.cpp \

					glsl/ir_function_inlining.h \

					glsl/ir_function.cpp \

					glsl/ir_hierarchical_visitor.cpp \

					glsl/ir_hierarchical_visitor.h \

					glsl/ir_hv_accept.cpp \

					glsl/ir_import_prototypes.cpp \

					glsl/ir_optimization.h \

					glsl/ir_print_visitor.cpp \

					glsl/ir_print_visitor.h \

					glsl/ir_reader.cpp \

					glsl/ir_reader.h \

					glsl/ir_rvalue_visitor.cpp \

					glsl/ir_rvalue_visitor.h \

					glsl/ir_set_program_inouts.cpp \

					glsl/ir_uniform.h \

					glsl/ir_validate.cpp \

					glsl/ir_variable_refcount.cpp \

					glsl/ir_variable_refcount.h \

					glsl/ir_visitor.h \

					glsl/linker.cpp \

					glsl/linker.h \

					glsl/link_atomics.cpp \

					glsl/link_functions.cpp \

					glsl/link_interface_blocks.cpp \

					glsl/link_uniforms.cpp \

					glsl/link_uniform_initializers.cpp \

					glsl/link_uniform_block_active_visitor.cpp \

					glsl/link_uniform_block_active_visitor.h \

					glsl/link_uniform_blocks.cpp \

					glsl/link_varyings.cpp \

					glsl/link_varyings.h \

					glsl/list.h \

					glsl/loop_analysis.cpp \

					glsl/loop_analysis.h \

					glsl/loop_controls.cpp \

					glsl/loop_unroll.cpp \

					glsl/lower_buffer_access.cpp \

					glsl/lower_buffer_access.h \

					glsl/lower_clip_distance.cpp \

					glsl/lower_const_arrays_to_uniforms.cpp \

					glsl/lower_discard.cpp \

					glsl/lower_discard_flow.cpp \

					glsl/lower_if_to_cond_assign.cpp \

					glsl/lower_instructions.cpp \

					glsl/lower_jumps.cpp \

					glsl/lower_mat_op_to_vec.cpp \

					glsl/lower_noise.cpp \

					glsl/lower_offset_array.cpp \

					glsl/lower_packed_varyings.cpp \

					glsl/lower_named_interface_blocks.cpp \

					glsl/lower_packing_builtins.cpp \

					glsl/lower_subroutine.cpp \

					glsl/lower_tess_level.cpp \

					glsl/lower_texture_projection.cpp \

					glsl/lower_variable_index_to_cond_assign.cpp \

					glsl/lower_vec_index_to_cond_assign.cpp \

					glsl/lower_vec_index_to_swizzle.cpp \

					glsl/lower_vector.cpp \

					glsl/lower_vector_derefs.cpp \

					glsl/lower_vector_insert.cpp \

					glsl/lower_vertex_id.cpp \

					glsl/lower_output_reads.cpp \

					glsl/lower_shared_reference.cpp \

					glsl/lower_ubo_reference.cpp \

					glsl/opt_algebraic.cpp \

					glsl/opt_array_splitting.cpp \

					glsl/opt_conditional_discard.cpp \

					glsl/opt_constant_folding.cpp \

					glsl/opt_constant_propagation.cpp \

					glsl/opt_constant_variable.cpp \

					glsl/opt_copy_propagation.cpp \

					glsl/opt_copy_propagation_elements.cpp \

					glsl/opt_dead_builtin_variables.cpp \

					glsl/opt_dead_builtin_varyings.cpp \

					glsl/opt_dead_code.cpp \

					glsl/opt_dead_code_local.cpp \

					glsl/opt_dead_functions.cpp \

					glsl/opt_flatten_nested_if_blocks.cpp \

					glsl/opt_flip_matrices.cpp \

					glsl/opt_function_inlining.cpp \

					glsl/opt_if_simplification.cpp \

					glsl/opt_minmax.cpp \

					glsl/opt_noop_swizzle.cpp \

					glsl/opt_rebalance_tree.cpp \

					glsl/opt_redundant_jumps.cpp \

					glsl/opt_structure_splitting.cpp \

					glsl/opt_swizzle_swizzle.cpp \

					glsl/opt_tree_grafting.cpp \

					glsl/opt_vectorize.cpp \

					glsl/program.h \

					glsl/s_expression.cpp \

					glsl/s_expression.h

				# glsl_compiler

				GLSL_COMPILER_CXX_FILES = \

					glsl/standalone_scaffolding.cpp \

					glsl/standalone_scaffolding.h \

					glsl/main.cpp

				# libglsl generated sources

				LIBGLSL_GENERATED_CXX_FILES = \

					glsl/glsl_lexer.cpp \

					glsl/glsl_parser.cpp

				# libglcpp

				LIBGLCPP_FILES = \

					glsl/glcpp/glcpp.h \

					glsl/glcpp/pp.c

				LIBGLCPP_GENERATED_FILES = \

					glsl/glcpp/glcpp-lex.c \

					glsl/glcpp/glcpp-parse.c

				NIR_GENERATED_FILES = \

					nir/nir_builder_opcodes.h \

					nir/nir_constant_expressions.c \

					nir/nir_opcodes.c \

					nir/nir_opcodes.h \

					nir/nir_opt_algebraic.c

				NIR_FILES = \

					nir/glsl_to_nir.cpp \

					nir/glsl_to_nir.h \

					nir/nir.c \

					nir/nir.h \

					nir/nir_array.h \

					nir/nir_builder.h \

					nir/nir_clone.c \

					nir/nir_constant_expressions.h \

					nir/nir_control_flow.c \

					nir/nir_control_flow.h \

					nir/nir_control_flow_private.h \

					nir/nir_dominance.c \

					nir/nir_from_ssa.c \

					nir/nir_gs_count_vertices.c \

					nir/nir_intrinsics.c \

					nir/nir_intrinsics.h \

					nir/nir_instr_set.c \

					nir/nir_instr_set.h \

					nir/nir_liveness.c \

					nir/nir_lower_alu_to_scalar.c \

					nir/nir_lower_atomics.c \

					nir/nir_lower_clip.c \

					nir/nir_lower_global_vars_to_local.c \

					nir/nir_lower_gs_intrinsics.c \

					nir/nir_lower_load_const_to_scalar.c \

					nir/nir_lower_locals_to_regs.c \

					nir/nir_lower_idiv.c \

					nir/nir_lower_io.c \

					nir/nir_lower_outputs_to_temporaries.c \

					nir/nir_lower_phis_to_scalar.c \

					nir/nir_lower_samplers.c \

					nir/nir_lower_system_values.c \

					nir/nir_lower_tex.c \

					nir/nir_lower_to_source_mods.c \

					nir/nir_lower_two_sided_color.c \

					nir/nir_lower_vars_to_ssa.c \

					nir/nir_lower_var_copies.c \

					nir/nir_lower_vec_to_movs.c \

					nir/nir_metadata.c \

					nir/nir_move_vec_src_uses_to_dest.c \

					nir/nir_normalize_cubemap_coords.c \

					nir/nir_opt_constant_folding.c \

					nir/nir_opt_copy_propagate.c \

					nir/nir_opt_cse.c \

					nir/nir_opt_dce.c \

					nir/nir_opt_dead_cf.c \

					nir/nir_opt_gcm.c \

					nir/nir_opt_global_to_local.c \

					nir/nir_opt_peephole_select.c \

					nir/nir_opt_remove_phis.c \

					nir/nir_opt_undef.c \

					nir/nir_print.c \

					nir/nir_remove_dead_variables.c \

					nir/nir_search.c \

					nir/nir_search.h \

					nir/nir_split_var_copies.c \

					nir/nir_sweep.c \

					nir/nir_to_ssa.c \

					nir/nir_validate.c \

					nir/nir_vla.h \

					nir/nir_worklist.c \

					nir/nir_worklist.h

									
										24

src/compiler/SConscript
									
										Normal file
									
												View File
												
				@@ -0,0 +1,24 @@

				Import('*')

				env = env.Clone()

				env.MSVC2013Compat()

				env.Prepend(CPPPATH = [

				    '#include',

				    '#src',

				    '#src/mapi',

				    '#src/mesa',

				    '#src/gallium/include',

				    '#src/gallium/auxiliary',

				])

				sources = env.ParseSourceList('Makefile.sources', 'LIBCOMPILER_FILES')

				compiler = env.ConvenienceLibrary(

				    target = 'compiler',

				    source = sources

				)

				Export('compiler')

				SConscript('glsl/SConscript')

									
										3

src/glsl/builtin_type_macros.h → src/compiler/builtin_type_macros.h
									
												View File
												
				@@ -28,8 +28,6 @@

				 * language version or extension might provide them.

				 */

				#include "glsl_types.h"

				DECL_TYPE(error,  GL_INVALID_ENUM, GLSL_TYPE_ERROR, 0, 0)

				DECL_TYPE(void,   GL_INVALID_ENUM, GLSL_TYPE_VOID,  0, 0)

				@@ -80,6 +78,7 @@ DECL_TYPE(dmat3x4, GL_DOUBLE_MAT3x4, GLSL_TYPE_DOUBLE, 4, 3)

				DECL_TYPE(dmat4x2, GL_DOUBLE_MAT4x2, GLSL_TYPE_DOUBLE, 2, 4)

				DECL_TYPE(dmat4x3, GL_DOUBLE_MAT4x3, GLSL_TYPE_DOUBLE, 3, 4)

				DECL_TYPE(sampler,           GL_SAMPLER_1D,                   GLSL_TYPE_SAMPLER, GLSL_SAMPLER_DIM_1D,   0, 0, GLSL_TYPE_VOID)

				DECL_TYPE(sampler1D,         GL_SAMPLER_1D,                   GLSL_TYPE_SAMPLER, GLSL_SAMPLER_DIM_1D,   0, 0, GLSL_TYPE_FLOAT)

				DECL_TYPE(sampler2D,         GL_SAMPLER_2D,                   GLSL_TYPE_SAMPLER, GLSL_SAMPLER_DIM_2D,   0, 0, GLSL_TYPE_FLOAT)

				DECL_TYPE(sampler3D,         GL_SAMPLER_3D,                   GLSL_TYPE_SAMPLER, GLSL_SAMPLER_DIM_3D,   0, 0, GLSL_TYPE_FLOAT)

1

src/glsl/.gitignore → src/compiler/glsl/.gitignore vendored

View File

@@ -1,4 +1,3 @@
 glsl_compiler
 glsl_lexer.cpp
 glsl_parser.cpp
 glsl_parser.h

									
										76

src/compiler/glsl/Android.gen.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,76 @@

				# Mesa 3-D graphics library

				#

				# Copyright (C) 2010-2011 Chia-I Wu <olvaffe@gmail.com>

				# Copyright (C) 2010-2011 LunarG Inc.

				#

				# Permission is hereby granted, free of charge, to any person obtaining a

				# copy of this software and associated documentation files (the "Software"),

				# to deal in the Software without restriction, including without limitation

				# the rights to use, copy, modify, merge, publish, distribute, sublicense,

				# and/or sell copies of the Software, and to permit persons to whom the

				# Software is furnished to do so, subject to the following conditions:

				#

				# The above copyright notice and this permission notice shall be included

				# in all copies or substantial portions of the Software.

				#

				# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR

				# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

				# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER

				# DEALINGS IN THE SOFTWARE.

				# included by glsl Android.mk for source generation

				ifeq ($(LOCAL_MODULE_CLASS),)

				LOCAL_MODULE_CLASS := STATIC_LIBRARIES

				endif

				intermediates := $(call local-generated-sources-dir)

				LOCAL_SRC_FILES := $(LOCAL_SRC_FILES)

				LOCAL_C_INCLUDES += \

					$(intermediates)/glcpp \

					$(MESA_TOP)/src/glsl/glcpp \

				LOCAL_GENERATED_SOURCES += $(addprefix $(intermediates)/, \

					$(LIBGLCPP_GENERATED_FILES) \

					$(LIBGLSL_GENERATED_CXX_FILES))

				define local-l-or-ll-to-c-or-cpp

					@mkdir -p $(dir $@)

					@echo "Mesa Lex: $(PRIVATE_MODULE) <= $<"

					$(hide) $(LEX) --nounistd -o$@ $<

				endef

				define glsl_local-y-to-c-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -o $@ -p "glcpp_parser_" $<

				endef

				define local-yy-to-cpp-and-h

					@mkdir -p $(dir $@)

					@echo "Mesa Yacc: $(PRIVATE_MODULE) <= $<"

					$(hide) $(YACC) -p "_mesa_glsl_" -o $@ $<

					touch $(@:$1=$(YACC_HEADER_SUFFIX))

					echo '#ifndef '$(@F:$1=_h) > $(@:$1=.h)

					echo '#define '$(@F:$1=_h) >> $(@:$1=.h)

					cat $(@:$1=$(YACC_HEADER_SUFFIX)) >> $(@:$1=.h)

					echo '#endif' >> $(@:$1=.h)

					rm -f $(@:$1=$(YACC_HEADER_SUFFIX))

				endef

				$(intermediates)/glsl_lexer.cpp: $(LOCAL_PATH)/glsl_lexer.ll

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glsl_parser.cpp: $(LOCAL_PATH)/glsl_parser.yy

					$(call local-yy-to-cpp-and-h,.cpp)

				$(intermediates)/glcpp/glcpp-lex.c: $(LOCAL_PATH)/glcpp/glcpp-lex.l

					$(call local-l-or-ll-to-c-or-cpp)

				$(intermediates)/glcpp/glcpp-parse.c: $(LOCAL_PATH)/glcpp/glcpp-parse.y

					$(call glsl_local-y-to-c-and-h)

									
										2

src/glsl/Android.mk → src/compiler/glsl/Android.mk
									
												View File
												
				@@ -44,6 +44,8 @@ LOCAL_C_INCLUDES := \

					$(MESA_TOP)/src/gallium/include \

					$(MESA_TOP)/src/gallium/auxiliary

				LOCAL_STATIC_LIBRARIES := libmesa_compiler

				LOCAL_MODULE := libmesa_glsl

				include $(LOCAL_PATH)/Android.gen.mk

									
										53

src/glsl/Makefile.am → src/compiler/glsl/Makefile.am
									
												View File
												
				@@ -27,9 +27,7 @@ AM_CPPFLAGS = \

					-I$(top_srcdir)/src/gallium/include \

					-I$(top_srcdir)/src/gallium/auxiliary \

					-I$(top_srcdir)/src/glsl/glcpp \

					-I$(top_srcdir)/src/glsl/nir \

					-I$(top_srcdir)/src/gtest/include \

					-I$(top_builddir)/src/glsl/nir \

					$(DEFINES)

				AM_CFLAGS = \

					$(VISIBILITY_CFLAGS) \

				@@ -43,13 +41,6 @@ EXTRA_DIST = tests glcpp/tests README TODO glcpp/README	\

					glsl_parser.yy					\

					glcpp/glcpp-lex.l				\

					glcpp/glcpp-parse.y				\

					nir/nir_algebraic.py				\

					nir/nir_builder_opcodes_h.py			\

					nir/nir_constant_expressions.py			\

					nir/nir_opcodes.py				\

					nir/nir_opcodes_c.py				\

					nir/nir_opcodes_h.py				\

					nir/nir_opt_algebraic.py			\

					SConscript

				include Makefile.sources

				@@ -66,7 +57,7 @@ TESTS_ENVIRONMENT= \

					export PYTHON2=$(PYTHON2); \

					export PYTHON_FLAGS=$(PYTHON_FLAGS);

				noinst_LTLIBRARIES = libnir.la libglsl.la libglcpp.la

				noinst_LTLIBRARIES = libglsl.la libglcpp.la

				check_PROGRAMS =					\

					glcpp/glcpp					\

					glsl_test					\

				@@ -134,29 +125,24 @@ glcpp_glcpp_LDADD =					\

					$(top_builddir)/src/libglsl_util.la		\

					-lm

				libglsl_la_LIBADD = libglcpp.la

				libglsl_la_LIBADD = \

					$(top_builddir)/src/compiler/nir/libnir.la \

					libglcpp.la

				libglsl_la_SOURCES =					\

					glsl_lexer.cpp					\

					glsl_parser.cpp					\

					glsl_parser.h					\

					$(LIBGLSL_FILES)				\

					$(NIR_FILES)					\

					$(NIR_GENERATED_FILES)

					$(LIBGLSL_FILES)

				libnir_la_SOURCES =					\

					glsl_types.cpp					\

					builtin_types.cpp				\

					glsl_symbol_table.cpp				\

					$(NIR_FILES)					\

					$(NIR_GENERATED_FILES)

				glsl_compiler_SOURCES = \

					$(GLSL_COMPILER_CXX_FILES)

				glsl_compiler_LDADD =					\

					libglsl.la					\

					$(top_builddir)/src/libglsl_util.la		\

					$(top_builddir)/src/util/libmesautil.la		\

					$(PTHREAD_LIBS)

				glsl_test_SOURCES = \

				@@ -228,8 +214,7 @@ BUILT_SOURCES =						\

					glsl_parser.cpp					\

					glsl_lexer.cpp					\

					glcpp/glcpp-parse.c				\

					glcpp/glcpp-lex.c				\

					$(NIR_GENERATED_FILES)

					glcpp/glcpp-lex.c

				CLEANFILES =						\

					glcpp/glcpp-parse.h				\

					glsl_parser.h					\

				@@ -241,25 +226,3 @@ clean-local:

				dist-hook:

					$(RM) glcpp/tests/*.out

					$(RM) glcpp/tests/subtest*/*.out

				PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)

				nir/nir_builder_opcodes.h: nir/nir_opcodes.py nir/nir_builder_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_builder_opcodes_h.py > $@

				nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_constant_expressions.py > $@

				nir/nir_opcodes.h: nir/nir_opcodes.py nir/nir_opcodes_h.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_h.py > $@

				nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opcodes_c.py > $@

				nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py nir/nir_algebraic.py

					$(MKDIR_GEN)

					$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@

									
										42

src/glsl/Makefile.sources → src/compiler/glsl/Makefile.sources
									
												View File
												
				@@ -18,42 +18,52 @@ NIR_GENERATED_FILES = \

					nir/nir_opt_algebraic.c

				NIR_FILES = \

					nir/glsl_to_nir.cpp \

					nir/glsl_to_nir.h \

					nir/nir.c \

					nir/nir.h \

					nir/nir_array.h \

					nir/nir_builder.h \

					nir/nir_clone.c \

					nir/nir_constant_expressions.h \

					nir/nir_control_flow.c \

					nir/nir_control_flow.h \

					nir/nir_control_flow_private.h \

					nir/nir_dominance.c \

					nir/nir_from_ssa.c \

					nir/nir_gs_count_vertices.c \

					nir/nir_intrinsics.c \

					nir/nir_intrinsics.h \

					nir/nir_live_variables.c \

					nir/nir_instr_set.c \

					nir/nir_instr_set.h \

					nir/nir_liveness.c \

					nir/nir_lower_alu_to_scalar.c \

					nir/nir_lower_atomics.c \

					nir/nir_lower_clip.c \

					nir/nir_lower_global_vars_to_local.c \

					nir/nir_lower_gs_intrinsics.c \

					nir/nir_lower_load_const_to_scalar.c \

					nir/nir_lower_locals_to_regs.c \

					nir/nir_lower_idiv.c \

					nir/nir_lower_io.c \

					nir/nir_lower_outputs_to_temporaries.c \

					nir/nir_lower_phis_to_scalar.c \

					nir/nir_lower_samplers.cpp \

					nir/nir_lower_samplers.c \

					nir/nir_lower_system_values.c \

					nir/nir_lower_tex_projector.c \

					nir/nir_lower_tex.c \

					nir/nir_lower_to_source_mods.c \

					nir/nir_lower_two_sided_color.c \

					nir/nir_lower_vars_to_ssa.c \

					nir/nir_lower_var_copies.c \

					nir/nir_lower_vec_to_movs.c \

					nir/nir_metadata.c \

					nir/nir_move_vec_src_uses_to_dest.c \

					nir/nir_normalize_cubemap_coords.c \

					nir/nir_opt_constant_folding.c \

					nir/nir_opt_copy_propagate.c \

					nir/nir_opt_cse.c \

					nir/nir_opt_dce.c \

					nir/nir_opt_dead_cf.c \

					nir/nir_opt_gcm.c \

					nir/nir_opt_global_to_local.c \

					nir/nir_opt_peephole_ffma.c \

					nir/nir_opt_peephole_select.c \

					nir/nir_opt_remove_phis.c \

					nir/nir_opt_undef.c \

				@@ -64,12 +74,10 @@ NIR_FILES = \

					nir/nir_split_var_copies.c \

					nir/nir_sweep.c \

					nir/nir_to_ssa.c \

					nir/nir_types.h \

					nir/nir_validate.c \

					nir/nir_vla.h \

					nir/nir_worklist.c \

					nir/nir_worklist.h \

					nir/nir_types.cpp

					nir/nir_worklist.h

				# libglsl

				@@ -83,15 +91,12 @@ LIBGLSL_FILES = \

					blob.c \

					blob.h \

					builtin_functions.cpp \

					builtin_type_macros.h \

					builtin_types.cpp \

					builtin_variables.cpp \

					glsl_parser_extras.cpp \

					glsl_parser_extras.h \

					glsl_symbol_table.cpp \

					glsl_symbol_table.h \

					glsl_types.cpp \

					glsl_types.h \

					hir_field_selection.cpp \

					ir_basic_block.cpp \

					ir_basic_block.h \

				@@ -142,6 +147,8 @@ LIBGLSL_FILES = \

					loop_analysis.h \

					loop_controls.cpp \

					loop_unroll.cpp \

					lower_buffer_access.cpp \

					lower_buffer_access.h \

					lower_clip_distance.cpp \

					lower_const_arrays_to_uniforms.cpp \

					lower_discard.cpp \

				@@ -162,9 +169,11 @@ LIBGLSL_FILES = \

					lower_vec_index_to_cond_assign.cpp \

					lower_vec_index_to_swizzle.cpp \

					lower_vector.cpp \

					lower_vector_derefs.cpp \

					lower_vector_insert.cpp \

					lower_vertex_id.cpp \

					lower_output_reads.cpp \

					lower_shared_reference.cpp \

					lower_ubo_reference.cpp \

					opt_algebraic.cpp \

					opt_array_splitting.cpp \

				@@ -174,7 +183,6 @@ LIBGLSL_FILES = \

					opt_constant_variable.cpp \

					opt_copy_propagation.cpp \

					opt_copy_propagation_elements.cpp \

					opt_cse.cpp \

					opt_dead_builtin_variables.cpp \

					opt_dead_builtin_varyings.cpp \

					opt_dead_code.cpp \

				@@ -194,8 +202,12 @@ LIBGLSL_FILES = \

					opt_vectorize.cpp \

					program.h \

					s_expression.cpp \

					s_expression.h \

					shader_enums.h

					s_expression.h

				# glsl to nir pass

				GLSL_TO_NIR_FILES = \

					nir/glsl_to_nir.cpp \

					nir/glsl_to_nir.h

				# glsl_compiler

0

src/glsl/README → src/compiler/glsl/README

View File

									
										2

src/glsl/SConscript → src/compiler/glsl/SConscript
									
												View File
												
				@@ -107,7 +107,7 @@ if env['platform'] == 'windows':

				        'user32',

				    ])

				env.Prepend(LIBS = [glsl])

				env.Prepend(LIBS = [compiler, glsl])

				glsl_compiler = env.Program(

				    target = 'glsl_compiler',

0

src/glsl/TODO → src/compiler/glsl/TODO

View File

									
										110

src/glsl/ast.h → src/compiler/glsl/ast.h
									
												View File
												
				@@ -183,6 +183,7 @@ enum ast_operators {

				   ast_post_dec,

				   ast_field_selection,

				   ast_array_index,

				   ast_unsized_array_dim,

				   ast_function_call,

				@@ -324,16 +325,7 @@ public:

				class ast_array_specifier : public ast_node {

				public:

				   /** Unsized array specifier ([]) */

				   explicit ast_array_specifier(const struct YYLTYPE &locp)

				     : is_unsized_array(true)

				   {

				      set_location(locp);

				   }

				   /** Sized array specifier ([dim]) */

				   ast_array_specifier(const struct YYLTYPE &locp, ast_expression *dim)

				     : is_unsized_array(false)

				   {

				      set_location(locp);

				      array_dimensions.push_tail(&dim->link);

				@@ -344,17 +336,40 @@ public:

				      array_dimensions.push_tail(&dim->link);

				   }

				   bool is_single_dimension() const

				   {

				      return this->array_dimensions.tail_pred->prev != NULL &&

				             this->array_dimensions.tail_pred->prev->is_head_sentinel();

				   }

				   virtual void print(void) const;

				   /* If true, this means that the array has an unsized outermost dimension. */

				   bool is_unsized_array;

				   /* This list contains objects of type ast_node containing the

				    * sized dimensions only, in outermost-to-innermost order.

				    * array dimensions in outermost-to-innermost order.

				    */

				   exec_list array_dimensions;

				};

				class ast_layout_expression : public ast_node {

				public:

				   ast_layout_expression(const struct YYLTYPE &locp, ast_expression *expr)

				   {

				      set_location(locp);

				      layout_const_expressions.push_tail(&expr->link);

				   }

				   bool process_qualifier_constant(struct _mesa_glsl_parse_state *state,

				                                   const char *qual_indentifier,

				                                   unsigned *value, bool can_be_zero);

				   void merge_qualifier(ast_layout_expression *l_expr)

				   {

				      layout_const_expressions.append_list(&l_expr->layout_const_expressions);

				   }

				   exec_list layout_const_expressions;

				};

				/**

				 * C-style aggregate initialization class

				 *

				@@ -453,6 +468,7 @@ struct ast_type_qualifier {

					 unsigned patch:1;

					 unsigned uniform:1;

					 unsigned buffer:1;

					 unsigned shared_storage:1;

					 unsigned smooth:1;

					 unsigned flat:1;

					 unsigned noperspective:1;

				@@ -497,6 +513,7 @@ struct ast_type_qualifier {

					 /** \name Layout qualifiers for GL_ARB_uniform_buffer_object */

					 /** \{ */

				         unsigned std140:1;

				         unsigned std430:1;

				         unsigned shared:1;

				         unsigned packed:1;

				         unsigned column_major:1;

				@@ -561,7 +578,7 @@ struct ast_type_qualifier {

				   unsigned precision:2;

				   /** Geometry shader invocations for GL_ARB_gpu_shader5. */

				   int invocations;

				   ast_layout_expression *invocations;

				   /**

				    * Location specified via GL_ARB_explicit_attrib_location layout

				@@ -569,20 +586,20 @@ struct ast_type_qualifier {

				    * \note

				    * This field is only valid if \c explicit_location is set.

				    */

				   int location;

				   ast_expression *location;

				   /**

				    * Index specified via GL_ARB_explicit_attrib_location layout

				    *

				    * \note

				    * This field is only valid if \c explicit_index is set.

				    */

				   int index;

				   ast_expression *index;

				   /** Maximum output vertices in GLSL 1.50 geometry shaders. */

				   int max_vertices;

				   ast_layout_expression *max_vertices;

				   /** Stream in GLSL 1.50 geometry shaders. */

				   unsigned stream;

				   ast_expression *stream;

				   /**

				    * Input or output primitive type in GLSL 1.50 geometry shaders

				@@ -596,7 +613,7 @@ struct ast_type_qualifier {

				    * \note

				    * This field is only valid if \c explicit_binding is set.

				    */

				   int binding;

				   ast_expression *binding;

				   /**

				    * Offset specified via GL_ARB_shader_atomic_counter's "offset"

				@@ -605,14 +622,14 @@ struct ast_type_qualifier {

				    * \note

				    * This field is only valid if \c explicit_offset is set.

				    */

				   int offset;

				   ast_expression *offset;

				   /**

				    * Local size specified via GL_ARB_compute_shader's "local_size_{x,y,z}"

				    * layout qualifier.  Element i of this array is only valid if

				    * flags.q.local_size & (1 << i) is set.

				    */

				   int local_size[3];

				   ast_layout_expression *local_size[3];

				   /** Tessellation evaluation shader: vertex spacing (equal, fractional even/odd) */

				   GLenum vertex_spacing;

				@@ -624,7 +641,7 @@ struct ast_type_qualifier {

				   bool point_mode;

				   /** Tessellation control shader: number of output vertices */

				   int vertices;

				   ast_layout_expression *vertices;

				   /**

				    * Image format specified with an ARB_shader_image_load_store

				@@ -645,6 +662,9 @@ struct ast_type_qualifier {

				    */

				   glsl_base_type image_base_type;

				   /** Flag to know if this represents a default value for a qualifier */

				   bool is_default_qualifier;

				   /**

				    * Return true if and only if an interpolation qualifier is present.

				    */

				@@ -665,31 +685,20 @@ struct ast_type_qualifier {

				    */

				   bool has_auxiliary_storage() const;

				   /**

				    * \brief Return string representation of interpolation qualifier.

				    *

				    * If an interpolation qualifier is present, then return that qualifier's

				    * string representation. Otherwise, return null. For example, if the

				    * noperspective bit is set, then this returns "noperspective".

				    *

				    * If multiple interpolation qualifiers are somehow present, then the

				    * returned string is undefined but not null.

				    */

				   const char *interpolation_string() const;

				   bool merge_qualifier(YYLTYPE *loc,

							_mesa_glsl_parse_state *state,

							ast_type_qualifier q);

				                        const ast_type_qualifier &q,

				                        bool is_single_layout_merge);

				   bool merge_out_qualifier(YYLTYPE *loc,

				                           _mesa_glsl_parse_state *state,

				                           ast_type_qualifier q,

				                           ast_node* &node);

				                           const ast_type_qualifier &q,

				                           ast_node* &node, bool create_node);

				   bool merge_in_qualifier(YYLTYPE *loc,

				                           _mesa_glsl_parse_state *state,

				                           ast_type_qualifier q,

				                           ast_node* &node);

				                           const ast_type_qualifier &q,

				                           ast_node* &node, bool create_node);

				   ast_subroutine_list *subroutine_list;

				};

				@@ -706,6 +715,7 @@ public:

							  struct _mesa_glsl_parse_state *state);

				   const char *name;

				   ast_type_qualifier *layout;

				   /* List of ast_declarator_list * */

				   exec_list declarations;

				   bool is_declaration;

				@@ -752,7 +762,7 @@ public:

				class ast_fully_specified_type : public ast_node {

				public:

				   virtual void print(void) const;

				   bool has_qualifiers() const;

				   bool has_qualifiers(_mesa_glsl_parse_state *state) const;

				   ast_fully_specified_type() : qualifier(), specifier(NULL)

				   {

				@@ -1093,17 +1103,13 @@ public:

				class ast_tcs_output_layout : public ast_node

				{

				public:

				   ast_tcs_output_layout(const struct YYLTYPE &locp, int vertices)

				      : vertices(vertices)

				   ast_tcs_output_layout(const struct YYLTYPE &locp)

				   {

				      set_location(locp);

				   }

				   virtual ir_rvalue *hir(exec_list *instructions,

				                          struct _mesa_glsl_parse_state *state);

				private:

				   const int vertices;

				};

				@@ -1135,9 +1141,12 @@ private:

				class ast_cs_input_layout : public ast_node

				{

				public:

				   ast_cs_input_layout(const struct YYLTYPE &locp, const unsigned *local_size)

				   ast_cs_input_layout(const struct YYLTYPE &locp,

				                       ast_layout_expression *const *local_size)

				   {

				      memcpy(this->local_size, local_size, sizeof(this->local_size));

				      for (int i = 0; i < 3; i++) {

				         this->local_size[i] = local_size[i];

				      }

				      set_location(locp);

				   }

				@@ -1145,7 +1154,7 @@ public:

				                          struct _mesa_glsl_parse_state *state);

				private:

				   unsigned local_size[3];

				   ast_layout_expression *local_size[3];

				};

				/*@}*/

				@@ -1175,4 +1184,9 @@ extern void

				check_builtin_array_max_size(const char *name, unsigned size,

				                             YYLTYPE loc, struct _mesa_glsl_parse_state *state);

				extern void _mesa_ast_process_interface_block(YYLTYPE *locp,

				                                              _mesa_glsl_parse_state *state,

				                                              ast_interface_block *const block,

				                                              const struct ast_type_qualifier &q);

				#endif /* AST_H */

									
										49

src/glsl/ast_array_index.cpp → src/compiler/glsl/ast_array_index.cpp
									
												View File
												
				@@ -22,19 +22,16 @@

				 */

				#include "ast.h"

				#include "glsl_types.h"

				#include "compiler/glsl_types.h"

				#include "ir.h"

				void

				ast_array_specifier::print(void) const

				{

				   if (this->is_unsized_array) {

				      printf("[ ] ");

				   }

				   foreach_list_typed (ast_node, array_dimension, link, &this->array_dimensions) {

				      printf("[ ");

				      array_dimension->print();

				      if (((ast_expression*)array_dimension)->oper != ast_unsized_array_dim)

				         array_dimension->print();

				      printf("] ");

				   }

				}

				@@ -64,21 +61,29 @@ update_max_array_access(ir_rvalue *ir, int idx, YYLTYPE *loc,

				      }

				   } else if (ir_dereference_record *deref_record =

				              ir->as_dereference_record()) {

				      /* There are two possibilities we need to consider:

				      /* There are three possibilities we need to consider:

				       *

				       * - Accessing an element of an array that is a member of a named

				       *   interface block (e.g. ifc.foo[i])

				       *

				       * - Accessing an element of an array that is a member of a named

				       *   interface block array (e.g. ifc[j].foo[i]).

				       *

				       * - Accessing an element of an array that is a member of a named

				       *   interface block array of arrays (e.g. ifc[j][k].foo[i]).

				       */

				      ir_dereference_variable *deref_var =

				         deref_record->record->as_dereference_variable();

				      if (deref_var == NULL) {

				         if (ir_dereference_array *deref_array =

				             deref_record->record->as_dereference_array()) {

				            deref_var = deref_array->array->as_dereference_variable();

				         ir_dereference_array *deref_array =

				            deref_record->record->as_dereference_array();

				         ir_dereference_array *deref_array_prev = NULL;

				         while (deref_array != NULL) {

				            deref_array_prev = deref_array;

				            deref_array = deref_array->array->as_dereference_array();

				         }

				         if (deref_array_prev != NULL)

				            deref_var = deref_array_prev->array->as_dereference_variable();

				      }

				      if (deref_var != NULL) {

				@@ -226,19 +231,22 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,

				             * by the linker.

				             */

				         }

				         else {

				         else if (array->variable_referenced()->data.mode !=

				                  ir_var_shader_storage) {

				            _mesa_glsl_error(&loc, state, "unsized array index must be constant");

				         }

				      } else if (array->type->fields.array->is_interface()

				                 && array->variable_referenced()->data.mode == ir_var_uniform

				      } else if (array->type->without_array()->is_interface()

				                 && (array->variable_referenced()->data.mode == ir_var_uniform ||

				                     array->variable_referenced()->data.mode == ir_var_shader_storage)

				                 && !state->is_version(400, 0) && !state->ARB_gpu_shader5_enable) {

					 /* Page 46 in section 4.3.7 of the OpenGL ES 3.00 spec says:

					 /* Page 50 in section 4.3.9 of the OpenGL ES 3.10 spec says:

					  *

					  *     "All indexes used to index a uniform block array must be

					  *     constant integral expressions."

					  *     "All indices used to index a uniform or shader storage block

					  *     array must be constant integral expressions."

					  */

					 _mesa_glsl_error(&loc, state,

							  "uniform block array index must be constant");

					 _mesa_glsl_error(&loc, state, "%s block array index must be constant",

				                          array->variable_referenced()->data.mode

				                          == ir_var_uniform ? "uniform" : "shader storage");

				      } else {

					 /* whole_variable_referenced can return NULL if the array is a

					  * member of a structure.  In this case it is safe to not update

				@@ -311,10 +319,9 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,

				    * expression.

				    */

				   if (array->type->is_array()

				       || array->type->is_matrix()) {

				       || array->type->is_matrix()

				       || array->type->is_vector()) {

				      return new(mem_ctx) ir_dereference_array(array, idx);

				   } else if (array->type->is_vector()) {

				      return new(mem_ctx) ir_expression(ir_binop_vector_extract, array, idx);

				   } else if (array->type->is_error()) {

				      return array;

				   } else {

0

src/glsl/ast_expr.cpp → src/compiler/glsl/ast_expr.cpp

View File

									
										325

src/glsl/ast_function.cpp → src/compiler/glsl/ast_function.cpp
									
												View File
												
				@@ -23,7 +23,7 @@

				#include "glsl_symbol_table.h"

				#include "ast.h"

				#include "glsl_types.h"

				#include "compiler/glsl_types.h"

				#include "ir.h"

				#include "main/core.h" /* for MIN2 */

				#include "main/shaderobj.h"

				@@ -142,6 +142,33 @@ verify_image_parameter(YYLTYPE *loc, _mesa_glsl_parse_state *state,

				   return true;

				}

				static bool

				verify_first_atomic_parameter(YYLTYPE *loc, _mesa_glsl_parse_state *state,

				                                   ir_variable *var)

				{

				   if (!var ||

				       (!var->is_in_shader_storage_block() &&

				        var->data.mode != ir_var_shader_shared)) {

				      _mesa_glsl_error(loc, state, "First argument to atomic function "

				                       "must be a buffer or shared variable");

				      return false;

				   }

				   return true;

				}

				static bool

				is_atomic_function(const char *func_name)

				{

				   return !strcmp(func_name, "atomicAdd") ||

				          !strcmp(func_name, "atomicMin") ||

				          !strcmp(func_name, "atomicMax") ||

				          !strcmp(func_name, "atomicAnd") ||

				          !strcmp(func_name, "atomicOr") ||

				          !strcmp(func_name, "atomicXor") ||

				          !strcmp(func_name, "atomicExchange") ||

				          !strcmp(func_name, "atomicCompSwap");

				}

				/**

				 * Verify that 'out' and 'inout' actual parameters are lvalues.  Also, verify

				 * that 'const_in' formal parameters (an extension in our IR) correspond to

				@@ -231,18 +258,10 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,

							     actual->variable_referenced()->name);

					    return false;

					 } else if (!actual->is_lvalue()) {

				            /* Even though ir_binop_vector_extract is not an l-value, let it

				             * slop through.  generate_call will handle it correctly.

				             */

				            ir_expression *const expr = ((ir_rvalue *) actual)->as_expression();

				            if (expr == NULL

				                || expr->operation != ir_binop_vector_extract

				                || !expr->operands[0]->is_lvalue()) {

				               _mesa_glsl_error(&loc, state,

				                                "function parameter '%s %s' is not an lvalue",

				                                mode, formal->name);

				               return false;

				            }

				            _mesa_glsl_error(&loc, state,

				                             "function parameter '%s %s' is not an lvalue",

				                             mode, formal->name);

				            return false;

					 }

				      }

				@@ -256,6 +275,23 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,

				      actual_ir_node  = actual_ir_node->next;

				      actual_ast_node = actual_ast_node->next;

				   }

				   /* The first parameter of atomic functions must be a buffer variable */

				   const char *func_name = sig->function_name();

				   bool is_atomic = is_atomic_function(func_name);

				   if (is_atomic) {

				      const ir_rvalue *const actual = (ir_rvalue *) actual_ir_parameters.head;

				      const ast_expression *const actual_ast =

				         exec_node_data(ast_expression, actual_ast_parameters.head, link);

				      YYLTYPE loc = actual_ast->get_location();

				      if (!verify_first_atomic_parameter(&loc, state,

				                                         actual->variable_referenced())) {

				         return false;

				      }

				   }

				   return true;

				}

				@@ -334,12 +370,8 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, const glsl_type *formal_type,

				   ir_rvalue *lhs = actual;

				   if (expr != NULL && expr->operation == ir_binop_vector_extract) {

				      rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert,

				                                       expr->operands[0]->type,

				                                       expr->operands[0]->clone(mem_ctx, NULL),

				                                       rhs,

				                                       expr->operands[1]->clone(mem_ctx, NULL));

				      lhs = expr->operands[0]->clone(mem_ctx, NULL);

				      lhs = new(mem_ctx) ir_dereference_array(expr->operands[0]->clone(mem_ctx, NULL),

				                                              expr->operands[1]->clone(mem_ctx, NULL));

				   }

				   ir_assignment *const assignment_2 = new(mem_ctx) ir_assignment(lhs, rhs);

				@@ -528,7 +560,8 @@ done:

					    state->symbols->add_global_function(f);

					    emit_function(state, f);

					 }

					 f->add_signature(sig->clone_prototype(f, NULL));

					 sig = sig->clone_prototype(f, NULL);

					 f->add_signature(sig);

				      }

				   }

				   return sig;

				@@ -568,6 +601,37 @@ match_subroutine_by_name(const char *name,

				   return sig;

				}

				static ir_rvalue *

				generate_array_index(void *mem_ctx, exec_list *instructions,

				                     struct _mesa_glsl_parse_state *state, YYLTYPE loc,

				                     const ast_expression *array, ast_expression *idx,

				                     const char **function_name, exec_list *actual_parameters)

				{

				   if (array->oper == ast_array_index) {

				      /* This handles arrays of arrays */

				      ir_rvalue *outer_array = generate_array_index(mem_ctx, instructions,

				                                                    state, loc,

				                                                    array->subexpressions[0],

				                                                    array->subexpressions[1],

				                                                    function_name, actual_parameters);

				      ir_rvalue *outer_array_idx = idx->hir(instructions, state);

				      YYLTYPE index_loc = idx->get_location();

				      return _mesa_ast_array_index_to_hir(mem_ctx, state, outer_array,

				                                          outer_array_idx, loc,

				                                          index_loc);

				   } else {

				      ir_variable *sub_var = NULL;

				      *function_name = array->primary_expression.identifier;

				      match_subroutine_by_name(*function_name, actual_parameters,

				                               state, &sub_var);

				      ir_rvalue *outer_array_idx = idx->hir(instructions, state);

				      return new(mem_ctx) ir_dereference_array(sub_var, outer_array_idx);

				   }

				}

				static void

				print_function_prototypes(_mesa_glsl_parse_state *state, YYLTYPE *loc,

				                          ir_function *f)

				@@ -949,6 +1013,7 @@ process_array_constructor(exec_list *instructions,

				   }

				   bool all_parameters_are_constant = true;

				   const glsl_type *element_type = constructor_type->fields.array;

				   /* Type cast each parameter and, if possible, fold constants. */

				   foreach_in_list_safe(ir_rvalue, ir, &actual_parameters) {

				@@ -975,12 +1040,34 @@ process_array_constructor(exec_list *instructions,

					 }

				      }

				      if (result->type != constructor_type->fields.array) {

				      if (constructor_type->fields.array->is_unsized_array()) {

				         /* As the inner parameters of the constructor are created without

				          * knowledge of each other we need to check to make sure unsized

				          * parameters of unsized constructors all end up with the same size.

				          *

				          * e.g we make sure to fail for a constructor like this:

				          * vec4[][] a = vec4[][](vec4[](vec4(0.0), vec4(1.0)),

				          *                       vec4[](vec4(0.0), vec4(1.0), vec4(1.0)),

				          *                       vec4[](vec4(0.0), vec4(1.0)));

				          */

				         if (element_type->is_unsized_array()) {

				             /* This is the first parameter so just get the type */

				            element_type = result->type;

				         } else if (element_type != result->type) {

				            _mesa_glsl_error(loc, state, "type error in array constructor: "

				                             "expected: %s, found %s",

				                             element_type->name,

				                             result->type->name);

				            return ir_rvalue::error_value(ctx);

				         }

				      } else if (result->type != constructor_type->fields.array) {

					 _mesa_glsl_error(loc, state, "type error in array constructor: "

							  "expected: %s, found %s",

							  constructor_type->fields.array->name,

							  result->type->name);

				         return ir_rvalue::error_value(ctx);

				      } else {

				         element_type = result->type;

				      }

				      /* Attempt to convert the parameter to a constant valued expression.

				@@ -997,6 +1084,14 @@ process_array_constructor(exec_list *instructions,

				      ir->replace_with(result);

				   }

				   if (constructor_type->fields.array->is_unsized_array()) {

				      constructor_type =

					 glsl_type::get_array_instance(element_type,

								       parameter_count);

				      assert(constructor_type != NULL);

				      assert(constructor_type->length == parameter_count);

				   }

				   if (all_parameters_are_constant)

				      return new(ctx) ir_constant(constructor_type, &actual_parameters);

				@@ -1310,9 +1405,9 @@ emit_inline_matrix_constructor(const glsl_type *type,

				            zero.d[i] = 0.0;

				      ir_instruction *inst =

					 new(ctx) ir_assignment(new(ctx) ir_dereference_variable(rhs_var),

								new(ctx) ir_constant(rhs_var->type, &zero),

								NULL);

				         new(ctx) ir_assignment(new(ctx) ir_dereference_variable(rhs_var),

				                                new(ctx) ir_constant(rhs_var->type, &zero),

				                                NULL);

				      instructions->push_tail(inst);

				      ir_dereference *const rhs_ref = new(ctx) ir_dereference_variable(rhs_var);

				@@ -1327,36 +1422,36 @@ emit_inline_matrix_constructor(const glsl_type *type,

				       * columns than rows).

				       */

				      static const unsigned rhs_swiz[4][4] = {

					 { 0, 1, 1, 1 },

					 { 1, 0, 1, 1 },

					 { 1, 1, 0, 1 },

					 { 1, 1, 1, 0 }

				         { 0, 1, 1, 1 },

				         { 1, 0, 1, 1 },

				         { 1, 1, 0, 1 },

				         { 1, 1, 1, 0 }

				      };

				      const unsigned cols_to_init = MIN2(type->matrix_columns,

									 type->vector_elements);

				                                         type->vector_elements);

				      for (unsigned i = 0; i < cols_to_init; i++) {

					 ir_constant *const col_idx = new(ctx) ir_constant(i);

					 ir_rvalue *const col_ref = new(ctx) ir_dereference_array(var, col_idx);

				         ir_constant *const col_idx = new(ctx) ir_constant(i);

				         ir_rvalue *const col_ref = new(ctx) ir_dereference_array(var, col_idx);

					 ir_rvalue *const rhs_ref = new(ctx) ir_dereference_variable(rhs_var);

					 ir_rvalue *const rhs = new(ctx) ir_swizzle(rhs_ref, rhs_swiz[i],

										    type->vector_elements);

				         ir_rvalue *const rhs_ref = new(ctx) ir_dereference_variable(rhs_var);

				         ir_rvalue *const rhs = new(ctx) ir_swizzle(rhs_ref, rhs_swiz[i],

				                                                    type->vector_elements);

					 inst = new(ctx) ir_assignment(col_ref, rhs, NULL);

					 instructions->push_tail(inst);

				         inst = new(ctx) ir_assignment(col_ref, rhs, NULL);

				         instructions->push_tail(inst);

				      }

				      for (unsigned i = cols_to_init; i < type->matrix_columns; i++) {

					 ir_constant *const col_idx = new(ctx) ir_constant(i);

					 ir_rvalue *const col_ref = new(ctx) ir_dereference_array(var, col_idx);

				         ir_constant *const col_idx = new(ctx) ir_constant(i);

				         ir_rvalue *const col_ref = new(ctx) ir_dereference_array(var, col_idx);

					 ir_rvalue *const rhs_ref = new(ctx) ir_dereference_variable(rhs_var);

					 ir_rvalue *const rhs = new(ctx) ir_swizzle(rhs_ref, 1, 1, 1, 1,

										    type->vector_elements);

				         ir_rvalue *const rhs_ref = new(ctx) ir_dereference_variable(rhs_var);

				         ir_rvalue *const rhs = new(ctx) ir_swizzle(rhs_ref, 1, 1, 1, 1,

				                                                    type->vector_elements);

					 inst = new(ctx) ir_assignment(col_ref, rhs, NULL);

					 instructions->push_tail(inst);

				         inst = new(ctx) ir_assignment(col_ref, rhs, NULL);

				         instructions->push_tail(inst);

				      }

				   } else if (first_param->type->is_matrix()) {

				      /* From page 50 (56 of the PDF) of the GLSL 1.50 spec:

				@@ -1374,36 +1469,43 @@ emit_inline_matrix_constructor(const glsl_type *type,

				      /* If the source matrix is smaller, pre-initialize the relavent parts of

				       * the destination matrix to the identity matrix.

				       */

				      if ((src_matrix->type->matrix_columns < var->type->matrix_columns)

					  || (src_matrix->type->vector_elements < var->type->vector_elements)) {

				      if ((src_matrix->type->matrix_columns < var->type->matrix_columns) ||

				          (src_matrix->type->vector_elements < var->type->vector_elements)) {

					 /* If the source matrix has fewer rows, every column of the destination

					  * must be initialized.  Otherwise only the columns in the destination

					  * that do not exist in the source must be initialized.

					  */

					 unsigned col =

					    (src_matrix->type->vector_elements < var->type->vector_elements)

					    ? 0 : src_matrix->type->matrix_columns;

				         /* If the source matrix has fewer rows, every column of the destination

				          * must be initialized.  Otherwise only the columns in the destination

				          * that do not exist in the source must be initialized.

				          */

				         unsigned col =

				            (src_matrix->type->vector_elements < var->type->vector_elements)

				            ? 0 : src_matrix->type->matrix_columns;

					 const glsl_type *const col_type = var->type->column_type();

					 for (/* empty */; col < var->type->matrix_columns; col++) {

					    ir_constant_data ident;

				         const glsl_type *const col_type = var->type->column_type();

				         for (/* empty */; col < var->type->matrix_columns; col++) {

				            ir_constant_data ident;

					    ident.f[0] = 0.0;

					    ident.f[1] = 0.0;

					    ident.f[2] = 0.0;

					    ident.f[3] = 0.0;

				            if (!col_type->is_double()) {

				               ident.f[0] = 0.0f;

				               ident.f[1] = 0.0f;

				               ident.f[2] = 0.0f;

				               ident.f[3] = 0.0f;

				               ident.f[col] = 1.0f;

				            } else {

				               ident.d[0] = 0.0;

				               ident.d[1] = 0.0;

				               ident.d[2] = 0.0;

				               ident.d[3] = 0.0;

				               ident.d[col] = 1.0;

				            }

					    ident.f[col] = 1.0;

				            ir_rvalue *const rhs = new(ctx) ir_constant(col_type, &ident);

					    ir_rvalue *const rhs = new(ctx) ir_constant(col_type, &ident);

				            ir_rvalue *const lhs =

				               new(ctx) ir_dereference_array(var, new(ctx) ir_constant(col));

					    ir_rvalue *const lhs =

					       new(ctx) ir_dereference_array(var, new(ctx) ir_constant(col));

					    ir_instruction *inst = new(ctx) ir_assignment(lhs, rhs, NULL);

					    instructions->push_tail(inst);

					 }

				            ir_instruction *inst = new(ctx) ir_assignment(lhs, rhs, NULL);

				            instructions->push_tail(inst);

				         }

				      }

				      /* Assign columns from the source matrix to the destination matrix.

				@@ -1412,51 +1514,51 @@ emit_inline_matrix_constructor(const glsl_type *type,

				       * generate a temporary and copy the paramter there.

				       */

				      ir_variable *const rhs_var =

					 new(ctx) ir_variable(first_param->type, "mat_ctor_mat",

							      ir_var_temporary);

				         new(ctx) ir_variable(first_param->type, "mat_ctor_mat",

				                              ir_var_temporary);

				      instructions->push_tail(rhs_var);

				      ir_dereference *const rhs_var_ref =

					 new(ctx) ir_dereference_variable(rhs_var);

				         new(ctx) ir_dereference_variable(rhs_var);

				      ir_instruction *const inst =

					 new(ctx) ir_assignment(rhs_var_ref, first_param, NULL);

				         new(ctx) ir_assignment(rhs_var_ref, first_param, NULL);

				      instructions->push_tail(inst);

				      const unsigned last_row = MIN2(src_matrix->type->vector_elements,

								     var->type->vector_elements);

				                                     var->type->vector_elements);

				      const unsigned last_col = MIN2(src_matrix->type->matrix_columns,

								     var->type->matrix_columns);

				                                     var->type->matrix_columns);

				      unsigned swiz[4] = { 0, 0, 0, 0 };

				      for (unsigned i = 1; i < last_row; i++)

					 swiz[i] = i;

				         swiz[i] = i;

				      const unsigned write_mask = (1U << last_row) - 1;

				         const unsigned write_mask = (1U << last_row) - 1;

				      for (unsigned i = 0; i < last_col; i++) {

					 ir_dereference *const lhs =

					    new(ctx) ir_dereference_array(var, new(ctx) ir_constant(i));

					 ir_rvalue *const rhs_col =

					    new(ctx) ir_dereference_array(rhs_var, new(ctx) ir_constant(i));

				         ir_dereference *const lhs =

				            new(ctx) ir_dereference_array(var, new(ctx) ir_constant(i));

				         ir_rvalue *const rhs_col =

				            new(ctx) ir_dereference_array(rhs_var, new(ctx) ir_constant(i));

					 /* If one matrix has columns that are smaller than the columns of the

					  * other matrix, wrap the column access of the larger with a swizzle

					  * so that the LHS and RHS of the assignment have the same size (and

					  * therefore have the same type).

					  *

					  * It would be perfectly valid to unconditionally generate the

					  * swizzles, this this will typically result in a more compact IR tree.

					  */

					 ir_rvalue *rhs;

					 if (lhs->type->vector_elements != rhs_col->type->vector_elements) {

					    rhs = new(ctx) ir_swizzle(rhs_col, swiz, last_row);

					 } else {

					    rhs = rhs_col;

					 }

				         /* If one matrix has columns that are smaller than the columns of the

				          * other matrix, wrap the column access of the larger with a swizzle

				          * so that the LHS and RHS of the assignment have the same size (and

				          * therefore have the same type).

				          *

				          * It would be perfectly valid to unconditionally generate the

				          * swizzles, this this will typically result in a more compact IR tree.

				          */

				         ir_rvalue *rhs;

				         if (lhs->type->vector_elements != rhs_col->type->vector_elements) {

				            rhs = new(ctx) ir_swizzle(rhs_col, swiz, last_row);

				         } else {

				            rhs = rhs_col;

				         }

					 ir_instruction *inst =

					    new(ctx) ir_assignment(lhs, rhs, NULL, write_mask);

					 instructions->push_tail(inst);

				         ir_instruction *inst =

				            new(ctx) ir_assignment(lhs, rhs, NULL, write_mask);

				         instructions->push_tail(inst);

				      }

				   } else {

				      const unsigned cols = type->matrix_columns;

				@@ -1634,13 +1736,18 @@ ast_function_expression::handle_method(exec_list *instructions,

				      if (op->type->is_array()) {

				         if (op->type->is_unsized_array()) {

				            _mesa_glsl_error(&loc, state, "length called on unsized array");

				            goto fail;

				            if (!state->has_shader_storage_buffer_objects()) {

				               _mesa_glsl_error(&loc, state, "length called on unsized array"

				                                             " only available with "

				                                             "ARB_shader_storage_buffer_object");

				            }

				            /* Calculate length of an unsized array in run-time */

				            result = new(ctx) ir_expression(ir_unop_ssbo_unsized_array_length, op);

				         } else {

				            result = new(ctx) ir_constant(op->type->array_size());

				         }

				         result = new(ctx) ir_constant(op->type->array_size());

				      } else if (op->type->is_vector()) {

				         if (state->ARB_shading_language_420pack_enable) {

				         if (state->has_420pack()) {

				            /* .length() returns int. */

				            result = new(ctx) ir_constant((int) op->type->vector_elements);

				         } else {

				@@ -1649,7 +1756,7 @@ ast_function_expression::handle_method(exec_list *instructions,

				            goto fail;

				         }

				      } else if (op->type->is_matrix()) {

				         if (state->ARB_shading_language_420pack_enable) {

				         if (state->has_420pack()) {

				            /* .length() returns int. */

				            result = new(ctx) ir_constant((int) op->type->matrix_columns);

				         } else {

				@@ -1911,16 +2018,18 @@ ast_function_expression::hir(exec_list *instructions,

				      ir_variable *sub_var = NULL;

				      ir_rvalue *array_idx = NULL;

				      process_parameters(instructions, &actual_parameters, &this->expressions,

							 state);

				      if (id->oper == ast_array_index) {

				         func_name = id->subexpressions[0]->primary_expression.identifier;

					 array_idx = id->subexpressions[1]->hir(instructions, state);

				         array_idx = generate_array_index(ctx, instructions, state, loc,

				                                          id->subexpressions[0],

				                                          id->subexpressions[1], &func_name,

				                                          &actual_parameters);

				      } else {

				         func_name = id->primary_expression.identifier;

				      }

				      process_parameters(instructions, &actual_parameters, &this->expressions,

							 state);

				      ir_function_signature *sig =

					 match_function_by_name(func_name, &actual_parameters, state);

				@@ -1976,7 +2085,7 @@ ast_aggregate_initializer::hir(exec_list *instructions,

				   }

				   const glsl_type *const constructor_type = this->constructor_type;

				   if (!state->ARB_shading_language_420pack_enable) {

				   if (!state->has_420pack()) {

				      _mesa_glsl_error(&loc, state, "C-style initialization requires the "

				                       "GL_ARB_shading_language_420pack extension");

				      return ir_rvalue::error_value(ctx);

2157

src/glsl/ast_to_hir.cpp → src/compiler/glsl/ast_to_hir.cpp

View File

File diff suppressed because it is too large Load Diff

									
										235

src/glsl/ast_type.cpp → src/compiler/glsl/ast_type.cpp
									
												View File
												
				@@ -38,13 +38,16 @@ ast_type_specifier::print(void) const

				}

				bool

				ast_fully_specified_type::has_qualifiers() const

				ast_fully_specified_type::has_qualifiers(_mesa_glsl_parse_state *state) const

				{

				   /* 'subroutine' isnt a real qualifier. */

				   ast_type_qualifier subroutine_only;

				   subroutine_only.flags.i = 0;

				   subroutine_only.flags.q.subroutine = 1;

				   subroutine_only.flags.q.subroutine_def = 1;

				   if (state->has_explicit_uniform_location()) {

				      subroutine_only.flags.q.explicit_index = 1;

				   }

				   return (this->qualifier.flags.i & ~subroutine_only.flags.i) != 0;

				}

				@@ -65,14 +68,17 @@ ast_type_qualifier::has_layout() const

				          || this->flags.q.depth_less

				          || this->flags.q.depth_unchanged

				          || this->flags.q.std140

				          || this->flags.q.std430

				          || this->flags.q.shared

				          || this->flags.q.column_major

				          || this->flags.q.row_major

				          || this->flags.q.packed

				          || this->flags.q.explicit_location

				          || this->flags.q.explicit_image_format

				          || this->flags.q.explicit_index

				          || this->flags.q.explicit_binding

				          || this->flags.q.explicit_offset;

				          || this->flags.q.explicit_offset

				          || this->flags.q.explicit_stream;

				}

				bool

				@@ -84,7 +90,8 @@ ast_type_qualifier::has_storage() const

				          || this->flags.q.in

				          || this->flags.q.out

				          || this->flags.q.uniform

				          || this->flags.q.buffer;

				          || this->flags.q.buffer

				          || this->flags.q.shared_storage;

				}

				bool

				@@ -95,23 +102,16 @@ ast_type_qualifier::has_auxiliary_storage() const

				          || this->flags.q.patch;

				}

				const char*

				ast_type_qualifier::interpolation_string() const

				{

				   if (this->flags.q.smooth)

				      return "smooth";

				   else if (this->flags.q.flat)

				      return "flat";

				   else if (this->flags.q.noperspective)

				      return "noperspective";

				   else

				      return NULL;

				}

				/**

				 * This function merges both duplicate identifies within a single layout and

				 * multiple layout qualifiers on a single variable declaration. The

				 * is_single_layout_merge param is used differentiate between the two.

				 */

				bool

				ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

								    _mesa_glsl_parse_state *state,

								    ast_type_qualifier q)

				                                    const ast_type_qualifier &q,

				                                    bool is_single_layout_merge)

				{

				   ast_type_qualifier ubo_mat_mask;

				   ubo_mat_mask.flags.i = 0;

				@@ -123,6 +123,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

				   ubo_layout_mask.flags.q.std140 = 1;

				   ubo_layout_mask.flags.q.packed = 1;

				   ubo_layout_mask.flags.q.shared = 1;

				   ubo_layout_mask.flags.q.std430 = 1;

				   ast_type_qualifier ubo_binding_mask;

				   ubo_binding_mask.flags.i = 0;

				@@ -150,7 +151,8 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

				      allowed_duplicates_mask.flags.i |=

				         stream_layout_mask.flags.i;

				   if ((this->flags.i & q.flags.i & ~allowed_duplicates_mask.flags.i) != 0) {

				   if (is_single_layout_merge && !state->has_enhanced_layouts() &&

				       (this->flags.i & q.flags.i & ~allowed_duplicates_mask.flags.i) != 0) {

				      _mesa_glsl_error(loc, state,

						       "duplicate layout qualifiers used");

				      return false;

				@@ -166,41 +168,32 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

				   }

				   if (q.flags.q.max_vertices) {

				      if (this->flags.q.max_vertices && this->max_vertices != q.max_vertices) {

					 _mesa_glsl_error(loc, state,

							  "geometry shader set conflicting max_vertices "

							  "(%d and %d)", this->max_vertices, q.max_vertices);

					 return false;

				      if (this->max_vertices) {

				         this->max_vertices->merge_qualifier(q.max_vertices);

				      } else {

				         this->max_vertices = q.max_vertices;

				      }

				   }

				   if (q.flags.q.subroutine_def) {

				      if (this->flags.q.subroutine_def) {

					 _mesa_glsl_error(loc, state,

							  "conflicting subroutine qualifiers used");

				      } else {

				         this->subroutine_list = q.subroutine_list;

				      }

				      this->max_vertices = q.max_vertices;

				   }

				   if (q.flags.q.invocations) {

				      if (this->flags.q.invocations && this->invocations != q.invocations) {

				         _mesa_glsl_error(loc, state,

				                          "geometry shader set conflicting invocations "

				                          "(%d and %d)", this->invocations, q.invocations);

				         return false;

				      if (this->invocations) {

				         this->invocations->merge_qualifier(q.invocations);

				      } else {

				         this->invocations = q.invocations;

				      }

				      this->invocations = q.invocations;

				   }

				   if (state->stage == MESA_SHADER_GEOMETRY &&

				       state->has_explicit_attrib_stream()) {

				      if (q.flags.q.stream && q.stream >= state->ctx->Const.MaxVertexStreams) {

				         _mesa_glsl_error(loc, state,

				                          "`stream' value is larger than MAX_VERTEX_STREAMS - 1 "

				                          "(%d > %d)",

				                          q.stream, state->ctx->Const.MaxVertexStreams - 1);

				      }

				      if (this->flags.q.explicit_stream &&

				          this->stream >= state->ctx->Const.MaxVertexStreams) {

				         _mesa_glsl_error(loc, state,

				                          "`stream' value is larger than MAX_VERTEX_STREAMS - 1 "

				                          "(%d > %d)",

				                          this->stream, state->ctx->Const.MaxVertexStreams - 1);

				      }

				      if (!this->flags.q.explicit_stream) {

				         if (q.flags.q.stream) {

				            this->flags.q.stream = 1;

				@@ -210,23 +203,15 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

				            this->flags.q.stream = 1;

				            this->stream = state->out_qualifier->stream;

				         }

				      } else {

				         if (q.flags.q.explicit_stream) {

				            _mesa_glsl_error(loc, state,

				                             "duplicate layout `stream' qualifier");

				         }

				      }

				   }

				   if (q.flags.q.vertices) {

				      if (this->flags.q.vertices && this->vertices != q.vertices) {

					 _mesa_glsl_error(loc, state,

							  "tessellation control shader set conflicting "

							  "vertices (%d and %d)",

							  this->vertices, q.vertices);

					 return false;

				      if (this->vertices) {

				         this->vertices->merge_qualifier(q.vertices);

				      } else {

				         this->vertices = q.vertices;

				      }

				      this->vertices = q.vertices;

				   }

				   if (q.flags.q.vertex_spacing) {

				@@ -263,15 +248,11 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

				   for (int i = 0; i < 3; i++) {

				      if (q.flags.q.local_size & (1 << i)) {

				         if ((this->flags.q.local_size & (1 << i)) &&

				             this->local_size[i] != q.local_size[i]) {

				            _mesa_glsl_error(loc, state,

				                             "compute shader set conflicting values for "

				                             "local_size_%c (%d and %d)", 'x' + i,

				                             this->local_size[i], q.local_size[i]);

				            return false;

				         if (this->local_size[i]) {

				            this->local_size[i]->merge_qualifier(q.local_size[i]);

				         } else {

				            this->local_size[i] = q.local_size[i];

				         }

				         this->local_size[i] = q.local_size[i];

				      }

				   }

				@@ -303,14 +284,36 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,

				bool

				ast_type_qualifier::merge_out_qualifier(YYLTYPE *loc,

				                                        _mesa_glsl_parse_state *state,

				                                        ast_type_qualifier q,

				                                        ast_node* &node)

				                                        const ast_type_qualifier &q,

				                                        ast_node* &node, bool create_node)

				{

				   void *mem_ctx = state;

				   const bool r = this->merge_qualifier(loc, state, q);

				   const bool r = this->merge_qualifier(loc, state, q, false);

				   if (state->stage == MESA_SHADER_TESS_CTRL) {

				      node = new(mem_ctx) ast_tcs_output_layout(*loc, q.vertices);

				   if (state->stage == MESA_SHADER_GEOMETRY) {

				      if (q.flags.q.prim_type) {

				         /* Make sure this is a valid output primitive type. */

				         switch (q.prim_type) {

				         case GL_POINTS:

				         case GL_LINE_STRIP:

				         case GL_TRIANGLE_STRIP:

				            break;

				         default:

				            _mesa_glsl_error(loc, state, "invalid geometry shader output "

				                             "primitive type");

				            break;

				         }

				      }

				      /* Allow future assigments of global out's stream id value */

				      this->flags.q.explicit_stream = 0;

				   } else if (state->stage == MESA_SHADER_TESS_CTRL) {

				      if (create_node) {

				         node = new(mem_ctx) ast_tcs_output_layout(*loc);

				      }

				   } else {

				      _mesa_glsl_error(loc, state, "out layout qualifiers only valid in "

				                       "tessellation control or geometry shaders");

				   }

				   return r;

				@@ -319,8 +322,8 @@ ast_type_qualifier::merge_out_qualifier(YYLTYPE *loc,

				bool

				ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,

				                                       _mesa_glsl_parse_state *state,

				                                       ast_type_qualifier q,

				                                       ast_node* &node)

				                                       const ast_type_qualifier &q,

				                                       ast_node* &node, bool create_node)

				{

				   void *mem_ctx = state;

				   bool create_gs_ast = false;

				@@ -414,15 +417,13 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,

				      state->in_qualifier->prim_type = q.prim_type;

				   }

				   if (this->flags.q.invocations &&

				       q.flags.q.invocations &&

				       this->invocations != q.invocations) {

				      _mesa_glsl_error(loc, state,

				                       "conflicting invocations counts specified");

				      return false;

				   } else if (q.flags.q.invocations) {

				   if (q.flags.q.invocations) {

				      this->flags.q.invocations = 1;

				      this->invocations = q.invocations;

				      if (this->invocations) {

				         this->invocations->merge_qualifier(q.invocations);

				      } else {

				         this->invocations = q.invocations;

				      }

				   }

				   if (q.flags.q.early_fragment_tests) {

				@@ -462,18 +463,72 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,

				      this->point_mode = q.point_mode;

				   }

				   if (create_gs_ast) {

				      node = new(mem_ctx) ast_gs_input_layout(*loc, q.prim_type);

				   } else if (create_cs_ast) {

				      /* Infer a local_size of 1 for every unspecified dimension */

				      unsigned local_size[3];

				      for (int i = 0; i < 3; i++) {

				         if (q.flags.q.local_size & (1 << i))

				            local_size[i] = q.local_size[i];

				         else

				            local_size[i] = 1;

				   if (create_node) {

				      if (create_gs_ast) {

				         node = new(mem_ctx) ast_gs_input_layout(*loc, q.prim_type);

				      } else if (create_cs_ast) {

				         node = new(mem_ctx) ast_cs_input_layout(*loc, q.local_size);

				      }

				      node = new(mem_ctx) ast_cs_input_layout(*loc, local_size);

				   }

				   return true;

				}

				bool

				ast_layout_expression::process_qualifier_constant(struct _mesa_glsl_parse_state *state,

				                                                  const char *qual_indentifier,

				                                                  unsigned *value,

				                                                  bool can_be_zero)

				{

				   int min_value = 0;

				   bool first_pass = true;

				   *value = 0;

				   if (!can_be_zero)

				      min_value = 1;

				   for (exec_node *node = layout_const_expressions.head;

				           !node->is_tail_sentinel(); node = node->next) {

				      exec_list dummy_instructions;

				      ast_node *const_expression = exec_node_data(ast_node, node, link);

				      ir_rvalue *const ir = const_expression->hir(&dummy_instructions, state);

				      ir_constant *const const_int = ir->constant_expression_value();

				      if (const_int == NULL || !const_int->type->is_integer()) {

				         YYLTYPE loc = const_expression->get_location();

				         _mesa_glsl_error(&loc, state, "%s must be an integral constant "

				                          "expression", qual_indentifier);

				         return false;

				      }

				      if (const_int->value.i[0] < min_value) {

				         YYLTYPE loc = const_expression->get_location();

				         _mesa_glsl_error(&loc, state, "%s layout qualifier is invalid "

				                          "(%d < %d)", qual_indentifier,

				                          const_int->value.i[0], min_value);

				         return false;

				      }

				      if (!first_pass && *value != const_int->value.u[0]) {

				         YYLTYPE loc = const_expression->get_location();

				         _mesa_glsl_error(&loc, state, "%s layout qualifier does not "

						          "match previous declaration (%d vs %d)",

				                          qual_indentifier, *value, const_int->value.i[0]);

				         return false;

				      } else {

				         first_pass = false;

				         *value = const_int->value.u[0];

				      }

				      /* If the location is const (and we've verified that

				       * it is) then no instructions should have been emitted

				       * when we converted it to HIR. If they were emitted,

				       * then either the location isn't const after all, or

				       * we are emitting unnecessary instructions.

				       */

				      assert(dummy_instructions.is_empty());

				   }

				   return true;

0

src/glsl/blob.c → src/compiler/glsl/blob.c

View File

0

src/glsl/blob.h → src/compiler/glsl/blob.h

View File

635

src/glsl/builtin_functions.cpp → src/compiler/glsl/builtin_functions.cpp

View File

File diff suppressed because it is too large Load Diff

									
										13

src/glsl/builtin_types.cpp → src/compiler/glsl/builtin_types.cpp
									
												View File
												
				@@ -34,7 +34,7 @@

				 * version and set of enabled extensions.

				 */

				#include "glsl_types.h"

				#include "compiler/glsl_types.h"

				#include "glsl_parser_extras.h"

				#include "util/macros.h"

				@@ -43,9 +43,7 @@

				 * convenience pointers (glsl_type::foo_type).

				 * @{

				 */

				#define DECL_TYPE(NAME, ...)                                    \

				   const glsl_type glsl_type::_##NAME##_type = glsl_type(__VA_ARGS__, #NAME); \

				   const glsl_type *const glsl_type::NAME##_type = &glsl_type::_##NAME##_type;

				#define DECL_TYPE(NAME, ...)

				#define STRUCT_TYPE(NAME)                                       \

				   const glsl_type glsl_type::_struct_##NAME##_type =           \

				@@ -114,7 +112,7 @@ static const struct glsl_struct_field gl_FogParameters_fields[] = {

				   glsl_struct_field(glsl_type::float_type, "scale"),

				};

				#include "builtin_type_macros.h"

				#include "compiler/builtin_type_macros.h"

				/** @} */

				/**

				@@ -127,7 +125,7 @@ static const struct glsl_struct_field gl_FogParameters_fields[] = {

				#define T(TYPE, MIN_GL, MIN_ES) \

				   { glsl_type::TYPE##_type, MIN_GL, MIN_ES },

				const static struct builtin_type_versions {

				static const struct builtin_type_versions {

				   const glsl_type *const type;

				   int min_gl;

				   int min_es;

				@@ -307,7 +305,8 @@ _mesa_glsl_initialize_types(struct _mesa_glsl_parse_state *state)

				      add_type(symbols, glsl_type::usamplerCubeArray_type);

				   }

				   if (state->ARB_texture_multisample_enable) {

				   if (state->ARB_texture_multisample_enable ||

				       state->OES_texture_storage_multisample_2d_array_enable) {

				      add_type(symbols, glsl_type::sampler2DMS_type);

				      add_type(symbols, glsl_type::isampler2DMS_type);

				      add_type(symbols, glsl_type::usampler2DMS_type);

									
										311

src/glsl/builtin_variables.cpp → src/compiler/glsl/builtin_variables.cpp
									
												View File
												
				@@ -22,6 +22,8 @@

				 */

				#include "ir.h"

				#include "ir_builder.h"

				#include "linker.h"

				#include "glsl_parser_extras.h"

				#include "glsl_symbol_table.h"

				#include "main/core.h"

				@@ -29,6 +31,8 @@

				#include "program/prog_statevars.h"

				#include "program/prog_instruction.h"

				using namespace ir_builder;

				static const struct gl_builtin_uniform_element gl_NumSamples_elements[] = {

				   {NULL, {STATE_NUM_SAMPLES, 0, 0}, SWIZZLE_XXXX}

				};

				@@ -323,6 +327,12 @@ per_vertex_accumulator::add_field(int slot, const glsl_type *type,

				   this->fields[this->num_fields].centroid = 0;

				   this->fields[this->num_fields].sample = 0;

				   this->fields[this->num_fields].patch = 0;

				   this->fields[this->num_fields].precision = GLSL_PRECISION_NONE;

				   this->fields[this->num_fields].image_read_only = 0;

				   this->fields[this->num_fields].image_write_only = 0;

				   this->fields[this->num_fields].image_coherent = 0;

				   this->fields[this->num_fields].image_volatile = 0;

				   this->fields[this->num_fields].image_restrict = 0;

				   this->num_fields++;

				}

				@@ -372,6 +382,11 @@ private:

				      return add_variable(name, type, ir_var_shader_out, slot);

				   }

				   ir_variable *add_index_output(int slot, int index, const glsl_type *type, const char *name)

				   {

				      return add_index_variable(name, type, ir_var_shader_out, slot, index);

				   }

				   ir_variable *add_system_value(int slot, const glsl_type *type,

				                                 const char *name)

				   {

				@@ -380,11 +395,12 @@ private:

				   ir_variable *add_variable(const char *name, const glsl_type *type,

				                             enum ir_variable_mode mode, int slot);

				   ir_variable *add_index_variable(const char *name, const glsl_type *type,

				                             enum ir_variable_mode mode, int slot, int index);

				   ir_variable *add_uniform(const glsl_type *type, const char *name);

				   ir_variable *add_const(const char *name, int value);

				   ir_variable *add_const_ivec3(const char *name, int x, int y, int z);

				   void add_varying(int slot, const glsl_type *type, const char *name,

				                    const char *name_as_gs_input);

				   void add_varying(int slot, const glsl_type *type, const char *name);

				   exec_list * const instructions;

				   struct _mesa_glsl_parse_state * const state;

				@@ -399,10 +415,12 @@ private:

				   const glsl_type * const bool_t;

				   const glsl_type * const int_t;

				   const glsl_type * const uint_t;

				   const glsl_type * const float_t;

				   const glsl_type * const vec2_t;

				   const glsl_type * const vec3_t;

				   const glsl_type * const vec4_t;

				   const glsl_type * const uvec3_t;

				   const glsl_type * const mat3_t;

				   const glsl_type * const mat4_t;

				@@ -416,12 +434,54 @@ builtin_variable_generator::builtin_variable_generator(

				   : instructions(instructions), state(state), symtab(state->symbols),

				     compatibility(!state->is_version(140, 100)),

				     bool_t(glsl_type::bool_type), int_t(glsl_type::int_type),

				     uint_t(glsl_type::uint_type),

				     float_t(glsl_type::float_type), vec2_t(glsl_type::vec2_type),

				     vec3_t(glsl_type::vec3_type), vec4_t(glsl_type::vec4_type),

				     uvec3_t(glsl_type::uvec3_type),

				     mat3_t(glsl_type::mat3_type), mat4_t(glsl_type::mat4_type)

				{

				}

				ir_variable *

				builtin_variable_generator::add_index_variable(const char *name,

				                                         const glsl_type *type,

				                                         enum ir_variable_mode mode, int slot, int index)

				{

				   ir_variable *var = new(symtab) ir_variable(type, name, mode);

				   var->data.how_declared = ir_var_declared_implicitly;

				   switch (var->data.mode) {

				   case ir_var_auto:

				   case ir_var_shader_in:

				   case ir_var_uniform:

				   case ir_var_system_value:

				      var->data.read_only = true;

				      break;

				   case ir_var_shader_out:

				   case ir_var_shader_storage:

				      break;

				   default:

				      /* The only variables that are added using this function should be

				       * uniforms, shader storage, shader inputs, and shader outputs, constants

				       * (which use ir_var_auto), and system values.

				       */

				      assert(0);

				      break;

				   }

				   var->data.location = slot;

				   var->data.explicit_location = (slot >= 0);

				   var->data.explicit_index = 1;

				   var->data.index = index;

				   /* Once the variable is created an initialized, add it to the symbol table

				    * and add the declaration to the IR stream.

				    */

				   instructions->push_tail(var);

				   symtab->add_variable(var);

				   return var;

				}

				ir_variable *

				builtin_variable_generator::add_variable(const char *name,

				@@ -573,6 +633,14 @@ builtin_variable_generator::generate_constants()

				         add_const("gl_MaxVaryingVectors",

				                   state->ctx->Const.MaxVarying);

				      }

				      /* EXT_blend_func_extended brings a built in constant

				       * for determining number of dual source draw buffers

				       */

				      if (state->EXT_blend_func_extended_enable) {

				         add_const("gl_MaxDualSourceDrawBuffersEXT",

				                   state->Const.MaxDualSourceDrawBuffers);

				      }

				   } else {

				      add_const("gl_MaxVertexUniformComponents",

				                state->Const.MaxVertexUniformComponents);

				@@ -604,7 +672,7 @@ builtin_variable_generator::generate_constants()

				      add_const("gl_MaxVaryingComponents", state->ctx->Const.MaxVarying * 4);

				   }

				   if (state->is_version(150, 0)) {

				   if (state->has_geometry_shader()) {

				      add_const("gl_MaxVertexOutputComponents",

				                state->Const.MaxVertexOutputComponents);

				      add_const("gl_MaxGeometryInputComponents",

				@@ -667,12 +735,11 @@ builtin_variable_generator::generate_constants()

				      add_const("gl_MaxAtomicCounterBindings",

				                state->Const.MaxAtomicBufferBindings);

				      /* When Mesa adds support for GL_OES_geometry_shader and

				       * GL_OES_tessellation_shader, this will need to change.

				       */

				      if (!state->es_shader) {

				      if (state->has_geometry_shader()) {

				         add_const("gl_MaxGeometryAtomicCounters",

				                   state->Const.MaxGeometryAtomicCounters);

				      }

				      if (!state->es_shader) {

				         add_const("gl_MaxTessControlAtomicCounters",

				                   state->Const.MaxTessControlAtomicCounters);

				         add_const("gl_MaxTessEvaluationAtomicCounters",

				@@ -690,12 +757,11 @@ builtin_variable_generator::generate_constants()

				      add_const("gl_MaxAtomicCounterBufferSize",

				                state->Const.MaxAtomicCounterBufferSize);

				      /* When Mesa adds support for GL_OES_geometry_shader and

				       * GL_OES_tessellation_shader, this will need to change.

				       */

				      if (!state->es_shader) {

				      if (state->has_geometry_shader()) {

				         add_const("gl_MaxGeometryAtomicCounterBuffers",

				                   state->Const.MaxGeometryAtomicCounterBuffers);

				      }

				      if (!state->es_shader) {

				         add_const("gl_MaxTessControlAtomicCounterBuffers",

				                   state->Const.MaxTessControlAtomicCounterBuffers);

				         add_const("gl_MaxTessEvaluationAtomicCounterBuffers",

				@@ -703,12 +769,17 @@ builtin_variable_generator::generate_constants()

				      }

				   }

				   if (state->is_version(430, 0) || state->ARB_compute_shader_enable) {

				      add_const("gl_MaxComputeAtomicCounterBuffers", MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS);

				      add_const("gl_MaxComputeAtomicCounters", MAX_COMPUTE_ATOMIC_COUNTERS);

				      add_const("gl_MaxComputeImageUniforms", MAX_COMPUTE_IMAGE_UNIFORMS);

				      add_const("gl_MaxComputeTextureImageUnits", MAX_COMPUTE_TEXTURE_IMAGE_UNITS);

				      add_const("gl_MaxComputeUniformComponents", MAX_COMPUTE_UNIFORM_COMPONENTS);

				   if (state->is_version(430, 310) || state->ARB_compute_shader_enable) {

				      add_const("gl_MaxComputeAtomicCounterBuffers",

				                state->Const.MaxComputeAtomicCounterBuffers);

				      add_const("gl_MaxComputeAtomicCounters",

				                state->Const.MaxComputeAtomicCounters);

				      add_const("gl_MaxComputeImageUniforms",

				                state->Const.MaxComputeImageUniforms);

				      add_const("gl_MaxComputeTextureImageUnits",

				                state->Const.MaxComputeTextureImageUnits);

				      add_const("gl_MaxComputeUniformComponents",

				                state->Const.MaxComputeUniformComponents);

				      add_const_ivec3("gl_MaxComputeWorkGroupCount",

				                      state->Const.MaxComputeWorkGroupCount[0],

				@@ -751,13 +822,16 @@ builtin_variable_generator::generate_constants()

				      add_const("gl_MaxCombinedImageUniforms",

				                state->Const.MaxCombinedImageUniforms);

				      if (state->has_geometry_shader()) {

				         add_const("gl_MaxGeometryImageUniforms",

				                   state->Const.MaxGeometryImageUniforms);

				      }

				      if (!state->es_shader) {

				         add_const("gl_MaxCombinedImageUnitsAndFragmentOutputs",

				                   state->Const.MaxCombinedShaderOutputResources);

				         add_const("gl_MaxImageSamples",

				                   state->Const.MaxImageSamples);

				         add_const("gl_MaxGeometryImageUniforms",

				                   state->Const.MaxGeometryImageUniforms);

				      }

				      if (state->is_version(450, 310)) {

				@@ -880,16 +954,27 @@ builtin_variable_generator::generate_uniforms()

				void

				builtin_variable_generator::generate_vs_special_vars()

				{

				   ir_variable *var;

				   if (state->is_version(130, 300))

				      add_system_value(SYSTEM_VALUE_VERTEX_ID, int_t, "gl_VertexID");

				   if (state->ARB_draw_instanced_enable)

				      add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceIDARB");

				   if (state->ARB_draw_instanced_enable || state->is_version(140, 300))

				      add_system_value(SYSTEM_VALUE_INSTANCE_ID, int_t, "gl_InstanceID");

				   if (state->AMD_vertex_shader_layer_enable)

				      add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");

				   if (state->AMD_vertex_shader_viewport_index_enable)

				      add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");

				   if (state->ARB_shader_draw_parameters_enable) {

				      add_system_value(SYSTEM_VALUE_BASE_VERTEX, int_t, "gl_BaseVertexARB");

				      add_system_value(SYSTEM_VALUE_BASE_INSTANCE, int_t, "gl_BaseInstanceARB");

				      add_system_value(SYSTEM_VALUE_DRAW_ID, int_t, "gl_DrawIDARB");

				   }

				   if (state->AMD_vertex_shader_layer_enable) {

				      var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");

				      var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   }

				   if (state->AMD_vertex_shader_viewport_index_enable) {

				      var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");

				      var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   }

				   if (compatibility) {

				      add_input(VERT_ATTRIB_POS, vec4_t, "gl_Vertex");

				      add_input(VERT_ATTRIB_NORMAL, vec3_t, "gl_Normal");

				@@ -947,9 +1032,14 @@ builtin_variable_generator::generate_tes_special_vars()

				void

				builtin_variable_generator::generate_gs_special_vars()

				{

				   add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");

				   if (state->is_version(410, 0) || state->ARB_viewport_array_enable)

				      add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");

				   ir_variable *var;

				   var = add_output(VARYING_SLOT_LAYER, int_t, "gl_Layer");

				   var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   if (state->is_version(410, 0) || state->ARB_viewport_array_enable) {

				      var = add_output(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");

				      var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   }

				   if (state->is_version(400, 0) || state->ARB_gpu_shader5_enable)

				      add_system_value(SYSTEM_VALUE_INVOCATION_ID, int_t, "gl_InvocationID");

				@@ -963,7 +1053,6 @@ builtin_variable_generator::generate_gs_special_vars()

				    * the specific case of gl_PrimitiveIDIn.  So we don't need to treat

				    * gl_PrimitiveIDIn as an {ARB,EXT}_geometry_shader4-only variable.

				    */

				   ir_variable *var;

				   var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveIDIn");

				   var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   var = add_output(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");

				@@ -977,14 +1066,23 @@ builtin_variable_generator::generate_gs_special_vars()

				void

				builtin_variable_generator::generate_fs_special_vars()

				{

				   add_input(VARYING_SLOT_POS, vec4_t, "gl_FragCoord");

				   add_input(VARYING_SLOT_FACE, bool_t, "gl_FrontFacing");

				   ir_variable *var;

				   if (this->state->ctx->Const.GLSLFragCoordIsSysVal)

				      add_system_value(SYSTEM_VALUE_FRAG_COORD, vec4_t, "gl_FragCoord");

				   else

				      add_input(VARYING_SLOT_POS, vec4_t, "gl_FragCoord");

				   if (this->state->ctx->Const.GLSLFrontFacingIsSysVal)

				      add_system_value(SYSTEM_VALUE_FRONT_FACE, bool_t, "gl_FrontFacing");

				   else

				      add_input(VARYING_SLOT_FACE, bool_t, "gl_FrontFacing");

				   if (state->is_version(120, 100))

				      add_input(VARYING_SLOT_PNTC, vec2_t, "gl_PointCoord");

				   if (state->is_version(150, 0)) {

				      ir_variable *var =

				         add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");

				   if (state->has_geometry_shader()) {

				      var = add_input(VARYING_SLOT_PRIMITIVE_ID, int_t, "gl_PrimitiveID");

				      var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   }

				@@ -998,6 +1096,19 @@ builtin_variable_generator::generate_fs_special_vars()

				                 array(vec4_t, state->Const.MaxDrawBuffers), "gl_FragData");

				   }

				   if (state->es_shader && state->language_version == 100 && state->EXT_blend_func_extended_enable) {

				      /* We make an assumption here that there will only ever be one dual-source draw buffer

				       * In case this assumption is ever proven to be false, make sure to assert here

				       * since we don't handle this case.

				       * In practice, this issue will never arise since no hardware will support it.

				       */

				      assert(state->Const.MaxDualSourceDrawBuffers <= 1);

				      add_index_output(FRAG_RESULT_DATA0, 1, vec4_t, "gl_SecondaryFragColorEXT");

				      add_index_output(FRAG_RESULT_DATA0, 1,

				                       array(vec4_t, state->Const.MaxDualSourceDrawBuffers),

				                       "gl_SecondaryFragDataEXT");

				   }

				   /* gl_FragDepth has always been in desktop GLSL, but did not appear in GLSL

				    * ES 1.00.

				    */

				@@ -1036,9 +1147,14 @@ builtin_variable_generator::generate_fs_special_vars()

				   }

				   if (state->is_version(430, 0) || state->ARB_fragment_layer_viewport_enable) {

				      add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");

				      add_input(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");

				      var = add_input(VARYING_SLOT_LAYER, int_t, "gl_Layer");

				      var->data.interpolation = INTERP_QUALIFIER_FLAT;

				      var = add_input(VARYING_SLOT_VIEWPORT, int_t, "gl_ViewportIndex");

				      var->data.interpolation = INTERP_QUALIFIER_FLAT;

				   }

				   if (state->is_version(450, 310)/* || state->ARB_ES3_1_compatibility_enable*/)

				      add_system_value(SYSTEM_VALUE_HELPER_INVOCATION, bool_t, "gl_HelperInvocation");

				}

				@@ -1048,20 +1164,23 @@ builtin_variable_generator::generate_fs_special_vars()

				void

				builtin_variable_generator::generate_cs_special_vars()

				{

				   /* TODO: finish this. */

				   add_system_value(SYSTEM_VALUE_LOCAL_INVOCATION_ID, uvec3_t,

				                    "gl_LocalInvocationID");

				   add_system_value(SYSTEM_VALUE_WORK_GROUP_ID, uvec3_t, "gl_WorkGroupID");

				   add_system_value(SYSTEM_VALUE_NUM_WORK_GROUPS, uvec3_t, "gl_NumWorkGroups");

				   add_variable("gl_GlobalInvocationID", uvec3_t, ir_var_auto, 0);

				   add_variable("gl_LocalInvocationIndex", uint_t, ir_var_auto, 0);

				}

				/**

				 * Add a single "varying" variable.  The variable's type and direction (input

				 * or output) are adjusted as appropriate for the type of shader being

				 * compiled.  For geometry shaders using {ARB,EXT}_geometry_shader4,

				 * name_as_gs_input is used for the input (to avoid ambiguity).

				 * compiled.

				 */

				void

				builtin_variable_generator::add_varying(int slot, const glsl_type *type,

				                                        const char *name,

				                                        const char *name_as_gs_input)

				                                        const char *name)

				{

				   switch (state->stage) {

				   case MESA_SHADER_TESS_CTRL:

				@@ -1089,32 +1208,34 @@ builtin_variable_generator::add_varying(int slot, const glsl_type *type,

				void

				builtin_variable_generator::generate_varyings()

				{

				#define ADD_VARYING(loc, type, name) \

				   add_varying(loc, type, name, name "In")

				   /* gl_Position and gl_PointSize are not visible from fragment shaders. */

				   if (state->stage != MESA_SHADER_FRAGMENT) {

				      ADD_VARYING(VARYING_SLOT_POS, vec4_t, "gl_Position");

				      ADD_VARYING(VARYING_SLOT_PSIZ, float_t, "gl_PointSize");

				      add_varying(VARYING_SLOT_POS, vec4_t, "gl_Position");

				      if (!state->es_shader ||

				          state->stage == MESA_SHADER_VERTEX ||

				          (state->stage == MESA_SHADER_GEOMETRY &&

				           state->OES_geometry_point_size_enable)) {

				         add_varying(VARYING_SLOT_PSIZ, float_t, "gl_PointSize");

				      }

				   }

				   if (state->is_version(130, 0)) {

				       ADD_VARYING(VARYING_SLOT_CLIP_DIST0, array(float_t, 0),

				       add_varying(VARYING_SLOT_CLIP_DIST0, array(float_t, 0),

				                   "gl_ClipDistance");

				   }

				   if (compatibility) {

				      ADD_VARYING(VARYING_SLOT_TEX0, array(vec4_t, 0), "gl_TexCoord");

				      ADD_VARYING(VARYING_SLOT_FOGC, float_t, "gl_FogFragCoord");

				      add_varying(VARYING_SLOT_TEX0, array(vec4_t, 0), "gl_TexCoord");

				      add_varying(VARYING_SLOT_FOGC, float_t, "gl_FogFragCoord");

				      if (state->stage == MESA_SHADER_FRAGMENT) {

				         ADD_VARYING(VARYING_SLOT_COL0, vec4_t, "gl_Color");

				         ADD_VARYING(VARYING_SLOT_COL1, vec4_t, "gl_SecondaryColor");

				         add_varying(VARYING_SLOT_COL0, vec4_t, "gl_Color");

				         add_varying(VARYING_SLOT_COL1, vec4_t, "gl_SecondaryColor");

				      } else {

				         ADD_VARYING(VARYING_SLOT_CLIP_VERTEX, vec4_t, "gl_ClipVertex");

				         ADD_VARYING(VARYING_SLOT_COL0, vec4_t, "gl_FrontColor");

				         ADD_VARYING(VARYING_SLOT_BFC0, vec4_t, "gl_BackColor");

				         ADD_VARYING(VARYING_SLOT_COL1, vec4_t, "gl_FrontSecondaryColor");

				         ADD_VARYING(VARYING_SLOT_BFC1, vec4_t, "gl_BackSecondaryColor");

				         add_varying(VARYING_SLOT_CLIP_VERTEX, vec4_t, "gl_ClipVertex");

				         add_varying(VARYING_SLOT_COL0, vec4_t, "gl_FrontColor");

				         add_varying(VARYING_SLOT_BFC0, vec4_t, "gl_BackColor");

				         add_varying(VARYING_SLOT_COL1, vec4_t, "gl_FrontSecondaryColor");

				         add_varying(VARYING_SLOT_BFC1, vec4_t, "gl_BackSecondaryColor");

				      }

				   }

				@@ -1163,6 +1284,7 @@ builtin_variable_generator::generate_varyings()

				         var->data.centroid = fields[i].centroid;

				         var->data.sample = fields[i].sample;

				         var->data.patch = fields[i].patch;

				         var->data.precision = fields[i].precision;

				         var->init_interface_type(per_vertex_out_type);

				      }

				   }

				@@ -1204,3 +1326,84 @@ _mesa_glsl_initialize_variables(exec_list *instructions,

				      break;

				   }

				}

				/**

				 * Initialize compute shader variables with values that are derived from other

				 * compute shader variable.

				 */

				static void

				initialize_cs_derived_variables(gl_shader *shader,

				                                ir_function_signature *const main_sig)

				{

				   assert(shader->Stage == MESA_SHADER_COMPUTE);

				   ir_variable *gl_GlobalInvocationID =

				      shader->symbols->get_variable("gl_GlobalInvocationID");

				   assert(gl_GlobalInvocationID);

				   ir_variable *gl_WorkGroupID =

				      shader->symbols->get_variable("gl_WorkGroupID");

				   assert(gl_WorkGroupID);

				   ir_variable *gl_WorkGroupSize =

				      shader->symbols->get_variable("gl_WorkGroupSize");

				   if (gl_WorkGroupSize == NULL) {

				      void *const mem_ctx = ralloc_parent(shader->ir);

				      gl_WorkGroupSize = new(mem_ctx) ir_variable(glsl_type::uvec3_type,

				                                                  "gl_WorkGroupSize",

				                                                  ir_var_auto);

				      gl_WorkGroupSize->data.how_declared = ir_var_declared_implicitly;

				      gl_WorkGroupSize->data.read_only = true;

				      shader->ir->push_head(gl_WorkGroupSize);

				   }

				   ir_variable *gl_LocalInvocationID =

				      shader->symbols->get_variable("gl_LocalInvocationID");

				   assert(gl_LocalInvocationID);

				   /* gl_GlobalInvocationID =

				    *    gl_WorkGroupID * gl_WorkGroupSize + gl_LocalInvocationID

				    */

				   ir_instruction *inst =

				      assign(gl_GlobalInvocationID,

				             add(mul(gl_WorkGroupID, gl_WorkGroupSize),

				                 gl_LocalInvocationID));

				   main_sig->body.push_head(inst);

				   /* gl_LocalInvocationIndex =

				    *    gl_LocalInvocationID.z * gl_WorkGroupSize.x * gl_WorkGroupSize.y +

				    *    gl_LocalInvocationID.y * gl_WorkGroupSize.x +

				    *    gl_LocalInvocationID.x;

				    */

				   ir_expression *index_z =

				      mul(mul(swizzle_z(gl_LocalInvocationID), swizzle_x(gl_WorkGroupSize)),

				          swizzle_y(gl_WorkGroupSize));

				   ir_expression *index_y =

				      mul(swizzle_y(gl_LocalInvocationID), swizzle_x(gl_WorkGroupSize));

				   ir_expression *index_y_plus_z = add(index_y, index_z);

				   operand index_x(swizzle_x(gl_LocalInvocationID));

				   ir_expression *index_x_plus_y_plus_z = add(index_y_plus_z, index_x);

				   ir_variable *gl_LocalInvocationIndex =

				      shader->symbols->get_variable("gl_LocalInvocationIndex");

				   assert(gl_LocalInvocationIndex);

				   inst = assign(gl_LocalInvocationIndex, index_x_plus_y_plus_z);

				   main_sig->body.push_head(inst);

				}

				/**

				 * Initialize builtin variables with values based on other builtin variables.

				 * These are initialized in the main function.

				 */

				void

				_mesa_glsl_initialize_derived_variables(gl_shader *shader)

				{

				   /* We only need to set CS variables currently. */

				   if (shader->Stage != MESA_SHADER_COMPUTE)

				      return;

				   ir_function_signature *const main_sig =

				      _mesa_get_main_function_signature(shader);

				   if (main_sig == NULL)

				      return;

				   initialize_cs_derived_variables(shader, main_sig);

				}

0

src/glsl/glcpp/.gitignore → src/compiler/glsl/glcpp/.gitignore vendored

View File

0

src/glsl/glcpp/README → src/compiler/glsl/glcpp/README

View File

0

src/glsl/glcpp/glcpp-lex.l → src/compiler/glsl/glcpp/glcpp-lex.l

View File

27

src/glsl/glcpp/glcpp-parse.y → src/compiler/glsl/glcpp/glcpp-parse.y

View File

@@ -2096,6 +2096,9 @@ _check_for_reserved_macro_name (glcpp_parser_t *parser, YYLTYPE *loc,
 	if (strncmp(identifier, "GL_", 3) == 0) {
 		glcpp_error (loc, parser, "Macro names starting with \"GL_\" are reserved.\n");
 	}
 	if (strcmp(identifier, "defined") == 0) {
 		glcpp_error (loc, parser, "\"defined\" cannot be used as a macro name");
 	}
 }
 static int
@@ -2382,9 +2385,21 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
 	         add_builtin_define(parser, "GL_OES_EGL_image_external", 1);
               if (extensions->OES_standard_derivatives)
                  add_builtin_define(parser, "GL_OES_standard_derivatives", 1);
               if (extensions->ARB_texture_multisample)
                  add_builtin_define(parser, "GL_OES_texture_storage_multisample_2d_array", 1);
               if (extensions->ARB_blend_func_extended)
                  add_builtin_define(parser, "GL_EXT_blend_func_extended", 1);
               if (version >= 310) {
                  if (extensions->OES_geometry_shader) {
                     add_builtin_define(parser, "GL_OES_geometry_point_size", 1);
                     add_builtin_define(parser, "GL_OES_geometry_shader", 1);
                  }
               }
 	   }
 	} else {
 	   add_builtin_define(parser, "GL_ARB_draw_buffers", 1);
            add_builtin_define(parser, "GL_ARB_enhanced_layouts", 1);
            add_builtin_define(parser, "GL_ARB_separate_shader_objects", 1);
 	   add_builtin_define(parser, "GL_ARB_texture_rectangle", 1);
            add_builtin_define(parser, "GL_AMD_shader_trinary_minmax", 1);
@@ -2424,6 +2439,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
 	      if (extensions->ARB_shader_bit_encoding)
 	         add_builtin_define(parser, "GL_ARB_shader_bit_encoding", 1);
 	      if (extensions->ARB_shader_clock)
 	         add_builtin_define(parser, "GL_ARB_shader_clock", 1);
 	      if (extensions->ARB_uniform_buffer_object)
 	         add_builtin_define(parser, "GL_ARB_uniform_buffer_object", 1);
@@ -2481,6 +2499,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
               if (extensions->ARB_shader_image_size)
                  add_builtin_define(parser, "GL_ARB_shader_image_size", 1);
               if (extensions->ARB_shader_texture_image_samples)
                  add_builtin_define(parser, "GL_ARB_shader_texture_image_samples", 1);
               if (extensions->ARB_derivative_control)
                  add_builtin_define(parser, "GL_ARB_derivative_control", 1);
@@ -2495,12 +2516,18 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio
               if (extensions->ARB_shader_subroutine)
                  add_builtin_define(parser, "GL_ARB_shader_subroutine", 1);
               if (extensions->ARB_shader_draw_parameters)
                  add_builtin_define(parser, "GL_ARB_shader_draw_parameters", 1);
 	   }
 	}
 	if (extensions != NULL) {
 	   if (extensions->EXT_shader_integer_mix)
 	      add_builtin_define(parser, "GL_EXT_shader_integer_mix", 1);
 	   if (extensions->EXT_shader_samples_identical)
 	      add_builtin_define(parser, "GL_EXT_shader_samples_identical", 1);
 	}
 	if (version >= 150)

0

src/glsl/glcpp/glcpp.c → src/compiler/glsl/glcpp/glcpp.c

View File

0

src/glsl/glcpp/glcpp.h → src/compiler/glsl/glcpp/glcpp.h

View File

0

src/glsl/glcpp/pp.c → src/compiler/glsl/glcpp/pp.c

View File

0

src/glsl/glcpp/tests/.gitignore → src/compiler/glsl/glcpp/tests/.gitignore vendored

View File

0

src/glsl/glcpp/tests/000-content-with-spaces.c → src/compiler/glsl/glcpp/tests/000-content-with-spaces.c

View File

0

src/glsl/glcpp/tests/000-content-with-spaces.c.expected → src/compiler/glsl/glcpp/tests/000-content-with-spaces.c.expected

View File

0

src/glsl/glcpp/tests/001-define.c → src/compiler/glsl/glcpp/tests/001-define.c

View File

0

src/glsl/glcpp/tests/001-define.c.expected → src/compiler/glsl/glcpp/tests/001-define.c.expected

View File

0

src/glsl/glcpp/tests/002-define-chain.c → src/compiler/glsl/glcpp/tests/002-define-chain.c

View File

0

src/glsl/glcpp/tests/002-define-chain.c.expected → src/compiler/glsl/glcpp/tests/002-define-chain.c.expected

View File

0

src/glsl/glcpp/tests/003-define-chain-reverse.c → src/compiler/glsl/glcpp/tests/003-define-chain-reverse.c

View File

0

src/glsl/glcpp/tests/003-define-chain-reverse.c.expected → src/compiler/glsl/glcpp/tests/003-define-chain-reverse.c.expected

View File

0

src/glsl/glcpp/tests/004-define-recursive.c → src/compiler/glsl/glcpp/tests/004-define-recursive.c

View File

0

src/glsl/glcpp/tests/004-define-recursive.c.expected → src/compiler/glsl/glcpp/tests/004-define-recursive.c.expected

View File

0

src/glsl/glcpp/tests/005-define-composite-chain.c → src/compiler/glsl/glcpp/tests/005-define-composite-chain.c

View File

0

src/glsl/glcpp/tests/005-define-composite-chain.c.expected → src/compiler/glsl/glcpp/tests/005-define-composite-chain.c.expected

View File

0

src/glsl/glcpp/tests/006-define-composite-chain-reverse.c → src/compiler/glsl/glcpp/tests/006-define-composite-chain-reverse.c

View File

0

src/glsl/glcpp/tests/006-define-composite-chain-reverse.c.expected → src/compiler/glsl/glcpp/tests/006-define-composite-chain-reverse.c.expected

View File

Compare commits

4287 Commits 11.0 ... 11.2-branc

1 .dir-locals.el Unescape Escape View File

101 .travis.yml Normal file Unescape Escape View File

12 Android.common.mk Unescape Escape View File

5 Android.mk Unescape Escape View File

1 Makefile.am Unescape Escape View File

2 VERSION Unescape Escape View File

73 appveyor.yml Normal file Unescape Escape View File

14 bin/.cherry-ignore Unescape Escape View File

35 bin/get-extra-pick-list.sh Unescape Escape View File

2 bin/get-pick-list.sh Unescape Escape View File

402 configure.ac Unescape Escape View File

119 docs/GL3.txt Unescape Escape View File

4 docs/README.UVD Unescape Escape View File

9 docs/autoconf.html Unescape Escape View File

4 docs/contents.html Unescape Escape View File

45 docs/envvars.html Unescape Escape View File

125 docs/index.html Unescape Escape View File

5 docs/install.html Unescape Escape View File

17 docs/relnotes.html Unescape Escape View File

164 docs/relnotes/10.6.6.html Normal file Unescape Escape View File

75 docs/relnotes/10.6.7.html Normal file Unescape Escape View File

136 docs/relnotes/10.6.8.html Normal file Unescape Escape View File

130 docs/relnotes/10.6.9.html Normal file Unescape Escape View File

2 docs/relnotes/11.0.5.html Unescape Escape View File

281 docs/relnotes/11.1.0.html Normal file Unescape Escape View File

197 docs/relnotes/11.1.1.html Normal file Unescape Escape View File

182 docs/relnotes/11.1.2.html Normal file Unescape Escape View File

85 docs/relnotes/11.2.0.html Normal file Unescape Escape View File

14 docs/shading.html Unescape Escape View File

176 docs/specs/EXT_shader_samples_identical.txt Normal file Unescape Escape View File

4 docs/thanks.html Unescape Escape View File

4 docs/utilities.html Unescape Escape View File

99 docs/vmware-guest.html Unescape Escape View File

1 include/D3D9/d3d9types.h Unescape Escape View File

11 include/GL/internal/dri_interface.h Unescape Escape View File

45 include/GL/osmesa.h Unescape Escape View File

54 include/c11/threads_posix.h Unescape Escape View File

305 include/c99/inttypes.h Unescape Escape View File

247 include/c99/stdint.h Unescape Escape View File

14 include/c99_compat.h Unescape Escape View File

49 include/c99_math.h Unescape Escape View File

3 include/d3dadapter/present.h Unescape Escape View File

66 include/pci_ids/i965_pci_ids.h Unescape Escape View File

1 include/pci_ids/virtio_gpu_pci_ids.h Normal file Unescape Escape View File

18 scons/gallium.py Unescape Escape View File

14 scons/llvm.py Unescape Escape View File

5 src/Makefile.am Unescape Escape View File

2 src/SConscript Unescape Escape View File

1 src/compiler/.gitignore vendored Normal file Unescape Escape View File

46 src/glsl/Android.gen.mk → src/compiler/Android.gen.mk Unescape Escape View File

67 src/compiler/Android.mk Normal file Unescape Escape View File

325 src/compiler/Makefile.am Normal file Unescape Escape View File

226 src/compiler/Makefile.sources Normal file Unescape Escape View File

24 src/compiler/SConscript Normal file Unescape Escape View File

3 src/glsl/builtin_type_macros.h → src/compiler/builtin_type_macros.h Unescape Escape View File

1 src/glsl/.gitignore → src/compiler/glsl/.gitignore vendored Unescape Escape View File

76 src/compiler/glsl/Android.gen.mk Normal file Unescape Escape View File

2 src/glsl/Android.mk → src/compiler/glsl/Android.mk Unescape Escape View File

53 src/glsl/Makefile.am → src/compiler/glsl/Makefile.am Unescape Escape View File

42 src/glsl/Makefile.sources → src/compiler/glsl/Makefile.sources Unescape Escape View File

0 src/glsl/README → src/compiler/glsl/README Unescape Escape View File

2 src/glsl/SConscript → src/compiler/glsl/SConscript Unescape Escape View File

0 src/glsl/TODO → src/compiler/glsl/TODO Unescape Escape View File

110 src/glsl/ast.h → src/compiler/glsl/ast.h Unescape Escape View File

49 src/glsl/ast_array_index.cpp → src/compiler/glsl/ast_array_index.cpp Unescape Escape View File

0 src/glsl/ast_expr.cpp → src/compiler/glsl/ast_expr.cpp Unescape Escape View File

325 src/glsl/ast_function.cpp → src/compiler/glsl/ast_function.cpp Unescape Escape View File

2157 src/glsl/ast_to_hir.cpp → src/compiler/glsl/ast_to_hir.cpp View File

235 src/glsl/ast_type.cpp → src/compiler/glsl/ast_type.cpp Unescape Escape View File

0 src/glsl/blob.c → src/compiler/glsl/blob.c Unescape Escape View File

0 src/glsl/blob.h → src/compiler/glsl/blob.h Unescape Escape View File

635 src/glsl/builtin_functions.cpp → src/compiler/glsl/builtin_functions.cpp View File

13 src/glsl/builtin_types.cpp → src/compiler/glsl/builtin_types.cpp Unescape Escape View File

311 src/glsl/builtin_variables.cpp → src/compiler/glsl/builtin_variables.cpp Unescape Escape View File

0 src/glsl/glcpp/.gitignore → src/compiler/glsl/glcpp/.gitignore vendored Unescape Escape View File

0 src/glsl/glcpp/README → src/compiler/glsl/glcpp/README Unescape Escape View File

0 src/glsl/glcpp/glcpp-lex.l → src/compiler/glsl/glcpp/glcpp-lex.l Unescape Escape View File

27 src/glsl/glcpp/glcpp-parse.y → src/compiler/glsl/glcpp/glcpp-parse.y Unescape Escape View File

4287 Commits

11.0 ... 11.2-branc

1

.dir-locals.el

View File

101

.travis.yml Normal file

View File

12

Android.common.mk

View File

5

Android.mk

View File

1

Makefile.am

View File

2

VERSION

View File

73

appveyor.yml Normal file

View File

14

bin/.cherry-ignore

View File

35

bin/get-extra-pick-list.sh

View File

2

bin/get-pick-list.sh

View File

402

configure.ac

View File

119

docs/GL3.txt

View File

4

docs/README.UVD

View File

9

docs/autoconf.html

View File

4

docs/contents.html

View File

45

docs/envvars.html

View File

125

docs/index.html

View File

5

docs/install.html

View File

17

docs/relnotes.html

View File

164

docs/relnotes/10.6.6.html Normal file

View File

75

docs/relnotes/10.6.7.html Normal file

View File

136

docs/relnotes/10.6.8.html Normal file

View File

130

docs/relnotes/10.6.9.html Normal file

View File

2

docs/relnotes/11.0.5.html

View File

281

docs/relnotes/11.1.0.html Normal file

View File

197

docs/relnotes/11.1.1.html Normal file

View File

182

docs/relnotes/11.1.2.html Normal file

View File

85

docs/relnotes/11.2.0.html Normal file

View File

14

docs/shading.html

View File

176

docs/specs/EXT_shader_samples_identical.txt Normal file

View File

4

docs/thanks.html

View File

4

docs/utilities.html

View File

99

docs/vmware-guest.html

View File

1

include/D3D9/d3d9types.h

View File

11

include/GL/internal/dri_interface.h

View File

45

include/GL/osmesa.h

View File

54

include/c11/threads_posix.h

View File

305

include/c99/inttypes.h

View File

247

include/c99/stdint.h

View File

14

include/c99_compat.h

View File

49

include/c99_math.h

View File

3

include/d3dadapter/present.h

View File

66

include/pci_ids/i965_pci_ids.h

View File

1

include/pci_ids/virtio_gpu_pci_ids.h Normal file

View File

18

scons/gallium.py

View File

14

scons/llvm.py

View File

5

src/Makefile.am

View File

2

src/SConscript

View File

1

src/compiler/.gitignore vendored Normal file

View File

46

src/glsl/Android.gen.mk → src/compiler/Android.gen.mk

View File

67

src/compiler/Android.mk Normal file

View File

325

src/compiler/Makefile.am Normal file

View File

226

src/compiler/Makefile.sources Normal file

View File

24

src/compiler/SConscript Normal file

View File

3

src/glsl/builtin_type_macros.h → src/compiler/builtin_type_macros.h

View File

1

src/glsl/.gitignore → src/compiler/glsl/.gitignore vendored

View File

76

src/compiler/glsl/Android.gen.mk Normal file

View File

2

src/glsl/Android.mk → src/compiler/glsl/Android.mk

View File

53

src/glsl/Makefile.am → src/compiler/glsl/Makefile.am

View File

42

src/glsl/Makefile.sources → src/compiler/glsl/Makefile.sources

View File

0

src/glsl/README → src/compiler/glsl/README

View File

2

src/glsl/SConscript → src/compiler/glsl/SConscript

View File

0

src/glsl/TODO → src/compiler/glsl/TODO

View File

110

src/glsl/ast.h → src/compiler/glsl/ast.h

View File

49

src/glsl/ast_array_index.cpp → src/compiler/glsl/ast_array_index.cpp

View File

0

src/glsl/ast_expr.cpp → src/compiler/glsl/ast_expr.cpp

View File

325

src/glsl/ast_function.cpp → src/compiler/glsl/ast_function.cpp

View File

2157

src/glsl/ast_to_hir.cpp → src/compiler/glsl/ast_to_hir.cpp

View File

235

src/glsl/ast_type.cpp → src/compiler/glsl/ast_type.cpp

View File

0

src/glsl/blob.c → src/compiler/glsl/blob.c

View File

0

src/glsl/blob.h → src/compiler/glsl/blob.h

View File

635

src/glsl/builtin_functions.cpp → src/compiler/glsl/builtin_functions.cpp

View File

13

src/glsl/builtin_types.cpp → src/compiler/glsl/builtin_types.cpp

View File

311

src/glsl/builtin_variables.cpp → src/compiler/glsl/builtin_variables.cpp

View File

0

src/glsl/glcpp/.gitignore → src/compiler/glsl/glcpp/.gitignore vendored

View File

0

src/glsl/glcpp/README → src/compiler/glsl/glcpp/README

View File

0

src/glsl/glcpp/glcpp-lex.l → src/compiler/glsl/glcpp/glcpp-lex.l

View File

27

src/glsl/glcpp/glcpp-parse.y → src/compiler/glsl/glcpp/glcpp-parse.y

View File

0

src/glsl/glcpp/glcpp.c → src/compiler/glsl/glcpp/glcpp.c

View File