Comparing 72916698b0...a7fd463eca - mesa

fran/mesa

Author	SHA1	Message	Date
Ian Romanick	a06c9791d1	docs: Add missing release notes for ARB_separate_shader_objects Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 17:25:19 -07:00
Eric Anholt	20404e45c7	i965: Move push constant state packets to push constant update time. -0.553779% +/- 0.423394% effect on cairo-perf-trace runtime on glamor (n=612) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	113037148d	i965: Merge gen8_upload_constant_state into gen7_upload_constant_state. The two paths are really similar, and the extra conditionals will be dwarfed by the cost of the actual upload. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	51b79a6571	i965: Refactor gen7_upload_constant_state to look more like gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	1515ceb8fd	i965: Drop unnecessary state flag for units on NEW_BINDING_TABLE. Commit `30259856a8` moved the state packets to table generation time, but forgot to make this change. Apparently the performance win there was about not reemitting the table pointers on unrelated state changes. No performance difference on cairo on glamor (n=118). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	f9a2679db5	i965/gen7+: Move sampler state packets to the stage sampler state table update. Now that we have the stage state coming into our setup of sampler states, it's easy to drop an identifier into it of which stage the stage_state is, and then look up which packet to emit in a little table. No performance difference on cairo on glamor (n=492). v2: Don't forget to do the workaround flush on IVB. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	680d202d49	i965/gen6: Don't update unit state when samplers change. There's no remaining dependency between these two packets that I can find. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	02a3449758	i965: Drop a NEW_SAMPLER annotation for use of sampler_count. The sampler count is set up from the gl_program at draw time, not at sampler change time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	57ad5a3103	i965: Simplify sampler setup by passing the stage state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	9e363f0262	i965: Make batch dumping go to stderr, too. All our other debug goes there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	55a049b9ae	i965: Fix a stale comment reference Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:39 -07:00
Armin K	0b307afd57	glx: Conditionally compile GLX_MESA_query_renderer DRI3 support Missed out with commit `625bdd64e5`. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 23:20:34 +01:00
Samuel Li	7f8f6790e4	radeonsi: add Mullins pci ids. Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-05-02 17:30:31 -04:00
Samuel Li	aad669b1e9	radeonsi: add support for Mullins asics. v2: name defaults to kabini for older llvm v3: fix llvm version check Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-05-02 17:30:27 -04:00
Alex Deucher	b26175b6c3	configure: bump up libdrm_radeon requirement to 2.4.54 Required for Mullins. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-05-02 17:29:56 -04:00
Ian Romanick	625bdd64e5	dri3: Enable GLX_MESA_query_renderer on DRI3 too This should have happend around the time of commit `4680d23`, but Keith's DRI3 patches and my GLX_MESA_query_renderer patches crossed in the mail. I don't have a working DRI3 setup, so I haven't been able to actually verify this. I'm hoping that someone can piglit this for me on DRI3... It's also unfortunate the DRI2 and DRI3 can't share more code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Keith Packard <keithp@keithp.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 22:13:58 +01:00
José Fonseca	7ebdc9e48c	util: Don't attempt to redefine INFINITY/NAN on VS 2013. There are now provided by VS. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:47 +01:00
José Fonseca	8c879ac197	mesa: VS 2013 does not provide strcasecmp. A define is necessary, like for earlier VS versions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:47 +01:00
José Fonseca	ade79b21e9	egl: Don't attempt to redefine stdint.h types with VS 2010. Just include stdint.h. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:47 +01:00
José Fonseca	979692a52a	scons: Don't use bundled C99 headers for VS 2013. Use the ones provided by the compiler instead. NOTE: External trees should be updated to not include '#include/c99' directory directly, but rather rely on scons/gallium.py to do the right thing. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	0582800dd6	scons: Don't restrict MSVC_VERSION values. Saves the trouble of continuously needing to update. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	d69fd5d940	draw: Prevent signed/unsigned comparisons. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	605ef195aa	st/vega: Prevent signed/unsigned comparisons. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	42b9f8590d	scons: Adjust the warnings for VS. Silence insignificant warnings so significant warnings have a chance to stand out. The only abundant warning that's not silenced here is "C4018: signed/unsigned mismatch", as it could hide security issues, so it's better to actually fix the code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	5bd3b91784	util/u_debug_flush: Use util_snprintf. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
Emil Velikov	1c6154c9b4	targets/omx: add nouveau target Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:27 +01:00
Emil Velikov	be1b5feaa0	targets/omx: use GALLIUM_VIDEO_CFLAGS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:27 +01:00
Emil Velikov	ce6c17c083	targets/pipe-loader: cleanup version-script Drop the version/name tag from the script as it was never meant to be there. Add swrast_create_screen as it is used when loading swrast. Rename the file to pipe.sym. v2: Rebase on top of the LD_NO_UNDEFINED changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 21:48:27 +01:00
Emil Velikov	f743670b9a	targets/opencl: hide all the exported llvm/clang mayhem... hopefully Both llvm and clang polute the exported symbol table, as soon as we try to link with either one. Other than those two everything else looks good (clean). Cc: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 21:48:27 +01:00
Emil Velikov	7b7944ee1c	targets/egl-static: freshen up the version script Namely drop the version/name tag of the exported symbol, and rename the filename to egl.sym. v2: Rebase on top of the LD_NO_UNDEFINED changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	4eaa3c9b60	targets/gbm: add version-script to limit exported symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	69d790da9f	targets/vdpau: use version script to limit the exported symbols Using export-symbols-regex is the least desirable method of restricting the exported symbols, as is completely messes up with the symbol table. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	53dd2e45f4	targets/omx: drop the version from the omx targets Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	bea9e8dca0	targets/omx: use version script to limit amount of exported symbols Using export-symbols-regex is the least desirable method of restricting the exported symbols, as is completely messes up with the symbol table. radeon_drm_winsys_create is not needed, avoid exporting it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	6239d42fdb	targets/dri: use a single version script to restict exported symbols Rather than having multiple (almost) identical version scripts use a single one. Cc: Christian König <christian.koenig@amd.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:25 +01:00
Emil Velikov	b8f31dfc22	targets/xvmc: limit the amount of exported symbols In the presence of LLVM the final library exports every symbol from the llvm namespace. Resolve this by using a version script (w/o the version/name tag). Considering that there are only ~25 symbols, explicitly list them to minimize the chances of rogue symbols sneaking in. Drop the *winsys_create functions as they were only meant for gl-vdpau interop. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:25 +01:00
Emil Velikov	9bcb3698db	targets/osmesa: hide osmesa_create_screen The symbol is not meant to be exported, and its presence was only a side effect due to the missing visibility flags. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:25 +01:00
Emil Velikov	658b36ff78	targets/pipe-loader: drop driver_descriptor symbol from swrast The symbol is used for hardware only drivers. For swrast the loader uses swrast_create_screen. Add VISIBILITY_CFLAGS while we're here. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 21:48:25 +01:00
Juha-Pekka Heikkila	a50b02783b	mesa: add extra null checks in vbo_rebase_prims() v2 [idr]: Move declarations before code to prevent MSVC build breaks. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 12:00:30 -07:00
Juha-Pekka Heikkila	dc675919d3	mesa: add missing null checks in _tnl_register_fastpath() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 11:58:36 -07:00
Ian Romanick	59ad2e6696	mesa: Add _mesa_error_no_memory for logging out-of-memory messages This can be called from locations that don't have a context pointer handy. This patch also adds enough infrastructure so that the unit tests for the GLSL compiler and the stand-alone compiler will build and function. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-05-02 11:58:36 -07:00
Chia-I Wu	267e28bb62	glsl: make static constant variables "static const" This allows them to be moved to .rodata, and allow us to be sure that they will not be modified. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-05-02 10:50:14 -07:00
Petri Latvala	6a2d28599f	docs: update 10.2 release notes Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:05 -07:00
Petri Latvala	b4363c8ea4	i965: Enable INTEL_performance_query for Gen5+. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	8cf5bdad3c	mesa: Implement INTEL_performance_query. Using the existing driver hooks made for AMD_performance_monitor, implement INTEL_performance_query functions. v2: Whitespace changes. v3: Whitespace changes, add a _mesa_warning() Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	dac82ceac5	mesa: Add core support for the GL_INTEL_performance_query extension. Like AMD_performance_monitor, this extension provides an interface for applications (and OpenGL-based tools) to access GPU performance counters. Since the exact performance counters available vary between vendors and hardware generations, the extension provides an API the application can use to get the names, types, and minimum/maximum values of all available counters. Applications create performance queries based on available query types, and begin/end measurement collection. Multiple queries can be measuring simultaneously. v2: Whitespace changes v3: src/mapi/glapi/gen/gl_API.xml: Also expose the functions to GLES2. v4: Whitespace changes, static_dispatch="false" for all functions, fix dispatch_sanity test for GLES2 functions Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	6ccb98e88c	mesa: Add INTEL_performance_query enums to tests/enum_strings.cpp Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	927c3c9704	Regenerate gl_mangle.h. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Ilia Mirkin	cf6c9dbc33	docs: update ARB_buffer_storage for nouveau	2014-05-02 12:16:25 -04:00
Ilia Mirkin	3df4d692f3	nouveau: add ARB_buffer_storage support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:16:25 -04:00
Ilia Mirkin	b0d02db7e0	nouveau: remove cb_dirty, it's never used Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:01:35 -04:00
Ilia Mirkin	1baf77dbe8	nvc0: treat non-linear 2DRect textures the same as 2D This fixes textureGather(2DRect) piglit tests, and does not appear to have any adverse effects. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:01:35 -04:00
Ilia Mirkin	cd064c6a25	mesa/st: enable carry/borrow lowering pass This handles the last of the ARB_gs5 instructions currently present in mesa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:01:35 -04:00
Ilia Mirkin	31b92aa2fc	glsl: add lowering passes for carry/borrow Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-02 12:01:35 -04:00
Ian Romanick	f64bfb2e39	mesa: Eliminate gl_shader_program::InternalSeparateShader This was a work-around to allow linking a program with only a fragment shader in a GLES context. Now that we have GL_EXT_separate_shader_objects in GLES contexts, we can just use that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:11 -07:00
Ian Romanick	7d9adef340	mesa: Enable GL_EXT_separate_shader_objects for OpenGL ES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:10 -07:00
Ian Romanick	507b875cf5	glsl: Sort the list of extensions ARB, OES, then everything else. If there's ever a KHR shading language extension, it should go between ARB and OES. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:10 -07:00
Ian Romanick	fb615feafb	mesa: Remove support for desktop OpenGL GL_EXT_separate_shader_objects I don't know of any applications that actually use it. Now that Mesa supports GL_ARB_separate_shader_objects in all drivers, this extension is just cruft. The entrypoints for the extension remain in the XML. This is done so that a new libGL will continue to provide dispatch support for old drivers that try to expose this extension. Future patches will add OpenGL ES GL_EXT_separate_shader_objects, but that's a different thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:10 -07:00
Ian Romanick	e608449d3e	mesa/sso: Enable GL_ARB_separate_shader_objects by default Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:08 -07:00
Ian Romanick	0939d3d097	sso: Add display list support for ARB_separate_shader_objects new functions With this patch, the piglit arb_separate_shader_object-dlist test passes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	7ff937e579	linker: Modify cross_validate_outputs_to_inputs to match using explicit locations This will be used for GL_ARB_separate_shader_objects. That extension not only allows separable shaders to rendezvous by location, but it also allows traditionally linked shaders to rendezvous by location. The spec says: 36. How does the behavior of input/output interface matching differ between separable programs and non-separable programs? RESOLVED: The rules for matching individual variables or block members between stages are identical for separable and non-separable programs, with one exception -- matching variables of different type with the same location, as discussed in issue 34, applies only to separable programs. However, the ability to enforce matching requirements differs between program types. In non-separable programs, both sides of an interface are contained in the same linked program. In this case, if the linker detects a mismatch, it will generate a link error. v2: Make sure consumer_inputs_with_locations is initialized when consumer is NULL. Noticed by Chia-I. v3: Rebase on removal of ir_variable::user_location. v4: Replace a (stale) FINISHME with some good explanation comments from Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	d030a3404c	linker: Sort shader I/O variables into a canonical order v2: Rebase on removal of ir_variable::user_location. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	c557eb7722	linker: Allow geometry shader without vertex shader for separable programs Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	1ff5a2b1ba	linker: Assign varying locations for separable programs Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:40 -07:00
Ian Romanick	7d73c3e99e	linker: Allow consumer stage or producer stage to be NULL When linking a separable program that contains only a fragment shader, the producer will be NULL. Similar cases will exist with geometry shaders and, eventually, tessellation shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	fe37cb0ac6	linker: Refactor code that gets an input matching an output Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:40 -07:00
Ian Romanick	5699220cd5	glsl: Exit when the shader IR contains an interface block instance While writing the link_varyings::single_interface_input test, I discovered that populate_consumer_input_sets assumes that all shader interface blocks have been lowered to discrete variables. Since there is a pass that does this, it is a reasonable assumption. It was, however, non-obvious. Make the code fail when it encounters such a thing, and add a test to verify that behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:40 -07:00
Ian Romanick	ba7195d126	glsl/tests: Add first simple tests of populate_consumer_input_sets Four initial tests: * Create an IR list with a single input variable and verify that variable is the only thing in the hash tables. * Same as the previous test, but use a built-in variable (gl_ClipDistance) with an explicit location set. * Create an IR list with a single input variable from an interface block and verify that variable is the only thing in the hash tables. * Create an IR list with a single input variable and a single input variable from an interface block. Verify that each is the only thing in the proper hash tables. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:39 -07:00
Ian Romanick	8f5852bd2b	linker: Refactor code that builds hash tables of varyings during linking I want to make some changes to this code, but first I want to make some unit tests for it... so that I can capture the pre- and post-invariants. Pulling the code out into its own function in a non-anonymous namespace enables that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 07:19:39 -07:00
Ian Romanick	ca21cffebd	meta: Fix saving the program pipeline state This code was broken in some odd ways before. Too much state was being saved, it was being restored in the wrong order, and in the wrong way. The biggest problem was that the pipeline object was restored before restoring the programs attached to the default pipeline. Fixes a regression in the glean texgen test. v3: Fairly significant re-write. I think it's much cleaner now, and it avoids a bug with some meta ops that use shaders (reported by Chia-I). v4: Check Pipeline.Current against NULL instead of Pipeline.Default. Suggested by Chia-I. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-05-02 07:17:34 -07:00
Ian Romanick	4a868a984d	mesa/sso: Refactor new function _mesa_bind_pipeline Pull most of the guts out of _mesa_BindPipeline into a new utility function that can be use elsewhere (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:16:55 -07:00
Ian Romanick	5998fd536a	linker: Make lower_packed_varyings work with explicit locations Don't do anything with variables that have explicitly assigned locations. This is also how built-in varyings are handled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:16:54 -07:00
Ian Romanick	7016afe25d	glsl: Remove varying "base" parameters In February 2013 Paul unified the values used for shader stage outputs and shader stage inputs. See commits 8a076c5f0^..eed6baf76. Since that time, the location_base parameters are always VARYING_SLOT_VAR0. Instead of passing that around, just hard code it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:16:54 -07:00
Ian Romanick	03488cd3b9	glsl: Constify parameter to a couple varying_matches methods Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 07:16:54 -07:00
Tom Stellard	e05cebafd8	clover: Add a stub implementation of clCreateImage() v3 Now that we are uisng the OpenCL 1.2 headers, applications expect all the OpenCL 1.2 functions to be implemented. This fixes linking errors with the piglit CL tests. v2: - Use c++ features - Fix error code handling v3: - Move <iostream> into api/util.hpp - Fix indentation Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 06:48:17 -07:00
Chris Forbes	11f92fd9f9	docs: Add missing ARB_gpu_shader5 subfeature to GL3.txt Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-02 17:09:13 +12:00
Fredrik Höglund	e6ff557d15	docs: Mark ARB_multi_bind as done ...and update relnotes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:42 +02:00
Fredrik Höglund	68f3b31a0f	mesa: Enable ARB_multi_bind Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:42 +02:00
Fredrik Höglund	2a25570456	mesa: Implement glBindImageTextures Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	63995b902a	mesa: Implement glBindVertexBuffers v2: Use the user provided offset and stride when the buffer ID is zero. Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2014-05-02 03:00:41 +02:00
Fredrik Höglund	f0c36cf4fa	mesa: Implement glBindBuffersRange Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	533cfa03ac	mesa: Implement glBindBuffersBase Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	835abfaba4	mesa: Add _mesa_set_transform_feedback_binding() Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	f65a0c19a5	mesa: Refactor set_ubo_binding() Make set_ubo_binding() just update the binding, and move the code that does validation, flushes the vertices etc. into a new bind_uniform_buffer() function. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	28d7335810	mesa: Add helper functions for looking up multiple buffers v2: Document the difference between _mesa_lookup_bufferobj() and _mesa_multi_bind_lookup_bufferobj(). v3: Don't create the buffer objects when they don't exist. Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)	2014-05-02 02:53:26 +02:00
Fredrik Höglund	19f7eeb6fb	mesa: Refactor set_atomic_buffer_binding() Make set_atomic_buffer_binding() just update the binding, and move the code that does validation, flushes the vertices etc. into a new bind_atomic_buffer() function. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:26 +02:00
Fredrik Höglund	4f30c0ba80	mesa: Implement glBindTextures Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	659d94b256	mesa: Add a texUnit parameter to dd_function_table::BindTexture This is for glBindTextures(), since it doesn't change the active texture unit. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	b8ee235e72	mesa: Add helper functions for looking up multiple textures Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	b16e2ada4c	mesa: Implement glBindSamplers Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	6655e70f99	glapi: Add infrastructure for ARB_multi_bind Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	82291f64e3	mesa: Add functions for doing unlocked hash table lookups This patch adds functions for locking/unlocking the mutex, along with _mesa_HashLookupLocked() and _mesa_HashInsertLocked() that do lookups and insertions without locking the mutex. These functions will be used by the ARB_multi_bind entry points to avoid locking/unlocking the mutex for each binding point. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	30af8ce3f8	mesa: Optimize unbind_texobj_from_texunits() The texture can only be bound to the index that corresponds to its target, so there is no need to loop over all possible indices for every unit and checking if the texture is bound to it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	4bd8272088	mesa: Add a _BoundTextures field in gl_texture_unit This will be used by glBindTextures() when unbinding textures, to avoid having to loop over all the targets. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	6bf8ac846a	mesa: Store the target index in gl_texture_object This will be used by glBindTextures() so we don't have to look it up for each texture. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Eric Anholt	d55e5a323b	i965: Fix the file comment for intel_image.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:28 -07:00
Eric Anholt	5566747296	i965: Rename intel_regions.h to something more appropriate now. We had the EGLimage structure laying around in intel_regions.h, but now it's the only thing left in the file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	e7f65655cb	i965: Delete the intel_regions.c code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	3278f96a52	i965: Drop region usage from DRI2 winsys-allocated buffers. v2: Fix bad pointer on unreference (caught by Chad) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-05-01 15:12:27 -07:00
Eric Anholt	835f90692f	i965: Drop a funny assert about mt pitch. I slipped this in in the region->pitch change from pixels to bytes, but I don't see any reason for it any more -- the libdrm code doesn't appear to divide pitch by a cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	b49982de6a	i965: Fix intel_bufferobj_buffer range for blit drawpixels. If the stride wasn't width*cpp, we wouldn't track how much of the src is busy, and allow a subdata into the end to proceed unsynchronized. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	e16c5c9063	i965: Drop use of intel_region from miptrees. Note: region->width/height used to reflect the total_width/height padding of separate stencil, though mt->total_width didn't. region->width/height was being used in EGL images, where the padded value would have been the wrong one, so I converted them to use rb->Width/Height. v2: Drop debug printf that slipped in (caught by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	e3a9ca4563	i965: Replace the region in DRIimage with just a BO pointer and stride. Regions aren't refcounted safely for multithreaded applications, and they're not terribly useful wrappers of a BO, so I'm trying to remove them. Even the stride I added here could probably be reduced to use of an existing field in the __DRIimageRec, but I want this to be as mechanical of a change as possible. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	8435b60a35	i965: Make intel_set_texture_region just take a BO and pitch. I want to do this to get the region removed from DRI images. However, it does mean that we won't share the intel_region between the rb and the texture for texture_from_pixmap. I think that's fine. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	c0bf5a7eff	i965: Stop making a pointless region for DRI2 to just throw it away. I noticed that we were doing this while changing the DRI3 path to not use regions, which involved changing the signature of intel_update_winsys_renderbuffer_miptree() this way. v2: Replace my comment with Chad's version. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	3a7a20752f	i965: Drop the global GEM name from regions. Once a buffer has been named, drm_intel_bo_flink() is just a getter. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	76932c0ded	i965: Drop the tiling argument to intel_miptree_create_for_bo. The drm function to get the tiling is just a getter storing the two pointers, so we don't need to go out of our way to avoid it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	522fb01275	i965: Drop pointless cast of texObj to intelObj. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	3033f80af5	i965: Move intel_region_get_aligned_offset() to be a miptree function. All the consumers are doing it on a miptree. v2: fix a silly duplicated dereference (review by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)	2014-05-01 15:12:26 -07:00
Eric Anholt	9791eb4280	i965: Move intel_region_get_tile_masks() to be a miptree function. All the consumers are doing it on a miptree. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	ea2cac01e8	i965: Fix another broken offset-aligned-to-tile test. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	65e025f99c	i965: Fix offset-aligned-to-tile test in dma_buf import. v1 of the patch got pushed, insted of the v2 that I had reviewed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	6db640da22	i965: Reuse intel_miptree_get_tile_offsets(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Brian Paul	5ec1adeb10	mesa: move declarations before code in texstore.c To fix MSVC build. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-05-01 16:01:06 -06:00
Ville Syrjälä	eb502c31a0	i965: Fix format of private renderbuffers intel_alloc_renderbuffer_storage() will clobber rb->Format which was already set up by intel_create_renderbuffer(). This causes the driver to potentially create the depth buffer in the wrong format. In practice this makes the depth buffer Z24 even if the visual has depthBits==16. The incorrect depth buffer format doesn't seem to cause any actual problems in i965, but it seems like we should fix it anyway. I see Z16 has been more or less deprecated in the driver except the for the depthBits==16 case. But if we want to use Z24 even in that case (not sure it's really legal?) it would look better if the code made that decision explicitly rather than relying on the format to get magically overwritten by the renderbuffer code. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-05-01 23:56:34 +03:00
Ville Syrjälä	c1d4d49993	i915: Don't advertise Z formats in TextureFormatSupported on gen2 Gen2 doesn't support texturing from Z formats, so state as much. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-05-01 23:56:25 +03:00
Ville Syrjälä	d3edc31810	i915: Fix format of private renderbuffers intel_alloc_renderbuffer_storage() will clobber rb->Format which was already set up by intel_create_renderbuffer(). This causes the driver to potentially create the depth buffer in the wrong format. Long time ago things worked by accident because _mesa_choose_tex_format() checked for ARB_depth_texture and thus returned MESA_FORMAT_NONE on gen2 hardware. Somehow that ended up working when depthBits==16 because the driver would then pick DEPTH_FRMT_16_FIXED. Not sure how, but things also seemed to work with depthBits==24. Things started to go more sideways at: commit `6ae473221a` Author: Eric Anholt <eric@anholt.net> Date: Mon Apr 22 16:04:25 2013 -0700 intel: Fold the one last function intel_tex_format.c into the caller. since that caused intel_miptree_create_layout() to divide by zero when encoutering MESA_FORMAT_NONE (bw==0). So after this commit things were broken enough that many applications wouldn't even run. Things got a bit better at: commit `c245efe7e8` Author: Eric Anholt <eric@anholt.net> Date: Thu Mar 21 09:50:45 2013 -0700 mesa: Remove extension checking from ChooseTexFormat. since now _mesa_choose_tex_format() would return MESA_FORMAT_X8_Z24 for GL_DEPTH_COMPONENT due to i915 erroneosly claiming that MESA_FORMAT_X8_S24 (and others) are supported texture formats even on gen2 hardware. So now the the div-by-zero was gone, but now the driver would pick DEPTH_FRMT_24_FIXED_8_OTHER even when depthBits==16 which caused rendering problems. If we prevent rb->Format from getting clobbered for the depth buffer things work much better. This makes the spinning title text visible again in chromium-bsu at 16bpp, for example. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-05-01 23:56:09 +03:00
Anuj Phogat	c1743707a1	mesa: Allow FLOAT_32_UNSIGNED_INT_24_8_REV in get_tex_depth_stencil() Fixes a crash in Khronos OpenGL CTS packed_pixels tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	29b8e894d1	mesa: Add support to unpack depth-stencil texture in to FLOAT_32_UNSIGNED_INT_24_8_REV V2: Follow the new naming convention for unpack functions. Use double precision for converting Z24 to a float. V3: Unpack stencil value to most significant byte. Use 'struct z32f_x24s8' type. V4: Unpack stencil value to least significant byte. Add a comment to clarify stencil packing. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	7a8045d2f7	mesa: Add new helper function _mesa_unpack_depth_stencil_row() This patch makes non-functional changes in the code. New helper function added here will make it easier to support more data types in the following patches. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	ef924f0de9	mesa: Remove redundant if checks in _mesa_texstore_xx_xx() functions This patch contains non-functional changes. Assertion checks made earlier in the functions make the if checks redundant. So, remove the if checks and unindent the code in if block. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	1a8f9ba9b3	mesa: Allow srcFormat=GL_DEPTH_STENCIL in _mesa_texstore_xx_xx() functions _mesa_texstore_z24_s8() and _mesa_texstore_z32f_x24s8() are capable of handling GL_DEPTH_STENCIL format. So, allow it in both the functions. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	aeb9d4495d	mesa: Add missing types in _mesa_texstore_xx_xx() functions Depth-stencil teture targets are allowed to use source data of type GL_UNSIGNED_INT_24_8_EXT and GL_FLOAT_32_UNSIGNED_INT_24_8_REV. Fixes few crashes in Khronos OpenGL CTS packed_pixels tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	d714b20eb4	i965: Fix crash in do_blit_readpixels() Fixes a crash in Khronos CTS packed_pixels tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	5388fc157e	mesa: Add error condition for format=STENCIL_INDEX in glGetTexImage() From OpenGL 4.0 spec, page 306: "Calling GetTexImage with a format of STENCIL_INDEX causes the error INVALID_ENUM." Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	340658e44f	mesa: Add entry for extension ARB_texture_stencil8 V2: Alphabetize the new entry Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	9bcb0a8532	glsl: Apply the link error conditions to GL_ARB_fragment_coord_conventions Link error conditions added in previous patch are equally applicable to GL_ARB_fragment_coord_conventions implementation. Extension's spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use of gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	35f11e85cb	glsl: Link error if fs defines conflicting qualifiers for gl_FragCoord GLSL 1.50 spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." This patch causes the shader link to fail if we have multiple fragment shaders with conflicting layout qualifiers for gl_FragCoord. V2: Restructure the code and add conditions to correctly handle the following case: fragment shader 1: layout(origin_upper_left) in vec4 gl_FragCoord; void main() { foo(); gl_FragColor = gl_FragData; } fragment shader 2: layout(pixel_center_integer) in vec4 gl_FragCoord; void foo() { } V3: Allow linking in the following case: fragment shader 1: void main() { foo(); gl_FragColor = gl_FragCoord; } fragment shader 2: in vec4 gl_FragCoord; void foo() { ... } Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	a751adf071	glsl: Compile error if fs uses gl_FragCoord before first redeclaration Section 4.3.8.1, page 39 of GLSL 1.50 spec says: "Within any shader, the first redeclarations of gl_FragCoord must appear before any use of gl_FragCoord." GLSL compiler should generate an error in following case: vec4 p = gl_FragCoord; layout(origin_upper_left) in vec4 gl_FragCoord; void main() { } Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	581e4acb0d	glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoord GLSL 1.50 spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." This patch makes the glsl compiler to generate an error if we have a fragment shader defined with conflicting layout qualifier declarations for gl_FragCoord. For example: layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; layout(pixel_center_integer) in vec4 gl_FragCoord; void main() { } V2: Some code refactoring for better readability. Add compiler error conditions for redeclarations like: layout(origin_upper_left) in vec4 gl_FragCoord; layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; and in vec4 gl_FragCoord; layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; V3: Simplify function is_conflicting_fragcoord_redeclaration() V4: Check for null pointer before doing strcmp(var->name, "gl_FragCoord"). Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	49c71050de	mesa: Use location VERT_ATTRIB_GENERIC0 for vertex attribute 0 In OpenGL 3.1 attribute 0 becomes non-magic, just like in OpenGL ES 2.0. Earlier versions of OpenGL used attribute 0 exclusively for vertex position. V2: Add a utility function _mesa_attr_zero_aliases_vertex() in varray.h Fixes 4 Khronos OpenGL CTS failures: glGetVertexAttrib depth24_basic depth24_precision rgb8_rgba8_rgb Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	dc75479b7a	mesa: Fix querying location of nth element of an array variable This patch makes changes to the behavior of glGetAttribLocation(), glGetFragDataLocation() and glGetFragDataIndex() functions. Code changes handle a case described in following example: shader program: layout(location = 1)in vec4[4] a; void main() { } Currently, glGetAttribLocation("a") returns 1. glGetAttribLocation("a[i]"), where i = {0, 1, 2, 3}, returns -1. But the expected locations for array elements are: 1, 2, 3 and 4 respectively. This clarification came up with the addition of ARB_program_interface_query to OpenGL 4.3. From Page 326 (page 347 of the PDF) of OpenGL 4.3 spec: "Otherwise, the command is equivalent to GetProgramResourceLocation(program, PROGRAM_INPUT, name);" And, From Page 101 (page 122 of the PDF) of OpenGL 4.3 spec: "A string provided to GetProgramResourceLocation or GetProgramResourceLocationIndex is considered to match an active variable if • the string exactly matches the name of the active variable; • if the string identifies the base name of an active array, where the string would exactly match the name of the variable if the suffix "[0]" were appended to the string; or • if the string identifies an active element of the array, where the string ends with the concatenation of the "[" character, an integer (with no "+" sign, extra leading zeroes, or whitespace) identifying an array element, and the "]" character, the integer is less than the number of active elements of the array variable, and where the string would exactly match the enumerated name of the array if the decimal integer were replaced with zero." V2: Simplify get_matching_index() function. Add relevant text from OpenGL spec in commit message. Fixes failures in Khronos OpenGL CTS tests: explicit_attrib_location_room draw_instanced_max_vertex_attribs Proprietary linux drivers of NVIDIA (331.49) matches the behavior expected by OpenGL 4.3 spec. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	8c61b6a99b	glsl: Allow overlapping locations for vertex input attributes Currently overlapping locations of input variables are not allowed for all the shader types in OpenGL and OpenGL ES. From OpenGL ES 3.0 spec, page 56: "Binding more than one attribute name to the same location is referred to as aliasing, and is not permitted in OpenGL ES Shading Language 3.00 vertex shaders. LinkProgram will fail when this condition exists. However, aliasing is possible in OpenGL ES Shading Language 1.00 vertex shaders." Taking in to account what different versions of OpenGL and OpenGL ES specs say about aliasing: - It is allowed only on vertex shader input attributes in OpenGL (2.0 and above) and OpenGL ES 2.0. - It is explictly disallowed in OpenGL ES 3.0. Fixes Khronos CTS failing test: explicit_attrib_location_vertex_input_aliased.test See more details about this at below mentioned khronos bug. V2: Fix the case where location exceeds the maximum allowed attribute location. V3: Simplify the condition added in V2. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: Khronos #9609 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Roland Scheidegger	a773fdc64d	glx/drisw: fix memory leak when destroying screen. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-01 16:13:38 +02:00
Roland Scheidegger	64d6460a56	gallivm: fix 2 leaks in disassembly code don't leak the MCSubtargetInfo (not really big, was already fixed with llvm master) and TargetMachine (big). While this is only used for debugging the leak is large enough to get you into trouble in some cases. Tested with llvm 3.1 and master. Before (llvm 3.1), GALLIVM_DEBUG=asm glxgears: ==14152== LEAK SUMMARY: ==14152== definitely lost: 105,228 bytes in 20 blocks ==14152== indirectly lost: 347,252 bytes in 261 blocks ==14152== possibly lost: 866,625 bytes in 1,453 blocks ==14152== still reachable: 7,344,677 bytes in 6,494 blocks ==14152== suppressed: 0 bytes in 0 blocks After: ==13799== LEAK SUMMARY: ==13799== definitely lost: 3,108 bytes in 6 blocks ==13799== indirectly lost: 0 bytes in 0 blocks ==13799== possibly lost: 804,143 bytes in 1,429 blocks ==13799== still reachable: 7,314,267 bytes in 6,473 blocks ==13799== suppressed: 0 bytes in 0 blocks Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-01 16:13:38 +02:00
José Fonseca	6d911a5944	mesa: Move declaration to top of block. To fix MSVC build. Trivial.	2014-05-01 10:00:10 +01:00
José Fonseca	b0de67ad2d	osmesa: Fix typo in _MaxEnabledTexImageUnit.	2014-05-01 09:55:20 +01:00
Kenneth Graunke	85ce2242cb	i965/vec4: Port untyped atomic message support to Broadwell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:12 -07:00
Kenneth Graunke	45367d2d09	i965/vec4: Port untyped surface reads support to Broadwell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:10 -07:00
Kenneth Graunke	e9e89d5756	i965/fs: Port untyped atomic message support to Broadwell. v2: Fix SIMD mode comment (caught by Eric Anholt). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:08 -07:00
Kenneth Graunke	54a48984b3	i965/fs: Port untyped surface read support to Broadwell. v2: Drop unused num_components variable; fix SIMD Mode comment (caught by Eric Anholt). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:06 -07:00
Kenneth Graunke	f1cd9fee53	i965/fs: Set fs_inst::header_present for untyped atomics/surface reads. The brw_eu_emit.c code manually forces the header present bit when used in align1 (scalar) mode. So, this has no effect currently. However, it is nice to have fs_inst::header_present reflect reality. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:04 -07:00
Kenneth Graunke	4d9c27df45	i965: Disassemble atomic operations and other DP:DC1 stuff on Broadwell. This is similar to what Eric did for Gen7 a little while ago; it also has support for untyped surface reads. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:02 -07:00
Kenneth Graunke	3b3c46656e	i965: Implement the create_raw_surface() hook on Broadwell. Otherwise we crash when setting up atomic buffer objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:23:59 -07:00
Kenneth Graunke	69fd055166	i965: Drop mark_surface_used from gen8 generators. Francisco made brw_mark_surface_used a freestanding function in commit `a32817f3c2`. We should use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:23:57 -07:00
Kenneth Graunke	b10785f9a9	i965/fs: Add support for fs_inst::force_writemask_all on Broadwell. This must not have existed when I wrote the original code. The atomic operation header setup code uses this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:23:44 -07:00
Kenneth Graunke	ac30e1adb4	i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS. For platforms using hardware contexts (currently Gen6+), we failed to emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS, instead emitting MI_NOOP for both. During one of the context initialization reordering patches, we accidentally moved brw_init_state before we set brw->CMD_PIPELINE_SELECT and brw->CMD_VF_STATISTICS. So, when brw_init_state uploaded initial GPU state (brw_init_state -> brw_upload_initial_gpu_state -> brw_upload_invariant_state), these would be 0 (MI_NOOP). Storing the commands in the context is not worthwhile. We have many generation checks in our state upload code, and for platforms with hardware contexts, this only gets called once per GL context anyway. The cost is negligable, and it's easy to botch context creation ordering. This may fix hangs on Gen6+ when using the media pipeline. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-05-01 00:12:22 -07:00
Kenneth Graunke	0380ec467d	i965: Don't enable reset notification support on Gen4-5. arekm reported that using Chrome with GPU acceleration enabled on GM45 triggered the hw_ctx != NULL assertion in brw_get_graphics_reset_status. We definitely do not want to advertise reset notification support on Gen4-5 systems, since it needs hardware contexts, and we never even request a hardware context on those systems. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75723 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 23:08:22 -07:00
Carl Worth	4546b70e08	doc: Add pointer to the Mesa Stable Queue page. Since this is now updated daily and looks to be useful.	2014-04-30 16:27:03 -07:00
Eric Anholt	862986ade3	i965: Fix state flag comments on color_buffer_write_enabled() calls. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	e739558c9d	i965: Drop bogus state flag comment. This was introduced with the comment and code below it, though the code only touches prog_data (CACHE_NEW_WM_PROG). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	60c5f9716c	i965: Track the number of samples in the drawbuffer. This keeps us from having to emit the nonpipelined state packet on every FBO binding. -4.42003% +/- 1.09961% effect on cairo-perf-trace runtime on glamor (n=110). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	973345fc23	mesa: Track maximum CurrentTexUnit to reduce glDeleteTextures() overhead. No more walking 96*6 pointers looking to see if they're the current texture, when we only use the first 2 out of 96 units. -6.26002% +/- 1.87817% effect on cairo runtime on no-fbo-cache glamor (n=36). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	6a97deb88a	mesa: Rewrite shader-based texture image state updates. Instead of walking 6 shader stages for each of the 96 combined texture image units, now we just walk the samplers used in each shader stage. With cairo-perf-trace on Xephyr with glamor, I'm seeing a -6.50518% +/- 2.55601% effect on runtime (n=22) since the "drop _EnabledUnits" change. No significant performance difference on an apitrace of minecraft (n=442). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	a580b500ed	mesa: Split the shader texture update logic from fixed function. I want to avoid walking the entire long array texture image units, but the obvious way to do so means walking program samplers, and thus hitting the units in a random order. This change replaces the previous behavior of only setting up the fallback texture for a fragment shader with setting up the fallback texture for any shader that's missing a complete texture of the right target in its unit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	e5e50fae6a	mesa: Finish removing the _ReallyEnabled field. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	741f5d58e6	radeon: Drop the remaining driver usage of _ReallyEnabled. This is kind of ugly, but I think it's worth it to finish off the last consumers of _ReallyEnabled. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	2f8749af20	swrast: Drop remaining use of _ReallyEnabled. The _MaxEnabledTexImageUnit check assures us that Unit[0].Current != NULL. This is the last consumer of _ReallyEnabled outside of the radeons. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	8061f90a64	gallium: Drop use of _ReallyEnabled. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	cef82a64bd	mesa: Drop _ReallyEnabled usage from ff_fragment_shader. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	07b94c99a7	i915: Drop use of _ReallyEnabled. We can just look at _Current's target. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	ff9c3e8e5a	mesa: Replace use of _ReallyEnabled as a boolean with use of _Current. I'm probably not the only person that has tried to kill _ReallyEnabled. This does the mechanical part of the work, and cleans _ReallyEnabled from i965. I think that using _Current makes texture management clearer: You can't have multiple targets in use in the same texture image unit at the same time, because there's just that one pointer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	62d46332d8	mesa: Ensure that (unit->_Current != 0) == (unit->_ReallyEnabled != 0). I'm going to try to delete _ReallyEnabled, which is this weird bitfield with either 0 or 1 bits set with just the reference to _Current. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	6bac47c05a	mesa: Drop dead last_ReallyEnabled fields from drivers. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	c703658b39	mesa: Drop _EnabledUnits. The field wasn't really valid, since we've got more than 32 units now. It turns out it was mostly just used for checking != 0, or checking for fixed function coordinates, though. v2: Fix mis-conversion in xm_line.c (caught by Ken). Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:17 -07:00
Eric Anholt	3dfe56c53b	swrast: Just use _EnabledCoordUnits for figuring out which texcoords to build. _EnabledUnits is all of the first 32 image units that are used by fixed function or programs, while _EnabledCoordUnits is just which fixed function fragment shader texcoords need to be generated. This is a theoretical bugfix in the case of a vertex shader texturing from large texture image unit number (we'd end up flagging something other than a VARYING_SLOT_TEXn as needing to be generated), but it's actually just motivated by trying to kill _EnabledUnits. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:21:59 -07:00
Eric Anholt	1ad443ecdd	i915: Redo texture unit walking on i830. We now know what the max unit is in the context state. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:21:59 -07:00
Matt Turner	9565392031	i965/vec4: Remove 'mul_arg' from try_emit_mad(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 11:41:29 -07:00
Matt Turner	1e50bc9ee1	i965/fs: Remove 'mul_arg' from try_emit_mad(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 11:41:29 -07:00
Brian Paul	475f5ff64d	mesa: change invalid texture swizzle error to GL_INVALID_ENUM The original GL_EXT_texture_swizzle extensions said GL_INVALID_OPERATION was to be generated when the an invalid swizzle was passed to glTexParameter(). But in OpenGL 3.3 and later, the error should be GL_INVALID_ENUM. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-30 10:09:44 -06:00
Andreas Hartmetz	1c6aa6599e	translate_sse: Use the correct buffer index in this fast path. It is possible that there are multiple input buffers but only one is relevant for translation. Then there will be only a single translation group, which might need to source data from a buffer index != 0. Fixes wrong vertex shader inputs as observed while debugging with an application and driver combination that requires translation of a vertex attribute in a non-trivial set of attributes and input buffers. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-29 20:35:10 -04:00
Tom Stellard	ca848e8bee	clover: Query drivers for max clock frequency Igor Gnatenko: v2: PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY instead of PIPE_COMPUTE_MAX_CLOCK_FREQUENCY Bruno Jiménez: v3: Drivers report clock in Mhz Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 15:28:17 -07:00
Tom Stellard	0a41054b7f	radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY Igor Gnatenko: v2: in define RADEON_INFO_MAX_SCLK use 0x1a instead of 0x19 (upstream changes) Bruno Jiménez: v3: Convert the frequency to MHz from kHz after getting it in 'do_winsys_init' Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-29 15:25:50 -07:00
Tom Stellard	5fe1a0ebad	gallium: Add PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY Bruno Jiménez: v2: Updated the docs v3: Remove trailing comma Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 15:24:53 -07:00
Kenneth Graunke	979a015bc1	i965: Fix a few base addresses on Broadwell. We intended to set these 64-bit addresses to 0, and set the enable bit. But, I accidentally placed the DWord with the high bits first, when it should have been second. This generally worked out, by luck - presumably General State Base Address is initially zero, and ends up remaining that way in our contexts since we bungled the "modify enable" bit. v2: Fix MOCS shift on GSBA. It should be 4, and I had 2. (Caught by Ben Widawsky.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-04-29 14:01:06 -07:00
EdB	7fb05f9298	clover: Stub implementation of CL 1.2 sub-devices. The implementation is basically a NOP but it conforms with OpenCL 1.2. [ Francisco Jerez: Initialize property return buffer for CL_DEVICE_PARTITION_PROPERTIES, CL_DEVICE_PARTITION_TYPE, CL_DEVICE_PARTITION_AFFINITY_DOMAIN, and make the latter a scalar rather than a vector. Some clean-up and code style fixes. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 16:14:50 +02:00
EdB	5827781d25	clover: Add clEnqueue{Marker, Barrier}WithWaitList. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 13:12:38 +02:00
Jan Vesely	7b11c97d31	clover: Align kernel argument sizes to nearest power of 2 v2: use a new variable for aligned size add comment make both vars const only use the aligned value in argument constructors fix comment typo Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 13:09:21 +02:00
Francisco Jerez	df985cc8f6	clover: Avoid warnings from references to deprecated CL 1.1 APIs. Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-29 13:01:37 +02:00
Francisco Jerez	beadd6b0cc	clover: Update OpenCL headers to version 1.2 from Khronos. The C++ headers are not updated because they rely on CL 1.2 APIs that we do not implement yet when the core CL 1.2 headers are present. Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-29 13:01:10 +02:00
Ilia Mirkin	f782d6e792	nvc0/ir: offset appears to come before the Z ref Fixes textureGatherOffset when used with a shadow sampler. Also verified against blob compiler with textureLodOffset manually (no piglit tests for texture[Lod]Offset + shadow samplers). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 20:32:36 -04:00
Brian Paul	50034c0171	mesa: remove unused #pragma export on/off lines PRAGMA_EXPORT_SUPPORTED is never defined. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77749 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 17:16:42 -06:00
Ilia Mirkin	f3aa999383	nv50/ir: change texture offsets to ValueRefs, allow nonconst This allows us to have non-constant offsets for textureGatherOffset and textureGatherOffsets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:09:18 -04:00
Ilia Mirkin	46364a53ef	nvc0/ir: do constant folding of extbf/insbf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:05:16 -04:00
Ilia Mirkin	1c85177419	nvc0/ir: add support for MUL_HI tgsi opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:05:16 -04:00
Ilia Mirkin	b4b20d42f6	nvc0/ir: add support for new bitfield manipulation opcodes This adds support for: IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC Which are all required for ARB_gs5 support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:05:16 -04:00
Ilia Mirkin	1db993f2fe	tgsi: add tgsi_exec support for new bit manipulation opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:05:11 -04:00
Ilia Mirkin	ab4927f3e0	gallium/util: add helpers for bitfield manipulation Add bitwise reversing and signed MSB helpers for software implementation of the new TGSI opcodes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:05:07 -04:00
Ilia Mirkin	3e73bf2724	mesa/st: implement new bit manipulation opcodes Also pipe through [IU]MUL_HI, MAD, and lower ldexp. This provides coverage of all new ARB_gpu_shader5 functions except uaddCarry, usubBorrow and interpolateAt*. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:05:04 -04:00
Ilia Mirkin	a52eaba787	gallium: add new opcodes for ARB_gs5 bit manipulation support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:04:46 -04:00
Emil Velikov	b125c92aa9	glx/drisw: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:39 +01:00
Emil Velikov	a2454bdfbd	glx/dri3: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:39 +01:00
Emil Velikov	55d82adec6	glx/dri2: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	76ae25d7e8	glx/dri: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	2f519e4635	glx/indirect: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Set indirect_screen_vtable as a static const. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	31a3b58cb7	glx/apple: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Set applegl_screen_vtable as a static const. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	5f280d0c44	egl_dri: rework dri extension handling Use designated initialisers, and store the extensions pointers as const. The loader extensions __DRIdri2LoaderExtension and __DRIswrastLoaderExtension are setup by the platform backends so they should not be constified. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	5457caa58c	gbm: cleanup __DRI*extension handling Use designated initialisers, store all extension pointers as const and use a const __DRIextensions array over assigning each element individually. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:38 +01:00
Emil Velikov	c812557a0e	dri_util: cleanup dri extension handling Explicitly set the version that is implemented, as that may differ from the one defined in dri_interface.h. The remaining __DRI*Extensions are treated as constants, so got ahead and declare them as such. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:38 +01:00
Emil Velikov	51e3569573	glx/tests: explicitly set __DRI2rendererQueryExtension members While we're here use the typcast'ed name and constify. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:38 +01:00
Emil Velikov	ecfe986120	glx/dri3: rework __DRIextension handling Use a const array with the extensions, rather than assigning each one to a fixed size array at runtime. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:37 +01:00
Emil Velikov	4be3874c97	glx/dri2: rework __DRIextension handling Make sure that the DRI*Extensions report the version of the interface implemented over the listed in the headers. While both are currently the same, this may change in the future. v2: Keep loader extensions handling as is. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:18 +01:00
Emil Velikov	98e2a8e2f9	st/dri: cleanup dri extension handling Explicitly set the version that is implemented, as that may differ from the one defined in dri_interface.h. Use designated initialisers and constify whereever possible. Note: __DRIimageExtension should not be made const as it's modified at runtime. This patch should have no side effects on compilers that do not support designated initialisers, as the existing code in dri/common already uses them. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:28 +01:00
Emil Velikov	748b35a69f	dri/radeon: use a const __DRIextension array Rather than keeping a separate and unused copy of the screen extensions within the radeon screen, use a constant array that can be used directly with __DRIscreen. [Kristian Høgsberg] The copy in the radeon screen isn't unused, that's where the array is built and stored, the dri screen just points to that. The pattern here was used for cases where the extensions exported by a dri driver could vary at runtime, for example depending on chipset. In this case, it's known at compile time, so it makes sense to use a static const array instead. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:27 +01:00
Emil Velikov	38f20f79da	drivers/dri: cleanup dri extension instantiation Uniformly use the typecasted extension name, constify extension instances and use designated initialisers. Set the implemented version of the extension, over the one defined in dri_infertace.h. Patch covers the following extensions: __DRItexBufferExtension __DRIimageExtension __DRIrobustnessExtension __DRI2rendererQueryExtension __DRIdri2LoaderExtension Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:27 +01:00
Emil Velikov	9b42fd1772	dri_interface: Update __DRItexBufferExtensionRec to version 3 With commit e59fa4c46c8("dri2: release texture image.") we updated the extension without bumping the version number. The patch itself added an interface required to enable texture_from_pixmap on certain platforms. The new code was effectively never build, as it depended on __DRI_TEX_BUFFER_VERSION >= 3, which never came to be in upstream mesa. This commit bumps the version number, drops the __DRI_TEX_BUFFER_VERSION checks and resolves all the build conflicts. Additionally it add a version check as egl and dri3, as require version 2 of the extension which does not have the releaseTexBuffer hook. Cc: Juan Zhao <juan.j.zhao@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:27 +01:00
Jon TURNEY	ec8ebff342	Check for dladdr(), rather than assuming we have it if we have RTLD_DEFAULT Unfortunately, Cygwin defines RTLD_DEFAULT (for glibc compatibility), but can't provide dladdr(), so add a check for dladdr() Since I don't think scons is ever used to build for Cygwin, just set HAVE_DLADDR in SConscript, assuming that if we have RTLD_DEFAULT, we have dladdr(). Cc: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-28 19:11:02 +01:00
Richard Sandiford	6c8f547f66	util: Fix cross-compiles between endiannesses The old python code used sys.is_big_endian to select between little-endian and big-endian formats, which meant that the build and host endiannesses needed to be the same. This patch instead generates both big- and little- endian layouts, using PIPE_ARCH_BIG_ENDIAN to select between them. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:27 +01:00
Richard Sandiford	6944796cbe	util: Split out channel-parsing Python code Splits out the code that parses the channel list, so that we can have different lists for little and big endian. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:25 +01:00
Richard Sandiford	1a3746212d	util: Split out channel-printing Python code Rather than iterate over format.channels and format.swizzles directly, use Python subfunctions that take the channel and swizzle lists as arguments. This allow the channel and swizzle lists to depend on endianness. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:24 +01:00
Richard Sandiford	0ee3ac938a	util: Turn inv_swizzle into a global function With the big-endian changes, there can be two swizzle orders for each format. This patch turns Format.inv_swizzle() into a global function that takes the swizzle list as a parameter. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:22 +01:00
Richard Sandiford	227d7a6a3c	util: Add more query methods to u_format_parse.Format The main aim is to reduce the number of places that access channels[0], swizzles[0] and swizzles[1] directly. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:20 +01:00
Michel Dänzer	136c437cea	st/mesa: Fix NULL pointer dereference for incomplete framebuffers This can happen with glamor, which uses EGL_KHR_surfaceless_context and only explicitly binds GL_READ_FRAMEBUFFER for glReadPixels. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-28 12:12:03 +09:00
Chris Forbes	151a20dcd4	glsl: fix spelling of derived Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-27 21:37:23 +12:00
Ilia Mirkin	e88644c1f2	docs: mark off nv50/nvc0 for ARB_sample_shading, update relnotes relnotes weren't updated this whole time, so I went through all the GL3.txt changes and picked out the nouveau ones since 10.1. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-27 00:16:29 -04:00
Chia-I Wu	7b2dd89041	mesa: overhaul debug namespace support _mesa_HashTable is not well-suited for us: it locks a mutex unnecessarily and it does not accept 0 as the key (and have branches to handle 1 specially). What we really need is a sparse array. Whether it should be implemented as a hash table, a list, or a bsearch()-able array requires investigations of the use models. We choose to implement it as a list for now, assuming it is common to have a short list of IDs in each (source, type) namespace. The code is simpler, and the memory footprint is lower. This also fixes several corner cases such as making messages to have different states at different severities. v2: use GLbitfield for State/DefaultState, and add a comment Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	70e4337014	mesa: delay copying of debug groups Do not copy the debug group until it is about to be written. One likely scenario of using glPushDebugGroup/glPopDebugGroup is to enclose a sequence of GL commands and give them a human-readable description. There is no message control change in this scenario, and thus no need to copy. This also reduces the initial size of gl_debug_state from 306KB to 7KB. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	a30c4c6ca0	mesa: clean up debug output namespace handling Add functions to provide these operations on a struct gl_debug_namespace: init(): initialize the namespace copy(): copy all elements from one namespace to another clear(): clear all elements (to free the memories) set(): set the value of an element set_all(): set the value of all elements get(): get the value of an element A debug namespace is like a sparse array. The length of the array is huge, 2^sizeof(GLuint), but most of the elements assume the same value sepcified by set_all(). Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	44a1374793	mesa: clean up debug groups Add struct gl_debug_group to hold all namespaces of a debug group. Replace the 3-dimensional array, Namespaces, in struct gl_debug_state by a 1-dimensional array of type struct gl_debug_groups. Turn the 4-dimensional array, Defaults, in struct gl_debug_state to a 1-dimensional array in struct gl_debug_namespace. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	e412305f9f	mesa: clean up debug message log Remove NextMsgLength, and move members of struct gl_debug_state that belong to the message log to a new struct, gl_debug_log. Rename gl_debug_msg to gl_debug_message. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	cf61ea3029	mesa: use accessors for struct gl_debug_state When GL_DEBUG_OUTPUT_SYNCHRONOUS is GL_TRUE, drivers are allowed to log debug messages from other threads. That requires gl_debug_state to be protected by a mutex, even when it is a context state. While we do not spawn threads in Mesa yet, this commit makes it easier to do when we want to. Since the definition of struct gl_debug_state is no longer needed by the rest of the driver, move it to main/errors.c. This should make it even harder to use the struct incorrectly. v2: add comments for the accessors Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	94e45c98e1	mesa: eliminate debug output message_insert Add validate_length, and call it together with log_msg directly instead of message_insert. No functional change. v2: make sure length is non-negative (i.e., known) before calling validate_length, noted by Timothy Arceri Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	188d22d9b7	mesa: eliminate debug output should_log In both call sites, it could be easily replaced by direct debug_is_message_enabled calls. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	c9dfb6b76c	mesa: eliminate debug output control_app_messages Merge control_app_messages with the only caller. Eliminate set_message_state and control_messages too as they are unused. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	274913c42c	mesa: eliminate debug output get_msg Merge get_msg with the only caller. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	04a8baad37	mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data Replace free_errors_data by debug_clear_group. Add debug_pop_group and debug_destroy for use in _mesa_PopDebugGroup and _mesa_free_errors_data respectively. No funcitonal change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	f1d00dce43	mesa: refactor _mesa_PushDebugGroup Move group copying to debug_push_group. Save the group message before pushing instead of after, since we will need it after popping. No functional change otherwise. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	de0e0ae4b6	mesa: refactor debug output control_messages Move most of the code to debug_set_message_enable_all. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	7e9451dc46	mesa: refactor debug output get_msg Move message fetching to debug_fetch_message and message deletion to debug_delete_messages. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	e9d1b5c8af	mesa: refactor debug out log_msg Move message logging to debug_log_message. Replace store_message_details by debug_message_store. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	880183fee8	mesa: refactor debug output set_message_state Move message state update to debug_set_message_enable. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	7554d27de4	mesa: refactor debug output should_log Move the message filtering logic to debug_is_message_enabled. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	672b209225	mesa: refactor _mesa_get_debug_state Move gl_debug_state allocation to a new function, debug_create. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Ilia Mirkin	9339f8ac1b	nvc0/ir: fetch shadow value from proper place for TG4 cube array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	b86d78b4c1	nvc0/ir: set gatherComp for non-shadow targets Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	24e68c9024	nvc0/ir: set instance count based on the GS_INVOCATIONS property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	802fe8d9af	nvc0/ir: add support for INVOCATIONID system value Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	b3a2398ade	nvc0/ir: add support for SAMPLEMASK sysval Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 11:57:18 -04:00
Ilia Mirkin	c3d2bda53e	mesa/st: translate gl_InvocationID to INVOCATIONID semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:57:15 -04:00
Ilia Mirkin	389379e81d	mesa/st: translate gl_SampleMaskIn to SAMPLEMASK semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:57:12 -04:00
Ilia Mirkin	4be146b108	gallium: add GS_INVOCATIONS property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:57:09 -04:00
Ilia Mirkin	76db20fc67	gallium: add INVOCATIONID semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:56:39 -04:00
Ilia Mirkin	af38ef907c	nvc0: add support for PIPE_CAP_SAMPLE_SHADING Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 11:53:34 -04:00
Ilia Mirkin	f715a0a39a	nv50: add support for PIPE_CAP_SAMPLE_SHADING Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 11:53:24 -04:00
Ilia Mirkin	c5d822dad9	mesa/st: add support for ARB_sample_shading Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-26 11:52:52 -04:00
Ilia Mirkin	88d8d88d8c	gallium: add basic support for ARB_sample_shading Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-26 11:52:01 -04:00
Enrico Horn	3a2885fb26	mapi: OpenVG symbol exports. Fixes another mistake in `144bbb7b78`. Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77502	2014-04-25 19:34:38 -07:00
Matt Turner	18993f7892	glsl: Use properly typed arguments for bitfieldInsert. bitfieldInsert takes scalar integers for its last two arguments. Since bitfieldInsert is lowered on i965 to two instructions that have more flexible arguments, I didn't notice when I wrote this. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-25 19:24:39 -07:00
Eric Anholt	07730e9463	i965: Don't bother flushing the batch if it doesn't ref our mt to map. -1.1372% +/- 0.858033% effect on cairo runtime on glamor (n=175). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-25 18:19:55 -07:00
Ander Conselvan de Oliveira	17860309f1	egl: Protect use of gbm_dri with ifdef HAVE_DRM_PLATFORM Otherwise it fails to compile if the drm egl platform is disabled. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:17:54 +01:00
Neil Roberts	63d4661ab2	wayland: Fix the logic in disabling the prime capability It looks like this bit of code is trying to disable the prime capability if the driver doesn't support createImageFromFds. However the logic looks a bit broken and what it would actually do is disable all other capabilities apart from prime. This patch fixes it to actually disable prime. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:17:05 +01:00
Ander Conselvan de Oliveira	49964fa28b	gbm: Set errno on errors This should give the caller some information of what called the error. For the gbm_bo_import() case, for instance, it is possible to know if the import is not supported or the error was caused by an invalid parameter. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:16:45 +01:00
Ander Conselvan de Oliveira	aa91fe1c09	gbm/dri: Fix out-of-memory error path in dri_device_create() Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:16:00 +01:00
Emil Velikov	c0953cf06e	gallium/tests: conditionally include sw/dri winsys In all fairness we allow the gallium tests to be build with --disable-dri which will result in the approapriate winsys to not be build, thus the build will fail. ./configure --disable-dri --with-gallium-drivers=svga --enable-gallium-tests Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:09:26 +01:00
Emil Velikov	6c44d43bae	automake: cleanup pipe-loader handling when using sw/xlib winsys Rather than defining our own set of variables, use NEED_WINSYS_XLIB and based on it include the sw/xlib winsys. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:09:17 +01:00
Emil Velikov	5c6a1445d5	pipe-loader: conditionally build and use pipe_loader_sw_probe_dri The function relies on the sw/dri winsys which is build only when --enable-dri is set. Fixes build issues with the following config ./configure --disable-dri --with-gallium-drivers=svga --enable-xa Issue can be reproduced with any hw gallium driver + st that uses the pipe-loader. Cc: Brian Paul <brianp@vmware.com> Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:09:09 +01:00
Roland Scheidegger	a7a03d84fc	llvmpipe: fix clearing of individual color buffers in a fb GL (3.0) allows you to clear individual color buffers in a fb. In fact for fbs containing both int and float/normalized color buffers this is required (because the clearing values are otherwise undefined if applied to all buffers). The gallium interface was changed a while ago, but llvmpipe ignored it (hence doing such individual clears always resulted in clearing all buffers, plus some assorted asserts due to the mixed fbs). So change the clear command to indicate the buffer to be cleared. Also, because indicating the buffer to be cleared would have made lp_rast_arg_cmd larger which is unacceptable (we're trying to shrink it some day) allocate the clear value in the scene and just pass a pointer. There's several advantages and disadvantages here: + clearing individual buffers works (we could also actually bin such clears now if they'd come through clear_render_target() if the surface is in the current fb, though we didn't do this before for the single rb case and still don't try). + since there's one clear per rb, we do the format conversion in setup rather than per bin. Aside from the (drop in the ocean...) performance advantage this means that clearing to very small values (that is, denormal when converted to the format) should work for small float (fp16 etc.) formats, as the util code couldn't handle it correctly before (because cpu denorms are disabled when executing the bin commands, screwing up the magic conversion and flushing the values to 0, though this was not verified). - there's some overhead for traditional old-style clear-all MRT cases, since there's one rast clear command per rb instead of one for all rbs. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=76976. v2: get rid of the ugly manual memcpy stuff and just use union util_color. This is 32 bytes instead of 16 but as the allocation is per scene we can live with those additional 16 bytes (and the additional 128 bytes in the setup context), which makes the code much more obvious. Suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-25 19:29:30 +02:00
Roland Scheidegger	fa4082320a	gallium/util: use ui[4] instead of ui in union util_color util_color often merely represents a collection of bytes, however it is inconvenient if those bytes can only be accessed as floats/doubles for int formats exceeding 32bits. (Note that since rgba8 formats use one uint, not 4 bytes, hence the byte and short member were left as is.)	2014-04-25 19:29:30 +02:00
Roland Scheidegger	2f65f61bea	llvmpipe: (trivial) use correct LP_MIN_VECTOR_ALIGN define for alignment. Currently it's the same value. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-25 19:29:30 +02:00
Marek Olšák	3a3b1bf60e	r600g: fix hang on RV740 by using DX_RASTERIZATION_KILL instead of SX_MISC Changing SX_MISC hangs RV740. When we're at it, let's use DX_RASTERIZATION_KILL on all R700 and later chipsets. Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:13 +02:00
Marek Olšák	3d0c4f3b01	r600g: fix for an MSAA hang on RV770 Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	ecc8a37ec5	r600g: fix for broken CULL_FRONT behavior on R6xx Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	ef162cf13d	r600g: fix for HTILE on R6xx Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	0967970768	r600g: fix buffer copying on R600-R700 This fixes broken rendering in DOTA 2. Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	042e40f67b	r600g: fix flushing on RV670, RS780, RS880 again Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	20a9b784da	r600g: fix MSAA resolve on R6xx when the destination is 1D-tiled Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	6dd045ef40	r600g: disable async DMA on R700 Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org	2014-04-25 01:33:12 +02:00
Marek Olšák	e5741f1e91	r600g: fix edge flags and layered rendering on R600-R700 We forgot to set these bits. Cc: 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	8a1dfba73e	st/mesa: remove trailing NULL colorbuffers Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	e522c455e4	r300g: don't crash when getting NULL colorbuffers Cc: mesa-stable@lists.freedesktop.org	2014-04-25 01:33:12 +02:00
Marek Olšák	ba4f6a5fc9	r300g: fix runtime warning after winsys cleanup Broken by: `b2238b3452` winsys/radeon: remove cs_write_reloc, add simpler cs_get_reloc	2014-04-25 01:33:12 +02:00
Marek Olšák	7920adb45c	radeonsi: implement GL_ARB_vertex_type_10f_11f_11f_rev Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-25 01:33:12 +02:00
José Fonseca	f438a82492	st/xlib: Do minimal version checking in glXCreateContextAttribsARB. The current version checking is wrongly refusing to create 3.3 contexts; unsupported version are checked elsewhere; and the DRI path doesn't do this sort of checking neither. This enables piglit glsl 3.30 tests to run without skipping. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-24 20:26:23 +01:00
José Fonseca	7380ce9bf6	llvmpipe: Advertise GLSL 3.30. According to Roland all TGSI support is there in theory. In practice there are a few piglit failures and crashes, as this hadn't been tested before. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-24 20:26:23 +01:00
José Fonseca	5f493eed69	st/xlib: Honour request of 3.1 contexts through core profile where available. The GLX_ARB_create_context_profile spec says: "If version 3.1 is requested, the context returned may implement any of the following versions: * Version 3.1. The GL_ARB_compatibility extension may or may not be implemented, as determined by the implementation. * The core profile of version 3.2 or greater." Mesa does not support GL_ARB_compatibility, and there are no plans to ever support it, therefore the only chance to honour a 3.1 context is through core profile, i.e, the 2nd alternative from the spec. This change does that. And with it piglit tests that require 3.1 contexts no longer skip. Assuming there is no objection with this change, src/glx/dri_common.c and src/gallium/state_trackers/wgl/stw_context.c should also be updated accordingly, given they have the same logic. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-24 20:26:23 +01:00
Zack Rusin	1c73e919a4	draw/llvm: reduce memory usage Lets make draw_get_option_use_llvm function available unconditionally and use it to avoid useless allocations when LLVM paths are active. TGSI machine is never used when we're using LLVM. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 13:59:24 -04:00
Brian Paul	552a8e44a9	docs: fix typo in 10.1.1 release notes URL	2014-04-24 08:37:23 -06:00
Brian Paul	0a92c88a51	swrast: move texture_slices() calls out of loops Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
Brian Paul	1a7fa8b2eb	swrast: move null pointer check earlier in _swrast_map_teximage() There's no reason to compute texel size, stride, etc. if there's no image data to map. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
Brian Paul	5e81e6e268	swrast: remove _mesa_ prefix from static function And add a const qualifier. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
Brian Paul	7cc2e2e99d	swrast: allocate swrast_texture_image::ImageSlices array if needed Fixes a segmentation fault in conform divzero.c test. This happens when glTexImage(level, width=0, height=0) is called. We don't allocate texture memory in that case so the ImageSlices array was never allocated. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
nick	15c92464df	swrast: Fix vertex color in _swsetup_Translate() Straightforward fix to properly load dest->color with color data, as opposed to position data as previously implemented. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-24 08:16:00 -06:00
José Fonseca	1527a545a4	gallivm: Fix wrong operator in lp_exec_default. Courtesy of MSVC static code analyser. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-24 14:49:53 +01:00
José Fonseca	878877d3c4	mesa/st: Handle empty frame-buffers without asserting. Fixes assertion failures with radeonsi. Tested-by: Marek Olšák <maraeo@gmail.com>	2014-04-24 14:48:37 +01:00
José Fonseca	fd92346c53	mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY. This prevents buffer overflow w/ llvmpipe when running piglit bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array single_level -fbo -auto v2: Compute the framebuffer size as the minimum size, as pointed out by Brian; compacted code; ran piglit quick test list (with no regressions.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-23 19:12:23 +01:00
José Fonseca	7a8667f2b3	util/u_debug: Pass correct size to strncat. Courtesy of Clang static analyzer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-23 19:12:23 +01:00
Rob Clark	05b3cea77b	freedreno/a3xx: fix TOTALATTRTOVS In cases where varying fetches are optimized away (just pass-through in vertex shader, but unused in fragment shader) we need to calculate the correct TOTALATTROVS based on the actual number of varyings fetched, otherwise lockup. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-23 07:32:16 -04:00
Kenneth Graunke	34a68345e2	i965: Make Broadwell HiZ path arrange for TC flushes. HiZ operations make the depth/render caches out of sync with the sampler caches. We need to arrange for a TC flush to happen before the target buffer is used by the sampler. Calling brw_render_cache_set_add_bo makes that happen. On previous generations, brw_blorp_exec took care of flushing the texture cache by calling intel_batchbuffer_emit_mi_flush after doing any rendering. If we were to use the normal drawing path, then brw_postdraw_set_buffers_need_resolve would handle this. On Broadwell, we don't use BLORP, and we don't emit a rectangle primitive via the normal drawing path. The 3DSTATE_WM_HZ_OP and PIPE_CONTROL implicitly make drawing happen. So, none of our existing code makes this flush happen - we need to do it directly. Fixes 11 Piglit copyteximage subtests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77223 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-22 10:57:11 -07:00
Matt Turner	fe49949392	i965: Use uint16_t for control/src index tables. No need to use 32-bits to store 15 and 12. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-22 09:12:31 -07:00
Matt Turner	f02f489295	i965/disasm: Fix s/xoo/xor/ typo. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-22 09:12:31 -07:00
Matt Turner	06501b3cf0	i965/disasm: Remove tables with obvious mappings. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-22 09:12:31 -07:00
Ilia Mirkin	5ce3f2fe72	mesa/st: enable EXT_shader_integer_mix when NativeIntegers is on Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-22 11:27:34 -04:00
Christian König	7eda318ffe	st/omx/enc: implement frame reordering and B-frames Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-22 16:42:08 +02:00
Leo Liu	b03be6908e	st/omx/enc: replace omx buffer with texture buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-22 15:13:08 +02:00
Michel Dänzer	360038fa50	radeonsi: Fix calculation of number of banks for SI The way cik_num_banks() was calculating the index only makes sense for the CIK specific macrotile mode array. For SI, we need to use the tile mode index directly. This happened to work most of the time because most of the SI tiling modes use the same number of banks. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-22 12:07:07 +09:00
Chris Forbes	0dfa6e7cf5	glsl: Only allow `invariant` on shader in/out between stages. Previously this was special-cased for VS and FS; it never got updated when geometry shaders came along. Generalize using is_varying_var() so this won't be broken again with tessellation. Note that there are two copies of the logic for `invariant`: It can be present as part of a new declaration, and also as a redeclaration of an existing variable or block member. Fixes the four new piglits: spec/glsl-1.50/compiler/invariant-qualifier-*.geom Note for stable: This won't quite pick cleanly due to whitespace and state->target -> state->stage renames. Should be straightforward adjustments though. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-22 09:07:05 +12:00
Brian Paul	0a0075666c	svga: move draw debug code into separate function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-21 14:54:28 -06:00
Brian Paul	e959274081	mesa: move declaration before code To fix MSVC build.	2014-04-21 13:24:26 -06:00
Anuj Phogat	f8ae2a56c6	mesa: Fix error code generation in glReadPixels() Section 4.3.1, page 220, of OpenGL 3.3 specification explains the error conditions for glreadPixels(): "If the format is DEPTH_STENCIL, then values are taken from both the depth buffer and the stencil buffer. If there is no depth buffer or if there is no stencil buffer, then the error INVALID_OPERATION occurs. If the type parameter is not UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_REV, then the error INVALID_ENUM occurs." Fixes failing Khronos CTS test packed_depth_stencil_error.test V2: Avoid code duplication Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 11:20:50 -07:00
Anuj Phogat	bd1880dfe8	mesa: Add an error condition in glGetFramebufferAttachmentParameteriv() From the OpenGL 4.4 spec page 275: "If pname is FRAMEBUFFER_ATTACHMENT_COMPONENT_TYPE, param will contain the format of components of the specified attachment, one of FLOAT, INT, UNSIGNED_INT, SIGNED_NORMALIZED, or UNSIGNED_NORMALIZED for floating-point, signed integer, unsigned integer, signed normalized fixedpoint, or unsigned normalized fixed-point components respectively. If no data storage or texture image has been specified for the attachment, param will contain NONE. This query cannot be performed for a combined depth+stencil attachment, since it does not have a single format." Fixes Khronos CTS test: packed_depth_stencil_parameters.test Khronos Bug# 9170 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 11:20:50 -07:00
Brian Paul	7cb3bbf2cd	libgl-gdi: silence unused variable warning when not using LLVM	2014-04-21 09:50:53 -06:00
Brian Paul	1f043cd95a	docs: import 10.0.5 release notes and update links	2014-04-21 09:03:32 -06:00
Brian Paul	3fd9943a65	docs: import 10.1.1 release notes, update links	2014-04-21 09:03:32 -06:00
Benjamin Bellec	9b3b9c613f	mesa: fix GetStringi error message with correct function name Signed-off-by: Benjamin Bellec <b.bellec@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: <mesa-stable@lists.freedesktop.org>	2014-04-21 08:44:20 -06:00
Brian Paul	27496af67f	st/mesa: fix invalid pointer use in st_texture_get_sampler_view() The '*used' pointer was pointing into the stObj->sampler_views array. If 'free' was null, we'd realloc that array, thus making the 'used' pointer invalid. This soon led to memory errors. Just change the pointer to be 'used' so it points directly at the pipe_sampler_view. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-21 08:30:46 -06:00
Chris Forbes	9fec560e63	glsl: Fix typo Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-21 16:02:02 +12:00
Chris Forbes	d63026f62a	i965: Use ctx->Texture._MaxEnabledTexImageUnit for upper bound Avoid looping over 32/48/96 (!!) tex image units every draw, most of which we don't care about. Improves performance on everyone's favorite not-a-benchmark by 2.9% on Haswell. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 10:13:07 +12:00
Chris Forbes	c4a98e76d7	mesa: Track max enabled tex image unit This gives us a better bound for some hot loops in the drivers than MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is ridiculously large on modern hardware, and only getting worse as more shader stages are added. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 10:12:00 +12:00
Ilia Mirkin	ba6dcb3c2b	nouveau/codegen: add missing values for OP_TXLQ into the target arrays Also rework things so that if someone were to add an opcode without adjusting the values in these arrays, there will be a compilation error. This fixes a few quadop-related piglit regressions since commit `d5faf8e786`. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Ilia Mirkin	47c19a5819	nvc0: change logic for centering of eng2d blit when downsampling We want to center the sample. The old code may have been correct given the limited values of ms_x/y, but the new logic should be more intuitive. Note that ms_x can only be 1/2 and ms_y can only be 0/1. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Ilia Mirkin	6d5c3c8260	nv50: use 2d blit when src/dst have same number of samples The 2D engine should be usable in more cases, but this fixes MS blits between textures with the same MS settings. Otherwise a single sample is selected to be the target texel value. This allows other tests to work that render to a RB and then blit that to a texture for input into a shader that uses sampler2DMS to verify it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Ilia Mirkin	2d2e60bdee	gallium/docs: fix PIPE_CAP_ENDIANNESS delimiter, remove trailing spaces Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Petri Latvala	b45f65e760	mesa: update glext.h to version 20140313 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-18 14:30:57 -07:00
Kenneth Graunke	a1273a07ed	i965/fs: Implement fs_inst::force_sechalf support on Broadwell. Back when I originally wrote this code, force_sechalf was only used for Gen4 code, so I didn't bother hooking it up. However, it's used more generally these days. In particular, we use it for computing gl_SamplePosition. Fixes Piglit's spec/ARB_sample_shading/builtin-gl-sample-position tests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77222 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-18 11:57:33 -07:00
Chris Forbes	92840aabf7	glsl: Allow explicit binding on atomics again As of `943b2d52bf`, layout(binding) on an atomic would fail the assertion here. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 10:35:05 -07:00
Alex Deucher	7489f3eeda	radeonsi: fix num banks selection on SI for dma setup (v2) The number of banks varies based on the tile mode index just like CIK. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=77533 v2: fix ordering for nbanks calculation for consistency Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-18 13:24:12 -04:00
Matt Turner	f770123f58	i965/fs: Reduce restrictions on interference in register coalescing. We previously only allowed coalescing registers that interfere (i.e., whose live ranges overlap) if the destination register's live range was entirely inside the source's live range. This is unnecessary -- we only need to check for interfering writes in the intersection of their live ranges. total instructions in shared programs: 1639470 -> 1638453 (-0.06%) instructions in affected programs: 84751 -> 83734 (-1.20%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	55de1c035c	i965/fs: Give up in interference check if we see a WHILE. Rather than any old control flow. Muchnick's algorithm just checks for interfering writes between the MOV and the end of the program. Handling this when you have backward branches is hard, so don't, but there's no reason to bail if you see forward branches. instructions in affected programs: 4270 -> 4248 (-0.52%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	5ff1e446d4	i965/fs: Simplify interference scan in register coalescing. We were starting at the beginning of the instruction list, rather than with the MOV instruction itself. This allows us to coalesce after control flow. Excluding the shaders from an unreleased title, the shader-db results: total instructions in shared programs: 1603791 -> 1594215 (-0.60%) instructions in affected programs: 678772 -> 669196 (-1.41%) GAINED: 5 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	04a4e43eb2	i965/fs: Unindent can_coalesce_vars(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	a975b2f55c	i965/fs: Recognize nop-MOV instructions early. And avoid rewriting other instructions unnecessarily. Removes a few self-moves we weren't able to handle because they were components of a large VGRF. instructions in affected programs: 830 -> 826 (-0.48%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	ef6127ff69	i965/fs: Only sweep NOPs if register coalescing made progress. Otherwise there's nothing to do. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Marek Olšák	352e06ddea	r600g,radeonsi: don't skip the context flush if a fence should be returned Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77589	2014-04-18 13:33:57 +02:00
Brian Paul	744d2a225d	svga: fix comment for emit_adjusted_vertex_attribs()	2014-04-17 16:15:37 -06:00
Brian Paul	cb34575e19	svga: compute need_swvfetch in svga_create_vertex_elements_state() This saves us doing it at state validation time. Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-04-17 11:31:15 -07:00
Brian Paul	851645a3e7	svga: add VS code to set attribute W component to 1 There's a few 3-component vertex attribute formats that have no equivalent SVGA3D_DECLTYPE_x format. Previously, we had to use the swtnl code to handle them. This patch lets us use hwtnl for more vertex attribute types by fetching 3-component attributes as 4-component attributes and explicitly setting the W component to 1. This lets us handle PIPE_FORMAT_R16G16B16_SNORM/UNORM and PIPE_FORMAT_R8G8B8_UNORM vertex attribs without using the swtnl path. Fixes piglit normal3b3s GL_SHORT test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:33 -07:00
Brian Paul	615a356ee3	svga: implement support for signed byte vertex attributes There's no SVGA3D_DECLTYPE that directly corresponds to PIPE_FORMAT_R8G8B8_SNORM. Previously, we used the swtnl fallback path to handle this but that's slow and causes invariance issues. Now we fetch the attribute as SVGA3D_DECLTYPE_UBYTE4N and insert some extra VS instructions to remap the attributes from the range [0,1] to the range[-1,1]. Fixes Sauerbraten sw fallback. Fixes piglit normal3b3s-invariance test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:33 -07:00
Brian Paul	52faafa174	svga: move translated vertex declaration types into svga_velems_state Now only translate the formats once in svga_create_vertex_elements_state(). And rename the array and use the proper SVGA3dDeclType type. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Brian Paul	0f5add1959	Revert "svga: add work-around for Sauerbraten Z fighting issue" This reverts commit `c875d6e57a`. Conflicts: src/gallium/drivers/svga/svga_context.c This work-around will no longer be needed after the next patch which properly supports signed-byte vertex attributes. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Brian Paul	7c7ab5434a	svga: use new inst_token_setp() helper function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Brian Paul	8e131576ee	svga: use new inst_token_predicated() helper function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Kenneth Graunke	71846a943f	i965: Retype pre-Gen6 varying pull load destination to UW. This sets up the proper execution mask for sends in SIMD16 mode. Fixes Piglit's glsl-fs-normalmatrix, glsl-fs-uniform-array-2, glsl-fs-uniform-array-6, and glsl-fs-uniform-array-7 on Ironlake, which regressed when I enabled SIMD16 pull parameter support in commit `b207e88b25`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-17 10:54:00 -07:00
Anuj Phogat	ee10e893cb	mesa: Fix error condition for multisample proxy texture targets Fixes failures in Khronos OpenGL CTS test proxy_textures_invalid_samples Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:26:39 -07:00
Anuj Phogat	1d350b9e22	i965: Add glBlitFramebuffer to commands affected by conditional rendering Fixes failures in Khronos OpenGL CTS test conditional_render_test9 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:26:39 -07:00
Anuj Phogat	8ed42ddd7d	swrast: Add glBlitFramebuffer to commands affected by conditional rendering Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 10:26:05 -07:00
Anuj Phogat	48fc2703e5	i965: Fix component mask and varying_to_slot mapping for gl_ViewportIndex gl_ViewportIndex doesn't get its own varying slot. It is stored in VARYING_SLOT_PSIZ.z. This patch fixes the issue for both gen7 and gen8 because gen7_upload_3dstate_so_decl_list() is shared between them. Fixes failures in OpenGL Khronos CTS test transform_feedback_builtins. Makes new piglit test glsl-1.50-transform-feedback-builtins pass for 'gl_ViewportIndex'. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:08:28 -07:00
Anuj Phogat	7928b9c249	i965: Fix component mask and varying_to_slot mapping for gl_Layer gl_Layer doesn't get its own varying slot. It is stored in VARYING_SLOT_PSIZ.y. This patch fixes the issue for both gen7 and gen8 because gen7_upload_3dstate_so_decl_list() is shared between them. Fixes failures in OpenGL Khronos CTS test transform_feedback_builtins. Makes new piglit test glsl-1.50-transform-feedback-builtins pass for 'gl_Layer'. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:08:28 -07:00
Anuj Phogat	969b461c2b	i965: Put an assertion to check valid varying_to_slot[varying] Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:08:28 -07:00
Darren Powell	bc86690f13	radeonsi: Added Diag Handler to receive LLVM Error messages Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-17 19:37:58 -04:00
Marek Olšák	9f9ab8ec0d	winsys/radeon: remove some unused code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:19 +02:00
Marek Olšák	8b966bcaf2	winsys/radeon: remove is_handle_added array Use index -1 if a buffer is not added. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:19 +02:00
Marek Olšák	b0fca0a378	winsys/radeon: remove local variable reloc from radeon_get_reloc Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:18 +02:00
Marek Olšák	3384a41aa9	winsys/radeon: remove parameter reloc from radeon_get_reloc Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:18 +02:00
José Fonseca	75e487538d	util: Add __declspec(noreturn) to _debug_assert_fail(). Mostly for consistency; as MSVC's static source code analysis doesn't seem to rely on assertions, but instead on different kind of source annotations( http://msdn.microsoft.com/en-us/library/hh916383.aspx ). Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 09:56:49 +01:00
José Fonseca	a2b89c4ae1	auxiliary/os,auxiliary/util: Fix the `‘noreturn’ function does return` warning. Now that _debug_assert_fail() has the noreturn attribute, it is better that execution truly never returns. Not just for sake of silencing the warning, but because the code at the return IP address may be invalid or lead to inconsistent results. This removes support for the GALLIUM_ABORT_ON_ASSERT debugging environment variable, but between the usefulness of GALLIUM_ABORT_ON_ASSERT and better static code analysis I think better static code analysis wins. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 09:56:48 +01:00
José Fonseca	97fa9cd220	scons: Enable building through Clang Static Analyzer. Same intent as commit `a45a50a482`, but this the C compiler is detected via C-preprocessor macros, similar to how autotools do it, as that seems to be the most reliable method. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 09:56:48 +01:00
Maarten Lankhorst	74f19445cc	gallium glsl: Fix crash with piglit fs-deref-literal-array-of-structs.shader_test This allows the following shader code to work without a weird crash: struct Foo { int value[1]; }; int actual_value = Foo[2](Foo(int[1](100)), Foo(int[1](200)))[i].value[0]; Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2014-04-17 10:34:10 +02:00
Maarten Lankhorst	49d26a277d	nouveau/vdec: small fixes to h264 handling nouveau_vp3_inter_sizes requires sliec_count as argument just as the other places that call it from h264 code do. Hopefully fixes something. Fix the status_vp code to allow status == 0 too, when processing hasn't started yet. set h264->second_field correctly.	2014-04-17 10:30:39 +02:00
Thomas Hellstrom	09cd376353	st/xa: Cache render target surface Otherwise it will trick the gallium driver into thinking that the render target has actually changed (due to different pipe_surface pointing to same underlying pipe_resource). This is really badness for tiling GPUs like adreno. This also appears to fix a rendering error with Motif on vmwgfx. Why that is is still under investigation. Based on an idea by Rob Clark. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-04-17 09:56:28 +02:00
Rob Clark	a45ae814d1	st/xa: scissor to help tilers Keep track of the maximal bounds of all the operations and set scissor accordingly. For tiling GPU's this can be a big win by reducing the memory bandwidth spent moving pixels from system memory to tile buffer and back. You could imagine being more sophisticated and splitting up disjoint operations. But this simplistic approach is good enough for the common cases. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-17 09:42:06 +02:00
Rob Clark	3c52013273	st/xa: remove unneeded args Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-17 09:40:42 +02:00
Iago Toral Quiroga	cda5e0c25e	glsl: Small optimization for constant conditionals Once the relevant branch has been identified do not iterate over the instructions in the branch, do a linked list insertion instead to avoid the loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 23:39:57 -07:00
Iago Toral Quiroga	4472ab9e6d	glsl: Fix incorrect indentation. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 23:22:24 -07:00
Chris Forbes	d1b6f67110	meta: Clip src/dest rects in BlitFramebuffer, using the scissor Fixes piglit's fbo-blit-stretch test on drivers which use the meta path. (i965: should fix Broadwell, but also fixes Sandybridge/Ivybridge/Haswell since this test falls off the blorp path now due to format conversion) V2: Use scissor instead of just mangling the rects, to avoid texcoord rounding problems. (Thanks Marek) V3: Rebase on Eric's CTSI meta changes; re-add _mesa_update_state in the CTSI path so that _mesa_clip_blit sees the correct bounds. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77414 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-17 18:11:24 +12:00
Samuel Iglesias Gonsalvez	9927180714	mesa: fix check for dummy renderbuffer in _mesa_FramebufferRenderbufferEXT() According to the spec: <renderbuffertarget> must be RENDERBUFFER and <renderbuffer> should be set to the name of the renderbuffer object to be attached to the framebuffer. <renderbuffer> must be either zero or the name of an existing renderbuffer object of type <renderbuffertarget>, otherwise an INVALID_OPERATION error is generated. This patch changes the previous returned GL_INVALID_VALUE to GL_INVALID_OPERATION. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76894 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>	2014-04-16 23:00:40 -07:00
Matt Turner	42a26cb5e4	i965: Don't make instructions with a null dest a barrier to scheduling. Now that we properly track accumulator dependencies, the scheduler is able to schedule instructions between the mach and mov in the common the integer multiplication pattern: mul acc0, x, y mach null, x, y mov dest, acc0 Since a null destination implies no dependency on the destination, we can also safely schedule instructions (that don't write the accumulator) between the mul and mach. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	a6860100b8	i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6 This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving one instruction and two temporary registers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	da0c3b02e7	i965/fs: Add support for the MAC instruction. This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	2dfbbeca50	i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6 This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving one instruction and two temporary registers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	0974706671	i965/vec4: Add support for the MAC instruction. This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	306ed81b93	i965: Add writes_accumulator flag Our hardware has an "accumulator" register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the "AccWrEn" flag. This patch introduces a new flag, inst->writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	30c35d1dcb	i965: Add is_accumulator() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Matt Turner	6541f1b4d0	i965: Add reads_accumulator_implicitly() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 22:46:44 -07:00
Anuj Phogat	cb6566f9df	mesa: Add error condition for integer formats in glGetTexImage() OpenGL 4.0 spec, page 306 suggests an INVALID_OPERATION in glGetTexImage if : "format is one of the integer formats in table 3.3 and the internal format of the texture image is not integer, or format is not one of the integer formats in table 3.3 and the internal format is integer." V2: Use helper function _mesa_is_format_integer() Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 18:37:06 -07:00
Anuj Phogat	3135668254	mesa: Add helper function _mesa_is_format_integer() This function will be used in the following patch. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 18:37:06 -07:00
Anuj Phogat	fdd8bebc22	mesa: Fix glGetVertexAttribi(GL_VERTEX_ATTRIB_ARRAY_SIZE) mesa currently returns 4 when GL_VERTEX_ATTRIB_ARRAY_SIZE is queried for a vertex array initially set up with size=GL_BGRA. This patch makes changes to return size=GL_BGRA as required by the spec. Fixes Khronos OpenGL CTS test: vertex_array_bgra_basic.test V2: Use array->Format instead of adding a new variable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2014-04-16 18:37:06 -07:00
Anuj Phogat	80b4a36fed	glsl: Fix copy-paste error in linker_warning() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 18:37:06 -07:00
Michel Dänzer	7286739b9b	r600g: Disable LLVM by default at runtime for graphics For graphics, the LLVM compiler backend currently has many shortcomings compared to the non-LLVM one. E.g. it can't handle geometry shaders yet, but that's just the tip of the iceberg. So building Mesa with --enable-r600-llvm-compiler is currently not recommended for anyone who doesn't want to work on fixing those issues. However, for protection of users who end up enabling it anyway for some reason, let's disable the LLVM backend at runtime by default. It can be enabled with the environment variable R600_DEBUG=llvm. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-17 10:15:59 +09:00
Roland Scheidegger	f23d1160c2	gallivm: fix compilation with llvm 3.5 r206241+ Just adjust to the ever-changing API, pass in MCContext when creating the MCDisassembler. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-16 19:57:47 +02:00
José Fonseca	e3c58cdfd9	Revert "scons: Enable building through Clang Static Analyzer." This reverts commit `a45a50a482`. Unfortunately gcc dumps argv[0] as the first word of --version, so it is unreliable for detecting gcc. In particular `cc --version` and `i686-w64-mingw32-gcc --version` give wrong results. A better solution needs to be found -- most likely using C-preprocessing like autotools does. Revert for now.	2014-04-16 13:18:06 +01:00
Marek Olšák	11459436d9	r600g,radeonsi: share some of gfx flush code Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:52 +02:00
Marek Olšák	adfadeadd8	r600g,radeonsi: share r600_flush_from_st Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:52 +02:00
Marek Olšák	586011486d	r600g: merge r600_flush with r600_context_flush Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	d4edc60767	radeonsi: merge si_flush with si_context_flush This also removes si_flush_gfx_ring. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	70cf6639c3	gallium/radeon: create and return a fence in the flush function All flush functions get a fence parameter. cs_create_fence is removed. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	3e9d2cbca2	r600g: remove redundant r600_flush_dma_from_winsys Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	dd72c327e9	winsys/radeon: fold cs_set_flush_callback into cs_create Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	c6033a6cb8	radeonsi: cleanup redundant computation of flush flags and rename a function Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	fc151b08be	r600g: remove redundant r600_flush_from_winsys Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	b2238b3452	winsys/radeon: remove cs_write_reloc, add simpler cs_get_reloc The only difference is that it doesn't write to the CS and only returns the index. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	927213f33d	winsys/radeon: consolidate hash table lookup I should have done this long ago. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
José Fonseca	d3c0e236f2	scons: Add an analyze option. For Clang static code analyzer, the scan-build script will produce more comprehensive output. Nevertheless you can invoke it as CC=clang CXX=clang++ scons analyze=1 For MSVC this is the best way to use its static code analysis. Simply invoke as scons analyze=1 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 11:44:21 +01:00
José Fonseca	f81305c0cb	util/u_debug: Add noreturn attribute to _debug_assert_fail(). As recommended by http://clang-analyzer.llvm.org/annotations.html#attr_noreturn Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 11:44:17 +01:00
José Fonseca	a45a50a482	scons: Enable building through Clang Static Analyzer. By accurately detecting gcc/clang through --version option instead of executable name. Clang Static Analyzer reports many issues, most false positives, but it found at least one real and subtle use-after-free issue in st_texture_get_sampler_view(): http://people.freedesktop.org/~jrfonseca/scan-build-2014-04-14-1/report-869047.html#EndPath Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 11:44:06 +01:00
Iago Toral Quiroga	6d0e30c6a3	glsl: Properly handle blocks that define the same field name. Currently we can have name space collisions between blocks that define the same fields. For example: in block { vec4 Color; } In[]; out block { vec4 Color; } Out; These two blocks will assign the same interface name (block.Color) to the Color field in flatten_named_interface_blocks_declarations.cpp, leading to havoc. This was breaking badly the gl-320-primitive-shading test from ogl-samples. The patch uses the block instance name to avoid collisions, producing names like block.In.Color and block.Out.Color to avoid the name clash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76394 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 22:18:43 -07:00
Michel Dänzer	6ac5a5e383	r600g/radeonsi: Map transfer staging texture unsynchronized when possible The transfer staging texture is always freshly allocated, so for write-only transfers we don't need to explicitly wait for the BO to become idle. Squeezes a few hundered MB/s more out of x11perf -shmput500 with glamor. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-16 12:11:27 +09:00
Matt Turner	9fed627234	Revert "i965/fs: Only sweep NOPs if register coalescing made progress." This reverts commit `f092e8951c`. Didn't mean to push this...	2014-04-15 17:27:55 -07:00
Matt Turner	f092e8951c	i965/fs: Only sweep NOPs if register coalescing made progress. Otherwise there's nothing to do.	2014-04-15 16:28:04 -07:00
Eric Anholt	7ae870211d	i965: Fix buffer overruns in MSAA MCS buffer clearing. This manifested as rendering failures or sometimes GPU hangs in compositors when they accidentally got MSAA visuals due to a bug in the X Server. Today we decided that the problem in compositors was equivalent to a corruption bug we'd noticed recently in resizing MSAA-visual glxgears, and debugging got a lot easier. When we allocate our MCS MT, libdrm takes the size we request, aligns it to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880 bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes, 32 pages). Because it's Y tiled, we attach a 384-byte-stride fence to it. When we memset by the BO size in Mesa, between bytes 122880 and 131072 the data gets stored to the first 20 or so scanlines of each of the 3 tiled pages in that row, even though only 2 of those pages were allocated by libdrm. In the glxgears case, the missing 3rd page happened to consistently be the static VBO that got mapped right after the first MCS allocation, so corruption only appeared once window resize made us throw out the old MCS and then allocate the same BO to back the new MCS. Instead, just memset the amount of data we actually asked libdrm to allocate for, which will be smaller (more efficient) and not overrun. Thanks go to Kenneth for doing most of the hard debugging to eliminate a lot of the search space for the bug. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:47 -07:00
Eric Anholt	e5b86cb64b	meta: Add support for MSAA resolves from 2D_MS_ARRAY textures. We don't have any piglit tests for this currently. v2: Use vec3s for the texcoords so it has some hope of working. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:45 -07:00
Eric Anholt	234db60954	meta: Add an accelerated glCopyTexSubImage using glBlitFramebuffer. You'll note from the previous commits that there's something of a loop here: You call CTSI, which calls BlitFB, then if things go wrong that falls back to CTSI. As a result, meta CTSI reaches over into blitfb to tell it "no, don't try that fallback". v2: Drop the _mesa_update_state(), which was only necessary due to use of _mesa_clip_blit() in _mesa_meta_BlitFramebuffer() in another patch series. v3: Drop an _EXT suffix I copy-and-pasted. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	70961c032f	meta: Add support for CUBE_MAP_ARRAY to generatemipmap. I added support to bind_fbo_image in the process of building meta CopyTexSubImage, and found that it broke generatemipmap because previously we would just throw a GL error there and then end up with an incomplete FBO and fallback. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	bb3f983d10	meta: Infer bind_fbo_image parameters from an incoming image. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	cd808ac848	meta: Move bind_fbo_image() code back to meta.c, to reuse it elsewhere. I need to do the same code again for CopyTexSubImage(). v2: Drop incorrect, not-terribly-useful comment (review by Ken) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	4cc42805e7	meta: Refactor the BlitFramebuffer depth CopyTexImage fallback. This avoids a ReadPixels() if there's accelerated CopyTexImage present. It now requires GLSL as opposed to just fragment programs, but we don't have any drivers that do ARB_fp but not GLSL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	b702233f53	meta: Refactor the BlitFramebuffer color CopyTexImage fallback. There shouldn't be anything special about copying out a subset of the src rb to a temp before texturing from it, so just do it when we're figuring out our src texture binding. This drops Anuj's change to copy an extra border of 1 pixel around the src area. I can't see how that change could be valid, and presumably if there's some filtering problem at edges we just need to set the right wrap mode. v2: Don't fall back to swrast on non-2D/RECT/2D_MS textures when we can still CopyTexSubImage. Fixes a segfault regression on i965 with gl-3.2-layered-rendering-blit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-15 14:34:06 -07:00
Eric Anholt	4e43299633	meta: Drop blit src size fallback. I think we can assert that renderbuffer size is <= maximum 2D texture size. Our source coordinates should have already been clipped to the src renderbuffer size, but haven't actually (so we could potentially have trouble if there's scaling, and we're in the CopyTexImage path that tries to use src size). However, this texture size dependency was blocking the next refactors, so I'm not sure if we want to go ahead with this series before we get the clipping sorted out or not. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 12:27:37 -07:00
Mike Stroyan	602510395a	i965: Avoid dependency hints on math opcodes Putting NoDDClr and NoDDChk dependency control on instruction sequences that include math opcodes can cause corruption of channels. Treat math opcodes like send opcodes and suppress dependency hinting. Signed-off-by: Mike Stroyan <mike@LunarG.com> Tested-by: Tony Bertapelli <anthony.p.bertapelli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-15 10:31:46 -07:00
Matt Turner	ad48a9a319	i965: Expand INTEL_DEBUG to uint64_t. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 10:29:00 -07:00
Matt Turner	58db339599	dri: Expand driParseDebugString return value to uint64_t. Users will downcast if they don't have >32 debug flags. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 10:28:57 -07:00
Matt Turner	73400d8f70	i965/fs: Remove dead_code_eliminate_local(). Subsumed by the new dead_code_eliminate() function. No shader-db changes. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-15 09:25:47 -07:00
Matt Turner	18d12336b9	i965/fs: Clear variable from live-set if it's completely overwritten. One program affected: instructions in affected programs: 246 -> 244 (-0.81%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-15 09:25:44 -07:00
Matt Turner	f34f39330b	i965/fs: Reimplement dead_code_elimination(). total instructions in shared programs: 1653399 -> 1651790 (-0.10%) instructions in affected programs: 92157 -> 90548 (-1.75%) GAINED: 2 LOST: 2 Also significantly reduces the number of optimization loop iterations: total loop iterations in shared programs: 39724 -> 31651 (-20.32%) loop iterations in affected programs: 21617 -> 13544 (-37.35%) Including some great pathological cases, like 29 -> 3 in Strike Suit Zero and 24 -> 3 in Dota2. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-15 09:25:11 -07:00
Matt Turner	596737ee91	i965/vec4: Let DCE eliminate dead writes in other basic blocks. We previously stopped searching for unread writes after encountering control flow, but we can instead just search backwards until we hit control flow. instructions in affected programs: 22854 -> 22194 (-2.89%)	2014-04-15 09:24:09 -07:00
Matt Turner	4dcfb92417	i965/gs: Add dummy source to prepare_channel_masks instruction. The generator uses its destination as a source implicitly, which breaks some assumptions in dead code elimination. Giving the instruction a source allows us to reason about it better.	2014-04-15 09:24:09 -07:00
Matt Turner	d877c643be	glsl: Use M_PI_* macros. Notice our multiple values for M_PI_2, which rounded ...32 up to ...4 and ...5.	2014-04-15 09:24:09 -07:00
Kenneth Graunke	4f20b7d3dd	i965: Disable Z16 in all APIs. We originally thought that GL 3.0 required GL_DEPTH_COMPONENT16 to map exactly to Z16. However, we misread the specification, thanks in part to LaTeX reordering the tables in the PDF. Page 180 of the GL 3.0 specification (glspec30.20080923.pdf) says: "[...] memory allocation per texture component is assigned by the GL to match the allocations listed in tables 3.16-3.18 as closely as possible. [...] Required Texture Formats [...] In addition, implementations are required to support the following sized internal formats. Requesting one of these internal formats for any texture type will allocate exactly the internal component sizes and types shown for that format in tables 3.16-3.17:" Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table 3.16 or table 3.17. It appears in table 3.18, where the "exact" rule doesn't apply, and it falls back to the "closely as possible" rule. The confusing part is that the ordering of the tables in the PDF is: Table 3.16 (pages 182-184) Table 3.18 (bottom of page 184 to top of 185) Table 3.17 (page 185) Presumably, people saw table 3.16, then saw the table immediately following with DEPTH_COMPONENT* formats, and assumed it was 3.17. Based on a patch by Chia-I Wu, but without the driconf option to force Z16 to be used. It's not required, and there's apparently no benefit to actually using it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-04-15 02:15:11 -07:00
Kenneth Graunke	be000b4d19	i965: Update comments about Z16 being slow. We've learned a few things since we originally disabled Z16; this attempts to summarize the issue. I am no expert on this subject, though, so the comment may not be totally accurate. I did some benchmarking on GM45 and Ironlake, and discovered that for GLBenchmark 2.7 EgyptHD, using Z16 was 3% slower on GM45 (n=15), and 4.5% slower on Ironlake (n=95). So, we can drop the "on Ivybridge" aspect of the comment - it's always slower. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-04-15 02:15:11 -07:00
Michel Dänzer	313104e8d5	r600g/radeonsi: Use caching buffer manager for textures as well Significantly reduces BO allocation / destruction overhead for transfers, e.g. measurable via x11perf -shm{ge,pu}t* with glamor. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-15 11:34:56 +09:00
Jordan Justen	24c773fb06	i965/gen8: add debug code to show FS disasm with jump locations Copied from similar code in gen8_vec4_generator.cpp. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-14 10:32:42 -07:00
Chia-I Wu	73a4761058	ilo: remove GPE state size estimation Use size defines from genhw.	2014-04-14 20:45:04 +08:00
Chia-I Wu	8fa8e9b1b8	ilo: remove GPE command size estimation Use size defines from genhw.	2014-04-14 20:45:04 +08:00
Chia-I Wu	bdd0546d7c	ilo: remove unused headers Remove intel_.h. brw_.h is still needed by the state dumper and disassembler.	2014-04-14 20:45:04 +08:00
Chia-I Wu	e55e1610e5	ilo: use only defines from genhw headers Stop including classic driver headers in genhw.h, with some formatting fixes.	2014-04-14 20:45:04 +08:00
Chia-I Wu	6c6bd796ad	ilo: scripted conversion to genhw headers Hopefully my four hundred line sed script is correct.	2014-04-14 20:45:04 +08:00
Chia-I Wu	01e3e82a56	ilo: add genhw headers All except genhw.h are generated by https://github.com/olvaffe/envytools/. intel_chipset.h is deprecated.	2014-04-14 20:45:03 +08:00
Chia-I Wu	d75a8799fd	ilo: avoid brw_wm_barycentric_interp_mode in compiler In preparation for genhw.	2014-04-14 20:45:03 +08:00
Chia-I Wu	ad39b991ce	ilo: add TOY_OPCODE_DO We used to give BRW_OPCODE_DO a special meaning, while we should have used TOY_OPCODE_DO.	2014-04-14 20:45:03 +08:00
Vinson Lee	36fb36aa36	gtest: Update to 1.7.0. This patch fixes gtest build errors on Mac OS X 10.9. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73106 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-14 00:06:53 -07:00
Chris Forbes	936dda08ee	mesa: Consider gl_VertexID and gl_InstanceID active attribs Fixes piglit's spec/gl-3.2/get-active-attrib-returns-all-inputs. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-13 19:27:01 +12:00
Chris Forbes	ca5c8d6cd4	mesa: Extract is_active_attrib() in shaderapi The rules are about to get a bit more complex to account for gl_InstanceID and gl_VertexID, which are system values. Extracting this first avoids introducing duplication. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-13 19:26:56 +12:00
Chris Forbes	aeb03f8aea	glsl: Fix typo in interface block comment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-13 17:02:11 +12:00
Simone Scanzoni	c3b701d63c	egl-static: fix build after recent radeon winsys changes Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-04-13 02:37:36 +02:00
Chris Forbes	b92e7f2da9	mesa: Fix typo in error message Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-13 12:38:24 +12:00
Iago Toral Quiroga	a5957f7bc5	i965: glClearBuffer() should only clear a single buffer. glClearBuffer() is currently clearing all active draw color buffers (all buffers that have not been set to GL_NONE when calling glDrawBuffers) instead of only clearing the one it receives as parameter. Altough brw_clear() receives a bit mask indicating the color buffers that should be cleared, this mask is ignored when calling brw_blorp_clear_color(). This was breaking the 'fbo-drawbuffers-none glClearBuffer' piglit test. The patch provides the bit mask to brw_blorp_clear_color() so it can limit clearing to the color buffers present in the mask. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76832 Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-13 12:28:25 +12:00
Chris Forbes	26224d3e00	i965: Add comment to explain the weird-looking shadow compares. This always looks crazy when I stumble across it, until I remember what the hardware is doing. Describing it ought to short-circuit that process next time :) V2: Fix indents to 6 spaces, not 7. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-13 08:51:46 +12:00
Kenneth Graunke	857f3a68ea	glsl: Ignore loop-too-large heuristic if there's bad variable indexing. Many shaders use a pattern such as: for (int i = 0; i < NUM_LIGHTS; i++) { ...access a uniform array, or shader input/output array... } where NUM_LIGHTS is a small constant (such as 2, 4, or 8). The expectation is that the compiler will unroll those loops, turning the array access into constant indexing, which is more efficient, and which may enable array splitting and other optimizations. In many cases, our heuristic fails - either there's another tiny nested loop inside, or the estimated number of instructions is just barely beyond the threshold. So, we fail to unroll the loop, leaving the variable indexing in place. Drivers which don't support the particular flavor of variable indexing will call lower_variable_index_to_cond_assign(), which generates piles and piles of immensely inefficient code. We'd like to avoid generating that. This patch detects unsupported forms of variable-indexing in loops, where the array index is a loop induction variable. In that case, it bypasses the loop-too-large heuristic and forces unrolling. Improves performance in various microbenchmarks: Gl32PSBump8 by 47%, Gl32ShMapVsm by 80%, and Gl32ShMapPcf by 27%. No changes in shader-db. v2: Check ir->array for being an array or matrix, rather than the ir_dereference_array itself. v3: Fix and expand statistics in commit message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:43 -07:00
Kenneth Graunke	2231db5598	glsl: Rename loop_unroll_count::fail to "nested_loop." The "fail" flag is set if loop_unroll_count encounters a nested loop; calling the flag "nested_loop" is a bit clearer. The original reasoning was that count is inaccurate (too small) if there are nested loops, as we don't do any sort of analysis on the inner loop. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:41 -07:00
Kenneth Graunke	8268a2f347	glsl: Pass gl_shader_compiler_optimizations to unroll_loops(). Loop unrolling will need to know a few more options in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:39 -07:00
Kenneth Graunke	da22221aa3	glsl: Drop do_common_optimization's max_unroll_iterations parameter. Now that we pass in gl_shader_compiler_options, it makes sense to just use options->MaxUnrollIterations, rather than passing a separate parameter. Half of the invocations already passed options->MaxUnrollIterations, while the other half passed in a hardcoded value of 32. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:37 -07:00
Kenneth Graunke	f00a6483e9	i965: Use EmitNoIndirect flags in lower_variable_index_to_cond_assign. This will prevent the two from getting out of sync again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:36 -07:00
Kenneth Graunke	320e0c5205	i965: Correct EmitNoIndirect shader compiler option flags. These were out of sync with the flags used to control lower_variable_index_to_cond_assign in brw_shader.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:25 -07:00
Matt Turner	509b2a6523	i965/fs: Reset reg_from when we can't coalesce. Not setting this would prevented coalescing after a failed attempt if the sources for both MOVs were the same. total instructions in shared programs: 1654531 -> 1650224 (-0.26%) instructions in affected programs: 423167 -> 418860 (-1.02%) GAINED: 2 LOST: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 15:27:46 -07:00
Eric Anholt	7e034a8d77	i965: Fill in a bunch of gen7/hsw data cache-related disasm. This gets us disasm of atomic ops. v2: Fix fallthrough on pre-gen7. (bug caught by Ilia Mirkin). Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-11 13:38:53 -07:00
Eric Anholt	99442bc7b2	i965: Stop setting up a 1:1 "attrib" member in our vertex inputs. It's just the array index, so we can just go look at the array and see which element we are. No significant performance difference (n=140) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:38:53 -07:00
Eric Anholt	9a5d19d680	i965: Skip a bunch of IB BO refcount twiddling. Improves cairo performance on glamor by 1.64828% +/- 1.04742% (n=65). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:38:52 -07:00
Eric Anholt	3f9440cfbb	i965/gen7: Skip repeated NULL depth/stencil state emits. Improves cairo performance on glamor by 2.87752% +/- 0.966977 (n=57). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:38:52 -07:00
Chris Forbes	fe4f373eb4	docs: Fix ubo indexing description Ian points out that this being unrestricted was an oversight in the spec, and is corrected in GLSL4.40. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-12 08:31:05 +12:00
Brian Paul	e5f306e3ff	draw: remove unused 'start' variable in draw_stats_clipper_primitives() It was computed, but never actually used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:54:17 -06:00
Kenneth Graunke	ae2a03b573	glsl: Try vectorizing when seeing a repeated assignment to a channel. When considering assignment expressions like: v.x += u.x; v.x += u.x; the vectorizer would incorrectly keep going, attempting to find more instructions to vectorize. It would overwrite the saved assignment to point at the second one, and increment channels a second time, resulting in try_vectorize thinking the expression was a vec2 instead of a float. Instead, if we see a repeated assignment to a channel, just try to vectorize everything we've found so far. This clears the saved state so it will start over. Fixes Piglit's repeated-channel-assignments.vert. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-11 12:39:37 -07:00
Ian Romanick	625cf8c874	glsl: Propagate explicit binding information from the AST all the way to the linker Information about the binding was not being properly communicated from the front-end compiler to the linker. As a result, the linker never knew that any UBOs had explicit bindings! Fixes the piglit test arb_shading_language_420pack-binding-layout. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: github@socker.lepus.uberspace.de [v0] Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	25a6656875	linker: Set binding for all elements of UBO array Previously, a UBO like layout(binding=2) uniform U { ... } my_constants[4]; wouldn't get any bindings set. The code would try to set the binding of U, but that would fail. It should instead set the bindings for U[0], U[1], ... Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	cc42717b50	linker: Set block bindings based on UniformBlocks rather than UniformStorage For blocks, gl_shader_program::UniformStorage isn't very useful. The names stored there are the names of the elements of the block, so finding blocks with an instance name is hard. There is also only one entry in ::UniformStorage for each element of a block array, and that is a deal breaker. Using ::UniformBlocks is what _mesa_GetUniformBlockIndex does. I contemplated sharing code between set_block_binding and _mesa_GetUniformBlockIndex, but building the stand-alone compiler and the unit tests make this hard. I plan to return to this effort shortly. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	157391a41b	linker: Clean up "unused parameter" warnings ../../src/glsl/link_uniform_initializers.cpp:87:1: warning: unused parameter 'mem_ctx' [-Wunused-parameter] ../../src/glsl/link_uniform_initializers.cpp:87:1: warning: unused parameter 'type' [-Wunused-parameter] ../../src/glsl/link_uniform_initializers.cpp:127:1: warning: unused parameter 'mem_ctx' [-Wunused-parameter] ../../src/glsl/link_uniform_initializers.cpp:127:1: warning: unused parameter 'type' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	943b2d52bf	linker: Fold set_uniform_binding into call site In the next patch, we'll see that using gl_shader_program::UniformStorage is not correct for uniform blocks. That means we can't use ::UniformStorage to select between the sampler path and the block path. Instead we want to just use the type of the variable. That's never passed to set_uniform_binding, and it's easier to just remove the function (especially for later patches in the series) than to add another parameter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	881c52f13f	linker: Various trivial clean-ups in set_sampler_binding - Remove the spurious block left from the previous commit and re-indent. - Constify elements. - Make the spec reference in the code look like other spec references in the compiler. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	6e2f63b69e	linker: Split set_uniform_binding into separate functions for blocks and samplers The two code paths are quite different, and there are some problems in the handling of uniform blocks. Future changes will cause these paths to diverge further. Ultimately, selecting between the two functions will happen at the set_uniform_binding call site, and set_uniform_binding will be deleted. NOTE: This patch just moves code around. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Heinrich Janzing	c8e7568f97	softpipe: fix shadow sampling And remove nonsensical approximation of linear interpolation behavior for shadow samplers. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2014-04-11 11:47:32 -06:00
Brian Paul	86b8843e9c	softpipe: add PIPE_CAP_MIN/MAX_TEXTURE_GATHER_OFFSET query cases To silence compiler warnings. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-11 11:47:31 -06:00
Brian Paul	f61edd509b	mesa: use _mesa_get_srgb_format_linear() in sRGB texstore functions Instead of switch statements. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 11:47:31 -06:00
Brian Paul	c5631b341e	swrast: use macros to initialize texfetch_funcs[] table Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 11:47:31 -06:00
Brian Paul	4da1efb370	swrast: fix more fetch_texel function names These were missed/typo'd in the previous patch series: s/R8G8B8A/R8G8B8A8/ s/rgba_16/RGBA_UNORM16/ s/rgba_uint/RGBA_UINT/ s/rgba_int/RGBA_SINT/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 11:47:31 -06:00
José Fonseca	9d36a8d4d2	egl-static: Fix missing radeon_surface.h includes. Fixes fatal error: radeon_surface.h: No such file or directory when libdrm is not present, or non-Linux OSes. Trivial.	2014-04-11 16:46:02 +01:00
Knut Andre Tidemann	5ac3435a47	gallium/radeon: fix missing winsys include in pipe-loader. The commit `3b0b44f7de` introduced a build error: error: dereferencing pointer to incomplete type This patch fixes this issue in all the affected files. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-11 19:22:17 -04:00
Christian König	68bba1801e	st/omx/enc: separate input buffer private and task structure Keep tasks as linked list, this way we can associate more than one encoding task with each buffer. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	7806dbeb70	radeon/vce: implement B-frame support Signed-off-by: Slava Grigorev <slava.grigorev@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	a56fa0e83b	radeon/vce: add proper CPB backtrack Remember what frames we encoded at which position. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	d7d41ce133	vl: add interface for H264 B-frame encoding Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	ee4439c562	radeon/vce: remove RVCE_NUM_CPB_EXTRA_FRAMES Doesn't seems to be needed any more. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:02 +02:00
Chris Forbes	ce57c8e925	docs/relnotes: Fix consistency, add i965 to ARB_buffer_storage. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-11 21:20:13 +12:00
Kenneth Graunke	227049098b	i965: Fix missing _NEW_SCISSOR in Broadwell SF_CLIP_VIEWPORT state. The _Xmin/_Xmax/_Ymin/_Ymax values need to be guarded by _NEW_SCISSOR. Fixes Piglit's scissor-many, and rendering in GNOME Shell. Hopefully fixes similar issues with Unity and ChromeOS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75879 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: James Ausmus <james.ausmus@intel.com> Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>	2014-04-10 23:38:10 -07:00
Ilia Mirkin	31640f4c38	mesa/st: set min/max texture gather offset to driver-reported value It was always getting set to -8/7 unconditionally. Use the driver-reported value instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-10 20:42:48 -04:00
Ilia Mirkin	c2f9ad5289	gallium: add a way to query min/max texture gather offsets Defaults to providing the same offsets as MIN/MAX_TEXEL_OFFSET. For nvc0, the offset can be -32/31. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-10 20:42:36 -04:00
Marek Olšák	8291f6d5c5	configure.ac: require libdrm_radeon 2.4.53 We need latest radeon_drm.h.	2014-04-10 21:24:50 +02:00
Marek Olšák	3b0b44f7de	winsys/radeon: fix a race condition in initialization of radeon_winsys::screen Create the screen in the winsys while the mutex is locked. This also results in a nice code cleanup! Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	ac330d4130	winsys/radeon: fix a race condition between winsys_create and winsys_destroy This also hides the reference count from drivers. v2: update the reference count while the mutex is locked in winsys_create Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	7c57b01564	winsys/radeon: fix a race condition between 2 calls to radeon_winsys_create This fixes random crashes of: piglit/glx-multithread-shader-compile. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	b5ebfc33b8	winsys/radeon: remove unused radeon_info variables, move backend_map Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	9b8449ae90	winsys/radeon: unify radeon_bo::flink and radeon_bo::name Both contained the GEM flink name. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	34564c8753	winsys/radeon: remove definitions already present in radeon_drm.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	e3e05c6db9	winsys/radeon: handle squared micro tiling from GEM_GET_TILING Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	38858207a1	gallium/u_gen_mipmap: rewrite using pipe->blit (v2) This replaces u_gen_mipmap with an extremely simple implementation based on pipe->blit. st/mesa is also cleaned up. Pros: - less code - correct mipmap generation for NPOT 3D textures (u_blitter uses a better formula) - queries are not affected by mipmap generation if drivers disable them v2: add "first_layer", "last_layer" parameters, drop "face" v2.1: add format v2.2: document the format parameter	2014-04-10 20:50:16 +02:00
Marek Olšák	26c41398cc	st/mesa: properly implement MapTextureImage with multiple mapped slices (v2) This is needed by _mesa_generate_mipmap. This adds an array of pipe_transfers to st_texture_image. Each transfer is for mapping a single layer. v2: allocate the array of transfers on demand	2014-04-10 20:50:16 +02:00
Brian Paul	5206d4bc09	mesa: remove the MALLOC, CALLOC and FREE macros No longer used anywhere. These also caused trouble in the Gallium state tracker code where we include both core Mesa and Gallium util headers (and the macros were defined differently in each world.) Removing these macros should help avoid macro mix-ups in the future. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:12 -06:00
Brian Paul	7e55050301	xlib: s/FREE/free/ Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:11 -06:00
Brian Paul	3b323c4d40	mesa: s/FREE/free/ in vdpau code Reviewed-by: Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 07:53:11 -06:00
Brian Paul	00f31bdd32	mesa: s/FREE/free/ in _mesa_free_errors_data() Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:10 -06:00
Brian Paul	7fbb8ba499	mesa: use malloc/free instead of MALLOC/FREE in attrib stack code We moved away from MALLOC/FREE in the rest of core Mesa a while ago. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:05 -06:00
Brian Paul	f9985db0bc	st/mesa: fix sampler_view REALLOC/FREE macro mix-up We were using REALLOC() from u_memory.h but FREE() from imports.h. This mismatch caused us to trash the heap on Windows after we deleted a texture object. This fixes a regression from commit `6c59be7776`. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-04-10 07:53:05 -06:00
Chris Forbes	87502bbcd7	docs: Expand ARB_gpu_shader5 to describe status of individual features This extension is a huge grab-bag of "stuff that's in DX11". Break it apart to make it clear what still needs to be done. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-10 18:52:03 +12:00
Chris Forbes	0d653b948f	docs: Mark off ARB_texture_view and add to release notes for 10.2. V4: Don't claim Gen8 yet. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:42 +12:00
Chris Forbes	2a2f8cd9d2	i965: Enable ARB_texture_view on Gen7 V4: Don't enable this for Gen8 yet -- that still needs wired up. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:42 +12:00
Chris Forbes	ea477817d7	i965: Account for view parameters in blit CTSI path Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	01d6a2ad16	i965: Account for MinLayer/MinLevel in blorp CTSI path Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	058f353a15	i965: Account for view parameters in fast depth clears V2: - No need for layer_multiplier; multisampled depth surfaces are IMS. - Remove unused num_layers. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	540d53d9b0	i965/blorp: Account for nonzero MinLayer in layered clears. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	d581247569	i965/blorp: Use irb->layer_count in clear Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	98328e4c19	i965: Add layer_count to intel_renderbuffer This is the effective layer count, for clears etc. This differs from the depth of the miptree level when views are involved. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	0a08147fcb	i965: Pull out layer_multiplier in intel_update_renderbuffer_wrapper We're about to need this in another place. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	a76cde35d8	i965: Add `layered` parameter to intel_update_renderbuffer_wrapper We're about to need this so we can determine the layer count of the wrapper. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	85dda825fe	i965: Adjust renderbuffer wrapper to account for MinLevel/MinLayer Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	24f490fb37	i965: Enable texture upload fast path with MinLevel We'll still avoid MinLayer here since the fast path doesn't understand arrays at all, but it's straightforward to do levels. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	5de52541e5	i965: Account for MinLevel in texture upload fast path Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	ba3499ba01	i965: Adjust map/unmap code for MinLevel/MinLayer This allows core mesa's TexSubImage paths etc to work correctly with views which have nonzero MinLevel or MinLayer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	ca1d1b2fc1	i965: Don't try to use fast upload path for nontrivial views This will eventually be relaxed, but we'll get the fallback path working first. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	c9c08867ed	i965: Adjust surface_state emission to account for view parameters V4: Comment style, remove magic shift. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	771c2ae0af	i965: Add _Format to intel_texobj. This is the actual mesa_format to use. In non-view cases this is always the same as the mt's format. V4: Comment style Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	b7f011fdc9	i965: Add driver hook for TextureView We need to wire the original texture's mt into the view. All the hard work of setting up an appropriate tree of gl_texture_image structures has already been done by core mesa. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	93fa16bdd1	i965: Ensure that texture validation is skipped for immutable textures. If we were to relayout the miptree, we'd break any views that are sharing it. (Simplified based on suggestions from Eric) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	a98b675945	i965: refactor format selection for unsupported ETC* formats We will need to call this to munge view formats. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	14c116433d	i965: refactor format munging for separate stencil We will need this for munging the view's format. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	215c9432b9	i965: Include #slices in miptree debug Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	c1b017472b	mesa: Adjust _MaxLevel computation to account for views Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	61e264f4fc	mesa: Prefer non-swizzled formats for most sized internalformats These formats can be cast to others (with different component types or sizes) via ARB_texture_view or ARB_shader_image_load_store. We want them to be laid out consistently so that we can just reinterpret the memory with a different format. In V1, this was done conditionally on a 'prefer_no_swizzle' flag which was set in TexStorage/TextureView paths, but we need the same behavior for ARB_shader_image_load_store (which also works with images created via TexImage, so we don't want it to be conditional. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	58790043bb	i965: Render R8G8B8X8 as R8G8B8A8 The sampler can handle R8G8B8X8 (and substitute 1.0 for the fourth component) but we can't use it as a render target. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	50eed4eed5	i965: Pretend we don't support BRW_SURFACEFORMAT_R16G16B16_FLOAT for textures. None of the other 3-component 16bpc formats are directly supported, so they get promoted to XRGB equivalents. Not promoting RGB16F the same way makes texture views much more fiddly -- we don't want to have to do crazy copying behind the scenes. (with my other master + my experimental ARB_texture_view support) fixes the piglit test: `spec/ARB_texture_view/view compare 48bit formats` No regressions in gpu.tests on Haswell. V4: Don't alter the formats table -- just don't match it to a mesa_format. [Kenneth] Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	66b0554fa6	i965: Enable R10G10B10A2_UNORM format This is supported by all generations, and is required for memory layout consistency for texture_view. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	932a1eeac8	i965: Enable R8G8B8A8_UNORM_SRGB format Now this is the preferred format for GL_SRGB8_ALPHA8. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	6ef7205613	swrast: Add support for fetching from MESA_FORMAT_R10G10B10A2_UNORM V4: Fix rebase conflicts with Brian's renaming of the texfetch functions. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	a421be1dcb	mesa: fix packing of float texels to GL_SHORT/GL_BYTE Previously, we would unpack the texels to floats using _TO_FLOAT_TEX, and then pack them into the desired format using FLOAT_TO_. Unfortunately, this isn't quite the inverse operation, and so some texel values would end up off-by-one. This fixes the GL_RGB8_SNORM and GL_RGB16_SNORM subcases in piglit's arb_texture_view-format-consistency-get test on i965. The similar 1-, 2- and 4-component cases already worked because they took the memcpy path rather than repacking. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Michel Dänzer	ee2bcf38a4	r600g: Don't leak bytecode on shader compile failure Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-10 14:00:43 +09:00
Emil Velikov	55f9bbd46c	build: force .so extension for the gallium dri modules While linux uses .so as a default extension for shared libraries that is not the case for other platforms. The loader in libGL (and others) assumes that the dri module will always have a .so extension, thus it will fail to load on the affected platforms. Spotted-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-09 22:12:36 +01:00
Jon TURNEY	92d0786f88	Partially revert `bba9c28` "configure: use LIB_EXT rather than hardcoded .so" Filenames passed to dlopen() don't need to use the platform's default extension for shared libraries. Using the '.so' extension when dlopen()ing DRI drivers is hardcoded into mesa and the X server, so it should be hardcoded here in the Makefile as well. A similar fix is probably also needed for gallium DRI drivers. (Consider that if we were starting from scratch, perhaps we would use a custom extension like .dri instead) Cc: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-09 22:12:36 +01:00
Emil Velikov	56f531657c	Partially revert "st/xa: Fix advertized version number and try to avoid future discrepancies" This reverts commit `61bedc3d6b`. As the header is the one defining the API/ABI and is distributed during installation, we should be using it rather than re-defining the XA version in configure.ac. Bump the version in the header to 2.2.0, to reflect what was the original intent of commit `42158926c6`. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-09 22:12:35 +01:00
Emil Velikov	f9832f960f	glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and initialize glx display) we've split the big _Xglobal_lock handling in a more fine grained manner. Unfortunatelly we forgot to drop the unlock_mutex on the error paths, leading to undefined behaviour as the mutex is already unlocked. Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-09 22:12:35 +01:00
Rob Clark	6afd7be132	freedreno/a3xx: assert() -> debug_assert() We hit this assert with some piglit tests. Which appears to be a bug outside of freedreno. Previously we were relying on assert() being redefined to debug_assert() so that we didn't crash in release builds. Somehow that stopped working. So just use debug_assert() directly. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-09 16:37:04 -04:00
Brian Paul	e853ade544	svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create() Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module for AA lines (when the device doesn't support that feature). We need to initialize this list before we setup the swtnl pieces. Found/fixed by Charmaine Lee. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-04-09 12:02:03 -06:00
Kenneth Graunke	26ae030fcc	i965: Stop advertising GL_MESA_ycbcr_texture. The "new" fragment shader backend has never supported the necessary color conversion code for this to work. We began using the new backend in Mesa 7.10 for GLSL (commit `a81d423d93`, October 2010), and for ARB_fragment_program in Mesa 9.1 (commit `97615b2d8c`, August 2012). I haven't heard any complaints, so I don't think anyone will miss this feature. I believe mplayer used it at one point, but these days defaults to other paths anyway. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-09 08:28:25 -07:00
Rob Clark	4a92c12232	freedreno/a3xx/compiler: add CEIL fixes piglit glsl-fs-ceil Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-09 10:59:18 -04:00
Rob Clark	9604e31dc9	freedreno/a3xx/compiler: fix neg mov's create_mov() was fixed up to handle neg/abs properly for interal mov's, using absneg.f, but forgot to fix it for TGSI MOV's. The problem with using add.f to handle negated mov's is that we can only take a single const reg src. So: MOV TEMP[n], -CONST[m] would turn into: add.f Rdst, (neg)CONST[m], 0.0 which would not work. Anyways, just remove the extra code and always use create_mov() which DTRT. This fixes piglit vs-op-neg-int test. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-09 10:59:18 -04:00
Marek Olšák	4d641803e8	radeonsi: allow fast color clear and Hyper-Z with 1D-tiled surfaces on CIK This depends on my kernel fix. Hyper-Z is still disabled by default.	2014-04-09 01:45:16 +02:00
Marek Olšák	fb5cf3490e	r600g,radeonsi: add a bunch of useful queries for the HUD	2014-04-09 01:45:16 +02:00
Marek Olšák	4a5519f1e0	r600g,radeonsi: set correct initial domain for shared resources	2014-04-09 01:45:16 +02:00
Marek Olšák	5f7faff61b	gallium/radeon: fix warnings	2014-04-09 01:45:16 +02:00
Iago Toral Quiroga	1a92637c68	tnl: Merge _tnl_vbo_draw_prims() into _tnl_draw_prims(). This should help prevent situations where we render without proper index bounds. For example: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-08 15:10:10 -07:00
Topi Pohjolainen	2ffb50d77b	i965: Remove unused sampler key fields Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 13:34:59 -07:00
Brian Paul	6f059725fa	mesa: move declaration before code in etc2_unpack_rgb8() To fix MSVC build since `cb4ad13685`.	2014-04-08 14:17:40 -06:00
Kenneth Graunke	ec1baea95a	i965: Delete "fast color clear unsupported" performance warning. Applications frequently clear to colors other than 0.0 or 1.0, which prevents us from doing fast color clears. In that case, we issue this performance warning on basically every glClear call, resulting in so much spam that it's nearly impossible to see any other messages. Plus, I don't think it's useful. We aren't suggesting a better way to do what the application developers want---we're just telling them it would be faster to do something they don't want. Driver developers have no control over the clear color, so this message is totally useless to them. A better alternative to get this sort of information is to use INTEL_DEBUG=blorp, which tells you whether color clears were fast, simd16 repdata, or slow. v2: Rebase on has_color_component changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-08 13:09:46 -07:00
Rob Clark	ee839cc6ef	freedreno/a3xx: deal with optimized tex instructions Keep track of whether we actually have any sam instructions in the resulting shader, rather than using TGSI SAMP declarations. If the sam instruction is optimized out, because the result is not used, we don't want to emit texture state, etc. In fact emitting sampler state and/or setting PIXLODENABLE bit when there are no texture fetches seems to cause lockup. In theory this should never happen for a "normal" shader, unless the state tracker is wonky. But it is a very real possibility for binning pass shaders. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-08 16:06:49 -04:00
Courtney Goeltzenleuchter	cb4ad13685	mesa: add bounds checking to eliminate buffer overrun Decompressing ETC2 textures was causing intermitent segfault by copying resulting 4x4 texel block to the destination texture regardless of the size of the destination texture. Issue found via application crash in GLBenchmark 3.0's Manhattan test. v2: add more detail comment. Compute limit outside inner loops. v3: add bugzilla reference v4: Correct cc syntax in commit log v5: really grab the right patch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988 Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1, suggested v2-3]	2014-04-08 12:55:25 -07:00
Leo Liu	a22d944fdb	st/omx/enc: cleanup omx/vid_enc.c cleanup by moving each step into a separate function Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-08 17:51:19 +02:00
Christian König	5f374826f8	st/omx/enc: allocate input buffer private on demand v2: move allocation to a function as first step to clean vid_enc_EncodeFrame Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-04-08 17:51:15 +02:00
Brian Paul	9bb2ec6fd1	svga: replace sampler assertion with conditional For TEX instructions, the set of samplers and sampler views should be consistent. The XA state tracker sometimes passes an inconsistent set of samplers and sampler views. Rather than assert and die, issue a warning. v2: add debugging code to detect inconsistent state. v3: also check for null sampler in svga_state_tss.c Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-08 08:45:18 -06:00
Chia-I Wu	4ddf51db6a	i965/vec4: fix record clearing in copy propagation Given mov vgrf7, vgrf9.xyxz add vgrf9.xyz, vgrf4.xyzw, vgrf5.xyzw add vgrf10.x, vgrf6.xyzw, vgrf7.wwww the last instruction would be wrongly changed to add vgrf10.x, vgrf6.xyzw, vgrf9.zzzz during copy propagation. The issue is that when deciding if a record should be cleared, the old code checked for inst->dst.writemask & (1 << ch) instead of inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, ch)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749 Signed-off-by: Chia-I Wu <olv@lunarg.com> Cc: Jordan Justen <jljusten@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romainck <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.1" <mesa-stable@freedesktop.org>	2014-04-08 21:04:22 +08:00
Eric Anholt	57d6e7b7ee	i965/vec4: Add a test for copy propagation behavior. I thought I was seeing a bug in the code while reviewing, but it's not there. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-08 00:59:48 -07:00
Eric Anholt	6230b646a5	i965/fs: Track whether we're doing dual source in a more obvious way. I'm going to be turning dual_src_output into an array in a moment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	14b85e3a47	i965/fs: Add a couple more global special regs to special[] Nothing bad came of this because they weren't used after visitor running, but leaving them in a bad state seems like a recipe for pain later. Suggested-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	4303d26f93	i965/fs: Handle arrays of special regs more cleanly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	72b845e640	i965/fs: Fix dump_instructions() on uniforms. All of a vec4 uniform was being printed as "u0" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	caa2605db5	i965/fs: Fix vgrf0 live interval when no interpolation was done. When you've got a simple solid-color shader that doesn't generate pixel_x/y interpolation, we were deciding that the first vgrf was both the undefined pixel_x and pixel_y, and extending its live interval to avoid the stride problem. That tricked other optimization that tries to see if a particular instruction is the last use of a variable. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	cf40ebacb1	i965: Drop pointless check for variable declarations in splitting. We're walking the whole instruction stream, so we know the declaration will be found. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	66b15ad9db	i965: Remove stale comment. We stopped doing variable index lowering for uniforms in `a64c1eb9b1`, 5 months after the comment was added. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	8c2bfbc6b9	glsl: Move tree grafting's debug output to stderr. The rest of our compiler dumps are there, now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	e9822f77a9	glsl: Skip making a temporary for assignments when we don't need one. While we wish our optimization passes could identify all the cases where we can coalesce our variables, we miss out on a lot of opportunities. total instructions in shared programs: 1673849 -> 1673166 (-0.04%) instructions in affected programs: 299521 -> 298838 (-0.23%) GAINED: 7 LOST: 0 Note that many programs are "hurt". The notable ones are where we produce unrolling in cases we didn't before (presumably just because of the lower instruction count). But there are also some cases where pushing things right into the variables prevents copy propagation and tree grafting, since we don't split our variable usage webs apart. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:47 -07:00
Iago Toral Quiroga	dff3439fef	i915: Fix build error. is_power_of_two() is now provided by mesa so its definition must be removed from the i915 driver code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:29:59 -07:00
Kenneth Graunke	73f80c20f6	glsl: Pass ctx->Const.NativeIntegers to do_algebraic. The next patch will introduce an optimization that only works when integers are not represented as floating point values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:02:06 -07:00
Kenneth Graunke	169c645f12	glsl: Pass ctx->Const.NativeIntegers to do_common_optimization(). The next few patches will introduce an optimization that only works when integers are not represented as floating point values. v2: Re-word-wrap a line, as requested by Ian Romanick. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:02:03 -07:00
Kenneth Graunke	40d9337406	glsl: Validate that base types match for a number of binops. The IR is not supposed to support implicit type conversions; we just failed to validate it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:02:01 -07:00
Kenneth Graunke	e14b93371c	glsl: Fix lack of i2u in lower_ubo_reference. ir_binop_ubo_load takes unsigned integer operands. However, the array index used to compute these offsets may be a signed integer. (For example, see Piglit's spec/glsl-1.40/uniform_buffer/fs-bvec-array). For some reason, we were missing an ir_binop_i2u cast, and ir_validator was failing to catch that. Without this change, ir_builder's type inference code broke for me when writing a new optimization pass. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:58 -07:00
Kenneth Graunke	4311f9878d	i965/fs: Skip emitting MACH/MOV for small integers. The vector backend already implemented this optimization, but surprisingly, we never bothered to implement it in the scalar backend. In addition to saving two instructions, this eliminates a use of the accumulator as an explicit source, which is unsupported in SIMD16 mode on Gen7+, which could help us gain SIMD16 programs. Cuts 19.23% of the instructions in dolphin/efb2ram.shader_test. v2: Rebase on is_16bit_integer_constant -> is_uint16_constant rename. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:55 -07:00
Kenneth Graunke	7540be22d1	glsl: Make is_16bit_constant from i965 an ir_constant method. The i965 MUL instruction doesn't natively support 32-bit by 32-bit integer multiplication; additional instructions (MACH/MOV) are required. However, we can avoid those if we know one of the operands can be represented in 16 bits or less. The vector backend's is_16bit_constant static helper function checks for this. We want to be able to use it in the scalar backend as well, which means moving the function to a more generally-usable location. Since it isn't i965 specific, I decided to make it an ir_constant method, in case it ends up being useful to other people as well. v2: Rename from is_16bit_integer_constant to is_uint16_constant, as suggested by Ilia Mirkin. Update comments to clarify that it does apply to both int and uint types, as long as the value is non-negative and fits in 16-bits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:53 -07:00
Kenneth Graunke	bd69f65f90	mesa: Move is_power_of_two() function from brw_context.h to macros.h. This makes the function available from core Mesa code, including the GLSL compiler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:51 -07:00
Kenneth Graunke	6bda3a5267	i965: Fix "SIMD16 unsupported" messages via KHR_debug. Performance warnings are logged via KHR_debug in addition to when the INTEL_DEBUG=perf environment variable is set. Without this, messages in debug contexts would have "(null)" for the reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:34 -07:00
Kenneth Graunke	ee12a03805	i965: Fix missing dirty bits in the gen8_sbe_state atom. These are clearly needed---the comments in the function are even present for each one of them. I originally had two separate state atoms for 3DSTATE_SBE and 3DSTATE_SBE_SWIZ. When I combined the functions, I must have forgotten to add the atoms for 3DSTATE_SBE_SWIZ. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 23:37:18 -07:00
Kenneth Graunke	47682f2ca1	i965: Drop BRW_NEW_RASTERIZER_DISCARD flag from Broadwell SOL atom. Nothing actually uses this---we handle rasterizer discard in the clipper in order for statistics counters to work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 23:37:16 -07:00
Kenneth Graunke	f68353c57c	i965: Use the correct program when uploading Broadwell SOL state. This is the equivalent of commit `43e77215b1`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 23:36:19 -07:00
Thomas Hellstrom	47f60cbb71	st/xa: Make sure unused samplers are set to NULL renderer_copy_prepare was setting the first sampler but never telling the cso code how many samplers were actually used. Fix this. Cc: "10.1" <mesa-stable@freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-07 22:34:10 -07:00
Thomas Hellstrom	e5d2c5b899	st/xa: Bind destination before setting new state Binding a new destination may cause the svga driver to emit draw calls while propagating the surface. Make sure this doesn't happen in the middle of sampler state setup where state may be incosistent. In practice, surface propagation should never happen here and even if it did, it wouldn't be a valid reason for the svga driver to emit partially set up state, but to avoid future uncertainties, make sure this doesn't happen anyway. Found while auditing the state tracker for inconsistent sampler state / sampler view setup. Cc: "10.1" <mesa-stable@freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-04-07 22:34:10 -07:00
Eric Anholt	34f15903d6	glapi: Fix libglapi build. This line appears to have been accidentally dropped from the last commit, and the resulting libglapi was missing symbols.	2014-04-07 14:34:49 -07:00
Matt Turner	144bbb7b78	glapi/build: Add headers to distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:45:26 -07:00
Matt Turner	fbca1ab780	glapi/gen: Ship more Python files Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:45:19 -07:00
Matt Turner	b0f37a6bd2	glapi/gen: Ship XML and Python files Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:43:21 -07:00
Matt Turner	f76ac9c9a6	glapi/gen: Add missing XML files to API_XML Also (re)move XML files from COMMON to API_XML. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:43:21 -07:00
Matt Turner	cdc3a6bb21	src/build: Add getopt to distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:41:02 -07:00
Matt Turner	a97611313d	gbm/build: Add headers to distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:41:01 -07:00
Matt Turner	3f64c3d591	egl/build: Sort egl sources alphabetically. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:41:00 -07:00
Matt Turner	5ae2f28ca7	egl/build: Remove unused -DXF86VIDMODE. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:40:58 -07:00
Matt Turner	5074117928	egl/build: Include headers and XML in distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:40:57 -07:00
Matt Turner	1d4007fbd9	egl/build: Drop two unnecessary Makefiles. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:40:31 -07:00
Matt Turner	5c770ba919	i965/fs: Remove left-over 'removed' variable. I think this was used for coalescing out partly dead large virtual registers, but the patch that enabled that caused regressions and didn't make it upstream. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 10:29:43 -07:00
Matt Turner	99437b730f	i965/fs: Check for interference after finding all channels. It's more likely that we won't find writes to all channels than one will interfere, and calculating interference is more expensive. This change will also help prepare for coalescing load_payload instructions' operands. Also update the live intervals for all channels, and not just the last that we saw. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 10:29:22 -07:00
Jordan Justen	70285f607c	i965: initialize more device info fields for Cherryview The intent in `9b6b084eb7` was for urb .size and .min_vs_entries fields to use the values from the GEN8_FEATURES macro. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-07 09:52:32 -07:00
Brian Paul	d3ef6f5427	swrast: reindent s_texfetch_temp.h, remove trailing whitespace Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:27 -06:00
Brian Paul	a19d60faef	swrast: remove out of date comments in s_texfetch_tmp.h The comments were out of date and redundant (the functions are pretty much self-explanatory). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:27 -06:00
Brian Paul	56db16fb5b	swrast: rename texture fetch functions (pt. 7) sed commands: s/f_z24_s8/S8_UINT_Z24_UNORM/g s/f_s8_z24/Z24_UNORM_S8_UINT/g s/f_z16/Z_UNORM16/g s/f_z32/Z_UNORM32/g s/z32f_x24s8/Z32_FLOAT_S8X24_UINT/g s/f_ycbcr_rev/YCBCR_REV/g s/f_ycbcr/YCBCR/g s/dudv8/DUDV8/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:27 -06:00
Brian Paul	d41fe0aec2	swrast: rename texture fetch functions (pt. 6) sed commands: s/rgb9_e5/R9G9B9E5_FLOAT/g s/r11_g11_b10f/R11G11B10_FLOAT/g s/f_alpha_f16/A_FLOAT16/g s/f_alpha_f32/A_FLOAT32/g s/f_luminance_f16/L_FLOAT16/g s/f_luminance_f32/L_FLOAT32/g s/f_luminance_alpha_f16/LA_FLOAT16/g s/f_luminance_alpha_f32/LA_FLOAT32/g s/f_intensity_f16/I_FLOAT16/g s/f_intensity_f32/I_FLOAT32/g s/f_r_f16/R_FLOAT16/g s/f_r_f32/R_FLOAT32/g s/f_rg_f16/RG_FLOAT16/g s/f_rg_f32/RG_FLOAT32/g s/f_rgb_f16/RGB_FLOAT16/g s/f_rgb_f32/RGB_FLOAT32/g s/f_rgba_f16/RGBA_FLOAT16/g s/f_rgba_f32/RGBA_FLOAT32/g s/xbgr16161616_float/RGBX_FLOAT16/g s/xbgr32323232_float/RGBX_FLOAT32/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	9eb45114fd	swrast: rename texture fetch functions (pt. 5) sed commands: s/srgba8/A8B8G8R8_SRGB/g s/sargb8/B8G8R8A8_SRGB/g s/sabgr8/R8G8B8A8_SRGB/g s/sxbgr8/R8G8B8X8_SRGB/g s/sla8/L8A8_SRGB/g s/sl8/L_SRGB8/g s/srgb8/BGR_SRGB8/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	faa8a8e8b2	swrast: rename texture fetch functions (pt. 4) sed commands: s/signed_rg1616/R16G16_SNORM/g s/signed_rg88_rev/R8G8_SNORM/g s/signed_al88/L8A8_SNORM/g s/signed_a8/A_SNORM8/g s/signed_a16/A_SNORM16/g s/signed_l8/L_SNORM8/g s/signed_l16/L_SNORM16/g s/signed_i8/I_SNORM8/g s/signed_i16/I_SNORM16/g s/signed_r8/R_SNORM8/g s/signed_r16/R_SNORM16/g s/signed_al1616/LA_SNORM16/g s/signed_rgb_16/RGB_SNORM16/g s/signed_rgba_16/RGBA_SNORM16/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	a401362019	swrast: rename texture fetch functions (pt. 3) Rename functions to match format names. sed commands: s/f_rg1616_rev/G16R16_UNORM/g s/f_rg1616/R16G16_UNORM/g s/f_argb2101010/B10G10R10A2_UNORM/g s/f_a8/A_UNORM8/g s/f_a16/A_UNORM16/g s/f_i8/I_UNORM8/g s/f_i16/I_UNORM16/g s/f_r8/R_UNORM8/g s/f_r16/R_UNORM16/g s/f_rgb888/BGR_UNORM8/g s/f_bgr888/RGB_UNORM8/g s/f_l8/L_UNORM8/g s/f_l16/L_UNORM16/g s/xbgr16161616_unorm/RGBX_UNORM16/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	e4ebb24b35	swrast: rename texture fetch functions (pt. 2) Rename functions to match format names. sed commands: s/f_al1616_rev/A16L16_UNORM/g s/f_al1616/L16A16_UNORM/g s/f_rgb565_rev/R5G6B5_UNORM/g s/f_rgb565/B5G6R5_UNORM/g s/f_argb4444_rev/A4R4G4B4_UNORM/g s/f_argb4444/B4G4R4A4_UNORM/g s/f_rgba5551/A1B5G5R5_UNORM/g s/f_argb1555_rev/A1R5G5B5_UNORM/g s/f_al88_rev/A8L8_UNORM/g s/f_al88/L8A8_UNORM/g s/f_gr88/R8G8_UNORM/g s/f_rg88/G8R8_UNORM/g s/f_al44/L4A4_UNORM/g s/f_rgb332/B2G3R3_UNORM/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:25 -06:00
Brian Paul	fde3258389	swrast: rename texture fetch functions (pt. 1) Rename functions to match format names. sed commands: s/signed_rgba8888_rev/R8G8B8A8_SNORM/g s/signed_rgba8888/A8B8G8R8_SNORM/g s/f_rgba8888_rev/R8G8B8A_UNORM/g s/f_rgba8888/A8B8G8R8_UNORM/g s/f_rgbx8888_rev/R8G8B8X8_UNORM/g s/f_rgbx8888/X8B8G8R8_UNORM/g s/f_argb8888_rev/A8R8G8B8_UNORM/g s/f_argb8888/B8G8R8A8_UNORM/g s/f_xrgb8888_rev/X8R8G8B8_UNORM/g s/f_xrgb8888/B8G8R8X8_UNORM/g s/signed_rgbx8888/X8B8G8R8_SNORM/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:25 -06:00
Brian Paul	e0fafd1913	mesa: rename stencil/Z functions in format_unpack.c So the function names match the format names. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:24 -06:00
Ilia Mirkin	89c5b56be6	nouveau: fix firmware check on nvd7/nvd9 The kernel driver expects the class to be based on chipset generation rather than VP generation. Make sure to pass 90b1 for NVDX chipsets instead of 95b1. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77102 Fixes: `40dd777b33` Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubunutu.com>	2014-04-07 08:58:15 -04:00
Thomas Hellstrom	2f6fcd65f2	winsys/svga: Fix prime surface references also for guest-backed surfaces Implement guest-backed surface sharing using prime fds. Previously only legacy surfaces could use this functionality. Also use the vmwgfx 2.6 single-ioctl prime fd reference if available. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-07 03:34:52 -07:00
Thomas Hellstrom	0887b499e9	winsys/svga: Update the vmwgfx_drm.h header to latest version from kernel Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-07 03:34:47 -07:00
Ilia Mirkin	159cec9dec	docs: mark ARB_texture_gather as done on nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:19 -04:00
Ilia Mirkin	f6579e4b17	nvc0: add support for texture gather Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:19 -04:00
Ilia Mirkin	91900c6d33	docs: mark ARB_texture_query_lod as done for nv50, nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:18 -04:00
Ilia Mirkin	423f64e83a	nvc0: enable texture query lod Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:18 -04:00
Ilia Mirkin	d5faf8e786	nv50: enable texture query lod Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:18 -04:00
Dave Airlie	4dc13e3c71	st/mesa: add support for ARB_texture_query_lod Add support for the LODQ texture instruction. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-07 01:06:18 -04:00
Dave Airlie	be5276ae7d	gallium: add support for LODQ opcodes. This opcode provide support for GL_ARB_texture_query_lod, Signed-off-by: Dave Airlie <airlied@redhat.com> [imirkin: rebase, docs update] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-07 01:06:18 -04:00
Matt Turner	5d0b3ec4ae	i965/vec4: Allow constant propagation into dot product. total instructions in shared programs: 1667088 -> 1667055 (-0.00%) instructions in affected programs: 3362 -> 3329 (-0.98%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-05 09:52:54 -07:00
Matt Turner	34ec1a24d6	glsl: Optimize (x + y cmp 0) into (x cmp -y). Cuts a small handful of instructions in Serious Sam 3: instructions in affected programs: 4692 -> 4666 (-0.55%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:37 -07:00
Matt Turner	6499ecafa5	i965/fs: Split out can_coalesce_vars() function. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:37 -07:00
Matt Turner	29841fbe20	i965/fs: Split out is_coalesce_candidate() function. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:37 -07:00
Matt Turner	0fbcdec2f6	i965/fs: Split fs_visitor::register_coalesce() into its own file. The function has gotten large, and brw_fs.cpp is the largest source file in the driver. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:37 -07:00
Matt Turner	8b1ab5c93b	i965/fs: Mark appropriate fs_inst members as const. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:36 -07:00
Matt Turner	39ecfca121	i965: Mark is_tex() and friends as const. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:36 -07:00
Matt Turner	92d03f7f28	i965/fs: Don't propagate saturation modifiers if there are source modifiers. Which would lead to translating mad vgrf9:F, vgrf3:F, u0:F, vgrf6:F mov.sat vgrf7:F, -vgrf9:F into mad.sat vgrf9:F, vgrf3:F, u0:F, vgrf6:F mov vgrf7:F, -vgrf9:F Fixes some lighting effects in Dota2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:36 -07:00
Matt Turner	7a7b8a02be	i965/fs: Don't propagate saturate modifiers into partial writes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:36 -07:00
Matt Turner	86ae6f477d	i965/fs: Fix off-by-one in saturate propagation. ip needs to be initialized to start_ip - 1, since the first thing in the main loop is ip++. Otherwise we would incorrectly propagate the saturate from the mov to the mad: mad a, b, c, d mov.sat x, a add y, z, a Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:36 -07:00
Matt Turner	20dee82a75	i965/vec4: Consider sources of non-GRF-dst instructions for dead channels. Previously we'd ignore the sources of instructions with non-GRF destinations when calculating calculating the dead channels. This would lead to us incorrectly removing the first instruction in this sequence: mov vgrf11, ... cmp.ne.f0 null, vgrf11, 1.0 mov vgrf11, ... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76616	2014-04-05 09:47:36 -07:00
Matt Turner	63d57f3b08	i965/fs: Name temporary ralloc contexts something other than mem_ctx. Or else poor programmers might mistakenly use the temporary mem_ctx, instead of the fs_visitor's mem_ctx and wonder why their code is crashing. Also remove the parenting. These contexts are local to the optimization passes they're in and are freed at the end.	2014-04-05 09:44:54 -07:00
Matt Turner	26012c1673	i965/fs: Recalculate live intervals in calculate_register_pressure(). Otherwise calling dump_instructions() after declaring a new fs_reg would segfault when calculate_register_pressure()'s loop over reg walked off the end of the virtual_grf_start[] array that calculate_live_intervals() would have reallocated for you, if it had known there was a new register.	2014-04-05 09:44:54 -07:00
Jonathan Gray	c973e440d5	egl/dri2: use drm macros to construct device name Don't hardcode /dev/dri/card0 but instead use the drm macros which allows the correct /dev/drm0 device to be opened on OpenBSD. v2: use snprintf and fallback to /dev/dri/card0 v3: check for snprintf truncation Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-05 13:36:29 +01:00
Jonathan Gray	81799c82e4	configure: don't require libudev for gbm or egl drm/wayland After the loader changes libudev is no longer required for gbm or the egl drm/wayland platforms. Lets these build/run on OpenBSD. v2: preserve the libudev requirement for Linux as suggested by Emil Velikov. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:35:25 +01:00
Jonathan Gray	0295953c5d	egl/dri2: don't require libudev to build drm/wayland platforms After the loader changes libudev is no longer required to build gbm or the egl drm/wayland platforms. Remove a libudev ifdef which allows the the drm egl driver to be loaded on OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:33:48 +01:00
Jonathan Gray	11623be934	automake: don't enable -Wl,--no-undefined on OpenBSD OpenBSD does not have DT_NEEDED entries for libc by design, over concerns how the symbols would be referenced after changing the major version of the library. So avoid -no-undefined checks on OpenBSD as they will fail. v2: don't include the -no-undefined libtool option in the variable and change -Wl,--no-undefined references in Automake.inc as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76856 Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-05 13:30:27 +01:00
Emil Velikov	e4bd00c1c6	targets/dri: move common libraries to GALLIUM_DRI_LIB_DEPS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:54 +01:00
Emil Velikov	fc91e7e4ae	targets/omx: use GALLIUM_COMMON_LIB_DEPS The targets do not require expat or selinux. Use GALLIUM_COMMON_LIB_DEPS which provides the core requirements for each gallium target. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:48 +01:00
Emil Velikov	6b41043050	targets/xvmc: use GALLIUM_COMMON_LIB_DEPS The targets do not require expat or selinux. Use GALLIUM_COMMON_LIB_DEPS which provides the core requirements for each gallium target. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:46 +01:00
Emil Velikov	432b5776f2	r600/omx: drop -lstdc++ hack The build system will use g++ to link the static library due to the dummy.cpp source(s). Thus one does not need the explicit link against stdc++. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:30 +01:00
Emil Velikov	28a4276442	drivers/nouveau: mention dummy.cpp to use g++ linker The build system does not know that the static library is C++. Mention the cpp file to trigger generation of the proper variable and drop the hacky stdc++ linking. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-05 13:00:32 +01:00
Emil Velikov	16372969c7	drivers/nouveau: use GALLIUM_COMMON_LIB_DEPS Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-05 13:00:14 +01:00
Emil Velikov	c8129604ef	drivers/r300: use GALLIUM_COMMON_LIB_DEPS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76848 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:00:07 +01:00
Emil Velikov	ba5eba5008	automake: introduce GALLIUM_COMMON_LIB_DEPS Rather than copying the core four dependencies all over gallium, introduce the above variable to avoid all the duplication. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76848 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:00:02 +01:00
Emil Velikov	16c13aaeb8	automake: move GALLIUM_DRI_LIB_DEPS to Automake.inc With recent commit we started de-duplicating all of the compiler/ linker flags moving their handling inside Automake.inc. This did not take into consideration that the above variable was set at configure time, leading to issues on certain build combinations. Move the variable to where it's used/handled thus cleaning up configure.ac. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76848 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:59:44 +01:00
Johannes Nixdorf	476db98e03	configure.ac: fix the detection of expat with pkg-config The pkg-config module was called "EXPAT" instead of "expat" in PKG_CHECK_EXISTS. This seems to have been wrong because the wrong argument was copied from PKG_CHECK_MODULES. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:24:01 +01:00
Jonathan Gray	1cc742d912	megadriver_stub.c: don't use _GNU_SOURCE to gate the compat code _GNU_SOURCE is only set/required for linux\|-gnu\|gnu) and as the functionality is available on other systems check for RTLD_DEFAULT instead. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:21:31 +01:00
Jonathan Gray	380f05ccc3	loader: don't limit the non-udev path to only android Platforms that lack libudev (OpenBSD and possibly others) need this change in order to load the correct dri driver. Under linux we unconditionally require libudev, thus this code will never get build. v2: Add commit message (Emil Velikov) Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:17:28 +01:00
Jonathan Gray	727f54a76e	loader: use 0 instead of FALSE which isn't defined Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:16:45 +01:00
Francisco Jerez	4ccff1499c	clover: Document that the obj() helpers already take care of object validation.	2014-04-05 12:18:29 +02:00
Matt Turner	489cb0b2d1	i965: Mark SNB GT1 as a GT1. brw->gt only seems to be used on gen >= 7, so this shouldn't have any effect. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-04 15:07:41 -07:00
Marek Olšák	78f754b739	gallium/u_blitter: implement scaled blitting in the Z direction So that pipe->blit can be used for 3D mipmap generation.	2014-04-04 19:38:36 +02:00
Marek Olšák	8ab7bb4707	gallium/u_blitter: don't adjust cubemap coordinates by a small number It may cause issues with mipmap generation. I think it was used to make some piglit tests pass on r300g.	2014-04-04 19:38:36 +02:00
Leo Liu	0817182b2f	Revert "radeon: just don't map VRAM buffers at all" This reverts commit `96e8b916a7`. In the case of VCE encoding with raw YUV file, CPU load directly to VRAM is faster than combination of CPU writing to GTT and then blit to VRAM with GPU. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-04 16:21:04 +02:00
Leo Liu	de1a59b7a7	radeon/vce: cleanup cpb handling v2: fix whitespace errors, minor coding style changes Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-04 12:35:55 +02:00
Christian König	6c59be7776	st/mesa: improve sampler view handling Keep a dynamically increasing array of all the views created for a texture instead of just the last one. v2: add comments, fix array size calculation, release only the first sampler view found Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-04 10:25:35 +02:00
Thomas Hellstrom	61bedc3d6b	st/xa: Fix advertized version number and try to avoid future discrepancies The xa version number had to be set in two places. In configure.ac and in xa_tracker.h. Furthermore, xa_tracker.h is an installed header so we can't use mesa internal defines. So therefore, at configure time, modify the xa_tracker.h header to use the version given by configure.ac Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-04-04 08:33:43 +02:00
Ian Romanick	4fa58ae5c7	glapi: Fix make check /me puts a paper bag on his head and sits in the corner. This was supposed to be included in `5a68f731`, which added glPointSizePointerOES back to the list of functions exposed by libGLESv1_CM. It looks like it was an uncommitted change in my tree when I sent the patch out. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-03 20:12:19 -07:00
Brian Paul	177c9be615	llvmpipe: remove no-op checks in sampler, sampler_view functions Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 20:05:56 -06:00
Brian Paul	61a3e9936c	softpipe: remove no-op checks in sampler, sampler_view functions Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 19:39:23 -06:00
Brian Paul	4105ad825f	svga: remove no-op checks in sampler, sampler_view functions We are checking for no-ops in the CSO module for both of these items so there's no reason to do it in the driver. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 19:39:23 -06:00
Brian Paul	5a2f8b2c48	cso: check for no sampler view changes in cso_set_sampler_views() As we do for sampler states in single_sampler_done() and many other CSO functions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-03 19:39:23 -06:00
Timothy Arceri	ffa39ab067	docs: Add note about updating tests to dev info Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-04-04 06:48:11 +11:00
José Fonseca	c6050ce7da	st/wgl: Remove wglGalliumMESA(). These were only used by the Python state tracker, which was removed, hence they have no practical use. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-03 12:52:09 +01:00
Ian Romanick	572a25be2f	glapi: Fix scons build Put the -c in the correct place (and match Makefile.am). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76960 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 12:52:09 +01:00
Adel Gadllah	d120506e15	glx: Do not advertise buffer_age on dri2 Previously GLX_EXT_buffer_age has always been advertised as supported because both client_glx_support and client_glx_only where set. So it did not matter that direct_support is only set when running dri3 and we ended up always advertising it. Fix that by not setting client_glx_only for buffer_age in known_glx_extensions. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-04-02 21:28:26 +01:00
Brian Paul	2355a64414	cso: fix sampler view count in cso_set_sampler_views() We want to call pipe->set_sampler_views() with count being the maximum of the old number of sampler views and the new number. This makes sure we null-out any old sampler views. We already do the same thing for sampler states in single_sampler_done(). Fixes some assertions seen in the VMware driver with XA tracker. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Tested-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-02 13:58:05 -06:00
Ian Romanick	5a68f73102	glapi: Add static dispatch for glPointSizePointerOES The OpenGL ES 1.1 conformance tests expect this function to be statically available form libGLESv1_CM.so. The comment "required for es1.1" in the XML file should have been a clue. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76926 Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Lu Hua <huax.lu@intel.com>	2014-04-02 11:30:52 -07:00
Ian Romanick	065ca63043	Revert "Revert "glapi/es1: Don't mark core functions as static_dispatch=false"" This reverts commit `526e49290c`. The original build problem should be fixed by the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com>	2014-04-02 11:30:49 -07:00
Ian Romanick	cecffa08d1	glapi: Enable ES compatibility mode Ages ago Chia-I added an ES compatibility flag to several of the various generator scripts. The intention was to bridge differences between ES and desktop in Mesa builds without ES. It doesn't appear that it has ever been used. Recent changes to static_dispatch status of several ES1 functions caused problems in desktop-only, non-shared-glapi builds. Enabling the ES compatibility mode appears to fix these build problems. This is kind of a duct tape solution to this problem. As I mentioned in the cover letter for the series that triggered the build problem, I would like to make some major changes to the generator architecture and the XML. The whole point of the proposed architecture changes is to better handle the differences between desktop GL and ES. I think duct tape is okay for now. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76869 Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Chia-I Wu <olv@lunarg.com>	2014-04-02 11:30:45 -07:00
Ian Romanick	8e3a7c6204	glapi: Fix build break in 'make check' on non-shared-glapi builds Commit `fb78fa58` made the GL_ARB_debug_output functions aliases of the GL_KHR_debug output functions. As a result, the function names in struct _glapi_table also changed. The table in check_table.cpp used the ARB names. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com> Cc: Vinson Lee <vlee@freedesktop.org>	2014-04-02 11:30:42 -07:00
Ian Romanick	4e18279fae	glapi: Remove support for "short string" mode C89 has a fairly short minimum-maximum string length. To support compilers limited by the C89 limits, this script had a mode where it would generate a character array instead of a giant string. These were functionally the same, but the code generated for the character array is HUGE and difficult to read. As far as I can tell, nothing in Mesa uses '-m short' any more. The generated files used to be tracked in revision control, but I think we stopped using '-m short' when we stopped tracking the generated files. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com> Cc: Vinson Lee <vlee@freedesktop.org>	2014-04-02 11:30:37 -07:00
Juha-Pekka Heikkila	0f641b2d50	mesa: remove redundant running of check_symbol_table() Nested for loops running through tables against which they finally do an assert were ran also with optimized builds. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	17e7cbe078	mesa: Add missing null check in _mesa_parse_arb_program() Add missing null check in program_parse.tab.c through program_parse.y Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	68a45b130e	mesa: Prevent negative indexing on noise2, noise3 and noise4 % operator could return negative value which would cause indexing before perm table. Change %256 to &0xff Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	1056c50d57	glx: add extra null check in getFBConfigs Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	88976daea9	glx: remove unused __glXClientInfo() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Tapani Pälli	e14cc504f3	i965/vec4: do not trim dead channels on gen6 for math Do not set a writemask on Gen6 for math instructions, those are executed using align1 mode that does not support a destination mask. v2: cleanups, better comment (Matt) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76883 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:50:48 +03:00
Thomas Hellstrom	5dc206525b	winsys/svga: Replace the query mm buffer pool with a slab pool v3 This is to avoid running out of query buffer space due to winsys limitations. Instead of a fixed size per screen pool of query buffers, use a slab allocator that allocates a new slab if we run out of space in the first one. v2: Correct email addresses. v3: s/8192/VMW_QUERY_POOL_SIZE/. Improve documentation and log message. Reported-and-tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-04-02 18:32:44 +02:00
Dave Airlie	76ba50a25a	mesa/soft/llvmpipe: add fake MSAA support This adds a gallium cap that allows us to fake GL3.0 by not exposing MSAA on sw rendering. It also forces the extra extensions needed for GL3.2. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-04-02 12:12:04 +10:00
Kristian Høgsberg	882b46a42e	gbm: Add gbm_bo_get_fd to gbm-symbols-check script	2014-04-01 14:08:38 -07:00
Kristian Høgsberg	a43d286ef7	gbm: Add import from fd Add a new import type that lets us create a gbm bo from a DMA-BUF file descriptor. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-01 12:27:26 -07:00
Kristian Høgsberg	f54f5891be	gbm: Add gbm_bo_get_fd() Add gbm function to get a DMA-BUF file descriptor for a gbm bo. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-01 12:27:13 -07:00
Jordan Justen	7c379ebe17	include/GLES3: add OpenGL ES 3.1 Headers From: http://www.khronos.org/registry/gles/api/GLES3/gl31.h http://www.khronos.org/registry/gles/api/GLES2/gl2ext.h http://www.khronos.org/registry/gles/api/GLES3/gl3platform.h Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 09:30:32 -07:00
Brian Paul	526e49290c	Revert "glapi/es1: Don't mark core functions as static_dispatch=false" This reverts commit `f6e290f80c`. To fix the broken build. The DRI-enabled build seems OK after reverting. Th non-DRI/gallium build is still suffering from an unrelated issue in the pipe-loader code.	2014-04-01 08:42:15 -06:00
Iago Toral Quiroga	f5904b732e	mesa: Allow setting GL_TEXTURE_MAX_LEVEL to 0 with GL_TEXTURE_RECTANGLE. Currently, we raise an error when doing this which breaks a conformance test from the OpenGL samples pack. Even if this is a bit silly it is not an error. From http://www.opengl.org/wiki/Rectangle_Texture: "Rectangle textures contain exactly one image; they cannot have mipmaps. Therefore, any texture parameters that depend on LODs are irrelevant when used with rectangle textures; attempting to set these parameters to any value other than 0 will result in an error." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76496 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 08:37:06 -06:00
Ilia Mirkin	c13ff5a763	gallium/docs: fix silent math failures due to ~ and & Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	b4cf180695	gallium/docs: line up some of the equations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	05d0223da3	gallium/docs: fix incorrect/missing references Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	45e383bfae	gallium/docs: fix use of _ in math sections Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	2f14e5eb09	gallium/docs: add format to index Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	4ca110a7b9	gallium/docs: fix a lot of bad formatting Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Chia-I Wu	5d76e44643	glsl: remove UBO fields from _mesa_glsl_parse_state They are not needed since `514f8c7ec7`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-01 13:41:20 +08:00
Ilia Mirkin	010171b562	nv50: implement clear_buffer to accelerate ARB_clear_buffer_object Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-31 21:55:03 -04:00
Ilia Mirkin	f5ba1a1f7f	mesa/st: Accelerate ARB_clear_buffer_object with clear_buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-31 21:21:11 -04:00
Ilia Mirkin	24b86cb304	gallium: add interface to clear buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-31 21:20:02 -04:00
Ian Romanick	4c035706dc	mapi_abi: Remove ABI-check work arounds for functions that are no longer exported The previous commit stopped exporting 21 libGLESv2 and 88 libGLESv1_CM functions. This removes the work-arounds for those functions from ABI-check. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:47:25 -07:00
Ian Romanick	1a59f9a131	mapi_abi: Make ES1 and ES2 static_dispatch=false functions hidden This has been a long standing issue with the ES libraries. Functions marked in the XML with 'static_dispatch=false' were still incorrectly exported. ABI-check is supposed to detect this case, but we have to paper over failures every time a new extension is added. This change will cause a big pile of functions to disappear from libGLESv2 and libGLESv1_CM. libGLESv2 loses (20 functions): glBindVertexArrayOES glCompressedTexImage3DOES glCompressedTexSubImage3DOES glCopyTexSubImage3DOES glDeleteVertexArraysOES glDiscardFramebufferEXT glDrawBuffersNV glFlushMappedBufferRangeEXT glFramebufferTexture3DOES glGenVertexArraysOES glGetBufferPointervOES glGetProgramBinaryOES glIsVertexArrayOES glMapBufferOES glMapBufferRangeEXT glProgramBinaryOES glReadBufferNV glTexImage3DOES glTexSubImage3DOES glUnmapBufferOES libGLESv1_CM loses (88 functions): glAlphaFuncxOES glBindFramebufferOES glBindRenderbufferOES glBlendEquationOES glBlendEquationSeparateOES glBlendFuncSeparateOES glCheckFramebufferStatusOES glClearColorxOES glClearDepthfOES glClearDepthxOES glClipPlanefOES glClipPlanexOES glColor4xOES glDeleteFramebuffersOES glDeleteRenderbuffersOES glDepthRangefOES glDepthRangexOES glDiscardFramebufferEXT glDrawTexfOES glDrawTexfvOES glDrawTexiOES glDrawTexivOES glDrawTexsOES glDrawTexsvOES glDrawTexxOES glDrawTexxvOES glFlushMappedBufferRangeEXT glFogxOES glFogxvOES glFramebufferRenderbufferOES glFramebufferTexture2DOES glFrustumfOES glFrustumxOES glGenerateMipmapOES glGenFramebuffersOES glGenRenderbuffersOES glGetBufferPointervOES glGetClipPlanefOES glGetClipPlanexOES glGetFixedvOES glGetFramebufferAttachmentParameterivOES glGetLightxvOES glGetMaterialxvOES glGetRenderbufferParameterivOES glGetTexEnvxvOES glGetTexGenfvOES glGetTexGenivOES glGetTexGenxvOES glGetTexParameterxvOES glIsFramebufferOES glIsRenderbufferOES glLightModelxOES glLightModelxvOES glLightxOES glLightxvOES glLineWidthxOES glLoadMatrixxOES glMapBufferOES glMapBufferRangeEXT glMaterialxOES glMaterialxvOES glMultiTexCoord4xOES glMultMatrixxOES glNormal3xOES glOrthofOES glOrthoxOES glPointParameterxOES glPointParameterxvOES glPointSizePointerOES glPointSizexOES glPolygonOffsetxOES glQueryMatrixxOES glRenderbufferStorageOES glRotatexOES glSampleCoveragexOES glScalexOES glTexEnvxOES glTexEnvxvOES glTexGenfOES glTexGenfvOES glTexGeniOES glTexGenivOES glTexGenxOES glTexGenxvOES glTexParameterxOES glTexParameterxvOES glTranslatexOES glUnmapBufferOES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Chia-I Wu <olv@lunarg.com> Cc: Paul Berry <stereotype441@gmail.com>	2014-03-31 14:47:00 -07:00
Ian Romanick	dfccd5ccd7	mapi: Hack around glGetInternalformativ not being hidden in GLES This is hella ugly. The same-named function in desktop OpenGL is hidden, but it needs to be exposed by libGLESv2 for OpenGL ES 3.0. There's no way to express in the XML that a function should be be hidden in one API but exposed in another. This won't affect any change now, but it will prevent a regression in a later patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:46:48 -07:00
Ian Romanick	f6e290f80c	glapi/es1: Don't mark core functions as static_dispatch=false Functions that are part of OpenGL ES 1.0 or 1.1 should have static dispatch functions in libGLESv1_CM. This doesn't affect any change yet, but it will prevent later regressions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:46:39 -07:00
Ian Romanick	d457eb193c	glapi: Mark all GL_ARB_separate_shader_objects functions with static_dispatch=false This prevents the entrypoints from being (incorrectly) advertised by libGL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:46:32 -07:00
Ian Romanick	5ccc4e7a8d	glapi: Remove some duplicate ignore="true" lines It looks like these were added accidentally by Paul in commit `1a1db174`. From the commit message and the look of the patch, I think this was just some sed-job left overs. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:45:37 -07:00
Matt Turner	3a8bd97241	i965/vec4: Don't trim writemasks of texture instructions. It was my understanding that the writemask works in SIMD4x2 mode for texturing instructions and doesn't require a message header. Some bit of this logic must be wrong, so disable it until it's understood. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76617 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-31 10:24:10 -07:00
Emil Velikov	d681b22ed7	automake: ask the linker to do garbage collection By doing GC the linker removes all the symbols that are not referenced and/or used by the final library. This results in a saving of ~100K up-to ~600K per (stripped) binary (classic vs gallium drivers). If interested one can ask the compiler to print the sections that are removed using -Wl,--print-gc-sections. v2: Check if ld supports the flag before using it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com> (v1)	2014-03-31 14:56:14 +01:00
Emil Velikov	d187a150d4	automake: add -Wl,--no-undefined to all libraries ... apart from the dri drivers. With this final change we can build mesa without fear that the resulting libraries will have unresolved symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:09:23 +01:00
Emil Velikov	902dc61f88	gallium/targets: add missing library dependencies Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:08:55 +01:00
Emil Velikov	354a5cad74	pipe-loader: reorder PIPE_LIBS Reorder -lm, -lrt, -lpthreads and -ldl to be consistent with the rest of mesa. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-31 13:05:36 +01:00
Emil Velikov	0177ff0039	pipe-loader: use PTHREAD_LIBS over -lpthread Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:02:47 +01:00
Emil Velikov	501af7a1a0	dri/i965: use CLOCK_LIBS over -lrt Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:01:10 +01:00
Emil Velikov	5503c227d9	automake: consistently use -no-undefined Set the flag for all but the dri targets. They have missing glapi symbols which are required for the normal operation with the X server. Jon, I fear that you'll need to carry the "no-undefined" hunk locally when building the dri drivers under cygwin. Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:59:16 +01:00
Emil Velikov	6c8d8119ca	targets/egl-static: move the common LDFLAGS into AM_LDFLAGS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:56:25 +01:00
Emil Velikov	c323273201	targets/omx: do not link against the trace driver Unused due to the missing GALLIUM_TRACE define. Requested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-31 12:55:29 +01:00
Emil Velikov	0484b8446a	gallium/targets: explicitly include a dummy.cpp and remove all the LINK mayhem Explicitly setting the linker variable was required for old and broken build toolchains. At this point this should no longer be needed, and setting the sources lists will trigger generation of the correct LINK variables. Explicitly include dummy.cpp to use g++ to link the static library which in most cases is based upon C++ code. v2: Reword commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:26:47 +01:00
Emil Velikov	2d9c33009a	gallium/targets: move LLVM_LIBS handling inside Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-31 12:26:32 +01:00
Emil Velikov	2328900f66	gallium/targets: fold LLVM_LDFLAGS inside Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-31 12:26:16 +01:00
Emil Velikov	1ea1767f72	targets/omx: use GALLIUM_OMX_LINKER_FLAGS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-31 12:25:34 +01:00
Emil Velikov	e6f8db1e56	targets/omx: introduce GALLIUM_OMX_LIB_DEPS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-31 12:25:04 +01:00
Emil Velikov	55bc658e4b	targets/pipe-loader: move LLVM_LIBS handling inside PIPE_LIBS This lets us have only one if HAVE_MESA_LLVM block, rather than one for each driver. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:23:59 +01:00
Emil Velikov	e36cc99880	targets/pipe-loader: include dummy.cpp irrespective of HAVE_MESA_LLVM Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:22:58 +01:00
Emil Velikov	029bc4510b	targets/pipe-loader: compact duplicating LDFLAGS Every library uses the same libtool/linker flags. Compact those into AM_LDFLAGS and append the version script to it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:22:30 +01:00
Joakim Sindholt	e6545aaaeb	pipe-loader/swrast: add soft/llvmpipe defines Or it compiles them in, but pretends they don't exist v2: Rebase (Emil) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:22:08 +01:00
Emil Velikov	613b4d59e4	targets/xa: drop libudev references from automake build Mesa does _not_ link against libudev. Additionally the only place that deals with it is the loader, thus we can drop the CFLAGS. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:47 +01:00
Emil Velikov	f5466b7b93	dri/common: LIBDRM_LIBS is not a linker/libtool flag, add it to LIBADD Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:42 +01:00
Emil Velikov	46ae286b9d	drivers/x11: GL_LIB_DEPS is not a linker/libtool flag, add it to LIBADD Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:36 +01:00
Emil Velikov	e62b7d38a1	configure: autodetect video state-trackers when non swrast driver is present It makes little sense to enable the vdpau, xvmc and omx state-trackers as they do not make use of (don't work with) the software driver. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:30 +01:00
Emil Velikov	3dc174e85e	configure: use grep in quiet mode, rather than piping stderr/stdout to /dev/null grep -q is easier to read and consistent with the rest of configure.ac. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:20:10 +01:00
Emil Velikov	e8e1158ac3	configure: error out when building gallium-osmesa without softpipe Gallium osmesa links against the softpipe driver, thus the build will fail if it's missing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:18:39 +01:00
Emil Velikov	4d8267ef20	Partially revert "automake: allow only shared builds" Evidently at least static OSMesa is still used as shared one causes substantial increase in the load time for some programs that use it (from seconds up-to ~30min). Rather than forcing everyone to use shared mesa, revert commit `a6efbac9fb` and default to shared build when both shared and static are disabled. v2: Whitespace cleanup, drop silly comment. Reported-by: Burlen Loring <burlen.loring@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:18:17 +01:00
Emil Velikov	23740ed031	configure: enable dri3 only for linux Currently only linux can make use of dri3, so it would make sense to enable it explicitly for the platform. Drop a duplicated libudev check while we're at it. v3: Properly handle dri3 and reword commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76377 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:11:37 +01:00
Chris Forbes	ec4b8d1697	mesa: Fix format matching checks for GL_INTENSITY* internalformats. GL_INTENSITY has never been valid as a pixel format -- to get the memcpy pack/unpack paths, the app needs to specify GL_RED as the pixel format (or GL_RED_INTEGER for the integer formats). Note: This was briefly merged before, but exposed some breakage in gallium, so was reverted. Hopefully it will stick this time. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 11:56:48 +13:00
Chris Forbes	e3cdbdb14b	st: fix st_choose_matching_format to ignore intensity _mesa_format_matches_format_and_type() returns true for GL_RED/GL_RED_INTEGER (with an appropriate type) into an intensity mesa_format. We want the `red`-based format instead, regardless of the order we find them in our walk of the mesa formats list. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-01 11:56:18 +13:00
Chris Forbes	3196c53c5d	mesa: fix texstore for MESA_FORMAT_R8G8B8A8_SRGB The case for this was in the wrong function, and this format's store func was not set in the table at all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 11:54:56 +13:00
Rob Clark	db414c4686	freedreno/a3xx/compiler: fix RECT textures Whether or not the coords are normalized is handled in the texture state. But we otherwise need to treat RECT sample instructions as 2D. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 12:10:26 -04:00
Rob Clark	83808a90be	freedreno/a3xx/compiler: avoid negative register ids In some cases, we need a register to be assigned up to three components before the base. Since we can't have negative register #'s, just shift everything up. May increase register usage for trivial shaders, but I don't think we are shader limited in those cases. A proper solution is going to require a better register assignment algorithm (which is on the TODO list), this is just a hack to get us by until then. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:53:32 -04:00
Rob Clark	2346ea6347	freedreno/a3xx: missing wfi RB_FRAME_BUFFER_DIMENSION is not a banked context register, so we need to wait for the GPU to idle before updating it. But we'd rather not have unnecessary WFI's, so actually keep track if we need to emit it or not. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:50:24 -04:00
Rob Clark	ae5efaf285	freedreno/a3xx: little extra debug Catch things which should not happen in debug builds. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:40:00 -04:00
Rob Clark	92141afd0e	freedreno: handle null sampler This is something that XA triggers. In some cases it will only use SAMP[1] (composite mask) but not SAMP[0] (composite src). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:38:16 -04:00
Kenneth Graunke	9b6b084eb7	i965: Add Cherryview support. Based on a patch by Ville Syrjälä. As usual, these are placeholder values; actual values will come later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 17:10:09 -07:00
Ian Romanick	4047263cb1	glsl: Clean up "unused parameter" warnings ../../src/glsl/builtin_functions.cpp:72:1: warning: unused parameter 'state' [-Wunused-parameter] ../../src/glsl/ir_clone.cpp:31:1: warning: unused parameter 'ht' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:44:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:50:1: warning: unused parameter 'ignore' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:68:1: warning: unused parameter 'ignore' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:149:6: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:556:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:562:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/link_uniforms.cpp:213:1: warning: unused parameter 'record_type' [-Wunused-parameter] ../../src/glsl/loop_analysis.cpp:225:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:73:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:79:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:85:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_copy_propagation_elements.cpp:189:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_cse.cpp:402:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_dead_code_local.cpp:117:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_redundant_jumps.cpp:53:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_vectorize.cpp:301:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:58 -07:00
Ian Romanick	1b28c8d77a	mesa: Clean up "unused parameter" warnings program/ir_to_mesa.cpp:2008:1: warning: unused parameter 'ir' [-Wunused-parameter] program/ir_to_mesa.cpp:2272:1: warning: unused parameter 'ir' [-Wunused-parameter] program/ir_to_mesa.cpp:2278:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:55 -07:00
Ian Romanick	1bdf65f743	mesa/program: Constify find_variable_storage Also clean up an old whitespace blooper. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:53 -07:00
Ian Romanick	22128e30f3	glsl: Move Doxygen block closing ot the correct place This is the closing for the "\defgroup IR Intermediate representation nodes" all the way at the top of the file. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:49 -07:00
Iago Toral Quiroga	029ccd773d	i965: Make sure we always compute valid index bounds before drawing. When doing software rendering (i.e. rendering to the selection buffer) we need to make sure that we have valid index bounds before calling _tnl_draw_prims(), otherwise we can crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 08:48:14 -07:00
Chia-I Wu	e7f7574598	glsl: remove {add,get}_type_ast from glsl_symbol_table They are not needed since `0da1a2cc36`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-28 10:59:49 +08:00
Brian Paul	e341856294	mesa: fix glMultiDrawArrays inside a display list The underlying glDrawArrays() calls weren't getting compiled into the display list. We simply need to use the current dispatch table so the CALL_DrawArrays() is routed to the display list save function. This patch also fixes glMultiModeDrawArraysIBM and glMultiModeDrawElementsIBM. Fixes the new piglit gl-1.4-dlist-multidrawarrays test. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-27 11:09:30 -06:00
Brian Paul	12b959c351	st/mesa: overhaul texture / sample swizzle code Previously we only examined the GL_DEPTH_MODE state to determine the sampler view swizzle for depth textures. Now we also consider the texture base format for color textures too. The basic idea is if we're sampling from a RGB texture we always want to get A=1, even if the actual hardware format might be RGBA. We had assumed that the texture's A values were always one since that's what Mesa's texstore code does. But if we render to the RGBA texture, the A values might not be 1. Subsequent sampling didn't return the right values. Now we examine the user-specified texture base format vs. the actual gallium format to determine the right swizzle. Fixes several fbo-blending-formats, fbo-clear-formats and fbo-tex-rgbx failures with VMware/svga driver (and possibly other drivers). No other piglit regressions with softpipe or VMware/svga. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2014-03-27 09:45:25 -06:00
Brian Paul	0151707cfc	st/mesa: simplify apply_depthmode() In preparation for following changes. I used a temporary test harness to compare the old code to the new for all possible swizzle inputs. No change in results.	2014-03-27 08:08:26 -06:00
Eric Anholt	b02bcea715	i965: Use intel_upload_space() for pull constant uploads. This also happens to fix a leak of the current GS pull constant BO on context destroy, by just not holding on to the pull const bos after the surface state is generated. No statistically significant performance difference on GLB2.7 on HSW at 1024x768 (n=40) or 320x240 (n=44), or on BYT at 320x240 (n=47). v2: Rebase on intel_upload simplification. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-26 13:14:57 -07:00
Eric Anholt	3b57988290	i965: Massively simplify the intel_upload implementation. The implementation kept a page-sized area for uploading data, and uploaded chunks from that to a 64kb-sized streamed buffer. This wasted cache footprint (and extra state tracking to do so) when we want to just write our data into the buffer immediately. Instead, build it around an interface like brw_state_batch() that just gets you a pointer to BO memory to upload your stuff immediately. Improves OpenArena on HSW by 1.62209% +/- 0.355299% (n=61) and on BYT by 1.7916% +/- 0.415743% (n=31). v2: Rebase on Mesa master, drop old prototypes. Re-do performance comparison on a kernel that doesn't punish CPU efficiency improvements. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-26 13:13:26 -07:00
Zack Rusin	b1909b260f	draw/llvm: improve debugging output a bit it's useful to know what the llvmbuildstore arguments are going to be before executing it because it can crash and make sure to print out the inputs only if we're not generating a gs because it fetches inputs differently. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 15:58:59 -04:00
Zack Rusin	a3c0fa2d22	draw/gs: reduce the size of the gs output buffer We used to overallocate the output buffer sometimes running out of memory with applications rendering large geometries. The actual maximum number of vertices out is simply the maximum number of primitives in (number of gs invocations) multiplied by the maximum number of output vertices per gs input primitive (i.e. gs invocation). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 15:58:32 -04:00
Brian Paul	c875d6e57a	svga: add work-around for Sauerbraten Z fighting issue Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	070951b6ba	svga: null out query's hwbuf pointer after destroying Just to be extra safe. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	8bbc84d1e5	svga: add some debug_printf() calls in the query object code To help debug failures. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	488d4c4826	st/mesa: add null pointer checking in query object functions Don't pass null query object pointers into gallium functions. This avoids segfaulting in the VMware driver (and others?) if the pipe_context::create_query() call fails and returns NULL. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	82246f7939	svga: fix a comment (sampler vs. sampler_view)	2014-03-26 10:31:13 -06:00
Brian Paul	1f4ebfaa88	mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up And use the z32f_x24s8 helper struct in unpack_Z32_FLOAT_X24S8(). Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	c1377ed464	mesa: fix indentation, formatting, etc in fbobject.c	2014-03-26 10:31:13 -06:00
Brian Paul	f5e0d024d1	mesa: rename format_(un)pack.c functions to match format names (pt. 7) sed commands: s/z_Z24_S8\b/S8_UINT_Z24_UNORM/g s/z_S8_Z24\b/Z24_UNORM_S8_UINT/g s/z_Z16\b/Z_UNORM16/g s/z_Z32\b/Z_UNORM32/g s/z_Z32_FLOAT/Z_FLOAT32/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	7f37802c8a	mesa: rename format_(un)pack.c functions to match format names (pt. 6) sed commands: s/ARGB2101010_UINT\b/B10G10R10A2_UINT/g s/ABGR2101010_UINT\b/R10G10B10A2_UINT/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	e51c3f9523	mesa: rename format_(un)pack.c functions to match format names (pt. 5) sed commands: s/SIGNED_R_UNORM8\b/R_SNORM8/g s/SIGNED_RG88_REV\b/R8G8_SNORM/g s/SIGNED_RGBX8888\b/X8B8G8R8_SNORM/g s/SIGNED_A8B8G8R8_UNORM\b/A8B8G8R8_SNORM/g s/SIGNED_R8G8B8A8_UNORM\b/R8G8B8A8_SNORM/g s/SIGNED_R_UNORM16\b/R_SNORM16/g s/SIGNED_R16G16_UNORM\b/R16G16_SNORM/g s/SIGNED_RGB_16\b/RGB_SNORM16/g s/SIGNED_RGBA_16\b/RGBA_SNORM16/g s/SIGNED_A_UNORM8\b/A_SNORM8/g s/SIGNED_L_UNORM8\b/L_SNORM8/g s/SIGNED_L8A8_UNORM\b/L8A8_SNORM/g s/SIGNED_L_UNORM8\b/I_SNORM8/g s/SIGNED_A_UNORM16\b/A_SNORM16/g s/SIGNED_L_UNORM16\b/L_SNORM16/g s/SIGNED_L16A16_UNORM\b/LA_SNORM16/g s/SIGNED_L_UNORM16\b/I_SNORM16/g s/XBGR16161616_SNORM\b/RGBX_SNORM16/g s/SIGNED_G8R8_UNORM\b/G8R8_SNORM/g s/SIGNED_G16R16_UNORM\b/G16R16_SNORM/g s/SIGNED_I_UNORM8\b/I_SNORM8/g s/SIGNED_I_UNORM16\b/I_SNORM16/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	f10f5b8822	mesa: rename format_(un)pack.c functions to match format names (pt. 4) sed commands: s/SRGBA_UNORM8\b/A8B8G8R8_SRGB/g s/SABGR_UNORM8\b/R8G8B8A8_SRGB/g s/SARGB8\b/B8G8R8A8_SRGB/g s/XBGR8888_SRGB\b/R8G8B8X8_SRGB/g s/XRGB8888_SRGB\b/B8G8R8X8_SRGB/g s/SL_UNORM8\b/L_SRGB8/g s/SLA_UNORM8\b/L8A8_SRGB/g manually changed SRGB8 -> BGR_SRGB8 Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	be9eee3bcf	mesa: rename format_(un)pack.c functions to match format names (pt. 3) sed commands: s/LUMINANCE_FLOAT32\b/L_FLOAT32/g s/LUMINANCE_FLOAT16\b/L_FLOAT16/g s/LUMINANCE_ALPHA_FLOAT32\b/LA_FLOAT32/g s/LUMINANCE_ALPHA_FLOAT16\b/LA_FLOAT16/g s/ALPHA_FLOAT32\b/A_FLOAT32/g s/ALPHA_FLOAT16\b/A_FLOAT16/g s/XBGR32323232_FLOAT\b/RGBX_FLOAT32/g s/RGB9_E5_FLOAT\b/R9G9B9E5_FLOAT/g s/R11_G11_B10_FLOAT\b/R11G11B10_FLOAT/g s/INTENSITY_FLOAT16\b/I_FLOAT16/g s/INTENSITY_FLOAT32\b/I_FLOAT32/g v2: removed a few redundant/no-op substitutions Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	a49f46b15a	mesa: rename format_(un)pack.c functions to match format names (pt. 2) sed commands: s/ABGR2101010\b/R10G10B10A2_UNORM/g s/XRGB2101010_UNORM\b/B10G10R10X2_UNORM/g s/XBGR16161616_UNORM\b/RGBX_UNORM16/g s/ABGR2101010\b/R10G10B10A2_UNORM/g s/I8\b/I_UNORM8/g s/I16\b/I_UNORM16/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	5c619ace6f	mesa: rename format_(un)pack.c functions to match format names (pt. 1) sed commands: s/RGBA8888\b/A8B8G8R8_UNORM/g s/RGBA8888_REV\b/R8G8B8A8_UNORM/g s/ARGB8888\b/B8G8R8A8_UNORM/g s/ARGB8888_REV\b/A8R8G8B8_UNORM/g s/RGBA8888\b/X8B8G8R8_UNORM/g s/RGBA8888_REV\b/R8G8B8X8_UNORM/g s/XRGB8888\b/B8G8R8X8_UNORM/g s/XRGB8888_REV\b/X8R8G8B8_UNORM/g s/RGB888\b/BGR_UNORM8/g s/BGR888\b/RGB_UNORM8/g s/RGB565\b/B5G6R5_UNORM/g s/RGB565_REV\b/R5G6B5_UNORM/g s/ARGB4444\b/B4G4R4A4_UNORM/g s/ARGB4444_REV\b/A4R4G4B4_UNORM/g s/RGBA5551\b/A1B5G5R5_UNORM/g s/ARGB1555\b/B5G5R5A1_UNORM/g s/ARGB1555_REV\b/A1R5G5B5_UNORM/g s/AL44\b/L4A4_UNORM/g s/AL88\b/L8A8_UNORM/g s/AL88_REV\b/A8L8_UNORM/g s/AL1616\b/L16A16_UNORM/g s/AL1616_REV\b/A16L16_UNORM/g s/RGB332\b/B2G3R3_UNORM/g s/A8\b/A_UNORM8/g s/A16\b/A_UNORM16/g s/L8\b/L_UNORM8/g s/L16\b/L_UNORM16/g s/L8\b/I_UNORM8/g s/L16\b/I_UNORM16/g s/R8\b/R_UNORM8/g s/GR88\b/R8G8_UNORM/g s/RG88\b/G8R8_UNORM/g s/R16\b/R_UNORM16/g s/GR1616\b/R16G16_UNORM/g s/RG1616\b/G16R16_UNORM/g s/ARGB2101010\b/B10G10R10A2_UNORM/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Zack Rusin	bbdefabfc9	llvmpipe: Fix llvmpipe_create_gs_state. Revert unintended behaviour change from commit `b995a010e6`. Tested-by: José Fonseca <jfonseca@vmware.com>	2014-03-26 16:11:28 +00:00
Christian König	aa2274c1d2	st/omx/dec: fix possible segfault at eos Signed-off-by: Christian König <christian.koenig@amd.com>	2014-03-26 16:29:20 +01:00
José Fonseca	2de70fe23f	mapi/glapi: Use ElementTree instead of libxml2. It is quite hard to meet the dependency of the libxml2 python bindings outside Linux, and in particularly on MacOSX; whereas ElementTree is part of Python's standard library. ElementTree is more limited than libxml2: no DTD verification, defaults from DTD, or XInclude support, but none of these limitations is serious enough to justify using libxml2. In fact, it was easier to refactor the code to use ElementTree than to try to get libxml2 python bindings. In the process, gl_item_factory class was refactored so that there is one method for each kind of object to be created, as it simplifies things substantially. I confirmed that precisely the same output is generated for GL/GLX/GLES. v2: Remove m4/ax_python_module.m4 as suggested by Matt Turner. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-26 13:51:32 +00:00
José Fonseca	b761dfa0c3	mapi/glapi: Remove glX_doc.py. As suggested by Ian Romanick, given it's no longer used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-26 12:32:57 +00:00
Christian König	d117ddbe31	st/mesa: fix sampler view handling with shared textures v4 Release the references to the sampler views before destroying the pipe context. v2: remove TODO and unrelated change v3: move to st_texture.[ch], rename callback, add comment v4: fix rebase mess up and add further cleanups Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-26 12:06:43 +01:00
Roland Scheidegger	3b421daf32	gallivm: fix no-op n:n lp_build_resize() This can get called in some circumstances if both src type and dst type have same width (seen with float32->unorm32). While this particular case was bogus anyway let's just fix that as it can work trivially (due to the way it was called it actually worked anyway apart from the assert). Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 01:44:23 +01:00
Kevin Rogovin	fe635d51ff	i965: For fast color clears, only check the color of live channels. When deciding if a clear color is suitable for fast clear, take into account if a color channel is active in the buffer format. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-25 15:34:28 -07:00
Kenneth Graunke	ee4484be3d	i965: Set Broadwell MOCS values everywhere it's possible. This patch introduces two pre-canned MOCS values: BDW_MOCS_WB (write-back, all caches) and BDW_MOCS_WT (write-through, all caches). We use write-through caching for render targets, and write-back for all other data. (At least on Haswell, I believe write-back LLC/eLLC didn't work for scan-out buffers, while write-through did.) No performance analysis has been done on the impact of this patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-03-25 15:14:08 -07:00
Kenneth Graunke	1afe335925	mesa: In core profile, refuse to draw unless a VAO is bound. Core profile requires a non-default VAO to be bound. Currently, calls to glVertexAttribPointer raise INVALID_OPERATION unless a VAO is bound, and we never actually get any vertex data set. Trying to draw without any vertex data can only cause problems. In i965, it causes a crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76400 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2014-03-25 15:13:49 -07:00
Ilia Mirkin	29bcc73d4d	Revert "build: llvm libs may not be in system search path, add rpath" This reverts commit `d9b983519c`. Unfortunately it seems like rpath is evaluated before LD_LIBRARY_PATH, so this breaks e.g. steam, as well as any other user of that env var, if the llvm path happens to be where other libs also reside. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76082 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-25 17:18:46 -04:00
Chris Forbes	4002daf095	Revert "mesa: Fix format matching checks for GL_INTENSITY* internalformats." This reverts commit `40d7b51953`.	2014-03-26 10:06:10 +13:00
Brian Paul	64278b36d6	mesa: move GLbitfield any_valid_stages declaration before code To fix MSVC build.	2014-03-25 13:33:10 -06:00
Ian Romanick	c4cec40883	glsl: Clean up "unused parameter" warnings ../../src/glsl/ir_constant_expression.cpp:486:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1633:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1752:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1761:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1769:1: warning: unused parameter 'variable_context' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	f3ab987b70	glsl: Minor clean ups in constant_referenced These could probably be squashed into one of the previous commits. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	6429d6276d	glsl: Remove ir_dereference::constant_referenced All of the functionality is implemented in a private function in the one file where it is used. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	bb0d6db974	glsl: Fold implementation of ir_dereference_array::constant_referenced into wrapper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	35bf94f901	glsl: Fold implementation of ir_dereference_record::constant_referenced into wrapper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	b66319b006	glsl: Fold implementation of ir_dereference_variable::constant_referenced into wrapper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	14f0faacb6	glsl: Add wrapper function that calls ir_dereference::constant_referenced Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	c11c7e4f01	glsl: Group all of the constant_referenced functions together Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Gwenole Beauchesne	3bd65dc8a1	i965: fix dma_buf import with non-zero offset. Fix eglCreateImage() from a packed dma_buf surface with a non-zero offset to pixels data. In particular, this fixes support for planar YUV surfaces when they are individually mapped on a per-plane basis, i.e. when the OES_EGL_image_external is not used and user application wants to use its own shader code for composition, or processing on individual plane (OCL). Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 18:56:41 +01:00
Gregory Hainaut	1c29068074	mesa/sso: Implement ValidateProgramPipeline Implementation note: I don't use context for ralloc (don't know how). The check on PROGRAM_SEPARABLE flags is also done when the pipeline isn't bound. It doesn't make any sense in a DSA style API. Maybe we could replace _mesa_validate_program by _mesa_validate_program_pipeline. For example we could recreate a dummy pipeline object. However the new function checks also the TEXTURE_IMAGE_UNIT number not sure of the impact. V2: Fix memory leak with ralloc_strdup Formatting improvement V3 (idr): * Actually fix the leak of the InfoLog. :) * Directly generate logs in to gl_pipeline_object::InfoLog via ralloc_asprintf isntead of using a temporary buffer. * Split out from previous uber patch. * Change spec references to include section numbers, etc. * Fix a bug in checking that a different program isn't active in a stage between two stages that have the same program. Specifically, if (pipe->CurrentVertexProgram->Name == pipe->CurrentGeometryProgram->Name && pipe->CurrentGeometryProgram->Name != pipe->CurrentVertexProgram->Name) should have been if (pipe->CurrentVertexProgram->Name == pipe->CurrentFragmentProgram->Name && pipe->CurrentGeometryProgram->Name != pipe->CurrentVertexProgram->Name) v4 (idr): Rework to use CurrentProgram array in loops. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	95426b28ac	mesa/sso: Add _mesa_sampler_uniforms_pipeline_are_valid This is much like _mesa_sampler_uniforms_are_valid, but it operates across an entire pipeline object. This function differs from _mesa_sampler_uniforms_are_valid in that it directly creates the gl_pipeline_object::InfoLog instead of writing to some temporary buffer. This was originally included in another patch, but it was split out by Ian Romanick. v2 (idr): Fix the loop bounds. shProg isn't an array, so ARRAY_SIZE(shProg) was 1, so only the vertex program was validated. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	aa46ad26b1	mesa/sso: Add gl_pipeline_object::InfoLog support V2 (idr): * Keep the behavior of other info logs in Mesa: and empty info log reports a GL_INFO_LOG_LENGTH of zero. * Use a NULL pointer to denote an empty info log. * Split out from previous uber patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	658eaa3229	mesa/sso: Implement GL_PROGRAM_PIPELINE_BINDING for glGet Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	9e9fac4714	mesa/sso: Implement _mesa_BindProgramPipeline Test become green in piglit: The updated ext_transform_feedback-api-errors:useprogstage_noactive useprogstage_active bind_pipeline arb_separate_shader_object-GetProgramPipelineiv arb_separate_shader_object-IsProgramPipeline For the moment I reuse Driver.UseProgram but I guess it will be better to create a UseProgramStages functions. Opinion is welcome V2: formatting & rename V3 (idr): * Change spec references to core OpenGL versions instead of issues in the extension spec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	78578b7599	mesa/sso: Implement _mesa_UseProgramStages Now arb_separate_shader_object-GetProgramPipelineiv should pass. V3 (idr): * Change spec references to core OpenGL versions instead of issues in the extension spec. * Split out from previous uber patch. v4 (idr): Use _mesa_has_geometry_shaders in _mesa_UseProgramStages to detect availability of geometry shaders. v5 (idr): Whitespace cleanup, use _mesa_lookup_shader_program_err instead of open-coding it again, and update some comments at the end of _mesa_UseProgramStages. All suggested by Eric. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	4caa9db71c	mesa/sso: Add gl_pipeline_object parameter to _mesa_use_shader_program Extend use_shader_program to support a different target. Allow to reuse the function to update the pipeline state. Note I bypass the flush when target isn't current. Maybe it would be better to create a new UseProgramStages driver function This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	de4f85f52d	meta/sso: Update meta to save and restore SSO state. save and restore _Shader/Pipeline binding point. Rational we don't want any conflict when the program will be unattached. V2: formatting improvement V3 (idr): * Build fix. The original patch added calls to _mesa_use_shader_program with 4 parameters, but the fourth parameter isn't added to that function until a much later patch. Just drop that parameter for now. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	c03477050a	mesa/sso: rename Shader to the pointer _Shader Basically a sed but shaderapi.c and get.c. get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior shaderapi.c => the old api stil update the Shader object directly V2: formatting improvement V3 (idr): * Rebase fixes after a block of code was moved from ir_to_mesa.cpp to shaderapi.c. * Trivial reformatting. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	b2bddaf7a0	mesa/sso: replace Shader binding point with _Shader To avoid NULL pointer check a default pipeline object is installed in _Shader when no program is current The spec say that UseProgram/UseShaderProgramEXT/ActiveProgramEXT got an higher priority over the pipeline object. When default program is uninstall, the pipeline is used if any was bound. Note: A careful rename need to be done now... V2: formating improvement V3 (idr): * Build fix. The original patch added calls to _mesa_use_shader_program with 4 parameters, but the fourth parameter isn't added to that function until a much later patch. Just drop that parameter for now. * Trivial reformatting. * Updated comment of gl_context::_Shader v4 (idr): Reformat spec quotations to look like spec quotations. Update comments describing what gl_context::_Shader can point to. Bot suggested by Eric. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
José Fonseca	b995a010e6	llvmpipe: Simplify vertex and geometry shaders. Eliminate lp_vertex_shader, as it added nothing over draw_vertex_shader. Simplify lp_geometry_shader, as most of the incoming state is unneeded. (We could also just use draw_geometry_shader if we were willing to peek inside the structure.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-03-25 12:54:39 +00:00
José Fonseca	ee89432a47	draw: Duplicate TGSI tokens in draw_pipe_pstipple module. As done in draw_pipe_aaline and draw_pipe_aapoint modules. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-25 12:54:39 +00:00
Alexander von Gluck IV	7683fce878	haiku: Fix build through scons corrections and viewport fixes * Add HAVE_PTHREAD, we do have pthread support wrappers now for non-native Haiku threaded applications. * Viewport changed behavior recently breaking the build. We fix this by looking at the gl_context ViewportArray (Thanks Brian for the idea) Acked-by: Brian Paul <brianp@vmware.com>	2014-03-24 19:01:53 -05:00
Kenneth Graunke	eccad18bd8	i965: For color clears, only disable writes to components that exist. The SIMD16 replicated FB write message only works if we don't need the color calculator to mask our framebuffer writes. Previously, we bailed on it if color_mask wasn't <true, true, true, true>. However, this was needlessly strict for formats with fewer than four components - only the components that actually exist matter. WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set to <true, true, true, false>. This will work perfectly fine with the replicated data message; we just bailed unnecessarily. Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) by abound 50%, and Bay Trail (at 1366x768) by over 70% (using Chrome 24). v2: Use _mesa_format_has_color_component() to properly handle ALPHA formats (and generally be less fragile). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-24 14:46:05 -07:00
Kenneth Graunke	630bf288de	mesa: Skip clearing color buffers when color writes are disabled. WebGL Aquarium in Chrome 24 actually hits this. v2: Move to core Mesa (wisely suggested by Ian); only consider components which actually exist. v3: Use _mesa_format_has_color_component to determine whether components actually exist, fixing alpha format handling. v4: Add a comment, as requested by Brian. No actual code changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-24 14:45:03 -07:00
Kenneth Graunke	92234b1b2a	mesa: Introduce a _mesa_format_has_color_component() helper. When considering color write masks, we often want to know whether an RGBA component actually contains any meaningful data. This function provides an easy way to answer that question, and handles luminance, intensity, and alpha formats correctly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-24 14:38:51 -07:00
Eric Anholt	0d99aef6c8	i965: Fix compiler warning about signed/unsigned. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:16:38 -07:00
Eric Anholt	4545ec1691	i965/gen8: Change the winsys MSAA blits from blorp to meta. This gets us equivalent code paths on BDW and pre-BDW, except for stencil (where we don't have MSAA stencil resolve code yet) Improves MSAA-forced citybench by 7.94496% +/- 2.38429% (n=16). Reduces DRI2 MSAA glxgears performance by -12.3559% +/- 1.52845% (n=9). v2: Move the new meta code to brw_meta_updownsample.c, name it brw_meta_updownsample(), add a comment about intel_rb_storage_first_mt_slice(), and rename that function and move the RB generation into it (review ideas by Ken). v3: Fix 2 src vs dst pasteos in previous change. v4: Skip this path pre-gen8 for now, until we can analyze the glxgears performance delta some more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:05 -07:00
Eric Anholt	7ccb26fdec	mesa: Stop skipping the FinishRenderTexture calls for winsys FBOs. Now that BindRenderbufferTexImage() is a thing that drivers can do, winsys FBOs can have NeedsFinishRenderTexture set. v2: Keep the short-circuit for non-BindRenderbufferTexImage() drivers (review by Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	dd4b226184	i965: Skip reallocating the private MSAA miptree, unless it's resized. Even if the singlesample_mt got reopened from DRI due to pageflipping/buffer swapping, our private miptree shouldn't need any changes. Improves performance of a little swapbuffers-loving microbenchmark with MSAA forced on, by 1.2371% +/- 0.624802% (n=102) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	44e944c87c	i965: Simplify the no-reopening-the-winsys-buffer tests. The formatting was weird, and the tests were duplicated, and it is guaranteed that mt->region exists. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	e07e7e9f89	i965: Don't forget to free the old singlesample_mt. Fixes a memory leak with MSAA winsys buffers since my move of singlesample_mt to the rb in `4e0924c5de` Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	41033509f2	i965: Add an env var for forcing window system MSAA. Sometimes it would be nice to benchmark some app with MSAA versus not, but it doesn't offer the controls you want. Just provide a handy knob to force the issue. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Matt Turner	764e25d79d	i965/vec4: Eliminate dead writes to the flag register. For each write, search previous instructions for unread writes to the flag register and remove them. Note that this will not eliminate the last unread write. total instructions in shared programs: 788074 -> 788004 (-0.01%) instructions in affected programs: 4930 -> 4860 (-1.42%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	9cd51bb0c4	i965/vec4: Eliminate writes that are never read. With an awful O(n^2) algorithm that searches previous instructions for dead writes. total instructions in shared programs: 805582 -> 788074 (-2.17%) instructions in affected programs: 144561 -> 127053 (-12.11%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	1b8f143a23	i965/vec4: Factor code out of DCE into a separate function. Will be reused in the next commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	9630ba6c6e	i965/vec4: Let dead code eliminate trim dead channels. That is, modify mad dst, a, b, c to be mad dst.xyz, a, b, c if dst.w is never read. total instructions in shared programs: 811869 -> 805582 (-0.77%) instructions in affected programs: 168287 -> 162000 (-3.74%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	dc0f5099fa	i965/vec4: Track live ranges per-channel, not per vgrf. Will be squashed with the next patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	89ccd11eeb	i965/vec4: Don't dead code eliminate instructions writing the flag. A future patch adds support for removing dead writes to the flag register. This patch simplifies the logic until then. total instructions in shared programs: 811813 -> 811869 (0.01%) instructions in affected programs: 3378 -> 3434 (1.66%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	3a12f50f9c	i965/vec4: Preparatory clean up of dead_code_eliminate(). Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	10dd6eca89	i965/vec4: Add is_null() method to dst_reg. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	0884ce8f42	i965/vec4: Print the predicate in dump_instructions(). Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	a6367dfc15	i965/vec4: Rename depends_on_flags() to reads_flag(). To be consistent with the fs backend. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	de4692f56c	i965/vec4: Add and use vec4_instruction::writes_flag(). To be consistent with the fs backend. Also the instruction scheduler incorrectly considered SEL with a conditional modifier to read the flag register. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	b0d3205c2a	i965/vec4: Add missing doxygen close brace. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Chris Forbes	a419a1c565	mesa: Generate FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT earlier The ARB_framebuffer_object spec lists this case before the FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER and FRAMEBUFFER_INCOMPLETE_READ_BUFFER cases. Fixes two broken cases in piglit's fbo-incomplete test, if ARB_ES2_compatibility is not advertised. (If it is, this is masked because the FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER / FRAMEBUFFER_INCOMPLETE_READ_BUFFER cases are removed by that extension) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-25 06:49:25 +13:00
Chris Forbes	40d7b51953	mesa: Fix format matching checks for GL_INTENSITY* internalformats. GL_INTENSITY has never been valid as a pixel format -- to get the memcpy pack/unpack paths, the app needs to specify GL_RED as the pixel format (or GL_RED_INTEGER for the integer formats). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-25 06:47:50 +13:00
Christian König	92e543c45d	st/mesa: recreate sampler view on context change v3 With shared glx contexts it is possible that a texture is create and used in one context and then used in another one resulting in incorrect sampler view usage. v2: avoid template copy v3: add XXX comment Signed-off-by: Christian König <christian.koenig@amd.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-24 17:50:38 +01:00
Kenneth Graunke	eabfadf4af	i965: Report the type of color clear in INTEL_DEBUG=blorp. It's useful to know whether a clear is fast (MCS-based), using the SIMD16 repdata message, or slow. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-23 00:32:53 -07:00
Marek Olšák	011569b5b7	radeonsi: disable fast color clear for 1D-tiled surfaces on CIK This will be re-enabled once my kernel fix lands.	2014-03-22 18:44:58 +01:00
Kenneth Graunke	4c79f088c0	Revert "i965: For color clears, only disable writes to components that exist." This reverts commit `2919c3fdb4`. For formats like BGRX, looping through 0..num_components works fine. But for formats like XRGB, we'd check the color mask for X and fail to check it for B.	2014-03-21 17:03:20 -07:00
Kenneth Graunke	2919c3fdb4	i965: For color clears, only disable writes to components that exist. The SIMD16 replicated FB write message only works if we don't need the color calculator to mask our framebuffer writes. Previously, we bailed on it if color_mask wasn't <true, true, true, true>. However, this was needlessly strict for formats with fewer than four components - only the components that actually exist matter. WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set to <true, true, true, false>. This will work perfectly fine with the replicated data message; we just bailed unnecessarily. Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) by abound 40%, and Bay Trail (at 1366x768) by over 70% (using Chrome 24). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-21 15:35:08 -07:00
Kenneth Graunke	a63db538ad	i965: Print number of multisamples in INTEL_DEBUG=blorp output. This lets us distinguish MSAA resolves from other ordinary blits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-03-21 15:34:59 -07:00
Kenneth Graunke	9834058a91	i965: Drop BLT TexSubImage Y-tiling restriction on Gen6+. Currently, we don't use this path on Sandybridge because we suspect other paths will be faster. But we potentially could. If we do, we should allow it to support Y-tiled BLTs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-21 15:31:45 -07:00
Chris Forbes	351e13c5ad	i965: Enable ARB_vertex_type_10f_11f_11f_rev for Gen4/5 also. Tested on ILK and CTG (with the GL3isms taken out of the piglits). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-22 09:19:55 +13:00
Tom Stellard	8d8d0cb09e	clover: Fix typo in validate_object() Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-03-21 19:12:12 +01:00
Roland Scheidegger	9477d8c862	llvmpipe: add support for b5g6r5_srgb The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA (and vice versa), fix this to handle also 16bit 565-style srgb formats. Still not really all that generic, things like r10g10b10a2_srgb or r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not require more work to not crash but near certainly need some higher precision calculation) but not needed right now. The code is not fully optimized for this (could use more direct calculation instead of expanding to 8-bit range first) but should be good enough. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-21 17:23:38 +01:00
Roland Scheidegger	2aa77f2777	gallium: add b5g6r5 srgb format GL generally doesn't seem to allow srgb formats with less (or more) than 8 bit for the rgb channels, though some hw could easily do it (typically for formats with up to 10 bits for the rgb channels, at least for formats with less than 8 bits support is likely widespread even). While it may be true there aren't really any benefits for such formats, we need for it for d3d, though luckily only for b5g6r5_srgb it seems. So add this format along with the util code for conversion - since that util code is heavily tuned for 8bit srgb this isn't really all that well optimized and rounding doesn't seem right but at least it should give some halfway meaningful results. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-21 17:23:38 +01:00
Ilia Mirkin	19ba573a57	nvc0/ir: move sample id to second source arg to fix sampler2DMS The nvc0 texfetch instruction expects the sample id to be in the second source (usually used for the offset) rather than as part of the texture coordinate. This fixes all the sampler2DMS/Array tests on nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-20 20:47:47 -04:00
Marek Olšák	e5f6b6d0fe	st/mesa: drop the lowering of quad strips to triangle strips This fallback to triangle strips is silly and should be done in drivers if they need it. This should fix the case when quad strips are used with flatshading that is enabled by the "flat" GLSL varying modifier. It also fixes primitive restart for quad strips. This fixes piglit: NV_primitive_restart/primitive-restart-draw-mode-quad_strip Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	2706448a10	gallium/u_gen_mipmap: remove the software fallback The last changes to it are from 2008 and 2009. It doesn't support most texture formats and some texture targets. Nobody can possibly be using this. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	db722bdcab	st/mesa: fix generating mipmaps for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	91df26842f	mesa: fix software fallback for generating mipmaps for 3D textures It didn't use the driver-provided src/dstRowStride at all. This was broken for the cases when stride != width*bpp. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	78c60d1b63	mesa: fix software fallback for generating mipmaps for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	185ad78ffd	mesa: allow generating mipmaps for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	55cf320ed8	mesa: fix texture border handling for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	54690a5f3b	r600g: use more appropriate names for async DMA functions _dma_copy calls either _dma_copy_buffer or *_dma_copy_tile. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-20 19:03:40 +01:00
Marek Olšák	6c487ff3bd	r600g: deobfuscate async DMA code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-20 18:56:11 +01:00
Marek Olšák	2c703ee8ad	r600g: don't flush the gfx IB explicitly before doing DMA It's flushed by calling r600_context_bo_reloc. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-20 18:41:18 +01:00
Marek Olšák	e914d0052f	winsys/radeon: only add duplicate relocations for DMA if VM isn't supported Also rewrite the comment for it to be readable and reorder the code. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-20 18:41:17 +01:00
Niels Ole Salscheider	71254732db	radeonsi: Implement DMA blit This code is a slightly modified version of evergreen_dma_blit (and evergreen_dma_copy as well as evergreen_dma_copy_tile). It would be nice to share some of the code in the long term. I have reused some "cik"-prefixed functions that also return the right value for SI. I am not sure if they should be renamed. v2: Marek> removed gfx.flush in si_dma_copy_tile Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-20 17:21:16 +01:00
Niels Ole Salscheider	acf55e7325	radeon: Move r600_need_dma_space to common code Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-20 17:21:16 +01:00
Richard Sandiford	f4b3430a36	llvmpipe: Tighten check for alpha-only formats The AoS version of ld_build_blend_factor was assuming that if the first channel was alpha, there were no rgb components. Fixes glean/blendFunc on System z. No piglit regressions on x86_64. The shortcut is still used in tests like spec/ARB_framebuffer_object/ fbo-alpha. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2014-03-20 16:50:40 +01:00
Jonathan Gray	8044fd6769	nouveau: don't assume libdrm include prefix drm headers may be installed in a different directory Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-20 08:32:12 -04:00
Jonathan Gray	8fbc9d9b6f	nouveau: use DLOPEN_LIBS instead of -ldl libdl does not exist on many platforms which have dlopen in libc. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-20 08:32:12 -04:00
Brian Paul	eaf9affa5e	c11/threads: don't include assert.h if the assert macro is already defined In the gallium code, the assert() macro could come from either the system's assert.h file (via c11/threads.h) or from gallium's u_debug.h. It looks like all known assert.h files unconditionally #undef assert before defining their own version. So the assert you get depends on whether threads.h or u_debug.h was included last. In the gallium code we really want to use the assert() from u_debug.h (it behaves better on Windows). In gallium, c11/threads.h is only included after u_debug.h in the os_thread.h wrapper. So Adding an #ifndef assert test in the threads*.h files avoids using the system's assert(). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-19 17:13:31 -06:00
Ilia Mirkin	e58071355e	nouveau: there may not have been a texture if the fbo was incomplete Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-19 18:20:29 -04:00
Ilia Mirkin	b676df9abf	nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-19 18:17:40 -04:00
Ilia Mirkin	18690995a6	mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture EXT_packed_depth_stencil is supported by all drivers, but ARB_depth_texture isn't (notably nouveau_vieux). This should avoid passing unexpected values down to ChooseTextureFormat. The EXT_packed_depth_stencil spec does not make any explicit references to requiring ARB_depth_texture in order to allow textures with that format, however if there is no dependency, ARB_depth_texture would be practically implied by the extension. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Note for 10.0 backport: This will produce a conflict, the solution is to move the surrounding if as well.	2014-03-19 18:17:40 -04:00
Ilia Mirkin	51989817e6	loader: add special logic to distinguish nouveau from nouveau_vieux There are a lot of different pci ids supported by nouveau, and more are added all the time. The relevant distinguisher between drivers is the chipset id. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-19 18:17:40 -04:00
Matt Turner	c049dd4396	glsl: Allow dot() on scalars, and throw out dotlike(). In all uses of dotlike() we're writing generic code that operates on 1-4 component vectors. That our IR requires ir_binop_dot expressions' operands to be 2+ component vectors is an implementation detail that's not important when implementing built-in functions with dot(), which is defined for scalar floats in GLSL. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 23:20:29 -07:00
Matt Turner	6cbc64c3cb	glsl: Optimize pow(x, 2) into x * x. Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 23:20:29 -07:00
Matt Turner	9a9eaaa79a	glsl: Match whitespace changes from previous patch. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 23:20:29 -07:00
Matt Turner	7988b4804f	glsl: Expose pack/unpack built-ins for ARB_gpu_shader5. ARB_gpu_shader5 and ES 3.0 expose different subsets of ARB_shading_language_packing. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 23:20:29 -07:00
Eric Anholt	651b8baa82	i965: Drop some more dead code from the old CACHED_BATCH feature. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 14:45:09 -07:00
Eric Anholt	512c88f826	i965: Drop special case for edgeflag thanks to Marek's change to core. As of `780ce576bb`, we end up with R8_SSCALED anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 14:45:09 -07:00
Brian Paul	f4435da940	mesa: include stdbool.h in register_allocate.h to fix build https://bugs.freedesktop.org/show_bug.cgi?id=76331	2014-03-18 13:28:17 -06:00
Ian Romanick	f74cf5f80e	i965: Enable EWA anisotropic filtering algorithm Volume 4, part 1 of the Ivybridge PRM says, "Generally, the EWA approximation algorithm results in higher image quality than the legacy algorithm." Using a classic anisotropic filtering "tunnel" demo, it appears that there is no anisotropic filtering on IVB without this bit set. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 10:56:38 -07:00
Kenneth Graunke	dd2e5d3999	i965: Actually initialize simd16_unsupported and no16_msg. I meant to include this fixes in v3 of commit `de7ad2c88f`, but accidentally pushed a previous version. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 10:50:48 -07:00
Kenneth Graunke	91f4528da6	i965/upload: Refactor open-coded ALIGN-like computations. Sadly, we can't use actual ALIGN(), since that only supports power-of-two values for the alignment parameter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-18 10:39:04 -07:00
Kenneth Graunke	b8b4e280b4	i965: Fix indentation in brw_upload_indices(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-18 10:38:48 -07:00
Kenneth Graunke	051edcc144	i965: Consolidate code for setting brw->ib.start_vertex_offset. This was set identically in three places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-18 10:38:44 -07:00
Kenneth Graunke	7a0fd3ca1d	i965: Allocate register sets at screen creation, not context creation. Register sets depend on the particular hardware generation, but don't depend on anything in the actual OpenGL context. Computing them is fairly expensive, and they take up a large amount of memory. Putting them in the screen allows us to compute/allocate them once for all contexts, saving both time and space. Improves the performance of a context creation/destruction microbenchmark by about 3x on my Haswell i7-4750HQ. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:35:53 -07:00
Kenneth Graunke	b3e4b769dd	i965: Allocate the screen using ralloc rather than calloc. This will allow us to use the screen as a memory context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:31:12 -07:00
Eric Anholt	41097db91b	ra: Convert another bool array to bitsets. This one saves about 2MB peak allocation in glsl-fs-algebraic-add-add-1, with no performance difference on timing short shader-db runs (n=9/10, warmup outlier removed). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-18 10:20:28 -07:00
Kenneth Graunke	da1cce2d68	ra: Use a bitset for storing which registers belong to a class. This should use 1/8 the memory. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christoph Brill <egore911@gmail.com>	2014-03-18 10:15:24 -07:00
Kenneth Graunke	8d856c3937	ra: Create a reg_belongs_to_class() helper function. This is a little easier to read. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christoph Brill <egore911@gmail.com>	2014-03-18 10:15:23 -07:00
Kenneth Graunke	786a647245	ra: Use bool instead of GLboolean. This isn't the GL API, so there's no reason to use GLboolean. Using bool is safer: any non-zero value is treated as "true". When converting a value to a GLboolean, all but the low byte is discarded, which means that values like 256 will be incorrectly rendered as false. Done via the following vim commands: :%s/GLboolean/bool/g :%s/GL_TRUE/true/g :%s/GL_FALSE/false/g and one line of manual whitespace tidying. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-18 10:15:18 -07:00
Kenneth Graunke	de7ad2c88f	i965: Accurately bail on SIMD16 compiles. Ideally, we'd like to never even attempt the SIMD16 compile if we could know ahead of time that it won't succeed---it's purely a waste of time. This is especially important for state-based recompiles, which happen at draw time. The fragment shader compiler has a number of checks like: if (dispatch_width == 16) fail("...some reason..."); This patch introduces a new no16() function which replaces the above pattern. In the SIMD8 compile, it sets a "SIMD16 will never work" flag. Then, brw_wm_fs_emit can check that flag, skip the SIMD16 compile, and issue a helpful performance warning if INTEL_DEBUG=perf is set. (In SIMD16 mode, no16() calls fail(), for safety's sake.) The great part is that this is not a heuristic---if the flag is set, we know with 100% certainty that the SIMD16 compile would fail. (It might fail anyway if we run out of registers, but it's always worth trying.) v2: Fix missing va_end in early-return case (caught by Ilia Mirkin). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v1] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:38 -07:00
Kenneth Graunke	b207e88b25	i965/fs: Support pull parameters in SIMD16 mode. This is just a matter of reusing the pull/push constant information set up by the SIMD8 compile. This gains us 78 SIMD16 programs in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:36 -07:00
Kenneth Graunke	229319e0f0	i965/fs: Use a single instance of the pull_constant_loc[] array. Now that we don't renumber uniform registers, assign_constant_locations and move_uniform_array_access_to_pull_constants use the same names. So, they can share a single copy of the pull_constant_loc[] array. This simplifies the code considerably. assign_constant_locations() doesn't need to walk through pull_params[] to rediscover reladdr demotions; it just has that information in pull_constant_loc[]. We also only need to rewrite the instruction stream once, instead of twice. Even better, we now have a single array describing the layout of all pull parameters, which we can pass to the SIMD16 program. This actually hurts a few shaders in Serious Sam 3, and one in KWin: total instructions in shared programs: 1841957 -> 1842035 (0.00%) instructions in affected programs: 1165 -> 1243 (6.70%) Comparing dump_instructions() before and after the pull constant transformations with and without this patch, it appears that there is a uniform array with variable indexing (reladdr) and constant indexing (of array element 0). Previously, we uploaded array element 0 as both a pull constant (for reladdr) /and/ a push constant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:32 -07:00
Kenneth Graunke	542f2e47f2	i965/fs: Don't renumber UNIFORM registers. Previously, remove_dead_constants() would renumber the UNIFORM registers to be sequential starting from zero, and the resulting register number would be used directly as an index into the params[] array. This renumbering made it difficult to collect and save information about pull constant locations, since setup_pull_constants() and move_uniform_array_access_to_pull_constants() used different names. This patch generalizes setup_pull_constants() to decide whether each uniform register should be a pull constant, push constant, or neither (because it's unused). Then, it stores mappings from UNIFORM register numbers to params[] or pull_params[] indices in the push_constant_loc and pull_constant_loc arrays. (We already did this for pull constants.) Then, assign_curb_setup() just needs to consult the push_constant_loc array to get the real index into the params[] array. This effectively folds all the remove_dead_constants() functionality into assign_constant_locations(), while being less irritable to work with. v2: Add assert(remapped <= i), requested by Topi. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:29 -07:00
Kenneth Graunke	d9f339eccd	i965/fs: Split pull parameter decision making from mechanical demoting. move_uniform_array_access_to_pull_constants() and setup_pull_constants() both have two parts: 1. Decide which UNIFORM registers to demote to pull constants, and assign locations. 2. Mechanically rewrite the instruction stream to pull the uniform value into a temporary VGRF and use that, eliminating the UNIFORM file access. In order to support pull constants in SIMD16 mode, we will need to make decisions exactly once, but rewrite both instruction streams. Separating these two tasks will make this easier. This patch introduces a new helper, demote_pull_constants(), which takes care of rewriting the instruction stream, in both cases. For the moment, a single invocation of demote_pull_constants can't safely handle both reladdr and non-reladdr tasks, since the two callers still use different names for uniforms due to remove_dead_constants() remapping of things. So, we get an ugly boolean parameter saying which to do. This will go away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:26 -07:00
Kenneth Graunke	2163e0fd5a	i965/fs: Record pull constant locations for all array elements. When demoting a variably indexed uniform array to pull constants, we only recorded the location for the base of the array (element 0). Recording locations for all array elements is a trivial amount of code and will make subsequent refactoring easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:24 -07:00
Kenneth Graunke	7c7627781f	i965/fs: Save push constant location information. Previously, both move_uniform_array_access_to_pull_constants() and setup_pull_constants() maintained stack-local arrays with this information. Storing this information will allow it to be used from multiple functions, allowing us to split and move code around. We'll also eventually want to pass pull constant location information to the SIMD16 compile. Saving this information will help us do that. Unfortunately, the two functions cannot share the contents of the array just yet. remove_dead_constants() renumbers all the UNIFORM registers to be contiguous starting at zero, so the two functions talk about uniforms using different names. We can't even remap them, since move_uniform_array_access_to_pull_constants() deletes UNIFORM registers that are only accessed with reladdr, so remove_dead_constants can't even see them. This situation will improve in the next few patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:21 -07:00
Kenneth Graunke	de77efde91	i965/fs: Delete dead code to fail compiles with SIMD16 pull parameters. The SIMD8 compile will determine whether pull parameters are necessary. If so, it will set prog_data->nr_pull_params to a value greater than 0. brw_wm_fs_emit checks if nr_pull_params > 0 and skips the SIMD16 compile altogether. So, this code should never occur. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:08 -07:00
Brian Paul	63e7b51912	gallium/docs: update SLT, SGE, SFL, STR opcode docs To emphasize that the result is floating point 1.0 or 0.0, to match other opcodes like SLE and SEQ. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-18 08:03:27 -06:00
Charmaine Lee	81f342ce64	glx: Fix incorrect pdp assignment in dri2_bind_context(). pdp should be set to dpyPriv->dri2Display. Fixes blank frame failure running glretrace ClearView. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-18 08:03:27 -06:00
Maarten Lankhorst	8fe888fafd	nvc0: Handle user mapped vertex buffer for edgeflag Handle mapping edgeflag data similar to the code around it. This fixes a crash in piglit test gl-2.0-edgeflag. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2014-03-18 14:51:06 +01:00
Francisco Jerez	d70ad1a4f9	clover: Fix region size error checking in some buffer transfer commands. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-18 12:14:46 +01:00
Ilia Mirkin	c8309cde30	nv50/ir/gk110: add postfactor support for fmul Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	d8e0d1e882	nv50/ir/gk110: set not modifier on first source of logic op Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	b56e50b8af	nv50/ir/gk110: use shl/shr instead of lshf/rshf so that c[] is supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	34bf5e27c6	nv50/ir/gk110: add 64/128-bit fetch/export support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	3c40be2615	nv50/ir/gk110: fix handling of OP_SUB for floating point ops Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	72310869f0	nv50/ir/gk110: presin/preex2 take their source at bit 23 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	48a9ba63f5	nv50/ir/gk110: add implementations of div u32/s32 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	4bb14aca29	nv50/ir/gk110: implement quadop Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	67cb8a6996	nv50/ir/gk110: fill in mov from predicate Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	563083ef57	nv50/ir/gk110: handle derivAll flag, fix useOffsets for non-txf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	ece734b3c1	nv50/ir/gk110: fix setting texture for txd/txf/txq Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	08505549ab	nv50/ir/gk110: add texcsaa implementation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	c17f7247ec	nv50/ir/gk110: add pfetch support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	15b1f420d0	nv50/ir/gk110: add emit/restart implementations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	1b68009466	nv50/ir/gk110: add missing break in sched emit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	76554d2d1f	nv50/ir/gk110: implement partial txq support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	cb3dcb1430	nv50/ir/gk110: fill out texture instruction support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	ce75a3e8d3	nv50/ir/gk110: fix control flow opcode emission, add sat flag Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:34 -04:00
Chad Versace	468cc866b4	egl/main: Enable Linux platform extensions Enable EGL_EXT_platform_base and the Linux platform extensions layered atop it: EGL_EXT_platform_x11, EGL_EXT_platform_wayland, and EGL_MESA_platform_gbm. Tested with Piglit's EGL_EXT_platform_base tests under an X11 session. To enable running the Wayland and GBM tests, windowed Weston was running and the kernel had render nodes enabled. I regression tested my EGL_EXT_platform_base patch set with Piglit on Ivybridge under X11/EGL, standalone Weston, and GBM with rendernodes. No regressions found. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:49:06 -07:00
Chad Versace	9a40ee16d0	egl/wayland: Emit EGL_BAD_PARAMETER for eglCreatePlatformPixmapSurface From the EGL_EXT_wayland_spec, version 3: It is not valid to call eglCreatePlatformPixmapSurfaceEXT with a <dpy> that belongs to Wayland. Any such call fails and generates EGL_BAD_PARAMETER. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	1787f5632f	egl/gbm: Emit EGL_BAD_PARAMETER for eglCreatePlatformPixmapSurface From the EGL_MESA_platform_gbm spec, version 5: It is not valid to call eglCreatePlatformPixmapSurfaceEXT with a <dpy> that belongs to the GBM platform. Any such call fails and generates EGL_BAD_PARAMETER. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	6d1f83ec09	egl/main: Stop using EGLNative types internally Internally, much of the EGL code uses EGLNativeDisplayType, EGLNativeWindowType, and EGLPixmapType. However, the EGLNative type often does not match the variable's actual type. The concept of EGLNative types are a bad match for Linux, as explained below. And the EGL platform extensions don't use EGLNative types at all. Those extensions attempt to solve cross-platform issues by moving the EGL API away from the EGLNative types. The core of the problem is that eglplatform.h can define each EGLNative type once only, but Linux supports multiple EGL platforms. To work around the problem, Mesa's eglplatform.h contains multiple definitions of each EGLNative type, selected by feature macros. Mesa expects EGL clients to set the feature macro approrpiately. But the feature macros don't work when a single codebase must be built with support for multiple EGL platforms, such as Mesa itself. When building libEGL, autotools chooses the EGLNative typedefs based on the first element of '--with-egl-platforms'. For example, '--with-egl-platforms=x11,drm,wayland' defines the following: typedef Display* EGLNativeDisplayType; typedef Window EGLNativeWindowType; typedef Pixmap EGLNativePixmapType; Clearly, this doesn't work well for Wayland and GBM. Mesa works around the problem by casting the EGLNative types to different things in different files. For sanity's sake, and to prepare for the EGL platform extensions, this patch removes from egl/main and egl/dri2 all internal use of the EGLNative types. It replaces them with 'void*' and checks each explicit cast with a static assertion. Also, the patch touches egl_gallium the minimal amount to keep it compatible with eglapi.h. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	cefa06cd69	egl: Add STATIC_ASSERT() macro Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	eef68a9094	egl/dri2: Dispatch eglCreateImageKHR by display, not driver Add dri2_egl_display_vtbl::create_image, set it for each platform, and let egl_dri2 dispatch eglCreateImageKHR to that. To remove ambiguity, rename egl_dri2.c:dri2_create_image() to dri2_create_image_from_dri(). This prepares for the EGL platform extensions. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	88b9e600a6	egl/dri2/x11: Don't clobber _EGLDriver::API dri2_initialize_x11_swrast() does a strange thing. For some extensions it doesn't support, it sets the corresponding functions in _EGLDriver::API to NULL. The intention here is clear, but misplaced. NULL or not, the function pointers never get called because their extensions aren't supported. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	eadd5e0c0a	egl/dri2: Dispatch eglCreateWaylandBufferFromImageWL by display, not driver Add dri2_egl_display_vtbl::create_wayland_buffer_from_image, set it for each platform, and let egl_dri2 dispatch eglCreateWaylandBufferFromImageWL to that. This prepares for the EGL platform extensions. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:22 -07:00
Chad Versace	f506ef6784	egl/dri2: Consolidate eglTerminate egl_dri2.c:dri2_terminate() handled terminating X11 and DRM displays. The Wayland platform implemented its own dri2_wl_terminate(), which was nearly a copy of the common one. To implement the EGL platform extensions, we either need to dispatch eglTerminate per display or define a common implementation for all platforms. This patch chooses consolidation. It removes dri2_wl_terminate() by folding it into the common dri2_terminate(). It was necessary to invert the `if (disp->PlatformDisplay == NULL)` and the switch statement because, unlike DRM and X11, Wayland's terminator performed action even when EGL didn't own the native display. In the inversion, I replaced `disp->PlatformDisplay == NULL` with `dri2_dpy->own_device` because the two expressions are synonymous, but the latter's meaning is clearer. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:22 -07:00
Chad Versace	31cd0fee31	egl/dri2/x11: Set dri2_dpy->own_device When the user calls eglGetDisplay(EGL_DEFAULT_DISPLAY), the Wayland and DRM platforms set dri2_dpy->own_device=true. This patch makes the X11 platform do the same for consistency. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:22 -07:00
Chad Versace	688a0e8e73	egl/dri2: Dispatch eglPostSubBufferNV by display, not driver Add dri2_egl_display_vtbl::post_sub_buffer, set it for each platform, and let egl_dri2 dispatch eglPostSubBufferNV to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	75d398ed93	egl/dri2: Dispatch eglSwapBuffersRegionNOK by display, not driver Add dri2_egl_display_vtbl::swap_buffers_region, set it for each platform, and let egl_dri2 dispatch eglSwapBuffersRegionNOK to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	bc2cbc0951	egl/dri2: Dispatch eglCopyBuffers by display, not driver Add dri2_egl_display_vtbl::copy_buffers, set it for each platform, and let egl_dri2 dispatch eglCopyBuffers to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	3fdfbd2572	egl/dri2: Dispatch API.QueryBufferAge by display, not driver Add dri2_egl_display_vtbl::query_buffer_age, set it for each platform, and let egl_dri2 dispatch API.QueryBufferAge to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	958dd80c40	egl/dri2: Dispatch eglDestroySurface by display, not driver Add dri2_egl_display_vtbl::destroy_surface, set it for each platform, and let egl_dri2 dispatch eglDestroySurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	bf20076baf	egl/dri2: Dispatch eglCreatePbufferSurface by display, not driver Add dri2_egl_display_vtbl::create_pbuffer_surface, set it for each platform, and let egl_dri2 dispatch eglCreatePbufferSurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	bc8b07a657	egl/dri2: Dispatch eglCreatePixmapSurface by display, not driver Add dri2_egl_display_vtbl::create_pbuffer_surface, set it for each platform, and let egl_dri2 dispatch eglCreatePixmapSurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	0a0c881a13	egl/dri2: Dispatch eglCreateWindowSurface by display, not driver Add dri2_egl_display_vtbl::create_window_surface, set it for each platform, and let egl_dri2 dispatch eglCreateWindowSurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	d03948a766	egl/dri2: Dispatch eglSwapBuffersWithDamage by display, not driver Add dri2_egl_display_vtbl::swap_buffers_with_damage, set it for each platform, and let egl_dri2 dispatch eglSwapBuffersWithDamageEXT to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	ad173bcfdb	egl/dri2: Dispatch eglSwapBuffers by display, not driver Add dri2_egl_display_vtbl::swap_buffers, set it for each platform, and let egl_dri2 dispatch eglSwapBuffers to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	8b9298af0a	egl/dri2: Dispatch eglSwapInterval by display, not driver Add dri2_egl_display_vtbl::swap_interval, set it for each platform, and let egl_dri2 dispatch eglSwapInterval to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	a218765478	egl/wl,x11: Call dri2_swap_interval() statically Don't call it through the driver dispatch table. Just call it statically. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	d019cd81b5	egl/dri2: Put platform func names into proper namespaces Each of the egl_dri2 platforms (except Android) prefix their function names with "dri2", not "dri2_${platform}". This means many function names have three separate definitions in the egl_dri2 directory: one in each of platform_drm.c, platform_wayland.c, and platform_x11.c. For example, each of the three files defines dri2_create_window_surface(). The name collisions make it difficult to review patches for correctness ("Is this patch hunk calling a platform_x11 function or a global egl_dri2 function?"), complicate debugging, and confuse code navigation tools. For each function in platform_x11.c prefixed with 'dri2', this patch changes its prefix to 'dri2_x11'. Likewise for platform_drm.c and 'dri2_drm'; and platform_wayland.c and 'dri2_wl'. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	90502b18b2	egl/dri2: Move dri2_egl_display virtual funcs to vtbl dri2_egl_display has only one virtual function, 'authenticate'. Define dri2_egl_display::vtbl and move 'authenticate' there. This prepares for the EGL platform extensions, which will add many more virtual functions to dri2_egl_display. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	38848b6217	egl: Update to revision 24567 of eglext.h This pulls in EGL_EXT_platform_base, EGL_EXT_platform_wayland, EGL_EXT_platform_x11, and EGL_MESA_platform_gbm. This patch has a lot of churn because Khronos recently changed its method of generating headers. Khronos now generates it headers from XML. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Michel Dänzer	7e0396dd73	winsys/radeon: Store GPU virtual memory addresses of BOs in a hash table This allows retrieving the existing BO and incrementing its reference count, instead of creating a separate winsys representation for it, when the kernel reports that the BO was already assigned a virtual memory address. This fixes problems with XWayland using radeonsi and the xf86-video-wlglamor driver, which calls GEM flink outside of the radeon winsys code and creates BOs from the flinked names using the same DRM file descriptor. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-17 11:53:59 +09:00
Chia-I Wu	361902ec04	targets/dri-ilo: make the driver installable install-gallium-links.mk fails to create the compat link for ilo_dri.so because it looks for dri_LTLIBRARIES instead of noinst_LTLIBRARIES. Fix this by switching to dri_LTLIBRARIES (and make the driver installable). Since pci_id_driver_map.h and the DDX both tell libGL.so to look for "i965", ilo_dri.so will never be loaded even enabled and installed. The change should not create any more confusion. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-16 13:26:22 +08:00
Marek Olšák	2e361160ff	mesa: mark GL_RGB9_E5 as not color-renderable The GL 4.4 spec says it's not color-renderable and not accepted by RenderBufferStorage. The EXT_texture_shared_exponent spec says it's not color-renderable but it's accepted by RenderBufferStorageEXT. This seems to be a bug in the extension spec. Let's do what GL 4.4 says. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-15 18:39:50 +01:00
Aaron Watry	ec1ada7327	radeonsi/compute: Fix memory leak Free shader buffer object for all kernels when deleting compute state. Signed-off-by: Aaron Watry <awatry@gmail.com>	2014-03-15 11:59:19 -05:00
Marek Olšák	8199d149ed	st/mesa: remove _NEW_POLYGON dependency from vertex shader We can just check the polygon mode when updating the edge flag state. Also, we can just flag ST_NEW_VERTEX_PROGRAM directly, which makes ST_NEW_EDGEFLAGS_DATA useless.	2014-03-15 17:47:36 +01:00
Marek Olšák	4e634c5240	st/mesa: implement zero-stride edge flag by culling primitives This was unimplemented.	2014-03-15 17:47:36 +01:00
Marek Olšák	3d42696d10	st/mesa: fix per-vertex edge flags and GLSL support (v2) This fixes piglit/gl-2.0-edgeflag. v2: use StrideB to recognize per-vertex edge flags Cc: mesa-stable@lists.freedesktop.org	2014-03-15 17:47:35 +01:00
Kenneth Graunke	7554539d7e	i965/fs: Invalidate live intervals when demoting uniforms to pull params. Normally, nothing uses live intervals at this point, so this isn't necessary. However, dump_instructions() calculates them and uses them to show register pressure. So, calling dump_instructions() in this area of the code would segfault due to the arrays being the wrong size. This is not a candidate for stable branches because it only serves to fix internal debugging code that you manually have to invoke by altering the source code or using gdb. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-14 13:18:46 -07:00
Kenneth Graunke	13782dcf9d	i965/fs: Print "+reladdr" on variably-indexed uniform arrays. Previously, dump_instruction() would print output such as: { 2} 3: mov vgrf1:F, u0:F { 3} 4: mov vgrf7:F, u0:F { 4} 5: mov vgrf8:F, u0:F which looked like either a scalar access or perhaps a constant-indexed access of element 0, when it was really a variable index. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-14 13:17:57 -07:00
Kenneth Graunke	01d9023a9b	i965: Fix register types in dump_instructions(), again. In commit `e57d77280e`, I fixed this for destinations in the Vec4 backend, and sources in the scalar backend. But not both types in both backends. To prevent this mess from continuing, make the reg_encoding table static, so only the disassembler can use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-14 13:17:57 -07:00
Kenneth Graunke	4d2e79269a	i965/fs: Fix register comparisons in saturate propagation. opt_saturate_propagation_local compares scan_inst->dst.reg/reg_offset with inst->src[0].reg/reg_offset, and ensures that scan_inst->dst.file is GRF. But nothing ensured that inst->src[0].file was GRF. In the following program, this resulted in u1:F matching vgrf1:UW, and a saturate being incorrectly propagated from instruction 8 to instruction 1. { 1} 0: add vgrf0:UW, hw_reg1+8:UW, hw_reg0:V { 1} 1: add vgrf1:UW, hw_reg1+10:UW, hw_reg0:V { 1} 2: linterp vgrf6:F, hw_reg2:F, hw_reg3:F, hw_reg0:F { 2} 3: linterp vgrf27:F, hw_reg2:F, hw_reg3:F, hw_reg0+16:F { 4} 4: mov vgrf10+0.0:F, vgrf6:F { 3} 5: mov vgrf10+1.0:F, vgrf27:F { 6} 6: tex vgrf8+0.0:F, vgrf10+0.0:F { 5} 7: mov vgrf32:F, u1:F { 5} 8: mov.sat vgrf12:F, u1:F From shader-db: total instructions in shared programs: 1841932 -> 1841957 (0.00%) instructions in affected programs: 5823 -> 5848 (0.43%) I inspected two of the 25 hurt shaders, and concluded that they were both hitting this bug, and not legitimately optimized. This fixes bugs in Left 4 Dead 2 and Team Fortress 2, possibly among others. The optimization pass didn't exist in 10.0, so this is only a candidate for 10.1. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 13:17:57 -07:00
Eric Anholt	2dbebbd37d	glsl: Improve debug output and variable names for opt_dead_code_local. I know this code has confused others, and it confused me 3 years later, too. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-14 13:02:05 -07:00
Eric Anholt	2f879356b5	i965: Add support for GL_ARB_buffer_storage. It turns out we can allow COHERENT storage/mappings all the time, regardless of LLC vs non-LLC. It just means never using temporary mappings to avoid GPU stalls, and on non-LLC we have to use the GTT intead of CPU mappings. If we were to use CPU maps on non-LLC (which might be useful if apps end up using buffer_storage on PBO reads, to avoid WC read slowness), those would be PERSISTENT but not COHERENT, but doing that would require us driving the clflushes from userspace somehow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:22 -07:00
Eric Anholt	1990da2568	i965: Always use CPU mappings for BOs on LLC platforms. It looks like there's no big difference for write-only workloads, but using a CPU map means that if they happen to read without having set the MAP_READ_BIT, they get 100x the performance for those reads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:22 -07:00
Eric Anholt	bb63df0c2d	i965: Drop the system-memory temporary allocations for flush explicit. While in expected usage patterns nobody will ever hit this path, doubling our bandwidth used seems like a waste, and it cost us extra code too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:22 -07:00
Eric Anholt	ea93246c00	i965: Switch mapping modes for non-explicit-flush blit-temporary maps. On LLC, it should always be better to use a cached mapping than the GTT. On non-LLC, it seems pretty silly to try to optimize read performance for the INVALIDATE_RANGE_BIT case. This will make the buffer_storage logic easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:21 -07:00
Jeff Muizelaar	ff1e850eec	gallivm: optimize repeat linear npot code in the aos int path Similar to the other cases, shift some weight/coord calculations to int space. This should be slightly faster (on x86 sse it should actually safe one instruction, and generally int instructions are cheaper).	2014-03-14 19:41:18 +01:00
Roland Scheidegger	9954f01497	gallivm: use correct rounding for nearest wrap mode (in the aos int path) The previous code used coords which were calculated as (int) (f_coord * tex_size * 256) >> 8. This is not only unnecessarily complex but can give the wrong texel due to rounding for negative coords (as an example, after denormalization coords from -1.0 to 0.0 should give -1, but this will give -1 for numbers from -1.0-1/256 - 0.0-1/256. Instead, juse use ifloor, dropping the shift stuff. Unfortunately, this will most likely be slower - with arch rounding available it shouldn't be too bad (trades a int shift for a round but also saves an int mul (which is shared by all coords) but otherwise it's a mess.	2014-03-14 19:41:18 +01:00
Jeff Muizelaar	88637e5764	gallivm: use correct rounding for linear wrap mode (in the aos int path) The previous method for converting coords to ints was sligthly inaccurate (effectively losing 1bit from the 8bit lerp weight). This is probably especially noticeable when trying to draw a pixel-aligned texture. As an example, for a 100x100 texture after dernormalization the texture coords in this case would turn up as 0.5, 1.5, 2.5, 3.5, 4.5, ... After the mul by 256, conversion to int and 128 subtraction, they end up as 0, 256, 512, 768, 1024, ... which gets us the correct coords/weights of 0/0, 1/0, 2/0, 3/0, 4/0, ... But even LSB errors (which are unavoidable) in the input coords may cause these coords/weights to be wrong, e.g. for a coord of 3.49999 we'd get a coord/weight of 2/255 instead. Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be equally fast on x86 sse though other archs probably suffer a little.	2014-03-14 19:41:18 +01:00
Brian Paul	6757ec3f8e	glapi: restore _glthread_GetID() function This partially reverts patch `02cb04c68f`. This fixes an unresolved symbol error when using older builds of libGL. Tested-by: Chia-I Wu <olv@lunarg.com>	2014-03-14 12:12:07 -06:00
Niels Ole Salscheider	f9901f1ab2	radeonsi: flush the dma ring in si_flush_from_st Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-14 15:01:14 +01:00
Niels Ole Salscheider	087b0ff1c1	radeon: Move DMA ring creation to common code Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-14 15:01:14 +01:00
Emil Velikov	a9cf3aa208	mesa: return v.value_int64 when the requested type is TYPE_INT64 Fixes "Operands don't affect result" defect reported by Coverity. Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-14 13:01:47 +00:00
Emil Velikov	f064bcdfbf	nvc0: minor cleanups in stream output handling Constify the offsets parameter to silence gcc warning 'assignment from incompatible pointer type' due to function prototype miss-match. Use a boolean changed as a shorthand for target != current_target. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-14 13:00:01 +00:00
Emil Velikov	ad4a44ebfc	nouveau: honor fread return value in the nouveau_compiler There is little point of continuing if fread returns zero, as it indicates that either the file is empty or cannot be read from. Bail out if fread returns zero after closing the file. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-14 13:00:01 +00:00
Emil Velikov	ae7d236172	nouveau: typecast the prime_fd handle when calling nouveau_bo_set_prime Core drm defines that the handle is of type int, while all drivers treat it as uint internally. Typecast the value to silence gcc warning messages and be consistent amongst all drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-14 13:00:01 +00:00
Emil Velikov	c26b488088	nv50: add missing brackets when handling the samplers array Commit 3805a864b1d(nv50: assert before trying to out-of-bounds access samplers) introduced a series of asserts as a precausion of a previous illegal memory access. Although it failed to encapsulate loop within nv50_sampler_state_delete effectively failing to clear the sampler state, apart from exadurating the illegal memory access issue. Fixes gcc warning "array subscript is above array bounds" and "Nesting level does not match indentation" and "Out-of-bounds read" defects reported by Coverity. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-14 13:00:01 +00:00
Anuj Phogat	4d0e30accd	i965: Fix build warning of unused variable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 02:57:00 -07:00
Adel Gadllah	a69fabc76c	dri3: Add GLX_EXT_buffer_age support v2: Indent according to Mesa style, reuse sbc instead of making a new swap_count field, and actually get a usable back before returning the age of the back (fixing updated piglit tests). Changes by anholt. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Robert Bragg <robert@sixbynine.org> (v1) Reviewed-by: Adel Gadllah <adel.gadllah@gmail.com> (v2) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-13 14:19:21 -07:00
Eric Anholt	0b02d8a633	dri3: Prefer the last chosen back when finding a new one. With the buffer_age code, I need to be able to potentially call this more than once per frame, and it would be bad if a new special event showing up meant I chose a different back mid-frame. Now, once we've chosen a back for the frame, another find_back will choose it again since we know that it won't have ->busy set until swap. Note that this makes find_back return a buffer id instead of a backbuffer index. That's kind of a silly distinction anyway, since it's an identity mapping between the two (it's the front buffer that is at an offset). Reviewed-By: Adel Gadllah <adel.gadllah@gmail.com>	2014-03-13 14:19:16 -07:00
Neil Roberts	551d459af4	Add the EGL_MESA_configless_context extension This extension provides a way for an application to render to multiple surfaces with different buffer formats without having to use multiple contexts. An EGLContext can be created without an EGLConfig by passing EGL_NO_CONFIG_MESA. In that case there are no restrictions on the surfaces that can be used with the context apart from that they must be using the same EGLDisplay. _mesa_initialze_context can now take a NULL gl_config which will mark the context as ‘configless’. It will memset the visual to zero in that case. Previously the i965 and i915 drivers were explicitly creating a zeroed visual whenever 0 is passed for the EGLConfig. Mesa needs to be aware that the context is configless because it affects the initial value to use for glDrawBuffer. The first time the context is bound it will set the initial value for configless contexts depending on whether the framebuffer used is double-buffered. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:47 -07:00
Neil Roberts	4b17dff3e5	eglCreateContext: Remove the check for whether config == 0 In eglCreateContext there is a check for whether the config parameter is zero and in this case it will avoid reporting an error if the EGL_KHR_surfacless_context extension is supported. However there is nothing in that extension which says you can create a context without a config and Mesa breaks if you try this so it is probably better to leave it reporting an error. The original check was added in `b90a3e7d8b` based on the API-specific extensions EGL_KHR_surfaceless_opengl/gles1/gles2. This was later changed to refer to EGL_KHR_surfacless_context in `b50703aea5`. Perhaps the original extensions specified a configless context but the new one does not. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:47 -07:00
Neil Roberts	4954518125	Fix the initial value of glDrawBuffers for GLES Under GLES 3 it is not valid to pass GL_FRONT to glDrawBuffers. Instead, GL_BACK has a magic interpretation which means it will render to the front buffer on single-buffered contexts and the back buffer on double-buffered. We were incorrectly setting the initial value to GL_FRONT for single-buffered contexts. This probably doesn't really matter at the moment except that presumably it would be exposed in the API via glGetIntegerv. When we switch to configless contexts this is more important because in that case we always want to rely on the magic interpretation of GL_BACK in order to automatically switch between the front and back buffer when a new surface with a different number of buffers is bound. We also do this for GLES 1 and 2 because the internal value doesn't matter in that case and it is convenient to use the same code to have the magic interpretation of GL_BACK. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:47 -07:00
Neil Roberts	0c58c96e54	Use the magic behaviour of GL_BACK in GLES 1 and 2 as well as 3 In GLES 3 it is not possible to select rendering to the front buffer and instead selecting GL_BACK has the magic interpretation that it is either the front buffer on single-buffered configs or the back buffer on double-buffered. GLES 1 and 2 have no way of selecting the draw buffer at all. In that case we were initialising the draw buffer to either GL_FRONT or GL_BACK depending on the context's config and then leaving it at that. When we switch to having configless contexts we ideally want Mesa to automatically switch between the front and back buffer whenever a double- or single-buffered surface is bound. To make this happen we can just allow the magic behaviour from GLES 3 in GLES 1 and 2 as well. It shouldn't matter what the internal value of the draw buffer is in GLES 1 and 2 because there is no way to query it from the external API. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:46 -07:00
Ian Romanick	87c66a4ff7	glsl: Fix typo Remove extra "any" and re-word-wrap the comment. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-12 11:16:50 -07:00
Ian Romanick	6bdc1d96c3	glsl: Rewrite unrolled link_invalidate_variable_locations calls as a loop Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-12 11:16:50 -07:00
Carl Worth	7b8acb9026	docs: Import 10.0.4 release notes, add news item.	2014-03-12 10:22:22 -07:00
Mike Stroyan	6e627b49f9	mesa: Release gl_debug_state when destroying context. Commit `6e8d04a` caused a leak by allocating ctx->Debug but never freeing it. Release the memory in _mesa_free_errors_data when destroying a context. Use FREE to match CALLOC_STRUCT from _mesa_get_debug_state. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-12 09:43:05 -06:00
Niels Ole Salscheider	2c886eba78	r600g: compute memory pool size is given in dw Multiply the dw value by 4 in order to map the complete buffer. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2014-03-11 19:00:08 -07:00
Eric Anholt	d3eb709ded	meta: Always restore the framebuffers and current renderbuffer. The few paths that were playing with framebuffers and renderbuffer were saving and restoring them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:47:46 -07:00
Eric Anholt	feb3d8dacd	i965: Drop intel_check_front_buffer_rendering(). This was being applied in a subset of the places that intel_prepare_render() was called, to set the same flag that intel_prepare_render() was setting. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:47:44 -07:00
Eric Anholt	ec542d7457	i965: Drop broken front_buffer_reading/drawing optimization. The flag wasn't getting updated correctly when the ctx->DrawBuffer or ctx->ReadBuffer changed. It usually ended up working out because most apps only have one window system framebuffer, or if they have more than one and they have any front read/drawing, they will have called glReadBuffer()/glDrawBuffer() on it when they get started on the new buffer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:47:41 -07:00
Eric Anholt	66073ef438	intel: When checking for updating front buffer reading, use the right fb. It's the ctx->ReadBuffer that gets read from, not the ctx->DrawBuffer. So, if you happened to have a ctx->ReadBuffer that was the winsys buffer, and it had previously been intel_prepare_render()ed but not invalidated since then, and you called glReadBuffer() to switch to front buffer instead of back buffer reading on the winsys fbo while your drawbuffer was a user FBO, you'd never get the front buffer's miptree fetched, and segfault. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:46:59 -07:00
Marek Olšák	e1a9a54464	r600g,radeonsi: attempt to fix racy multi-context apps calling BufferData Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75061 v2: minimize the window where cs_buf != new_buf	2014-03-11 19:18:02 +01:00
Marek Olšák	74d95adea0	r600g,radeonsi: fix broken buffer download Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 19:18:02 +01:00
Marek Olšák	4ca3486b19	r600g,radeonsi: use a fallback in dma_copy instead of failing v2: - allow byte-aligned DMA buffer copies on Evergreen - fix piglit/texsubimage regression - use the fallback for 3D copies (depth > 1) as well	2014-03-11 19:18:02 +01:00
Marek Olšák	de5094d102	radeonsi: small cleanup in get_param Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	e219842282	radeonsi: set correct alignment for texture buffers and constant buffers I think these are all equivalent to vertex buffer fetches which should be dword-aligned. Scalar loads are also dword-aligned. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	f549129564	r600g, radeonsi: fix primitives-generated query with disabled streamout Buffers are disabled by VGT_STRMOUT_BUFFER_CONFIG, but the query only works if VGT_STRMOUT_CONFIG.STREAMOUT_0_EN is enabled. This moves VGT_STRMOUT_CONFIG to its own state. The register is set to 1 if either streamout or the primitives-generated query is enabled. However, the primitives-emitted query is also incremented, so it's disabled by setting VGT_STRMOUT_BUFFER_SIZE to 0 when there is no buffer bound. This fixes piglit: ARB_transform_feedback2/counting with pause EXT_transform_feedback/primgen-query transform-feedback-disabled Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	958ef47a6d	r600g,radeonsi: don't add streamout.num_dw_for_end twice It's already added in need_cs_space. Also don't calculate anything if there are no buffers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	4f1f32306a	r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limits CB_COLORi_VIEW.SLICE_MAX can be at most 2047. This fixes the maxlayers piglit test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	8bd7a6f48c	st/dri: flush drawable textures before unreferencing This fixes piglit/fbo-sys-blit with fast clear on radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	a38e1fd78b	radeonsi: implement fast color clear This works for both multi-sample and single-sample color buffers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	28eb0bcf19	r600g: move fast color clear code to a common place Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	d3c1be530a	r600g,radeonsi: move CMASK register values from r600_surface to r600_texture When doing fast clear for single-sample color buffers for the first time, a CMASK buffer has to be allocated and the CMASK state in all pipe_surfaces referencing the color buffer must be updated. Updating all surfaces is kinda silly, so let's move the values to r600_texture instead. This is only for Evergreen and later. R600-R700 don't have fast clear. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	61a2fac199	radeonsi: convert the framebuffer state to atom-based This looks like r600g. The shared Cayman MSAA code is used here. The real motivation for this is that I need the ability to change values of color registers after the framebuffer state is set. The PM4 state cannot be modified easily after it's generated. With this, I can just change r600_surface::cb_color_xxx and set framebuffer.atom.dirty=true and it's done. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	946d1cfe39	r600g: move cayman MSAA setup to a common place I will use this in radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	6a5499b9d9	radeonsi: move framebuffer-related state to a new struct si_framebuffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	bee2b96b02	r600g,radeonsi: set priorities for relocations	2014-03-11 18:51:19 +01:00
Marek Olšák	3edb3b86b2	r300g,uvd,vce: set priorities for relocations This updates all occurences of cs_add_reloc.	2014-03-11 18:51:19 +01:00
Marek Olšák	db1a7f78c2	winsys/radeon: add interface for setting a priority number for each relocation The cs_add_reloc change is commented out not to break compilation. The highest priority of all cs_add_reloc calls is send to the kernel.	2014-03-11 18:51:19 +01:00
Jonathan Gray	0d6f573f6e	glsl: Link glsl_compiler with pthreads library. Fixes the following build error on OpenBSD: ./.libs/libglsl.a(builtin_functions.o)(.text+0x973): In function `mtx_lock': ../../include/c11/threads_posix.h:195: undefined reference to `pthread_mutex_lock' ./.libs/libglsl.a(builtin_functions.o)(.text+0x9a5): In function `mtx_unlock': ../../include/c11/threads_posix.h:248: undefined reference to `pthread_mutex_unlock' Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-11 08:47:12 -06:00
Jonathan Gray	40214267ab	gallium: add endian detection for OpenBSD Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-11 08:47:12 -06:00
Emil Velikov	a6efbac9fb	automake: allow only shared builds Static and shared builds were possible in the good old days of static makefiles. Currently the build system does not distinguish nor does anything special when one requests a static build. Print a warning message for the packager that static builds are not supported and continue building shared libs. Currently only Debian and derivatives use static build, and they use it for building a Xlib powered libGL. This patch will only change the warning message they are seeing but the binaries produced will be identical. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:44 +00:00
Emil Velikov	065b6ca52b	configure: update enable-llvm-shared-libs comments - As of commit cb080a10b68(configure.ac: Don't require shared LLVM when building OpenCL) opencl does not mandate using shared llvm. - Add a warning message that building with static llvm may cause compilation problems. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-11 12:50:44 +00:00
Emil Velikov	e267e4318c	st/dri: build the drm backend when libdrm is present Prevent build issues on systems lacking libdrm. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:44 +00:00
Emil Velikov	f41a65397b	glx: cleanup unneeded headers - xf86dri.h is the old dri1 header, not required by dri2 nor dri3 - fold xf86drm.h inclusiong inside dri2.h - dri3_glx does not have any drm specific dependencies - glapi.h is not required by the dri2 and dri3 codepaths Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:50:43 +00:00
Jon TURNEY	e5214dd8f1	glx/tests: honor enable-driglx-direct configure flag Recent commit fixed build issues in dri2_query_renderer.c by wrapping in defined(direct_rendering) && !defined(applegl) This patch targets the query_renderer tests, so that make check passes on platforms such as hurd and cygwin. v2: (Emil) - Rebase and update commit message. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:50:43 +00:00
Emil Velikov	254aafba3e	configure: read libomxil-bellagio.pc only when it exists Currenly configure.ac will print a warning when one is missing the package. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-11 12:50:43 +00:00
Emil Velikov	22c133546a	automake: create compat symlinks only for linux systems The primary users of these are linux developers, although it can be extended for BSD and others if needed. Fixes make install for Cygwin and OpenBSD at least. v2: - Wrap vdpau targets as well. v3: - Fold HAVE_COMPAT_SYMLINKS conditional within installlinks.mk Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63269 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> (v1) Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-11 12:50:43 +00:00
Emil Velikov	bba9c28215	configure: use LIB_EXT rather than hardcoded .so Some platforms different library extension - dll, dylib, a. Honor that when we are creating the required links. Rename LIB_EXTENSION to LIB_EXT while we're here. With libglapi linking aside, building classic drivers on non-linux platforms should be possible now. v2: Resolve conflicts. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:43 +00:00
Emil Velikov	020bc0d0dd	automake: do not use symbols names for static glapi.la In the cases where one links against the static glapi.la there is no need to create temporary variables only to explicitly link agaist it. Instead use SHARED_GLAPI_LIB to explicitly indicate when one is building and linking with the shared glapi provider. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:43 +00:00
Emil Velikov	3c5599b276	configure: remove old makefile variables All the variables were used before the automake conversion and do not make sense (nor are used) currently. Replace GL_LIB_NAME with lib$(GL_LIB).$(LIB_EXTENSION) for apple-glx. The build has been broken for ages, but this will ease the recovery process as it happens. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:43 +00:00
Emil Velikov	49d7bcea82	gallium/targets: use install-gallium-targets.mk Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	f3595b6748	gallium/targets: drop link generation for non DRI targets All three (xvmc and omx) do not have an alternative loading similar to the dri modules. Thus one needs to explicitly install them in order to use/test them. v2: - Keep vdpau targets, as an equivalent of LIBGL_DRIVERS_PATH is being worked on. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	d8ba951ad6	targets/vdpau: use install-gallium-links.mk Drop the duplication across all vdpau targets. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-11 12:50:42 +00:00
Emil Velikov	ce24bcd394	targets/dri: use install-gallium-links.mk Drop the duplication across all dri targets. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	bbae65e25c	automake: introduce install-gallium-links.mk This helper script will be used to minimise the duplication during link generation across all gallium targets. v2: - Handle vdpau_LTLIBRARIES. Requested by Christian König. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	7b4ccad33d	automake: use install-lib-links.mk across all classic mesa Use the handy script and minimise the boilerplate in the makefiles. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	b496ab0567	automake: make install-lib-links less chatty There is little point in echoing everything that the script does to stdout. Wrap it in AM_V_GEN so that a reasonable message is printed as a indication of it's invocation. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	90a4ffdea5	automake: use only the folder name if it's a subfolder of the present one v2: Resolve rebase conflicts. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:41 +00:00
Emil Velikov	b15b1fbb51	automake: silence folder creation There is little gain in printing whenever a folder is created. v2: - Use $(AM_V_at) over @ to have control in verbose builds. Suggested by Erik Faye-Lund. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:41 +00:00
Emil Velikov	c690f8dd9b	automake: use MKDIR_P when possible Use the automake predefined macro over hardcoding mkdir -p everywhere. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:41 +00:00
Vinson Lee	e6c565fcc5	radeon: Fix build. Fix build error introduced with commit `dfa25ea5cd`. CC r600_streamout.lo r600_streamout.c:108:6: error: conflicting types for 'r600_set_streamout_targets' void r600_set_streamout_targets(struct pipe_context ctx, ^ ./r600_pipe_common.h:413:6: note: previous declaration is here void r600_set_streamout_targets(struct pipe_context ctx, ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76009 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-03-10 22:54:59 -07:00
Zack Rusin	dfa25ea5cd	gallium: allow setting of the internal stream output offset D3D10 allows setting of the internal offset of a buffer, which is in general only incremented via actual stream output writes. By allowing setting of the internal offset draw_auto is capable of rendering from buffers which have not been actually streamed out to. Our interface didn't allow. This change functionally shouldn't make any difference to OpenGL where instead of an append_bitmask you just get a real array where -1 means append (like in D3D) and 0 means do not append. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-07 12:49:33 -05:00
Brian Paul	7d5903980e	meta: use non-ARB shader/program create/delete functions The non-ARB versions take GLuint ids, not GLhandleARB. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-10 17:07:05 -06:00
Brian Paul	d96ed5c088	mesa: s/GLhandleARB/GLuint/ for glGetUniform functions The GL specs say the parameter is GLuint, not GLhandleARB. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-10 17:06:57 -06:00
Brian Paul	a19b19fb94	mesa: rename MESA_FORMAT_X8Z24_UNORM -> MESA_FORMAT_X8_UINT_Z24_UNORM To follow the example of MESA_FORMAT_Z24_UNORM_X8_UINT. Reviewed-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 16:11:54 -06:00
Brian Paul	9b5fff2dd7	mesa: reorder MESA_FORMAT enums The MESA_FORMAT_x enums in formats.h weren't declared in any sort of reasonable order. Now it should be a little more logical. This also required reordering tables in formats.c and s_texfetch.c Reviewed-by: Michel Dänzer <michel@daenzer.net> Acked-by: Eric Anholt <eric@anholt.net>	2014-03-10 16:11:50 -06:00
Brian Paul	10738727ae	mesa: trim down format.h comments There's no real reason to list all the formats in the comments. Reviewed-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 16:11:45 -06:00
Matt Turner	3330dec90c	i965/vec4: Don't fix-up scalar uniforms for 3 src instructions. Removes unnecessary MOV instructions in L4D2, TF2, Dota2, and many other Steam games. total instructions in shared programs: 1668126 -> 1657509 (-0.64%) instructions in affected programs: 242235 -> 231618 (-4.38%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-10 14:13:45 -07:00
Matt Turner	b823d5df0f	i965: Disassemble 3 src instructions' rep_ctrl field. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-10 14:13:45 -07:00
Matt Turner	dafcc1b7c4	i965: Disassemble 3-src operands widths' correctly. <4,1,1> isn't a real thing. We meant <4,4,1>, i.e., each component of the whole register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-10 14:13:45 -07:00
Eric Anholt	30259856a8	i965: Move binding table update packets to binding table setup time. This keeps us from needing to reemit all the other stage state just because a surface changed. Improves unoptimized glamor x11perf -f8text by 1.10201% +/- 0.489869% (n=296). [v1] v2: - Drop binding table packets from Gen8 unit state as well. - Pass _3DSTATE_BINDING_TABLE_POINTERS_XS to brw_upload_binding_table, cutting even more code. v3: Don't forget to drop them from 3DSTATE_GS (botched refactor in v2). Signed-off-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v2, v3] Reviewed-by: Eric Anholt <eric@anholt.net> [v3]	2014-03-10 13:05:12 -07:00
Kenneth Graunke	db26253a48	i965: Reorganize the code in brw_upload_binding_tables. This makes both the empty and non-empty binding table paths exit through the bottom of the function, which gives us a place to share code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 13:05:12 -07:00
Maarten Lankhorst	8c136b53b7	fix vdpau interop when using -Bsymbolic-functions in ldflags Explicitly add radeon_drm_winsys_create and nouveau_drm_screen_create to the dynamic list. This will ensure vdpau interop still works even when the user links with -Bsymbolic-functions in hardened builds. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Tested-by: Rachel Greenham <rachel@strangenoises.org> Reported-by: Peter Frühberger <peter.fruehberger@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-10 17:08:19 +01:00
Chia-I Wu	952fda4d3f	ilo: do not set I915_EXEC_NO_RELOC This reverts most of commit `d80f0c34b7`. Upon a closer reading, having the presumed offsets written is not enough to set the flag. EXEC_OBJECT_NEEDS_GTT and/or EXEC_OBJECT_WRITE of the reloc entries must also be set appropriately.	2014-03-10 19:04:43 +08:00
Chia-I Wu	5ecdd7ba22	ilo: add support for PIPE_QUERY_PIPELINE_STATISTICS	2014-03-10 16:43:53 +08:00
Chia-I Wu	8fc2f0c874	ilo: add ILO_3D_PIPELINE_WRITE_STATISTICS The command writes statistics registers to the specified bo.	2014-03-10 16:43:53 +08:00
Chia-I Wu	d8b2e3c25e	ilo: add some MI commands to GPE We will need MI commands that load/store registers.	2014-03-10 16:43:53 +08:00
Chia-I Wu	0f41f9c63d	ilo: set PIPE_CONTROL_GLOBAL_GTT_WRITE automatically Set the flag automatically in gen6_emit_PIPE_CONTROL(), and set it only for GEN6.	2014-03-10 16:43:53 +08:00
Chia-I Wu	345bf92f13	ilo: print a warning when PPGTT is disabled Despite what the PRMs say, the driver appears to work fine when PPGTT is disabled. But at least print a warning in that case.	2014-03-10 16:42:42 +08:00
Chia-I Wu	747627d045	ilo: require hardware logical context support The code paths are not tested for a while, and have some known issues.	2014-03-10 16:42:42 +08:00
Chia-I Wu	72956ed374	ilo: protect the decode context with a mutex The decode context is not thread safe.	2014-03-10 16:42:42 +08:00
Chia-I Wu	d80f0c34b7	ilo: set I915_EXEC_NO_RELOC when available The winsys makes it clear that the pipe drivers should write presumed offsets. We can always set I915_EXEC_NO_RELOC when the kernel supports it.	2014-03-10 16:42:42 +08:00
Chia-I Wu	0b462d3ab1	ilo: move ring types to winsys It results in less code despite that i915_drm.h specifies the ring type as part of the execution flags.	2014-03-10 16:42:42 +08:00
Chia-I Wu	42c1ce4c03	ilo: winsys may limit the batch buffer size The maximum batch buffer size is determined at the time of drm_intel_bufmgr_gem_init(). Make sure the pipe driver does not exceed the limit.	2014-03-10 16:42:42 +08:00
Chia-I Wu	a434ac045e	ilo: PIPE_CAP_QUERY_TIMESTAMP may not be supported Reading TIMESTAMP register may fail, depending on both kernel and hardware.	2014-03-10 16:42:42 +08:00
Chia-I Wu	249b1ad984	ilo: rework winsys batch buffer functions Rename intel_winsys_check_aperture_size() to intel_winsys_can_submit_bo(), intel_bo_exec() to intel_winsys_submit_bo(), and intel_winsys_decode_commands() to intel_winsys_decode_bo(). Make a semantic change to ignore intel_context when the ring is not the render ring.	2014-03-10 16:42:42 +08:00
Chia-I Wu	3e324f99d3	ilo: replace bo alloc flags by initial domains The only alloc flag is INTEL_ALLOC_FOR_RENDER, which can as well be expressed by specifying the initial write domain. The change makes it obvious that we failed to set INTEL_ALLOC_FOR_RENDER in several places.	2014-03-10 16:42:42 +08:00
Chia-I Wu	76713ed5d6	ilo: remove intel_bo_get_size() Commit `bfa8d21759` uses it to work around a hardware limitation. But there are other ways to do it without the need for intel_bo_get_size().	2014-03-10 16:42:42 +08:00
Chia-I Wu	790c32ec75	ilo: remove intel_bo_get_virtual() Make the map functions return the pointer directly.	2014-03-10 16:42:42 +08:00
Chia-I Wu	90786613e9	ilo: rework winsys bo reloc functions Rename intel_bo_emit_reloc() to intel_bo_add_reloc(), intel_bo_clear_relocs() to intel_bo_truncate_relocs(), and intel_bo_references() to intel_bo_has_reloc(). Besides, we need intel_bo_get_offset() only to get the presumed offset afer adding a reloc entry. Remove the function and make intel_bo_add_reloc() return the presumed offset. While at it, switch to gem_bo->offset64 from gem_bo->offset.	2014-03-10 16:42:42 +08:00
Chia-I Wu	76ed4f75dd	ilo: add a wrapper to cast struct intel_bo It is just drm_intel_bo, but having a wrapper makes the code cleaner.	2014-03-10 16:42:42 +08:00
Chia-I Wu	4491f0a971	ilo: fix DRM_API_HANDLE_TYPE_FD export It can be exported by drm_intel_bo_gem_export_to_prime(). The code is already in winsys, just not enabled.	2014-03-10 16:42:42 +08:00
Chia-I Wu	276348e85a	ilo: improve winsys documentation/comments Document the interface, and add comments as to why some features are enabled and why some checks are made.	2014-03-10 16:42:41 +08:00
Chia-I Wu	f2aabecbb0	ilo: remove intel_winsys_enable_reuse() It should be an (winsys) implementation detail.	2014-03-10 16:42:41 +08:00
Tapani Pälli	56b1be4399	mesa/glsl: introduce a remap table for uniform locations Patch adds a remap table for uniforms that is used to provide a mapping from application specified uniform location to actual location in the UniformStorage. Existing UniformLocationBaseScale usage is removed as table can be used to set sequential values for array uniform elements. This mapping helps to implement GL_ARB_explicit_uniform_location so that uniforms locations can be reorganized and handled in a more easy manner. v2: small fixes + rename parameters for merge and split functions (Ian) improve documentation, remove old check for location bounds (Eric) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 09:46:24 +02:00
Tapani Pälli	aa0d95a08d	mesa: remove _mesa_symbol_table_iterator structure Nothing uses this structure, removal fixes Klocwork error about the possible oom condition in _mesa_symbol_table_iterator_ctor. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-10 09:45:41 +02:00
Michel Dänzer	678cf9618f	radeonsi: Use proper member name for deleting export shader PM4 state Fixes double-free with some piglit tests using geometry shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-10 12:21:50 +09:00
Marek Olšák	9c2a3934c5	r600g: document why texture offset emulation is needed	2014-03-10 00:19:59 +01:00
Ilia Mirkin	897f40f25d	Revert nvc0 part of "nv50: adjust blit_3d handling of ms output textures" The nvc0 bits don't appear to work, and I thought I had removed them from the commit. Oops. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-09 01:38:10 -05:00
Ilia Mirkin	253314d487	nv50: adjust blit_3d handling of ms output textures This fixes some unwanted scaling when the output is multisampled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-09 01:32:06 -05:00
Ilia Mirkin	507f0230d4	nouveau: fix fence waiting logic in screen destroy nouveau_fence_wait has the expectation that an external entity is holding onto the fence being waited on, not that it is merely held onto by the current pointer. Fixes a use-after-free in nouveau_fence_wait when used on the screen's current fence. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75279 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-09 01:31:59 -05:00
Ilia Mirkin	5bf90cb521	nouveau: add valid range tracking to nouveau_buffer This logic is borrowed from the radeon code. The transfer logic will only get called for PIPE_BUFFER resources, so it shouldn't be necessary to worry about them becoming render targets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-03-09 01:31:21 -05:00
Julien Cristau	cf1c52575d	gbm: make 'devices' array static It's only used in this one file as far as I can tell, and exporting a symbol named 'devices' from a shared library is a recipe for trouble. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-08 20:43:54 +00:00
Emil Velikov	330a3799d0	automake: make clean the correct git_sha1.h.tmp When building out of tree, the file ends up dangling which may result in a binary with the old git sha. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-08 20:40:56 +00:00
Christian König	6a402359fd	radeonsi: fix freeing descriptor buffers That structure member is a pointer, so the loop with the Elements macro only freed up the first entry. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Christian König	58d2afa223	radeonsi: fix leaking the bound state on destruction v2 v2: rebased on stale pointer fixes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Christian König	1fa2acba61	radeonsi: avoid stale state pointers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Christian König	1a8c66023b	radeonsi: avoid stale pointers in si_delete_shader_selector Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Marek Olšák	c1a06da465	Revert "winsys/radeon: if there's VRAM-only usage, keep it" This reverts commit `67aef6dafa`. It caused GPU hangs. The question is why. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75900	2014-03-08 16:00:25 +01:00
Christian König	a995f564c7	radeon/vce: fix memory leak Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 14:43:53 +01:00
Sir Anthony	6e39a8f6ec	glcpp: Do not remove spaces to preserve locations. After preprocessing by glcpp all adjacent spaces were replaced by single one and glsl parser received column-shifted shader source. It negatively affected ast location set up and produced wrong error messages for heavily-spaced shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-08 01:38:32 -08:00
Sir Anthony	da2275cd9b	glsl: Change locations from yylloc to appropriate tokens positions. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	5656775cf6	glsl: Add ast_node method to set location range. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	654ee41cd3	glsl: Make ast_node location comments more informative. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	433d562ac6	glsl: Extend ast location structure to hande end token position. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	6984aa4350	glsl: Update lexers in glsl and glcpp to hande end position of token. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Vinson Lee	98fb8c95c0	scons: Add drivers/common/meta_generate_mipmap.c to src/mesa/SConscript. This patch fixes this SCons build error introduced with commit `70e7905608`. build/linux-x86_64-debug/mesa/libmesa.a(driverfuncs.os): In function `_mesa_init_driver_functions': src/mesa/drivers/common/driverfuncs.c:99: undefined reference to `_mesa_meta_GenerateMipmap' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-03-07 23:39:29 -08:00
Kenneth Graunke	14ca611258	meta: Support GenerateMipmaps on 1DArray textures. I don't know how many people care about this case, but it's easy enough to do, so we may as well. The tricky part is that for some reason Mesa stores the number of array slices in Height, not Depth. I thought the easiest way to handle that here was to make Height = 1 (the actual height), and srcDepth = srcImage->Height. This requires some munging when calling _mesa_prepare_mipmap_level, so I created a wrapper that sorts it out for us. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:25 -08:00
Kenneth Graunke	158a7440c3	meta: Use srcWidth/Height/Depth rather than srcImage->Width and such. This is equivalent for now, and will differ once we add 1DArray support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:19 -08:00
Kenneth Graunke	ec23d5197e	meta: Support GenerateMipmaps on 2DArray textures. This is largely a matter of looping over the number of slices/layers, and not minifying depth (presumably that code exists for the unfinished 3D texture support). Normally, I would have made the loop over array slices the outermost loop. I suspect that would make it trickier to support 3D textures someday, though, so I didn't. The advantage is that we would only have one BufferData call per slice, rather than one per miplevel and slice. However, a GenerateMipmaps microbenchmark indicates that either way is basically just as fast. So I'm not sure it's worth bothering. Improves performance in a GenerateMipmaps microbenchmark by nearly 5x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:17 -08:00
Kenneth Graunke	15b2f69b9c	meta: Add a 'layer' argument to bind_fbo_image(). For array textures and 3D textures, this represents the layer to use. Just pass 0 for now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:16 -08:00
Kenneth Graunke	be84d53d44	meta: Refactor code for binding a texture image to the FBO. Almost the exact same code appeared twice, and it needs to expand to handle additional texture targets. Refactor it to tidy up the code and avoid duplicating more work in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:14 -08:00
Kenneth Graunke	45ee1b30d7	meta: Use minify() in GenerateMipmaps code. This is what the macro is for. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:13 -08:00
Kenneth Graunke	9afca91984	meta: Drop redundant FBO creation code in GenerateMipmaps. fallback_required() already creates the FBO in order to check whether we can render to the format. So it's guaranteed to exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:11 -08:00
Kenneth Graunke	1285bc87ac	meta: Replace GLboolean with bool in fallback_required(). This doesn't interact with the GL API, so we shouldn't use GL types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:10 -08:00
Kenneth Graunke	092b7edb3f	meta: Make _mesa_meta_check_generate_mipmap_fallback static. This was only ever used in one place; there's no reason for it to be non-static. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:09 -08:00
Kenneth Graunke	70e7905608	meta: Split GenerateMipmap() into its own file. Putting the implementation of each GL function in its own file makes it much easier not to get lost. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:07 -08:00
Kenneth Graunke	3a7f3d843a	meta: De-static setup_texture_coords(). This will be used in multiple files soon. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:04 -08:00
Timothy Arceri	1308d21fbf	glapi: Add KHR_debug.xml	2014-03-08 15:45:26 +11:00
Timothy Arceri	6c3f5abc2d	mesa: add missing DebugMessageControl types Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-08 15:38:31 +11:00
Timothy Arceri	fb78fa58d2	mesa: make ARB_debug_output functions an alias of KHR_debug Also update dispatch sanity removing ARB_debug_output checks and removing KHR_debug placeholders as the checks have already been added V2: Make sure we exit case statements with conditional breaks rather than just dropping through. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-08 15:38:31 +11:00
Timothy Arceri	0608d346aa	glapi: move KHR_debug into its own file Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-08 15:31:59 +11:00
Adel Gadllah	b972e55684	glx_pbuffer: Refactor GetDrawableAttribute Move the pdraw != NULL check out so that they don't have to be duplicated. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 16:59:57 -08:00
Adel Gadllah	6b13cd1f7f	glx: Update glxext.h to revision 25407 Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 16:59:57 -08:00
Tom Stellard	a1b189ac90	radeon: Include radeon_elf_util.c in the list of LLVM_C_FILES v2 This fixes the a build breakage caused by `6974eb9076` on build configurations where all the following are true: 1. radeonsi is not being built 2. r600g is being built 3. opencl is disabled 4. --enable-r600-llvm-compiler is not being used 5. libelf is not installed v2: - Add $(RADEON_CFLAGS) to libllvmradeon_la_CFLAGS Tested-by: Brian Paul <brianp@vmware.com>	2014-03-07 18:06:59 -05:00
Brian Paul	9b322d540a	st/mesa: only mark framebuffer as sRGB capable if Mesa supports the format Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-07 15:43:36 -07:00
Tom Stellard	6974eb9076	radeon/llvm: Factor elf parsing code out into its own function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-07 13:31:52 -05:00
Tom Stellard	1f4a9fc84e	radeon: Rename struct radeon_llvm_binary to radeon_shader_binary v2 And move its definition into r600_pipe_common.h; This struct is a just a container for shader code and has nothing to do with LLVM. v2: - Drop unrelated Makefile change Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-07 13:31:51 -05:00
Marek Olšák	d8fde8ffed	gallium: rename R4A4 and A4R4 formats to match their swizzle Like L4A4. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-07 18:07:05 +01:00
Marek Olšák	780ce576bb	mesa: fix the format of glEdgeFlagPointer Softpipe expects a float in the vertex shader, which is what glEdgeFlag generates. This fixes piglit/gl-2.0-edgeflag. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 18:07:05 +01:00
Marek Olšák	472ac0db08	radeonsi: fix blit compressed texture workaround to support 2D arrays We don't have a piglit test for this, but I think it's correct. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-07 18:07:05 +01:00
Marek Olšák	fcdf6fa86c	r600g: fix blitting the last 2 mipmap levels for Evergreen This fixes a lot of compressedteximage piglit tests. R600-R700 don't have this issue. Cc: mesa-stable@lists.freedesktop.org	2014-03-07 18:07:05 +01:00
Marek Olšák	8a08051e2a	r600g: fix texelFetchOffset GLSL functions Cc: mesa-stable@lists.freedesktop.org	2014-03-07 18:07:05 +01:00
Marek Olšák	67aef6dafa	winsys/radeon: if there's VRAM-only usage, keep it	2014-03-07 18:07:05 +01:00
Niels Ole Salscheider	f112ba03bb	radeon: Use upload manager for buffer downloads Using DMA for reads is much faster. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off by: Marek Olšák <marek.olsak@amd.com>	2014-03-07 18:07:05 +01:00
Brian Paul	b46e8622f1	glapi: use 'Mesa' in error messages A user would have no idea what "_glthread_" is. This removes the last remaining instance of the _glthread_ string in Mesa. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-07 09:04:01 -07:00
Brian Paul	6d2dffe8b1	st/mesa: add test_format_conversion() debug function To check that the st_mesa_format_to_pipe_format() and st_pipe_format_to_mesa_format() functions correctly convert all corresponding Mesa/Gallium formats. This found that MESA_FORMAT_YCBCR_REV was missing in st_mesa_format_to_pipe_format(). Fixed that too. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-07 07:31:29 -07:00
Brian Paul	d8f7e3d79e	st/mesa: add MESA_FORMAT_R8G8B8A8_SRGB in st_mesa_format_to_pipe_format() v2: rename patch after rebasing on top of Jose's changes. Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-03-07 07:31:18 -07:00
José Fonseca	b3689adf51	mesa/st: Fix PIPE_FORMAT_R8G8B8A8_SRGB -> MESA_FORMAT_ conversion. Copy'n'past typo introduced in my `1d8e3067fd` commit. This fixes swapped RB channels I was seeing in my test machines. Trivial.	2014-03-07 13:35:24 +00:00
Kusanagi Kouichi	7233d4479e	st/vdpau: Add rotation v2 v2: add static asserts Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-07 09:20:11 +01:00
Kusanagi Kouichi	e7e207658c	vl: Add rotation v3 v2: rotate in gen_rect_verts instead v3: clear rotate in vl_compositor_clear_layers, update calc_drawn_area as well Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-03-07 09:20:11 +01:00
Christian König	53d1d879d5	st/omx/enc: fix crash on destruction Signed-off-by: Christian König <christian.koenig@amd.com>	2014-03-07 08:55:57 +01:00
Kenneth Graunke	378c6f2246	mesa: Drop unused hash_table::mem_ctx field. It's never used, and it's equivalent to ralloc_parent(ht) if you really need it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-06 20:55:34 -08:00
Michel Dänzer	9ceee5f4be	clover: Fix build against LLVM SVN r203065 or newer llvm/Linker.h was moved to llvm/Linker/Linker.h. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-07 11:12:12 +09:00
Brian Paul	0f0c16b238	mesa: add MESA_FORMAT_R8G8B8A8_SRGB To match PIPE_FORMAT_R8G8B8A8_SRGB. v2: fix component name copy&paste bugs Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-06 18:17:14 -07:00
Matt Turner	8d3f739383	mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__. Because people insist on doing things like explicitly disabling SSE 4.1. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberger <david.heidelberger@ixit.cz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71547	2014-03-06 15:46:54 -08:00
Eric Anholt	c10896b593	i965: Fix render-to-texture in non-FinishRenderTexture cases. We've had several problems now with FinishRenderTexture not getting called enough, and we're ready to just give up on it ever doing what we need. In particular, an upcoming Steam title had rendering bugs that could be fixed by always_flush_cache=true. Instead of hoping Mesa core can figure out when we need to flush our caches, just track what BOs we've rendered to in a set, and when we render from a BO in that set, emit a flush and clear the set. There's some overhead to keeping this set, but most of that is just hashing the pointer -- it turns out our set never even gets very large, because cache flushes are so common (even on cairo-gl). No statistically significant performance difference in cairo-gl (n=100), despite spending ~.5% CPU in these set operations. v1: (Original patch by Eric Anholt.) v2: (Changes by Ken Graunke.) - Rebase forward from May 7th 2013 -> March 4th 2014. - Drop the FinishRenderTexture hook entirely; after rebasing the patch, the hook was just an empty function. - Move the brw_render_cache_set_clear() call from intel_batchbuffer_emit_flush() to brw_emit_pipe_control_flush(). In theory, this could catch more cases where we've flushed. - Consider stencil as a possible texturing source. v3: (changes by anholt): - Move set_clear() back to emit_mi_flush() -- it means we can drop more forced flushes from the code. In the previous location, it wouldn't have been called when we wanted pre-gen6. - Move the set clear from batch init to reset -- it should be empty at the start of every batch, since the kernel handled any inter-batch flush for us. v4: Drop the debug code in set.c that I accidentally committed. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Dylan Baker <baker.dylan.c@gmail.com> [v2]	2014-03-06 11:35:17 -08:00
Brian Paul	1e25aa4cdb	mesa: fix copy & paste bugs in pack_ubyte_SRGB8() Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-06 11:39:41 -07:00
Brian Paul	9493fc729e	mesa: fix copy & paste bugs in pack_ubyte_SARGB8() Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-06 11:16:15 -07:00
Aaron Watry	fb78152678	gallium/util: Fix memory leak Fix a leaked vertex shader in u_blitter.c Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> CC: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-06 11:38:26 -06:00
José Fonseca	1d8e3067fd	st/mesa: Add R8G8B8A8_SRGB case to st_pipe_format_to_mesa_format. With the recent SRGB changes all my automated OpenGL llvmpipe tests (piglit, conform, glretrace) start asserting with the backtrace below. I'm hoping this change will fix it. I'm not entirely sure, as this doesn't happen in my development machine (the bug probably depends on the exact X visual). Anyway, it seems the sensible thing to do here. Program terminated with signal 5, Trace/breakpoint trap. #0 _debug_assert_fail (expr=expr@entry=0x7fa324df2ed7 "0", file=file@entry=0x7fa324e3fc30 "src/mesa/state_tracker/st_format.c", line=line@entry=758, function=function@entry=0x7fa324e40160 <__func__.34798> "st_pipe_format_to_mesa_format") at src/gallium/auxiliary/util/u_debug.c:281 #0 _debug_assert_fail (expr=expr@entry=0x7fa324df2ed7 "0", file=file@entry=0x7fa324e3fc30 "src/mesa/state_tracker/st_format.c", line=line@entry=758, function=function@entry=0x7fa324e40160 <__func__.34798> "st_pipe_format_to_mesa_format") at src/gallium/auxiliary/util/u_debug.c:281 No locals. #1 0x00007fa3241d22b3 in st_pipe_format_to_mesa_format (format=format@entry=PIPE_FORMAT_R8G8B8A8_SRGB) at src/mesa/state_tracker/st_format.c:758 __func__ = "st_pipe_format_to_mesa_format" #2 0x00007fa3241c8ec5 in st_new_renderbuffer_fb (format=format@entry=PIPE_FORMAT_R8G8B8A8_SRGB, samples=0, sw=<optimised out>) at src/mesa/state_tracker/st_cb_fbo.c:295 strb = 0x19e8420 #3 0x00007fa32409d355 in st_framebuffer_add_renderbuffer (stfb=stfb@entry=0x19e7fa0, idx=<optimised out>) at src/mesa/state_tracker/st_manager.c:314 rb = <optimised out> format = PIPE_FORMAT_R8G8B8A8_SRGB sw = <optimised out> #4 0x00007fa32409e635 in st_framebuffer_create (st=0x19e7fa0, st=0x19e7fa0, stfbi=0x19e7a30) at src/mesa/state_tracker/st_manager.c:458 stfb = 0x19e7fa0 mode = {rgbMode = 1 '\001', floatMode = 0 '\000', colorIndexMode = 0 '\000', doubleBufferMode = 0, stereoMode = 0, haveAccumBuffer = 0 '\000', haveDepthBuffer = 1 '\001', haveStencilBuffer = 1 '\001', redBits = 8, greenBits = 8, blueBits = 8, alphaBits = 8, redMask = 0, greenMask = 0, blueMask = 0, alphaMask = 0, rgbBits = 32, indexBits = 0, accumRedBits = 0, accumGreenBits = 0, accumBlueBits = 0, accumAlphaBits = 0, depthBits = 24, stencilBits = 8, numAuxBuffers = 0, level = 0, visualRating = 0, transparentPixel = 0, transparentRed = 0, transparentGreen = 0, transparentBlue = 0, transparentAlpha = 0, transparentIndex = 0, sampleBuffers = 0, samples = 0, maxPbufferWidth = 0, maxPbufferHeight = 0, maxPbufferPixels = 0, optimalPbufferWidth = 0, optimalPbufferHeight = 0, swapMethod = 0, bindToTextureRgb = 0, bindToTextureRgba = 0, bindToMipmapTexture = 0, bindToTextureTargets = 0, yInverted = 0, sRGBCapable = 1} idx = <optimised out> #5 st_framebuffer_reuse_or_create (st=st@entry=0x19dfce0, fb=<optimised out>, stfbi=stfbi@entry=0x19e7a30) at src/mesa/state_tracker/st_manager.c:728 No locals. #6 0x00007fa32409e8cc in st_api_make_current (stapi=<optimised out>, stctxi=0x19dfce0, stdrawi=0x19e7a30, streadi=0x19e7a30) at src/mesa/state_tracker/st_manager.c:747 st = 0x19dfce0 stdraw = 0x640064 stread = 0x1300000006 ret = <optimised out> #7 0x00007fa324074a20 in XMesaMakeCurrent2 (c=c@entry=0x195bb00, drawBuffer=0x19e7e90, readBuffer=0x19e7e90) at src/gallium/state_trackers/glx/xlib/xm_api.c:1194 No locals. #8 0x00007fa3240783c8 in glXMakeContextCurrent (dpy=0x194e900, draw=8388610, read=8388610, ctx=0x195bac0) at src/gallium/state_trackers/glx/xlib/glx_api.c:1177 drawBuffer = <optimised out> readBuffer = <optimised out> xmctx = 0x195bb00 glxCtx = 0x195bac0 firsttime = 0 '\000' no_rast = 0 '\000' #9 0x00007fa32407852f in glXMakeCurrent (dpy=<optimised out>, drawable=<optimised out>, ctx=<optimised out>) at src/gallium/state_trackers/glx/xlib/glx_api.c:1211 No locals. Acked-by: Brian Paul <brianp@vmware.com>	2014-03-06 17:23:17 +00:00
Brian Paul	84094a273e	glapi: remove u_mutex wrapper code, use c99 thread mutexes directly v2: fix initializer mistake spotted by Chia-I Wu. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-06 07:53:06 -07:00
Brian Paul	846a7e8630	glapi: rename u_current dispatch table functions Put "table" in the names to make things more understandable. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-06 07:47:12 -07:00
Brian Paul	280e065707	glapi: replace 'user' with 'context' in u_current.[ch] code To make the functions more understandable. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-06 07:47:05 -07:00
Brian Paul	ef8a19ed4f	glsl: fix compiler warnings in link_uniforms.cpp With a non-debug build, gcc has two complaints: 1. 'found' var not used. Silence with '(void) found;' 2. 'id' not initialized. It's assigned by the UniformHash->get() call, actually. But init it to zero to silence gcc. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-06 07:45:36 -07:00
Ilia Mirkin	3649800009	mesa/st: only compare the one scissor sizeof(scissor) returns the size of the full array rather than a single element. Fix it to consider just the one element. Fixes: `0705fa35` ("st/mesa: add support for GL_ARB_viewport_array (v0.2)") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2014-03-05 22:51:58 -05:00
Chia-I Wu	4c68c6dcff	st/mesa: make winsys fbo sRGB-capable when supported The texture formats of winsys fbo are always linear becase the st manager (st/dri for example) could not know the colorspace used. But it does not mean that we cannot make the fbo sRGB-capable. By - setting rb->Visual.sRGBCapable to GL_TRUE when the pipe driver supports the format in sRGB colorspace, - giving rb an sRGB internal format, and - updating code to check rb->Format instead of strb->texture->format, we should be good. Fixed bug 75226 for at least llvmpipe and ilo, with no piglit regression. v2: do not set rb->Visual.sRGBCapable for GLES contexts to avoid surprises Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75226 Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-06 10:59:25 +08:00
Chia-I Wu	6d23ca1621	st/mesa: add mappings for MESA_FORMAT_B8G8R8X8_SRGB The format is mapped to PIPE_FORMAT_B8G8R8X8_SRGB. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-06 10:59:25 +08:00
Chia-I Wu	5a27491a76	mesa: add MESA_FORMAT_B8G8R8X8_SRGB The format is needed to represent an RGB-only winsys framebuffer that is sRGB-capable. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-06 10:59:25 +08:00
Brian Paul	48a9094b69	mesa: fix packing/unpacking for MESA_FORMAT_A4R4G4B4_UNORM Spotted by Chia-I Wu. v2: also fix unpack_ubyte_ARGB4444_REV() Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-05 16:06:54 -07:00
Eric Anholt	171ec9585f	i965: Fix predicated-send-based discards with MRT. We need the header setup to not be predicated on which pixels are undiscarded. I'm not sure originally if I had thought that the mask disable implied predicate disable, or if I had just misread the mask disable as predicate disable. Either way, I know I had spent more time thinking about this in the gen8 generator than the gen7 generator. Plus, it turns out that I had mis-implemented the "the GPU will use the predicate unless this header is present" comment, by skipping setting up the pixel mask when the header was present. Fixes GPU hangs in piglit glsl-fs-discard-mrt, Trine, Trine 2 and preusmably MLL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75207 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-05 13:37:33 -08:00
Eric Anholt	9856d658ce	configure: Fix bashism. /bin/sh defaults to dash on debian. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-05 13:37:33 -08:00
Andreas Boll	c1958911f1	docs: update 10.2 release notes	2014-03-05 22:20:48 +01:00
Brian Paul	02cb04c68f	mesa: remove remaining uses of _glthread_GetID() It was really only used in the radeon driver for a debug printf. And evidently, libGL.so referenced it just to work around some sort of linker issue. This patch removes the two calls to the function and the function itself. Fixes undefined _glthread_GetID symbol in libGL reported by 'nm'. Though, the missing symbol doesn't cause any issues on my system but it does cause glxinfo to fail on one of our test systems. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-05 11:05:48 -07:00
Brian Paul	0b0114cc3b	mesa: new init_teximage_fields_ms() function to init MS texture images Before, it was kind of ugly to set the multisample fields with assignments after we called _mesa_init_teximage_fields(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-05 11:05:47 -07:00
Rob Clark	4de1e5eddc	WIP: freedreno/a3xx: incorrect scissor for binning pass If scissor optimization is used (to avoid bringing scissored portions of the render target into GMEM and then back out to system memory) in combination with hw binning pass, the result would be a scissor mismatch between binning pass and rendering pass. This would cause rendering bugs in some scenarios with (for example) gnome-shell. I would have expected that simply using the correct screen-scissor during the binning pass would be enough, but seems like there is something else missing. So for now disable binning pass if scissor optimization is used.	2014-03-05 12:37:21 -05:00
Topi Pohjolainen	12d55d5f19	i965: Mark invariants in backend_visitor as constants Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:31:57 +02:00
Topi Pohjolainen	a290cd039c	i965: Merge resolving of shader program source Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:31:44 +02:00
Topi Pohjolainen	81494ec613	i965: Merge initialisation of backend_visitor Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:20:21 +02:00
Topi Pohjolainen	afed5354aa	i965/wm: Use resolved miptree consistently in surface setup Most of the logic refers to the local variable 'mt' directly but a few cases use 'intelObj->mt' instead. These are the same for now but will be different once stencil miptree gets used. v2 (Ian): fixed also indentation in surrounding lines Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:19:19 +02:00
Topi Pohjolainen	9b169a1893	i965/vec4: Mark invariant members as constants in vec4_visitor Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:13:57 +02:00
Topi Pohjolainen	8a9b4ade03	i965: Mark sources for offset getters as constants Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:13:05 +02:00
Ian Romanick	8f049dc298	docs: Import 10.1 release notes, add news item. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-05 09:32:26 +02:00
Ilia Mirkin	c74783abfa	nv50,nvc0: add 11f_11f_10f vertex support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-04 21:54:54 -05:00
Kenneth Graunke	dfa1ab0e52	i965: Implement ARB_stencil_texturing on Gen8+. On earlier hardware, we had to implement math in the shader to translate Y-tiled or untiled coordinates to W-tiled coordinates (which is what BLORP does today in order to texture from stencil buffers). On Broadwell, we can simply state that it's W-tiled in SURFACE_STATE, and adjust the pitch. This is much easier. In the surface state code, I chose to handle the "should we sample depth or stencil?" question separately from the setup for sampling from stencil. This should make it work with the BindRenderbufferTexImage hook as well, and hopefully be reusable for GL_ARB_texture_stencil8 someday. v2: Update docs/GL3.txt (caught by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-04 17:23:03 -08:00
Kenneth Graunke	23e81b93bb	mesa: Add core API support for GL_ARB_stencil_texturing (from 4.3). While the GL_ARB_stencil_texturing extension does not allow the creation of stencil textures, it does allow shaders to sample stencil values stored in packed depth/stencil textures. Specifically, applications can call glTexParameter* with a pname of GL_DEPTH_STENCIL_TEXTURE_MODE and value of either GL_DEPTH_COMPONENT or GL_STENCIL_INDEX to select which component they wish to sample. The default value is GL_DEPTH_COMPONENT (for traditional depth sampling). Shaders should use an unsigned integer sampler (presumably usampler2D) to access stencil data. Otherwise, results are undefined. Using shadow samplers with GL_STENCIL_INDEX selected also is undefined behavior. This patch creates a new gl_texture_object field, StencilSampling, to indicate that stencil should be sampled rather than depth. (I chose to use a boolean since I figured it would be more convenient for drivers.) It also introduces the [Get]TexParameter code to get and set the value, and of course the extension plumbing. v2: Also consider textures incomplete when sampling stencil with non-NEAREST min/mag filters (caught by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-04 17:21:06 -08:00
Dieter Nützel	5f23a2d9c2	radeon/uvd: fix typo in documentation s/grap/grab/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 17:54:07 -05:00
Eric Anholt	b959fd9674	dri: Require libudev-dev for building DRI on Linux. The loader infrastructure for everything but DRI2 requires that udev be present, so we can figure out an appropriate driver from the fd. We don't have a portable solution yet, but presumably it will have similar lookup based on the device node. It will also be even more required for krh's udev-based hwdb support, which lets us have a loader that actually loads DRI drivers not included in the loader's source distribution. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75212 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-04 14:32:52 -08:00
Tom Stellard	262e15fdd4	clover: Use correct LLVM version in #if for DataLayout construction Spotted by Michel Dänzer.	2014-03-04 16:22:09 -05:00
Zack Rusin	1dd84357ec	translate: fix buffer overflows Because in draw we always inject position at slot 0 whenever fragment shader would take the maximum number of inputs (32) it meant that we had PIPE_MAX_ATTRIBS + 1 slots to translate, which meant that we were crashing with fragment shaders that took the maximum number of attributes as inputs. The actual max number of attributes we need to translate thus is PIPE_MAX_ATTRIBS + 1. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-03-04 15:56:04 -05:00
Zack Rusin	08f174daa4	draw/llvm: fix generation of the VS with GS present draw_current_shader_* functions return a final output when considering both the geometry shader and the vertex shader. But when code generating vertex shader we can not be using output slots from the geometry shader because, obviously, those can be completely different. This fixes a number of very non-obvious crashes. A side-effect of this bug was that sometimes the vertex shading code could save some random outputs as position/clip when the geometry shader was writing them and vertex shader had different outputs at those slots (sometimes writing garbage and sometimes something correct). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-03-04 15:37:52 -05:00
Anuj Phogat	079bff5a99	mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexImage{123}D() From OpenGL 3.3 spec, page 141: "Textures with a base internal format of DEPTH_COMPONENT or DEPTH_STENCIL require either depth component data or depth/stencil component data. Textures with other base internal formats require RGBA component data. The error INVALID_OPERATION is generated if one of the base internal format and format is DEPTH_COMPONENT or DEPTH_STENCIL, and the other is neither of these values." Fixes Khronos OpenGL CTS test failure: proxy_textures_invalid_size Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 11:23:04 -08:00
Anuj Phogat	0f6f92e284	mesa: Use clear_teximage_fields() in place of _mesa_init_teximage_fields() This patch makes no functional changes to the code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 11:22:58 -08:00
Anuj Phogat	063980151e	mesa: Set initial internal format of a texture to GL_RGBA From OpenGL 4.0 spec, page 398: "The initial internal format of a texel array is RGBA instead of 1. TEXTURE_COMPONENTS is deprecated; always use TEXTURE_INTERNAL_FORMAT." Fixes Khronos OpenGL CTS test failure: proxy_textures_invalid_size Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 11:22:39 -08:00
Vinson Lee	f2d724c686	scons: Build with C++11 with LLVM >= 3.5. Starting with llvm-3.5svn r202574, LLVM expects C+11 mode. commit f8bc17fadc8f170c1126328d203f0dab78960137 Author: Chandler Carruth <chandlerc@gmail.com> Date: Sat Mar 1 06:31:00 2014 +0000 [C++11] Turn off compiler-based detection of R-value references, relying on the fact that we now build in C++11 mode with modern compilers. This should flush out any issues. If the build bots are happy with this, I'll GC all the code for coping without R-value references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202574 91177308-0d34-0410-b5e6-96231b3b80d8 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-03-04 10:12:20 -08:00
Brian Paul	cbacee207f	st/osmesa: check buffer size when searching for buffers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75543 Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-04 08:49:15 -07:00
José Fonseca	3d7c8836a6	configure: s/--with-llvm-shared-libs/--enable-llvm-shared-libs/ `--enable-llvm-shared-libs` option was recently renamed as `--with-llvm-shared-libs`, but several error messages still mention the old option, causing confusing. Trivial.	2014-03-04 14:09:37 +00:00
José Fonseca	a61d859519	c11/threads: Don't implement thrd_current on Windows. GetCurrentThread() returns a pseudo-handle (a constant which only makes sense when used within the calling thread) and not a real handle. DuplicateHandle() will return a real handle, but it will create a new handle every time we call. Calling DuplicateHandle() here means we will leak handles, which can cause serious problems. In short, the Windows implementation of thrd_t needs a thorough make over, and it won't be pretty. It looks like C11 committee over-simplified things: it would be much better to have seperate objects for threads and thread IDs like C++11 does. For now, just comment out the thrd_current() implementation, so we get build errors if anybody tries to use it. Thanks to Brian Paul for spotting and diagnosing this problem. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 12:05:23 +00:00
José Fonseca	e8d85034da	mapi/u_thread: Use GetCurrentThreadId u_thread_self() expects thrd_current() to return a unique numeric ID for the current thread, but this is not feasible on Windows. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 12:05:23 +00:00
José Fonseca	f34d75d6f6	c11/threads: Fix nano to milisecond conversion. Per https://gist.github.com/yohhoy/2223710/#comment-710118 Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel@daenzer.net>	2014-03-04 12:05:23 +00:00
Marek Olšák	1337da5115	r600g: implement edge flags Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 12:26:16 +01:00
Marek Olšák	ac35ded473	r600g: port color buffer format conversion from radeonsi r600_translate_colorformat is rewritten to look like radeonsi. r600_translate_colorswap is shared with radeonsi. r600_colorformat_endian_swap is consolidated. This adds some formats which were missing. Future "plain" formats will automatically be supported. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 12:26:16 +01:00
Marek Olšák	dff3eccd15	radeonsi: move translate_colorswap to common code Also translate the Y__X swizzle. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 12:26:16 +01:00
Emil Velikov	1a568e0f2b	Revert "configure: use enable_dri_glx local variable" This reverts commit `dfe8cb48fc`. Accidently pushed this commit, over 1bb23abe065(configure: disable shared glapi when building xlib powered glx).	2014-03-04 02:13:48 +00:00
Emil Velikov	1bb23abe06	configure: disable shared glapi when building xlib powered glx With commit 0432aa064b(configure: use shared-glapi when more than one gl* API is used) we removed "disable shared-glapi when building without dri" hunk. In the good old days of classic mesa, dri and xlib-glx were mutually exclusive thus the hunk made sense. Currently enable-dri is used as a synonym for a range of things thus it's more appropriate to handle xlib-glx explicitly. Fixes a missing symbol '_glapi_Dispatch' in a xlib powered libGL, build using the following ./autogen.sh --enable-xlib-glx --disable-dri --with-gallium-drivers=swrast Cc: Brian Paul <brianp@vmware.com> Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-04 02:13:14 +00:00
Brian Paul	1e3bdb35a6	mesa: remove unneeded glthread.c file The _glthread_GetID() function is also defined in mapi_glapi.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:09:00 -07:00
Brian Paul	db806cacfd	mesa: remove empty glthread.h file Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	94dc91d7ec	mesa: remove unused glthread/TSD macros Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	bc76e9f28d	xlib: remove unneeded context tracking code This removes the only use of _glthread_Get/SetTSD(), etc. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	c00b250c80	xlib: simplify context handling Get rid of the fake_glx_context struct. Now, an XMesaContext is the same as a GLXContext. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	9b8e267976	xlib: remove unused realglx.[ch] files At one point in time, the xlib driver could call the real GLX functions. But that's long dead. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	afbc9b3537	mesa: remove unused _glthread_*MUTEX() macros Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	f19000550d	glsl: switch to c11 mutex functions Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:58 -07:00
Brian Paul	d129ea7fa2	mesa: switch to c11 mutex functions Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:58 -07:00
Brian Paul	2706db701d	xlib: switch to c11 mutex functions The _glthread_LOCK/UNLOCK_MUTEX() macros are just wrappers around the c11 mutex functions. Let's start getting rid of those wrappers. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:58 -07:00
Brian Paul	657436da7e	mesa: update packed format layout comments Update the comments for the packed formats to accurately reflect the layout of the bits in the pixel. For example, for the packed format MESA_FORMAT_R8G8B8A8, R is in the least significant position while A is in the most-significant position of the 32-bit word. v2: also fix MESA_FORMAT_A1B5G5R5_UNORM, per Roland.	2014-03-03 13:08:58 -07:00
Hans	837da9bdae	mesa: don't define c99 math functions for MSVC >= 1800 Signed-off-by: Brian Paul <brianp@vmware.com> Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-03 11:56:33 -07:00
Hans	bf25660325	util: don't define isfinite(), isnan() for MSVC >= 1800 Signed-off-by: Brian Paul <brianp@vmware.com> Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-03 11:56:30 -07:00
Brian Paul	aff7c5e78a	mesa: don't call ctx->Driver.ClearBufferSubData() if size==0 Fixes failed assertion when trying to map zero-length region. https://bugs.freedesktop.org/show_bug.cgi?id=75660 Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-03 10:41:42 -07:00
Brian Paul	465b2c42bc	softpipe: use 64-bit arithmetic in softpipe_resource_layout() To avoid 32-bit integer overflow for large textures. Note: we're already doing this in llvmpipe. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-03 10:41:42 -07:00
Grigori Goronzy	070036ca39	NV_vdpau_interop: fix IsSurfaceNV return type The spec incorrectly used void as return type, when it should have been GLboolean. This has now been fixed. According to Nvidia, their implementation always used GLboolean. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-03 18:37:59 +01:00
Grigori Goronzy	86c06871a2	st/vdpau: fix possible NULL dereference Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-03 18:37:35 +01:00
Christian König	bd6654aa38	st/omx: always advertise all components omx_component_library_Setup should return all entrypoints the library implements, independent of what is available on the current hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74944 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2014-03-03 18:22:38 +01:00
Bruno Jiménez	79c83837c9	clover: Fix building with latest llvm Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-03 17:16:58 +01:00
Bruno Jiménez	089d0660c7	configure: Remove more flags from llvm-config This way, we are left with only the preprocessor flags and '-std=X' Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-03 17:16:52 +01:00
Fabio Pedretti	8a8dd86edc	configure.ac: consolidate dependencies version check Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-03 16:45:16 +01:00
Julien Cristau	6f0e2731e8	glx/dri2: fix build failure on HURD Patch from Debian package. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-03 16:44:44 +01:00
Dave Airlie	15b4ff3f4e	st/dri: add support for dma-buf importer (DRIimage v8) This is just a simple implementation that stores the extra values into the DRIimage struct and just uses the fd importer. I haven't looked into what is required to import YUV or deal with the extra parameters. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-03-03 11:14:38 +10:00
Dave Airlie	3fd081d1a5	st/dri: move fourcc->format conversion to a common place Before I cut-n-paste this a 3rd time lets consolidate it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-03-03 11:14:38 +10:00
Kenneth Graunke	c95ec27a4a	mesa: Move MESA_GLSL=dump output to stderr. i965 recently moved debug printfs to use stderr, including ones which trigger on MESA_GLSL=dump. This resulted in scrambled output. For drivers using ir_to_mesa, print_program was already using stderr, yet all the code around it was using stdout. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-02 13:37:09 -08:00
Kenneth Graunke	3f37dd913f	glsl: Fix broken LRP algebraic optimization. opt_algebraic was translating lrp(x, 0, a) into add(x, -mul(x, a)). Unfortunately, this references "x" twice, which is invalid in the IR, leading to assertion failures in the validator. Normally, cloning IR solves this. However, "x" could actually be an arbitrary expression tree, so copying it could result in huge piles of wasted computation. This is why we avoid reusing subexpressions. Instead, transform it into mul(x, add(1.0, -a)), which is equivalent but doesn't need two references to "x". Fixes a regression since `d5fa8a9562`, which isn't in any stable branches. Fixes 18 shaders in shader-db (bastion and yofrankie). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-02 13:35:03 -08:00
Rob Clark	ecb71cfa66	freedreno/a3xx/compiler: overflow in trans_endif The logic to count number of block outputs was out of sync with the actual array construction. But to simplify / make things less fragile, we can just allocate the arrays for worst case size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	e0007f733d	freedreno/a3xx/compiler: fix for resolving PHI's A value may be assigned on only one side of an if/else. In this case we can simply substitute a mov.f32f32. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	26530716ab	freedreno/lowering: two-sided-color Add option to generate fragment shader to emulate two sided color. Additional inputs are added to shader for BCOLOR's (on corresponding to each COLOR input). CMP instructions are used to select whether to use COLOR or BCOLOR. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	8dd70125fc	freedreno/a3xx/compiler: add SSG Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	44c8f96b0d	freedreno/a3xx: fix gl_PointSize If vertex writes pointsize, there are a few extra bits we need to turn on in the cmdstream here and there. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	05a9bda971	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	cb540c21f2	freedreno/a3xx: binning-pass vertex shader variant Now that we have the infrastructure for shader variants, add support to generate an optimized shader for hw binning pass (with varyings/outputs other than position/pointsize removed). This exposes the possibility that the shader uses fewer constants than what is bound, so we have to take care to not emit consts beyond what the shader uses, lest we provoke the wrath of the HLSQ lockup! Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	664045752f	freedreno/a3xx: add support for frag coord/face Fixes anything that tries to use gl_FrontFacing/gl_FragCoord. Also, face support is needed to emulate two sided color. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	76924e3b51	freedreno/a3xx: fix for unused inputs An unused input might not have a register assigned. We don't want bogus regid to result in impossibly high max_reg.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Chris Forbes	befbda56a2	i965: Validate (and resolve) all the bound textures. BRW_MAX_TEX_UNIT is the static limit on the number of textures we support per-stage, not in total. Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is significantly larger, and across the various shader stages, up to ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually used. Fixes invisible bad behavior in piglit's max-samplers test (although this escalated to an assertion failure on HSW with texture_view, since non-immutable textures only have _Format set by validation.) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-02 21:14:56 +13:00
Chris Forbes	590920f93e	i965: Widen sampler key bitfields for 32 samplers Previously the `high` 16 samplers on Haswell+ would not get sampler workarounds applied. Don't bother widening YUV fields, since they're ignored and going away soon anyway. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-02 21:14:18 +13:00
Emil Velikov	fc25956bad	dri/i9*5: correctly calculate the amount of system memory The variable name states megabytes, while we calculate the amount in kilobytes. Correct this by dividing with the correct amount. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-01 08:49:59 -08:00
Ilia Mirkin	f19271c7bf	gallium/util: add missing u_math include This is needed for MIN2/MAX2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-28 20:00:34 -05:00
Brian Paul	a12d4d0398	mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT Fixes glGetTexImage() when converting from MESA_FORMAT_Z32_FLOAT_S8X24_UINT to GL_UNSIGNED_INT_24_8. Hit by the piglit ext_packed_depth_stencil-getteximage test. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-28 17:16:37 -07:00
Siavash Eliasi	2a399d9eae	glx/apple: Fixed glx context memory leak in case of failure. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jeremy Huddleston Sequoia: <jeremyhu@apple.com>	2014-02-28 15:57:15 -08:00
Siavash Eliasi	f4416323fc	gbm/dri: Fixed buffer object memory leak in case of failure. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-28 15:57:15 -08:00
Siavash Eliasi	0fe8d71667	r300g/tests: Added missing fclose for FILE resource. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-28 15:57:15 -08:00
Ian Romanick	ff2cbf9e0c	i915: Allocate the sys_buffer using _mesa_align_malloc Though it won't matter on Linux, use _mesa_align_free to release it. Since i965 doesn't have sys_buffer, I overlooked this in the GL_ARB_map_buffer_alignment work a few months ago. Fixes i915 (and presumably i830) regressions in ARB_map_buffer_range tests and the failure in arb_map_buffer_alignment-sanity_test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74960 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-28 15:05:39 -08:00
Ian Romanick	8ba157006f	i915: Only allow 8 vertex texture units There's no reason to have more vertex texture units than fragment texture units on this hardware. Since increasing the default maximum number of texture units from 16 to 32, this has triggered some segfault in i915 driver. There's probably some array or bitfield that isn't properly sized now. This really papers over the bug, but I don't think I'll lose any sleep over that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74071 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-28 15:05:38 -08:00
Petri Latvala	59989a4a92	i965: Assert array index on access to vec4_visitor's arrays. v2: vec4_visitor::pack_uniform_registers(): Use correct comparison in the assert, this->uniforms is already adjusted. Compare the actual value used to index uniform_size and uniform_vector_size instead. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 15:05:38 -08:00
Petri Latvala	7189fce237	i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically. v2: Don't add function parameters, pass the required size in prog_data->nr_params. v3: - Use the name uniform_array_size instead of uniform_param_count. - Round up when dividing param_count by 4. - Use MAX2() instead of taking the maximum by hand. - Don't crash if prog_data passed to vec4_visitor constructor is NULL v4: Rebase for current master v5 (idr): Trivial whitespace change. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 15:05:38 -08:00
Marek Chalupa	96f324e229	gbm: export gbm_device_is_format_supported Probably depending on compiler settings, the definition can be hidden, so undefined reference error can be encountred during linking. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75528 Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:57:30 +00:00
Emil Velikov	dfe8cb48fc	configure: use enable_dri_glx local variable GLX can be either dri or xlib based, while enable_dri is used in a variety of contexts. With enable_dri_glx the context is clearly visible. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:56:33 +00:00
Emil Velikov	4687b0a1a7	configure: enable the drm pipe-loader for non swrast drivers All hardware drivers including the virtual vmwgfx require the drm pipe-loader in order to be properly loaded by xa, gbm and opencl. Note this does _not_ add support for the above three it only allows the pipe driver to be loaded by the library. Eg. GBM will now properly open the pipe-i915 driver, should one be working on the such hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75453 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:48:38 +00:00
Emil Velikov	e283e96666	configure: error out when building xa only with swrast Building to provide accelration using swrast does not make sense. Note: update your build script to explicitly mention svga in the gallium drivers list, if you are building the vmwgfx xa library. v2: Update error message to provide more clarify, add an example. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:47:56 +00:00
Emil Velikov	2e830bba21	configure: avoid setting variables as empty strings Recent patch converted our logic to use test -n and test -z. An emptry string variable (empty_str="") return true for both thus making the check unreliable. Fix this by correctly setting the variable when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:34:50 +00:00
Emil Velikov	f42333b6b6	configure: avoid constantly building megadrivers 'core' The issue is caused by a thinko that an empty string will be considered of zero length by 'test'. This is not the case, thus we were building the 'core' of megadrivers even when no classic drivers were built. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:34:50 +00:00
Tom Stellard	f61e382f0a	r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs This prevents clover from using unsupported devices. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 16:17:34 -05:00
Matt Turner	4bd7f1d044	glsl: Don't vectorize horizontal expressions. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75224	2014-02-28 10:37:52 -08:00
Matt Turner	5eff8576ba	glsl: Add is_horizontal() method to ir_expression. Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 10:37:46 -08:00
Matt Turner	d5fa8a9562	glsl: Optimize lrp(x, 0, a) into x - (x * a). Helps one program in shader-db: instructions in affected programs: 96 -> 92 (-4.17%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 10:36:12 -08:00
Matt Turner	ecc6c3d4ab	glsl: Optimize lrp(0, y, a) into y * a. Helps two programs in shader-db: instructions in affected programs: 254 -> 234 (-7.87%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 10:36:06 -08:00
Brian Paul	43dee0295e	mesa: do depth/stencil format conversion in glGetTexImage glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row() to convert texels from the hardware format to the GL format. Fixes issue reported by David Meng at Intel. The new piglit ext_packed_depth_stencil-getteximage test checks for this bug. Also, add some format/type assertions. We don't yet handle the GL_FLOAT_32_UNSIGNED_INT_24_8_REV type. That should be fixed in a follow-on patch. Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 07:02:55 -07:00
Brian Paul	84787aae95	mesa: fix depth/stencil comments in formats.h	2014-02-28 07:02:36 -07:00
Thomas Hellstrom	f5e681f3fa	winsys/svga: Avoid calling drm getparam for max surface size on older kernels This avoids the kernel driver spewing out errors about the param not being supported. Also correct the max surface size used when the kernel does not support the query. Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 11:11:21 +01:00
Kenneth Graunke	085f61bd4e	meta: Drop ctx->API checks. API is always API_OPENGL_COMPAT (since commit `4e4a537ad5`, "meta: Push into desktop GL mode when doing meta operations."), so most of these checks do nothing. We could instead check save->API to only bother setting/restoring relevant GL state, but I'm not sure saving a few _mesa_set_enable calls is worth the complexity. My understanding is the point of the ctx->API guards was to avoid raising GL errors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-27 10:07:40 -08:00
Kenneth Graunke	cf719a0204	meta: Restore API at the end of _mesa_meta_end(), not the start. In _mesa_meta_begin(), we switch to API_OPENGL_COMPAT, then munge a lot of state (including some that doesn't exist in the actual API - like PolygonStipple in API_OPENGL_CORE). It seems reasonable that in _mesa_meta_end(), we should restore it, then switch back to the original API. This at least makes it symmetric. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-27 10:07:40 -08:00
Roland Scheidegger	612a1d5be1	util/u_format: don't crash in util_format_translate if we can't do translation Some formats can't be handled - in particular cannot handle ints/uints formats, which lack the pack_rgba_float/unpack_rgba_float functions. Instead of trying to call these (and crash) return an error (I'm not sure yet if we should try to translate such formats too here might not make much sense). v2: suggested by Jose, use separate checks for pack/unpack of rgba_8unorm and rgba_float functions (right now if one exists the other should as well). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-27 17:56:10 +01:00
Kenneth Graunke	80c1b9349c	i965: Convert VUE map generation checks to if rather than switch. There are currently only two VUE map layouts: one for Gen4-5, and one for everything else. We keep having to add new "case N+1" labels for every new hardware generation, and so far it's always been the same. This patch makes it so we only have to do work in the case where something actually changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-27 00:05:55 -08:00
Kenneth Graunke	9b1a6745f6	i965: Only emit VS state pipe control workaround on IVB and BYT. According to the BSpec's 3D workarounds page, this is unnecessary on shipping Haswell hardware, and was never necessary on Broadwell. It unfortunately doesn't say anything about Baytrail. The workaround database confirms those results for Ivybridge, Haswell, and Broadwell. Baytrail is less clear - one page says it's necessary, while the other says it isn't. For now, be conservative and leave it enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-27 00:05:48 -08:00
Ilia Mirkin	51fc093421	nouveau: add a nouveau_compiler binary to compile TGSI into shader ISA This makes it easy to compare output between different cards, especially for ones that you don't have (and/or not in the current machine). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:48 -05:00
Ilia Mirkin	dd370f0af6	nv30: remove nv30_context use from nvfx_*prog This should pave the way to being able to use the compiler without a context. Also leads to cleaner code. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:47 -05:00
Ilia Mirkin	41dbc4c444	nv30: remove unused sprite flipping parameter Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:47 -05:00
Ilia Mirkin	fe2738f998	nv30: remove unused render_mode and hw_pointsprite_control Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:46 -05:00
Ilia Mirkin	8f23d08928	nv30: remove use_nv4x, it is identical to is_nv4x Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:45 -05:00
Ilia Mirkin	734fe2d246	docs: update nvc0 state ARB_texture_buffer_object_rgb32 has been supported for a while already.	2014-02-26 23:35:45 -05:00
Michel Daenzer	59936a49dd	radeonsi: Prevent geometry shader from emitting too many vertices	2014-02-27 10:27:55 +09:00
Anuj Phogat	b3094d9927	i965: Fix the region's pitch condition to use blitter intelEmitCopyBlit uses a signed 16-bit integer to represent buffer pitch, so it can only handle buffer pitches < 32k. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-26 13:43:00 -08:00
Brian Paul	863a1f7757	glsl: add switch case for MESA_SHADER_COMPUTE To fix warning about unhandled enum value. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-26 13:29:16 -07:00
Kenneth Graunke	fe8f3bef31	meta: Use a #define for the vector type to avoid %svec4 everywhere. By adding "#define gvec4 %svec4" to the top of our fragment shader, we can write generic code without needing to specialize it to vec4, ivec4, or uvec4 via asprintf. This also makes the INT and UNSIGNED_INT merge function code identical, so I combined those two cases. It's not a big savings, but a little bit tidier. v2: Rebase on Vinson's MSVC build fixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-26 02:33:58 -08:00
Kenneth Graunke	f896e82301	i965: Don't try to dump shader source for fixed-function FS programs. sh->Source is NULL and this will segfault. Fixes MESA_GLSL=dump with "The Swapper". Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:31:24 -08:00
Kenneth Graunke	b18871c863	i965: Don't forget to subtract mt->first_level in minify calls. This fixes fbo-clear-formats GL_ARB_depth_texture on Ironlake, which regressed since commit `f128bcc7c2` ("i965: Drop mt->levels[].width/height.") intel_miptree_copy_slice was calling minify(.., 7) on a 2x2 texture with mt->first_level == 7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75292 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:29:44 -08:00
Kenneth Graunke	ac0a8b9540	glsl: Delete LRP_TO_ARITH lowering pass flag. Tt's kind of a trap---calling do_common_optimization() after lower_instructions() may cause opt_algebraic() to reintroduce ir_triop_lrp expressions that were lowered, effectively defeating the point. Because of this, nobody uses it. v2: Delete more code (caught by Ian Romanick). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:56 -08:00
Kenneth Graunke	2fdea48e21	i965: Stop lowering ir_triop_lrp. Both the vector and scalar backends now support it natively, so there's no point in lowering it. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:55 -08:00
Kenneth Graunke	56879a7ac4	i965/vec4: Handle ir_triop_lrp on Gen4-5 as well. When the vec4 backend encountered an ir_triop_lrp, it always emitted an actual LRP instruction, which only exists on Gen6+. Gen4-5 used lower_instructions() to decompose ir_triop_lrp at the IR level. Since commit `8d37e9915a` ("glsl: Optimize open-coded lrp into lrp."), we've had an bug where lower_instructions translates ir_triop_lrp into arithmetic, but opt_algebraic reassembles it back into a lrp. To avoid this ordering concern, just handle ir_triop_lrp in the backend. The FS backend already does this, so we may as well do likewise. v2: Add a comment reminding us that we could emit better assembly if we implemented the infrastructure necessary to support using MAC. (Assembly code provided by Eric Anholt). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:53 -08:00
Kenneth Graunke	ffde483f3c	i965/vec4: Add a brw->gen >= 6 assertion in three-source emitters. Three source instructions didn't exist until Gen6. vec4_generator has assertions to catch this, but catching it in the visitor provides a nicer backtrace. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:34 -08:00
Chia-I Wu	bb9c8071ea	ilo: create u_upload_mgr last Similar to u_blitter, u_upload_mgr is now a client of the pipe context. Its creation needs to be delayed until the context has been (almost) initialized.	2014-02-26 11:33:37 +08:00
Fredrik Höglund	3616e862f2	glx: Fix the GLXFBConfig attrib sort priorities The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are not defined in GL_ARB_multisample, but they are defined in the GLX 1.4 specification. Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-26 02:17:12 +01:00
Fredrik Höglund	f41c2f6c33	glx: Fix the default values for GLXFBConfig attributes The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in the GLX 1.4 specification. This fixes the glx-choosefbconfig-defaults piglit test. Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-26 02:16:42 +01:00
Tom Stellard	54df6a0491	Re-commit 'clover: Fix build with LLVM 3.5' This was accidentally reverted in `9dfd7c5f75`	2014-02-25 14:43:26 -08:00
Vinson Lee	f094866d93	mesa: Add GL_ARB_buffer_storage to dispatch_sanity.cpp. Fixes 'make check' failure introduced with commit `119ffa7307`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75503 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-25 14:00:08 -08:00
Timothy Arceri	9dfd7c5f75	Revert "Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa" This reverts commit `1b79582f32`, reversing changes made to `376a98d345`.	2014-02-26 08:46:08 +11:00
Timothy Arceri	1b79582f32	Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa ry,	2014-02-26 08:39:32 +11:00
Tom Stellard	fcd499730b	clover: Fix build with LLVM 3.5	2014-02-25 13:32:37 -08:00
Timothy Arceri	376a98d345	glsl: removed unused dimension_count varible This variable is no longer needed after the cleanup to the code prior to the first arrays of array series Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-26 08:31:25 +11:00
Ilia Mirkin	d9b983519c	build: llvm libs may not be in system search path, add rpath On my gentoo system, llvm libs are in /usr/lib64/llvm, and llvm-config --ldflags does not provide the rpath (it does, of course, provide a -L). This adds the llvm dir to the rpath. It should be harmless if the path is a system path, and should make things work when it's not. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-25 15:30:13 -05:00
Eric Anholt	42c2366de5	i965: Fix segfaults since the buffer_storage changes.	2014-02-25 12:19:15 -08:00
Ilia Mirkin	6417cabd9c	docs: update nv50 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 14:42:35 -05:00
Ilia Mirkin	d1b1329c3a	nv50: enable txg where supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 14:42:34 -05:00
Ilia Mirkin	0e71c65db0	nv50: enable cube map array texture support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 14:42:34 -05:00
Brian Paul	5a3dc449a9	libgl-xlib: add -Isrc/gallium/winsys flag So that sw/xlib/xlib_sw_winsys.h can be found. Fixes a build break. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-25 12:35:07 -07:00
Brian Paul	c88a0b6af3	st/mesa: add comment to explain _min(), _maxf(), etc. functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-25 12:35:07 -07:00
Marek Olšák	9855477e90	r600g,radeonsi: consolidate create_surface and surface_destroy Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:26 +01:00
Marek Olšák	b9aa8ed009	radeonsi: inline util_blitter_copy_texture This will be used for changing texture properties without modifying pipe_resource like r600g, but not in this series. For now, this change allows consolidation of pipe_surface functions. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:22 +01:00
Marek Olšák	f7176d700f	radeonsi: remove useless psbox variable from resource_copy_region Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:20 +01:00
Marek Olšák	80eb377a37	radeonsi: compute depth surface registers only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:18 +01:00
Marek Olšák	629b019a40	radeonsi: compute color surface registers only once Same as r600g. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:17 +01:00
Marek Olšák	6b4e03216a	r600g: remove r600_resource.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:15 +01:00
Marek Olšák	ec266d06d0	r600g: remove r600_surface::htile_enabled v2: use one of the htile registers instead Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:12 +01:00
Marek Olšák	7fc6ece40e	r600g: use r600_surface::db_z_info db_z_info was unused. This just renames the variable to match the register name. Now, db_depth_info is unused on Evergreen. Both variables will be needed on SI though. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:10 +01:00
Marek Olšák	40b9812a76	r600g,radeonsi: share r600_surface I'm gonna use this in radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:08 +01:00
Marek Olšák	933eaeee25	radeonsi: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to framebuffer state It doesn't depend on anything else. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:05 +01:00
Marek Olšák	dca350201e	mesa: allow buffers to be mapped multiple times OpenGL allows a buffer to be mapped only once, but we also map buffers internally, e.g. in the software primitive restart fallback, for PBOs, vbo_get_minmax_index, etc. This has always been a problem, but it will be a bigger problem with persistent buffer mappings, which will prevent all Mesa functions from mapping buffers for internal purposes. This adds a driver interface to core Mesa which supports multiple buffer mappings and allows 2 mappings: one for the GL user and one for Mesa. Note that Gallium supports an unlimited number of buffer and texture mappings, so it's not really an issue for Gallium. v2: fix unmapping in xm_dd.c, remove the GL errors there v3: fix the intel driver (by Fredrik) Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	86e68b0f1f	docs: update ARB_buffer_storage status Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	04fb4bf61b	gallium/upload_mgr: remove useless variable "size" Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	7ea3f6bce5	gallium/upload_mgr: don't unmap buffers if persistent mappings are supported Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	db8886ed09	gallium: the other drivers don't support ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	6381dd7e9d	r300g,r600g,radeonsi: add support for ARB_buffer_storage All GTT memory mappings are coherent and therefore can be persistent. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:05:41 +01:00
Marek Olšák	dfa0b8d9b8	st/mesa: implement ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:05:41 +01:00
Marek Olšák	5f61f052b5	gallium: add interface for persistent and coherent buffer mappings Required for ARB_buffer_storage.	2014-02-25 16:05:41 +01:00
Marek Olšák	d26a065b74	mesa: allow buffers mapped with the persistent flag to be used by the GPU v2: also fixed InvalidateBufferData, added citations from the 4.4 spec Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	4f78e17f6d	mesa: add error checks to glMapBufferRange, glMapBuffer for ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	119ffa7307	glapi: add ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	e592f11227	mesa: implement glBufferStorage, immutable buffers; add extension enable flag Reviewed-by: Fredrik Höglund <fredrik@kde.org> v2: dropped the error that DYNAMIC_STORAGE is required for MAP_WRITE_BIT, the error is removed in the latest revision of GL 4.4	2014-02-25 16:04:22 +01:00
Marek Olšák	7e548d0507	mesa: add storage flags parameter to Driver.BufferData It will be used by glBufferStorage. The parameters are chosen according to ARB_buffer_storage. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	aea4933287	mesa: remove unused driver hook BindBuffer Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:21 +01:00
Emil Velikov	882070cc81	nv50: correctly calculate the number of vertical blocks during transfer map Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 12:19:07 +00:00
Dave Airlie	7c3138acb9	st/mesa: add texture gather support. (v2) This adds support for GL_ARB_texture_gather, and one step of support for GL_ARB_gpu_shader5. This adds support for passing the TG4 instruction, along with non-constant texture offsets, and tracking them for the optimisation passes. This doesn't support native textureGatherOffsets hw, to do that you'd need to add a CAP and if set disable the lowering pass, and bump the MAX offsets to 4, then do the i0,j0 sampling using those. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-25 13:29:37 +10:00
Dave Airlie	2fcbec48d7	gallium: add texture gather support to gallium (v3) This adds support to gallium for a TG4 instruction, and two CAPs. The first CAP is required for GL_ARB_texture_gather. The second CAP is required to expose GL_ARB_gpu_shader5. However so far we haven't found any hardware that natively exposes the textureGatherOffsets feature from GL, so just lower it for now. If hardware appears for this we can add another CAP to allow TG4 to take 4 offsets. v2: add component selection src and a cap to say hw can do it. (st can use to help control GL_ARB_gpu_shader5/GLSL 4.00). Add docs. v3: rename to SM5, add docs. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-25 13:29:17 +10:00
Dave Airlie	122c3b9486	glsl/i965: move lower_offset_array up to GLSL compiler level. This lowering pass will be useful for gallium drivers as well, in order to support the GL TG4 oddity that is textureGatherOffsets. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-25 13:28:57 +10:00
Tom Stellard	945d87f958	clover: Pass buffer offsets to the driver in set_global_binding() v3 The offsets will be stored in the handles parameter. This makes it possible to use sub-buffers. v2: - Style fixes - Add support for constant sub-buffers - Store handles in device byte order v3: - Use endian helpers Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-24 12:56:27 -08:00
Tom Stellard	eac7236042	radeonsi: Use SI_BIG_ENDIAN now that it exists Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	8f3bcedde2	r600g: Use util_cpu_to_le32() instead of bswap32() on big-endian systems Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	195ee10673	radeonsi: Use util_cpu_to_le32() instead of bswap32() on big-endian systems Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	9f30685fae	util: Add util_cpu_to_le* helpers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-24 12:56:27 -08:00
Tom Stellard	a9f88e2ae8	util: Add util_bswap64() v3 v2: - Use __builtin_bswap64() - Remove unnecessary mask - Add util_le64_to_cpu() helper v3: - Remove unnecessary AC_SUBST Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	f8ba0f55d3	configure.ac: Use AX_GCC_BUILTIN to check availability of __builtin_bswap32 v2 v2: - Remove unnecessary AC_SUBST Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-24 12:56:26 -08:00
Emil Velikov	73b46136b0	targets/opencl: resolve undefined symbols at link time Current automake build does not try to resolve undefined symbols thus we could end up with a broken library. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-24 14:59:39 +00:00
Emil Velikov	1ad9534337	gallium/targets: resolve undefined reference to pipe_loader_sw_probe_dri With the introduction of the pipe_loader_sw_probe_dri helper we require the sw/dri winsys during linking stage despite it being unused by any of the targets. This will cause a minor increase in the resulting library which will be cleaned up via linker options with upcoming patches. v2: Link with libswdri.la only when available. Reported-and-tested-by: Tom Stellard <thomas.stellard@amd.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:59:34 +00:00
Emil Velikov	61973ffe5b	configure: correctly report if we're building the sw/xlib winsys While looking at bug 75356, I've noticed that the presence of x11 egl platform pulls in sw/xlib as "needed" but fails to report so at the end of configure. Tested-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:57:41 +00:00
Emil Velikov	3445e8bb92	pipe-loader: wrap pipe_loader_sw_probe_xlib within HAVE_PIPE_LOADER_XLIB The above function implies using the the xlib winsys, which has additional library dependencies that should not be forced. Make the software xlib pipe loader optional thus avoid all the dependency hell. A user that wishes to use the particular pipe-loader would need to set the following within configure.ac. enable_gallium_xlib_loader=yes v2: - Wrap sw/xlib/xlib_sw_winsys.h to handle compilation on systems lacking X11 headers. Spotted by Christian Prochaska. Tested-by: Tom Stellard <thomas.stellard@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75356 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:52:27 +00:00
Emil Velikov	0e7c30233f	targets/gbm: exit gracefully if pipe_loader_drm_probe_fd is not available When one builds without gallium_drm_loader, the above function will not be available, thus we'll segfault in gallium_screen_create due to memory access violation. Tested-by: Tom Stellard <thomas.stellard@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75335 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:51:45 +00:00
Kenneth Graunke	73c78c514f	i965: Don't try to use the hardware blitter for multisampled miptrees. The blitter is completely ignorant of MSAA buffer layouts, so any attempt to use BLT paths with MSAA buffers is likely to break spectacularly. In most cases, BLORP handles MSAA blits, so we never hit this bug. Until recently, it also wasn't worth fixing, since Meta couldn't handle MSAA either, so there was nothing to fall back to. But now there is. +143 piglit tests on Broadwell (which doesn't have BLORP support). Surprisingly, three also start failing. Since non-IMS MSAA buffers store samples in successive array slices, using the blitter ought to access sample 0 and ignore the rest, which is apparently good enough for a few not-very-picky Piglit tests. Presumably the meta replacement code is still broken. No Piglit changes on Ivybridge. v2: Move the early return to the top of the function (suggested by Paul). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-23 20:19:00 -08:00
Rob Clark	3f7239ca0e	freedreno/a3xx/compiler: half-precision output Using generic shaders caused a measurable fps drop, which was isolated to use of full precision (vs half precision) output. This is an attempt to regain that lost performance by using half precision solid/blit shaders (when the output format is not float32). Note: for the built-in shaders, I would not expect them to be register starved. And in fact it is the solid frag shader that seems to have the biggest impact. So I suspect you get double the pixel pipe units (or half the cycles) when the output is half precision. So there may be some gain to using half precision output for application shaders as well, even though the rest of register usage is still full precision. But for half precision to work for more complex shaders, we need to deal with some constraints, like cat2 needing same precision for it's two src registers. So for now it is not enabled by default except for the built-in shaders. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:24 -05:00
Rob Clark	141ae71671	freedreno/a3xx: add shader variants Start putting in place infrastructure to deal with multiple shader variants. Initially we'll use this for two sided color (frag) and binning pass (vert) shaders. Possibly need for others later (such as YUV vs RGB eglImage?). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	9bbfae6265	freedreno/a3xx/compiler: collapse nop's with repeat Easier than making more extensive use of rpt, and the more compact shaders seem to bring some bit of performance boost. (Perhaps repeat flag benefits are more than just instruction cache, possibly it saves on instruction decode as well?) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	bb255fdf06	freedreno/a3xx: drop hand-coded blit/solid shaders Instead in the common code, construct these shaders from TGSI. For now we let a2xx keep it's hand coded shaders, as it's compiler isn't quite up to the job yet. All the same it is a net drop in code size and gets rid of special cases. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	1c953b7cda	freedreno/lowering: cleanup api Make things configurable, and tweak the API a bit to avoid an extra tgsi_shader_scan(). Getting closer to something generic which can be moved out of freedreno and shaderd by other drivers. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	67cea4b32a	freedreno/a3xx: add float 16 and 32bit formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	e819885b99	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Emil Velikov	f92fbba11b	glx/drisw: use the implemented version of __DRIswrastLoaderExtension ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	f6537d0608	glx/dri: use the implemented version of __DRIdamageExtension ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	ef342aad80	glx/dri_common: use the implemented version of __DRIsystemTimeExtension ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	fbbf5ec471	glx/dri: use the implemented version of __DRIgetDrawableInfoExtension ... over the one provided by the headers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	15db8c0801	dri_util: use the implemented version of __DRIimageDriverExtension ... over the one provided by the headers. Currently both versions are identical, but that is not guaranteed to be the case in the future. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	e9eb3ec331	glx/dri3: set the implemented version of __DRIimageLoaderExtension ... over the one provided by the spec. Currently both versions are identical, but that is not guaranteed to be the case in the future. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	4e229a6e86	gbm: explicitly set __DRIimageLoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	9e627ccc0d	egl/wayland: explicitly set __DRIimageLoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>y	2014-02-23 16:42:16 +00:00
Emil Velikov	73b35b913e	drivers/dri: explicitly set __DRI2flushExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>y	2014-02-23 16:42:16 +00:00
Emil Velikov	8b45bc0ad5	gbm: explicitly set __DRIdri2LoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>y	2014-02-23 16:42:16 +00:00
Emil Velikov	92273962f5	glx/dri2: set the implemented version of __DRIdri2LoaderExtension ... over the version number provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	6dffab2092	dri_interface: note introduction of __DRIdri2LoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	c9fff0740e	dri_interface: note introduction of various __DRItexBufferExtension members Note the member function releaseTexBuffer was added without bumping spec version, and currently no drivers implement it. v2: releaseTexBuffer was introduced by version 3 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	acf2fae64e	dri_interface: Note the version introducing __DRIswrastLoaderExtensionRec::putImage2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	13e5daf2da	dri_util: explicitly set __DRIcopySubBufferExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:15 +00:00
Emil Velikov	01814734e6	dri_util: explicitly set __DRIswrastExtension members. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:15 +00:00
Kenneth Graunke	5e639a5f59	glsl: Pass stdout to _mesa_print_ir from st_glsl_to_tgsi. Fixes the Gallium build since commit `1e3bd9f9a5`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75389 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-22 22:10:11 -08:00
Eric Anholt	83daa88035	i965: Move the remaining driver debug over to stderr. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	a76e5dce4f	i965: Move compiler debugging output to stderr. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	1e3bd9f9a5	glsl: Add a file argument to the IR printer. While we want to be able to print to stdout for glsl_compiler, for debugging drivers we want to be able to dump to stderr because that's where other driver debug (like LIBGL_DEBUG) tends to go, and because some apps actually close stdout to shut up their own messages (such as the X Server, or NWN). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	f28c920865	i965: Refactor debug dumping of GLSL IR. This was only going to get worse when tesselation shows up, and was causing too much extra duplication in my stderr changes coming up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	9ac9d133ed	intel: Remove some dead code I noticed in intel_screen.c. It was present in the initial i915tex import. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Eric Anholt	fdcf6c8fad	i965: Use the object label when available for INTEL_DEBUG=vs,gs,fs output. Note that this requires updated run.py in shader_db. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Eric Anholt	f474ced0d1	i965: Use the object label when available for shader_time output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Eric Anholt	0e2c7e2f6e	meta: Set some object labels on our meta shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Ilia Mirkin	6152ba0894	nv50: make sure to clear _all_ layers of all attachments Unfortunately there's only one RT_ARRAY_MODE setting for all attachments, so clears were previously truncated to the minimum number of layers any attachment had. Instead set the RT_ARRAY_MODE to 512 (the max number of layers) before doing the clear. This fixes gl-3.2-layered-rendering-clear-color-mismatched-layer-count. Also fix clears of individual layered rt/zeta, in case it ever happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-22 18:42:31 -05:00
Chia-I Wu	d5cbd73d21	ilo: fix and enable fast depth clear Use tex->bo_format instead of zs->format in ilo_blitter_rectlist_clear_zs() because the latter may be combined depth/stencil format. hiz_can_clear_zs() is no-op for GEN7+, but move the GEN check so that the assertions are tested. Finally, call the fast depth clear function from ilo_clear().	2014-02-22 22:45:13 +08:00
Chia-I Wu	f57bddc7e4	ilo: add slice clear value It is needed for 3DSTATE_CLEAR_PARAMS, and can also be used to track what value the slice has been cleared to.	2014-02-22 22:45:13 +08:00
Chia-I Wu	4afb8a7fb5	ilo: better readability and doc for texture flags Improve comments for the flags, and explicitly separate their uses in slice flags and resolve flags.	2014-02-22 22:45:13 +08:00
Chia-I Wu	cb8a0d2be1	ilo: fix for stencil only rectlist ops 3DSTATE_STENCIL_BUFFER inherits some states from 3DSTATE_DEPTH_BUFFER. We need to emit both even the surface is stencil only.	2014-02-22 22:45:13 +08:00
Chia-I Wu	409add30b3	ilo: fix a false assertion failure on GEN6 Layer offsetting is possible when it is level 0, layer 0.	2014-02-22 22:45:12 +08:00
Chia-I Wu	e7307fe708	ilo: pipe_texture::usage is not a bitfield It happens to work because PIPE_USAGE_STAGING is 0x100.	2014-02-22 22:45:12 +08:00
Chia-I Wu	f8d19a58dc	ilo: set ILO_TEXTURE_CPU_WRITE for imported textures Assume the bo has been written by another process, which will trigger a HiZ resolve.	2014-02-22 22:45:12 +08:00
Christoph Bumiller	1f4bfb8797	nv50/ir/ra: fix SpillCodeInserter::offsetSlot usage We were turning non-memory spill slots into NULL. Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-22 13:17:23 +01:00
Matt Turner	7770b02693	Revert "i965/fs: Make fs_reg's type an enum for better debugging." This reverts commit `5ceadd29b0`. I rebased and apparently failed to build test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75355	2014-02-21 23:53:36 -08:00
Kenneth Graunke	760c6777a0	i965/fs: Drop the emit(fs_inst) overload. Using this emit function implicitly creates three copies, which is pointlessly inefficient. 1. Code creates the original instruction. 2. Calling emit(fs_inst) copies it into the function. 3. It then allocates a new fs_inst and copies it into that. The second could be eliminated by changing the signature to fs_inst(const fs_inst &) but that wouldn't eliminate the third. Making callers heap allocate the instruction and call emit(fs_inst *) allows us to just use the original one, with no extra copies, and isn't much more of a burden. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	326fc60ee9	i965/fs: Pass fs_regs by constant reference where possible. These functions (modulo emit_lrp, necessitating the small fix-up) pass these arguments by value unmodified to other functions. No point in making an additional copy. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	070f20272f	i965/fs: Move setting opcode = NOP to its one useful location. All other callers of init() immediately set opcode to something else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	4fbebd6e65	i965/fs: Use a bitfield for fs_inst's bool fields. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	d91035a8f6	i965/fs: Reorder fs_inst's fields for better packing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	109c211ffd	i965/fs: Reduce the sizes of some fs_inst members. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	0fc1a77e14	i965/fs: Reorder fs_reg for better packing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	5ceadd29b0	i965/fs: Make fs_reg's type an enum for better debugging. Since the enum is marked as packed, it'll still take only one byte. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Matt Turner	3f6baf5755	i965/fs: Reduce the sizes of some fs_reg members. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Matt Turner	98e2654880	i965: Mark brw_reg_type and register_file enums as PACKED. The C99 spec says the type of an enum is implementation defined (but can be char, signed int, or unsigned int). gcc appears to always give enums four bytes, even when they can fit in less. It does so because this is what other compilers seem to do [0] and therefore to maintain ABI compatibility with them. gcc has an -fshort-enum flag that tells the compiler to use only as much space as needed for an enum. Adding __attribute__((__packed__)) to an enum definition has the same behavior, but on a per-enum basis. brw_reg_type and register_file are not part of the ABI, so we can safely mark them as PACKED so that they'll take only a byte, rather than four. [0] http://gcc.gnu.org/onlinedocs/gcc/Non-bugs.html#index-fshort-enums-3868 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Matt Turner	00c567e897	i965: Reduce predicate field of backend_instruction to uint8_t. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Vinson Lee	079773d1cb	libgl-xlib: Fix xlib_sw_winsys.h include path. This patch fixes this SCons build error introduced with commit `4f37e52f37`. Compiling src/gallium/targets/libgl-xlib/xlib.c ... src/gallium/targets/libgl-xlib/xlib.c:35:42: fatal error: state_tracker/xlib_sw_winsys.h: No such file or directory #include "state_tracker/xlib_sw_winsys.h" ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75347 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 19:56:17 -08:00
Vinson Lee	24ce678f83	mesa: Move declarations before code. This patch fixes these MSVC build errors. Compiling src\mesa\drivers\common\meta_blit.c ... meta_blit.c src\mesa\drivers\common\meta_blit.c(255) : error C2143: syntax error : missing ';' before 'type' src\mesa\drivers\common\meta_blit.c(255) : error C2143: syntax error : missing ')' before 'type' src\mesa\drivers\common\meta_blit.c(255) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(255) : warning C4552: '<' : operator has no effect; expected operator with side-effect src\mesa\drivers\common\meta_blit.c(255) : error C2059: syntax error : ')' src\mesa\drivers\common\meta_blit.c(255) : error C2143: syntax error : missing ';' before '{' src\mesa\drivers\common\meta_blit.c(258) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(263) : error C2143: syntax error : missing ';' before 'type' src\mesa\drivers\common\meta_blit.c(263) : error C2143: syntax error : missing ')' before 'type' src\mesa\drivers\common\meta_blit.c(263) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(263) : warning C4552: '<=' : operator has no effect; expected operator with side-effect src\mesa\drivers\common\meta_blit.c(263) : error C2059: syntax error : ')' src\mesa\drivers\common\meta_blit.c(263) : error C2143: syntax error : missing ';' before '{' src\mesa\drivers\common\meta_blit.c(264) : error C2143: syntax error : missing ';' before 'type' src\mesa\drivers\common\meta_blit.c(264) : error C2143: syntax error : missing ')' before 'type' src\mesa\drivers\common\meta_blit.c(264) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(264) : warning C4552: '<' : operator has no effect; expected operator with side-effect src\mesa\drivers\common\meta_blit.c(264) : error C2059: syntax error : ')' src\mesa\drivers\common\meta_blit.c(264) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(264) : error C2143: syntax error : missing ';' before '{' src\mesa\drivers\common\meta_blit.c(268) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(268) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(269) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(269) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(270) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(270) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(559) : warning C4244: 'function' : conversion from 'const GLint' to 'GLfloat', possible loss of data src\mesa\drivers\common\meta_blit.c(723) : warning C4244: 'function' : conversion from 'const GLint' to 'GLfloat', possible loss of data src\mesa\drivers\common\meta_blit.c(773) : warning C4244: 'function' : conversion from 'const GLint' to 'GLfloat', possible loss of data Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 19:40:00 -08:00
Emil Velikov	dcbf404c0d	pipe-loader: introduce pipe_loader_sw_probe_null helper function v2: Handle null_sw_create failure, add missing function return type Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1)	2014-02-22 03:26:29 +00:00
Emil Velikov	969e8d15b7	pipe-loader: introduce pipe_loader_sw_probe_dri helper Will be used in the following commits. v2: Link gallium tests against the library. v3: Handle dri_create_sw_winsys failure v4: Rebase on top of the targets/xa changes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v2)	2014-02-22 03:26:29 +00:00
Emil Velikov	cc3aeacab6	pipe-loader: introduce pipe_loader_sw_probe_xlib helper Will be used in the upcoming patches. v2: handle xlib_create_sw_winsys failure, drop unneeded header Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1)	2014-02-22 03:26:29 +00:00
Emil Velikov	6325fdd6cf	pipe-loader: use bool type for pipe_loader_drm_probe_fd() v2: Rebase on top of the rendernode changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1) Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1)	2014-02-22 03:26:29 +00:00
Emil Velikov	4f37e52f37	winsys/xlib: move xlib_create_sw_winsys within the winsys v2: Rebase on top of vl_winsys_xsp.c removal Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1)	2014-02-22 03:26:28 +00:00
Emil Velikov	b4e8572bca	pipe-loader: handle memory allocation failure Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-02-22 03:26:28 +00:00
Emil Velikov	1fb750f7f7	pipe-loader: build pipe_loader_drm_x_auth whenever HAVE_PIPE_LOADER_XCB is defined Currently HAVE_PIPE_LOADER_XCB is defined, rather than being set to 1/0. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-22 03:26:28 +00:00
Emil Velikov	ed092a8e1f	pipe-loader: destroy sw_winsys on sw_release The sw pipe-loader implicitly handles winsys_create, thus we it would make sense to implicitly destroy it upon releasing the loader. Currently we leak the sw_winsys when releasing the pipe-loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-22 03:26:28 +00:00
Emil Velikov	636ac989b2	vl/winsys_dri: cleanup vl_screen_create error path Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-22 03:26:27 +00:00
Emil Velikov	0c9912b266	targets/pipe-loader: link pipe-nouveau against libdrm Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-22 03:26:27 +00:00
Kenneth Graunke	6984a6be5c	meta: Eliminate samplers[] array in favor of using vec4_prefix. We don't need an array mapping the shader index to "sampler2DMS", "isampler2DMS", and so on. We can simply do "%ssampler2DMS" and pass in vec4_prefix, which is "", "i", or "u". This eliminates the use of C99 array initializers and should fix the MSVC build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75344 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-21 19:18:07 -08:00
Kenneth Graunke	119aa50929	i965: Delete the fabulous target_to_target() function. gl_texture_object's Target field is never a cube face enumeration, so target_to_target is just the identity function. Aptly named, at least. I verified this by putting an assert(!"ZOMG, CUBES!") in the cube face case, and running Piglit. Nothing ever hit it. Beyond that, I inspected the code in mesa/main. This could probably also be deleted from i915, but I haven't tested there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-21 19:17:55 -08:00
Kenneth Graunke	82f9ad8c60	i965: Fix S8 and X8 reversal in brw_depthbuffer_format refactor. In commit `09d9a8913e`, I accidentally botched the X8 and S8 cases. (I wrote this patch before realizing that X8 and S8 had been swapped in the big MESA_FORMAT rename, and apparently didn't rebase it properly after fixing that...) Fixes regressions in 13 Piglit tests on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75291 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-21 19:17:50 -08:00
Vinson Lee	5a0b08e9ea	mesa: Move declarations before code. This patch fixes these MSVC build errors introduced with `73b78f9c9f`. Compiling src\mesa\main\uniforms.c ... uniforms.c src\mesa\main\uniforms.c(291) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(294) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(294) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(294) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(306) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(309) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(309) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(309) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(322) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(325) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(325) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(325) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(345) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(348) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(348) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(348) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(360) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(363) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(363) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(363) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(376) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(379) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(379) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(379) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(588) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(591) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(591) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(591) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(603) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(606) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(606) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(606) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(619) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(622) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(622) : warning C4047: 'function' : 'gl_shader_program *' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(622) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 19:11:58 -08:00
Vinson Lee	aaefc85f3b	mesa/sso: Change CreateShaderProgramv return type from uint to GLuint. This patch fixes this MinGW build error. Compiling src/mapi/glapi/glapi_dispatch.c ... In file included from src/mapi/glapi/glapi_dispatch.c:41:0: build/windows-x86_64-debug/mapi/glapi/glapitable.h:930:4: error: expected specifier-qualifier-list before 'uint' uint (GLAPIENTRYP CreateShaderProgramv)(GLenum type, GLsizei count, const GLchar * const * strings); /* 886 */ ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 18:05:40 -08:00
Vinson Lee	34587e4a00	scons: Add main/pipelineobj.c to src/mesa/SConscript. This patch fixes this SCons build error. build/linux-x86_64-debug/mesa/libmesa.a(context.os): In function `init_attrib_groups': src/mesa/main/context.c:815: undefined reference to `_mesa_init_pipeline' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 17:00:47 -08:00
Vinson Lee	897a5fa360	mesa/sso: Fix typo of 'unsigned'. Fix build error introduced with commit `f4c13a890f`. CC pixeltransfer.lo main/pipelineobj.c: In function '_mesa_delete_pipeline_object': main/pipelineobj.c:59:4: error: unknown type name 'unsinged' unsinged i; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 16:41:04 -08:00
Gregory Hainaut	4719ad79ec	mesa/sso: Implement _mesa_GetProgramPipelineiv This was originally included in another patch, but it was split out by Ian Romanick. v2 (idr): * Trivial reformatting. * Remove GL_COMPUTE_SHADER. Compute shaders don't participate in pipeline objects anyway. Suggested by Matt Turner. v3 (idr): * Use _mesa_has_geometry_shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:03 -08:00
Gregory Hainaut	c171834b49	mesa/sso: Implement _mesa_ActiveShaderProgram This was originally included in another patch, but it was split out by Ian Romanick. v2 (idr): Return early from _mesa_ActiveShaderProgram if _mesa_lookup_shader_program_err returns an error. Suggested by Jordan. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v2]	2014-02-21 15:41:03 -08:00
Gregory Hainaut	e9ff3b9918	mesa/sso: Implement _mesa_CreateShaderProgramv This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:03 -08:00
Gregory Hainaut	3659eade53	mesa/sso: Refactor implementation of _mesa_CreateShaderProgramEXT This will allow the guts of the implementation to be shared with _mesa_CreateShaderProgramv. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:03 -08:00
Gregory Hainaut	8ed8592fd6	mesa/sso: Add support for GL_PROGRAM_SEPARABLE query This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	4177d39c1e	mesa/sso: Implement _mesa_IsProgramPipeline Implement IsProgramPipeline based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	0c26552662	mesa/sso: Implement _mesa_GenProgramPipelines Implement GenProgramPipelines based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	55311557fd	mesa/sso: Implement _mesa_DeleteProgramPipelines Implement DeleteProgramPipelines based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	f4c13a890f	mesa/sso: Add pipeline container/state V1: * Extend gl_shader_state as pipeline object state * Add a new container gl_pipeline_shader_state that contains binding point of the previous object * Update mesa init/free shader state due to the extension of the attibute * Add an init/free pipeline function for the context V2: * Rename gl_shader_state to gl_pipeline_object * Rename Pipeline.PipelineObj to Pipeline.Current * Formatting improvement V3 (idr): * Split out from previous uber patch. * Remove '#if 0' debug printfs. V4 (idr): * Fix some errors in comments. Suggested by Jordan. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	0f137a1d73	mesa: Add a mutex and refcounting to gl_shader_state Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	47476fa673	mesa: Make get_shader_flags publicly available Future patches will use this function outside shaderapi.c. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	73b78f9c9f	mesa/sso: Add extension entry points for GL_ARB_separate_shader_objects Nothings implemented yet but glProgramUniform* which are mostly a copy/paste of the older function glUniform* I create dedicated pipelineobj.[ch] file that will contains function related to the "new" pipeline container object. V2: formatting improvement V3: * indentation fix * Update copyright * Add a comment on ProgramParameteri already present in another extension * Remove TODO, will be readded on correct patch V4 (idr): * Fix dispatch_sanity unit test * Make extension string available in core profiles (instead of just compatibility). * Trivial reformating Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	4d14b190bb	glsl/sso: Add parser and AST-to-HIR support for separate shader object layouts GL_ARB_separate_shader_objects adds the ability to specify location layouts for interstage inputs and outputs. In addition, this extension makes 'in' and 'out' generally available for shader inputs and outputs. This mimics the behavior of GL_ARB_explicit_attrib_location. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	f3b184590f	mesa/sso: Add extension tracking for ARB_separate_shader_objects This adds the necessary bits for both the API and the GLSL compiler. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	79146065f9	mesa: Refactor per-stage link check to its own function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:01 -08:00
Emil Velikov	68bc1e2025	specs: MESA_query_renderer.spec resolve a couple of typos Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-21 22:52:46 +00:00
Emil Velikov	0432aa064b	configure: use shared-glapi when more than one gl* API is used Current behaviour states that shared-glapi is usefull when building with dri, which is not the case. Shared-glapi is used to dispatch the gl* functions across the one or more gl api's which can be dri based but do not need to be. Fixed the following build ./configure --enable-gles2 --disable-dri --enable-gallium-egl \ --with-egl-platforms=fbdev --with-gallium-drivers=swrast Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75098 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-21 22:48:50 +00:00
Emil Velikov	9eae750317	configure: use default dri drivers whenever opengl and dri are enabled Commit ee55500c22a(configure: cleanup classic dri drivers handling) cleaned up the logic handling autodetection of dri drivers, but missed the case when one can explicitly disable dri, and still request opengl. Fixes build issues for the following ./autogen.sh --disable-dri --with-gallium-drivers=swrast While we're here, explicitly clear with_dri_drivers whenever building without such drivers to prevent choking later on. v2: Simplify with_dri_drivers handling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75126 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-21 22:47:51 +00:00
Eric Anholt	c2ebbe2728	i965: Stop throwing away our double precision for time calculations. Fixes negative times being reported in our perf debug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:50 -08:00
Eric Anholt	f2f337c6d5	meta: Add support for integer blits. Compared to i965, the code generated doesn't use the AVG instruction. But I'm not sure that multisampled integer resolves are really that important to worry about. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	b0a8d0ee40	meta: Add support for doing MSAA to MSAA blits. These are non-stretched, non-resolving blits, so it's just a matter of sampling once from our gl_SampleID and storing that to our color/depth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	eb55b01eef	meta: Save and restore a bunch of MSAA state. We're disabling GL_MULTISAMPLE, so we didn't need to worry about a lot of that state. But to do MSAA to MSAA blits, we need to start handling more state. v2: Fix pasteo caught by Kenneth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	f7f15d3c2d	meta: Try to do blending of sRGB values in linear colorspace. Blending of values would occur when doing GL_LINEAR filtering with scaling, and in an upcoming commit when doing MSAA resolves. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	7d2f73e737	meta: Add support for doing multisample resolves. Note that this doesn't handle GL_EXT_multisample_scaled_blit yet. The i965 code for that extension bakes in knowledge of the sample positions (well, knowledge of the sample positions aligned to a lower-resolution grid), which we would have to do at runtime somehow for meta. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	aba85d960e	i965: Fix miptree matching for multisampled, non-interleaved miptrees. We haven't been executing this code before the meta-blit case, because we've been flagging the miptree as validated at texstorage time, and never having to revalidate. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Courtney Goeltzenleuchter	941769be81	mesa: Remove unnecessary condition. Identified by Valgrind memory check. Initialized block-opaque in a different patch. This test seems unnecessary. If opaque must be true, just set to true. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>	2014-02-21 10:16:10 -08:00
Francisco Jerez	9b2fe7cf96	clover: Unabbreviate a few data accessor names for consistency. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:23 +01:00
Francisco Jerez	a0d99937a0	clover: Replace the transfer(new ...) idiom with a safer create(...) helper function. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	c4578d2277	clover: Migrate a bunch of pointers and references in the object tree to smart references. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	d82b39ce38	clover: Allow storing a range into a container of different (but compatible) element type. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	1b9fb2fd91	clover: Define an intrusive smart reference class. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	9ae0bd3829	clover: Some improvements for the intrusive pointer class. Define some additional convenience operators, clean up the implementation slightly, and rename it to 'intrusive_ptr' for reasons that will be obvious in the next commit. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	198cd136b9	clover: Fix up NULL constant pointer arguments. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:29:05 +01:00
Jordan Justen	c97763ca2d	tgsi_ureg: add property_gs_invocations Fixes a build break in state_tracker/st_program.c Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75278 Reviewed-by: Dave Airlie <airlied@redhat.com>	2014-02-20 16:41:01 -08:00
Kenneth Graunke	1336ccb7dd	i965: Enable Broadwell support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:51:38 -08:00
Kenneth Graunke	808952a095	i965/fs: Implement FS_OPCODE_[UN]PACK_HALF_2x16_SPLIT[_XY] opcodes. I'd neglected to port these to Broadwell. Most of this code is copy and pasted from Gen7, but instead of using F32TO16/F16TO32, we just use MOV with HF register types. Fixes fs-packHalf2x16 and fs-unpackHalf2x16 tests (both the ARB extension and ES 3.0 variants). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:59 -08:00
Kenneth Graunke	850e372fc7	i965: Drop bogus F32TO16/F16TO32 instructions on Broadwell - use MOV. Broadwell removed the F32TO16 and F16TO32 instructions. However, it has actual support for HF values, so they're actually just MOV. Fixes vs-packHalf2x16 and vs-unpackHalf2x16 tests (both the ARB extension and ES 3.0 variants). v2: Emulate F32TO16's align16 zeroing bug, since Chad's front end code relies on it happening. We can probably refactor this code to be better later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:57 -08:00
Kenneth Graunke	3663bbe773	i965: Create a hardware context before initializing state module. brw_init_state() calls brw_upload_initial_gpu_state(). If hardware contexts are enabled (brw->hw_ctx != NULL), this will upload some initial invariant state for the GPU. Without hardware contexts, we rely on this state being uploaded via atoms that subscribe to the BRW_NEW_CONTEXT bit. Commit `46d3c2bf4d` accidentally moved the call to brw_init_state() before creating a hardware context. This meant brw_upload_initial_gpu_state would always early return. Except on Gen6+, we stopped uploading the initial GPU state via state atoms, so it never happened. Fixes a regression since `46d3c2bf4d`. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	e3823147a5	i965/fs: Implement scratch read/write support for Broadwell. To make sure that both the Gen4 and Gen7 style messages work, I initially disabled the SHADER_OPCODE_GEN7_SCRATCH_READ optimization, ran Piglit, re-enabled it, and ran Piglit again. Both worked fine. Fixes 40 Piglit tests (most of the varying-packing category). v2: Move num_regs assertion from gen8_fs_generator to gen8_set_dp_scratch_message() (suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	29a6974403	i965: Add Gen8 assembly support for DP Scratch messages. The new accessors will make it easy to do Gen7-style scratch messages. v2: Move num_regs assertion from gen8_fs_generator into gen8_set_dp_scratch_message() (suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	a5e54c91a3	i965: Store absolute thread count in max_wm_threads on Broadwell. In the past, 3DSTATE_PS took an absolute number of threads. Conversely, on Broadwell you always program 64, and it implicitly scales based on the GT-level with no special programming. So, I stored 64 in brw_device_info::max_wm_threads. However, I didn't realize that we also use max_wm_threads to compute the size of the scratch space buffer. In that case, we really need the absolute number of threads. This patch hardcodes 3DSTATE_PS to use the value it expects, and changes max_wm_threads back to a (completely fake) absolute thread count (once again copied from Haswell). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	dca84b4b5b	i965: Use MOV, not OR for setting URB write channel enables on Gen8+. On Broadwell, g0.5 contains the "Scratch Space Pointer"; using OR puts some bits of that into "ignored" sections of our message header. While this doesn't hurt, it's also not terribly /useful/. Using MOV is sufficient to set the only interesting bits in this part of the message header. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:07 -08:00
Kenneth Graunke	e643c7d036	i965: Implement a CS stall workaround on Broadwell. According to the latest documentation, any PIPE_CONTROL with the "Command Streamer Stall" bit set must also have another bit set, with five different options: - Render Target Cache Flush - Depth Cache Flush - Stall at Pixel Scoreboard - Post-Sync Operation - Depth Stall I chose "Stall at Pixel Scoreboard" since we've used it effectively in the past, but the choice is fairly arbitrary. Implementing this in the PIPE_CONTROL emit helpers ensures that the workaround will always take effect when it ought to. Apparently, this workaround may be necessary on older hardware as well; for now I've only added it to Broadwell as it's absolutely necessary there. Subsequent patches could add it to older platforms, provided someone tests it there. v2: Only flag "Stall at Pixel Scoreboard" when none of the other bits are set (suggested by Ian Romanick). v3: Prefix the function with "gen8" (requested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:07 -08:00
Jordan Justen	741782b594	i965: support instanced GS on gen7 v3: * Properly prevent dual object mode execution when the invocation count > 1 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	008338bc4e	i965: support gl_InvocationID for gen7 v2: * Make gl_InvocationID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	d099019935	glsl: add gl_InvocationID variable for ARB_gpu_shader5 v2: * Make gl_InvocationID a system value Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	22388e2208	main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support v3: * Add check for ARB_gpu_shader5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	86d6b5546b	mesa: initialize gl_geometry_program Invocations field Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	313402048f	glsl/linker: produce gl_shader_program Geom.Invocations Grab the parsed invocation count, check for consistency during linking, and finally save the result in gl_shader_program Geom.Invocations. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	02dc74fbd7	glsl: parse invocations layout qualifier for ARB_gpu_shader5 _mesa_glsl_parse_state in_qualifier->invocations will store the invocations count. v3: * Use in_qualifier to allow the primitive to be specied separately from the invocations count (merge_qualifiers) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	738c9c3c54	glsl: Generate error for invalid input layout declarations Fixes various piglit tests: spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-*.geom Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	0c558f9ee6	glsl: convert GS input primitive to use ast_type_qualifier We introduce a new merge_in_qualifier ast_type_qualifier which allows specialized handling of merging input layout qualifiers. By merging layout qualifiers into state->in_qualifier, we allow multiple input qualifiers. For example, the primitive type can be specified specified separately from the invocations count (ARB_gpu_shader5). state->gs_input_prim_type is moved into state->in_qualifier->prim_type state->gs_input_prim_type_specified is still processed separately so we can determine when the input primitive is specified. This is important since certain scenerios are not supported until after the primitive type has been specified in the shader code. v4: * Merge with compute shader input layout qualifiers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Eric Anholt	5bc0b2f432	i965: Fix extra return value after winsys rb update refactor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75172 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-20 10:15:13 -08:00
Eric Anholt	9245206cbf	i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls. Improves performance of a dolphin emulator trace I had laying around by 3.60131% +/- 0.995887% (n=128). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-20 10:15:13 -08:00
Eric Anholt	9e3cab8881	i965/fs: Add an optimization pass to remove redundant flags movs. We generate steaming piles of these for the centroid workaround, and this quickly cleans them up. total instructions in shared programs: 1591228 -> 1590047 (-0.07%) instructions in affected programs: 26111 -> 24930 (-4.52%) GAINED: 0 LOST: 0 (Improved apps are l4d2, csgo, and dolphin) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-20 10:15:13 -08:00
Roland Scheidegger	b2b2a2c06c	gallivm: add smallfloat to float conversion not relying on cpu denorm handling The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Si Chen <sichen@vmware.com>	2014-02-20 18:41:42 +01:00
Leo Liu	0206f0b3d4	st/omx/enc: add multi scaling buffers for performance improvement Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-20 13:34:16 +01:00
Christian König	754fa3a0d2	st/omx/dec/h264: fix prevFrameNumOffset handling Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-20 13:34:06 +01:00
Kenneth Graunke	57405605a8	i965: Actually claim to support MSAA on Broadwell. We need to advertise 8x, 4x, and 2x multisamples. Previously, we only claimed to support 0/1 samples. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	4af8c95783	i965: Update physical width/height munging for 2x IMS MSAA. I can't find any documentation to explain what ought to be done here, so I simply guessed based on the pattern I observed in the 4x/8x cases. It appears to work, but it could be totally wrong. I was able to find the Sandybridge PRM quote from the comments in the latest documentation: Shared Functions > 3D Sampler > Multisampled Surface Behavior. However, it only mentions 4x MSAA - not even 8x. After a substantial amount more digging, I was able to find a second page (incorrectly tagged) which confirmed the formulas in our code for 8x MSAA. However, that page didn't mention 2x MSAA at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	51145a24f7	i965: Enable smooth points when multisampling without point sprites. According to the "Point Multisample Rasterization" of the OpenGL specification (3.0 or later), smooth points are supposed to be enabled implicitly when multisampling, regardless of the GL_POINT_SMOOTH flag. However, if GL_POINT_SPRITE is enabled, you get square points no matter what. Core contexts always enable point sprites, so this effectively makes smooth points go away, even in the case of multisampling. Fixes Piglit's EXT_framebuffer_multisample/point-smooth tests. (Yes, that's right folks, we actually have Piglit tests for this.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	a3d70580b5	i965: Thwack multisample enable bit in 3DSTATE_RASTER. The meaning and effects of this bit are surprisingly complicated. See Rasterization > Windower > Multisampling > Multisample ModesState. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	0c5873c9b9	i965: Only use the SIMD16 program for per-sample shading on Broadwell. This restriction carries forward from earlier platforms. The code is ported straight from gen7_wm_state.c. v2: Actually do it right. v3: Add missing _NEW_MULTISAMPLE bit (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:42:54 -08:00
Kenneth Graunke	61d7ea4b16	i965: Set "Position XY Offset Select" bits in 3DSTATE_PS on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:42:16 -08:00
Kenneth Graunke	01c42b2be6	i965: Add missing sample shading bits to Gen8's 3DSTATE_PS_EXTRA. v2: Also set the "oMask Present to Render Target" bit, which is required for shaders that write oMask. Otherwise the hardware won't expect the extra data. v3: Add missing _NEW_MULTISAMPLE (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:42:02 -08:00
Kenneth Graunke	77c37ed74b	i965/fs: Implement FS_OPCODE_SET_OMASK on Broadwell. I made a few changes which I think simplify the code a bit compared to the Gen7 implementation, but which are largely pointless. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:39:41 -08:00
Kenneth Graunke	5476da79f8	i965/fs: Implement FS_OPCODE_SET_SAMPLE_ID on Broadwell. Largely cut and paste from Gen7; it works the same way. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:39:41 -08:00
Kenneth Graunke	80c4edfc27	i965: Disable MCS on Broadwell for now. v2: Add a perf_debug() message to remind us to come back to this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:39:21 -08:00
Kenneth Graunke	4eba0d124d	i965: Use gen7_surface_msaa_bits in Broadwell SURFACE_STATE code. We already set the number of samples, but were missing the MSAA layout mode. Reusing gen7_surface_msaa_bits makes it easy to set both. This also lets us drop the Gen8 surface_num_multisamples function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:35:54 -08:00
Kenneth Graunke	6eeae17c02	i965: Use ffs() for sample counting in gen7_surface_msaa_bits(). The enumerations are just log2(num_samples) shifted by 3, which we can easily compute via ffs(). This also makes it reusable for Broadwell, which has 2x MSAA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:35:53 -08:00
Kenneth Graunke	2ed5824a5d	i965: Simplify Broadwell's 3DSTATE_MULTISAMPLE sample count handling. These enumerations are simply log2 of the number of multisamples shifted by a bit, so we can calculate them using ffs() in a lot less code. Suggested by Eric Anholt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:35:32 -08:00
Ian Romanick	7700c73cf4	glsl: Silence "type qualifiers ignored on function return type" warning The const in const unsigned foo(void); is meaningless. Removing it silences this warning: src/glsl/ast_to_hir.cpp:1802:56: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:08:50 -08:00
Ian Romanick	2c85fd5a96	glsl: Only warn for macro names containing __ From page 14 (page 20 of the PDF) of the GLSL 1.10 spec: "In addition, all identifiers containing two consecutive underscores (__) are reserved as possible future keywords." The intention is that names containing __ are reserved for internal use by the implementation, and names prefixed with GL_ are reserved for use by Khronos. Names simply containing __ are dangerous to use, but should be allowed. Per the Khronos bug mentioned below, a future version of the GLSL specification will clarify this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Cc: Tapani Pälli <lemody@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Bugzilla: Khronos #11702	2014-02-19 15:08:50 -08:00
Ian Romanick	0bd7892630	glcpp: Only warn for macro names containing __ Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the GLSL ES spec (all versions) say: "All macro names containing two consecutive underscores ( __ ) are reserved for future use as predefined macro names. All macro names prefixed with "GL_" ("GL" followed by a single underscore) are also reserved." The intention is that names containing __ are reserved for internal use by the implementation, and names prefixed with GL_ are reserved for use by Khronos. Since every extension adds a name prefixed with GL_ (i.e., the name of the extension), that should be an error. Names simply containing __ are dangerous to use, but should be allowed. In similar cases, the C++ preprocessor specification says, "no diagnostic is required." Per the Khronos bug mentioned below, a future version of the GLSL specification will clarify this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Cc: Tapani Pälli <lemody@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Bugzilla: Khronos #11702	2014-02-19 15:08:50 -08:00
Tom Stellard	a4c734297f	configure: Use LLVM shared libraries by default Linking with LLVM static libraries is easily broken by changes to the llvm-config program or when LLVM adds, removes, or changes library components. Keeping up with these changes requires a lot of maintanence effort to keep the build working on the master and stable branches. Also, because of issues in the past LLVM static libraries, the release manager is currently configuring with --with-llvm-shared-libs when checking the build before release. Enabling shared libraries by default would allow the release manager to run ./configure with no arguments, and be reasonably confident that the build would succeed. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-19 14:35:49 -05:00
Francisco Jerez	8928d7860a	i965/fs: Allocate the param_size array dynamically. Useful because the total number of uniform components might exceed MAX_UNIFORMS * 4 in some cases because of the image metadata we'll be passing as push constants. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 19:03:56 +01:00
Francisco Jerez	eef710fc53	i965/fs: Use a separate variable to keep track of the last uniform index seen. Like the VEC4 back-end does. It will make dynamic allocation of the param_size array easier in a future commit. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 19:03:56 +01:00
Rob Clark	9186cd39d4	freedreno: tweak ringbuffer sizes/count Since we are now consuming two ringbuffers at a time, we probably want a pool larger than 4.. but we don't need each individual ringbuffer to be so large, so offset the pool size increase by reducing rb size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-19 12:02:57 -05:00
Rob Clark	5993723471	freedreno/a3xx/compiler: scheduling/legalize fixes It seems the write-after-read hazard that applies to texture fetch instructions, also applies to sfu instructions. Also, cat5/cat6 instructions do not have a (ss) bit, so in these cases we need to insert a dummy nop instruction with (ss) bit set. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-19 12:01:26 -05:00
Francisco Jerez	bbf8239f92	i965: Have brw_imm_vf4() take the vector components as integer values. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:56:57 +01:00
Francisco Jerez	51b00c5cb9	i965: Add helper function to find out the signedness of a register type. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:56:57 +01:00
Francisco Jerez	560f10e573	i965/vec4: Use swizzle() in the ARB_vertex_program code. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	8797ccf3fa	i965/fs: Use offset() in the ARB_fragment_program code. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	6f56d5dc60	i965/fs: Remove fs_reg::retype. There doesn't seem to be any reason for it to be a method, and it's surprising that the expression 'reg.retype(t)' doesn't retype its object but rather it creates a temporary with the new type. Use 'retype(reg, t)' instead. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	3b03273275	i965/vec4: Trivial improvements to the with_writemask() function. Add assertion that the register is not in the HW_REG or IMM file, calculate the conjunction of the old and new mask instead of replacing the old [consistent with the behavior of brw_writemask(), causes no functional changes right now], make it static inline to let the compiler do a slightly better job at optimizing things, and shorten its name. v2: Assert that the new writemask is not zero to avoid undefined hardware behaviour. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	42b226ef82	i965: Make sure that backend_reg::type and brw_reg::type are consistent for fixed regs. And define non-mutating helper functions to retype fixed and normal regs with a common interface. At some point we may want to get rid of ::fixed_hw_reg completely and have fixed regs use the normal register data members (e.g. backend_reg::reg to select a fixed GRF number, src_reg::swizzle to store the swizzle, etc.), I have the feeling that this is not the last headache we're going to get because of the multiple ways to represent the same thing and the different register interface depending on the file a register is stored in... Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	98306e727b	i965/vec4: Add non-mutating helper functions to modify src_reg::swizzle and ::negate. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	2337820d49	i965: Add non-mutating helper functions to modify the register offset. Yes, we could avoid having four copies of essentially the same code by using templates here. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	af25addcd0	i965/vec4: Fix off-by-one register class overallocation. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	a32817f3c2	i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	ae8b066da5	i965: Move up duplicated fields from stage-specific prog_data to brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw__prog_data_free, and the simplification of the stage-specific brw__prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:22 +01:00
Francisco Jerez	7f00c5f1a3	i965/vec4: Add constructor of src_reg from a fixed hardware reg. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 15:10:57 +01:00
Kenneth Graunke	98e048cf32	i965: Enable fast depth clears. They work fine now, too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	7023786417	i965: Enable HiZ on Broadwell. It appears to work fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	8cad1c115a	i965: Implement HiZ resolves on Broadwell. Broadwell's 3DSTATE_WM_HZ_OP packet makes this much easier. Instead of programming the whole pipeline, we simply have to emit the depth/stencil packets, a state override, and a pipe control. Then arrange for the state to be put back. This is easily done from a single function. v2: Use minify(mt->logical_{width,height}0, level) in 3DSTATE_WM_HZ_OP instead of intel_mipmap_level's width/height fields. Those were based on the physical width/height, and thus wrong for MSAA buffers. Eric also deleted those fields. v3: Use 0xFFFF as the sample mask regardless of what the user set (as this operation is unrelated); set the drawing rectangle to the miplevel being operated on, rather than the whole surface; remove unnecessary MAX2(..., 1) around mt->logical_depth0 (all suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	82711611cf	i965: Refactor Gen8 depth packet emission. The existing code followed the vtable function signature, which is not a great fit: many of the parameters are unused, and the function still inspects global state, making it less reusable. This patch refactors the depth buffer packet emission code into a new function which takes exactly the parameters it needs, and which uses no global state. It then makes the existing vtable function call the new one. Ideally, we would remove the vtable function, and clean up that interface. But that can happen once HiZ is working. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	67f073b91c	i965: Add #defines for the 3DSTATE_WM_HZ_OP packet's contents. We're going to need these to implement HiZ. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	577fdf1f48	i965: Bump generation check in code to disable HiZ at LODs > 0. Broadwell's "HiZ Resolve" operation still has the restriction that the rectangle primitive must be 8x4 aligned. So I believe we still need this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	a5d2eb6b98	i965: Program 3DSTATE_HIER_DEPTH_BUFFER properly on Broadwell. HiZ buffers still don't exist, but when they do, we'll set them up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:16 -08:00
Kenneth Graunke	09d9a8913e	i965: Pull format conversion logic out of brw_depthbuffer_format. brw_depthbuffer_format is not very reusable at the moment, since it uses global state (ctx->DrawBuffer) to access a particular depth buffer. For HiZ on Broadwell, I need a function which simply converts the formats. However, at least one existing user of brw_depthbuffer_format really wants the existing interface. So, I've created a new function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:16 -08:00
Chia-I Wu	4695f64895	egl: clarify what _eglInitResource does It is a helper called from the initializers of its subclasses.	2014-02-19 13:08:54 +08:00
Chia-I Wu	dc97e54d97	Revert "egl: Unhide functionality in _eglInitContext()" This reverts commit `1456ed85f0`. _eglInitResource can and is supposed to be called on subclass objects. Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-02-19 13:08:52 +08:00
Chia-I Wu	924490a747	Revert "egl: Unhide functionality in _eglInitSurface()" This reverts commit `498d10e230`. _eglInitResource can and is supposed to be called on subclass objects. Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-02-19 13:08:44 +08:00
Kenneth Graunke	c593ad6e46	i965: Bump MaxTexMbytes from 1GB to 1.5GB. Even with the other limits raised, TestProxyTexImage would still reject textures > 1GB in size. This is an artificial limit; nothing prevents us from having a larger texture. I stayed shy of 2GB to avoid the larger-than-aperture situation. For 3D textures, this raises the effective limit: - RGBA8: 645 -> 738 - RGBA16: 512 -> 586 - RGBA32F: 406 -> 465 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 18:59:24 -08:00
Kenneth Graunke	6c04423153	i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192. Gen4+ supports 8192x8192 cube maps. Ivybridge and later can actually support 16384, but that would place GL_MAX_CUBE_MAP_TEXTURE_SIZE above GL_MAX_TEXTURE_SIZE, which seems like a bad idea. (Unfortunately, we can't bump GL_MAX_TEXTURE_SIZE to 16384 without causing regressions due to awful W-tiled stencil buffer interactions.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 18:59:18 -08:00
Kenneth Graunke	06b047ebc7	i965: Bump MAX_3D_TEXTURE_SIZE to 2048. It's highly unlikely that there will be enough memory in the system to allocate enough space for this, but we should still expose the hardware limit. It's what the Intel Windows driver does, and it seems most other vendors do likewise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 18:58:57 -08:00
Ian Romanick	f0fdee5095	docs: Trivial updates to MESA_query_renderer.spec Fix the version and the status before sending to Khronos for listing in the registry. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 15:25:04 -08:00
Sinclair Yeh	6c9d6898fd	Prevent zero sized wl_egl_window It is illegal to create or resize a window to zero (or negative) width and/or height. This patch prevents such a request from happening.	2014-02-18 14:12:11 -08:00
Anuj Phogat	03597cf802	glsl: Fix condition to generate shader link error GL_ARB_ES2_compatibility doesn't say anything about shader linking when one of the shaders (vertex or fragment shader) is absent. So, the extension shouldn't change the behavior specified in GLSL specification. Tested the behavior on proprietary linux drivers of NVIDIA and AMD. Both of them allow linking a version 100 shader program in OpenGL context, when one of the shaders is absent. Makes following Khronos CTS tests to pass: successfulcompilevert_linkprogram.test successfulcompilefrag_linkprogram.test Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 11:07:09 -08:00
Anuj Phogat	6bd2472a8b	mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target() Fixes failing Khronos CTS test packed_depth_stencil_init.test Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 11:07:09 -08:00
Eric Anholt	d92f593d87	i965/fs: Use conditional sends to do FB writes on HSW+. This drops the MOVs for header setup, which are totally mis-scheduled. total instructions in shared programs: 1590047 -> 1589331 (-0.05%) instructions in affected programs: 43729 -> 43013 (-1.64%) GAINED: 0 LOST: 0 glb27-trex: x before + after +-----------------------------------------------------------------------------+ \| + x xx + + + \| \| ++ + xxx ++x xx + ** x+ + + + x \| \|+x xx x* x+++xxxxx++++xx++** x x+**x+xx+* + * + + *\| \| \|__\|__________MA___A___________\|___\| \| +-----------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 49 62.33 65.41 63.49 63.53449 0.62757822 + 50 62.28 65.4 63.7 63.6982 0.656564 No difference proven at 95.0% confidence Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 10:11:36 -08:00
Eric Anholt	4226798354	i965/fs: Drop dead comment about the old proj_attrib_mask optimization. The code was removed early last year. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 10:01:45 -08:00
Eric Anholt	f128bcc7c2	i965: Drop mt->levels[].width/height. It often confused people because it was unclear on whether it was the physical or logical, and people needed the other one as well. We can recompute it trivially using the minify() macro, clarifying which value is being used and making getting the other value obvious. v2: Fix a pasteo in intel_blit.c's dst flip. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 10:01:45 -08:00
Eric Anholt	4e0924c5de	i965: Move singlesample_mt to the renderbuffer. Since only window system renderbuffers can have a singlesample_mt, this lets us drop a bunch of sanity checking to make sure that we're just a renderbuffer-like thing. v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial helper function for set_needs_downsample. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 10:01:45 -08:00
Eric Anholt	019560c127	i965: Drop some duplicated code in DRI winsys BO updates. The only DRI2 vs DRI3 delta was just how to decide about frontbuffer-ness for doing the upsample. v2: Fix missing singlesample_mt->region->name update in the merged code, which would have broken the DRI2 don't-recreate-the-miptree optimization. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:36 -08:00
Eric Anholt	0440e677b9	i965: Simplify intel_miptree_updownsample. Pretty silly to pass in values dereferenced out of one of the arguments. v2: Get the destination size from the dst, even though the callers are always dealing with src size == dst size cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:34 -08:00
Eric Anholt	bbd85ad27c	i965: Don't try to use the ctx->ReadBuffer when asked to blorp miptrees. So far it's happened to be that we're only ever calling intel_miptree_blit() (up/downsampling) from the ReadBuffer, but I stumbled over a null ReadBuffer case when debugging later parts of the series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:32 -08:00
Eric Anholt	af4f758a44	i965: Make the mt->target of multisample renderbuffers be 2D_MS. Mostly mt->target == 2D_MS just results in a few checks that we don't try to allocate multiple LODs and don't try to do slice copies with them. But with the introduction of binding renderbuffers to textures, we need more consistency. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:29 -08:00
Eric Anholt	4e4a537ad5	meta: Push into desktop GL mode when doing meta operations. This lets us simplify our shaders, and rely on GLES-prohibited functionality (like ARB_texture_multisample) when writing these driver-internal functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:27 -08:00
Eric Anholt	b3dcce65c9	meta: Fix blit shader compile on non-glsl-130 drivers. Compare this VS to the one for the post-130 case. Fixes piglit glsl-lod-bias, and presumably tons of other code (I haven't done a full piglit run on swrast). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:06 -08:00
Rob Clark	20d14ef263	configure: fix build error with XA Fixes: xa_tracker.c: In function 'xa_tracker_create': xa_tracker.c:147:5: error: implicit declaration of function 'pipe_loader_drm_probe_fd' [-Werror=implicit-function-declaration] in some build configurations, as XA now implicitly depends on gallium_drm_loader. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-02-18 08:12:37 -05:00
Michel Dänzer	cf0172d46a	r600g,radeonsi: Consolidate logic for short-circuiting flushes Fixes radeonsi emitting command streams to the kernel even when there have been no draw calls before a flush, potentially powering up the GPU needlessly. Incidentally, this also cuts the runtime of piglit gpu.py in about half on my Kaveri system, probably because an X11 client going away no longer always results in a command stream being submitted to the kernel via glamor. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761 Cc: "10.1" mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-18 10:46:23 +09:00
Emil Velikov	adad8fb2e9	st/dri: remove #ifdef DRM_CAP_PRIME guard Required for libdrm 2.4.37 and earlier. Both scons and automake require version 2.4.38 now so that guard is not longer needed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:08:26 +00:00
Emil Velikov	6fbd00e43a	automake: remove leftover XORG and LIBKMS variables No longer set or used since the removal of st/xorg. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:08:03 +00:00
Emil Velikov	4b3a4c799a	scons: sync package requirements xorg-server and libkms is no longer required since the removal of the xorg state-tracker. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:04:07 +00:00
Emil Velikov	5fe47969c0	configure: bump up libdrm requirement to 2.4.38 This is the first version that introduced DRM_CAP_PRIME, which is implicitly required by egl/wayland. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:04:02 +00:00
Emil Velikov	f41102b538	configure: use test -n whenever possible Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:30 +00:00
Emil Velikov	8015ffeea1	configure: use test -z whenever possible Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:23 +00:00
Emil Velikov	ee55500c22	configure: cleanup classic dri drivers handling * Make sure that only drivers that are handled by configure.ac are included in DRI_DIRS. * Change with_dri_drivers default value to auto, and set enable autodetection, when enable_opengl is on. v2: Move "test" to the correct location. v3: Squash DRI_DIRS handling before the switch statement. Suggested by Ilia Mirkin Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:19 +00:00
Emil Velikov	35f6eed742	configure: compact ppc/sparc DRI_DIRS handling Both arches have the same list of dri_dirs. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:13 +00:00
Emil Velikov	65e67b9bf7	configure: drop explicit DRI_DIRS assignment on some platforms/arches Both x86_64\|amd64 and *bsd, already set the full range of available classic dri drivers. Drop the explicit assignment, and fall back to the generic default. Keep explicit list from plafroms/arches that do not handle the default list. Update help strings, to explicitly mention "classic" for applicable DRI drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:05 +00:00
Emil Velikov	49e93e8945	configure: cleanup switch statement Move all the cases within one switch statement and handle i9{1,6}5 and r{adeon,200} independently. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-17 23:59:25 +00:00
Kusanagi Kouichi	d23f9e3390	targets/vdpau: Don't link unused libraries libvdpau, libselinux and libexpat are not used. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>	2014-02-17 21:14:17 +00:00
Kusanagi Kouichi	6ba4392da2	configure: Try pkg-config first for libselinux v2 (Emil) Add SELINUX_CFLAGS in the respective locations Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2014-02-17 21:14:16 +00:00
Kusanagi Kouichi	61f6cddef7	targets/vdpau: Always use c++ to link If built without llvm, the following error occurs with mplayer: Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE [vo/vdpau] Error when calling vdp_device_create_x11: 1 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-17 21:14:16 +00:00
Ilia Mirkin	6958fb341f	st/xvmc: fix tests so that they pass Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-16 23:21:57 -05:00
Rob Clark	8b5f894e13	pipe-loader: add pipe loader for freedreno/msm Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:36:23 -05:00
Rob Clark	24fa96163a	st/xa: missing handle type DRM_API_HANDLE_TYPE_SHARED is zero, so doesn't actually fix anything. But we shouldn't rely on SHARED handle type being zero. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:36:23 -05:00
Rob Clark	42158926c6	st/xa: use pipe-loader to get screen This lets multiple gallium drivers use XA. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:36:19 -05:00
Rob Clark	a122c75599	pipe-loader: split out "client" version Build two versions of pipe-loader, with only the client version linking in x11 client side dependencies. This will allow the XA state tracker to use pipe-loader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:31:10 -05:00
Rob Clark	d73b2c0517	freedreno/a3xx/compiler: use (ss) for WAR hazards Seems texture sample instructions don't immediately consume there src(s). In fact, some shaders from blob compiler seem to indiciate that it does not even count the texture sample instructions when calculating number of delay slots to fill for non-sample instructions. (Although so far it seems inconclusive as to whether this is required.) In particular, when a src register of a previous texture sample instruction is clobbered, the (ss) bit is needed to synchronize with the tex pipeline to ensure it has picked up the previous values before they are overwritten. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	e8cca57a3f	freedreno/a3xx/compiler: fix RA typo Was supposed to be a '+', otherwise we end up with a negative offset and choosing registers below the assigned range. This seems to fix the scheduling mystery "solved" by adding in extra delay slots. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	579473f8f8	freedreno/a3xx/compiler: handle kill properly (new compiler) Since 'kill' does not produce a result, the new compiler was happily optimizing them out. We need to instead track 'kill's similar to outputs. But since there is no non-predicated kill instruction, (and for flattend if/else we do want them to be predicated), we need to track the topmost branch condition on the stack and use that as src arg to the kill. For a kill at the topmost level, we have to generate an immediate 1.0 to feed into the cmps.f for setting the predicate register. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	e35747b882	freedreno/a3xx/compiler: trans_cmp() sanity Thanks to figuring out 32bit float render target, and adding regdump test in fdre-a3xx, I can more easily play around with instructions to figure out range of inputs/outputs/etc. And from this I can conclude that cmps.f works more like expected and I can do something much more simple in trans_cmp() (compared to before which was more closely emulating the instruction sequence of the blob compiler). And using sel.b32 (binary 0/1) often makes more sense than sel.f32 (+/- float) or sel.u32 (+/- uint) as it can use the output directly from cmps.f without needing the 'add.s tmp0, tmp0, -1'. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	89dc282581	freedreno: fix problems if no color buf bound Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Eric Anholt	1020d8937e	meta: Don't try to enable FF texturing when we're using GLSL. On a core context, this would throw an error. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-14 12:09:42 -08:00
Carl Worth	a92581acf2	main: Avoid double-free of shader Label As documented, the _mesa_free_shader_program_data function: "Frees all the data that hangs off a shader program object, but not the object itself." This means that this function may be called multiple times on the same object, (and has been observed to). Meanwhile, the shProg->Label field was not being set to NULL after its free(). This led to a second call to free() of the same address on the second call to this function. Fix this by setting this field to NULL after free(), (just as with all other calls to free() in this function). Reviewed-by: Brian Paul <brianp@vmware.com> CC: mesa-stable@lists.freedesktop.org	2014-02-14 11:45:48 -08:00
Brian Paul	e4a5a9fd2f	gallium/pipebuffer: change pb_cache_manager_create() size_factor to float Requested by Marek. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 09:56:55 -07:00
Thomas Hellstrom	141e39a893	svga/winsys: Propagate surface shared information to the winsys The linux winsys needs to know whether a surface is shared. For guest-backed surfaces we need this information to avoid allocating a mob out of the mob cache for shared surfaces, but instead allocate a shared mob, that is never put in the mob cache, from the kernel. Also previously, all surfaces were given the "shareable" attribute when allocated from the kernel. This is too permissive for client-local surfaces. Now that we have the needed info, only set the "shareable" attribute if the client indicates that it needs to share the surface. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	fe6a854477	svga/winsys: implement GBS support This is a squash commit of many commits by Thomas Hellstrom. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Thomas Hellstrom	59e7c59621	gallium/util: Add flush/map debug utility code Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Thomas Hellstrom	8af358d8bc	gallium/pipebuffer: Add a cache buffer manager bypass mask In some situations, it may be desirable to bypass the cache at buffer creation but to insert the buffer in the cache at buffer destruction. One such situation is where we already have a kernel representation of a buffer that we want to use, but we also want to insert it in the cache when it's freed up. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Thomas Hellstrom	c9e9b1862b	pipebuffer, winsys: Add a size match parameter to the cached buffer manager In some situations it's important to restrict the sizes of buffers that the cached buffer manager is allowed to return Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	3d1fd6df53	svga: update texture code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	72b0e959fc	svga: update buffer code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	e0a6fb09bd	svga: add new helper functions for GBS buffers Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	6476bcbc50	svga: remove a couple unneeded assertions Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	f8bbd8261d	svga: adjust adjustment for point coordinates Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	d0c22a6d53	svga: track which textures are rendered to Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	c1e60a61e8	svga: add helpers for tracking rendering to textures Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	f84c830b14	svga: update shader code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	2f1fc8db10	svga: update constant buffer code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	31dfefc47f	svga: add svga_have_gb_objects/dma() functions Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	823fbfdca7	svga: add new GBS commands And update some existing commands. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	d993ada50c	svga: update svga_winsys interface for GBS This adds new interface functions for guest-backed surfaces and adds a mobid parameter to the surface_relocation() function. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	024711385e	svga: update dumping code with new GBS commands, etc Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	2e0c90847f	svga: split / update svga3d header files The old svga3d_reg.h file is split into separate header files and we add new items for guest-backed surfaces. Plus some minor code fixes because of renamed symbols. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:43 -07:00
Grigori Goronzy	6d1cecbfd7	st/vdpau: add support for DEINTERLACE_TEMPORAL Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-14 09:05:20 +01:00
Grigori Goronzy	af34c3fd10	vl: add motion adaptive deinterlacer Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-14 08:55:33 +01:00
Leo Liu	f87dfc35bc	st/omx/enc: fix scaling src alignment issue Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-14 08:50:32 +01:00
Alex Deucher	01e6371149	radeon: reverse DBG_NO_HYPERZ logic Change the flag to DBG_HYPERZ and reverse the logic so setting the flag enabled the feature. This disables hyperz on r600g and radeonsi by default. It can be enabled by setting the env var. There are just too many issues with certain apps so leave it disabled for now until we sort out the issues with the problematic apps. Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=58660 https://bugs.freedesktop.org/show_bug.cgi?id=64471 https://bugs.freedesktop.org/show_bug.cgi?id=66352 https://bugs.freedesktop.org/show_bug.cgi?id=68799 https://bugs.freedesktop.org/show_bug.cgi?id=72685 https://bugs.freedesktop.org/show_bug.cgi?id=73088 https://bugs.freedesktop.org/show_bug.cgi?id=74428 https://bugs.freedesktop.org/show_bug.cgi?id=74803 https://bugs.freedesktop.org/show_bug.cgi?id=74863 https://bugs.freedesktop.org/show_bug.cgi?id=74892 https://bugzilla.kernel.org/show_bug.cgi?id=70411 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org> Acked-by: Marek Olšák <marek.olsak@amd.com>	2014-02-13 20:55:54 -05:00
Tom Stellard	3c4bd95b62	pipe-loader: Add support for render nodes v2 v2: - Add missing call to pipe_loader_drm_release() - Fix render node macros - Drop render-node configure option	2014-02-13 19:53:15 -05:00
Tom Stellard	8481d208ce	pipe-loader: Add auth_x parameter to pipe_loader_drm_probe_fd() The caller can use this boolean parameter to tell the pipe-loader to authenticate with the X server when probing a file descriptor.	2014-02-13 19:53:15 -05:00
Christian König	0320ba9988	st/omx/dec/h264: fix pic_order_cnt_type==2 Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-13 18:00:44 +01:00
Ilia Mirkin	0c8b165366	nouveau: fix chipset checks for nv1a by using the oclass instead Commit `f4ebcd133b` ("dri/nouveau: NV17_3D class is not available for NV1a chipset") fixed this partially by using the correct 3d class. However there were a lot of checks left over comparing against the chipset. Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-13 11:06:41 -05:00
Christian König	0ef3ce4155	st/omx: initial OpenMAX H264 encoder v7 v2 (chk): fix eos handling v3 (leo): implement scaling configuration support v4 (leo): fix bitrate bug v5 (chk): add workaround for bug in Bellagio v6 (chk): fix div by 0 if framerate isn't known, user separate pipe object for scale and transfer, always flush the transfer pipe before encoding v7 (chk): make suggested changes, cleanup a bit more, only advertise encoder on supported hardware Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-02-13 11:11:24 +01:00
Christian König	9ff0cf903d	radeon/vce: initial VCE support v8 v2 (chk): revert feedback buffer hack v3 (slava): fixed bitstream size calculation v4 (chk): always create buffers in the right domain v5 (chk): flush async v6 (chk): rework fw interface add version check v7 (leo): implement cropping support v8 (chk): add hw checks Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>	2014-02-13 11:11:24 +01:00
Christian König	cbdd052577	radeon/winsys: add VCE support v4 v2: add fw version query v3: add README.VCE v4: avoid error msg when kernel doesn't support it Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-13 11:11:24 +01:00
Ilia Mirkin	ef9a6ded10	nv50: mark scissors/viewports dirty on context switch Commit `246ca4b001` ("nv50: implement multiple viewports/scissors, enable ARB_viewport_array") added dirty tracking to scissors/viewports. However it neglected to mark them all as dirty on a context switch. This fixes an apparent regression in webgl in chrome, but probably in any application that switches contexts. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-13 10:08:29 +01:00
Christian König	1ef7b9de06	gallium/vl: remove remaining softpipe video functions Unused and unmaintained for quite a while. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2014-02-13 09:46:54 +01:00
Ilia Mirkin	18caef953f	docs: add nv50 to the ARB_viewport_array list	2014-02-12 22:14:41 -05:00
Ilia Mirkin	246ca4b001	nv50: implement multiple viewports/scissors, enable ARB_viewport_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-12 21:47:36 -05:00
Ilia Mirkin	a7012eede8	mesa/st: hardcode the viewport bounds range The bound range is disconnected from the viewport dimensions. This is the relevant bit from glViewportArray: """ The location of the viewport's bottom left corner, given by (x, y) is clamped to be within the implementaiton-dependent viewport bounds range. The viewport bounds range [min, max] can be determined by calling glGet with argument GL_VIEWPORT_BOUNDS_RANGE. Viewport width and height are silently clamped to a range that depends on the implementation. To query this range, call glGet with argument GL_MAX_VIEWPORT_DIMS. """ Just set it to +/-16384, as that is the minimum required by ARB_viewport_array and the value that all current drivers provide. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-13 12:44:36 +10:00
Brian Paul	f0e967f212	scons: add meta_blit.c to src/mesa/SConscript	2014-02-12 17:46:11 -07:00
Eric Anholt	255bd9c0b8	meta: Add acceleration for depth glBlitFramebuffer(). Surprisingly, the GLSL shaders already wrote the sampled r value to FragDepth. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51600 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-12 16:17:11 -08:00
Eric Anholt	067c7b67e8	meta: Use BindRenderbufferTexImage() for meta glBlitFramebuffer(). This avoids a CopyTexImage() on Intel i965 hardware without blorp. v2: Move the !readAtt check up higher. v3: Rebase on idr's changes, plus readAtt check is totally gone, and also fix a typo in a comment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2014-02-12 16:17:11 -08:00
Eric Anholt	f29c25fc1d	i965: Add a driver hook for binding renderbuffers to textures. This will let us use meta's acceleration from renderbuffers without having to do a CopyTexImage first. This is like what we do for TFP, but just taking an existing renderbuffer and binding it to a texture with whatever its format was. The implementation won't work for stencil renderbuffers, and it only does non-texture renderbuffers (but then, if you're using a texture renderbuffer, you can just pull the texture object/level/slice out of the renderbuffer, anyway). v2: Don't forget to propagate NumSamples to the teximage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-12 16:17:11 -08:00
Eric Anholt	431decf16f	meta: Do a massive unindent (and rename) of blitframebuffer_texture(). This function is only handling the color case. We can just unindent as long as we're willing to do the check for the bit outside of the function. v2: Rebase on idr's changes, drop readAtt check that's always non-null anyway (it's a pointer into to the statically-allocated attachments array in the renderbuffer). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 16:17:11 -08:00
Eric Anholt	3e4ccf499e	meta: Move glBlitFramebuffer() to a separate file. v2: Drop a bunch of unnecessary includes (by Kenneth), rebase on idr's changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 16:17:08 -08:00
Eric Anholt	81ddbdaaba	meta: De-static some of meta's functions. I want split some meta.c code off to a separate file, so these functions can't be static any more. v2: Rebase on idr's changes, also expose setup_blit_shader, blit_shader_table_cleanup, setup_vertex_objects, setup_ff_tnl_for_blit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 16:16:03 -08:00
Eric Anholt	2c8f182c86	meta: Move the meta structures to the meta header. I'd like to split some of our code to separate files, since 4k lines and growing is pretty unreasonable for all these separate operations. v2: Rebase on idr's changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 15:38:58 -08:00
Eric Anholt	cd084aa297	meta: Fold the texture setup into setup_copypix_texture(). There was this funny argument passed to setup for "did alloc decide we need to allocate new texture storage?", which goes away if we don't have the caller do alloc as a separate step. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:58 -08:00
Eric Anholt	397b2c3966	meta: Drop the src == dst restriction on meta glBlitFramebuffer(). From the GL_ARB_fbo spec: If the source and destination buffers are identical, and the source and destination rectangles overlap, the result of the blit operation is undefined. As far as I know, that's the only thing that would have been of concern for this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:58 -08:00
Eric Anholt	a4f3e2ca0e	mesa: Make TexImage error cases about internalFormat more informative. I tripped over one of these when debugging meta, and it's a lot nicer to just see the internalFormat being complained about. v2: Drop a note in the other errors path that there is one early return. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:58 -08:00
Eric Anholt	56b031d8ae	meta: Rename the "sampler" stuff to "blit shader". While these structs are generated per GLSL sampler type, they're structs of data-about-shaders (notably, the ID of a shader program), not data-about-samplers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Eric Anholt	e455c8283b	meta: Drop a now-trivial helper function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Eric Anholt	e48a6378c9	meta: Fold the glUseProgram() into the blit program generator. Everyone was just immediately calling it and doing nothing else with the shader program id. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Eric Anholt	b719aa3902	meta: Simplify the blit shader setup steps. The only thing that wants to track the glsl_sampler structure is the shader string generator. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Francisco Jerez	b424da4be0	i965/vec4: Fix confusion between SWIZZLE and BRW_SWIZZLE macros. Most of the VEC4 back-end agrees on src_reg::swizzle being one of the BRW_SWIZZLE macros defined in brw_reg.h, except in two places where we use Mesa's SWIZZLE macros. There is even a doxygen comment saying that Mesa's macros are the right ones. They are incompatible swizzle representations (3 bits vs. 2 bits per component), and the code using Mesa's works by pure luck. Fix it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:39:42 +01:00
Francisco Jerez	a3a55067bd	i965/fs: Remove fs_reg::sechalf. The same effect can be achieved using ::subreg_offset. Remove the less flexible alternative and define a convenience function to keep the fs_reg interface sane. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:39:24 +01:00
Francisco Jerez	019bf6ed8d	i965/fs: Remove fs_reg::smear. The same effect can be achieved using a combination of ::stride and ::subreg_offset. Remove the less flexible ::smear to keep the data members of fs_reg orthogonal. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:07:57 +01:00
Francisco Jerez	756d37b1d6	i965/fs: Add support for specifying register horizontal strides. v2: Some improvements for copy propagation with non-contiguous register strides and mismatching types. v3: Add example of the situation that the copy propagation changes are intended to avoid. Clarify that 'fs_reg::apply_stride()' is expected to work with zero strides too. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:07:57 +01:00
Francisco Jerez	4c7206bafd	i965/fs: Add support for sub-register byte offsets to the FS back-end IR. It would be nice if we could have a single 'reg_offset' field expressed in bytes that would serve the purpose of both, but the semantics of 'reg_offset' are quite complex currently (it's measured in units of one, eight or sixteen dwords depending on the register file and the dispatch width) and changing it to bytes would be a very intrusive change at this stage. Add a separate 'subreg_offset' field for now. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:07:57 +01:00
Brian Paul	248606a5f0	glsl: rename _restrict to restrict_flag To fix MSVC compile breakage. Evidently, _restrict is an MSVC keyword, though the docs only mention __restrict (with two underscores). Note: we may want to also rename _volatile to volatile_flag to be consistent. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74900 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 13:37:09 -07:00
Brian Paul	fd0620ff6c	mesa: assorted clean-ups in detach_shader() Fix formatting, add new comments, get rid of extraneous indentation. Suggested by Ian in bug 74723. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 11:21:47 -07:00
Brian Paul	23d4ff53d4	svga: replace out-of-temps assertion with debug warning Signed-off-by: Brian Paul <brianp@vmware.com>	2014-02-12 11:21:46 -07:00
Francisco Jerez	76f95ba272	mesa: Handle binding of uniforms to image units with glUniform(). v2: Set driver-specified flag in NewDriverState when glUniform is used to bind an image unit. v3: Abbreviate argument type check. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	212122543b	glsl/linker: Propagate image uniform access qualifiers to the driver. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	c318a677dd	glsl/linker: Assign image uniform indices. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	e51158f2e7	glsl/linker: Count and check image resources. v2: Add comment about the reason why image variables take up space from the default uniform block. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	e8dbe430aa	glsl: Add image built-in function generator. Because of the combinatorial explosion of different image built-ins with different image dimensionalities and base data types, enumerating all the 242 possibilities would be annoying and a waste of .text space. Instead use a special path in the built-in builder that loops over all the known image types. v2: Generate built-ins on GLSL version 4.20 too. Rename '_has_float_data_type' to '_supports_float_data_type'. Avoid duplicating enumeration of image built-ins in create_intrinsics() and create_builtins(). v3: Use a more orthodox approach for passing image built-in generator parameters. v4: Cosmetic changes. Acked-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	87acc7c650	glsl: Add built-in constants for ARB_shader_image_load_store. v2: Add them on GLSL version 4.20 too. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	6057300ec6	glcpp: Add built-in define for ARB_shader_image_load_store. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	60c89f8bff	glsl: Add built-in types defined by ARB_shader_image_load_store. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	7af167d2be	glsl/ast: Generalize some sampler variable restrictions to all opaque types. No opaque types may be statically initialized in the shader, all opaque variables must be declared uniform or be part of an "in" function parameter declaration, no opaque types may be used as the return type of a function. v2: Add explicit check for opaque types in interface blocks. Check for opaque types in ir_dereference::is_lvalue(). Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	2158749e52	glsl/ast: Forbid declaration of image variables in structures and uniform blocks. Aggregating images inside uniform blocks is explicitly disallowed by the standard, aggregating them inside structures is not (as of GL 4.4), but there is a similar problem as with atomic counters: image uniform declarations require either a "writeonly" memory qualifier or an explicit format qualifier, which are explicitly forbidden in structure member declarations. In the resolution of Khronos bug #10903 the same wording applied to atomic counters was decided to mean that they're not allowed inside structures -- Rejecting image member declarations within structures seems the most reasonable option for now. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	6b28528d1c	glsl/ast: Make sure that image argument qualifiers match the function prototype. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	81c167ef1c	glsl/ast: Verify that function calls don't discard image format qualifiers. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	94a95e03d9	glsl/ast: Validate and apply memory qualifiers to image variables. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	910311c4a6	glsl/parser: Handle image built-in types. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	f9cf61df3b	glsl/parser: Handle image memory qualifiers. v2: Make the "map" array static const. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	fcd869ed56	glsl/parser: Handle the early_fragment_tests input layout qualifier. v2: Only allow the early_fragment_tests qualifier in fragment shaders. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	b0b26faa25	glsl/lexer: Add new tokens for ARB_shader_image_load_store. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	299e869d25	glsl/ast: Keep track of type qualifiers defined by ARB_shader_image_load_store. v2: Add comment next to the read_only and write_only qualifier flags. Change temporary copies of the type qualifier mask to use uint64_t too. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	c116541b2c	glsl: Add gl_uniform_storage fields to keep track of image uniform indices. v2: Promote anonymous struct into named struct. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	bb13691d1c	glsl: Add image memory and layout qualifiers to ir_variable. v2: Add comment next to the read_only and write_only qualifier flags. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:04 +01:00
Francisco Jerez	107d03a6d5	glsl: Add helper methods to glsl_type for dealing with images. Add predicates to query if a GLSL type is or contains an image. Rename sampler_coordinate_components() to coordinate_components(). v2: Use assert instead of unreachable. v3: No need to use a separate code-path for images in coordinate_components() after merging image and sampler fields in the glsl_type structure. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:43:37 +01:00
Francisco Jerez	8a2508ee07	glsl: Add image type to the GLSL IR. v2: Reuse the glsl_sampler_dim enum for images. Reuse the glsl_type::sampler_* fields instead of creating new ones specific to image types. Reuse the same constructor as for samplers adding a new 'base_type' argument. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:39:48 +01:00
Francisco Jerez	9e611fc72d	glsl: Add ARB_shader_image_load_store extension enables. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:39:48 +01:00
Fredrik Höglund	9afbd04d89	mesa: Preserve the NewArrays state when copying a VAO Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72895 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-12 18:22:42 +01:00
Maarten Lankhorst	fee0686c21	nouveau: create only 1 shared screen between vdpau and opengl This fixes bug 73200 "vdpau-GL interop fails due to different screen objects" in the same way radeon does. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-12 14:57:25 +01:00
Maarten Lankhorst	572a8345bf	gallium makefiles: use a linker script for building dri drivers Only export __driDriverExtensions by default, and radeon_drm_winsys_create on radeons. Remove -Bsymbolic which should no longer be needed. As a side effect, it ought to fix a manifestation of bug 73200 on radeon. Signed-off-by: Maarten Lankhorst<maarten.lankhorst@canonical.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-12 13:51:51 +01:00
Matt Turner	025d99ce3c	glsl: Do not vectorize vector array dereferences. Array dereferences must have scalar indices, so we cannot vectorize them. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reported-by: Andrew Guertin <lists@dolphinling.net> Tested-by: Andrew Guertin <lists@dolphinling.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-11 16:05:55 -08:00
Ian Romanick	4cffd3e791	meta: Enable cubemap array texture support to decompress_texture_image Fixed piglit test getteximage-targets S3TC CUBE_ARRAY on systems that don't have libtxc_dxtn installed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	daa3eea877	meta: Add cubemap array support to generic blit shader code Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	e68aa12849	meta: Get the correct info log Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	10f7c54477	meta: Expand texture coordinate from vec3 to vec4 This will be necessary to support cubemap array textures because they use all four components. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	b2ad3dbfa4	meta: Use GLSL to decompress 2D-array textures Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72582 Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	c1417aae6c	meta: Use common GLSL code for blits Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	d524654c34	meta: Improve GLSL version check We want to use the GLSL 1.30-ish path for OpenGL ES 3.0. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	4825af972a	meta: Add rectangle textures to the shader-per-sampler-type table Rectangle textures were not necessary for mipmap generation (because they cannot have mipmaps), but all of the future users of this common code will need to support rectangle textures. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	f5a477ab76	meta: Refactor shader generation code out of mipmap generation path This is quite like code we want for blits. Pull it out so that it can be shared by other paths. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	ed3bc38ee7	meta: Refactor the table of glsl_sampler structures This will allow the same table of shader-per-sampler-type to be used for paths in meta other than just mipmap generation. This is also the reason the declarations of the structures was moved towards the top of the file. v2: Code formatting change suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	b514f24101	meta: Use common vertex setup code for _mesa_meta_Bitmap too Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	75227a0968	meta: Add storage to the vertex structure for R, G, B, and A Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	5e5d87ff32	meta: Use common routine to configure fixed-function TNL state Also... glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0, 1.0) is the identity matrix, so drop the unnecessary call to _mesa_Ortho. v2: Rename setup_ff_TNL_for_blit() to setup_ff_tnl_for_blit(). Seems silly to capitalize one out of two to three acronyms in the name (change by anholt, acked by idr). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Kenneth Graunke	35e8de383c	i965: Fix General and Indirect Base Addresses on Broadwell. I set the "address modify enable" bit in the wrong DWord. The first DWord is the high 16 bits of the address, while the second is the low 32-bits and enable bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:25:45 -08:00
Kenneth Graunke	b0e90ea09f	i965: Drop VECTOR_MASK_ENABLE in Broadwell's 3DSTATE_VS packet. We never set it on previous generations, but I had to set it in 3DSTATE_PS for correct behavior. For symmetry, I set it in 3DSTATE_VS as well, but there's no actual need to do so. Piglit works fine either way. The documentation also remarks that there should never be a need to program this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:25:29 -08:00
Kenneth Graunke	4dd1002518	i965/gs: Fix EndPrimitive on Broadwell. My earlier patch (i965: Reserve space for "Vertex Count" in GS outputs.) incremented Global Offset for most URB writes to make room for the new "Vertex Count" field, but failed to shift the URB writes used for writing control bits. Confusingly, Global Offset must be incremented by 2 here, rather than 1. The URB writes we use for actual data are HWord writes, which treat Global Offset as a 256-bit offset. These are OWord writes, so it's treated as a 128-bit offset instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:25:03 -08:00
Kenneth Graunke	5ebfac8d72	i965/vec4: Support arbitrarily large sampler indices on Broadwell+. I added support for these on Haswell, but forgot to update the Broadwell code before landing it. Fixes Piglit's max-samplers test. v2: Use get_element_ud() for the destination as well as the source. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:24:36 -08:00
Kenneth Graunke	b371734331	i965/fs: Support arbitrarily large sampler indices on Broadwell+. I added support for these on Haswell, but forgot to update the Broadwell code before landing it. Partially fixes Piglit's max-samplers test. v2: Use get_element_ud() consistently, rather than using it for the source but using brw_vec1_grf for the destination.. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:22:22 -08:00
Kenneth Graunke	0e21ba07f2	i965/fs: Fix Broadwell texture header setup to be uncompressed. MOV_RAW disables masking, but doesn't force the instruction to be uncompressed. That needs to be done by hand. Fixes textureGather and texture offset tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:21:10 -08:00
Ian Romanick	1edca151a0	mesa: GL_ARB_half_float_pixel is not optional Almost every driver already supported it. All current and future Gallium drivers always support it, and most existing classic drivers support it. This only changes radeon and nouveau. This extension only adds data types that can be passed to, for example, glTexImage2D. It does not add internal formats. Since you can already pass GL_FLOAT to glTexImage2D this shouldn't pose any additional issues with those drivers. Note that r200 and i915 already supported this extension, and they don't support floating-point textures either. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	6d6a290181	mesa: Fix extension dependency for half-float TexBOs Half-float TexBOs should require both GL_ARB_half_float_pixel and GL_ARB_texture_float. This doesn't matter much in practice. Every driver that supports GL_ARB_texture_buffer_object already supports GL_ARB_half_float_pixel. We only expose the TexBO extension in core profiles, and those require GL_ARB_texture_float. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	54b1082828	meta: Silence unused parameter warning in _mesa_meta_CopyTexSubImage drivers/common/meta.c: In function '_mesa_meta_CopyTexSubImage': drivers/common/meta.c:3744:52: warning: unused parameter 'rb' [-Wunused-parameter] Unfortunately, the parameter can't just be removed because it is part of the dd_function_table::CopyTexSubImage interface. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	d156281cfe	meta: Silence unused parameter warning in setup_drawpix_texture drivers/common/meta.c: In function 'setup_drawpix_texture': drivers/common/meta.c:1572:30: warning: unused parameter 'texIntFormat' [-Wunused-parameter] setup_drawpix_texture has never used this paramater. Before the refactor commit `04f8193aa` it was used in several locations. After that commit, texIntFormat was only used in alloc_texture. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	f34d599a5b	meta: Refactor common VAO and VBO initialization code v2: Clean up some stray binding calls Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net> (v2)	2014-02-11 14:24:02 -08:00
Ian Romanick	beb33fc5b7	meta: Track the _mesa_meta_DrawPixels VBO just like the others All of the other meta routines have a particular pattern for creating and tracking the VAO and VBO. This one function deviated from that pattern for no apparent reason. Almost all of the code added in this patch will be removed shortly. v2: Drop glDeleteBuffers() of the old, now-uninitialized vbo variable. Fixes getteximage-formats and fbo-mipmap-copypix regression when "2" landed in the variable (change by anholt). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:23:55 -08:00
Ian Romanick	83c90c9239	meta: Expand the vertex structure for the GenerateMipmap and decompress paths Final intermediate step leading to some code sharing. Note that the new GemerateMipmap and decompress vertex structures are the same as the new vertex structure in BlitFramebuffer and the others. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	897f975668	meta: Expand the vertex structure for the DrawPixels paths Another step leading to some code sharing. Note that the new DrawPixels vertex structure is the same as the new vertex structure in BlitFramebuffer and the others. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	d7ac102c7b	meta: Expand the vertex structure for the Clear paths Another step leading to some code sharing. Note that the new Clear vertex structure is the same as the new BlitFramebuffer and CopyPixels vertex structure. The "sizeof(float) * 7" hack is temporary. It will magically disappear in a just a couple more patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	545fd9bc9b	meta: Expand the vertex structure for the CopyPixels paths Another step leading to some code sharing. Note that the new CopyPixels vertex structure is the same as the new BlitFramebuffer vertex structure. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	9b4e659e62	meta: Expand the vertex structure for the BlitFramebuffer paths This is the first of several steps leading to some code sharing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ilia Mirkin	908a711313	nv30,nvc0: only claim a single viewport It should be possible to make this be 16 on nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 22:08:01 +00:00
Emil Velikov	82cd6e6317	st/clover: use VISIBILITY_CXXFLAGS where approapriate Use the c++ visibility flags when building cpp files. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 21:36:52 +00:00
Emil Velikov	7ed32c9af9	omx: use VISIBILITY_CFLAGS to control exported symbols Initial step of cleaning the exported symbols from targets/omx - Mark omx_component_library_Setup as public v2: Keep export-symbols-regex Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> (v1)	2014-02-11 21:36:16 +00:00
Emil Velikov	eda9a66f7e	osmesa: drop obsolete AM_CXXFLAGS There is no cpp files during the build process, thus we can safely drop the unused cxxflags. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 21:32:39 +00:00
Emil Velikov	927b9e8eb8	st/vdpau: automake: export only PUBLIC symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-11 21:27:45 +00:00
Emil Velikov	255b39f17a	st/vdpau: do not export VdpPresentationQueueTargetCreateX11 The function pointer is retrieved via VdpGetProcAddress just like all the other vdpau functions and should not be exported. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-11 21:25:11 +00:00
Emil Velikov	d84e0eb406	wayland-egl: automake: add symbol test Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 20:19:46 +00:00
Emil Velikov	6405563783	st/egl: automake: avoid exporting all symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 20:19:01 +00:00
Emil Velikov	11926e8997	targets/egl-static: automake: don't export local symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 20:16:55 +00:00
Emil Velikov	5c7f75f70a	gbm: automake: add symbol tests Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	33b9c0d465	targets/gbm: automake: do not export internal symbols Add VISIBILITY_CFLAGS to automake build, so that only required symbols are exported. v2: Rebase Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	10e5ffd496	gbm: do not export _gbm_mesa_get_device This symbol is internal and was never part of the API. Unused by any of the gbm backends, it makes sense to simply not export it. Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	d00b319f40	gbm: automake: add VISIBILITY_CFLAGS Currently the library exports every symbol imaginable, rather than the ones defined by the API. Note: This may cause issues for libraries that are linking agaist libgbm's internals. Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	631cc6105d	st/gbm: automake: do not export gbm_gallium_drm_device_create Symbol is internal and was never meant to be exported. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	90ed101322	auxiliary/pipe-loader: automake: avoid exporting all symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	165eecf1f6	egl/dri2/android: free driver_name in dri2_initialize_android error path v2: Cleanup driver name if dri2_load_driver() fails. Spotted by Chad Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	76d9f6d972	dri/nouveau: Pass the API into _mesa_initialize_context Currently we create a OPENGL_COMPAT context regardless of what was requested by the program. Correct that by retaining the program's request and passing it into _mesa_initialize_context. Based on a similar commit for radeon/r200 by Ian Romanick. Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	118c36adb4	configure: cleanup libudev handling Add the explicit note about the required version during configure. Require the same version (151) of udev when building the pipe-loader. Mention the udev version requirement in GBM Requires.private. v2: Resolve a couple of silly typos. Spotted by Ilia v3: Cleanup platfrom/platform typo. Spotten by Stefan Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 18:59:59 +00:00
Emil Velikov	31f50f3149	gbm: drop unneeded dependency of libudev As of recently we dlopen the library, additionally the only code that is including the libudev.h header, is the loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	d57dc6dc30	opencl: do not link against libudev Previously the linking was required due to dependency of udev in the pipe-loader. Now this is no longer the case, as we dlopen the library. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	e19fba7cc6	gallium/tests: do not link against libudev Previously the linking was required due to dependency of udev in the pipe-loader. Now this is no longer the case, as we dlopen the library. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	897e1989da	egl-static: stop linking against libudev No longer required since all the udev code is in the loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	053e095ecb	egl_dri2: remove LIBUDEV_CFLAGS from Makefile.am None of the code within builds or (explicitly) requires udev. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	6fe2ca7a08	configure: drop LIBUDEV_CFLAGS from X11_INCLUDES The cflags are explicitly included in the only Makefile that handles udev dependant code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	7536d744ee	pipe-loader: drop obsolete libudev.h include All the udev code is in the loader, so there is no reason for us to include this header. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:49 +00:00
Emil Velikov	929f83376a	configure: error out when building radeonsi without gallium-llvm --enable-gallium-llvm is required by radeonsi. Currently we check only for LLVM_VERSION_INT which is 0, whenever gallium-llvm is disabled explicitly. ./configure --with-gallium-drivers=r600,radeonsi --disable-gallium-llvm v2: Correct typo in error message. Spotted by Tom Stellard Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:04:18 +00:00
Christian König	4ca8439dce	omx/radeonsi: fix target Another minor typo. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-11 17:10:22 +01:00
Christian König	79aa29d45e	omx: fix some minor configure.ac issues Matt Turner noted the incorrect order, but I somehow forgotten to change it before pushing upstream. The other one is a typo during rebase. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-11 17:08:42 +01:00
Christian König	ee978aee94	vl: add H264 encoding interface Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-02-11 13:26:13 +01:00
Kenneth Graunke	eaf3358e0a	i965: Don't call abort() on an unknown device. If we don't recognize the PCI ID, we can't reasonably load the driver. However, calling abort() is quite rude - it means the application that tried to initialize us (possibly the X server) can't continue via fallback paths. We already have a more polite mechanism - failing to create the context. So, just use that. While we're at it, improve the error message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73024 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Lu Hua <huax.lu@intel.com>	2014-02-11 02:23:22 -08:00
Daniel Kurtz	b47d231526	glsl: Add locking to builtin_builder singleton Consider a multithreaded program with two contexts A and B, and the following scenario: 1. Context A calls initialize(), which allocates mem_ctx and starts building built-ins. 2. Context B calls initialize(), which sees mem_ctx != NULL and assumes everything is already set up. It returns. 3. Context B calls find(), which fails to find the built-in since it hasn't been created yet. 4. Context A finally finishes initializing the built-ins. This will break at step 3. Adding a lock ensures that subsequent callers of initialize() will wait until initialization is actually complete. Similarly, if any thread calls release while another thread is still initializing, or calling find(), the mem_ctx/shader would get free'd while from under it, leading to corruption or use-after-free crashes. Fixes sporadic failures in Piglit's glx-multithread-shader-compile. Bugzilla: https://bugs.freedesktop.org/69200 Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-11 02:21:41 -08:00
Kenneth Graunke	e95a4ed296	i965/fs: Simplify FS_OPCODE_SET_OMASK stride mashing a bit. In the first case, we can simply call stride(mask, 16, 8, 2) rather than creating a new register with a different stride, then immediately changing it a second time. In the second case, the stride was already what we wanted, so we can just use mask without any changes at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-11 02:21:35 -08:00
Kenneth Graunke	f948ad2a07	i965/fs: Simplify FS_OPCODE_SET_SAMPLE_ID stride mashing a bit. stride(brw_vec1_reg(...) ...) takes some register, changes the strides, then changes the strides again. Let's do it once. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-11 02:21:26 -08:00
Dave Airlie	08fd34c8a3	docs/GL3.txt: denote r600g support for ARB_viewport_array Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:15:18 +10:00
Dave Airlie	6d434252e2	r600g: add support for multiple viewports. tested on rv635 and barts. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:14:50 +10:00
Dave Airlie	0705fa35cd	st/mesa: add support for GL_ARB_viewport_array (v0.2) this just ties the mesa code to the pre-existing gallium interface, I'm not sure what to do with the CSO stuff yet. 0.2: fix min/max bounds Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:14:50 +10:00
Dave Airlie	c116ee6042	st/mesa: add support for viewport index semantic This adds GS output and FS input support, even though FS input support isn't supported until GLSL 4.30 from what I can see. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:06:40 +10:00
Kenneth Graunke	a21552a96b	i965: Program 2x MSAA sample positions. There are only two sensible placements for 2x MSAA samples - and one is the mirror image of the other. I chose (0.25, 0.25) and (0.75, 0.75). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-10 08:18:29 -08:00
Kenneth Graunke	f4bc0ac83e	i965: Store 4x MSAA sample positions in a scalar value, not an array. Storing a single value in an array is rather pointless. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-10 08:18:29 -08:00
Kenneth Graunke	16f7510ad3	i965: Duplicate less code in GetSamplePositions driver hook. The 4x and 8x cases contained identical code for extracting the X and Y sample offset values and converting them from U0.4 back to float. Without this refactoring, we'd have to duplicate it a third time in order to support 2x MSAA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-10 08:18:28 -08:00
Ilia Mirkin	40dd777b33	nouveau/video: make sure that firmware is present when checking caps Apparently some players are ill-prepared for us claiming that a decoder exists only to have creating it fail, and express this poor preparation with crashes (e.g. flash). Check that firmware is there to increase the chances of there being a high correlation between reported capabilities and ability to create a decoder. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org> Tested-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-10 14:00:17 +01:00
Kenneth Graunke	a487ef87fe	mesa: Fix MESA_FORMAT_Z24_UNORM_S8_UINT vs. X8_UINT mix-up. In commit `eeed49f5f2`, Mark accidentally renamed MESA_FORMAT_S8_Z24 to MESA_FORMAT_Z24_UNORM_X8_UINT and MESA_FORMAT_X8_Z24 to MESA_FORMAT_Z24_UNORM_S8_UINT, reversing their sense. The commit message was correct, but what sed commands actually got run didn't match that. This patch swaps the two enum names, reversing them. This should undo the damage, but might break things if people have manually fixed a few instances in the meantime... Mark's commit also failed to mention renames: s/MESA_FORMAT_ARGB2101010_UINT\b/MESA_FORMAT_B10G10R10A2_UINT/g s/MESA_FORMAT_ABGR2101010\b/MESA_FORMAT_R10G10B10A2_UNORM/g but those seem okay. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-09 16:57:45 -08:00
Maxence Le Doré	b903be50b0	mesa: remove duplicated init of MaxViewports Already declared 5 lines before. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-09 16:45:23 -08:00
Grigori Goronzy	d34d5fddf8	gallium: add geometry shader output limits v2: adjust limits for radeonsi and llvmpipe v3: add documentation Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-02-09 23:31:38 +01:00
Siavash Eliasi	61bc014c96	mesa: Removed unnecessary check for NULL pointer when freeing memory Note that it is OK to pass NULL pointers to this function since this commit: mesa: modified _mesa_align_free() to accept NULL pointer http://cgit.freedesktop.org/mesa/mesa/commit/?id=f0cc59d68a9f5231e8e2111393a1834858820735 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-09 16:16:34 +01:00
Ilia Mirkin	356aff3a5c	nv30: report 8 maximum inputs nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for nv40/nv30. This fixes compilation of the varying-packing tests. Furthermore it appears that the last 2 inputs on nv4x don't seem to work in those tests, so just report 8 everywhere for now. Tested on NV42, NV44. NV4B appears to have additional problems. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-08 19:06:51 -05:00
Christoph Bumiller	2e9ee44797	nv50/ir/ra: some register spilling fixes Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-09 00:04:13 +01:00
Brian Paul	c325ec8965	mesa: update assertion in detach_shader() for geom shaders Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74723 Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2014-02-08 14:21:28 -07:00
Brian Paul	6e8d04ac3e	mesa: allocate gl_debug_state on demand We don't need to allocate all the state related to GL_ARB_debug_output until some aspect of that extension is actually needed. The sizeof(gl_debug_state) is huge (~285KB on 64-bit systems), not even counting the 54(!) hash tables and lists that it contains. This change reduces the size of gl_context alone from 431KB bytes to 145KB bytes on 64-bit systems and from 277KB bytes to 78KB bytes on 32-bit systems. Reviewed-by: Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 11:27:58 -07:00
Brian Paul	31b2625cb5	mesa: trivial clean-ups in errors.c Whitespace changes, 78-column rewrapping, comment clean-ups, add some braces, etc. Reviewed-by: Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 11:27:58 -07:00
Brian Paul	1dc209d8f2	mesa: remove _mesa_ prefix from some static functions Reviewed-by: Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 11:27:57 -07:00
Kenneth Graunke	dcb0330d30	i965: Label JIP and UIP in Broadwell shader disassembly. This makes it obvious which number is which. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:38:15 -08:00
Kenneth Graunke	8a7fe50067	i965: Don't disassemble UIP field for Broadwell WHILE instructions. The WHILE instruction doesn't have UIP. It only has JIP. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:38:12 -08:00
Kenneth Graunke	5230655a2e	i965: Don't print source registers for Broadwell flow control. The bits which normally contain the source register descriptions actually contain the JIP/UIP jump targets, which we already printed. Interpreting JIP/UIP as source registers results in some really creepy looking output, like IF statements with acc14.4<0,1,0>UD sources. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:37:34 -08:00
Kenneth Graunke	8e0a0e4d30	i965: Fix fast depth clear values on Broadwell. Broadwell's 3DSTATE_CLEAR_PARAMS packet expects a floating point value regardless of format. This means we need to stop converting it to UNORM. Storing the value as float would make sense, but since we already have a uint32_t field, this patch continues shoehorning it into that. In a sense, this makes mt->depth_clear_value the DWord you emit in the packet, rather than the clear value itself. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:36:14 -08:00
Christoph Bumiller	882e98e5e6	nvc0: handle TGSI_SEMANTIC_LAYER Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-07 23:14:00 +01:00
Christoph Bumiller	dd2229d4c6	nvc0: create the SW object It's required for being able to use software methods now.	2014-02-07 22:53:37 +01:00
Christoph Bumiller	b7233acf78	nvc0/ir/emit: hardcode vertex output stream to 0 for now	2014-02-07 22:53:36 +01:00
Chris Forbes	0c14c5c62a	i965: Enable ARB_texture_gather for one component on Gen6. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:24 +13:00
Chris Forbes	31d1077dd2	i965/vec4: Emit shader w/a for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:23 +13:00
Chris Forbes	73b91fe05a	i965/fs: Emit shader w/a for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:20 +13:00
Chris Forbes	c2d51aaa11	i965: Add surface format overrides for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:19 +13:00
Chris Forbes	2b7bbd89e8	i965: Add Gen6 gather wa to sampler key Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:06 +13:00
Eric Anholt	1e12dafcac	glsl: Optimize triop_csel with all-true or all-false. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	de796b0ef0	glsl: Optimize various cases of fma (aka MAD). Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	44577c4857	glsl: Optimize lrp(x, x, coefficient) --> x. total instructions in shared programs: 1627754 -> 1624534 (-0.20%) instructions in affected programs: 45748 -> 42528 (-7.04%) GAINED: 3 LOST: 0 (serious sam, humus domino demo) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	d72956790f	glsl: Optimize pow(x, 1) -> x. total instructions in shared programs: 1627826 -> 1627754 (-0.00%) instructions in affected programs: 6640 -> 6568 (-1.08%) GAINED: 0 LOST: 0 (HoN and savage2) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	6d7c123d6c	glsl: Optimize log(exp(x)) and exp(log(x)) into x. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:47 -08:00
Eric Anholt	2c2aa35336	glsl: Optimize ~~x into x. v2: Fix pasteo of an extra abs being inserted (caught by many). Rewrite to drop the silly switch statement. Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2014-02-07 12:46:47 -08:00
Eric Anholt	0f6279bab2	i965: Add some informative debug when the X Server botches DRI2 GetBuffers. We've had various bug reports over the years where miptrees are missing, and when I screwed it up while adding DRI2 to the modesetting driver, I figured I should put the info necessary for debug here. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-07 12:46:47 -08:00
Eric Anholt	b5e5f34dd2	i965: Remove redundant check in blitter-based glBlitFramebuffer(). The intel_miptree_blit() code checks the format for us now, plus it handles xrgb vs argb for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-07 12:46:47 -08:00
Kenneth Graunke	697f401a31	i965: Fix Gen8+ disassembly of half float subregister numbers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	e990234ff6	i965: Use the new brw_load_register_mem helper for draw indirect. This makes it work on Broadwell, too. v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register (caught by Chris Forbes). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	b7c435b261	i965: Implement a brw_load_register_mem helper function. This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64 distinction. Placing the function in intel_batchbuffer.c is rather arbitrary; there wasn't really an obvious place for it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	2f97119950	i965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs. Since commit `9cee3ff562`, INTEL_DEBUG=vs has caused a NULL pointer dereference for fixed-function/ARB programs. In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the gl_shader_program. This is different than the FS visitor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	2062f40d81	glsl: Don't lose precision qualifiers when encountering "centroid". Mesa fails to retain the precision qualifier when parsing: #version 300 es centroid in mediump vec2 v; Consider how the parser's type_qualifier production is applied. First, the precision_qualifier rule creates a new ast_type_qualifier: <precision: mediump> Then the storage_qualifier rule creates a second one: <flags: in> and calls merge_qualifier() to fold in any previous qualifications, returning: <flags: in, precision: mediump> Finally, the auxiliary_storage_qualifier creates one for "centroid": <flags: centroid> it then does $$ = $1 and $$.flags \|= $2.flags, resulting in: <flags: centroid, in> Since precision isn't stored in the flags bitfield, it is lost. We need to instead call merge_qualifier to combine all the fields. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-07 12:36:38 -08:00
Brian Paul	f47e596288	st/mesa: avoid sw fallback for getting/decompressing textures If st_GetTexImage() is to decompress the texture, avoid the fallback path even if prefer_blit_based_texture_transfer = false. For drivers that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we were always taking the fallback path for texture decompression rather than rendering a quad. The later is a lot faster. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-07 09:54:43 -07:00
Erik Faye-Lund	5125165dde	gallium/tgsi: correct typo propagated from NV_vertex_program1_1 In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F800000 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:22:23 -07:00
Erik Faye-Lund	7a49a796a4	gallium/tgsi: use CLAMP instead of open-coded clamps Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:22:14 -07:00
Juha-Pekka Heikkila	498d10e230	egl: Unhide functionality in _eglInitSurface() _eglInitResource() was used to memset entire _EGLSurface by writing more than size of pointed target. This does work as long as Resource is the first element in _EGLSurface, this patch fixes such dependency. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	1456ed85f0	egl: Unhide functionality in _eglInitContext() _eglInitResource() was used to memset entire _EGLContext by writing more than size of pointed target. This does work as long as Resource is the first element in _EGLContext, this patch fixes such dependency. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	d530745169	glx: Add missing null check in __glX_send_client_info() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	d3e948340b	i965: Add missing null check in fs_visitor::dead_code_eliminate_local() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	e503609e6f	glx: Add some missing null checks in glx_pbuffer.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	88cad8356e	glsl: Fix null access on file read error Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:04 -07:00
Juha-Pekka Heikkila	2ae1437a8e	glx: Add missing null check in __glXCloseDisplay Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:04 -07:00
Juha-Pekka Heikkila	d28e92ff74	glx: Add missing null checks in glxcmds.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:04 -07:00
Jordan Justen	020c43f401	main/get: support ARB_gpu_shader5 If a driver enables ARB_gpu_shader5 and sets Const.MaxVertexSteams >= 4, then piglit's arb_gpu_shader5-minmax test should now pass. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-06 16:58:33 -08:00
Jordan Justen	60914fa80d	glapi: add definitions for ARB_gpu_shader5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-06 16:58:33 -08:00
Ilia Mirkin	0befbafb4b	nouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:50:19 -05:00
Ilia Mirkin	f76c7ad5b1	nv50: only over-allocate by a page for code The pre-fetching doesn't go too far. Tested with over-allocating by only a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:50:19 -05:00
Ilia Mirkin	364bdd2419	nv50: fix layerid to be the fp input number rather than vp output number In the tests they were the same so it didn't matter, but indications are that this is the correct behaviour. Also take this opportunity to (trivially) support using gl_Layer in fp. Cc: 10.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:03:24 -05:00
Ilia Mirkin	c7373b7dc7	nv50: rework primid logic Functionally identical but much simpler. Should also better integrate with future layer/viewport changes/fixes. Cc: 10.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:02:57 -05:00
Kristian Høgsberg	f658150639	glx: Pass NULL DRI drawables into the DRI driver for None GLX drawables GLX_ARB_create_context allows making a GLX context current with None drawable and readables, but this was never implemented correctly in GLX. We would create a __DRIdrawable for the None GLX drawable and pass that to the DRI driver and that would somehow work. Now it's somehow broken. The way this should have worked is that we pass a NULL DRI drawable to the DRI driver when the GLX user calls glXMakeContextCurrent() with None for drawable and readables. https://bugs.freedesktop.org/show_bug.cgi?id=74143 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-06 14:23:42 -08:00
Christian König	db54fca9b8	st/vdpau: add flush on unmap Flush the context when we unmap a buffer, otherwise VDPAU might start rendering the next frame while we still reference that buffer. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: StrangeNoises (rachel@strangenoises.org)	2014-02-06 20:58:38 +01:00
Marek Olšák	3f98053fc9	vdpau: flush the context before exporting the surface v2 Bugzilla (bug needs XBMC changes as well): https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. v2 (chk): move the flush to the right place. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: StrangeNoises (rachel@strangenoises.org)	2014-02-06 20:58:07 +01:00
Matt Turner	e2ef93cf94	glsl: Initialize ubo_binding_mask flags to zero. Missed in commit `e63bb298`. Caused sporadic test failures, like incorrect-in-layout-qualifier-repeated-prim.geom. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-06 10:36:54 -08:00
Marek Olšák	559af1df10	gallium/radeon: fix warnings	2014-02-06 17:43:29 +01:00
Marek Olšák	c32114460d	gallium: remove PIPE_USAGE_STATIC Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:37:34 +01:00
Marek Olšák	eeb5a4a50e	gallium: define the behavior of PIPE_USAGE_* flags properly STATIC will be removed in the following commit. v2: changed the definition of IMMUTABLE Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:30:00 +01:00
Marek Olšák	ed84fb3167	gallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS Unused. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:30:00 +01:00
Marek Olšák	2be5bbdd97	r600g,radeonsi: set resource domains in one place (v2) v2: This doesn't change the behavior. It only moves the tiling check to r600_init_resource and removes the usage parameter. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-06 17:29:59 +01:00
Marek Olšák	c6dbcf10df	st/mesa: fix crash when a shader uses a TBO and it's not bound This binds a NULL sampler view in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251 Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:29:59 +01:00
Christian König	b862cc23f2	st/omx: add workaround for bug in Bellagio Not blocking for the message thread can lead to accessing freed up memory. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:19:39 +01:00
Christian König	15e39ca28a	st/omx: initial OpenMAX support v3 Featuring a full grown MPEG2 and H264 decoder and a couple of hundred bugs. v2 (Leo): fix an error for pic_order_cnt_type 1 v3 (Leo): implement support for field decoding Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-02-06 16:16:34 +01:00
Christian König	c9b941ff1b	vl/rbsp: add H.264 RBSP implementation Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:16:33 +01:00
Christian König	b8b28bf94a	vl/vlc: add function to limit the vlc size Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:16:33 +01:00
Christian König	9ef42a54a7	vl/vlc: add remove bits function Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:16:33 +01:00
Christian König	fe0f9ab056	radeon: update legal notes on UVD Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:15:58 +01:00
Christian König	96e8b916a7	radeon: just don't map VRAM buffers at all Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-06 16:08:22 +01:00
Christian König	9b218dcdd7	radeon/video: directly create buffers in the right domain Avoid moving things around on start of stream. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 15:54:14 +01:00
Christian König	7bcfb0bc8f	radeon/video: seperate common video functions Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 15:54:13 +01:00
Axel Davy	57f94bff71	gallium/dri2: Fix dri2_dup_image dri2_dup_image was not copying the dri_format field. This was causing some bugs, for example: . we create an gbm_bo. . we get an EGLImage from the gbm_bo. . Bug: impossible to get again the gbm_bo from the EGLImage by importing. (gbm dri2 backend) Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-02-05 22:22:00 -08:00
Chris Forbes	bba1105d52	i965/vs: Fix typo in brw_compute_vue_map Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-05 22:02:23 -08:00
Kenneth Graunke	e57d77280e	i965: Fix register types in dump_instructions(). This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract type that doesn't match the hardware description. dump_instruction() was using reg_encoding[] from brw_disasm.c, which no longer matches (and was incorrect for Gen8+ anyway). This patch introduces a new function to convert the abstract enum values into the letter suffix we expect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 21:07:48 -08:00
Chad Versace	1340e24406	egl/glx: Remove egl_glx driver Mesa now has a real, feature-rich EGL implementation on X11 via xcb. Therefore I believe there is no longer a practical need for the egl_glx driver. Furthermore, egl_glx appears to be unmaintained. The most recent nontrivial commit to egl_glx was `6baa5f1` on 2011-11-25. Tested by running weston-smoke in windowed Weston on X with i965. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-05 18:19:26 -08:00
Dave Airlie	0224bd20f3	docs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-06 01:13:05 +00:00
Zack Rusin	8a3c990823	tgsi/ureg: increase the number of immediates ureg_program is allocated on the heap so we can just bump the number of immediates that it can handle. It's needed for d3d10. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Zack Rusin	efb152dd04	gallivm: make sure analysis works with large number of immediates We need to handle a lot more immediates and in order to do that we also switch from allocating this structure on the stack to allocating it on the heap. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Zack Rusin	69ee3f431f	gallivm: handle huge number of immediates We only supported up to 256 immediates, which isn't enough. We had code which was allocating immediates as an allocated array, but it was always used along a statically backed array for performance reasons. This commit adds code to skip that performance optimization and always use just the dynamically allocated immediates if the number of them is too great. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Zack Rusin	8507afc97f	gallivm: allow large numbers of temporaries The number of allowed temporaries increases almost with every iteration of an api. We used to support 128, then we started increasing and the newer api's support 4096+. So if we notice that the number of temporaries is larger than our statically allocated storage would allow we just treat them as indexable temporaries and allocate them as an array from the start. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Chris Forbes	5eeb12c0bc	i965/fs: Assume FBO rendering in precompile if MRT. If multiple color outputs are written, this shader is unlikely to be useful with a winsys framebuffer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-06 10:58:52 +13:00
Chris Forbes	046f8d8a6f	i965/fs: Guess nr_color_regions better in precompile Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-06 10:58:37 +13:00
Chris Forbes	6c9de691c7	docs: Add relnotes for 10.2 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-06 10:28:36 +13:00
Chris Forbes	87e916a240	mesa: Bump version to 10.2.0-devel Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-06 10:15:09 +13:00
Kristian Høgsberg	44338cd826	i965: Move intel_prepare_render() above first buffer access The driver is supposed to ensure buffers before any drawing operation, but in do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format before calling intel_prepare_render(). That was covered up by the unconditional call to intel_prepare_render() in intelMakeCurrent(), but we now only do this on the initial intelMakeCurrent call for a context (to get the size for the initial viewport values). https://bugs.freedesktop.org/show_bug.cgi?id=74083 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Alexander Monakov <amonakov@gmail.com>	2014-02-05 11:10:39 -08:00
Brian Paul	db98d238e2	st/mesa: add MESA_SHADER_COMPUTE case in shader_stage_to_ptarget() Silences compiler warning. Trivial.	2014-02-05 11:00:41 -07:00
Brian Paul	357faa5a36	mesa: re-wrap, fix-up comment text in formats.h Wrap to 78 columns, fix comment formatting. Trivial.	2014-02-05 10:43:21 -07:00
Paul Berry	25268b930d	i965/cs: Allow ARB_compute_shader to be enabled via env var. This will allow testing of compute shader functionality before it is completed. To enable ARB_compute_shader functionality in the i965 driver, set INTEL_COMPUTE_SHADER=1. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:14:16 -08:00
Paul Berry	3bbf93045a	i965/cs: Create the brw_compute_program struct, and the code to initialize it. v2: Fix comment. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:05:04 -08:00
Paul Berry	1fe274b3d7	glsl/cs: Prohibit mixing of compute and non-compute shaders. Fixes piglit test: spec/ARB_compute_shader/linker/mix_compute_and_non_compute Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:05:01 -08:00
Paul Berry	5a79bdab30	glsl/cs: Prohibit user-defined ins/outs in compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:04:58 -08:00
Paul Berry	f5c5438e1f	main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE. v2: Improve error message. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:04:55 -08:00
Paul Berry	28ce604b7f	mesa/cs: Handle compute shader local size during linking. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:04:20 -08:00
Paul Berry	0fa74e848f	glsl/cs: Handle compute shader local_size_{x,y,z} declaration. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:03:44 -08:00
Paul Berry	0398b69954	mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant. v2: Document that the 3-element array MaxComputeWorkGroupCount is indexed by dimension. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:03:08 -08:00
Paul Berry	c85c50997f	mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant. Reviewed-by: Matt Turner <mattst88@gmail.com> v2: Use CONTEXT_INT rather than CONTEXT_ENUM. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:02:30 -08:00
Paul Berry	347dde82e6	mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant. v2: Document that the 3-element array MaxComputeWorkGroupSize is indexed by dimension. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:01:54 -08:00
Paul Berry	47d480e3e4	mesa/cs: Create the gl_compute_program struct, and the code to initialize it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:18 -08:00
Paul Berry	9b34ae2e64	mesa/cs: Handle compute shaders in _mesa_use_program(). v2: do cs after the ordered pipeline stages for consistency. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:16 -08:00
Paul Berry	c15064c169	glsl/cs: update main.cpp to use the ".comp" extension for compute shaders. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:13 -08:00
Paul Berry	d861c2963a	glsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE]. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:10 -08:00
Paul Berry	c61ec8d8e3	mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements. This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum. Also, where it is trivial to do so, it adds a compute shader case to switch statements that switch based on the type of shader. This avoids "unhandled switch case" compiler warnings. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:00:34 -08:00
Paul Berry	28e526d558	glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound. Linker loops that iterate through all the stages in the pipeline need to use MESA_SHADER_FRAGMENT as a bound, so that we can add an additional MESA_SHADER_COMPUTE stage, without it being erroneously included in the pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:00:31 -08:00
Paul Berry	79134cb516	mesa/cs: Add dispatch API stubs for ARB_compute_shader. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:00:14 -08:00
Paul Berry	b7d05a58ae	mesa/cs: Add extension enable flags for ARB_compute_shader. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 08:59:37 -08:00
Roland Scheidegger	4a7da3bec5	gallivm: fix F2U opcode Previously, we were really doing F2I. And also move it to generic section. (Note that for llvmpipe the code generated is definitely bad, due to lack of unsigned conversions with sse. I think though what llvm does (using scalar conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit) including lots of domain changes is quite suboptimal, could do something like is_large = arg >= 2^31 half_arg = 0.5 * arg small_c = fptoint(arg) large_c = fptoint(half_arg) << 1 res = select(is_large, large_c, small_c) which should be much less instructions but that's something llvm should do itself.) This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs GL 3.0 version override to run.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-02-05 17:45:31 +01:00
José Fonseca	5c975966dc	tools/trace: Handle index buffer overflow gracefully. Trivial.	2014-02-05 10:58:38 +00:00
Dave Airlie	16215a9723	docs/GL3.txt: update r600 status This updates the r600 driver status to 3.3 being fully supported. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-05 10:56:58 +10:00
Dave Airlie	79ea0f4506	r600g: add support for geom shaders to r600/r700 chipsets (v2) This is my first attempt at enabling r600/r700 geometry shaders, the basic tests pass on both my rv770 and my rv635, It requires this kernel patch: http://www.spinics.net/lists/dri-devel/msg52745.html v2: address Alex comments. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:43 +10:00
Dave Airlie	ccea799ee3	r600g: enable GLSL 3.30 on evergreen GPUs This throws the switch to enable GL 3.3 and GLSL 330. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:43 +10:00
Dave Airlie	c6cfc54db0	r600g: properly propogate clip dist write value This moves the value from the GS shader to the copy shader so the registers are setup correctly. fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:43 +10:00
Dave Airlie	b209afb153	r600g: calculate a better value for array_size (v2) attempt to calculate a better value for array size to avoid breaking apps. v2: use 0xfff like streamout, suggested by Grigori Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:42 +10:00
Dave Airlie	ce9e939144	r600g: fix CAYMAN geometry shader support cayman has a different end of program bit, so do that properly. fixes hangs with geom shader tests on cayman. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:42 +10:00
Dave Airlie	7ec5e883f2	r600g: fix up shader out misc stuff for copy shader set the correct values so the misc out register is setup correctly for the copy shader. This also updates the state for the gs copy shader so the hw gets programmed correctly. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:42 +10:00
Dave Airlie	7863611de3	r600g: port the layered surface rendering patch from radeonsi This just makes r600 and evergreen do what the radeonsi codepaths do for layered rendering. This makes the 2d amd_vertex_shader_layer test pass on evergreen. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	f89394be98	r600g: initial VS output layer support This just adds support for emitting the proper value in the VS out misc. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	5191937352	r600g: setup const texture buffers for geom shaders This just enables the workarounds we have for vertex/pixel shaders for geom shaders as well. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	afce47fb0b	r600g: calculate correct cut value This selects the cut value depending on the shader selected. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	0d79d5da40	r600g: fix dynamic_input_array_index.shader_test This follows what fglrx does, it unpacks the input we are going to indirect into a bunch of registers and indirects inside them. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:40 +10:00
Dave Airlie	e12147e9f6	r600g: add support for indirect geom ring writes We need to be able to write to the ring using a base register for when we emit vertices in a loop, in theory the SB compiler could collapse these indirect writes to direct writes if the register value is constant and known, but that is outside my pay grade. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:40 +10:00
Dave Airlie	cda63db780	r600g: write proper output prim type Vadim's code derived it from the info.mode, but it needs to be takes from the geometry shader output primitive. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:40 +10:00
Dave Airlie	2b0be2015d	r600g: enable instance cnt register with new enough kernel The instance cnt register was missing for a few kernels, with a new enough kernel we can output it. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	f4652babbd	r600g: add primitive input support for gs only enable prim id if gs uses it Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	b0e842bd9f	r600g: emit streamout from dma copy shader This enables streamout with GS in the mix, from the VS dma shader. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	20adc7449c	r600g/gs: fix cases where number of gs inputs != number of gs outputs this fixes a bunch of the geom shader built-in tests Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	defebc0293	r600g: increase array base for exported parameters Trivial fix to Vadim's code. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:38 +10:00
Dave Airlie	d9954e402f	r600g: initialise the geom shader loop registers. As we do for vertex and pixel shaders. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:38 +10:00
Dave Airlie	461c463bb2	r600g: emit NOPs at end of shaders in more cases If the shader has no CF clauses at all emit an nop If the last instruction is an ENDLOOP add a NOP for the LOOP to go to if the last instruction is CALL_FS add a NOP These fix a bunch of hangs in the geometry shader tests. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:38 +10:00
Dave Airlie	c4782a58c3	r600g: don't enable SB for geom shaders SB needs fixes for three GS instructions it seems to raise them outside loops etc despite my best efforts. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:37 +10:00
Dave Airlie	5758a76d04	r600g/sb: add MEM_RING support Although we don't use SB on geom shaders, the VS copy shader will use it so we might as well implement MEM_RING support in sb. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:37 +10:00
Dave Airlie	eeead9b8ed	r600g: don't fail if we can't map VS->GS ring entries This can happen in normal operation, so don't report an error on it, just continue. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:37 +10:00
Vadim Girlin	1371d65a7f	r600g: initial support for geometry shaders on evergreen (v2) This is Vadim's initial work with a few regression fixes squashed in. v2: (airlied) fix regression in glsl-max-varyings - need to use vs and ps_dirty fix regression in shader exports from rebasing. whitespace fixing. v2.1: squash fix assert Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:11 +10:00
Vadim Girlin	34ee1d0f9f	r600g: add hw register definitions for GS block setup Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:40:42 +10:00
Vadim Girlin	a144bc29b5	r600g: defer shader variant selection and depending state updates [airlied: fix dropped streamout line - fix for master] Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:40:38 +10:00
Dave Airlie	ae29a098ea	r600g/bc: add support for indexed memory writes. It looks like we need these for geom shaders in the future. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:40:33 +10:00
Vadim Girlin	552aae7e47	r600g: move barrier and end_of_program bits from output to cf struct (v2) v2: fix regression on r600 NOP instructions. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-05 10:40:23 +10:00
Dave Airlie	29a43cb0b6	r600g: split streamout emit code into a separate function For geometry shaders we need to call this code from a second place. Just move it out for now to keep future patches cleaner. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-05 10:40:17 +10:00
Marek Olšák	07075cf350	r600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment	2014-02-04 20:19:16 +01:00
Marek Olšák	08f0344cf3	r600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already	2014-02-04 20:19:16 +01:00
Marek Olšák	796e2fba8c	r600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING	2014-02-04 20:19:16 +01:00
Marek Olšák	0354b769c2	gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS This can be derived from the shader caps. All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots for each shader stage.	2014-02-04 20:19:16 +01:00
Brian Paul	82c0914266	mesa: remove stray bits of GL_EXT_cull_vertex GL_EXT_cull_vertex was removed back in 2010 in commit `02984e3536` but these bits still lingered. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-04 11:53:21 -07:00
Paul Berry	7f5740899f	glsl: Fix continue statements in do-while loops. From the GLSL 4.40 spec, section 6.4 (Jumps): The continue jump is used only in loops. It skips the remainder of the body of the inner most loop of which it is inside. For while and do-while loops, this jump is to the next evaluation of the loop condition-expression from which the loop continues as previously defined. Previously, we incorrectly treated a "continue" statement as jumping to the top of a do-while loop. This patch fixes the problem by replicating the loop condition when converting the "continue" statement to IR. (We already do a similar thing in "for" loops, to ensure that "continue" causes the loop expression to be executed). Fixes piglit tests: - glsl-fs-continue-inside-do-while.shader_test - glsl-vs-continue-inside-do-while.shader_test - glsl-fs-continue-in-switch-in-do-while.shader_test - glsl-vs-continue-in-switch-in-do-while.shader_test Cc: mesa-stable@lists.freedesktop.org Acked-by: Carl Worth <cworth@cworth.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-04 09:06:09 -08:00
Paul Berry	56790856b3	glsl: Make condition_to_hir() callable from outside ast_iteration_statement. In addition to making it public, we also need to change its first argument from an ir_loop * to an exec_list *, so that it can be used to insert the condition anywhere in the IR (rather than just in the body of the loop). This will be necessary in order to make continue statements work properly in do-while loops. Cc: mesa-stable@lists.freedesktop.org Acked-by: Carl Worth <cworth@cworth.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-04 09:06:09 -08:00
Topi Pohjolainen	933be19cdf	i965/blorp: do not use unnecessary hw-blending support This is really not needed as blorp blit programs already sample XRGB normally and get alpha channel set to 1.0 automatically by the sampler engine. This is simply copied directly to the payload of the render target write message and hence there is no need for any additional blending support from the pixel processing pipeline. The blending formula is anyway broken for color components, it multiplies the color component with itself (blend factor is the component itself). Alpha blending in turn would not fix the alpha to one independent of the source but simply used the source alpha as is instead (1.0 * src_alpha + 0.0 * dst_alpha). Quoting Eric: "If we want to actually make the no-alpha-bits-present thing work, we need to override the bits in the surface state or in the generated code. In the normal draw path, it's done for sampling by the swizzling code in brw_wm_surface_state.c, and the blending overrides is just to fix up the alpha blending stage which doesn't pay attention to that for the destination surface." If one modifies piglit test gl-3.2-layered-rendering-blit to use color component values other than zero or one, this change will kick in on IVB. No regressions on IVB. This is effectively revert of `c0554141a9`: i965/blorp: Support overriding destination alpha to 1.0. Currently, Blorp requires the source and destination formats to be equal. However, we'd really like to be able to blit between XRGB and ARGB formats; our BLT engine paths have supported this for a long time. For ARGB -> XRGB, nothing needs to occur: the missing alpha is already interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha channel to 1.0 when writing the destination colors. This is fairly straightforward with blending. For now, this code is never used, as the source and destination formats still must be equal. The next patch will relax that restriction. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-02-04 16:39:23 +02:00
Christian König	c3c24c3acc	radeon/uvd: fix feedback buffer handling v2 Without the correct feedback buffer size UVD runs into an error on each frame, reducing the maximum FPS. v2: fixing Michels comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>	2014-02-04 13:10:50 +01:00
Kenneth Graunke	adaa5a6ca6	i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw(). This moves the intel_batchbuffer_flush before the drm_intel_bo_busy call, which is a change in behavior. However, the old behavior was broken. In the future, we may want to only flush in the batchbuffer references the BO being mapped. That's certainly more typical. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-03 16:16:38 -08:00
Kenneth Graunke	e396674d5f	i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy(). This additionally measures the time stalled, while also simplifying the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-03 16:16:35 -08:00
Kenneth Graunke	d613bafe91	i965: Create drm_intel_bo_map wrappers with performance warnings. Mapping a buffer is a common place where we could stall the CPU. In a few places, we've added special code to check whether a buffer is busy and log the stall as a performance warning. Most of these give no indication of the severity of the stall, though, since measuring the time is a small hassle. This patch introduces a new brw_bo_map() function which wraps drm_intel_bo_map, but additionally measures the time stalled and reports a performance warning. If performance debugging is not enabled, it simply maps the buffer with negligable overhead. We also add a similar wrapper for drm_intel_gem_bo_map_gtt(). This should make it easy to add performance warnings in lots of places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-03 16:16:26 -08:00
Rob Clark	1b886078db	freedreno: enabling binning and opt by default Hw binning pass doesn't seem to have broken anything. And optimizing compiler fixes a lot of shaders and doesn't seem to break anything. So re-org slightly FD_MESA_DEBUG params and make both hw binning and optimizer enabled by default. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	554f1ac00c	freedreno/a3xx/compiler: new compiler The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): `163b6306b1` Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	f0e2d7ab46	freedreno/a3xx/compiler: split out old compiler For the time being, keep old compiler as fallback for things that the new compiler does not support yet. Split out as it's own commit to make the later new-compiler commits easier to follow. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	a418573c4d	freedreno/a3xx/compiler: prepare for new compiler Shuffle things around to prepare for new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	f08d2b1c0f	freedreno/a3xx: remove useless reg tracking in disasm-a3xx Not really used for anything anymore. So strip it out and avoid conflicting symbols with upcoming new-compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Carl Worth	1597788d12	docs: Add release notes for 10.0.3 Which was just made.	2014-02-03 13:55:24 -08:00
Brian Paul	fc3fcd1e01	draw: fix incorrect color of flat-shaded clipped lines When we clipped a line weren't copying the provoking vertex color to the second vertex. We also weren't checking for first vs. last provoking vertex. Fixes failures found with the new piglit line-flat-clip-color test. Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:50:04 -07:00
Brian Paul	349b76a553	mesa: change GL_ALL_ATTRIB_BITS to 0xFFFFFFFF This has been wrong for many years. It was originally 0x000FFFFF and long ago there was discussion about whether GL_ALL_ATTRIB_BITS should include the then-new GL_MULTISAMPLE_BIT bit. Eventually the ARB decided that glPushAttrib(GL_ALL_ATTRIB_BITS) should save all current and future attribute groups (hence ~0). Unfortunately, Mesa's gl.h was never updated. This was just recently spotted by Eric Anholt and reported as a bug to the ARB. Ian, Jon Leech and I discussed it at the ARB meeting and decided to change Mesa's value to reflect the ARB's decision. Acked-by: Eric Anholt <eric@anholt.net>	2014-02-03 12:50:03 -07:00
Brian Paul	307fd76053	gallium/auxiliary/indices: replace free() with FREE() To match the CALLOC_STRUCT() call. Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:49:55 -07:00
Brian Paul	97fdace6d7	svga: check shader size against max command buffer size If the shader is too large, plug in a dummy shader. This patch also reworks the existing dummy shader code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:40:13 -07:00
Brian Paul	4686f610b1	svga: refactor some shader code Put common code in new svga_shader.c file. Considate separate vertex/ fragment shader ID generation. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:40:13 -07:00
Zack Rusin	9bace99d77	gallivm: fix opcode and function nesting gallivm soa code supported only a single level of nesting for control flow opcodes (if, switch, loops...) but the d3d10 spec clearly states that those are nested within functions. To support nesting of conditionals inside functions we need to store the nesting data inside function contexts and keep a stack of those. Furthermore we make sure that if nesting for subroutines is deeper than 32 then we simply ignore all subsequent 'call' invocations. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-03 13:29:14 -05:00
Kenneth Graunke	595bcf38a6	mesa: Drop unnecessary (void) ctx from VAO code. ctx is always used, even on release builds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:16 -08:00
Kenneth Graunke	4323b92479	mesa: Remove "APPLE" from some VAO error messages. Chances are, people will be using the core names these days. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:15 -08:00
Kenneth Graunke	cf62e59673	mesa: Update some comments relating to VAOs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:13 -08:00
Kenneth Graunke	e1b1f2a687	mesa: Rename ElementArrayBufferObj to IndexBufferObj. DirectX and most hardware documentation use the term "Index Buffer" to refer to a buffer containing indexes into arrays of vertex data, which allows random access to vertex data, rather than sequential access. OpenGL uses a different term for this concept: "Element Array Buffer". However, "Index Buffer" has become much more widespread. A quick Google search shows 29,300 hits for "Element Array Buffer" vs. 82,300 hits for "Index Buffer." Arguably, "Index Buffer" is clearer: an "element of an array" (or list) usually refers to an actual item stored in the array, not the index used to refer to it. The terminology is also already used in Mesa: some VBO module code for dealing with ElementArrayBufferObj names local variables "ib". Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/ElementArrayBufferObj/IndexBufferObj/g' Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:11 -08:00
Kenneth Graunke	0354e50798	mesa: Rename _mesa_lookup_arrayobj to _mesa_lookup_vao. For consistency with the previous renames. Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/_mesa_lookup_arrayobj/_mesa_lookup_vao/g' Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:09 -08:00
Kenneth Graunke	de47fd2668	mesa: Rename _mesa_..._array_obj functions to _mesa_..._vao. _mesa_update_vao_client_arrays() is less of a mouthful than _mesa_update_array_object_client_arrays(), and generally clearer. Generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/_mesa_$[^_]*$_array_object/_mesa_\1_vao/g' with manual whitespace and indentation fixes applied. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:07 -08:00
Kenneth Graunke	aac1415b66	mesa: Rename "struct gl_array_object" to gl_vertex_array_object. I considered replacing it with "gl_vao", but spelling it out seemed to fit better with Mesa's traditional style. Mesa doesn't shy away from long type names - consider gl_transform_feedback_object, gl_fragment_program_state, gl_uniform_buffer_binding, and so on. Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/gl_array_object/gl_vertex_array_object/g' v2: Rerun command to resolve conflicts with Ian's meta patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:05 -08:00
Kenneth Graunke	94e07c1960	mesa: Rename "arrayObj" local variables to "vao". Now that the field is named "VAO" instead of "ArrayObj", it makes sense to call the local variables "vao" instead of "arrayObj". Completely generated by: $ find . -type f -print0 \| xargs 0 sed -i 's/arrayObj/vao/g' Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:02 -08:00
Kenneth Graunke	0dfe50f1a6	mesa: Rename ArrayObj to VAO and DefaultArrayObj to DefaultVAO. When reading through the Mesa drawing code, it's not immediately obvious to me that "ArrayObj" (gl_array_object) is the Vertex Array Object (VAO) state. The comment above the structure explains this, but readers still have to remember this and translate accordingly. Out of context, "array object" is a fairly vague. Even in context, "array" has a lot of meanings: glDrawArrays, vertex data stored in user arrays, gl_client_arrays, gl_vertex_attrib_arrays, and so on. Using the term "VAO" immediately associates these fields with the OpenGL concept, clarifying the situation and aiding programmer sanity. Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ -e 's/ArrayObj;/VAO;/g' \ -e 's/->ArrayObj/->VAO/g' \ -e 's/Array\.ArrayObj/Array.VAO/g' \ -e 's/Array\.DefaultArrayObj/Array.DefaultVAO/g' v2: Rerun command to resolve conflicts with Ian's meta patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:52:58 -08:00
Ian Romanick	81144c049b	meta: Silence several 'unused parameter' warnings Silences many GCC warnings of the form: drivers/common/meta.c: In function 'cleanup_temp_texture': drivers/common/meta.c:1208:41: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'setup_ff_blit_framebuffer': drivers/common/meta.c:1453:46: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_blit_cleanup': drivers/common/meta.c:1998:43: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_clear_cleanup': drivers/common/meta.c:2287:44: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'setup_ff_generate_mipmap': drivers/common/meta.c:3365:45: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_generate_mipmap_cleanup': drivers/common/meta.c:3556:54: warning: unused parameter 'ctx' [-Wunused-parameter] There are a couple other similar warnings, but they are less trivial. I want to investigate these further before axing them. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:09 +01:00
Ian Romanick	2bf4db1697	meta: Don't use fixed-function to decompress array textures Array textures can't be used with fixed-function, so don't. Instead, just drop the decompress request on the floor. This is no worse than what was done previously because generating the GL error (in _mesa_set_enable) broke everything anyway. A later patch will get GL_TEXTURE_2D_ARRAY targets working. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:09 +01:00
Ian Romanick	eb65d4b84d	meta: Use NDC in decompress_texture_image There is no need to use pixel coordinates, and using NDC directly will simplify the GLSL paths. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:09 +01:00
Ian Romanick	abfa65ca81	meta: Consistenly use non-Apple VAO functions For these objects, meta was already using the non-Apple function to delete the objects. Everywhere else in the file uses _mesa_GenVertexArrays and _mesa_BindVertexArrays. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:09 +01:00
Ian Romanick	070f55d893	meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY The hardware decompression path isn't even close to being able to handle this. This converts the crash (assertion failure) in "EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a plain old failure. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:09 +01:00
Ian Romanick	fcb498302b	meta: Release resources used by _mesa_meta_DrawPixels _mesa_meta_DrawPixels creates a VAO and (potentially) two fragment programs, but none of them are ever released. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:08 +01:00
Ian Romanick	2d3f92e881	meta: Release resources used by decompress_texture_image decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a sampler object, but none of them are ever released. Later patches will add program objects, exacerbating the problem. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:08 +01:00
Ian Romanick	a722454dac	mesa: Use common _mesa_tex_target_to_index in tex param code TEXTURE_BUFFER_INDEX has to be specially called out because it is not allowed in any of the glTexParameter or glGetTexParameter functions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:08 +01:00
Ian Romanick	35e7027dab	mesa: Make target_enum_to_index available outside texobj.c The next patch will use this function in another file. v2: Rename _mesa_target_enum_to_index to _mesa_tex_target_to_index. Suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:08 +01:00
Brian Paul	9451281aca	mesa: make several FBO functions static The four functions in question weren't called from any other file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:38 -07:00
Brian Paul	3abd4f4d90	mesa: move glGenerateMipmap() code into new genmipmap.c file Mipmap generation has nothing to do with FBOs. v2: update gl_genexec.py too (not api_exec.c) Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:37 -07:00
Brian Paul	bfcb9bb204	mesa: move glBlitFramebuffer code into new blit.c file Just for better organization. v2: update gl_genexec.py too (not api_exec.c) Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:37 -07:00
Brian Paul	20fedfd80a	mesa: don't signal _NEW_TEXTURE in TexSubImage() functions glTexSubImage(), glCopyTexSubImage() and glCompressedTexSubImage() only change the texel data, not other state like texture size or format. If a driver really needs do something special it can hook into the corresponding driver functions or Map/UnmapTextureImage(). This should avoid some needless state validation effort. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-02 06:52:37 -07:00
Brian Paul	c55e3e6811	mesa: add some comments about mipmap generation Trivial.	2014-02-02 06:52:37 -07:00
Brian Paul	e286b63c8f	mesa: simplify comment in texstorage.c Trivial.	2014-02-02 06:52:37 -07:00
Brian Paul	8b3e383820	mesa: formatting fixes, 78-column wrappings in dd.h Trivial.	2014-02-02 06:52:37 -07:00
Brian Paul	deb9dd6e27	mesa: remove target param from ctx->Driver.TexParameter() Not really used anywhere. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:37 -07:00
Brian Paul	c20b48c48e	gallivm: add a few const qualifiers Trivial.	2014-02-02 06:52:36 -07:00
Brian Paul	c6d94648cf	translate: reindent translate_sse.c Trivial.	2014-02-02 06:52:36 -07:00
Brian Paul	8689076925	mesa: make _mesa_get_proxy_target() static Wasn't used in any other file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	9eaed3eb6e	mesa: remove unused _mesa_select_tex_object() function The _mesa_get_current_tex_object() function is now used everywhere that _mesa_select_tex_object() was formerly used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	d5df28381e	swrast: use _mesa_get_current_tex_object() in swrastSetTexBuffer2() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	ed72115891	st/mesa: use _mesa_get_current_tex_object() in st_context_teximage() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	f09a1261ad	mesa: use _mesa_get_current_tex_object() in GetTexLevelParameteriv() And update a related comment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	8b4f6fada2	radeon: use _mesa_get_current_tex_object() in radeonSetTexBuffer2() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	76c33e383c	r200: use _mesa_get_current_tex_object() in r200SetTexBuffer2() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Paul Seidler	1cdeeef6c4	build: move ARCH_LIBS definition outside of ASM definition _mesa_streaming_load_memcpy is also needed even if assembling is disabled Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-01 15:01:06 -08:00
Eric Anholt	c849ecc19a	dri: Add a useful error message if someone's packages missed libudev deps. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-01 10:09:11 -08:00
Eric Anholt	63546b8e3d	dri: Also support the loader with libudev.so.0. As far as I know, this should be safe. If not, we have to decide whether to have variable lookup of the functions, or just drop support for .so.0 (which is a year and a half old it looks like) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74127 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-01 10:08:36 -08:00
Rob Clark	dc00ec154b	freedreno: better manage our WFI's Updates to non-banked registers, CP_LOAD_STATE, etc, need a WFI if there is potentially pending rendering. Track this better, and add fd_wfi() calls everywhere that might potentially need CP_WAIT_FOR_IDLE. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 12:10:17 -05:00
Rob Clark	1fe9df8f29	freedreno/a3xx: add logicop Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:59:25 -05:00
Rob Clark	8d27be2633	freedreno/a3xx: handle frag z write Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:58:47 -05:00
Rob Clark	083b27a1b1	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:57:39 -05:00
Rob Clark	98c1111462	freedreno/a3xx: fix const confusion Gallium can leave const buffers bound above what is used by the current shader. Which can have a couple bad effects: 1) write beyond const space assigned, which can trigger HLSQ lockup 2) double emit of immed consts, first with bound const buffer vals followed by with actual immed vals. This seems to be a sort of undefined condition. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:57:09 -05:00
Rob Clark	5c6961efae	freedreno/a3xx/compiler: compiler cleanups Drop color/pos/psize_regid, plus a few compiler and IR cleanups. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:53:21 -05:00
Rob Clark	69eca28dd0	freedreno/compiler/a3xx: remove lowered instructions Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:52:27 -05:00
Rob Clark	0f2df4ff90	freedreno: add tgsi lowering pass Currently lowers the following instructions: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2 translating these into equivalent simpler TGSI instructions. This probably should be moved to util so other drivers can use it, but just adding under freedreno for now so that I can clear out a lot of the lowering code in a3xx compiler before beginning to add new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:50:10 -05:00
Rob Clark	7524756199	freedreno/a3xx/compiler: add CLAMP Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:49:31 -05:00
Rob Clark	fafe16a8a0	freedreno/a3xx/compiler: various fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:49:06 -05:00
Rob Clark	4971628bae	freedreno: ctx should hold ref to dev The ctx should hold ref to dev to avoid problems if screen is destroyed before ctx. Doesn't really fix the egl/glx issues, but at least it prevents things from getting much worse. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:47:08 -05:00
Rob Clark	303df12db8	freedreno: add prims-emitted driver query Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:45:19 -05:00
Kenneth Graunke	80bf1fbaf6	i965: Silence unused variable 'ctx' warning. Somehow I missed this before pushing the Broadwell PS state upload code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-31 21:40:27 -08:00
Kenneth Graunke	e1cdafe6f7	i965: Fix math instruction hstride assertions on Broadwell. In the final revision of my gen8_generator patch, I updated the MATH instruction's assertion from (dst.hstride == 1) to check that source and destination hstride matched. Unfortunately, I didn't test this enough, and many Piglit tests fail this test. The documentation indicates that "scalar source is also supported", which we believe means <0,1,0> access mode (hstride == 0). If hstride is non-zero, then it must match the destination register. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-31 17:50:09 -08:00
Kenneth Graunke	d8878055f5	i965: Add (disabled) Broadwell PCI IDs. This puts the PCI IDs in place so it's easy to enable support. However, it doesn't actually enable support since it's very preliminary still, and a few crucial pieces (such as BLORP) are still missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	3ade766684	i965: Disable 3DSTATE_WM_HZ_OP fields. Eric believes this to be wrong and unnecessary, as the command is supposed to emit an implicit rectangle primitive. However, empirically the pixel pipeline is completely unreliable without it. So for now, it stays until someone comes up with a better solution. We'll need to do better than this when we implement multisampling, HiZ, or fast clears...but for now, this will do. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	4c4e0ed64b	i965: Update GS state for Broadwell. This is quite similar to the Gen7 code. The main changes: - 48-bit relocations - Thread count is specified as U/2-1 instead of U-1. - An extra DWord (DW9) with clip planes, URB entry output length/offsets - We need to program the "Expected Vertex Count" (VerticesIn) v2: Set the number of binding table entries so they can be prefetched (requested by Eric Anholt). v3: Add a WARN_ONCE for a missing workaround. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	a0d4311072	i965: Update multisampling state for Broadwell. On previous platforms, 3DSTATE_MULTISAMPLE contained the number of samples, pixel location, and the positions of each sample within a pixel for each multisampling mode (4x and 8x). It was also a non-pipelined command, presumably since changing the sample positions is fairly drastic. Broadwell improves upon this by splitting the sample positions out into a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet. Broadwell also supports 2x and 16x multisampling, in addition to the 4x and 8x supported by Gen7. This patch, however, does not implement 2x and 16x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	9cd65e3289	i965: Update 3DSTATE_{DEPTH,STENCIL,...}_BUFFER and such for Broadwell. The amount of cut and paste from Gen7 is rather ugly, and should probably be cleaned up in the future. Even the Gen7 code is in need of some tidying though; many of the function parameters aren't used on platforms that use level/layer rather than tile offsets. Tidying both can be left to a future patch series. This at least gets things going. v2: Rebase on Paul's rename of NumLayers -> MaxNumLayers. v3: Shift QPitch by 2 when storing it in the packet. Bits 14:0 store bits 16:2 of the actual value. Fixes tests. v4: Add missing stencil buffer QPitch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	2fce1e3c69	i965: Update BLEND_STATE for Broadwell. v2: Allow logic ops on all surface types. The UNORM restriction was lifted with Haswell and I simply hadn't noticed. Also, add missing BRW_NEW_STATE_BASE_ADDRESS dirty bit. Both caught by Eric Anholt. v3: Fix swapped per-RT DWord pairs. Eliminates bizarre hacks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	460e0df330	i965: Update SF_CLIP_VIEWPORT for Broadwell. It has additional fields to support clipping to the viewport even if guardband clipping is enabled. v2: Update for viewport array changes. v3: No, seriously, update for viewport array changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2014-01-31 17:50:08 -08:00
Kenneth Graunke	dcbf25969e	i965: Rework SURFACE_STATE entries for Broadwell. v2: Add missing SCS setting in gen8_emit_buffer_surface_state (caught by Eric Anholt). v3: Use stored QPitch rather than recomputing it. v4: Shift QPitch by 2 when setting it in the packet; bits 14:0 store bits 16:2 of the actual value (fixes myriads of cube and array texturing tests). Also, only enable cube face bits for cubemaps (matches Chris Forbes' commit on master). Port to use offset64. v5: s/gl_format/mesa_format/g v6: Fix DW5 of renderbuffer state, which neglected to subtract irb->mt->first_level. Use vertical_alignment() rather than hardcoding 4. Use ffs for multisample counts rather than a large switch statement (all caught/suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:07 -08:00
Kenneth Graunke	990aaf87c4	i965: Update SOL state for Broadwell. Unlike on Gen7, we can directly set the offset via the state packet. We also -have- to: the kernel SOL reset code won't work anymore. v2: Fix copy and paste mistake in buffer stride setup; drop stale comment (caught by Eric Anholt). Add a perf_debug for missing MOCS setup. v3: Rebase on Paul Berry's changes to CurrentVertexProgram. v4: Fix SO Write Offset handling. We need to set bits 20 and 21 so the hardware both loads and saves the offset. There's also a restriction that 3DSTATE_SO_BUFFER can only be programmed once per buffer between primitives, so the "reset to zero" code needed reworking. Fixes most of the transform feedback Piglit tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	fd91ab662d	i965: Update the code that disables unused shader stages for Broadwell. v2: Also disable 3DSTATE_WM_CHROMAKEY for safety. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	3d3c351cfb	i965: Update 3DSTATE_CLIP for Broadwell. Broadwell's winding order, polygon fill, and viewport Z test fields have moved to DWord 1 of 3DSTATE_RASTER. v2: Add a perf_debug for a future optimization and improve commit message (both suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:07 -08:00
Kenneth Graunke	5c0d7dbcb9	i965: Rework vertex uploads for Broadwell. v2: Emit a dummy 3DSTATE_VF_SGVS packet when not needed. v3: Add WARN_ONCE and perf_debugs requested by Eric Anholt. v4: Program 3DSTATE_SGVS even in the no-elements case so gl_VertexID continues working. Fix 3DSTATE_VF_INSTANCING to not use an element index to access the buffers array. Some ARB_draw_indirect prep work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:07 -08:00
Kenneth Graunke	08a4714959	i965: Update STATE_BASE_ADDRESS for Broadwell. v2: Fix missing "change" bit on instruction state base address (caught by Haihao Xiang). v3: Add a perf_debug for missing MOCS setup, requested by Eric. v4: Fix buffer sizes. The value, specified at bit 12 and up, is actually measured in 4k pages. We need to round up to the next multiple of 4k. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com> [v4]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	f3c6d6f1e1	i965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA. v2: Fix setting of GEN8_PSX_ATTRIBUTE_ENABLE after rebases. v3: Add missing binding table entry counts. Don't worry about alpha testing or alpha to coverage when setting the "Kill Pixel" bit; those are specified in 3DSTATE_PS_BLEND (caught by Eric Anholt). Drop unused _NEW_BUFFERS. Tidy comments. v4: Rebase on Paul Berry's changes to CurrentFragmentProgram. v5: Re-enable line stippling. It doesn't crash or anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v3]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	20d9286f71	i965: Rework 3DSTATE_VS for Broadwell. v2: Remove incorrect MOCS shifts; rename urb_entry_write_offset to urb_entry_output_offset to closer match the documentation. v3: Only emit a non-zero constant buffer read length when active. v4: Add missing binding table counts (caught by Eric). v5: Rebase on Paul Berry's changes to CurrentVertexProgram. v6: Drop bogus SBE read length/offset field code. We were programming the wrong values, and our 3DSTATE_SBE code overrides any value we put here anyway with the correct one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v4]	2014-01-31 17:50:06 -08:00
Kenneth Graunke	c96686a6cc	i965: Add the new 3DSTATE_PS_BLEND state packet. v2: Only set GEN8_PS_BLEND_HAS_WRITEABLE_RT if color buffer writes are enabled (caught by Eric Anholt). v3: Set non-blending flags (writeable RT, alpha test, alpha to coverage) for integer formats too. +14 Piglits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2014-01-31 17:50:06 -08:00
Kenneth Graunke	17768bb7b4	i965: Replace DEPTH_STENCIL_STATE with Gen8's 3DSTATE_WM_DEPTH_STENCIL. v2: Use stencil->_WriteEnabled instead of setting GEN8_WM_DS_STENCIL_BUFFER_WRITE_ENABLE twice (suggested by Eric). v3: Mask stencil->WriteMask and stencil->ValueMask with 0xff. The field is only 8-bits, so we'd trip the new SET_FIELD assertion when core Mesa gave us a value like 0xFFFFFFFF. The Gen7 code uses structure field widths to implicitly do this truncation. Fixes Piglit tests. v4: Use uint32_t for dw1/dw2, not uint8_t. Worst. Typo. Ever. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2014-01-31 17:50:06 -08:00
Kenneth Graunke	90fff1354b	i965: Update SF, SBE, and RASTER state for Broadwell. The attribute override portion of 3DSTATE_SBE was split out into 3DSTATE_SBE_SWIZ; various bits of 3DSTATE_SF were split out into 3DSTATE_RASTER. v2: Set Force URB Read Offset bit. Eventually the URB read offset should be set in 3DSTATE_VS, but that will require some refactoring. v3: Rebase on viewport array changes. v4: Improve comments about URB read length/offset overrides. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:06 -08:00
Kenneth Graunke	4552a22f04	i965: Bump generation assertions on workaround flushes. I haven't investigated whether these are necessary on Broadwell or not, but for paranoia's sake, we may as well continue doing them for now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-31 17:50:06 -08:00
Kenneth Graunke	2184b519cd	i965: Duplicate gen7_atoms to gen8_atoms. It's going to diverge significantly. Starting out with a copy allows future patches to change atoms one by one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:06 -08:00
Brian Paul	f51ca46f0c	radeon: move driContextSetFlags(ctx) call after ctx var is initialized CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-31 17:09:44 -07:00
Brian Paul	2d6d69bab6	r200: move driContextSetFlags(ctx) call after ctx var is initialized Otherwise, ctx was a garbage value. CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-31 17:09:44 -07:00
Roland Scheidegger	1d53603f1f	llvmpipe: fix denorm handling for r11g11b10_float format when blending The code re-enabling denorms for small float formats did not recognize this format due to format handling hacks (mainly, the lp_type doesn't have the floating bit set). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-31 19:51:06 +01:00
Matt Turner	606544214e	glsl: Expand non-expr & non-swizzle scalar rvalues in vectorizing.	2014-01-31 10:21:50 -08:00
Matt Turner	3f49a8c9a5	glcpp: Reject #version after the version has been resolved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74166 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Carl Worth <cworth@cworth.org>	2014-01-31 10:21:50 -08:00
Carl Worth	9d4a6bd6bb	glcpp: Rename the variable used to enable debugging. The -p option we now use when calling bison means that this variable will be named glcpp_parser_debug not yydebug. This was not caught when the -p option was added because this variable isn't used in the code as committed. (I prefer the declaration to remain since it allows a developer to easily find this variable name to enable debugging.)	2014-01-31 10:02:58 -08:00
Carl Worth	2dc93bd5d1	glcpp: Add "make check" test for comment-parsing bug This is the innocent-looking but killer test case to verify the bug fixed in the preceding commit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-31 10:02:54 -08:00
Carl Worth	71978cf66f	glcpp: Don't enter lexer's NEWLINE_CATCHUP start state for single-line comments In commit `6005e9cb28` a new start state of NEWLINE_CATCHUP was added to the lexer. This start state is used whenever the lexer is emitting a NEWLINE token to emit additional NEWLINE tokens for any newline characters that were skipped by an immediately preceding multi-line comment. However, that commit erroneously entered the NEWLINE_CATCHUP state for single-line comments. This is not desired since in the case of a single-line comment, the lexer is not emitting any NEWLINE token. The result is that the lexer will remain in the NEWLINE_CATCHUP state and proceed to fail to emit a NEWLINE token for the subsequent newline character, (since the case to match \n expects only the INITIAL start state). The fix is quite simple, remove the "BEGIN NEWLINE_CATCHUP" code from the single-line comment case, (preserving it only in exactly the cases where the lexer is actually emitting a NEWLINE token). Many thanks to Petri Latvala for reporting this bug and for providing the minimal test case to exercise it. The bug showed up only with a multi-line comment which was followed immediately by a single-line comment (without any intervening newline), such as: /* */ // Kablam! Since `6005e9cb28`, and before this commit, that very innocent-looking combination of comments would yield a parse failure in the compiler. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-31 10:02:36 -08:00
Brian Paul	df21f31788	mesa: use _mesa_align_free() in _mesa_delete_buffer_object() To match _mesa_align_malloc() call in _mesa_buffer_data(). Found by Colin Harrison <colin.harrison@virgin.net> Signed-off-by: Brian Paul <brianp@vmware.com>	2014-01-31 09:52:11 -07:00
Michel Dänzer	db8b6fb2df	st/dri: Fix tests for no draw/read buffers in dri_make_current() Fixes piglit glx/GLX_ARB_create_context/current with no framebuffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-31 11:06:26 +09:00
Keith Packard	3fbd1b0cb5	dri3: Track current Present swap mode and adjust buffer counts This automatically adjusts the number of buffers that we want based on what swapping mode the X server is using and the current swap interval: swap mode interval buffers copy > 0 1 copy 0 2 flip > 0 2 flip 0 3 Note that flip with swap interval 0 is currently limited to twice the underlying refresh rate because of how the kernel manages flipping. Moving from 3 to 4 buffers would help, but that seems ridiculous. v2: Just update num_back at the point that the values that change num_back change. This means we'll have the updated value at the point that the freeing of old going-to-be-unused backbuffers happens, which might not have been the case before (change by anholt, acked by keithp). Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 17:29:33 -08:00
Keith Packard	aea4757eb4	dri3, i915, i965: Add __DRI_IMAGE_FOURCC_SARGB8888 The __DRIimage createImageFromFds function takes a fourcc code, but there was no fourcc code that match __DRI_IMAGE_FORMAT_SARGB8. This adds a define for that format, adds a translation in DRI3 from __DRI_IMAGE_FORMAT_SARGB8 to __DRI_IMAGE_FOURCC_SARGB8888 and then adds translations back to __IMAGE_FORMAT_SARGB8 in both the i915 and i965 drivers. I'll refrain from comments on whether I think having two separate sets of format defines in dri_interface.h is a good idea or not... Fixes piglit glx-tfp and glx-visuals-depth Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 17:29:23 -08:00
Keith Packard	f12d6d613a	dri3: Flush XCB before blocking for special events XCB doesn't flush the output buffer automatically, so we have to call xcb_flush ourselves before waiting. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:40:25 -08:00
Keith Packard	09d6c19720	dri3: Enable GLX_INTEL_swap_event Now that we're tracking SBC values correctly, and the X server has the ability to send the GLX swap events from a PresentPixmap request, enable this extension. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:40:06 -08:00
Keith Packard	1525474ead	dri3: Fix dri3_wait_for_sbc to wait for completion of requested SBC Eric figured out that glXWaitForSbcOML wanted to block until the requested SBC had been completed, which means to wait until the PresentCompleteNotify event for that SBC had been received. This replaces the simple sleep(1) loop (which was bogus) with a loop that just checks to see if we've seen the specified SBC value come back in a PresentCompleteNotify event yet. The change is a bit larger than that as I've broken out a piece of common code to wait for and process a single Present event for the target drawable. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:38:36 -08:00
Keith Packard	71d614250e	dri3: Track full 64-bit SBC numbers, instead of just 32-bits Tracking the full 64-bit SBC values makes it clearer how those values are being used, and simplifies the wait_msc code. The only trick is in re-constructing the full 64-bit value from Present's 32-bit serial number that we use to pass the SBC value from request to event. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:35:00 -08:00
Mark Mueller	34a8a0820f	mesa: Add warning to _REV pack/unpack functions with incorrect behavior Signed-off-by: Mark Mueller <MarkKMueller@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-01-31 00:51:36 +01:00
Siavash Eliasi	03065ea05c	r600g: Removed unnecessary positivity check for unsigned int variable. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-01-31 00:50:08 +01:00
Michel Dänzer	9f26ad00d7	st/dri: Allow creating OpenGL 3.3 core contexts Enables OpenGL 3.3 piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-30 10:30:41 +09:00
Kristian Høgsberg	cbecd958a7	build: Share the all-local rule for linking libraries into the build dir This consolidates how we link the libraries into the build directory. It works for lib_LTLIBRARIES but not custom shared libraries like DRI drivers or gallium state trackers which needs special casing (cf dri mega drivers, for example) Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-29 12:58:13 -08:00
Emil Velikov	7965908976	loader: do not print the pci id during normal operation Spamming the pci id is not beneficial. Make sure it's printed only when needed. v2: Change severity to _LOADER_DEBUG, rather than removing the message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-29 19:55:02 +00:00
Emil Velikov	780dfc1fec	loader: print WARNING and FATAL messages using the default logger Lower values are used for more severe cases. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-29 19:53:53 +00:00
Emil Velikov	4c35e32594	glsl: s/_NDEBUG/NDEBUG/ The former symbol is never defined within mesa. Based on the code it seems that the original intent was to use NDEBUG. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 19:52:35 +00:00
Kristian Høgsberg	e3afbe3ad7	dir-locals.el: Set indent-tabs-mode true for makefile-mode Makefiles need hard tabs, let's not make that harder than it needs to be. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-29 11:45:49 -08:00
Courtney Goeltzenleuchter	3e894e213b	mesa: Return after ScissorArrayv or ScissorIndexed detect a parameter error Fixes piglit arb_viewport_array-scissor-ignore. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jon Ashburn <jon@lunarg.com>	2014-01-29 09:40:02 -07:00
Ian Romanick	ca385bffa6	docs: Add GL_ARB_map_buffer_alignment status to GL3.txt and release notes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:40 -07:00
Siavash Eliasi	7fd6ad7adc	mesa: GL_ARB_map_buffer_alignment is not optional Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. v2: Making GL_ARB_map_buffer_alignment a desktop OpenGL extension only. v3: Squash two commits together. v4 (idr): MIN_MAP_BUFFER_ALIGNMENT queries don't have any dependencies. In previous versions of the patch it depended on EXTRA_API_GL which would prevent the query from working in core profile contexts. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	b9aaa96ec3	nouveau: Use gl_constants::MinMapBufferAlignment as the alignment in nouveau_bo_new This driver does not support GL_ARB_map_buffer_range, so no special treatment is needed for unaligned offsets in the mapping. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	d38867d80c	radeon / r200: Use gl_constants::MinMapBufferAlignment as the alignment in radeon_bo_open These drivers do not support GL_ARB_map_buffer_range, so no special treatment is needed for unaligned offsets in the mapping. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	f772d51c25	mesa: Use _mesa_align_malloc in _mesa_buffer_data v2: Fixed memory leak. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	689b20cfe0	mesa: Set gl_constants::MinMapBufferAlignment to 64 by default Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	6bb27ee51c	mesa/st: Unconditionally enable ARB_map_buffer_alignment. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Ian Romanick	25c14f40f3	freedreno: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 Allocations actually have page alignment, but 64 is still a reasonable value. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	205e624048	ilo: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 Ian manually ran the map_buffer_range* tests and the arb_map_buffer_alignment-* tests, but he did not do a full piglit run. v2 (idr): Use 64 instead of 4096 Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	75081391a4	svga: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	d273fe72df	i915g: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly.	2014-01-29 09:11:39 -07:00
Siavash Eliasi	4329e99b23	i915g: Use alignment of 64 instead of 16 for buffer allocation Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	809d3a7d25	llvmpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	6317664de0	llvmpipe: Use alignment of 64 instead of 16 for buffer allocation v2: Changed allocation alignment of llvmpipe_displaytarget_layout. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	c83b34c43b	softpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	e36759a81e	softpipe: Use alignment of 64 instead of 16 for buffer allocation v2: Changed allocation alignment in softpipe_displaytarget_layout. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Stéphane Marchesin	023a50dd9b	i915g: support more PIPE_CAPs	2014-01-28 18:56:54 -08:00
Michel Dänzer	f8e16010e5	radeonsi: Put GS ring buffer descriptors with streamout buffer descriptors And mark the constant buffers as read only for the GPU again. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:09:26 +09:00
Michel Dänzer	d7c68e2dc1	radeonsi: Enable OpenGL 3.3 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:09:14 +09:00
Michel Dänzer	db9d6af862	radeonsi: Geometry shader micro-optimizations Move parameter loads out of loops, and use the instruction offset instead of a VGPR for the vertex attribute offset when writing to the ESGS ring buffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:09:04 +09:00
Michel Dänzer	3b3687adcb	radeonsi: We don't support indirect addressing of geometry shader inputs Fixes piglit spec/glsl-1.50/execution/geometry/dynamic_input_array_index Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:54 +09:00
Michel Dänzer	b4e14931a9	radeonsi: Pass VS resource descriptors to the HW ES shader stage as well This makes sure constants and samplers work in the vertex shader even when a geometry shader is active. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:43 +09:00
Michel Dänzer	67e385b3b7	radeonsi: Fix streamout from geometry shader Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:33 +09:00
Michel Dänzer	d88a375229	radeonsi: Simplify shader PM4 state handling Just always bind the current states before drawing. Besides the simplification, as a bonus this makes sure the VS hardware shader stage always uses the GS copy shader when a geometry shader is active, fixing a number of GS related piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:21 +09:00
Michel Dänzer	e884c560a6	radeonsi: Properly match ES outputs to GS inputs Fixes piglit vs-gs-arrays-within-blocks-pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:10 +09:00
Michel Dänzer	e1df0d45c4	radeonsi: Really dump TGSI code before any TGSI->LLVM conversion attempt While we're at it, use the local variable 'sel'. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:58 +09:00
Michel Dänzer	7b19c391f4	radeonsi: Also export clip distances with geometry shader Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:48 +09:00
Michel Dänzer	8afde9fa23	radeonsi: Take GS into account for VS state in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:35 +09:00
Michel Dänzer	28630713b2	radeonsi: Handle adjacency primitives Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:23 +09:00
Michel Dänzer	d8b3d806fc	radeonsi: Handle TGSI_SEMANTIC_PRIMID Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:11 +09:00
Michel Dänzer	7c7d7380f1	radeonsi: Generalize counting of shader parameters Now it covers ES->GS as well as VS->PS. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:06:58 +09:00
Michel Dänzer	f07a96dad1	radeonsi: Fix handling of geometry shader output vertex ID It needs to increment at shader runtime, not at shader compile time, as the geometry shader can emit vertices in loops. LLVM automagically converts the ID back to an immediate value if its value can be determined at compile time. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:06:45 +09:00
Michel Dänzer	404b29d765	radeonsi: Initial geometry shader support Partly based on the corresponding r600g work by Vadim Girlin and Dave Airlie. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:06:28 +09:00
Michel Dänzer	51f89a03e1	radeonsi: Refactor shader input / output handling code In preparation for adding geometry shader support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:05:58 +09:00
Matt Turner	947c828d5c	i965/fs: Add a saturation propagation optimization pass. Transforms, for example, mul vgrf3, vgrf2, vgrf1 mov.sat vgrf4, vgrf3 into mul.sat vgrf3, vgrf2, vgrf1 mov vgrf4, vgrf3 which gives register_coalescing an opportunity to remove the MOV instruction. total instructions in shared programs: 1515039 -> 1504634 (-0.69%) instructions in affected programs: 798586 -> 788181 (-1.30%) GAINED: 0 LOST: 4 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-28 17:47:41 -08:00
Matt Turner	39d7ec2c9a	i965: Add can_do_saturate() method to backend_instruction. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-28 17:47:41 -08:00
Anuj Phogat	3303475558	mesa: Generate correct error code in glDrawBuffers() OpenGL 3.3 spec expects GL_INVALID_OPERATION: "For both the default framebuffer and framebuffer objects, the constants FRONT, BACK, LEFT, RIGHT, and FRONT AND BACK are not valid in the bufs array passed to DrawBuffers, and will result in the error INVALID OPERATION." But OpenGL 4.0 spec changed the error code to GL_INVALID_ENUM: "For both the default framebuffer and framebuffer objects, the constants FRONT, BACK, LEFT, RIGHT, and FRONT_AND_BACK are not valid in the bufs array passed to DrawBuffers, and will result in the error INVALID_ENUM." This patch changes the behaviour to match OpenGL 4.0 spec Fixes Khronos OpenGL CTS draw_buffers_api.test. V2: Update the comment in code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-28 15:30:55 -08:00
Dave Airlie	faee376869	loader: fix running with --disable-egl builds I sometimes build without EGL just for speed purposes, however it no longer finds my drivers when I do due to the HAVE_LIBUDEV defines being wrong. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-28 21:51:21 +00:00
Anuj Phogat	dc2f94bc78	i965: Ignore 'centroid' interpolation qualifier in case of persample shading I missed this change in commit `f5cfb4a`. It fixes the incorrect rendering caused in Dolphin Emulator. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73915 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Markus Wick <wickmarkus@web.de> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-28 13:32:20 -08:00
Matt Turner	10dc994e09	gbm: Make libgbm.so.1 symlink. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-28 07:29:14 -08:00
Kevin Rogovin	1db9ed6495	mesa: Allow depth = 0 parameter for TexImage3D. Fixes the tests for the depth parameter for TexImage3D calls when the target type is GL_TEXTURE_2D_ARRAY or GL_TEXTURE_CUBE_MAP_ARRAY so that a depth value of 0 is accepted. Previously, the check incorrectly required the depth argument to be atleast 1. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-28 07:29:14 -08:00
Tom Stellard	7b4592a489	r600g,radeonsi: Don't set resource_create in r600_common_screen_init() r600g and radeonsi have different implementations of resource_create. https://bugs.freedesktop.org/show_bug.cgi?id=74139 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-28 07:24:11 -08:00
José Fonseca	f29968b270	c11: Add missing stdlib.h include. For malloc/free. Silences gcc mingw warnings.	2014-01-28 14:35:04 +00:00
Emil Velikov	61c825e862	loader: include dlfcn.h when building with HAVE_LIBUDEV The code depending on the definitions is already wrapped in the same conditional so go ahead and wrap the include. Otherwise we'll brake compilation on platforms that are missing the header. Add assert.h in there as well, as it is introduced and used in the same fashon. Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74122 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-28 14:32:03 +00:00
José Fonseca	2eddf91faf	gallivm: Workaround http://llvm.org/PR18600 We have code generation paths that carry out swizzles of AoS vectors via bitwise shifts, as these tend to generate more efficient code than straightforward byte shuffles. But when the input is a constant the additional bitwise arithmetic operations somehow don't really get constant propagated properly, evenutally causing assertion failure in InstCombine pass. Therefore avoid the bug by using the trivial shuffles for constant inputs. Although the sample LLVM IR can cause a crash with any LLVM version, this was only seen in practice with LLVM 3.2. Reviewed-by: Matthew McClure <mcclurem@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-28 14:27:27 +00:00
Matt Turner	37f1903e00	glsl: Avoid combining statements from different basic blocks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74113 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 21:15:35 -08:00
Matt Turner	8e2b8bd0e6	glsl: Set proper swizzle when a channel is missing in vectorizing. Previously, for example if the x channel was missing from a series of assignments we were attempting to vectorize, the wrong swizzle mask would be applied. a.y = b.y; a.z = b.z; a.w = b.w; would be incorrectly transformed into a.yzw = b.xyz; Fixes two transform feedback tests in the ES3 conformance suite. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73978 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73954 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	57109d57f8	glsl: Use bitfieldInsert in ldexp() lowering. Shaves a few instructions off of lowered ldexp(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	3ea64f9093	glsl: Add constant evaluation of ir_binop_bfm. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	c59a605c70	glcpp: Resolve implicit GLSL version to 100 if the API is ES. Fixes a regression since `b2d1c579` where ES shaders without a #version declaration would fail to compile if their precision declaration was wrapped in the standard #ifdef GL_ES check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74066 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	3e0e9e3bf9	glcpp: Check version_resolved in the proper place. The check was in the wrong place, such that if a shader incorrectly put a preprocessor token before the #version declaration, the version would be resolved twice, leading to a segmentation fault when attempting to redefine the __VERSION__ macro. #extension GL_ARB_sample_shading: require #version 130 void main() {} Also, rename glcpp_parser_resolve_version to glcpp_parser_resolve_implicit_version to avoid confusion. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Michel Dänzer	a818bf481a	r600g: s/r600_llvm_gpu_string/r600_get_llvm_processor_name/ Fixes build failure introduced by commit `65dc588bfd` ('r600g,radeonsi: consolidate get_compute_param'), which consolidated the former into the latter.	2014-01-28 10:12:32 +09:00
Marek Olšák	7209703432	radeonsi: cleanup includes, add missing license Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:13 +01:00
Marek Olšák	2942124db8	radeonsi: remove open-coded PS_PARTIAL_FLUSH event Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:10 +01:00
Marek Olšák	8a4d7c296f	radeonsi: move some inline functions from si_pipe.h to si_state.c And si_tex_aniso_filter is unused. v2: remove INLINE occurences Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:05 +01:00
Marek Olšák	530348680a	radeonsi: remove si_resource.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:04 +01:00
Marek Olšák	6e38a3de8a	radeonsi: remove si.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:02 +01:00
Marek Olšák	27a73a1b94	radeonsi: move si_upload_const_buffer to a better place This gets rid of another file. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:59 +01:00
Marek Olšák	9f5c037ab9	radeonsi: inline si_translate_index_buffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:57 +01:00
Marek Olšák	0932f0ff14	radeonsi: inline si_upload_index_buffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:53 +01:00
Marek Olšák	ed42e95404	r600g,radeonsi: consolidate remaining obviously duplicated pipe_screen code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:50 +01:00
Marek Olšák	65dc588bfd	r600g,radeonsi: consolidate get_compute_param v2: added fprintf to r600_get_llvm_processor_name Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:48 +01:00
Marek Olšák	d41bd71bcf	r600g,radeonsi: consolidate get_paramf and get_video_param radeonsi now reports PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE = true if UVD support isn't available. It's what all the other drivers do. Also, some #include directives were missing in radeon_uvd.h. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:46 +01:00
Marek Olšák	a4c218f398	r600g,radeonsi: consolidate variables for CS tracing Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:42 +01:00
Marek Olšák	ba0c16f7b2	r600g,radeonsi: consolidate get_timestamp, get_driver_query_info This enables more queries for the Gallium HUD with radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:39 +01:00
Marek Olšák	4df3f25fa2	r600g,radeonsi: consolidate get_name and get_vendor queries Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:37 +01:00
Marek Olšák	f4612105e8	radeon: place context-related functions first in r600_pipe_common.c To follow the unwritten convention of r600g and radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:27 +01:00
Marek Olšák	a9ae7635b7	r600g,radeonsi: consolidate the contents of r600_resource.c Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:25 +01:00
Marek Olšák	8739c60796	radeonsi: advertise the pipeline statistics query Implemented by the common code. You can now visualize the statistics with the HUD, see GALLIUM_HUD=help for all available queries. For example: GALLIUM_HUD=clipper-primitives-generated Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:15 +01:00
Marek Olšák	62d55c0a2d	radeonsi: use queries from r600g Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:10 +01:00
Marek Olšák	c53b8de335	r600g: remove a no-op while loop for (;;) { } while (); I was surprised to see such a statement. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:08 +01:00
Marek Olšák	aa90f17126	r600g: convert query emission code to radeon_emit Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:03 +01:00
Marek Olšák	dc76eea22c	r600g: only emit NOP relocations for queries if VM is disabled Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:38:59 +01:00
Marek Olšák	4e5c70e066	r600g: move queries to drivers/radeon Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:38:56 +01:00
Mark Mueller	f5bd5568ab	mesa: Fix Type A _INT formats to MESA_FORMAT naming standard Replace Type A _INT formats names with _SINT to match naming spec, and update type C formats as follows: s/MESA_FORMAT_R_INT8\b/MESA_FORMAT_R_SINT8/g s/MESA_FORMAT_R_INT16\b/MESA_FORMAT_R_SINT16/g s/MESA_FORMAT_R_INT32\b/MESA_FORMAT_R_SINT32/g s/MESA_FORMAT_RG_INT8\b/MESA_FORMAT_RG_SINT8/g s/MESA_FORMAT_RG_INT16\b/MESA_FORMAT_RG_SINT16/g s/MESA_FORMAT_RG_INT32\b/MESA_FORMAT_RG_SINT32/g s/MESA_FORMAT_RGB_INT8\b/MESA_FORMAT_RGB_SINT8/g s/MESA_FORMAT_RGB_INT16\b/MESA_FORMAT_RGB_SINT16/g s/MESA_FORMAT_RGB_INT32\b/MESA_FORMAT_RGB_SINT32/g s/MESA_FORMAT_RGBA_INT8\b/MESA_FORMAT_RGBA_SINT8/g s/MESA_FORMAT_RGBA_INT16\b/MESA_FORMAT_RGBA_SINT16/g s/MESA_FORMAT_RGBA_INT32\b/MESA_FORMAT_RGBA_SINT32/g s/\bMESA_FORMAT_RED_RGTC1\b/MESA_FORMAT_R_RGTC1_UNORM/g s/\bMESA_FORMAT_SIGNED_RED_RGTC1\b/MESA_FORMAT_R_RGTC1_SNORM/g s/\bMESA_FORMAT_RG_RGTC2\b/MESA_FORMAT_RG_RGTC2_UNORM/g s/\bMESA_FORMAT_SIGNED_RG_RGTC2\b/MESA_FORMAT_RG_RGTC2_SNORM/g s/\bMESA_FORMAT_L_LATC1\b/MESA_FORMAT_L_LATC1_UNORM/g s/\bMESA_FORMAT_SIGNED_L_LATC1\b/MESA_FORMAT_L_LATC1_SNORM/g s/\bMESA_FORMAT_LA_LATC2\b/MESA_FORMAT_LA_LATC2_UNORM/g s/\bMESA_FORMAT_SIGNED_LA_LATC2\b/MESA_FORMAT_LA_LATC2_SNORM/g	2014-01-27 14:34:04 -08:00
Mark Mueller	8b47b6bc32	mesa: Fix MESA_FORMAT names containg SIGNED Update comments. Replace format names containing SIGNED with SNORM appended w/decoration per the format name spec: s/MESA_FORMAT_SIGNED_R8\b/MESA_FORMAT_R_SNORM8/g s/MESA_FORMAT_SIGNED_RG88_REV\b/MESA_FORMAT_R8G8_SNORM/g s/MESA_FORMAT_SIGNED_RGBX8888\b/MESA_FORMAT_X8B8G8R8_SNORM/g s/MESA_FORMAT_SIGNED_RGBA8888\b/MESA_FORMAT_A8B8G8R8_SNORM/g s/MESA_FORMAT_SIGNED_RGBA8888_REV\b/MESA_FORMAT_R8G8B8A8_SNORM/g s/MESA_FORMAT_SIGNED_R16\b/MESA_FORMAT_R_SNORM16/g s/MESA_FORMAT_SIGNED_GR1616\b/MESA_FORMAT_R16G16_SNORM/g s/MESA_FORMAT_SIGNED_RGB_16\b/MESA_FORMAT_RGB_SNORM16/g s/MESA_FORMAT_SIGNED_RGBA_16\b/MESA_FORMAT_RGBA_SNORM16/g s/MESA_FORMAT_SIGNED_A8\b/MESA_FORMAT_A_SNORM8/g s/MESA_FORMAT_SIGNED_I8\b/MESA_FORMAT_I_SNORM8/g s/MESA_FORMAT_SIGNED_L8\b/MESA_FORMAT_L_SNORM8/g s/MESA_FORMAT_SIGNED_A16\b/MESA_FORMAT_A_SNORM16/g s/MESA_FORMAT_SIGNED_I16\b/MESA_FORMAT_I_SNORM16/g s/MESA_FORMAT_SIGNED_L16\b/MESA_FORMAT_L_SNORM16/g s/MESA_FORMAT_SIGNED_AL88\b/MESA_FORMAT_L8A8_SNORM/g s/MESA_FORMAT_SIGNED_RG88\b/MESA_FORMAT_G8R8_SNORM/g s/MESA_FORMAT_SIGNED_RG1616\b/MESA_FORMAT_G16R16_SNORM/g	2014-01-27 14:33:29 -08:00
Mark Mueller	2e02e195fe	mesa: Fix MESA_FORMAT names with ALPH, INTENSITY, and LUMINANCE Compressed spelled out color components ALPHA, INTENSITY, and LUMINANCE to A, I, and L: s/MESA_FORMAT_ALPHA_UINT8\b/MESA_FORMAT_A_UINT8/g' s/MESA_FORMAT_ALPHA_UINT16\b/MESA_FORMAT_A_UINT16/g' s/MESA_FORMAT_ALPHA_UINT32\b/MESA_FORMAT_A_UINT32/g' s/MESA_FORMAT_ALPHA_INT32\b/MESA_FORMAT_A_SINT32/g' s/MESA_FORMAT_ALPHA_INT16\b/MESA_FORMAT_A_SINT16/g' s/MESA_FORMAT_ALPHA_INT8\b/MESA_FORMAT_A_SINT8/g' s/MESA_FORMAT_INTENSITY_UINT8\b/MESA_FORMAT_I_UINT8/g' s/MESA_FORMAT_INTENSITY_UINT16\b/MESA_FORMAT_I_UINT16/g' s/MESA_FORMAT_INTENSITY_UINT32\b/MESA_FORMAT_I_UINT32/g' s/MESA_FORMAT_INTENSITY_INT32\b/MESA_FORMAT_I_SINT32/g' s/MESA_FORMAT_INTENSITY_INT16\b/MESA_FORMAT_I_SINT16/g' s/MESA_FORMAT_INTENSITY_INT8\b/MESA_FORMAT_I_SINT8/g' s/MESA_FORMAT_LUMINANCE_UINT8\b/MESA_FORMAT_L_UINT8/g' s/MESA_FORMAT_LUMINANCE_UINT16\b/MESA_FORMAT_L_UINT16/g' s/MESA_FORMAT_LUMINANCE_UINT32\b/MESA_FORMAT_L_UINT32/g' s/MESA_FORMAT_LUMINANCE_INT32\b/MESA_FORMAT_L_SINT32/g' s/MESA_FORMAT_LUMINANCE_INT16\b/MESA_FORMAT_L_SINT16/g' s/MESA_FORMAT_LUMINANCE_INT8\b/MESA_FORMAT_L_SINT8/g' s/MESA_FORMAT_LUMINANCE_ALPHA_UINT8\b/MESA_FORMAT_LA_UINT8/g' s/MESA_FORMAT_LUMINANCE_ALPHA_UINT16\b/MESA_FORMAT_LA_UINT16/g' s/MESA_FORMAT_LUMINANCE_ALPHA_UINT32\b/MESA_FORMAT_LA_UINT32/g' s/MESA_FORMAT_LUMINANCE_ALPHA_INT32\b/MESA_FORMAT_LA_SINT32/g' s/MESA_FORMAT_LUMINANCE_ALPHA_INT16\b/MESA_FORMAT_LA_SINT16/g' s/MESA_FORMAT_LUMINANCE_ALPHA_INT8\b/MESA_FORMAT_LA_SINT8/g' s/MESA_FORMAT_ALPHA_FLOAT16\b/MESA_FORMAT_A_FLOAT16/g' s/MESA_FORMAT_ALPHA_FLOAT32\b/MESA_FORMAT_A_FLOAT32/g' s/MESA_FORMAT_INTESITY_FLOAT16\b/MESA_FORMAT_I_FLOAT16/g' s/MESA_FORMAT_INTESITY_FLOAT32\b/MESA_FORMAT_I_FLOAT32/g' s/MESA_FORMAT_INTENSITY_FLOAT16\b/MESA_FORMAT_I_FLOAT16/g' s/MESA_FORMAT_INTENSITY_FLOAT32\b/MESA_FORMAT_I_FLOAT32/g' s/MESA_FORMAT_LUMINANCE_FLOAT16\b/MESA_FORMAT_L_FLOAT16/g' s/MESA_FORMAT_LUMINANCE_FLOAT32\b/MESA_FORMAT_L_FLOAT32/g' s/MESA_FORMAT_LUMINANCE_ALPHA_FLOAT16\b/MESA_FORMAT_LA_FLOAT16/g' s/MESA_FORMAT_LUMINANCE_ALPHA_FLOAT32\b/MESA_FORMAT_LA_FLOAT32/g'	2014-01-27 14:32:41 -08:00
Mark Mueller	eeed49f5f2	mesa: Change many Type P MESA_FORMATs to meet naming spec Conversion of Type P formats as follows (w/related comment fixes): s/MESA_FORMAT_RGB565\b/MESA_FORMAT_B5G6R5_UNORM/g s/MESA_FORMAT_RGB565_REV\b/MESA_FORMAT_R5G6B5_UNORM/g s/MESA_FORMAT_ARGB4444\b/MESA_FORMAT_B4G4R4A4_UNORM/g s/MESA_FORMAT_ARGB4444_REV\b/MESA_FORMAT_A4R4G4B4_UNORM/g s/MESA_FORMAT_RGBA5551\b/MESA_FORMAT_A1B5G5R5_UNORM/g s/MESA_FORMAT_XBGR8888_SNORM\b/MESA_FORMAT_R8G8B8X8_SNORM/g s/MESA_FORMAT_XBGR8888_SRGB\b/MESA_FORMAT_R8G8B8X8_SRGB/g s/MESA_FORMAT_ARGB1555\b/MESA_FORMAT_B5G5R5A1_UNORM/g s/MESA_FORMAT_ARGB1555_REV\b/MESA_FORMAT_A1R5G5B5_UNORM/g s/MESA_FORMAT_AL44\b/MESA_FORMAT_L4A4_UNORM/g s/MESA_FORMAT_RGB332\b/MESA_FORMAT_B2G3R3_UNORM/g s/MESA_FORMAT_ARGB2101010\b/MESA_FORMAT_B10G10R10A2_UNORM/g s/MESA_FORMAT_Z24_S8\b/MESA_FORMAT_S8_UINT_Z24_UNORM/g s/MESA_FORMAT_S8_Z24\b/MESA_FORMAT_Z24_UNORM_S8_UINT/g s/MESA_FORMAT_X8_Z24\b/MESA_FORMAT_Z24_UNORM_X8_UINT/g s/MESA_FORMAT_Z24_X8\b/MESA_FORMAT_X8Z24_UNORM/g s/MESA_FORMAT_RGB9_E5_FLOAT\b/MESA_FORMAT_R9G9B9E5_FLOAT/g s/MESA_FORMAT_R11_G11_B10_FLOAT\b/MESA_FORMAT_R11G11B10_FLOAT/g s/MESA_FORMAT_Z32_FLOAT_X24S8\b/MESA_FORMAT_Z32_FLOAT_S8X24_UINT/g s/MESA_FORMAT_ABGR2101010_UINT\b/MESA_FORMAT_R10G10B10A2_UINT/g s/MESA_FORMAT_XRGB4444_UNORM\b/MESA_FORMAT_B4G4R4X4_UNORM/g s/MESA_FORMAT_XRGB1555_UNORM\b/MESA_FORMAT_B5G5R5X1_UNORM/g s/MESA_FORMAT_XRGB2101010_UNORM\b/MESA_FORMAT_B10G10R10X2_UNORM/g s/MESA_FORMAT_AL88\b/MESA_FORMAT_L8A8_UNORM/g s/MESA_FORMAT_AL88_REV\b/MESA_FORMAT_A8L8_UNORM/g s/MESA_FORMAT_AL1616\b/MESA_FORMAT_L16A16_UNORM/g s/MESA_FORMAT_AL1616_REV\b/MESA_FORMAT_A16L16_UNORM/g s/MESA_FORMAT_RG88\b/MESA_FORMAT_G8R8_UNORM/g s/MESA_FORMAT_GR88\b/MESA_FORMAT_R8G8_UNORM/g s/MESA_FORMAT_GR1616\b/MESA_FORMAT_R16G16_UNORM/g s/MESA_FORMAT_RG1616\b/MESA_FORMAT_G16R16_UNORM/g s/MESA_FORMAT_SRGBA8\b/MESA_FORMAT_A8B8G8R8_SRGB/g s/MESA_FORMAT_SARGB8\b/MESA_FORMAT_B8G8R8A8_SRGB/g s/MESA_FORMAT_SLA8\b/MESA_FORMAT_L8A8_SRGB/g Conflicts: src/mesa/drivers/dri/i965/brw_surface_formats.c src/mesa/main/format_pack.c src/mesa/main/format_unpack.c src/mesa/main/formats.c src/mesa/main/texformat.c src/mesa/main/texstore.c	2014-01-27 14:31:55 -08:00
Mark Mueller	50a01d2aca	mesa: Change many Type A MESA_FORMATs to meet naming standard Update comments. Conversion of the following Type A formats: s/MESA_FORMAT_RGB888\b/MESA_FORMAT_BGR_UNORM8/g s/MESA_FORMAT_BGR888\b/MESA_FORMAT_RGB_UNORM8/g s/MESA_FORMAT_A8\b/MESA_FORMAT_A_UNORM8/g s/MESA_FORMAT_A16\b/MESA_FORMAT_A_UNORM16/g s/MESA_FORMAT_L8\b/MESA_FORMAT_L_UNORM8/g s/MESA_FORMAT_L16\b/MESA_FORMAT_L_UNORM16/g s/MESA_FORMAT_I8\b/MESA_FORMAT_I_UNORM8/g s/MESA_FORMAT_I16\b/MESA_FORMAT_I_UNORM16/g s/MESA_FORMAT_R8\b/MESA_FORMAT_R_UNORM8/g s/MESA_FORMAT_R16\b/MESA_FORMAT_R_UNORM16/g s/MESA_FORMAT_Z16\b/MESA_FORMAT_Z_UNORM16/g s/MESA_FORMAT_Z32\b/MESA_FORMAT_Z_UNORM32/g s/MESA_FORMAT_S8\b/MESA_FORMAT_S_UINT8/g s/MESA_FORMAT_SRGB8\b/MESA_FORMAT_BGR_SRGB8/g s/MESA_FORMAT_RGBA_16\b/MESA_FORMAT_RGBA_UNORM16/g s/MESA_FORMAT_SL8\b/MESA_FORMAT_L_SRGB8/g s/MESA_FORMAT_Z32_FLOAT\b/MESA_FORMAT_Z_FLOAT32/g s/MESA_FORMAT_XBGR16161616_UNORM\b/MESA_FORMAT_RGBX_UNORM16/g s/MESA_FORMAT_XBGR16161616_SNORM\b/MESA_FORMAT_RGBX_SNORM16/g s/MESA_FORMAT_XBGR16161616_FLOAT\b/MESA_FORMAT_RGBX_FLOAT16/g s/MESA_FORMAT_XBGR16161616_UINT\b/MESA_FORMAT_RGBX_UINT16/g s/MESA_FORMAT_XBGR16161616_SINT\b/MESA_FORMAT_RGBX_SINT16/g s/MESA_FORMAT_XBGR32323232_FLOAT\b/MESA_FORMAT_RGBX_FLOAT32/g s/MESA_FORMAT_XBGR32323232_UINT\b/MESA_FORMAT_RGBX_UINT32/g s/MESA_FORMAT_XBGR32323232_SINT\b/MESA_FORMAT_RGBX_SINT32/g s/MESA_FORMAT_XBGR8888_UINT\b/MESA_FORMAT_RGBX_UINT8/g s/MESA_FORMAT_XBGR8888_SINT\b/MESA_FORMAT_RGBX_SINT8/g	2014-01-27 14:30:50 -08:00
Mark Mueller	ef145ba4de	mesa: Rename 4 color component unsigned byte MESA_FORMATs Change all 4 color component unsigned byte formats to meet spec for P Type formats: s/MESA_FORMAT_RGBA8888\b/MESA_FORMAT_A8B8G8R8_UNORM/g s/MESA_FORMAT_RGBA8888_REV\b/MESA_FORMAT_R8G8B8A8_UNORM/g s/MESA_FORMAT_ARGB8888\b/MESA_FORMAT_B8G8R8A8_UNORM/g s/MESA_FORMAT_ARGB8888_REV\b/MESA_FORMAT_A8R8G8B8_UNORM/g s/MESA_FORMAT_RGBX8888\b/MESA_FORMAT_X8B8G8R8_UNORM/g s/MESA_FORMAT_RGBX8888_REV\b/MESA_FORMAT_R8G8B8X8_UNORM/g s/MESA_FORMAT_XRGB8888\b/MESA_FORMAT_B8G8R8X8_UNORM/g s/MESA_FORMAT_XRGB8888_REV\b/MESA_FORMAT_X8R8G8B8_UNORM/g	2014-01-27 14:29:13 -08:00
Mark Mueller	71fe943716	mesa: change gl_format to mesa_format s/\bgl_format\b/mesa_format/g. Use better name for Mesa Formats enum	2014-01-27 14:28:46 -08:00
Ian Romanick	bc0ed68275	docs: Update GL3.txt due to recent work v2: Note that Fredrik Höglund is working on GL_ARB_multi_bind, not Maxence Le Doré. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 14:35:19 -07:00
Ian Romanick	6901c278ca	glcpp: Make sure GL_AMD_shader_trinary_minmax is defined The define was only available if gl_extensions::AMD_shader_trinary_minmax was set, but no driver set it. Since the extension is advertised by default, remove that field too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Maxence Le Doré <maxence.ledore@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-27 14:28:24 -07:00
Ian Romanick	764be9f9e8	mesa: Clean up bad code formatting left from previous commit Also s/_EXT// on enums that are now part of core. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	a6729731af	mesa: GL_EXT_framebuffer_blit is not optional Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	71cc510ef6	radeon: Enable GL_EXT_framebuffer_blit The dd_function_table::BlitFramebuffer is already initialized to _mesa_meta_BlitFramebuffer, so it should just work. Tested on a Radeon 7500 (OpenGL renderer string: Mesa DRI R100 (RV200 5157) TCL DRI2). I couldn't do a full piglit run because it would tank the system with or without this patch. I just ran all the blit tests (-t blit to piglit-run.py). Only fbo-sys-sub-blit failed. All of the other tests that weren't skipped (i.e., all the multisample and sRGB tests skip) passed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	bed51a4858	r200: Enable GL_EXT_framebuffer_blit The dd_function_table::BlitFramebuffer is already initialized to _mesa_meta_BlitFramebuffer, so it should just work. Tested on a FireGL 8800 (OpenGL renderer string: Mesa DRI R200 (R200 5148) TCL DRI). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	33214679bb	radeon / r200: Pass the API into _mesa_initialize_context Otherwise an application that requested an OpenGL ES 1.x context would actually get a desktop OpenGL context. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	af0b34783e	mesa: Validate internalFormat with target in glTexStorage paths Fixes the glTexStorage3D failure in ext_packed_depth_stencil-depth-stencil-texture and oes_packed_depth_stencil-depth-stencil-texture_gles2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	421b5958eb	mesa: Refactor internalFormat / target checks to a separate function We need almost identical code in the glTexStorage path. v2: Fix typo in a comment noticed by Topi. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:42 -07:00
Ian Romanick	88db6ad7db	mesa: Generate the correct error for a depth format with a 3D texture All versions of the OpenGL spec are quite clear that GL_INVALID_OPERATION should be generated. I added a quotation from the 3.3 core profile spec. Fixes the glTexImage3D subcases of ext_packed_depth_stencil-depth-stencil-texture and oes_packed_depth_stencil-depth-stencil-texture_gles2. The same subtests of oes_packed_depth_stencil-depth-stencil-texture_gles1 fail, but they fail with a different wrong error code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:42 -07:00
Matt Turner	3f3aafbfee	glx: Update glxext.h to revision 24777. It readds the GLXContextID typedef, but under #ifndef GLX_VERSION_1_3 and glx.h already defines GLX_VERSION_1_3. Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=11454 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 09:57:12 -08:00
Emil Velikov	a6031a82f9	loader: Add missing \n on message printing Cover both loader and glx/dri_glx Drop \n from the default loader logger Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-27 09:37:29 -08:00
Eric Anholt	867d7c0e10	dri: Reuse dri_message to implement our other message handlers. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:37:29 -08:00
Eric Anholt	4a8da40fc0	dri: Fix the logger error message handling. Since the loader changes, there has been a compiler warning that the prototype didn't match. It turns out that if a loader error message was ever thrown, you'd segfault because of trying to use the warning level as a format string. Reviewed-by: Keith Packard <keithp@keithp.com> Tested-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:37:29 -08:00
Eric Anholt	7bd95ec437	dri2: Trust our own driver name lookup over the server's. This allows Mesa to choose to rename driver .sos (or split drivers), without needing a flag day with the corresponding 2D driver. v2: Undo the loader-only-for-dri3 change. Reviewed-by: Keith Packard <keithp@keithp.com> [v1] Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]	2014-01-27 09:37:10 -08:00
Eric Anholt	be7a6976a8	dri2: Open the fd before loading the driver. I want to stop trusting the server for the driver name, and instead decide on our own based on the fd, so I needed this code motion. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:36:24 -08:00
Eric Anholt	378e7ad26f	dri3: Fix two little memory leaks. Noticed when valgrinding an unrelated bug. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:36:24 -08:00
Eric Anholt	4556c73470	loader: Use dlsym to get our udev symbols instead of explicit linking. Steam links against libudev.so.0, while we're linking against libudev.so.1. The result is that the symbol names (which are the same in the two libraries) end up conflicting, and some of the usage of .so.1 calls the .so.0 bits, which have different internal structures, and segfaults happen. By using a dlopen() with RTLD_LOCAL, we can explicitly look for the symbols we want, while they get the symbols they want. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2014-01-27 09:36:24 -08:00
Tom Stellard	d51dbe048a	r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader. This is necessary to prevent the next SURFACE_SYNC packet from hanging the GPU. https://bugs.freedesktop.org/show_bug.cgi?id=73418 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-27 11:09:15 -05:00
Ilia Mirkin	3518606c14	docs: sync up nv50/nvc0 status on GL4.x extensions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	59e334194b	docs: update GL3.txt, relnotes to reflect current nv50/nvc0 status Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	839bd3cff7	nv50, nvc0: update reported glsl version to 330 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Christoph Bumiller	3efed4cd05	mesa/st: expose ARB_texture_rgb10_a2ui if R10G10B10A2_UINT is supported v2 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-27 16:40:43 +01:00
Christoph Bumiller	c7b14ba23f	nv50: add more RGB10A2 formats	2014-01-27 16:40:43 +01:00
Christoph Bumiller	f3bd2bc7b2	st/mesa: fix GS varyings for PIPE_CAP_TGSI_TEXCOORD	2014-01-27 16:40:43 +01:00
Ilia Mirkin	dc8da4c29b	nv50: enable seamless cube maps on all hw Some of the hardware support is missing. The NVIDIA-provided driver, which claims seamless cube map support fails the relevant tests as well. As this is the last extension before we can have OpenGL 3.2, doing this allows us to expose geometry shaders without doing the additional work involved in supporting ARB_geometry_shader4. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	b9b7cfbabf	nv50: report glsl 1.50 now that gp tests pass Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	3bd40073b9	nv50: add support for texelFetch'ing MS textures, ARB_texture_multisample Creates two areas in the AUX constbuf: - Sample offsets for MS textures - Per-texture MS settings When executing a texelFetch with a MS sampler, looks up that texture's settings and adjusts the parameters given to the texfetch instruction. With this change, all the ARB_texture_multisample piglits pass, so turn on PIPE_CAP_TEXTURE_MULTISAMPLE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	a6cf950ba2	nv50: copy nvc0's get_sample_position implementation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	b87f5abd21	nv50: add comments about CB_AUX contents Updates a few inconsistencies as well, like the size of the buffer, location of the runout, etc. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	250e7c835e	nvc0: don't forget to also clear additional layers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	e3247355cc	nv50: don't forget to also clear additional layers Fixes most of the tests/spec/gl-3.2/layered-rendering/* piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	d98b85b507	nv50: allocate an extra code bo to avoid dmesg spam Each code BO is a heap that allocates at the end first, and so GPs are allocated at the very end of the allocated space. When executing, we see PAGE_NOT_PRESENT errors for the next page. Just over-allocate to make sure that there's something there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	58589f6c6d	nv50: GP_REG_ALLOC_RESULT must be positive Set max_out to 1 when there are no outputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	006095b38a	nv50: VP_RESULT_MAP_SIZE has to be positive Make sure that we never try to use a 0-sized map. This can happen when using a gp, so add a dummy mapping when computing vp_gp_mapping in that case. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	c4adbd5a57	nv50: enable primitive id generation when it is an FP input without GP Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	70a07ac352	nv50: handle gl_Layer writes in GP Marks gl_Layer as only having one component, and makes sure to keep track of where it is and emit it in the output map, since it is not an input to the FP. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	7c624148a6	nv50: properly set the PRIMITIVE_ID enable flag when it is a gp input. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	6f3219a8f3	nv50/ir: add support for gl_PrimitiveIDIn Note that the primitive id is stored in a[0x18], while usually the geometry instructions are of the form a[$a1 + 0x4] which gets mapped to p[] space. We need to avoid the change from a[] to p[] here, so it's keyed on whether the access is indirect or not. Note that there's also a use-case for accessing e.g. a[$r1], however that's not supported for now. (Could be added by checking the register file of the indirect parameter.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	f77069419a	nv50/ir: fix support for shader input + immediate in gp This only works for up to $a3, hopefully we won't go that high. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	45b7f1701e	nv50/ir: disallow shader input + cbuf in same instruction in gp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	42dc414cc6	nv50/ir: disallow predicates on emit/restart ops	2014-01-27 16:40:42 +01:00
Ilia Mirkin	20929963d3	nv50: allow vert_count to be >255 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Bryan Cain	02b317a0d6	nv50: add support for geometry shaders Layer output probably doesn't work yet, but other than that everything seems to be working. Signed-off-by: Bryan Cain <bryancain3@gmail.com> [calim: fix up minor bugs, code formatting] Signed-off-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Bryan Cain	b3f82e1a63	nv50/ir: delay calculation of indirect addresses Instead of emitting an SHL 4 io an address register on the TGSI ARL and UARL instructions, emit the shift when the loaded address is actually used. This is necessary because input vertex and attribute indices in geometry shaders on nv50 need to be shifted left by 2 instead of 4. Signed-off-by: Bryan Cain <bryancain3@gmail.com> [calim: various updates to the indirect address logic] Signed-off-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> [imirkin: remove OP_MAD change that calim made, add OP_RESTART handling same as OP_EMIT for code flow analysis] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Christoph Bumiller	67250acbab	nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPs	2014-01-27 16:40:42 +01:00
Ilia Mirkin	2689b59cab	nv50/ir: txg not available on nvaa/nvac Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	e05de038bf	nv50, nvc0: only clear out the buffers that we were asked to clear Fixes fbo-drawbuffers-none glClearBuffer piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	c75eeab609	nv50, nvc0: clear out RT on a null cbuf This is needed since commit `9baa45f78b` (st/mesa: bind NULL colorbuffers as specified by glDrawBuffers). This implementation is highly based on a larger commit by Christoph Bumiller <e0425955@student.tuwien.ac.at> in his gallium-nine branch. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	3f264e16e2	nv50: don't leak heap on tls alloc failure Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	18d97a8df7	nouveau/codegen: set dType to S32 for OP_NEG U32 It doesn't make sense to do an OP_NEG from U32 to U32. This was manifested on nv50 in glsl-fs-atan-3 which was generating a UMAD TEMP[0].x, TEMP[0].xxxx, -TEMP[5].xxxx, TEMP[0].xxxx instruction. (For some reason, nvc0 causes a different shader to be generated.) This led to a cvt neg u32 $r1 u32 $r1 Which did not yield the desired result. This changes the final output to cvt neg s32 $r1 u32 $r1 which produces the desired output and the piglit tests passes. My assumption is that this is also what we want on nvc0, but could not test as there was no suitable shader that generated the problem instruction. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	45b64e52f4	util/u_vbuf: correct map offset calculation for crazy offsets When the min_index is very large (or very negative), the multipliation can overflow 32 bits and result in an incorrect map pointer modification. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	3de97ce920	translate: deal with size overflows by casting to ptrdiff_t This was discovered as a result of the draw-elements-base-vertex-neg piglit test, which passes very negative offsets in, followed up by large indices. The nouveau code correctly adjusts the pointer, but the translate code needs to do the proper inverse correction. Similarly fix up the SSE code to do a 64-bit multiply to compute the proper offset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-27 16:40:42 +01:00
Emil Velikov	4dd445f1cf	gallium/rtasm: handle mmap failures appropriately For a variety of reasons mmap (selinux and pax to name a few) and can fail and with current code. This will result in a crash in the driver, if not worse. This has been the case since the inception of the gallium copy of rtasm. Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73473 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-01-27 13:24:51 +00:00
Alexander von Gluck IV	e5e4120723	haiku: change atomic int to non-volatile * Our atomic calls changed recently and no longer want atomic int pointers to be volatile * Spellcheck	2014-01-26 18:56:05 -06:00
Kenneth Graunke	07149f0252	i965: Don't store qpitch / 4 as mt->qpitch for compressed surfaces. Broadwell requires software to specify QPitch in a bunch of packets, so we decided to store it in the miptree. However, when I did that refactoring, I missed a subtlety: the hardware expects QPitch to be "in units of rows in the uncompressed surface". This is the value we originally compute. However, for compressed surfaces, we then divided it by 4 (the block height), to obtain the physical layout. This is no longer the QPitch Broadwell expects. So, store the original undivided value in mt->qpitch, but continue to use the divided value in brw_miptree_layout_texture_array(). For non-Broadwell platforms, this should have no impact at all. Helps fix Piglit's "getteximage-targets S3TC CUBE" test on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-25 19:20:17 -08:00
Vinson Lee	a487b4d0e3	c11: Do not use pthread_mutex_timedlock on NetBSD. This patch fixes the NetBSD build. NetBSD does not have pthread_mutex_timedlock. CC glapi_dispatch.lo threads_posix.h: In function 'mtx_timedlock': threads_posix.h:216:5: error: implicit declaration of function 'pthread_mutex_timedlock' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-01-24 18:20:42 -08:00
Kenneth Graunke	6709f0549f	glsl: Simplify built-in generator functions for min3/max3/mid3. The type of all three parameters are identical, so we don't need to specify it three times. The predicate is always identical too, so we don't need to make it a parameter, either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-24 14:18:15 -08:00
Kenneth Graunke	44a86e2b4f	glsl: Fix chained assignments of vector channels. Simple shaders such as: void splat(vec2 v, float f) { v[0] = v[1] = f; } failed to compile with the following error: error: value of type vec2 cannot be assigned to variable of type float First, we would process v[1] = f, and transform: LHS: (expression float vector_extract (var_ref v) (constant int (1))) RHS: (var_ref f) into: LHS: (var_ref v) RHS: (expression vec2 vector_insert (var_ref v) (constant int (1)) (var_ref f)) Note that the LHS type is now vec2, not a float. This is surprising, but not the real problem. After emitting assignments, this ultimately becomes: (declare (temporary) vec2 assignment_tmp) (assign (xy) (var_ref assignment_tmp) (expression vec2 vector_insert (var_ref v) (constant int (1)) (var_ref f))) (assign (xy) (var_ref v) (var_ref assignment_tmp)) We would then return (var_ref assignment_tmp) as the rvalue, which has the wrong type---it should be float, but is instead a vec2. To fix this, we simply return (vector_extract (var_ref assignment_temp) <the appropriate channel>) to pull out the desired float value. Fixes Piglit's chained-assignment-with-vector-constant-index.vert and chained-assignment-with-vector-dynamic-index.vert tests. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74026 Reported-by: Dan Ginsburg <dang@valvesoftware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-24 14:18:15 -08:00
Kenneth Graunke	6c158e110c	glsl: Rename "expr" to "lhs_expr" in vector_extract munging code. When processing assignments, we have both an LHS and RHS. At a glance, "lhs_expr" clearly refers to the LHS, while a generic name like "expr" is ambiguous. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-24 14:18:15 -08:00
Paul Berry	eab32bb8f1	Update .gitignore for Catalan translations build artifacts Causes git to ignore the new build artifacts introduced by commit `d5e5367e89` (driconf: Add Catalan translations).	2014-01-24 13:45:16 -08:00
Ian Romanick	c11d76c51a	mesa: Increment the list pointer while freeing instruction data Since the list pointer was never incremented when a OPCODE_PIXEL_MAP opcode was encountered, the data for the instruction would get freed over and over and over... resulting in a crash. Fixes gl-1.0-beginend-coverage. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72214 Reviewed-by: Brian Paul <brianp@vmware.com> Cc: Lu Ha <huax.lu@intel.com>	2014-01-24 13:43:10 -08:00
Brian Paul	a44554870e	svga: rename "tex_usage" to "bindings", add comments Trivial.	2014-01-24 13:33:29 -07:00
Brian Paul	e2dd240e32	st/mesa: add a simple sanity check assertion in st_validate_attachment() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-24 13:33:13 -07:00
Paul Berry	43e77215b1	i965/gen7: Use to the correct program when uploading transform feedback state. Transform feedback may come from either the geometry shader or the vertex shader, so we can't use ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the current post-link transform feedback information. Fortunately we can use ctx->TransformFeedback.CurrentObject->shader_program. Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-23 13:41:36 -08:00
Paul Berry	e190709119	mesa: Ensure that transform feedback refers to the correct program. Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback functions were using ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the program that would be the source of transform feedback data. This isn't correct--if there's a geometry shader present it should be ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]. (These might be different if separate shader objects are in use). This patch creates a function get_xfb_source(), which figures out the correct program to use based on GL state, and updates _mesa_{Begin,Resume}TransformFeedback to call it. get_xfb_source() is written in terms of the gl_shader_stage enum, so it should not need modification when we add tessellation shaders in the future. It also creates a new driver flag, NewTransformFeedbackProg, which is flagged whenever this program changes. To reduce future confusion, this patch also rewords some comments and error message text to avoid referring to vertex shaders. Cc: 10.0 <mesa-stable@lists.freedesktop.org> v2: make the for loop in get_xfb_source() clearer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-23 13:41:01 -08:00
Paul Berry	9cee3ff562	i965: Remove *_generator::shader field; use prog field instead. The "shader" field in fs_generator, vec4_generator, and gen8_generator was only used for one purpose; to figure out if we were compiling an assembly program or a GLSL shader (shader is NULL for assembly programs). And it wasn't being used properly: in vec4 shaders we were always initializing it based on prog->_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we were compiling a geometry shader or a vertex shader. This patch simplifies things by using the "prog" field instead; this is also NULL for assembly programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-23 13:40:55 -08:00
Matt Turner	00c672086c	gles3: Update gl3.h to revision 24614. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	d519ebb34c	gles2: Update gl2ext.h to revision 24614. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	117d8ce27b	gles2: Update gl2.h to revision 24614. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	66ef8feb4d	glcpp: Define GL_EXT_shader_integer_mix in both GL and ES. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Matt Turner	73c3c7e37d	glcpp: Remove unused gl_api bits. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Matt Turner	b2d1c579bb	glcpp: Set extension defines after resolving the GLSL version. Instead of defining preprocessor macros in glcpp_parser_create based on the GL API, wait until the shader version has been resolved. Doing this allows us to correctly set (and not set) preprocessor macros for extensions allowed by the API but not the shader, as in the case of ARB_ES3_compatibility. The shader version has been resolved when the preprocessor encounters the first preprocessor token, since the GLSL spec says "The #version directive must occur in a shader before anything else, except for comments and white space." Specifically, if a #version token is found the version is known explicitly, and if any other preprocessor token is found then the GLSL version is implicitly 1.10. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71630 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Anuj Phogat	c907595ba7	glsl: Disable ARB_texture_rectangle in shader version 100. OpenGL with ARB_ES2_compatibility allows shaders that specify #version 100. This fixes the Khronos OpenGL test(Texture_Rectangle_Samplers_frag.test) failure. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	e0648015e9	glsl: Mark GLSL 4.40 as a known version. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Brian Paul	f7c118ffbf	st/mesa: fix glReadBuffer(GL_NONE) segfault Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73956 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Tested-by: Ahmed Allam <ahmabdabd@hotmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 11:08:40 -07:00
Brian Paul	349efdbba1	svga: fix PS output register setup regression Fixes glean fragProg1 regression caused by commit `b9f68d927e` (implement TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS). This bug only appears when the fragment shader emits fragment.Z before color outputs. The bug was caused by confusion between register indexes and semantic indexes. Also added some comments to better explain register indexing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-23 11:08:40 -07:00
Emil Velikov	c6b6916b9a	glx: link loader util lib only when building with dri3 Otherwise we pull libudev as a dependency and crash games/programs that ship their own version of libudev. Either way we should link the loader lib only when needed. This fixes a regression caused by commit `eac776cf77` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Sat Jan 11 02:24:43 2014 +0000 glx: use the loader util lib Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73854 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-23 18:04:22 +00:00
Alex Henrie	d5e5367e89	driconf: Add Catalan translations See the instructions in Makefile.am under "Adding new translations". Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-23 09:10:19 -08:00
Alex Henrie	84529a5ddb	driconf: Correct and update Spanish translations Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-23 09:10:18 -08:00
Alex Henrie	822b4315b7	driconf: Synchronize po files See the instructions in Makefile.am under "Updating existing translations". Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-23 09:10:18 -08:00
Ian Romanick	e4fcae0755	mesa: Set gl_constants::MinMapBufferAlignment Leaving it set to zero isn't really correct since every allocation has at least an alignment of 1 byte. It also caused a problem in the i965 driver after I removed the MAX(64, ...) from the alignment calculation. That's what I get for changing a patch without retesting it. :( Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73907 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Lu Hua <huax.lu@intel.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	7a0f26dec9	radeon / r200: Eliminate BEGIN_BATCH_NO_AUTOSTATE Sed job: grep -lr BEGIN_BATCH_NO_AUTOSTATE src/mesa/drivers/dri/ \| while read f do cat $f \| sed 's/BEGIN_BATCH_NO_AUTOSTATE/BEGIN_BATCH/g' > x mv x $f done Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	2d5fd20690	radeon / r200: Remove unused 'dostate' parameter This parameter hasn't been used since January 2010 (commit `29e02c7`). Fixes the following warning in both radeon and r200: radeon_common.c: In function 'r200_rcommonBeginBatch': radeon_common.c:762:14: warning: unused parameter 'dostate' [-Wunused-parameter] Note that now BEGIN_BATCH and BEGIN_PATCH_NO_AUTOSTATE are identical. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	5b4c12972c	radeon / r200: Fix 'empty body' warning radeon_common.c: In function 'radeon_draw_buffer': radeon_common.c:237:3: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	b790bed21e	radeon / r200: Fix incompatible pointer type warning When parameters were removed from dd_function_table::Viewport (commit `065bd6ff`), radeon_viewport (in both radeon and r200) started generating a warning. radeon_common.c: In function 'r200_radeon_viewport': radeon_common.c:415:15: warning: assignment from incompatible pointer type [enabled by default] radeon_common.c:419:23: warning: assignment from incompatible pointer type [enabled by default] I didn't notice this initially, and it's harmless because the function is never called through the incorrectly typed pointer. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
José Fonseca	840154dc50	draw: Save original driver functions earlier. Otherwise they will be NULL when stage destroy is invoked prematurely, (i.e, on out of memory). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-23 15:49:32 +00:00
Brian Paul	1a44180578	mesa: whitespace fixes in glformats.c Reindent _mesa_get_nongeneric_internalformat() to match other functions. Remove extraneous empty lines in _mesa_get_linear_internalformat(). Trivial.	2014-01-23 08:31:21 -07:00
Brian Paul	a15eb19676	svga: minor code movement in svga_tgsi_insn.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	f12954e1cb	svga: whitespace, formatting fixes in svga_state_framebuffer.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	56b876ecd0	svga: simplify common immediate value construction Use some new helper functions to make the code much more readable. And fix wrong value for XPD's w result. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	023020d740	svga: add comments, etc to svga_tgsi_insn.c code To make things a little easier to understand for newcomers. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	fe043ae554	svga: assorted cleanups in shader code Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:00 -07:00
Brian Paul	2a30379dcd	svga: rename shader_result -> variant To be more consisten with other parts of gallium. Plus, update/add various comments. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:22:58 -07:00
Brian Paul	35ddd2cc5d	mesa: rename unbind_texobj_from_imgunits() ... to unbind_texobj_from_image_units() and change a local var's type to silence an MSVC warning. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:14 -07:00
Brian Paul	1f2007429e	glsl: silence a couple warnings in find_active_atomic_counters() Silence unitialized variable 'id' warning. Silence unused 'found' warning. Only seen in release builds. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:14 -07:00
Brian Paul	5306ee736e	mesa: initialize "is_layered" variable to silence warning Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:14 -07:00
Brian Paul	b98fa6fe6f	mesa: fix/add some cases in _mesa_get_linear_internalformat() In some cases we were converting generic formats to sized formats and vice versa. The point is to simply convert sRGB formats to corresponding linear formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:13 -07:00
Brian Paul	91567b83bf	mesa: add missing ETC2_SRGB cases in formats.c In the _mesa_get_format_color_encoding() and _mesa_get_srgb_format_linear() functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:13 -07:00
José Fonseca	ab6f9fccd4	radeon: More missing stdio.h includes.	2014-01-23 14:20:20 +00:00
José Fonseca	fa75cc4b89	os/os_thread: Revert pipe_barrier pre-processing logic. Whitelist platforms instead of blacklisting, as several pthread implementations are missing pthread_barrier_t, in particular MacOSX.	2014-01-23 13:44:10 +00:00
José Fonseca	cd978ce26a	c11: Fix missing pthread_mutex_timedlock declaration warnings on MacOSX.	2014-01-23 13:42:38 +00:00
José Fonseca	6b6fdb6aa9	radeon: Adding missing stdio.h include. Became apparent with the C11 thread changes. Unfortunately I didn't have all dependencies to build the driver, and only noticed this issue on build server.	2014-01-23 13:23:43 +00:00
José Fonseca	ab5dc45b2f	mapi: Prevent cast from pointer to integer of different size. On Windows64.	2014-01-23 13:21:52 +00:00
José Fonseca	799f30f385	c11: Update docs/license.html and include verbatim copy of Boost license.	2014-01-23 12:55:55 +00:00
José Fonseca	f298720cbc	egl: Use C11 thread abstractions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	54876afcf0	mapi: Use C11 thread abstractions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	fd33a6bcd7	gallium: Use C11 thread abstractions. Note that PIPE_ROUTINE now returns an int. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	ecaa81bd96	c11: Import threads.h emulation library. Implementation is based of https://gist.github.com/2223710 with the following modifications: - inline implementatation - retain XP compatability - add temporary hack for static mutex initializers (as they are not part of the stack but still widely used internally) - make TIME_UTC a conditional macro (some system headers already define it, so this prevents conflict) - respect HAVE_PTHREAD macro Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	349f0a94ae	os: Remove pipe_static_condvar. Never used. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-23 12:55:55 +00:00
Timothy Arceri	815e064fb6	docs: Mark ARB_arrays_of_arrays as started Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:37 +11:00
Timothy Arceri	b0c64d3cc6	glsl: remove remaining is_array variables Previously the reason we needed is_array was because we used array_size == NULL to represent both non-arrays and unsized arrays. Now that we use a non-NULL array_specifier to represent an unsized array, is_array is redundant. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:37 +11:00
Timothy Arceri	61a5846099	glsl: create type name for arrays of arrays We need to insert outermost dimensions in the correct spot otherwise the dimension order will be backwards Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:36 +11:00
Timothy Arceri	3d492f19f6	glsl: Allow arrays of arrays as input to vertex shader Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:36 +11:00
Timothy Arceri	3dc932d450	glsl: only call mark_max_array if we are assigning an array This change does not help fix or prevent any bugs it just seems reasonable to do Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:36 +11:00
Timothy Arceri	bfb48750f0	glsl: Add ARB_arrays_of_arrays support to yacc definition and ast Adds array specifier object to hold array information Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:31:10 +11:00
Timothy Arceri	72288e0c7b	mesa: Add ARB_arrays_of_arrays Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:15:29 +11:00
Topi Pohjolainen	bda88f121b	i965/blorp: switch eu-emitter to use FS IR and fs_generator No regressions on IVB (piglit quick + unit tests). v2 (Paul): - no need to patch the unit tests anymore. Original logic was altered and unit tests updated to match the fs-generator - lrp emission moves from the blorp compiler core into the emitter here (previously there was a separate refactoring patch which is not really needed anymore as the lrp logic got refactored when the original lrp logic got fixed). - pass 'BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX' to the generator in fs_inst::target instead of hardcoding it Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:47:12 +02:00
Topi Pohjolainen	8f3e5363ad	i965/fs: add support for BRW_OPCODE_AVG in fs_generator Needed for compiling blorp blit programs. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:47:12 +02:00
Topi Pohjolainen	9927d7ae68	i965/fs: introduce blorp specific rt-write for fs_generator The compiler for blorp programs likes to emit instructions for the message construction itself meaning that the generator needs to skip any such when blorp programs are translated for the hw. In addition, the binding table control is special for blorp programs and the generator does not need to update the binding tables associated with the compiler bookkeeping (this in fact gets thrown away as the blorp compiler sets the program data in its own way). v2 (Paul): do not hardcode the binding table index but use fs_inst::target instead. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:46:57 +02:00
Topi Pohjolainen	85fc724df5	i965/fs: allow unit tests to dump the final patched assembly Unit tests comparing generated blorp programs to known good need to have the dump in designated file instead of in default standard output. The comparison also expects the jump counters of if-else-instructions to be correctly set and hence the dump needs to be taken _after_ 'patch_IF_ELSE()' is run (the default dump of the fs_generator does this before). v2 (Paul): dropped the redundant 'dump_enabled' argument Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:57 +02:00
Topi Pohjolainen	757b4cf011	i965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter v2 (Paul): renamed emit_if() to emit_cmp_if() Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:53 +02:00
Topi Pohjolainen	8c0030678a	i965/blorp: wrap RNDD (/brw_RNDD(&func, /emit_rndd(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:51 +02:00
Topi Pohjolainen	44524cb42f	i965/blorp: wrap FRC (/brw_FRC(&func, /emit_frc(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:49 +02:00
Topi Pohjolainen	f9d875926e	i965/blorp: wrap MUL (/brw_MUL(&func, /emit_mul(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:47 +02:00
Topi Pohjolainen	bbab8068d2	i965/blorp: wrap OR (/brw_OR(&func, /emit_or(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:44 +02:00
Topi Pohjolainen	de6ea2fe25	i965/blorp: wrap SHL (/brw_SHL(&func, /emit_shl(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:42 +02:00
Topi Pohjolainen	d256a5f843	i965/blorp: wrap SHR (/brw_SHR(&func, /emit_shr(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:39 +02:00
Topi Pohjolainen	0df1f5ce4e	i965/blorp: wrap ADD (/brw_ADD(&func, /emit_add(/) In addition, the special case requiring explicit execution size control is wrapped manually. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:37 +02:00
Topi Pohjolainen	c777e72bd8	i965/blorp: wrap AND (/brw_AND(&func, /emit_and(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:34 +02:00
Topi Pohjolainen	8b5fd98043	i965/blorp: wrap MOV (/brw_MOV(&func, /emit_mov(/) In addition, the two special cases requiring explicit execution size control are wrapped manually. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:30 +02:00
Topi Pohjolainen	250494f742	i965/blorp: wrap emission of if-equal-assignment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:28 +02:00
Topi Pohjolainen	9e9617f797	i965/blorp: wrap emission of conditional assignment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:25 +02:00
Topi Pohjolainen	8c42ade7a4	i965/blorp: move emission of sample combining into eu-emitter v2 (Paul): pass the combining opcode as an argument to emit_combine(). This keeps manual_blend_average() selfcontained documentation wise. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:16 +02:00
Topi Pohjolainen	ecf795615c	i965/blorp: move emission of rt-write into eu-emitter Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:13 +02:00
Topi Pohjolainen	aac6bace9f	i965/blorp: move emission of texture lookup into eu-emitter Resolving of the hardware message type is moved into the emitter also in preparation for switching to use fs_generator. The generator wants to translate the high level op-code into the message type and hence the emitter needs to know the original op-code. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:10 +02:00
Topi Pohjolainen	41d397f22b	i965/fs: introduce non-compressed equivalent of tex_cms v2: introduces 'SHADER_OPCODE_TXF_UMS' also for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:04 +02:00
Topi Pohjolainen	ce527a6722	i965: rename tex_ms to tex_cms Prepares for the introduction of non-compressed multi-sampled lookup used in the blorp programs. v2: now also taking into account gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:44:58 +02:00
Topi Pohjolainen	3c44e43357	i965/blorp: move emission of pixel kill into eu-emitter The combination of four separate comparison operations and and the masked "and" require special treatment when moving to FS LIR. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:44:52 +02:00
Topi Pohjolainen	f031487dcb	i965/blorp: introduce separate eu-emitter for blit compiler Prepares for presenting blorp blit programs using FS IR that allows EU-assembly generation using i965 glsl-compiler backend (fs_generator). v2: rebased on top of endif-jump counter fix (moving the added brw_set_uip_jip() into the emitter) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:44:44 +02:00
Kenneth Graunke	d8c7740dda	i965: Support 32 texture image units on Haswell+. The Intel closed source OpenGL driver recently began supporting 32 texture image units on Haswell. This makes the open source driver support 32 as well. Earlier generations don't have the message header field required to support more than 16 sampler states, so we continue to advertise 16 there. On Haswell, this causes us to advertise: - GL_MAX_TEXTURE_IMAGE_UNITS = 32 - GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS = 32 - GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96 instead of the old values of 16, 16, and 48. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:58 -08:00
Kenneth Graunke	5a51a26804	i965/fs: Switch from BRW_MAX_TEX_UNIT to the actual limit. BRW_MAX_TEX_UNIT is about to grow, but only Gen7+ will be able to support the new larger value. On older platforms, we don't want to allocate the extra space - it would just be a waste. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:56 -08:00
Kenneth Graunke	50ce6f682d	mesa: Bump MAX_TEXTURE_IMAGE_UNITS to 32. This allows drivers to optionally support more than 16 texture units. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:55 -08:00
Kenneth Graunke	15fc919491	i965/vec4: Support arbitrarily large sampler state indices on Haswell+. Like the scalar backend, we add an offset to the "Sampler State Pointer" field to select a group of 16 samplers, then use the "Sampler Index" field to select within that group. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:53 -08:00
Kenneth Graunke	d58e03fe4f	i965/vec4: Refactor sampler message setup. The next patch adds an additional case where the message header is necessary. So we want to do the g0 copy if inst->header_present is set, rather than inst->texture_offset. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:51 -08:00
Kenneth Graunke	e0a5602911	i965/vec4: Don't set header_present if texel offsets are all 0. In theory, a shader might use textureOffset() but set all the texel offsets to zero. In that case, we don't actually need to set up the message header - zero is the implicit default. By moving the texture_offset setup before the header_present setup, we can easily only set header_present when there are non-zero texel offset values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:49 -08:00
Kenneth Graunke	6943ac0bd9	i965/fs: Support arbitrarily large sampler state indices on Haswell+. The message descriptor's "Sampler Index" field is only 4 bits (on all generations of hardware), so it can only represent indices 0 through 15. Haswell introduced a new field in the message header - "Sampler State Pointer". Normally, this is copied straight from g0, but we can also add a byte offset (as long as it's a multiple of 32). This patch uses a "Sampler State Pointer" offset to select a group of 16 sampler states, and then uses the "Sampler Index" field to select the state within that group. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:48 -08:00
Kenneth Graunke	d7450e52e6	i965/fs: Plumb sampler index into emit_texture_gen7. We'll need this in the next patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:46 -08:00
Kenneth Graunke	ebfe43d5ad	i965/fs: Refactor sampler message header to duplicate less code. Previously, the code to copy g0 to the message header existed in two places - one for the texture offset case, and one for any other case. By treating texture_offset as a special case of header_present, we can remove this duplication and shorten the code. Future patches which add new header fields also won't have to add additional duplication. This also clarifies a confusing construct. The old code contained: } else if (inst->header_present) { if (brw->gen >= 7) { ...explicit copy from g0 to the message header... } else { /* Set up an implied move from g0 to the MRF. */ } } This looks like it might set up an implied move on Sandybridge, which doesn't support those. However, Sandybridge only uses a message header for texture offsets, so it would never hit this code path. The new code avoids this implicit knowledge by only setting up an implied move on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:42 -08:00
Kenneth Graunke	87e7326735	i965: Use get_element_ud to shorten texture header access. This is shorter, easier to read, and further from the 80 column limit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:18 -08:00
Marek Olšák	d40532f260	gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats This fixes a serious regression introduced in `4e549ddb50`. Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	d382e90614	gallium: remove PIPE_CAP_SCALED_RESOLVE If any driver doesn't support this, it can use a blit after resolving the samples. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	a8930adbf8	radeonsi: use hardware scissors correctly Use the WINDOW and VPORT scissors for the framebuffer and scissor test, respectively. The other two scissors are disabled (they cover the max fb size). We actually have 16 VPORT scissors, which will map well to ARB_viewport_array. Also, we don't need to write SC_WINDOW_OFFSET with this commit, because it's disabled everywhere. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	69c29cb147	radeonsi: handle R600_CONTEXT_PS_PARTIAL_FLUSH in si_emit_cache_flush For consistency only, This is unused by radeonsi currently. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	5dfb10b2f5	r600g,radeonsi: if discarding whole buffer range, discard whole resource instead Also set the unsynchronized flag if the whole resource was discarded to avoid doing buffer-busy checks again. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	ee0dc659c8	gallium/u_upload_mgr: don't expose u_upload_flush It's unused and shouldn't be used at all in my opinion. If some driver doesn't support the unsynchronized flag, u_upload_mgr should avoid the synchronization by other means, e.g. by using the DONTBLOCK flag.	2014-01-23 01:47:14 +01:00
Marek Olšák	0c20bff4b6	gallium/hud: just unmap the upload vertex buffer instead of recreating it	2014-01-23 01:47:14 +01:00
Marek Olšák	2b033f3aab	gallium/vl: use u_upload_mgr to upload vertices for vl_compositor This is the recommended way for streaming vertices. Always use this if you need to upload vertices every frame. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-23 01:47:14 +01:00
Kristian Høgsberg	11baad3508	intel: Fix initial MakeCurrent for single-buffer drawables Commit `05da4a7a5e` attempts to eliminate the call to intel_update_renderbuffer() in the case where we already have a drawbuffer for the drawable. Unfortunately this only checks the back left renderbuffer, which breaks in case of single buffer drawables. This means that the initial viewport will not be set in that case. Instead, we now check whether the initial viewport has not been set, in which case we call out to intel_update_renderbuffer(). https://bugs.freedesktop.org/show_bug.cgi?id=73862 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-22 12:30:59 -08:00
Paul Berry	0da1a2cc36	glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays. Most of the time it is not necessary to perform type inference to compile GLSL; the type of every expression can be inferred from the contents of the expression itself (and previous type declarations). The exception is aggregate initializers: their type is determined by the LHS of the variable being assigned to. For example, in the statement: mat2 foo = { { 1, 2 }, { 3, 4 } }; the type of { 1, 2 } is only known to be vec2 (as opposed to, say, ivec2, uvec2, int[2], or a struct) because of the fact that the result is being assigned to a mat2. Previous to this patch, we handled this situation by doing some type inference during parsing: when parsing a declaration like the one above, we would call _mesa_set_aggregate_type(), which would infer the type of each aggregate initializer and store it in the corresponding ast_aggregate_initializer::constructor_type field. Since this happened at parse time, we couldn't do the type inference using glsl_type objects; we had to use ast_type_specifiers, which are much more awkward to work with. Things are about to get more complicated when we add support for ARB_arrays_of_arrays. This patch simplifies things by postponing the call to _mesa_set_aggregate_type() until ast-to-hir time, when we have access to glsl_type objects. As a side benefit, we only need to have one call to _mesa_set_aggregate_type() now, instead of six. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-22 11:08:30 -08:00
Jan Vesely	6ec210989f	clover: Don't crash on NULL global buffer objects. Specs say "If the argument is a buffer object, the arg_value pointer can be NULL or point to a NULL value in which case a NULL value will be used as the value for the argument declared as a pointer to __global or __constant memory in the kernel." So don't crash when somebody does that. v2: Insert NULL into input buffer instead of buffer handle pair Fix constant_argument too Drop r600 driver changes v3: Fix inserting NULL pointer Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-01-22 13:30:35 +01:00
Vinson Lee	6caf34b97e	meta: Move loop variable declaration outside loop. Fixes MSVC build error introduced with commit `69b258cb46`. meta.c(618) : error C2143: syntax error : missing ';' before 'type' meta.c(618) : error C2143: syntax error : missing ')' before 'type' meta.c(618) : error C2065: 'i' : undeclared identifier meta.c(618) : warning C4552: '<' : operator has no effect; expected operator with side-effect meta.c(618) : error C2059: syntax error : ')' meta.c(618) : error C2143: syntax error : missing ';' before '{' meta.c(619) : error C2065: 'i' : undeclared identifier meta.c(620) : error C2065: 'i' : undeclared identifier Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-01-21 22:59:16 -08:00
Topi Pohjolainen	8b16b0255b	i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR No known bugs fixed but this is now in line with fs-generator. No regresssions on IVB. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-22 08:13:32 +02:00
Topi Pohjolainen	89347dd61b	i965/blorp: patch jump counters also for endif No known bugs fixed but this is now in line with fs-generator. No regresssions on IVB. Eric further explained that: "The endif jump, since it's forward, is just an optimization to have set right -- otherwise, the GPU will just step forward instruction by instruction until it hits something else that updates the per-channel PC." Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-22 08:13:32 +02:00
Paul Berry	1032c33cb9	mesa: Change redundant code into loops in texstate.c. This is possible now that ctx->Shader.CurrentProgram is an array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:52 -08:00
Paul Berry	6ac2e1e199	mesa: Change redundant code into loops in shaderapi.c. This is possible now that ctx->Shader.CurrentProgram is an array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:49 -08:00
Paul Berry	5808c44bab	mesa: Remove ad-hoc arrays of gl_shader_program. Now that we have a ctx->Shader.CurrentProgram array, we can just use it directly. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:47 -08:00
Paul Berry	69b258cb46	meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with an array. Since ctx->Shader.Current{Vertex,Geometry,Fragment}Program is an array, this allows some meta code to be rolled up into loops. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:44 -08:00
Paul Berry	b4b70674ea	i965: Fix comments to refer to the new ctx->Shader.CurrentProgram array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:41 -08:00
Paul Berry	1aef45578c	mesa: Fold long lines introduced by the previous patch. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:38 -08:00
Paul Berry	3b22146dc7	mesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an array. These are replaced with ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' ')' \ -print0 \| xargs -0 sed -i \ -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \ -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \ -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g' Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:02 -08:00
Paul Berry	cd18ba1c7a	glsl/linker: Refactor in preparation for adding more shader stages. Rather than maintain separately named arrays and counts for vertex, geometry, and fragment shaders, just maintain these as arrays indexed by the gl_shader_type enum. v2: When there is neither a vertex nor a geometry shader, set prog->LastClipDistanceArraySize = 0, and clarify that the values is not used. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:59 -08:00
Paul Berry	4a91675b26	mesa: use _mesa_validate_shader_target() more frequently. This patch replaces code in _mesa_new_shader() and delete_shader_cb() that checks the type of a shader with calls to _mesa_validate_shader_target(). This has two advantages: it allows for a more thorough check (since _mesa_validate_shader_target() doesn't permit shader targets that aren't supported by the back-end), and it reduces the amount of code that will need to be modified when adding new shader stages. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:56 -08:00
Paul Berry	020919b2ae	main: Allow ctx == NULL in _mesa_validate_shader_target(). This will allow this function to be used in circumstances where there is no context available, such as when building built-in GLSL functions. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:54 -08:00
Paul Berry	6ab2a6148a	mesa: Make validate_shader_target() non-static. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:49 -08:00
Paul Berry	46d210d38f	mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program. In my recent zeal to refactor Mesa's handling of the gl_shader_stage enum, I accidentally wound up with two functions that do the same thing: _mesa_program_index_to_target(), and _mesa_shader_stage_to_program(). This patch keeps _mesa_shader_stage_to_program(), since its name is more consistent with other related functions. However, it changes the signature so that it accepts an unsigned integer instead of a gl_shader_stage--this avoids awkward casts when the function is called from C++ code. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:43 -08:00
Dave Airlie	2212a97fe3	llvmpipe: dump geometry shaders when using LP_DEBUG=tgsi for consistency with vs and fs dumpers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-01-22 14:08:03 +10:00
Ian Romanick	178c1bf1ad	mesa: Generate GL_INVALID_OPERATION for unsupported DSA TexStorage functions We have to make the functions available to work around a GLEW bug (see comments already in the code), but if an application calls one of these functions we should still generate GL_INVALID_OPERATION. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 15:39:54 -08:00
Ian Romanick	17594dccfd	mesa: Silence many unused parameter warnings main/texstorage.c: In function '_mesa_alloc_texture_storage': main/texstorage.c:240:53: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:241:37: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:241:53: warning: unused parameter 'depth' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage1DEXT': main/texstorage.c:464:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:464:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:464:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:465:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:466:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage2DEXT': main/texstorage.c:473:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:473:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:473:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:474:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:475:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:475:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage3DEXT': main/texstorage.c:483:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:483:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:483:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:484:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:485:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:485:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:485:66: warning: unused parameter 'depth' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 15:39:54 -08:00
Anuj Phogat	f5cfb4ae21	i965: Ignore 'centroid' interpolation qualifier in case of persample shading This patch handles the use of 'centroid' qualifier with 'in' variables in a fragment shader when persample shading is enabled. Per sample shading for the whole fragment shader can be enabled by: glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID} builtin variables in fragment shader. Explaining it below in more detail. /* Enable sample shading using OpenGL API */ glEnable(GL_SAMPLE_SHADING); glMinSampleShading(1.0); Example fragment shader: in vec4 a; centroid in vec4 b; main() { ... } Variable 'a' will be interpolated at sample location. But, what interpolation should we use for variable 'b' ? ARB_sample_shading recommends interpolation at sample position for all the variables. GLSL 400 (and earlier) spec says that: "When an interpolation qualifier is used, it overrides settings established through the OpenGL API." But, this text got deleted in later versions of GLSL. NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3) interpolates at sample position. This convinces me to use the similar approach on intel hardware. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-21 14:42:28 -08:00
Anuj Phogat	a92e5f7cf6	i965: Use sample barycentric coordinates with per sample shading Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-21 14:42:27 -08:00
Anuj Phogat	3313cc269b	i965: Add an option to ignore sample qualifier This will be useful in my next patch which depends on a functionality of _mesa_get_min_invocations_per_fragment() to ignore the sample qualifier (prog->IsSample) based on a flag passed to it. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-21 14:42:27 -08:00
Matt Turner	78d65476b6	mesa/x86: Remove dead read_rgba_span_x86.h. Dead since `304f7a13`.	2014-01-21 14:20:44 -08:00
Matt Turner	bf0773aeca	i965/fs: Optimize LRP with x == y into a MOV. total instructions in shared programs: 1487331 -> 1485988 (-0.09%) instructions in affected programs: 45638 -> 44295 (-2.94%) GAINED: 7 LOST: 0 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Jordan Justen	8d37e9915a	glsl: Optimize open-coded lrp into lrp. total instructions in shared programs: 1498191 -> 1487051 (-0.74%) instructions in affected programs: 669388 -> 658248 (-1.66%) GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	13100ac142	i965: Enable AOS optimizations for the geometry shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	4bd6e0d7c6	glsl: Vectorize multiple scalar assignments Reduces vertex shader instruction counts in DOTA2 by 6.42%, L4D2 by 4.61%, and CS:GO by 5.71%. total instructions in shared programs: 1500153 -> 1498191 (-0.13%) instructions in affected programs: 59919 -> 57957 (-3.27%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	5e82d8a9da	glsl: Add parameter to .equals() to ignore an IR type. Only implemented for ir_swizzles currently, but perhaps will be useful for other IR types in the future. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	ebf91993c1	mesa: rename PreferDP4 to OptimizeForAOS. This flag was really just a proxy for determining whether the backend was vector (AOS) or scalar (SOA). It will be used to apply a future optimization only for vector backends. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	413622fbef	i965/fs: Print the maximum register pressure. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Kenneth Graunke	391eaa59bd	i965/fs: Show register pressure in dump_instructions() output. Dumping the number of live registers at each IP allows us to see register pressure and identify any local maxima. This should aid in debugging passes designed to reduce register pressure, as well as optimizations that suddenly trigger spilling. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Kenneth Graunke	3b74f4b233	i965: Compute the number of live registers at each IP. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-21 14:20:44 -08:00
Matt Turner	0ea600ef1a	i965/fs: Call opt_peephole_sel later in the optimization loop. Calling it after value numbering (added in the next commit) prevents some instruction count regressions. total instructions in shared programs: 1524387 -> 1523905 (-0.03%) instructions in affected programs: 13112 -> 12630 (-3.68%) GAINED: 0 LOST: 3 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	ede6c341f6	i965/fs: Calculate interference better in register_coalesce. Previously we simply considered two registers whose live ranges overlapped to interfere. Cases such as set A ------ ... \| mov B, A -- \| ... \| B \| A use B -- \| ... \| use A ------ would be considered to interfere, even though B is an unmodified copy of A whose live range fit wholly inside that of A. If no writes to A or B occur between the mov B, A and the use of B then we can safely coalesce them. Instead of removing MOV instructions, we make them NOPs and remove them at once after the main pass is finished in order to avoid recomputing live intervals (which are needed to perform the previous step). total instructions in shared programs: 1543768 -> 1513077 (-1.99%) instructions in affected programs: 951563 -> 920872 (-3.23%) GAINED: 46 LOST: 22 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	4a7d0c550e	i965/fs: Support coalescing registers of size > 1. total instructions in shared programs: 1550048 -> 1549880 (-0.01%) instructions in affected programs: 1896 -> 1728 (-8.86%) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	78fa6172e1	i965/fs: Assert that var < num_vars. Helped to track down a problem in a version of the next commit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	9bb4d71fd2	i965/fs: Add a comment explaining how register coalescing works. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	2dfb067139	i965/fs: Add and use MAX_SAMPLER_MESSAGE_SIZE definition. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	81d52419cf	mesa: Add STRINGIFY macro. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	80b949f16b	i965/fs: Fix the example about overwriting uniforms in SIMD16. mov takes only a single source argument. Example instruction inexplicably changed from add to mov in commit `f10f5e49`. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	71bc11a375	i965: Print reg_offset for vgrf of size > 1 in dump_instruction(). Previously we wouldn't print the +0 for the first part of a VGRF of size greater than 1. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Grigori Goronzy	955c93dc08	glsl: Match unnamed record types across stages. Unnamed record types are assigned to separate types per stage, e.g. if uniform struct { ... } a; is defined in both vertex and fragment shader, two separate types will result with different names. When linking the shader, this results in a type conflict. However, there is no reason why this should not be allowed according to GLSL specifications. Compare and match record types when linking shader stages to avoid this conflict. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-21 14:01:09 -08:00
Grigori Goronzy	41c9bf884f	glsl: Extract function for record comparisons. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-21 14:01:09 -08:00
Brian Paul	6d8cf5181a	docs: remove some ancient README.* files None of this info is relevant anymore. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 10:53:51 -08:00
Brian Paul	b9f68d927e	svga: implement TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS Fixes several colorbuffer tests, including piglit "fbo-drawbuffers-none" for "gl_FragColor" and "glDrawPixels" cases. v2: rework patch to only avoid creating extra shader variants when TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS is not specified. Per Jose. Use a write_color0_to_n_cbufs key field to replicate color0 to N color buffers only when N > 0 and WRITES_ALL_CBUFS is set. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:51 -08:00
Brian Paul	384fd64ab1	svga: rename color output variables Just to be bit more readable. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:51 -08:00
Brian Paul	f6bc7d6586	svga: fix clearing for null color buffers Fixes piglit "fbo-drawbuffers-none glClear" test. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:51 -08:00
Brian Paul	ff59b3d9ee	mesa: add missing TYPE_DOUBLEN_2 cases in get.c The new TYPE_DOUBLEN_2 type was added in `0e60d850` but the code to return values of that type wasn't completed. Fixes conform's default state test. glGetFloatv(GL_DEPTH_RANGE) wasn't returning anything. v2: remove stray 'break' statements. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:12 -08:00
Paul Berry	51000c2ff8	i965: Modify some error messages to refer to "vec4" instead of "vs". These messages are in code that is shared between the VS and GS back-ends, so use the terminology "vec4" to avoid confusion. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-21 09:05:33 -08:00
Paul Berry	a4d68e9ee9	i965: Add GS support to INTEL_DEBUG=shader_time. Previously, time spent in geometry shaders would be counted as part of the vertex shader time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-21 09:05:12 -08:00
Roland Scheidegger	e23e4f67be	draw: fix points with negative w coords for d3d style point clipping Even with depth clipping disabled, vertices which have negative w coords must be discarded. And since we don't have a proper guardband implementation yet (relying on driver to handle all values except infs/nans in rasterization for such points) we need to kill them off manually (as they can end up with coordinates inside viewport otherwise). v2: use 0.0f instead of 0 (spotted by Brian). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 17:49:02 +01:00
Kenneth Graunke	ad04e396fa	i965: Reserve space for "Vertex Count" in GS outputs. v2: Also increment ir->offset in the GS visitor, rather than at the final assembly generation stage (requested by Paul). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-21 00:20:14 -08:00
Kenneth Graunke	94c0a11b19	i965: Update blitter code for 48-bit addresses. v2: Rebase on Eric's SET_FIELD changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2014-01-20 16:21:52 -08:00
Kenneth Graunke	23827756f3	i965: Update PIPE_CONTROL packet lengths for Broadwell. On Broadwell, PIPE_CONTROL needs an extra DWord to accomodate the 48-bit addressing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:24 -08:00
Kenneth Graunke	f7e76e00b6	i965: Re-combine the Gen4-5 and Gen6+ write_depth_count functions. Now that we have a helper function that handles the PIPE_CONTROL variations between the various platforms, these are basically the same. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	f5dd608db2	i965: Create a helper function for emitting PIPE_CONTROL writes. There are a lot of places that use PIPE_CONTROL to write a value to a buffer (either an immediate write, TIMESTAMP, or PS_DEPTH_COUNT). Creating a single function to do this seems convenient. As part of this refactor, we now set the PPGTT/GTT selection bit correctly on Gen7+. Previously, we set bit 2 of DW2 on all platforms. This is correct for Sandybridge, but actually part of the address on Ivybridge and later! Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to adjust that in substantially fewer places, giving us confidence that we've hit them all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	35458a99c0	i965: Use full-length PIPE_CONTROL packets for workaround writes. I believe that PIPE_CONTROL uses the length field to decide whether to do 32-bit or 64-bit writes. A length of 4 would do a 32-bit write, while a length of 5 would do a 64-bit write. (I haven't verified this, though.) For workaround writes, we don't care what value gets written, or how much data. We're only writing something because hardware bugs mandate that do so. So using a 64-bit write should be fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	4b9e5c985c	i965: Emit full-length PIPE_CONTROLs for (non-write) flushes. The PIPE_CONTROL packet actually has 5 DWords on Gen6+: 1. Header 2. Flags 3. Address 4. Immediate Data: Lower DWord 5. Immediate Data: Upper DWord We just never emitted the last one. While it appears to work, it's probably safer to emit the entire thing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	9420b577dd	i965: Create a helper function for emitting PIPE_CONTROL flushes. These days, we need to emit PIPE_CONTROL flushes all over the place. Being able to do that via a single function call seems convenient. Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to do this in substantially fewer places. v2: Add back forgotten intel_emit_post_sync_nonzero_flush (caught by Eric Anholt). Drop unlikely() from BLT_RING check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-20 15:38:16 -08:00
Kenneth Graunke	ded5674689	i965: Fix MI_STORE_REGISTER_MEM for Broadwell. It now takes a 48-bit address. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	f11c1feaf7	i965: Introduce an OUT_RELOC64 macro. Broadwell uses 48-bit addresses. The first DWord is the low 32 bits, and the second DWord is the high 16 bits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	67ebcb4711	i965: Use the new drm_intel_bo offset64 field. libdrm 2.4.52 introduces a new 'uint64_t offset64' field, intended to replace the old 'unsigned long offset' field. To preserve ABI, libdrm continues to store the presumed offset in both locations. On Broadwell, a 64-bit kernel may place BOs at "high" (> 4G) addresses. However, with a 32-bit userspace, the 'unsigned long offset' field will only be 32-bit, which is not large enough to hold this value. We need to use a proper uint64_t (like the kernel does). Technically, a lot of this code doesn't affect Broadwell, so we could leave it using the old field. But it makes sense to just switch to the new, properly typed field. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	77425ef91a	build: Require libdrm 2.4.52 for Intel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	5f4eed3575	i965: Delete intel_batchbuffer_emit_reloc_fenced. Nothing in i965 uses it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 15:12:12 -08:00
Ian Romanick	4cd8011907	i915: Silence warning: unused parameter warning in intel_bufferobj_buffer intel_buffer_objects.c: In function 'old_intel_bufferobj_buffer': intel_buffer_objects.c:471:17: warning: unused parameter 'flag' [-Wunused-parameter] The parameter hasn't been used since the i915 and i965 drivers had their breakup. i965 got the flags, and i915 got to cry itself to sleep. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:40:46 -08:00
Ian Romanick	8468f437e8	i915: Ensure that intel_bufferobj_map_range meets alignment guarantees Not actually tested, but the changes are identical to the i965 changes that are tested. v2: Remove MAX2(64, ...). Suggested by Ken (in the i965 version of this patch). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Siavash Eliasi <siavashserver@gmail.com>	2014-01-20 11:40:41 -08:00
Ian Romanick	1ec663ab19	i965: Ensure that intel_bufferobj_map_range meets alignment guarantees No piglit regressions on IVB. With minor tweaks to the arb_map_buffer_alignment-map-invalidate-range test (disable the extension check, set alignment to 64 instead of querying), the i965 driver would fail the test without this patch (as predicted by Eric). With this patch, it passes. v2: Remove MAX2(64, ...). Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Siavash Eliasi <siavashserver@gmail.com>	2014-01-20 11:40:34 -08:00
Ian Romanick	c2352a88ed	docs: Note that GL_ARB_viewport_array is done on i965 At least for GEN7+, anyway. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:05 -08:00
Courtney Goeltzenleuchter	7837f425e7	i965: Enable ARB_viewport_array v2 (idr): Only enable the extension on GEN7+ w/core profile because it requires geometry shaders. v3 (idr): Add some casting to fix setting of ViewportBounds.Min. Negating an unsigned value, then casting to float doesn't do what you might think it does. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:05 -08:00
Ian Romanick	d3ee8ba346	i965: Consider all viewports before enabling guardband clipping Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:05 -08:00
Ian Romanick	bdff9a6e47	i965: Consider only the scissor rectangle for viewport 0 for clears noop_scissor (correctly) only examines the scissor rectangle for viewport 0. Therefore, it should only be called when that scissor rectangle is enabled. v2: Remove spurious change to radeon code. Noticed by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	2c27f1d47a	i965: Set all the supported scissor rectangles for GEN7 Currently MaxViewports is still 1, so this won't affect any change. v2: Minor code reformatting suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	a2b946cb35	mesa: Refactor bounding-box calculation out of _mesa_update_draw_buffer_bounds Drivers that currently use _Xmin and friends to set their scissor rectangle will need to use this code directly once they are updated for GL_ARB_viewport_array. v2: Use different bit-test idiom and fix mixed tabs and spaces. Both were suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	d989c4b134	i965: Set all the supported viewports for GEN7 Currently MaxViewports is still 1, so this won't affect any change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	fceb8b55c0	i965: Emit writes to viewport index This variable is handled in a fashion identical to gl_Layer. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	37f65b0751	i965: Set the maximum VPIndex At various stages the hardware clamps the gl_ViewportIndex to these values. Setting them to zero effectively makes gl_ViewportIndex be ignored. This is acutally useful in blorp (so that we don't have to modify all of the viewport / scissor state). v2: Use INTEL_MASK to create GEN6_CLIP_MAX_VP_INDEX_MASK. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Courtney Goeltzenleuchter	9ef16befd0	mesa: Add ARB_viewport_array plumbing Define API connections to extension entry points added in previous commits. Update entry points to use floating point arguments as required by the extension. Add get tokens for ARB_viewport_array state. v2: Include review feedback. v3 (idr): Fix 'make check'. Add missing Get infrastructure (some was culled from other pathces). Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	c2eefb06aa	glsl: Add gl_ViewportIndex built-in variable v2 (idr): Fix copy-and-paste bug... s/LAYER/VIEWPORT/ Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	5439964270	glsl: Add extension infrastructure for ARB_viewport_array Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	3815264d7d	mesa: Add varying slot for viewport index Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	86231c4ab3	mesa: Add new viewport and depth-range entry points for GL_ARB_viewport_array v2 (idr): Use set_viewport_no_notify / set_depth_range_no_notify (and manually notify the driver) instead of calling _mesa_set_viewporti / _mesa_set_depthrangei. Refactor bodies of _mesa_ViewportIndexed and _mesa_ViewportIndexedv into a shared function. Remove spurious CLAMP calls in _mesa_DepthRangeArrayv and _mesa_DepthRangeIndexed. v3 (idr): Add some missing return-statements after calls to _mesa_error. v4 (idr): Only perform the ViewportBounds.Min / ViewportBounds.Max clamping in set_viewport_no_notify if GL_ARB_viewport_array is enabled. Otherwise the driver may not have set ViewportBounds, and the clamping will do bad things. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	0a7baa68a8	mesa: Add new scissor entry points for GL_ARB_viewport_array v2 (idr): Use set_scissor_no_notify (and manually notify the driver) instead of calling _mesa_set_scissori. Refactory bodies of _mesa_ScissorIndexed and _mesa_ScissorIndexedv into a shared function. Perform parameter validation in the same order in all three functions. Pull MaxViewports comparison fix (in _mesa_ScissorArrayv) from the next patch to this patch. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	917db0bc3d	mesa: Add custom get function for SCISSOR_TEST to _mesa_IsEnabledi Now that the scissor enable state is a bitfield need a custom function to extract the correct value from gl_context. Modeled Scissor.EnableFlags after Color.BlendEnabled. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	6d9c0011a0	mesa: Add new get entrypoints for ARB_viewport_array v2 (idr): Fix several "comparison between signed and unsigned integer expressions" warnings. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	a4bc73f7ba	mesa: Change parameter to _mesa_set_viewport to float This matches the expectations of GL_ARB_viewport_array and the storage type where the values will land. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	91ad851876	meta: Restore all scissor state Previously the restore code would enable all scissor rectangles if any scissor rectangles were enabled on entry to meta. When there is only one scissor rectangle, this is fine. As soon as a driver supports multiple viewports, this will be a problem. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	6d3b1dc150	mesa: Set all scissor rects In _mesa_Scissor, make sure that ctx->Driver.Scissor is only called once instead of once per scissor rectangle. v2: Use MAX_VIEWPORTS instead of ctx->Const.MaxViewports because the driver may not set ctx->Const.MaxViewports yet. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	454cec4299	mesa: Set all viewports from _mesa_Viewport and _mesa_DepthRange In _mesa_Viewport and _mesa_DepthRange, make sure that ctx->Driver.Viewport is only called once instead of once per viewport or depth range. v2: Make _mesa_DepthRange actually set all of the depth ranges (instead of just index 0). Noticed by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	562f353434	mesa: Restore all the viewports in _mesa_PopAttrib Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	c65db3ebed	mesa: Restore all the scissor rectangles in _mesa_PopAttrib Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	9de863603d	mesa: Initialize all the viewports v2: Use MAX_VIEWPORTS instead of ctx->Const.MaxViewports because the driver may not set ctx->Const.MaxViewports yet. v3: Handle all viewport entries in update_viewport_matrix and _mesa_copy_context too. This was previously in an earlier patch. Having the code in the earlier patch could cause _mesa_copy_context to access a matrix that hadn't been constructed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v2]	2014-01-20 11:31:59 -08:00
Ian Romanick	f6d7cd4a11	mesa: Add an index parameter to _mesa_set_scissor Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	5232a7ded0	mesa: Refactor scissor rectangle setting even more Create an internal function that just writes data into the scissor rectangle. In future patches this will see more use because we only want to call dd_function_table::Scissor once after setting all of the scissor rectangles instead of once per scissor rectangle. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	799265aadc	mesa: Refactor viewport setting even more Create an internal function that just writes data into the viewport. In future patches this will see more use because we only want to call dd_function_table::Viewport once after setting all of the viewport instead of once per viewport. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	42f916e150	mesa: Refactor depth range setting even more Create an internal function that just writes data into the depth range. In future patches this will see more use because we only want to call dd_function_table::DepthRange once after setting all of the depth ranges instead of once per depth range. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:58 -08:00
Ian Romanick	3eb135d1c7	mesa: Add an index parameter to _mesa_set_viewport Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:58 -08:00
Courtney Goeltzenleuchter	cbb271a488	mesa: Convert gl_context::Viewport to gl_context::ViewportArray Only element 0 of the array is used anywhere at this time, so there should be no changes. v4: Split out from a single megapatch. Suggested by Ken. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:56 -08:00
Courtney Goeltzenleuchter	5b84226c31	mesa: Converty gl_viewport_attrib::X, ::Y, ::Width, and ::Height to float v4: Split out from a single megapatch. Suggested by Ken. Also make meta's save_state::ViewportX, ::ViewportY, ::ViewportW, and ::ViewportH to match gl_viewport_attrib. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:53 -08:00
Courtney Goeltzenleuchter	d4dc359875	mesa: Convert gl_viewport_attrib::Near and ::Far to double v4: Split out from a single megapatch. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:50 -08:00
Courtney Goeltzenleuchter	0e60d85029	mesa: Allow glGet of values that are 2 doubles This will be used when the viewport near and far plane are stored as doubles instead of as floats. v4 (idr): Split out from a single megapatch. Suggested by Ken. Also drop value_double_4. It's never used anywhere in the patch series. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:47 -08:00
Ian Romanick	83bd850cc7	mesa: Move parameter validation from _mesa_set_viewport to _mesa_Viewport Internal callers should do the right thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:42 -08:00
Courtney Goeltzenleuchter	a9c73fb778	mesa: Update gl_scissor_attrib to support ARB_viewport_array Update Mesa and drivers to access updated gl_scissor_attrib. Now have an enable bitfield and array of gl_scissor_rects. Drivers have been updated to the new scissor enable state attribute (gl_context.scissor.EnableFlags) but still treat it as a single boolean which is okay as mesa will only use bit 0 when communicating with a driver that does not support ARB_viewport_array. v2 (idr): Rebase fixes. v3 (idr): Small code formatting fix suggsted by Ken. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:42 -08:00
Ian Romanick	1f59e963b4	mesa: Add new constants related to GL_ARB_viewport_array These limits will be queryable by GL_MAX_VIEWPORTS, GL_VIEWPORT_SUBPIXEL_BITS, and GL_VIEWPORT_BOUNDS_RANGE. Drivers that actually implement the extension must set values for these constants that comply with the minimum-maximums from the spec. Most of these changes were part of other patches. They were separated out because it make reordering of later patches easier. Also, MaxViewports wasn't set by that patch, and I completely overlooked it in review. It's now obvious that it's set. :) v2 (idr): Split these changes out from the original patches. Keep MaxViewportWidth and MaxViewportHeight as GLuint. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:41 -08:00
Courtney Goeltzenleuchter	b39bfa4f49	mesa: Add extension tracking bit for ARB_viewport_array v2 (idr): Split these changes out from the original patch. Only advertise GL_ARB_viewport_array in a core profile because it requires geometry shaders. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:41 -08:00
Brian Paul	d6b6ab51d4	draw: use some cast wrappers in draw_pt_fetch_shade_pipeline*.c Trivial.	2014-01-20 11:01:48 -08:00
Brian Paul	807cbb9023	draw: whitespace and formatting fixes in draw_pt_fetch_shade_pipeline*.c Trivial.	2014-01-20 11:00:32 -08:00
Brian Paul	ad814d04ca	draw: fix incorrect vertex size computation in LLVM drawing code We were calling draw_total_vs_outputs() too early. The call to draw_pt_emit_prepare() could result in the vertex size changing. So call draw_total_vs_outputs() after draw_pt_emit_prepare(). This fix would seem to be needed for the non-LLVM code as well, but it's not obvious. Instead, I added an assertion there to try to catch this problem if it were to occur there. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-20 10:57:20 -08:00
Brian Paul	3a4255148b	docs: note reduced display list memory usage in 10.1 relnotes	2014-01-20 10:52:11 -08:00
Roland Scheidegger	8c0368abb9	draw: clean up d3d style point clipping Instead of skipping x/y clipping completely if there's point_tri_clip points use guard band clipping. This should be easier (previously we could not disable generating the x/y bits in the clip mask for llvm path, hence requiring custom clip path), and it also allows us to enable this for tris-as-points more easily too (this would require custom tri clip filtering too otherwise). Moreover, some unexpected things could have happen if there's a NaN or just a huge number in some tri-turned-point, as the driver's rasterizer would need to deal with it and that might well lead to undefined behavior in typical rasterizers (which need to convert these numbers to fixed point). Using a guardband should hence be more robust, while "usually" guaranteeing the same results. (Only "usually" because unlike hw guardbands draw guardband is always just twice the vp size, hence small vp but large points could still lead to different results.) Unfortunately because the clipmask generated is completely unaffected by guard band clipping, we still need a custom clip stage for points (but not for tris, as the actual clipping there takes guard band into account). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-20 17:45:53 +01:00
Brian Paul	799abb271a	swrast: check for null/-1 when mapping renderbuffers Fixes fbo-drawbuffers-none crash (but test still fails). https://bugs.freedesktop.org/show_bug.cgi?id=73757 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-20 08:18:21 -08:00
Brian Paul	3ede8dd5f1	softpipe: fix crash when accessing null colorbuffer Fixes piglit fbo-missing-attachment-blit test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73755 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-20 08:18:21 -08:00
Brian Paul	33ae0c24d0	st/vdpau: s/surface/resource/ to fix compiler warning Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-20 07:54:42 -08:00
José Fonseca	a1e528a0f0	i915,r200,radeon,vega: Change vendor from "VMware, Inc." to "Mesa Project". These are components which were originally developed by Tungsten Graphics, which was in turn acquired by VMware, but are de facto now being maintained by third-party contributors of the Mesa open-source community. This matches what's reported by swrast driver and a few other components. Suggested by Ian Romanick.	2014-01-20 14:15:27 +00:00
José Fonseca	f0c2662b12	logger: Remove unused variable. Silences gcc "unused variable ‘buf’" warning. Trivial.	2014-01-20 13:58:11 +00:00
José Fonseca	d43260b59e	logger: s/\<log\>/log_/ Currently the MSVC build is broken because of conflicting definitions of 'log' function. I didn't investigate thoroughly, but I suspect the it is conflicting standard math.h's log. log_ is admittedly not a great name, but it is better than a broken build. A better one can be used in a follow-on build.	2014-01-20 13:57:12 +00:00
Topi Pohjolainen	9ab553cf52	i965/blorp: reduce the scope of the explicit compression control By highlighting these special cases makes it clearer to switch to the fs-generator as the wider scoped compression control settings used in the current implementation can be simply dropped. No regressions on IVB (piglit quick + unit tests). v2 (Ian): typo in a comment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 09:42:36 +02:00
Topi Pohjolainen	d0f63b3757	i965/blorp: remove dependency to compression control state Effectively only the mask control bit gets altered for the single addition in question and hence there is no real need to use a fresh state control level for it -- that is more useful when multiple intructions share the same mask and compression settings. This is a preparation step for removing the explicit compression control modifiers in the blit compiler. After this patch there are no nested state control levels making the constant nature of the compression settings more apparent. No regressions on IVB (piglit quick + unit tests). v2 (Matt, Ian): use temporary variable instead of assigning directly on the same line with a function call. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 09:42:27 +02:00
Kristian Høgsberg	05da4a7a5e	i965: Only update renderbuffers on initial intelMakeCurrent We call intel_prepare_render() in intelMakeCurrent() to make sure we have renderbuffers before calling _mesa_make_current(). The only reason we do this is so that we can have valid defaults for width and height. If we already have buffers for the drawable we're making current, we don't need this step. In itself, this is a small optimization, but it also avoids a round trip that could block on the display server in a unexpected place. https://bugs.freedesktop.org/show_bug.cgi?id=72540 https://bugs.freedesktop.org/show_bug.cgi?id=72612 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-19 20:48:19 -08:00
Ilia Mirkin	f5788e042a	st/vdpau: check surface params before creating surfaces Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-19 20:02:10 -05:00
Ilia Mirkin	813ce219c8	st/vdpau: fix bogus error handling in output/bitmap creation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-19 20:02:10 -05:00
Ilia Mirkin	00e4314f6d	st/vdpau: don't return a device if the screen doesn't support NPOT NV3x cards don't support NPOT textures. Technically this restriction could be worked around, but since it also doesn't expose any video decoding hw, just turn it off entirely. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-19 20:01:48 -05:00
Armin K	ad3c99e22a	pipe-loader: Fix build pipe_loader_drm.c: In function 'pipe_loader_drm_probe_fd': pipe_loader_drm.c:120:4: error: implicit declaration of function 'loader_get_pci_id_for_fd' [-Werror=implicit-function-declaration] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-19 15:20:58 +00:00
Emil Velikov	26d380da69	loader: ifdef libdrm specific code and include Mesa provides the flexibility of building without the need to have libdrm present on the system. The situation has regressed with the recent commit commit `8c2e7fd846` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Fri Jan 10 23:36:16 2014 +0000 loader: introduce the loader util lib By isolating libdrm code by #ifndef __NOT_HAVE_DRM_H we can have libdrm-less builds on across all build systems. This patch converts Android's _EGL_NO_DRM to __NOT_HAVE_DRM_H to provide consistency with the other cases within mesa, allows compilation of libloader on libdrm-less scons and conditionally links against libdrm if present under automake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73776 BUgzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73777 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-19 15:17:00 +00:00
Kenneth Graunke	a33d1339d5	i965: Double the push constant space multipliers on Broadwell too. Broadwell has 2Kb push constant size increments like Haswell GT3. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-18 21:58:13 -08:00
Kenneth Graunke	4c6a1d380a	i965: Update invariant state for Broadwell. The only difference is that STATE_SIP takes a 48-bit address, so we need to output two zeroes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-18 21:57:59 -08:00
Kenneth Graunke	37e9b5e305	i965: Use the Sandybridge VUE format on Broadwell as well. It hasn't changed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-18 21:56:23 -08:00
Kenneth Graunke	11f6882e1d	i965: Create a new fragment shader backend for Broadwell. This replaces the old fs_generator backend. v2: Port to the C-based representation of assembly instructions. Fix texturing after the texture-grf merge. v3: Add high quality derivative support. Fix SET_SIMD4X2_OFFSET. v4: Pass brw_context to gen8_instruction functions as required. v5: Fixes for MRT, as well as zero render targets (alpha test only). v6: Replace n-wide with SIMDn in comments and messages; port over Topi's blorp-generator changes; add missing TXF_MCS opcode, fix missing high quality derivatives for DDX; fix typo (all caught by Eric). Simplify ADDC/SUBB handling; drop "Used only on Gen6+" comment (caught by Matt). Emit SIMD16 versions of three source instructions (caught by both Eric and Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:56:08 -08:00
Kenneth Graunke	9eb568d753	i965: Create a new vec4 backend for Broadwell. This replaces the old vec4_generator backend. v2: Port to use the C-based instruction representation. Also, remove Geometry Shader offset hacks - the visitor will handle those instead of this code. v3: Texturing fixes (including adding textureGather support). v4: Pass brw_context to gen8_instruction functions as required. v5: Add SHADER_OPCODE_TXF_MCS support; port DUAL_INSTANCED gs fixes (caught by Eric). Simplify ADDC/SUBB handling; add comments to gen8_set_dp_message calls (suggested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:56:02 -08:00
Kenneth Graunke	f8035ba036	i965: Add a new infrastructure for generating Broadwell shader assembly. This replaces the brw_eu_emit.c layer for Broadwell. It will be used by both the vector and scalar shader backends. v2: Port to use the C-based instruction representation. v3: Fix destination register type for CMP. v4: Pass brw to gen8_instruction functions (required by rebase). v5: Remove bogus assertion on math instructions (caught by Piglit). v6: Remove more restrictions on math instructions (caught by Eric). Make ADDC and SUBB helpers set accumulator writes, like MAC and MACH (caught by Matt). v7: Don't implicitly force ALU3 operations to SIMD8 (we've been able to do SIMD16 versions since Haswell, but didn't when I originally wrote this code). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:54 -08:00
Kenneth Graunke	8ea4b16eea	i965: Implement a disassembler for Broadwell's new instruction encoding. Heavily based on Keith Packard's existing brw_disasm.c code. I've tried to go through most of the pieces (like SFIDs) and update the lists to include features added in recent generations. v2: Port to use the C-based instruction emitters. This allows us to use C99 array initializers, which tidies up some of the code. v3: Improve decoding of render target write messages. v4: Update for BRW_REGISTER_TYPE becoming an abstraction. v5: Rebase on Chris Forbes' SFID message defines. v6: Fix disassembly of UV immediates; remove silly casts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-18 21:55:45 -08:00
Kenneth Graunke	0923dad90a	i965: Add a new representation for Broadwell shader instructions. Broadwell significantly changes the EU instruction encoding. Many of the fields got moved to different bit positions; some even got split in two. With so many changes, it was infeasible to continue using struct brw_instruction. We needed a new representation. This new approach is a bit different: rather than a struct, I created a class that has four DWords, and helper functions that read/write various bits. This has several advantages: 1. We can create several different names for the same bits. For example, conditional modifiers, SFID for SEND instructions, and the MATH instruction's function opcode are all stored in bits 27:24. In each situation, we can use the appropriate setter function: set_sfid(), set_math_function(), or set_cond_modifier(). This is much easier to follow. 2. Since the fields are expressed using the original 128-bit numbers, the code to create the getter/setter functions follows the table in the documentation very closely. To aid in debugging, I've enabled -fkeep-inline-functions when building gen8_instruction.c. Otherwise, these functions cannot be called by gdb, making it insanely difficult to print out anything. Kenneth Graunke wrote most of this code. Damien Lespiau ported it to C99. Xiang Haihao added media fields. Zhao Yakui added indirect addressing support. Eric Anholt added an assertion to make sure that values fit in the alloted number of bits. v2: Update for brw_reg_type_to_hw_type(), which necessitates passing brw_context pointers around everywhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:37 -08:00
Kenneth Graunke	f4cf231cac	i965: Add SFID #defines for media stuff. While we probably won't ever use these, having them makes it easy to share disassembler code between intel-gpu-tools and Mesa. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:31 -08:00
Kenneth Graunke	9e7da0c716	i965: Add #defines for new Broadwell math functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:25 -08:00
Chris Forbes	45607b5c5f	i965: add struct and SFID for pixel interpolator messages Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-18 21:55:17 -08:00
Chris Forbes	566e0ddfd0	i965/Gen7: Only emit cube face enables for cubes. This is not observed to actually fix anything, but the PRM says this field must be zero for other surface types. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-19 11:22:34 +13:00
Chris Forbes	b0042f2c23	i965: Improve dumping of Gen7 SURFACE_STATE Previously this was missing many interesting fields. Having them decoded makes debugging views much easier. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-19 11:22:32 +13:00
Chris Forbes	9b5eda8544	i965: Add masks for more SURFACE_STATE fields Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-19 11:22:00 +13:00
Emil Velikov	66fd5057d3	nv50: drop obsolete check from error path At 'out_err' the nv50_context has been calloc-ated. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:45 +00:00
Emil Velikov	e1e30f6dfb	nv50: assert before trying to out-of-bounds access framebuffer.cbufs Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:41 +00:00
Emil Velikov	3805a864b1	nv50: assert before trying to out-of-bounds access samplers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:37 +00:00
Emil Velikov	6a53b81086	nv50: assert before trying to out-of-bounds access textures Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:34 +00:00
Emil Velikov	19069803be	nv50: pass vtxbuf index as unsigned The index passed to the function is already unsigned, and internally we threat it as unsigned. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:28 +00:00
Emil Velikov	1773611c52	nv50: assert before trying to out-of-bounds access vtxbuf Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:24 +00:00
Emil Velikov	741e935a72	nv50: typecast the result of ffs() to unsigned Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:20 +00:00
Emil Velikov	5e130f2371	nv50: assert before trying to out-of-bounds access constbuf Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:15 +00:00
Emil Velikov	12e744abbb	nv50: access only the available amount of constbuf The textures array is defined as a number of NV50_MAX_PIPE_CONSTBUFS per shader stage. Currently the nv50 driver handles only 3 shader stages, thus we wreck chaos when accessing array-out-of-bounds. Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:09 +00:00
Emil Velikov	d606ca37eb	nv50: access only the available amount of textures The textures array is defined as a number of PIPE_MAX_SAMPLERS per shader stage. Currently nv50 driver handles only 3 shader stages, thus we wreck chaos when accessing array-out-of-bounds. Fixes a segfault in piglit/bin/arb_texture_buffer_object-data-sync -fbo -auto Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:16:16 +00:00
Rob Clark	bf70c238a7	loader: fallback to drmGetVersion() for non-pci devices Use the kernel driver name are returned by drmGetVersion() for non-pci(platform) devices. Signed-off-by: Rob Clark <robclark@freedesktop.org> v2 (Emil): Rebased and weaked commit message. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:52:07 +00:00
Emil Velikov	26458420d8	pipe-loader: add support for non-pci (platform) devices Culled out of the "loader: refactor duplicated code into loader util lib" patch by Rob Clark. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-18 18:52:07 +00:00
Emil Velikov	3d3ae75c86	pci_ids: no not include loader.h As per original approach by Rob, each user of the loader lib should include loader.h and the pci_id_driver_map.h header will be used exclusively by the loader. Add back the include guard __IS_LOADER and remove no longer needed include folder in the scons build. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:51:54 +00:00
Emil Velikov	8d4357b5ba	egl_dri2: use loader util lib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:49 +00:00
Emil Velikov	a0a1c60fb0	pipe-loader: use loader util lib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-18 18:47:49 +00:00
Emil Velikov	0e78c35234	st/egl: use loader util lib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-18 18:47:48 +00:00
Emil Velikov	a980024224	egl-static: use loader util lib v2 * Drop the no longer used _EGL_NO_DRM from Android.mk. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:48 +00:00
Emil Velikov	fae0dfa59b	gbm: use the loader util lib Additionally this commit removes the following exported functions _gbm_udev_device_new_from_fd() _gbm_fd_get_device_name() _gbm_log() All three were erroneously marked as exported since their inception. Neither of them has ever been a part of the API thus there should be no users of them. Cc: Chad Versace <chad.versace@linux.intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:48 +00:00
Emil Velikov	eac776cf77	glx: use the loader util lib v2 * Set logger to ErrorMessageF. Spotted by Kristian Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:48 +00:00
Emil Velikov	8c2e7fd846	loader: introduce the loader util lib All the various window system integration layers duplicate roughly the same code for figuring out device and driver name, pci-id's, etc. Which is sad. So extract it out into a loader util lib. v2 (Emil) * Separate the introduction of libloader from the code de-duplication. * Strip out non-pci devices support. * Add scons + Android build system support. * Add VISIBILITY_CFLAGS to avoid exporting the loader funcs. v3 (Emil) * PIPE_OS_ANDROID is undefined at this scope, use ANDROID * Make sure we define _EGL_NO_DRM when building only swrast Signed-off-by: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:27 +00:00
Kenneth Graunke	1c5e2965a0	i965: Remove CACHED_BATCH support altogether. Using an unoptimized variant of glamor spending 50% of its CPU time in brw_draw_prims() (and hitting the cache very frequently): N Min Max Median Avg Stddev x 200 29200 40500 34900 34750 958.43256 + 200 31000 40300 34700 34622 916.35941 No difference proven at 95.0% confidence Similarly, no difference on GLB2.7: N Min Max Median Avg Stddev x 63 64.1 71.36 70.69 70.113175 1.6782026 + 63 63.6 71.18 70.75 70.223651 1.6044186 No difference proven at 95.0% confidence v2: Rebase on master (by anholt) v3: Add a missing BEGIN_BATCH(3) to aa_line_parameters -- CACHED_BATCH didn't have the asserts about batchbuffer usage that ADVANCE_BATCH does, so we started assertion failing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-17 13:21:11 -08:00
Eric Anholt	746e3e3b3a	i965: Replace 8-wide and 16-wide with SIMD8 and SIMD16. Those are the terms used in the docs, and think "n-wide" was something I just happened to say. Note that shader-db needs updating for the INTEL_DEBUG=fs parsing. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-17 12:58:43 -08:00
Eric Anholt	26a3bf5c72	i965: Stop doing our optimization on a copy of the GLSL IR. The original intent was that we'd keep a driver-private copy, and there would be the normal copy for swrast to make use of without the tuning (or anything more invasive we might do) specific to i965. Only, we don't generate swrast code any more, because swrast can't render current shaders anyway. Thus, our private copy is rather a waste, and we can just do our backend-specific operations on the linked shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-17 12:58:37 -08:00
José Fonseca	8771285054	s/Tungsten Graphics/VMware/ Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 \| xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra$ph\\|hp$ics,\? [iI]nc\.\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics$,\? [iI]nc\.$\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-17 20:00:32 +00:00
José Fonseca	27307a73e5	trace: Re-license trace.xsl under MIT license. I was the sole author, as Tungsten Graphics employee, which was since then acquired by VMware Inc. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-17 20:00:32 +00:00
Brian Paul	3618ac4f20	svga: fix crash when clearing null color buffer Fixes regression since `9baa45f78b` but some of the piglit fbo-drawbuffers-none tests still don't pass. v2: use the right pointer type for 'h' Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-17 08:52:37 -08:00
Brian Paul	d6fa71fbb0	llvmpipe: handle NULL color buffer pointers Fixes regression from `9baa45f78b` v2: incorporate a few small changes suggested by Roland. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-17 08:52:11 -08:00
Brian Paul	7b4ceec0b7	softpipe: handle NULL color buffer pointers Fixes regression from `9baa45f78b` Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-17 08:52:11 -08:00
Roland Scheidegger	3b64714da4	llvmpipe: fix large point rasterization with point_quad_rasterization The whole round-pointsize-to-int stuff must only be done with GL legacy rules (no point_quad_rasterization) or all the wrong edges are lit up. This was previously in a private branch (d3d pointsprite test complains loudly otherwise) and got lost in a merge. However, it should certainly apply to GL point sprite rasterization as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-17 17:01:01 +01:00
Roland Scheidegger	4b9bcf31f4	gallium: add bits for clipping points as tris (d3d-style) OpenGL does whole-point clipping, that is a large point is either fully clipped or fully unclipped (the latter means it may extend beyond the viewport as long as the center is inside the viewport). d3d9 (d3d10 has no large points) however requires points to be clipped after they are expanded to a rectangle. (Note some IHVs are known to ignore GL rules at least with some hw/drivers.) Hence add a rasterizer bit indicating which way points should be clipped (some drivers probably will always ignore this), and add the draw interaction this requires. Drivers wanting to support this and using draw must support large points on their own as draw doesn't implement vp clipping on the expanded points (it potentially could but the complexity doesn't seem warranted), and the driver needs to do viewport scissoring on such points. Conflicts: src/gallium/drivers/llvmpipe/lp_context.c src/gallium/drivers/llvmpipe/lp_state_derived.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-17 17:01:01 +01:00
Ilia Mirkin	739dc95e67	mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_program Commit `c13970808` (mesa: GL_EXT_secondary_color is not optional) changed CHECK_EXTENSION2(EXT_secondary_color, ARB_vetex_program, cap) to CHECK_EXTENSION(ARB_vertex_program, cap) However CHECK_EXTENSION2 checks that either extension is available, not both. Remove the extension check entirely since the intent was for it to always be enabled. v2: Fix glGet*(GL_COLOR_SUM) too. Suggested by Ian. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-16 16:42:33 -08:00
Zack Rusin	93b953d139	llvmpipe: do constant buffer bounds checking in shaders It's possible to bind a smaller buffer as a constant buffer, than what the shader actually uses/requires. This could cause nasty crashes. This patch adds the architecture to pass the maximum allowable constant buffer index to the jit to let it make sure that the constant buffer indices are always within bounds. The behavior follows the d3d10 spec, which says the overflow should always return all zeros, and overflow is only defined as access beyond the size of the currently bound buffer. Accesses beyond the declared shader constant register size are not considered an overflow and expected to return garbage but consistent garbage (we follow the behavior which some wlk tests expect which is to return the actual values from the bound buffer). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-16 16:33:57 -05:00
Ilia Mirkin	dd687fb8d0	nv50, nvc0: initialize ctx->sample_mask to ~0 Commit `95bf222603` (cso_context: Fix cso_context::sample_mask initial value.) fixed the cso sample mask to be initialized to ~0. The cso code is also careful not to needlessly call set_sample_mask, so we ended up with the ctx->sample_mask never being set. This broke a number of EXT_framebuffer_multisample piglit tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-16 19:26:05 +01:00
Aaron Watry	188383591d	mesa/main: Free ctx->DrawIndirectBuffer during teardown ctx->DrawIndirectBuffer wasn't being free'd in _mesa_free_buffer_objects With this patch, "valgrind --leak-check=full glxgears" on evergreen (CEDAR) now shows: LEAK SUMMARY: definitely lost: 0 bytes in 0 blocks indirectly lost: 0 bytes in 0 blocks possibly lost: 0 bytes in 0 blocks still reachable: 70,228 bytes in 651 blocks suppressed: 0 bytes in 0 blocks Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-16 10:10:04 -06:00
Aaron Watry	ce3528896b	st/dri: prevent leak of dri option default values v2: Change comment style CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-16 10:10:04 -06:00
Aaron Watry	5ac3229f76	radeon: Move gfx/dma cs cleanup to r600_common_context_cleanup The radeonsi code was not cleaning up either of these items leading to leaked memory. v2: Move cleanup to r600_common_context_cleanup instead of duplicating the logic for SI CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-16 10:10:04 -06:00
Ian Romanick	a05c596a00	mesa: Eliminate parameters to dd_function_table::Scissor The i830 and i915 drivers used them, but they didn't really need to. They will just be annoying in future patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-15 10:02:48 -08:00
Ian Romanick	6dbab6b2bb	mesa: Eliminate parameters to dd_function_table::DepthRange No driver uses them. They will just be annoying in future patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-15 10:02:48 -08:00
Ian Romanick	065bd6ffc2	mesa: Eliminate parameters to dd_function_table::Viewport No driver uses them. They will just be annoying in future patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-15 10:02:48 -08:00
Ian Romanick	fbc0c9a553	radeon: Remove dead code A future patch will rename some of the fields of gl_viewport_attrib, and I don't want to update dead code that I can't test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Dave Airlie <airlied@redhat.com>	2014-01-15 10:02:47 -08:00
Ian Romanick	4fcdb75268	i915: Remove spurious calls to DepthRange For both i830 and i915, the driver DepthRange function just calls intelCalcViewport. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Eric Anholt <eric@anholt.net>	2014-01-15 10:02:47 -08:00
Ian Romanick	0a75909b3f	mesa: Add COMPRESSED_RGBA_S3TC_DXT1_EXT to COMPRESSED_TEXTURE_FORMATS for GLES The ES and desktop GL specs diverge here. Yay! In desktop OpenGL, the driver can perform online compression of uncompressed texture data. GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats that it could ask the driver to compress with some expectation of quality. The GL_ARB_texture_compression spec calls this "suitable for general-purpose usage." As noted above, this means GL_COMPRESSED_RGBA_S3TC_DXT1_EXT is not included in the list. In OpenGL ES, the driver never performs compression. GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats that the driver can receive from the application. It is the complete list of formats. The GL_EXT_texture_compression_s3tc spec says: "New State for OpenGL ES 2.0.25 and 3.0.2 Specifications The queries for NUM_COMPRESSED_TEXTURE_FORMATS and COMPRESSED_TEXTURE_FORMATS include COMPRESSED_RGB_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT3_EXT, and COMPRESSED_RGBA_S3TC_DXT5_EXT." Note that the addition is only to the OpenGL ES specification! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> See-also: http://lists.freedesktop.org/archives/mesa-dev/2013-October/047439.html Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-15 10:02:47 -08:00
Brian Paul	bf27d02390	scons: add new shaderimage.c file to the build	2014-01-15 09:17:04 -07:00
Francisco Jerez	bd62666224	clover: Fix clover::keys and ::values to deal with r-value references properly. Returning a reference is incorrect if the specified pair was a temporary -- Instead of that, use decltype() to deduce the correct return type qualifiers. Fixes a crash in clCreateProgramWithBinary(). Reported-and-tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:48:37 +01:00
Francisco Jerez	5662602ba0	clover: Don't try to build programs created from a binary again. According to the spec it's allowed to call clBuildProgram() on a program created from a user-specified binary. We don't need to do anything to build the program in that case. Reported-and-tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:48:05 +01:00
Francisco Jerez	5195f1d9c6	clover: Add missing fields to the clover::module serialization code. Tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:46:12 +01:00
Francisco Jerez	efcc84f425	clover: Store map result into a temporary vector in clCreateProgramWithBinary. This avoids the inefficient multiple evaluation of the map result in the code below. It should cause no functional changes. Tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:45:05 +01:00
Francisco Jerez	83db4a30b8	docs: Mark ARB_shader_image_load_store as work in progress. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	647344bf3e	mesa: Validate image units when the texture state changes. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	ace31f4bc0	mesa: Unbind deleted textures from the shader image units. From ARB_shader_image_load_store: If a texture object bound to one or more image units is deleted by DeleteTextures, it is detached from each such image unit, as though BindImageTexture were called with <unit> identifying the image unit and <texture> set to zero. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	902f9df36b	mesa: Add image parameter queries for ARB_shader_image_load_store. v2: Fix off-by-one error in index parameter bound checking. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	eb0de7c432	mesa: Add ARB_shader_image_load_store to the extension table. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	a167e354e7	glapi: Update dispatch XML files for ARB_shader_image_load_store. And uncomment the relevant lines of the dispatch sanity test. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	bcc49e17ff	mesa: Implement the GL entry points defined by ARB_shader_image_load_store. v2: Name image format classes consistently, fix array and 3D teximage selection with layered = GL_FALSE, make sure that the user-specified layer is less than the number of texture layers, add some asserts. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	7510c10209	mesa: Add MESA_FORMAT_SIGNED_RG88 and _RG1616. Including pack/unpack and texstore code. ARB_shader_image_load_store requires support for the GL_RG8_SNORM and GL_RG16_SNORM formats, which map to MESA_FORMAT_SIGNED_GR88 and MESA_FORMAT_SIGNED_GR1616 on little-endian hosts, and MESA_FORMAT_SIGNED_RG88 and MESA_FORMAT_SIGNED_RG1616 respectively on big-endian hosts -- only the former were already present, add support for the latter. Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	87942749a3	mesa: Add MESA_FORMAT_ABGR2101010. Including pack/unpack and texstore code. This texture format is a requirement for ARB_shader_image_load_store. Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	16070716bc	mesa: Add driver interface for ARB_shader_image_load_store. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	7a98741ef2	mesa: Add state data structures required for ARB_shader_image_load_store. v2: Increase MAX_IMAGE_UNITS to what i965 wants and add a separate MAX_IMAGE_UNIFORMS define, clarify a couple of comments. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	d9b0b4e960	mesa: Define helper function to get the number of texture layers. And to check if it can have layers at all. This will be used by the implementation of ARB_shader_image_load_store. v2: Fix constness of texobj argument, use assert and return reasonable default rather than calling unreachable() in default switch case. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-15 16:42:07 +01:00
Emil Velikov	bfcf78c110	st/mesa: use signed temporary variable to store _ColorDrawBufferIndexes The temporary variable used to store _ColorDrawBufferIndexes must be signed (GLint), otherwise the following conditional will be incorrectly evaluated. Leading to crashes in the driver/mesa or accessing/writing to arbitrary memory location. The bug dates back to 2009. Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-15 14:33:28 +00:00
Emil Velikov	3515a648a9	automake: include the git sha in the opengl version string for oot builds Acked-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-15 14:32:24 +00:00
Emil Velikov	10368e1446	mesa: use signed temporary variable to store _ColorDrawBufferIndexes _ColorDrawBufferIndexes is defined as GLint* and using a GLuint* will result in the first part of the conditional to be evaluated to true always. Unintentionally introduced by the following commit, this will result in a driver segfault if one is using an old version of the piglit test bin/clearbuffer-mixed-format -auto -fbo commit `03d848ea10` Author: Marek Olšák <marek.olsak@amd.com> Date: Wed Dec 4 00:27:20 2013 +0100 mesa: fix interpretation of glClearBuffer(drawbuffer) This corresponding piglit tests supported this incorrect behavior instead of pointing at it. Cc: Marek Olšák <marek.olsak@amd.com> Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-15 14:31:04 +00:00
Ilia Mirkin	716b512dcf	nouveau: add framebuffer validation callback Fixes assertions when trying to attach textures to fbs with formats not supported by the render engines. See https://bugs.freedesktop.org/show_bug.cgi?id=73459 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-01-15 12:12:00 +01:00
Francisco Jerez	e457aca7fa	clover: Use cl_ulong in the maximum allocation size calculation to avoid overflow.	2014-01-14 22:10:24 +01:00
Kenneth Graunke	8c4a9f631d	i965: Emit 3DSTATE_VF on Broadwell too. It's not just for Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-14 00:59:03 -08:00
Kenneth Graunke	eadabec4cd	i965: Disable workaround flush for push constants on Broadwell. If it wasn't necessary for Haswell, it's likely not to be necessary for Broadwell either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-14 00:59:03 -08:00
Kenneth Graunke	8618407d15	i965: Enable native ETC texture support on Broadwell. Broadwell, like Baytrail, has native ETC texture support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-14 00:58:17 -08:00
Chia-I Wu	fa772aa92b	ilo: handle NULL renderbuffers correctly Renderbuffers may be NULL since `9baa45f78b`.	2014-01-14 16:27:57 +08:00
Chia-I Wu	7fdab3b201	ilo: disable HiZ for misaligned levels We need to disable HiZ for non-8x4 aligned levels, except for level 0, layer 0. For the very first layer we can adjust Width and Height fields of 3DSTATE_DEPTH_BUFFER to make it aligned. Specifically, add ILO_TEXTURE_HIZ and set the flag only for properly aligned levels. ilo_texture_can_enable_hiz() is updated to check for the flag. In tex_layout_validate(), align the depth bo to 8x4 so that we can adjust Width/Height of 3DSTATE_DEPTH_BUFFER without introducing out-of-bound access. Finally in rectlist blitter, add the ability to adjust 3DSTATE_DEPTH_BUFFER.	2014-01-14 15:43:20 +08:00
Chia-I Wu	18645d1533	ilo: use a helper to determine if HiZ is enabled Add ilo_texture_can_enable_hiz and replace all checks for tex->hiz.bo by calls to ilo_texture_can_enable_hiz().	2014-01-14 15:43:20 +08:00
Chia-I Wu	1427c3f79f	ilo: decide on hiz first in texture allocation Add tex_layout_init_hiz() before tex_layout_init_format() to decide whether HiZ should be enabled. On GEN6, because of layer offsetting, HiZ is enabled only when the texture is non-mipmapped and non-array. PIPE_USAGE_STAGING is also taken as a hint to disable HiZ.	2014-01-14 15:43:20 +08:00
Chia-I Wu	194a61cd39	ilo: emit gen7_wa_pipe_control_wm_max_threads_stall on Haswell Rename the workaround, as it is for 3DSTATE_PS instead of 3DSTATE_WM, and emit it on Haswell too. This does not fix any app, but an assertion failure.	2014-01-14 15:43:19 +08:00
Chia-I Wu	c6605c51de	ilo: use HALIGN_4 on GEN7 for depth buffers The comment was no longer true since `6642381e75`.	2014-01-14 15:42:53 +08:00
Chia-I Wu	e90e3e39c2	ilo: OOM for HiZ is fatal on GEN6 On GEN6, HiZ and Separate Stencil Buffer must be enabled at the same time.	2014-01-14 15:19:41 +08:00
Chia-I Wu	5b1c516080	ilo: fix a HiZ bo leakage Dereference the HiZ bo when the texture is destroyed.	2014-01-14 15:19:41 +08:00
Chia-I Wu	af57378e59	ilo: simplify ilo_texture_set_slice_flags() Call ilo_texture_get_slice() for the last slice so that we can get rid of the duplicated assert().	2014-01-14 15:19:41 +08:00
Vinson Lee	8f9b70fa3c	egl-static: Fix build error. Fix build regression introduced with commit `786af2f963`. egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory #include "radeonsi/radeonsi_public.h" ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73578 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-01-13 15:54:26 -08:00
Andreas Hartmetz	aa7ae4fd6e	radeonsi: Rename the commonly occurring rscreen variable. The "r" stands for R600. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:14 +01:00
Andreas Hartmetz	8662e66bf2	radeonsi: Rename the commonly occurring rctx/r600 variables. The "r" stands for R600. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:14 +01:00
Andreas Hartmetz	44d27ce2b2	radeonsi: Rename r600_trace_emit->si_trace_emit. I had previously considered that unsafe. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	0b57fc15e1	radeonsi: Rename R600->SI in some remaining defines. I had previously considered that unsafe. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	1b79764f49	radeonsi: Rename radeonsi->si remaining identifiers in si_uvd.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	b902298615	radeonsi: Rename r600->si remaining identifiers in si_state_draw.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	3a4b87511e	radeonsi: Rename r600->si remaining identifiers in si_resource.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	5d068f734c	radeonsi: Rename r600->si remaining identifiers in si_query.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	eb0ddb6d5b	radeonsi: Rename r600->si remaining identifiers in si_pipe.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	238427625f	radeonsi: Rename r600->si remaining identifier in si_hw_context.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	3160aa4877	radeonsi: Rename radeonsi->si remaining identifiers in si_compute.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	7b7eb4dd1f	radeonsi: Rename r600->si remaining identifiers in si_blit.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	45578def71	radeonsi: Rename r600->si for functions in si_pipe.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	280c360c02	radeonsi: Rename r600->si for functions in si.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	f2a21ed8b9	radeonsi: Rename r600->si for functions in si_resource.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	a88f46bc9b	radeonsi: Rename r600->si for structs in si_resource.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	3e81883a42	radeonsi: Rename r600->si for structs in si.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	238aeabce0	radeonsi: Rename r600->si for structs in si_pipe.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	786af2f963	radeonsi: Apply si_* file naming scheme. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Michał Górny	5ea2376334	Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config. This should help with cross-compiling and multilib when $CHOST-specific llvm-config is expected rather than build host default one. It will help us a bit in Gentoo where we've started using i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Michał Górny <mgorny@gentoo.org> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=73100 CC: "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-13 14:37:55 -08:00
Tom Stellard	6a19bb56e0	configure: Disable xvmc by default The xvmc unit tests are failing on r300g and r600g. Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2014-01-13 14:37:55 -08:00
Kenneth Graunke	277dbf08b0	glsl: Remove exec_list iterators now that nothing uses them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:47 -08:00
Kenneth Graunke	826d9fb8c0	glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking. These can't use foreach_list since they want to skip over the first few list elements. Just doing the ad-hoc list walking isn't too bad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:45 -08:00
Kenneth Graunke	48d0faaa43	glsl: Use a new foreach_two_lists macro for walking two lists at once. When handling function calls, we often want to walk through the list of formal parameters and list of actual parameters at the same time. (Both are guaranteed to be the same length.) Previously, we used a pattern of: exec_list_iterator 1st_iter = <1st list>.iterator(); foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) { ... 1st_iter.next(); } This was awkward, since you had to manually iterate through one of the two lists. This patch introduces a foreach_two_lists macro which safely walks through two lists at the same time, so you can simply do: foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) { ... } v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested by Ian Romanick. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:42 -08:00
Kenneth Graunke	02ff2a2758	glsl: Statically cast parameter exec_node to ir_variable. Formal function parameters are always ir_variable objects, not an arbitrary ir_instruction. So there's no need to dynamically cast here. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	8050584096	glsl: Cast ir_call parameters to ir_rvalue, not ir_instruction. A function call's parameters are always rvalues. ir_rvalue may not always be a subclass of ir_instruction in the future, so we should use the right one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	2e113dfab8	glsl: Replace foreach_iter and iter.remove() with foreach_list_safe. foreach_list_safe allows you to safely remove the current node. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	838a6871bb	glsl: Convert piles of foreach_iter to foreach_list_safe. In these cases, we edit the list (or at least might be), so we use the foreach_list_safe variant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	5f7e778fa1	glsl: Convert piles of foreach_iter to the newer foreach_list macro. foreach_iter and exec_list_iterators have been deprecated for some time now; we just hadn't ever bothered to convert code to the newer foreach_list and foreach_list_safe macros. In these cases, we aren't editing the list, so we can use foreach_list rather than foreach_list_safe. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Paul Berry	fb6d9798a0	i965: Ensure that all necessary state is re-emitted if we run out of aperture. Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-13 09:44:39 -08:00
Marek Olšák	df918b5b90	r600g: fix glClearBuffer by handling PIPE_CLEAR_COLORi flags correctly also restructure the code	2014-01-13 15:48:08 +01:00
Marek Olšák	6e98a17551	r600g: handle NULL colorbuffers correctly on R600-R700	2014-01-13 15:48:08 +01:00
Marek Olšák	07032d4068	r600g: handle NULL colorbuffers correctly on Evergreen	2014-01-13 15:48:08 +01:00
Marek Olšák	a86de9a72f	radeonsi: handle NULL colorbuffers correctly Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-13 15:48:08 +01:00
Marek Olšák	9677cfab32	gallium/util: easy fixes for NULL colorbuffers Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-13 15:48:08 +01:00
Marek Olšák	9baa45f78b	st/mesa: bind NULL colorbuffers as specified by glDrawBuffers An example why it is required: Let's say there's a fragment shader writing to gl_FragData[0..1]. The user calls: glDrawBuffers(2, {GL_NONE, GL_COLOR_ATTACHMENT0}); That means gl_FragData[0] is unused and gl_FragData[1] is written to GL_COLOR_ATTACHMENT0. st/mesa was skipping the GL_NONE draw buffer, therefore gl_FragData[0] was written to GL_COLOR_ATTACHMENT0, which was wrong. This commit fixes it, but drivers must also be fixed not to crash when binding NULL colorbuffers. There is also a new set of piglit tests for this. The MSAA state also had to be fixed not to crash when reading fb->cbufs[0]. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-13 15:48:07 +01:00
Marek Olšák	9bf9578c1b	mesa: handle GL_NONE draw buffers correctly in glClear Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-13 15:48:07 +01:00
Marek Olšák	4e549ddb50	st/mesa: use sRGB formats for MSAA resolving if destination is sRGB Copied from the i965 driver, including the big comment. Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-13 15:48:07 +01:00
Marek Olšák	355686a69f	st/mesa: check depth and stencil writemask before clearing	2014-01-13 15:25:31 +01:00
Marek Olšák	9ea3f88c0a	st/mesa: always prefer pipe->clear over clear_with_quad (v2) v2: clear depth and stencil together	2014-01-13 15:25:31 +01:00
Martin Andersson	c156d24525	st/egl: Flush resources before presentation Fixes wayland regression on r600g due to fast clear introduced by commit `edbbfac6`. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-01-13 15:25:31 +01:00
Tapani Pälli	99abb87c63	dri: set yInverted default to GL_TRUE yInverted is used by EGL_NOK_texture_from_pixmap to indicate that window system rendering is y-inverted compared to OpenGL texture representation. This extension is only known to be used with X11 window system where sane default is GL_TRUE. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73371 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-13 08:00:37 +02:00
Tapani Pälli	f8c5b8a17d	egl_dri2: call dri2_add_configs_for_visuals after extensions set dri2_add_config makes decisions based on NOK_texture_from_pixmap so it needs to be enabled before calling dri2_add_configs_for_visuals. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-13 07:59:56 +02:00
Ian Romanick	2dc35a619c	mesa: Set the correct error in _mesa_BeginConditionalRender Piglit was recently changed to expect the correct error code (piglit commit 271b998), so it started failing on Mesa. This corrects that failing and adds some spec quotations to justify the errrors set. The code was rearranged a little bit to match the order listed in the spec. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-10 17:19:48 -08:00
Kenneth Graunke	db1dc21a75	i965: Delete duplicate write_timestamp function. brw_queryobj.c needs a version of write_timestamp that works on all generations for the QueryCounter() driver hook. So there's no point in duplicating it in gen6_queryobj.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-10 15:35:01 -08:00
Paul Berry	532b1fecd9	i965: Fix clears of layered framebuffers with mismatched layer counts. Previously, Mesa enforced the following rule (from ARB_geometry_shader4's list of criteria for framebuffer completeness): * If any framebuffer attachment is layered, all attachments must have the same layer count. For three-dimensional textures, the layer count is the depth of the attached volume. For cube map textures, the layer count is always six. For one- and two-dimensional array textures, the layer count is simply the number of layers in the array texture. { FRAMEBUFFER_INCOMPLETE_LAYER_COUNT_ARB } However, when ARB_geometry_shader4 was adopted into GL 3.2, this rule was dropped; GL 3.2 permits different attachments to have different layer counts. This patch brings Mesa in line with GL 3.2. In order to ensure that layered clears properly clear all layers, we now have to keep track of the maximum number of layers in a layered framebuffer. Fixes the following piglit tests in spec/!OpenGL 3.2/layered-rendering: - clear-color-all-types 1d_array mipmapped - clear-color-all-types 1d_array single_level - clear-color-mismatched-layer-count - framebuffer-layer-count-mismatch Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-10 05:58:49 -08:00
Paul Berry	28af1dc217	main: check texture target when validating layered framebuffers. From section 4.4.4 (Framebuffer Completeness) of the GL 3.2 spec: If any framebuffer attachment is layered, all populated attachments must be layered. Additionally, all populated color attachments must be from textures of the same target. We weren't checking that the attachments were from textures of the same target. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-10 05:58:46 -08:00
Chad Versace	90368875e7	i965/gen6/blorp: Remove redundant HiZ workaround Commit `1a92881` added extra flushes to fix a HiZ hang in WebGL Google Maps. With the extra flushes emitted by the previous two patches, the flushes added by `1a92881` are redundant. Tested with the same criteria as in `1a92881`: by zooming in and out continuously for 2 hours on Sandybridge Chrome OS (codename Stumpy) without a hang. CC: Kenneth Graunke <kenneth@whitecape.org> CC: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-09 15:02:45 -08:00
Chad Versace	6a5c86f486	i965/gen6/blorp: Set need_workaround_flush at top of blorp Unconditionally set brw->need_workaround_flush at the top of gen6 blorp state emission. The art of emitting workaround flushes on Sandybridge is mysterious and not fully understood. Ken and I believe that intel_emit_post_sync_nonzero_flush() may be required when switching from regular drawing to blorp. This is an extra safety measure to prevent undiscovered difficult-to-diagnose gpu hangs. I verified that on ChromeOS, pre-patch, need_workaround_flush was not set at the top of blorp, as Paul expected. To verify, I inserted the following debug code at the top of gen6_blorp_exec(), restarted the ui, and inspected the logs in /var/log/ui. The abort gets triggered so early that the browser never appears on the display. static void gen6_blorp_exec(...) { if (!brw->need_workaround_flush) { fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__); abort(); } ... } CC: Kenneth Graunke <kenneth@whitecape.org> CC: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-09 15:02:39 -08:00
Chad Versace	5e0cd58de4	i965/gen6/blorp: Set need_workaround_flush immediately after primitive This patch makes the workaround code in gen6 blorp follow the pattern established in the regular draw path. It shouldn't result in any behavioral change. On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim() and gen6_blorp_emit_primitive(). brw_emit_prim() sets need_workaround_flush immediately after emitting the primitive, but blorp does not. Blorp sets need_workaround_flush at the bottom of brw_blorp_exec(). This patch moves the need_workaround_flush from brw_blorp_exec() to gen6_blorp_emit_primitive(). There is no need to set need_workaround_flush in gen7_blorp_emit_primitive() because the workaround applies only to gen6. Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-09 15:02:36 -08:00
Carl Worth	3587fbc586	docs: Import 10.0.2 release notes, add news item.	2014-01-09 12:05:53 -08:00
Brian Paul	513a324b88	mesa: add missing SNORM formats in _mesa_base_fbo_format() We weren't handling the LUMINANCE_SNORM, LUMINANCE_ALPHA_SNORM and INTENSITY_SNORM cases. Note that adding these cases here does not require a driver to support rendering to these surface types. If the driver can't do it we'll report an incomplete framebuffer. NVIDIA doesn't support GL_EXT_texture_snorm but their driver accepts these formats in glRenderBufferStorage(). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-09 11:35:52 -07:00
Brian Paul	689ec8dfb2	mesa: remove dead geom shader code I doubt the swrast-based drivers will ever support GS. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-09 11:35:52 -07:00
Brian Paul	c47207d517	docs: minor updates to VMware SVGA3D driver page Signed-off-by: Brian Paul <brianp@vmware.com>	2014-01-09 11:35:50 -07:00
Brian Paul	d046fd731a	mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query If a channel has zero bits it's not signed. v2: also check for luminance and intensity format bits. Bruce Merry's proposed piglit test hits the luminance case. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-09 11:35:50 -07:00
Brian Paul	0fc8d7c66e	mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed() This packed floating point format only stores positive values. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-09 11:35:50 -07:00
Brian Paul	d81d263eeb	st/mesa: fix breakage from gl_constant::Program[] change	2014-01-09 11:35:13 -07:00
Paul Berry	8668eaaa00	mesa: Use functions to convert gl_shader_stage to PROGRAM enum or pipe target. Suggested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Improve assert message.	2014-01-09 09:31:27 -08:00
Paul Berry	e654216ac7	main: Change init_program_limits() to use gl_shader_stage. This allows the caller to execute it in a loop rather than hand-rolling a separate call for each stage. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:23 -08:00
Paul Berry	bce8bc0b25	glsl: Index into ctx->Const.Program[] rather than using ad-hoc code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:19 -08:00
Paul Berry	b539385789	mesa: Index into ctx->Const.Program[] rather than using ad-hoc code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:16 -08:00
Paul Berry	84732a982c	mesa: replace ctx->Const.{Vertex,Fragment,Geomtery}Program with an array. These are replaced with ctx->Const.Program[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' -o -iname '.py' \ -o -iname '.y' ')' -print0 \| xargs -0 sed -i \ -e 's/Const\.VertexProgram/Const.Program[MESA_SHADER_VERTEX]/g' \ -e 's/Const\.GeometryProgram/Const.Program[MESA_SHADER_GEOMETRY]/g' \ -e 's/Const\.FragmentProgram/Const.Program[MESA_SHADER_FRAGMENT]/g' Suggested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:01 -08:00
José Fonseca	9b96be595b	llvmpipe: Honour pipe_rasterizer::point_quad_rasterization. Commit `eda21d2a30` fixed the rasterization of points for Direct3D but ended up breaking the rasterization of OpenGL non-sprite points, in particular conform's pntrast.c test. The only way to get both working is to properly honour pipe_rasterizer::point_quad_rasterization, and follow the weird OpenGL rule when it is false. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-09 12:35:11 +00:00
Eric Anholt	f46563fe1c	i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps. We definitely want to fall through to the unsynchronized map case, instead of wasting bandwidth on a copy. Prevents a -43.2407% +/- 1.06113% (n=49) performance regression on aa10perf when teaching glamor to provide the GL_INVALIDATE_RANGE_BIT information. This is a performance fix, which I usually wouldn't cherry-pick to stable. But this was really was just a bug in the code, its presence would discourage developers from giving us the best information they can, and I think we've got fairly high confidence in the unsynchronized map path already. Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 15:39:20 +08:00
Eric Anholt	e186b927b8	i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels. Fixes piglit GL_MESA_pack_invert/readpixels and GPU hangs with glamor and cairo-gl. Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:30:33 +08:00
Eric Anholt	a4b222ac13	i965: Fix incorrect bounds tracking for blit readpixels's GPU access. While incorrect, it probably wouldn't affect anyone ever: You'd have to do an appropriately-formatted readpixels into a PBO, then overwrite the tail end of the updated area of the PBO with glBufferSubData(), and you wouldn't get appropriate synchronization. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:30:32 +08:00
Eric Anholt	66524daf17	i965: Use SET_FIELD to safety check our x/y offsets in blits. The earlier assert made sure that our math didn't exceed our bounds, but this makes sure that we don't overflow from the high bits X into the low bits of Y. We've already put checks in intel_miptree_blit(), but I've wanted to expand the type in our protoype from short to uint32_t, and we could get in trouble with intel_emit_linear_blit() if we did. v2: Add Ken's comment about the funny language extension used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)	2014-01-09 15:30:11 +08:00
Eric Anholt	5d2e86924e	i965: Add an assert for when SET_FIELD's value exceeds the field size. This was one of the things we always wanted to do to this, to make it more useful than just (value << FIELD_MASK). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:23:27 +08:00
Eric Anholt	98cdb2ceed	i965: Add a safety check for emitting blits. With all of the flipping and pitch twiddling and miptree layout involved in our blits, there are lots of ways for us to scribble outside of a buffer. Put in a check that we're not about to do so. This catches a bug that glamor was running into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:23:23 +08:00
Eric Anholt	bdc5241af4	i965: Don't call the blitter on addresses it can't handle. Noticed by tex3d-maxsize on my next commit to check that our addresses don't overflow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:23:00 +08:00
Thomas Sondergaard	e8ff08edd8	mesa: Namespace qualify fma to override ambiguity with fma from math.h MSVC 2013 version of math.h includes an fma() function. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:07 -07:00
Thomas Sondergaard	8fcddd325c	mesa: Work around internal compiler error This small rearrangement avoids MSVC 2013 ICE. Also, this should be a better memory access order. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-08 17:33:06 -07:00
Thomas Sondergaard	067ad6e53e	mesa: Fix compile error with MSVC 2013 This fixes the following compile error: src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3 overloads have similar conversions Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:06 -07:00
Thomas Sondergaard	20e65c92c7	mesa: Preliminary support for MSVC_VERSION=12.0 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:06 -07:00
Rob Clark	646c16af6e	freedreno: add basic query support Add for now some simple/basic query support (ie. things not actually requiring the GPU). Might change around a bit when I actually add GPU queries, but for now this enables some useful performance info in the GALLIUM_HUD. For example: GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls The driver specific specific queries are: + draw-calls + batches - number of batches per second, sum of batches-sysmem plus batches-gmem + batches-gmem - render a set of tiles in GMEM, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve) per second + batches-sysmem - N draws to system memory (GMEM bypass) per second + restores - number of GMEM batches that required restore per second Ideally for GMEM rendering, you want batches-gmem to equal fps. If the app is doing something that triggers multiple passes (ie. requires extra round trip gmem <-> system memory) then the # of batches per second will go up relative to fps. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	725d736f6a	freedreno/a3xx: use cs patch instead of RFI+RMW Since we now have the cmdstream patch mechanism needed for hw binning, might as well also use it for RB_RENDER_CONTROL updates. This avoids the need to use RMW (and associated WFI) to update RB_RENDER_CONTROL. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	c0766528ba	freedreno/a3xx: support for hw binning pass The binning pass sorts vertices into which bins/tiles they apply to. The visibility information generated during the binning pass can be used to speed up the rendering pass by filtering out vertices which do not apply to the current tile. See: https://github.com/freedreno/freedreno/wiki/Adreno-tiling#optimized-approach This brings a significant fps boost. A rough assortment of tests (supertuxkart, etracer, tremulous, glmark2 'build' test, etc) seems to yield a ~35-45% fps improvement. For now, to be conservative, the binning pass is not enabled yet by default. To enable it use: FD_MESA_DEBUG=binning So far I haven't found anything that breaks with binning enabled, but I'd like a bit more testing before I enable it as default. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	bfb44c24bc	freedreno: be more clever about gmem usage Only need to leave room for depth/stencil if it is actually used, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	42c5e2a2ed	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Chris Forbes	9e99735f30	i965: fold offset into coord for textureOffset(gsampler2DRect) The hardware is broken with nonzero texel offsets and unnormalized coordinates; instead of doing correct offsetting, we get garbage. This just extends the existing workaround for ir_txf and ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect. Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is not enabled; also fixes the new piglit test 'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'. Has been broken ~forever; suggesting including this in only 10.0 because the lowering pass doesn't exist in 9.2 or earlier so would require quite a different patch. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Lee Salzman <lsalzman@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-09 10:09:01 +13:00
Paul Berry	31ec2f8338	mesa: Remove _mesa_progshader_enum_to_string(), which is no longer used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:32:14 -08:00
Paul Berry	acfc58a7e5	glsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:32:01 -08:00
Paul Berry	2adb9fea77	glsl: Make more use of gl_shader_stage enum in lower_clip_distance.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:58 -08:00
Paul Berry	80ee24823f	glsl: Make more use of gl_shader_stage enum in link_varyings.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename "shaderType" param of is_varying_var() to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:55 -08:00
Paul Berry	9110078209	glsl: Change _mesa_glsl_parse_state ctor to use gl_shader_stage enum. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename "target" param to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:49 -08:00
Paul Berry	e3b86f07da	mesa: Use gl_shader::Stage instead of gl_shader::Type where possible. This reduces confusion since gl_shader::Type is sometimes GL_SHADER_PROGRAM_MESA but is more frequently GL_SHADER_{VERTEX,GEOMETRY,FRAGMENT}. It also has the advantage that when switching on gl_shader::Stage, the compiler will alert if one of the possible enum types is unhandled. Finally, many functions in src/glsl (especially those dealing with linking) already use gl_shader_stage to represent pipeline stages; using gl_shader::Stage in those functions avoids the need for a conversion. Note: in the process I changed _mesa_write_shader_to_file() so that if it encounters an unexpected shader stage, it will use a file suffix of "????" rather than "geom". Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:31:45 -08:00
Paul Berry	65511e5f22	mesa: Store gl_shader_stage enum in gl_shader objects. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:31:28 -08:00
Paul Berry	1722f5e73e	mesa: Move declaration of gl_shader_stage earlier in mtypes.h. Also move the related #define MESA_SHADER_STAGES. This will allow gl_shader_stage to be used in struct gl_shader. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:30:54 -08:00
Paul Berry	72a995d307	glsl: make _mesa_shader_stage_to_string() available to non-C++ code. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:30:48 -08:00
Paul Berry	665b8d7b6d	mesa: Clean up nomenclature for pipeline stages. Previously, we had an enum called gl_shader_type which represented pipeline stages in the order they occur in the pipeline (i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several inconsistently named functions for converting between it and other representations: - _mesa_shader_type_to_string: gl_shader_type -> string - _mesa_shader_type_to_index: GLenum (GL__SHADER) -> gl_shader_type - _mesa_program_target_to_index: GLenum (GL__PROGRAM) -> gl_shader_type - _mesa_shader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string This patch tries to clean things up so that we use more consistent terminology: the enum is now called gl_shader_stage (to emphasize that it is in the order of pipeline stages), and the conversion functions are: - _mesa_shader_stage_to_string: gl_shader_stage -> string - _mesa_shader_enum_to_shader_stage: GLenum (GL__SHADER) -> gl_shader_stage - _mesa_program_enum_to_shader_stage: GLenum (GL__PROGRAM) -> gl_shader_stage - _mesa_progshader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES, for consistency with the new name for the enum. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename the "target" field of _mesa_glsl_parse_state and the "target" parameter of _mesa_shader_stage_to_string to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:30:30 -08:00
José Fonseca	eda21d2a30	llvmpipe: Fix the bottom_edge_rule adjustment for points. The adjustment needs to be applied to the y coordinates and not the x coordinates, just like the equivalent code for lines and triangles in lp_setup_line.c and lp_setup_tri.c. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-01-08 12:18:17 +00:00
José Fonseca	37de6b0682	llvmpipe: Respect bottom_edge_rule when computing the rasterization bounding boxes. This was inadvertently forgotten when replacing gl_rasterization_rules with lower_left_origin and half_pixel_center (commit `2737abb44e`). This makes a difference when lower_left_origin != half_pixel_center, e.g, D3D10. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-01-08 12:18:17 +00:00
Chia-I Wu	76edf44f9e	ilo: enable HiZ The support is still early. Fast depth buffer clear is not enabled yet. HiZ can be forced off with ILO_DEBUG=nohiz.	2014-01-08 18:11:36 +08:00
Chia-I Wu	e7b4219e22	ilo: resolve Z/HiZ correctly When the depth buffer is to be read, perform a Depth Buffer Resolve if it has been rendered. When the depth buffer is to be rendered, perform a HiZ Buffer Resolve when the depth buffer is modified externally.	2014-01-08 18:11:35 +08:00
Chia-I Wu	77e3db464f	ilo: add flags to texture slices The flags are used to mark who (CPU, BLT, or RENDER) has accessed the resource and how (READ or WRITE).	2014-01-08 18:11:35 +08:00
Chia-I Wu	846f70a6ef	ilo: rename and add an accessor for texture slices Rename ilo_texture::slice_offsets to ilo_texture::slices and add an accessor, ilo_texture_get_slice().	2014-01-08 18:11:35 +08:00
Chia-I Wu	127fbc086b	ilo: add HiZ op support to the pipelines Add blitter functions to perform Depth Buffer Clear, Depth Buffer Resolve, and Hierarchical Depth Buffer Resolve. Those functions set ilo_blitter up and pass it to the pipelines to emit the commands.	2014-01-08 18:11:35 +08:00
Chia-I Wu	546416d495	ilo: add support for HiZ allocation Add tex_create_hiz() to create HiZ bo. It is not really called yet.	2014-01-08 18:11:35 +08:00
Chia-I Wu	e372819589	ilo: refactor separate stencil allocation Move separate stencil allocation code to tex_create_separate_stencil to keep tex_create sane.	2014-01-08 18:11:35 +08:00
Chia-I Wu	82676f5d34	ilo: assorted GPE fixes for HiZ Allow HiZ op to be specified in 3DSTATE_WM. Pass depth format directly in gen7_emit_3DSTATE_SF. Use tex->hiz.bo to determine if HiZ exists. Fix 3DSTATE_SF for the case when there is no ilo_rasterizer_state. Fix 3DSTATE_PS for the case when there is no ilo_shader_state.	2014-01-08 18:11:35 +08:00
Chia-I Wu	6642381e75	ilo: no layer offsetting on GEN7+ Even though the Ivy Bridge PRM lists some restrictions that require layer offsetting as the Sandy Bridge PRM does, it seems they are actually lifted.	2014-01-08 18:11:34 +08:00
Chia-I Wu	011fde4bf2	ilo: offset to layers only when necessary GEN6 has several requirements regarding the LOD/Depth/Width/Height of the render targets and the depth buffer. We used to offset to the layers in question unconditionally to meet the requirements. With this commit, offseting is done only when the requirements are not met.	2014-01-08 18:11:34 +08:00
Chia-I Wu	0a2a221d01	ilo: allow ilo_zs_surface to skip layer offsetting Make offset to layer optional in ilo_gpe_init_zs_surface.	2014-01-08 18:11:34 +08:00
Chia-I Wu	8d9f5d57e2	ilo: allow ilo_view_surface to skip layer offsetting Make offset to layer optional in ilo_gpe_init_view_surface_for_texture. render_cache_rw is always the same as is_rt and is replaced.	2014-01-08 18:11:34 +08:00
Tapani Pälli	0978a6966a	i965/fs: do SEL optimization only when src type for MOV matches Fixes a bug where then branch operates with ivec4 while else uses vec4. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72379 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-08 07:06:45 +02:00
Kenneth Graunke	847bc36a38	glsl: Optimize pow(2, x) --> exp2(x). On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	5e3fd6a9db	glsl: Refactor is_zero/one/negative_one into an is_value() method. This patch creates a new generic is_value() method, which checks if an ir_constant has a particular value. (For vectors, it must have the single value repeated across all components.) It then rewrites the is_zero/is_one/is_negative_one methods to use this generic helper. All three were basically identical except for the value they checked for. The other difference is that is_negative_one rejects boolean types. The new is_value function maintains this behavior, only allowing boolean types when checking for 0 or 1. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	d6c1d66d3a	glsl: Optimize pow(1.0, X) --> 1.0. Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	05fbb021a6	mesa: Use get_local_param_pointer in glProgramLocalParameters4fvEXT(). Using the get_local_param_pointer helper ensures that the LocalParams arrays have actually been allocated before attempting to use them. glProgramLocalParameters4fvEXT needs to do a bit of extra checking, but it can be simplified since the helper has already validated the target. Fixes crashes in programs that use Cg (for example, Awesomenauts, Rocketbirds: Hardboiled Chicken, and Tiny and Big: Grandpa's Leftovers) since commit `e5885c119d` (mesa: Dynamically allocate the storage for program local parameters.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73136 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Laurent Carlier <lordheavym@gmail.com>	2014-01-07 12:50:23 -08:00
José Fonseca	2d368b982a	llvmpipe: Basic implementation of pipe_context::set_sample_mask. We don't support MSAA (ie, number of samples is always one) therefore sample_mask boils down to a synonym of the rasterizer_discard flag. Also, this change makes setup actually use the value received in lp_setup_set_rasterizer_discard instead of reaching out to llvmpipe upper layers to re-fetch it. Based on Si Chen's draft. With this patch `wgf11multisample Coverage passes 100%` on the UMD D3D10 state tracker. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Si Chen <sichen@vmware.com>	2014-01-07 16:04:42 +00:00
José Fonseca	95bf222603	cso_context: Fix cso_context::sample_mask initial value. The initial value of cso_context::sample_mask_saved is irrelevant as it will be overwritten with cso_context::sample_mask in cso_save_sample_mask. Therefore it is cso_context::sample_mask that needs to be properly initialized. This fixes regressions in blits and mipmap generation after adding support for sample_mask to llvmpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-07 16:04:42 +00:00
Si Chen	72c6d0e506	llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers. Implement Alpha to Coverage by discarding a fragment alpha component is less than 0.5. This is a joint work of Jose and Si. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-07 16:04:42 +00:00
Andreas Fänger	2a0fb946e1	swrast: fix delayed texel buffer allocation regression for OpenMP Commit `9119269ca1` moved the texel buffer allocation to _swrast_texture_span(), however, when compiled with OpenMP support this code already runs multi-threaded so a critical section is required to prevent multiple allocations and rendering errors. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-07 08:03:49 -07:00
Dave Airlie	aa4e2243a2	gallium/draw: remove double semicolon code cleanup. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-01-07 18:52:46 +10:00
Brian Paul	8d1400fe12	glsl: rename min(), max() functions to fix MSVC build Evidently, there's some other definition of "min" and "max" that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 16:57:49 -07:00
Kenneth Graunke	f6b10544cd	i965: Remove unused PIPE_CONTROL defines. Both brw_defines.h and intel_reg.h defined PIPE_CONTROL fields, which had similar names, but couldn't be used in the same way. (One had built-in shifts, and the other didn't...) Delete the unused set to preserve sanity. (Eric wrote an almost identical patch back in August, so I believe he approves.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 15:45:42 -08:00
Vinson Lee	f8432832a7	mesa: Remove GLXContextID typedef from glxext.h. This patch fixes this build error with gcc <= 4.5 and clang <= 3.1. CC clientattrib.lo In file included from ../../include/GL/glx.h:333:0, from glxclient.h:45, from clientattrib.c:32: ../../include/GL/glxext.h:275:13: error: redefinition of typedef 'GLXContextID' ../../include/GL/glx.h:171:13: note: previous declaration of 'GLXContextID' was here Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70591 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:57:23 -08:00
Maxence Le Doré	a44ca3595e	docs/relnotes/10.1.html: report AMD_shader_trinary_minmax support Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:11 -08:00
Maxence Le Doré	1a9e8c23eb	mesa: enable AMD_shader_trinary_minmax Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:10 -08:00
Maxence Le Doré	eb5dc75601	glsl: implement mid3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:09 -08:00
Maxence Le Doré	73c7451587	glsl: implement max3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:08 -08:00
Maxence Le Doré	ce46e14729	glsl: Implement min3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:08 -08:00
Maxence Le Doré	61c450fc81	glsl: add min() and max() functions to builder.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:07 -08:00
Maxence Le Doré	cf70d2a7c0	glsl: add a shader_trinary_minmax predicate Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:06 -08:00
Maxence Le Doré	ff50493bb3	glsl: Add extension tracking for AMD_shader_trinary_minmax Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:02 -08:00
Alexander von Gluck IV	61ef697afc	haiku libGL: Move from gallium target to src/hgl * The Haiku renderers need to link to libGL to function properly in all usage contexts. As mesa drivers build before gallium targets, we couldn't properly link the mesa swrast driver to the gallium libGL target for Haiku. * This is likely better as it mimics how glx is laid out ensuring the Haiku libGL is better understood. * All renderers properly link in libGL now. Acked-by: Brian Paul <brianp@vmware.com>	2014-01-06 15:50:21 -06:00
Alexander von Gluck IV	b236314a11	haiku: Fix missing HaikuGL header paths Acked-by: Brian Paul <brianp@vmware.com>	2014-01-06 15:50:15 -06:00
Brian Paul	3486f6f31b	mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query This is part of the GL_EXT_packed_float extension. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-06 13:37:00 -07:00
Eric Anholt	7db56ddee0	i965: Warning fix Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 10:54:22 -08:00
Kenneth Graunke	242ca9acb4	i965: Delete unused INTEL_WRITE_{PART,FULL} and INTEL_READ #defines. These are just software flag values (not hardware specific values), and aren't used anywhere. Delete them to avoid confusion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 10:52:43 -08:00
Marek Olšák	346b6abab9	radeonsi: calculate NUM_BANKS for DB correctly on CIK NUM_BANKS is not constant on CIK. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-06 18:40:42 +01:00
Marek Olšák	bf3c361113	radeonsi: set correct pipe config for Hawaii in DB Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-06 18:40:42 +01:00
Marek Olšák	2748b7da7e	radeonsi: disable HTILE for 1D-tiled depth-stencil buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-06 18:40:41 +01:00
Juha-Pekka Heikkila	d41f5396f3	glx: check memory allocations in __glXInitVertexArrayState() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-06 10:23:26 -07:00
Juha-Pekka Heikkila	0c04cca0e1	glx: Add missing null check in __glXNewIndirectAPI() Add extra null check in auto generated indirect_init.c via src/mapi/glapi/gen/glX_proto_send.py Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-06 10:23:12 -07:00
Nathan Kidd	0691b37732	docs: fix misspellings Fixed what I noticed; no warranty for exhaustiveness. Signed-off-by: Nathan Kidd <nkidd@opentext.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-06 09:55:38 -07:00
Chris Forbes	a61ae2aa01	i965: set size of txf_mcs payload vgrf properly Previously we left the size of this vgrf as 1, which caused register allocation to be subtly broken. If we were lucky we would explode in the post-alloc instruction scheduler; if we were unlucky we'd just stomp on someone else and get broken rendering. Fixes crash when running `tesseract` with the following settings: msaa 4 glineardepth 0 Also fixes the piglit test: arb_sample_shading-builtin-gl-sample-id Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: Anuj Phogat <anuj.phogat@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72859 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-04 20:24:29 +13:00
Erik Faye-Lund	eb212c5a30	glcpp: error on multiple #else/#elif directives The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-02 14:22:58 -08:00
Carl Worth	6005e9cb28	glcpp: Replace multi-line comment with a space (even as part of macro definition) The preprocessor has always replaced multi-line comments with a single space character, (as required by the specification), but as of commit `bd55ba568b` the lexer also emitted a NEWLINE token for each newline within the comment, (in order to preserve line numbers). The emitting of NEWLINE tokens within the comment broke the rule of "replace a multi-line comment with a single space" as could be exposed by code like the following: #define FOO a/* */b FOO Prior to commit `bd55ba568b`, this code defined the macro FOO as "a b" as desired. Since that commit, this code instead defines FOO as "a" and leaves a stray "b" in the output. In this commit, we fix this by not emitting the NEWLINE tokens while lexing the comment, but instead merely counting them in the commented_newlines variable. Then, when the lexer next encounters a non-commented newline it switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as necessary (so that subsequent parsing stages still generate correct line numbers). Of course, it would have been more clear if we could have written a loop to emit all the newlines, but flex conventions prevent that, (we must use "return" for each token we emit). It similarly would have been clear to have a new rule restricted to the <NEWLINE_CATCHUP> state with an action much like the body of this if condition. The problem with that is that this rule must not consume any characters. It might be possible to write a rule that matches a single lookahead of any character, but then we would also need an additional rule to ensure for the <EOF> case where there are no additional characters available for the lookahead to match. Given those considerations, and given that the SKIP-state manipulation already involves a code block at the top of the lexer function, before any rules, it seems best to me to go with the implementation here which adds a similar pre-rule code block for the NEWLINE_CATCHUP. Finally, this commit also changes the expected output of a few, existing glcpp tests. The change here is that the space character resulting from the multi-line comment is now emitted before the newlines corresponding to that comment. (Previously, the newlines were emitted first, and the space character afterward.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-02 14:15:51 -08:00
Carl Worth	61cea49014	glcpp: Add a more descriptive comment for the SKIP state manipulation Two things make this code confusing: 1. The uncharacteristic manipulation of lexer start state outside of flex rules. 2. The confusing semantics of the skip_stack (including the "lexing_if" override and the SKIP_NO_SKIP state). This new comment is intended to bring a bit more clarity for any readers. There is no intended beahvioral change to the code here. The actual code changes include better indentation to avoid an excessively-long line, and using the more descriptive INITIAL rather than 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-02 14:15:24 -08:00
Courtney Goeltzenleuchter	5a51c1b01a	i965: Enhance intel_texsubimage_tiled_memcpy() to support all levels Support all levels of a supported texture format. Using 1024x1024, RGBA 8888 source, mipmap internal-format Before (MB/sec) mipmap (MB/sec) GL_RGBA 627.15 615.90 GL_RGB 456.35 611.53 512x512 GL_RGBA 597.00 619.95 GL_RGB 440.62 611.28 256x256 GL_RGBA 487.80 587.42 GL_RGB 376.63 585.00 Benchmark has been sent to mesa-dev list: teximage_enh Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-30 14:57:49 -08:00
Courtney Goeltzenleuchter	85784fd832	i965: Add XRGB to intel_texsubimage_tiled_memcpy() MESA_FORMAT_XRGB8888 is equivalent to MESA_FORMAT_ARGB8888 in terms of storage on the device, so okay to use this optimized copy routine. This series builds on work from Frank Henigman to optimize the process of uploading a texture to the GPU. This series adds support for MESA_XRGB_8888 and full miptrees where were found to be common activities in the Smokin' Guns game. The issue was found while profiling the app but that part is not benchmarked. Smokin-Guns uses mipmap textures with an internal format of GL_RGB (MESA_XRGB_8888 in the driver). These changes need a performance tool to run against to show how they improve execution performance for specific texture formats. Using this benchmark I've measured the following improvement on my Ivybridge Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz. 1024x1024 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 628.15 627.15 GL_RGB 265.95 456.35 512x512 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 600.23 597.00 GL_RGB 255.50 440.62 256x256 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 489.08 487.80 GL_RGB 229.03 376.63 Benchmark has been sent to mesa-dev list: teximage Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-30 14:57:48 -08:00
Paul Berry	77c74c647b	glsl: Fix gl_type of usamplerCube built-in type. I'm not aware of any piglit tests that this fixes, but the old code was obviously wrong. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-30 11:21:39 -08:00
Paul Berry	7e0b4b5e9b	mesa: Add an assertion to _mesa_program_index_to_target(). Only a Mesa bug could cause this function to be called with an out-of-range index, so raise an assertion if that ever happens. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:33 -08:00
Paul Berry	99e822fa18	mesa: Improve static error checking of arrays sized by MESA_SHADER_TYPES. This patch replaces the following pattern: foo bar[MESA_SHADER_TYPES] = { ... }; With: foo bar[] = { ... }; STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES); This way, when a new shader type is added in a future version of Mesa, we will get a compile error to remind us that the array needs to be updated. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:27 -08:00
Paul Berry	b30e25f297	glsl: Remove extraneous shader_type argument from analyze_clip_usage(). This argument was carrying the name of the shader target (as a string). We can get this just as easily by calling _mesa_shader_enum_to_string(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:24 -08:00
Paul Berry	d343e3d98c	glsl: Get rid of hardcoded arrays of shader target names. We already have a function for converting a shader type index to a string: _mesa_shader_type_to_string(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:21 -08:00
Paul Berry	89c35c59a4	main: Remove unused function _mesa_shader_index_to_type(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:14 -08:00
Paul Berry	26707abe56	Rename overloads of _mesa_glsl_shader_target_name(). Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:08 -08:00
Kenneth Graunke	f425d56ba4	Revert "mesa: Remove GLXContextID typedef from glx.h." This reverts commit `136a12ac98`. According to belak51 on IRC, this commit broke Allegro, which would no longer compile. Applications apparently expect the GLXContextID typedef to exist in glx.h; removing it breaks them. A bit of searching around the internet revealed other complaints since upgrading to Mesa 10. Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-29 23:23:33 -08:00
Kenneth Graunke	da031f83f7	i965: Remove unused depth_mode parameter from translate_tex_format(). According to git blame, this hasn't been used in over two years: commit `d2235b0f46` Author: Eric Anholt <eric@anholt.net> Date: Thu Nov 17 17:01:58 2011 -0800 i965: Always handle GL_DEPTH_TEXTURE_MODE through the shader. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-29 23:18:24 -08:00
Topi Pohjolainen	597a7ccc72	i965/blorp: unit test compiling integer typed texture fetches Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:45 +02:00
Topi Pohjolainen	1c76b53482	i965/blorp: unit test compiling simple gen6 zero-src sampled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:38 +02:00
Topi Pohjolainen	118c093d56	i965/blorp: unit test compiling gen6 msaa-8 cms alpha blend Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:34 +02:00
Topi Pohjolainen	b03319ddb1	i965/blorp: unit test compiling bilinear filtered Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:31 +02:00
Topi Pohjolainen	b928e345e4	i965/blorp: unit test compiling simple zero-src sampled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:27 +02:00
Topi Pohjolainen	001b92c112	i965/blorp: unit test compiling unaligned msaa-8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:23 +02:00
Topi Pohjolainen	0f89ebacbb	i965/blorp: unit test compiling msaa-8 cms alpha blend Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:19 +02:00
Topi Pohjolainen	90dcf31631	i965/blorp: unit test compiling msaa-4 ums to cms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:15 +02:00
Topi Pohjolainen	11d2986a53	i965/blorp: unit test compiling msaa-8 cms to cms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:11 +02:00
Topi Pohjolainen	28d2c969e7	i965/blorp: unit test compiling msaa-8 ums to cms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:07 +02:00
Topi Pohjolainen	812f1e94c0	i965/blorp: unit test compiling blend and scaled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:03 +02:00
Topi Pohjolainen	a7757bf518	i965/blorp: allow unit tests to compile and dump assembly Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:59 +02:00
Topi Pohjolainen	1cb22f0da2	i965: dump the disassembly to the given file instead of ignoring the argument and always dumping to standard output. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:52 +02:00
Topi Pohjolainen	1958a9bbdf	i965/fs: allow fs-generator use without gl_fragment_program Prepares the generator to accept hand-crafted blorp programs. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:46 +02:00
Topi Pohjolainen	ca53704f4b	i965/fs: generate fs programs also without any 8-width instructions Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:36 +02:00
Rob Clark	8ab47b4353	freedreno/a3xx: fix blend state corruption issue Using RMW on banked context registers is not safe. The value read could be the wrong one. So if there has been a DRAW_IDX launched, the RMW must be preceded by a WAIT_FOR_IDLE to ensure the read part of RMW sees the correct value. To avoid unnecessary WFI's, keep track if there is a need for WFI, and only emit one if needed. Furthermore, keep track if we even need to update the register in the first place. And to cut down on the amount of RMW to avoid excessive WFI's, at the tiling/GMEM level we can always overwrite RB_RENDER_CONTROL, as the state at beginning of draw/clear cmds (which we IB to) is always undefined. In the draw/clear commands, we always still use RMW (with WFI if needed), but only if the register value actually changes. (At points where the current value cannot be known, the saved value is reset to ~0, which includes bits outside of RBRC_DRAW_STATE, so there never is chance for confusion.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-26 12:13:42 -05:00
Rob Clark	be01d7a905	freedreno: prepare for hw binning Actually assign VSC_PIPE's properly, which will be needed for tiling. And introduce fd_tile for per-tile state (including the assignment of tile to VSC_PIPE). This gives us the proper pipe setup that we'll need for hw binning pass, and also cleans things up a bit by not having to pass so many parameters around. And will also make it easier to introduce different tiling patterns (since we may no longer render tiles in a simple left-to-right top-to-bottom pattern). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-26 12:06:29 -05:00
Rob Clark	64fe067066	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-26 12:06:29 -05:00
Alex Deucher	e2d53fac1c	r600g: fix SUMO2 pci id 0x9649 is sumo2, not sumo. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-24 15:22:31 -05:00
Vinson Lee	35a3414302	scons: Add system library linker flags on LLVM 3.5. llvn-3.5svn r197664 split out the linker flags from ldflags to system-libs. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-12-23 11:33:29 -08:00
Aaron Watry	3ddabe0d52	r600/pipe: Stop leaking context->start_compute_cs_cmd.buf on EG/CM Found while tracking down memory leaks in VDPAU playback Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	20446d0e53	st/vdpau: Destroy context when initialization fails Prevents a potential memory leak found when tracking down something else. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	767b0f82c3	radeon/llvm: Free target data at end of optimization Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	0bd858d7ff	r600/compute: Use the correct FREE macro when deleting compute state Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	e19717d075	r600/compute: Free compiled kernels when deleting compute state v2: Remove unnecessary null pointer check CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	8c9a9205d9	radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode Previously we were creating a new LLVMContext every time that we called radeon_llvm_parse_bitcode, which caused us to leak the context every time that we compiled a CL program. Sadly, we can't dispose of the LLVMContext at the point that it was being created because evergreen_launch_grid (and possibly the SI equivalent) was assuming that the context used to compile the kernels was still available. Now, we'll create a new LLVMContext when creating EG/SI compute state, store it there, and pass it to all of the places that need it. The LLVM Context gets destroyed when we delete the EG/SI compute state. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	a7653c19a3	pipe_loader/sw: close dev->lib when initialization fails Prevents a memory leak. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	862f55c29c	clover: Remove unused variable Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Jonathan Liu	7990ab58fa	llvmpipe: use pipe_sampler_view_release() to avoid segfault This fixes another case of faulting when freeing a pipe_sampler_view that belongs to a previously destroyed context. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Liu <net147@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-22 07:07:56 -07:00
Jonathan Liu	670be71bd8	st/mesa: use pipe_sampler_view_release() This fixes a crash where old_view->context was already freed in the pipe_sampler_view_reference function contained in src/gallium/auxiliary/utils/u_inlines.h. As a result, the sampler_view_destroy function pointer contained 0xfeeefeee indicating freed heap memory. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Liu <net147@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-22 07:07:07 -07:00
Henri Verbeet	b094b3b9f4	i915: Add support for gl_FragData[0] reads. Similar to `556a47a262`, without this reading from gl_FragData[0] would cause a software fallback. Bugzilla: https://bugs.winehq.org/show_bug.cgi?id=33964 Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-22 11:55:39 +01:00
Andreas Hartmetz	2efe7927d3	radeonsi: Use htile_buffer for depth only when there is no stencil. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-12-22 01:41:03 +01:00
Niels Ole Salscheider	900ac63ee8	winsys/radeon: remove superfluous distinction of cases Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-12-22 01:41:02 +01:00
Mark Mueller	852db050b9	mesa: inline r200 radeon texture format macros to facility search and replace Signed-off-by: Mark Mueller <MarkKMueller@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-12-21 15:27:29 +01:00
Lauri Kasanen	fcefdc9a59	mesa: Fix build to properly check for supported compiler flags Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72708 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lauri Kasanen <cand@gmx.com>	2013-12-20 17:00:57 -08:00
Ian Romanick	79f268978d	mesa: It is not possible to have GLSL < 1.20 This hasn't been possible for a long time. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	4949322462	mesa: Clean up bad code formatting left from previous commit Also s/_EXT// on enums that are now part of core. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	a92b9e60ab	mesa: GL_EXT_packed_depth_stencil is not optional Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	b66edff435	radeon: Sort list of enabled extensions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	1bf436e014	r200: Sort list of enabled extensions Note that ARB_occlusion_query was previously enabled twice. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Lauri Kasanen	fe2079c4c0	glx: Simplify __glxGetMscRate, it only needs the screen, not a drawable Useful in its own right, but also needed for adaptive vsync. No regressions in the piglit glx-oml-sync-control-getmscrate test. Signed-off-by: Lauri Kasanen <cand@gmx.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 16:43:08 -08:00
Keith Packard	6b51113981	dri3: Rename DRI3_MAX_BACK to DRI3_NUM_BACK It is the maximum number of back buffers, but the name is confusing and is easily read as the maximum back buffer index. Chage to DRI3_NUM_BACK to make the intended usage a bit clearer. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:31:09 -08:00
Keith Packard	547bcc4b57	i965: Set fast color clear mcs_state on newly allocated image miptrees Just copying code from the dri2 path to set up the fast color clear state. This also removes a couple of bogus intel_region_reference calls. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:19:52 -08:00
Keith Packard	c426fb08cf	i965: Correct check for re-bound buffer in intel_update_image_buffer The buffer-object is the persistent thing passed through the loader, so when updating an image buffer, check to see if it is already bound to the provided bo. The region, on the other hand, is allocated separately for the miptree, and so will never be the same as that passed back from the loader. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:18:37 -08:00
Keith Packard	ca2012a912	dri3: Clean up struct dri3_drawable Move the depth field up with width and height. Remove unused previous_time and frames fields. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:18:11 -08:00
Keith Packard	95b04850d0	dri3: Free resources when drawable is destroyed. Always nice to clean up after ourselves. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:17:59 -08:00
Keith Packard	568a27588d	dri3: Switch to libxshmfence version 1.1 libxshmfence v1.0 foolishly used 'int32_t ' for the fence type, which works when the fence is a linux futex. However, version 1.1 changes the exported datatype to 'struct xshmfence ' Require libxshmfence version 1.1 and switch the API around. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:17:54 -08:00
Kenneth Graunke	9f330481c3	i965: Use RED for depth texture formats rather than INTENSITY. While looking through the documentation, I found this in the Sandybridge PRM (Volume 4, Part 1, Page 140): "Use of sample_c with SURFTYPE_CUBE surfaces is undefined with the following surface formats: I24X8_UNORM, L24X8_UNORM, A24X8_UNORM, I32_FLOAT, L32_FLOAT, A32_FLOAT." I haven't observed this to be true, but it suggests that we may want to use other formats. We already perform DEPTH_TEXTURE_MODE swizzling in the shaders, and don't rely on the surface format to splat things appropriately. So using RED should work just as well as INTENSITY. A few notes about the formats: - R24_UNORM_X8_TYPELESS has the exact same properties as I24X8_UNORM. - R16_UNORM and R32_FLOAT are additionally supported as a render target, while the old I16_UNORM/I32_FLOAT formats are not. - R32_FLOAT_X8X24_TYPELESS is not supported as a render target, while the old format (R32G32_FLOAT) was. However, it shares the same properties as the formats we use for Z24, so it should suffice. This makes translate_tex_format and brw_blorp_surface_info::set a bit more similar. No Piglit changes on Sandybridge or Ivybridge. No oglconform changes on Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 16:14:35 -08:00
Chad Versace	1a928816a1	i965/gen6: Fix HiZ hang in WebGL Google Maps Emitting flushes before depth and hiz resolves at the top of blorp's state emission fixes the hang. Marchesin and I found the fix experimentally, as opposed to adhering to a documented hardware workaround. A more minimal fix likely exists, but this gets the job done. Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS. Tested by zooming in and out continuously for 2 hours. This patch is based on `8bc07bb701` CC: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740 Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 15:20:30 -08:00
Kenneth Graunke	b97fa1e75b	i965: Store QPitch in intel_mipmap_tree. Broadwell allows us to specify an arbitrary value for QPitch, rather than baking a specific formula into the hardware and requiring software to lay things out to match. The only restriction is that the software provided QPitch needs to be large enough so successive array slices do not overlap. In order to support this flexibility, software needs to specify QPitch in a bunch of packets. Storing QPitch makes that easy, and allows us to adjust it in a single place should we wish to change it in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-20 12:41:54 -08:00
Kenneth Graunke	1e8e17ccd7	i965: Add support for Broadwell's new register types. Broadwell introduces support for Q, UQ, and HF types. It also extends DF support to allow immediate values. Irritatingly, although HF and DF both support immediates, they're represented by a different value depending on the register file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:43 -08:00
Kenneth Graunke	15b9aa22d7	i965: Add BRW_REGISTER_TYPE_DF. Ivybridge, Baytrail, and Haswell support double float register types, but do not support them as immediate values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:41 -08:00
Kenneth Graunke	54e91e7420	i965: Abstract BRW_REGISTER_TYPE_* into an enum with unique values. On released hardware, values 4-6 are overloaded. For normal registers, they mean UB/B/DF. But for immediates, they mean UV/VF/V. Previously, we just created #defines for each name, reusing the same value. This meant we could directly splat the brw_reg::type field into the assembly encoding, which was fairly nice, and worked well. Unfortunately, Broadwell makes this infeasible: the HF and DF types are represented as different numeric values depending on whether the source register is an immediate or not. To preserve sanity, I decided to simply convert BRW_REGISTER_TYPE_* to an abstract enum that has a unique value for each register type, and write translation functions. One nice benefit is that we can add assertions about register files and generations. I've chosen not to convert brw_reg::type to the enum, since converting it caused a lot of trouble due to C++ enum rules (even though it's defined in an extern "C" block...). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:39 -08:00
Kenneth Graunke	13454fc3de	i965: Decode three-source register types directly. Three-source instructions use a different encoding for register types (and have a much more limited set to choose from). Previously, we translated those into BRW_REGISTER_TYPE_* values, then reused the existing reg_encoding mapping. Doing it directly is more straightforward and actually less code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:38 -08:00
Kenneth Graunke	4e95a09937	i965: Disassemble UV types, not UB types. UB types have never been supported as immediates. On Gen4-5, register encoding 4 is "Reserved." On Gen6+, it means UV. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:36 -08:00
Kenneth Graunke	d10242c5f7	i965: Add missing BRW_REGISTER_TYPE_UV. Sandybridge added support for packed unsigned vectors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:15 -08:00
Kenneth Graunke	51c9cfc296	i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation. When adding geometry shader support, we accidentally reversed the size and offset parameters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-20 12:25:43 -08:00
Kenneth Graunke	0d0edf8e4c	i965: Use {point_sprite,flat}_enable variable names instead of dw. Calling the local variables flat_enable and point_sprite_enable is clearer than dw16 and such. It also matches the names used in calculate_attr_overrides, which computes them. v2: Add / dw16 / and / dw10 */ comments, requested by Jordan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-12-20 12:25:33 -08:00
Kenneth Graunke	23fc845f81	i965: Zero out {point_sprite,flat}_enables in calculate_attr_overrides. calculate_attr_overrides is responsible for computing the point sprite and flat-shading enable bitfields. It does so by OR'ing in a bunch of bits. However, it relied on the caller to set the initial value to zero. This is pretty fragile - if the caller neglects to zero out those variables, then the enable bitfields end up full of garbage, which shows up as random things being flat-shaded. This patch moves the zero-initialization into calculate_attr_overrides, so that the computation is completely in one place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-12-20 12:25:33 -08:00
Kenneth Graunke	da872ddcc6	i965: Delete bogus BRW_REGISTER_TYPE_HF define. git blame ascribes this to the initial commit of the driver. No released hardware has ever supported half float, according to the documentation for SrcType in the ISA reference. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-20 12:25:33 -08:00
Kevin Rogovin	3b1195f8a6	Report that no function found if signature lookup is empty If no function signature is found for a function name, report that the function is not found instead of printing an empty list of candidates. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 09:03:54 -08:00
Kevin Rogovin	23d294bb60	Use line number information from entire function expression This patch changes the error reporting behavior for incorrect function invocation (triggered by match_function_by_name() unable to find a matching function call) from using the line number information associated to the function name term to using the line number information of the entire function expression. Fixes bug #72264. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-20 09:03:54 -08:00
Michel Dänzer	d580905000	radeonsi: Only scan pixel shaders for TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS It's not relevant for other shader types. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-12-20 18:51:09 +09:00
Aaron Watry	8252847b7b	r600g: Fix spelling error Trivial change, testing commit access	2013-12-19 14:30:51 -06:00
Quanxian Wang	1413a09f34	egl: break instead of looping after driver is found Stop searching for a driver after success. Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Reviewed-By: Gong, Zhigang <zhigang.gong@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 12:44:11 -07:00
Juha-Pekka Heikkila	22bf0f3eb4	mesa: Assert variable coming from get_variable() in get_current_attrib Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 08:26:17 -07:00
Juha-Pekka Heikkila	a7d8607d9e	mesa: Add asserts into emit_fog_instructions Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 08:25:58 -07:00
Juha-Pekka Heikkila	cd6aaf2920	glx: Fix two identical null check errors in driSet/GetInterval Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 08:25:45 -07:00
Dave Airlie	149140e922	st_glsl_to_tgsi: add support for prim id fragment shader input For GLSL 1.50 we can get frag shaders with primitive id as an input, add support to the translator for this. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-12-18 22:46:29 +00:00
Juha-Pekka Heikkila	28b552bf6b	mesa: add asserts in load_texunit_bumpmap In load_texunit_bumpmap tc_array is asserted so lets assert rot_mat_0 and rot_mat_1 also which are coming from same path. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:40:29 -07:00
Juha-Pekka Heikkila	c02f6c26d3	glx: add missing null check in dri2_bind_tex_image Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:40:19 -07:00
Brian Paul	a9bf5999d1	mesa: minor simplification in _mesa_es3_error_check_format_and_type() The type_valid local was set to true and never changed.	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	ca3df5eeda	glx: Add missing null check in dri2CreateDrawable Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	56c5ba8f92	mesa: Verify memory allocations success in _mesa_PushAttrib Check for malloc() returning null to fix Klocwork warnings. Minor clean-ups by BrianP. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	2a83e4182c	mesa: Verify memory allocations success in _mesa_PushClientAttrib Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	d08ac826c5	mesa: Change save_attrib_data() to return boolean Change save_attrib_data() to return true/false depending on success. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Brian Paul	aa4001b607	mesa: add API/extension checks for 3-component texture buffer formats The GL_RGB32F, GL_RGB32UI and GL_RGB32I texture buffer formats are only supposed to be allowed if the GL_ARB_texture_buffer_object_rgb32 extension is supported. Note that the texture buffer extensions require a core profile. This patch adds those checks. Fixes the soon-to-be-added arb_clear_buffer_object-negative-bad-internalformat piglit test.	2013-12-18 09:06:52 -07:00
Brian Paul	eaaa9695b2	mesa: 78-column wrapping in extensions.c	2013-12-18 09:06:52 -07:00
Pi Tabred	4bf3afdde9	mesa: Cleanup mesa/main/bufferobj.h Column wrapping and space between lines. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Pi Tabred	3b0f5fc084	Modify release notes to include ARB_clear_buffer_object extension Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Pi Tabred	78216fb485	Add ARB_clear_buffer_object to list of supported extensions Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Brian Paul	787dadbeea	st/mesa: plug in default buffer object driver functions In particular, this plugs in the new ClearBufferSubData() fallback driver function.	2013-12-18 09:06:51 -07:00
Pi Tabred	5f7bc0c759	mesa: Implement functions for clear_buffer_object extensions Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	7d94653052	mesa: Modify get_buffer() to allow for a variable error code Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	84c4ea571d	mesa: Add bufferobj_range_mapped function Add function to test if the buffer is already mapped and if so, if the mapped range overlaps the given range. Modify the _mesa_InvalidateBufferSubData function to use the new function. Enable buffer_object_subdata_range_good() to use bufferobj_range_mapped Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	72d872ad82	mesa: get_texbuffer_format(): differentiate between core and compat context alpha, lumincance and intensity formats are illegal in a core context. Add a check to return MESA_FORMAT_NONE if one of those is requested within a core context. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	1ec2d0a9a8	mesa: Modify format validation to check for extension not context version Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	d5e6fe4d29	mesa: Make validate_texbuffer_format function available externally - change storage class from static to extern - rename validate_texbuffer_format to _mesa_validate_texbuffer_format Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	1f7c3e541f	mesa: Add infrastructure for GL_ARB_clear_buffer_object - add xml file for extension - add reference in gl_API.xml - add pointer to device driver function table (dd.h) - update dispatch_sanity.cpp Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Jan Vesely	56647c5d8f	clover: Append buffers that use CL_MEM_USE_HOST_PTR. Specs say it's legal for implementations to use internal copies, and the write synchronization seems to work. Fixes clCreateBuffer (together with previous patches) and buffer-flags piglits. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Francisco Jerez <currojerez@riseup.net>	2013-12-18 16:21:59 +01:00
Jan Vesely	21f82188ce	clover: Add parameter checks to clCreateBuffer. v2: Use fewer if statements and functional tricks instead of single-use method, suggested by Francisco Jerez. Squash two small patches into one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-18 16:18:15 +01:00
Markus Trippelsdorf	78fcc31d4a	configure.ac: remove -fcolor-diagnostics from LLVM flags When LLVM is build with Clang, "llvm-config --cxxflags" contains the -fcolor-diagnostics flag. It is not recognized by gcc and the build fails. Fix by removing the flag. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-12-18 07:12:13 -07:00
Thomas Hellstrom	00cf048b12	st/dri: Check for kernel support before enabling fd sharing v2 The dri2 state tracker is checking for driver support before enabling dri2ImageExtension version 7. This commit adds a check that also the kernel driver supports fd sharing through prime. Note that this adds a libdrm dependency on dri2.c. v2: Removed unnecessary clamping of bool expression Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>	2013-12-18 09:11:24 +01:00
Marek Olšák	37c24e6d86	radeonsi: set CB_DISABLE if the color mask is 0 Also needed for the DB in-place decompression according to hw docs. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-12-18 01:20:11 +01:00
Marek Olšák	3352ff97c2	radeonsi: add the htile buffer to the CS ioctl buffer list This may fix the GPU crashes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-12-18 01:20:11 +01:00
Paul Berry	7963fde37b	glsl: Replace _mesa_glsl_parser_targets enum with gl_shader_type. These enums were redundant. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:36 -08:00
Paul Berry	abab438543	main: Move MESA_SHADER_TYPES outside of gl_shader_type enum. This will avoid spurious compiler warnings in the patch that follows. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:36 -08:00
Paul Berry	d9b55244fd	glsl: Don't return bad values from _mesa_shader_type_to_index. This will avoid compiler warnings in the patch that follows. There should be no user-visible effect because the change only affects the behaviour when an invalid enum is passed to _mesa_shader_type_to_index(), and that can only happen if there is a bug elsewhere in Mesa. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:35 -08:00
Brian Paul	188630dc13	swrast: silence driContextSetFlags() parameter type warning	2013-12-17 09:47:47 -08:00
Brian Paul	d79058d1c6	st/dri: fix compiler warning for driCopySubBufferExtension	2013-12-17 09:47:47 -08:00
Marek Olšák	2b404a6504	radeonsi: improve HiZ precision for less and lequal depth functions r600g needs this too. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-17 15:41:46 +01:00
Marek Olšák	1a63f278f2	radeonsi: make DB_RENDER_OVERRIDE an invariant register All this cruft was ported from r600g and isn't needed on SI and later according to hw docs. If we implemented HiS, we would set it to 0. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-17 15:41:46 +01:00
Marek Olšák	249cb511c5	radeonsi: flush HTILE when appropriate Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-17 15:41:46 +01:00
Thomas Hellstrom	3e2b0f801d	st/xa: Add new map flags Replicate some of the gallium pipe transfer functionality. Also bump minor to signal availability of this feature. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-12-17 09:01:29 +01:00
Alexander von Gluck IV	56d920a5c1	Haiku: Add in public GL kit headers * These make up the base of what C++ GL Haiku applications use for 3D rendering. * Not placed in includes/GL to prevent Haiku headers from getting installed on non-Haiku systems. Acked-by: Brian Paul <brianp@vmware.com>	2013-12-16 18:18:12 -06:00
Rob Clark	f9cfe5ce82	freedreno: dummy-draw workaround for a320 Fixes gpu lockups in supertuxkart. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-14 12:35:07 -05:00
Marek Olšák	b56c7f4df1	r600g: expose 32-bit integer vertex formats This advertises GL_ARB_texture_buffer_object_rgb32.	2013-12-14 17:42:08 +01:00
Marek Olšák	2eb321b992	radeonsi: move invariant regs to si_init_config Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	696229523d	r600g: use shader-based MSAA resolving when hw-based one cannot be used This fixes some MSAA integer tests.	2013-12-14 17:42:08 +01:00
Marek Olšák	9ebb9a3c8e	radeonsi: use shader-based MSAA resolving when hw-based one cannot be used This fixes MSAA resolving for 32-bit integer colorbuffers, which isn't implemented by the hardware. It also fixes VM protection faults when resolving MSAA 2D array textures. This may be a CB bug, because shader-based resolving works fine. It may also be faster for upside-down and scaled blits. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	5a609fbcb5	gallium/u_blitter: implement shader-based MSAA resolve with bilinear filtering For scaled resolve. The filter is only good for magnification. If somebody has an idea how to implement a good filter for minification, I'm all ears. I'd have to use derivatives probably. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	fc21098a95	gallium/u_blitter: implement shader-based MSAA resolve We need this for integer formats and upside-down blits, which Radeons don't support for MSAA resolving. It can be used by calling util_blitter_blit. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	f0ed082bab	gallium/u_blitter: remove useless parameters from some functions Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	072c5d0573	st/dri: resolve sRGB buffers in linear colorspace Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Roland Scheidegger	27d47bd42f	gallivm: fix pointer type for stmxcsr/ldmxcsr The argument is a i8 pointer not a i32 pointer (even though the value actually stored/loaded IS i32). Older llvm versions didn't care but 3.2 and newer do leading to crashes. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-12-14 17:11:03 +01:00
Roland Scheidegger	7c027666da	llvmpipe: get rid of barycentric calculation of a0 Didn't really work as well as hoped (in particular it was not generally more accurate), will solve this differently. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-12-14 17:11:03 +01:00
Roland Scheidegger	bfcf1ba1c4	llvmpipe: (trivial) get rid of triangle subdivision code This code was always problematic, and with 64bit rasterization we no longer need it at all. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:11:03 +01:00
Kenneth Graunke	35f0aafaa4	i965: Treat Haswell as 75 in the surface format table. Much like we do for G45. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-13 21:14:19 -08:00
Chris Forbes	8bb666cee3	mesa: fix texture view use of _mesa_get_tex_image() The target parameter to _mesa_get_tex_image() is a target enum, not an index. When we're setting up faces for a cubemap, it should be CUBE_MAP_POSITIVE_X .. CUBE_MAP_NEGATIVE_Z; for all other targets it should be the same as the texobj's target. Fixes broken cubemaps [had only +X face but claimed to have all] produced by glTextureView, which then caused various crashes in the driver when we tried to use them. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 16:32:41 +13:00
Chris Forbes	544869377d	i965/fs: add support for gl_SampleMaskIn[] v2: - add assert so we don't run into trouble on Gen6. - adjust for Tapani's rearrangement of ir_variable Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:28:11 +13:00
Chris Forbes	1d71f38924	glsl: add gl_SampleMaskIn[] builtin Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:24:22 +13:00
Chris Forbes	c1e1dd2298	mesa: add SYSTEM_VALUE_SAMPLE_MASK_IN Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:24:21 +13:00
Brian Paul	7d91390359	mesa: document _mesa_texstore() return value	2013-12-13 17:02:43 -07:00
Brian Paul	19fa540219	st/mesa: only set up sampler compare mode for depth textures The GL_ARB_shadow spec says the shadow compare mode should have no effect when sampling a color texture. As it was, it was up to drivers to check for that (softpipe, llvmpipe, svga and probably the rest don't do that). Note: it looks like DX10 allows shadow compare with some non-depth formats, so this case really should be handled in the state tracker. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:07 -07:00
Brian Paul	31b0e7d024	st/mesa: add const qualifiers in sampler validation code Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:06 -07:00
Brian Paul	9f9860b004	st/mesa: add const qualifier to st_translate_color() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:06 -07:00
Brian Paul	eff11b5a4a	st/mesa: simplify integer texture check Just use the gl_texture_object::_IsInteger field instead of computing it from scratch. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:06 -07:00
Brian Paul	b5cc710473	mesa: update glext.h to version 20131212 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-13 16:04:23 -07:00
Brian Paul	d6a8421f3b	svga: don't emit extraneous fs shadow code Depending on the depth texture format, we may or may not have to emit explicit fs code to do the shadow comparison. Before, we were emitting it more often than needed. v2: check the actual texture format rather than the screen->depth.z16 field. The screen->depth.z16, x8z24, s8z24 fields may not all be set to a consistent set of depth formats. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-13 12:01:28 -08:00
Brian Paul	e735dfd35b	mesa: s/uint/GLuint/ to fix MSVC error	2013-12-13 12:51:10 -07:00
Courtney Goeltzenleuchter	375f660e27	mesa: Update TexStorage to support ARB_texture_view Call TextureView helper function to set TextureView state appropriately for the TexStorage calls. Misc updates from review feedback. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	1db4cb841b	mesa: add texture_view helper function for TexStorage Add helper function to set texture_view state from TexStorage calls. Include review feedback. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	f07ca59839	mesa: Fill out ARB_texture_view entry points Add Mesa TextureView logic. Incorporate feedback on ARB_texture_view: - Add S3TC VIEW_CLASSes to compatibility table - Use existing _mesa_get_tex_image - Clean up error strings - Use bool instead of GLboolean for internal functions - Split compound level & layer test into individual tests - eliminate helper macro for VIEW_CLASS table - do not call driver if ptr null. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	bb5947de99	mesa: consolidate multiple next_mipmap_level_size Refactor to make next_mipmap_level_size defined in mipmap.c a _mesa_ helper function that can then be used by texture_view Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	320ec1deac	mesa: Add driver entry point for ARB_texture_view Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	f1563e6392	mesa: ARB_texture_view get parameters Add support for ARB_texture_view get parameters: GL_TEXTURE_VIEW_MIN_LEVEL GL_TEXTURE_VIEW_NUM_LEVELS GL_TEXTURE_VIEW_MIN_LAYER GL_TEXTURE_VIEW_NUM_LAYERS Incorporate feedback regarding when to allow query of GL_TEXTURE_IMMUTABLE_LEVELS. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	668f3614ca	mesa: update texture object for ARB_texture_view Add state needed by glTextureView to the gl_texture_object. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:53 -07:00
Courtney Goeltzenleuchter	2e8493af51	mesa: Tracking for ARB_texture_view extension Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:53 -07:00
Courtney Goeltzenleuchter	d77d2af20a	mesa: Add API definitions for ARB_texture_view Stub in glTextureView API call to go with the glTextureView API xml definition. Includes dispatch test for glTextureView Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:53 -07:00
Anuj Phogat	7a73c6acb0	mesa: Fix error code generation in glBeginConditionalRender() This patch changes the error condition to satisfy below statement from OpenGL 4.3 core specification: "An INVALID_OPERATION error is generated if id is the name of a query object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query currently in progress." Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 11:13:25 -08:00
Carl Worth	93e399f641	Makefile: Add bin/test-driver to EXTRA_FILES I'm not sure why this change is necessary. When I've built previous tar files (such as 9.2.4) with the "make tarballs" target, they include the bin/test-driver file. But at my first attempt to build the tar files for the 10.0.1 release this file was not being included and the build failed. (cherry picked from commit `d573899b93`) [The cherry pick is because I original applied this on the 10.0 branch while working on the 10.0.1 release. But if we don't have this on master as well, this issue will trip us up again the next time we make a new major-release branch off of master.]	2013-12-13 11:12:23 -08:00
Kristian Høgsberg	38366c0c6e	dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context The driverPrivate pointer is opaque to the driver and we can't assume it's a struct gl_context in dri_util.c. Instead provide a helper function to set the struct gl_context flags from the incoming DRI context flags. v2 (idr): Modify the other classic drivers to also use driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests with i965 and classic swrast without regressions. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau] Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-13 08:19:50 -08:00
Carl Worth	d6c8365795	docs: Update note regarding nominating patches for the stable branch. This brings the documentation up to date with the current practice of using the CC syntax for patch nomination.	2013-12-12 23:10:53 -08:00
Carl Worth	16c2919972	docs: Fix typo Simply replacing Extentions with the correct Extensions.	2013-12-12 23:02:54 -08:00
Carl Worth	66d9cbfe6d	docs: Import 9.2.5 release notes, add news item.	2013-12-12 22:58:40 -08:00
Carl Worth	79c60999dc	docs: Import 10.0.1 release notes, add news item.	2013-12-12 22:21:08 -08:00
Dave Airlie	ba00f2f6f5	swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3) This patches add MESA_copy_sub_buffer support to the dri sw loader and then to gallium state tracker, llvmpipe, softpipe and other bits. It reuses the dri1 driver extension interface, and it updates the swrast loader interface for a new putimage which can take a stride. I've tested this with gnome-shell with a cogl hacked to reenable sub copies for llvmpipe and the one piglit test. I could probably split this patch up as well. v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review, add to p_screen doc comments. v3: finish off winsys interfaces, add swrast classic support as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com> swrast: add support for copy_sub_buffer	2013-12-13 14:37:01 +10:00
Brian Paul	40070e72d4	util: fix compile breakage D'oh!	2013-12-12 11:11:32 -07:00
Brian Paul	ba67d72c64	util: move variable declaration out of for-loop To fix MSVC build.	2013-12-12 11:09:02 -07:00
Marek Olšák	be909274aa	gallium/util: implement new color clear API in u_blitter	2013-12-12 18:48:04 +01:00
Marek Olšák	f09de87735	st/mesa: set correct PIPE_CLEAR_COLORn flags This also fixes the clear_with_quad function for glClearBuffer.	2013-12-12 18:48:04 +01:00
Marek Olšák	164dc6216a	gallium: allow choosing which colorbuffers to clear Required for glClearBuffer, which only clears one colorbuffer attachment. Example: If the first colorbuffer is float and the second one is int: pipe->clear(pipe, PIPE_CLEAR_COLOR0, float_clear_color, ...); pipe->clear(pipe, PIPE_CLEAR_COLOR1, int_clear_color, ...); This doesn't need any driver changes yet, because all drivers just use: if (flags & PIPE_CLEAR_COLOR) .. The drivers which support GL 3.0 will have to implement it properly though.	2013-12-12 18:48:04 +01:00
Marek Olšák	0612005aa6	st/mesa: fix glClear with multiple colorbuffers and different formats Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>	2013-12-12 18:48:04 +01:00
Marek Olšák	03d848ea10	mesa: fix interpretation of glClearBuffer(drawbuffer) This corresponding piglit tests supported this incorrect behavior instead of pointing at it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>	2013-12-12 18:48:04 +01:00
Marek Olšák	0ad57bef96	docs/GL3: better documentation of GL 3.0	2013-12-12 18:48:04 +01:00
Marek Olšák	e4ef639a57	r600g,radeonsi: fix initialized buffer range tracking for DMA, add comments The DMA functions modify dst_offset and size and util_range_add gets wrong values. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	7fa8fb7382	radeonsi: fix binding the dummy pixel shader This fixes valgrind errors in glxinfo. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	0eb528abf2	radeonsi: fix FS_COLOR0_WRITES_ALL_CBUFS with mixed colorbuffer formats The 16bpc packing must be done separately for each render target. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	cd86f773a7	radeonsi: use the colorbuffer count from the shader key As a result, the initialization of write_all must be done before the compilation. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	e9fc552837	radeonsi: remove unused variable in si_pipe_shader_ps Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Andreas Hartmetz	8ee7370c9b	radeonsi: Write htile state to hardware.	2013-12-12 18:34:11 +01:00
Andreas Hartmetz	a32aa2617d	radeon: Allocate htile buffer for SI in r600_texture.	2013-12-12 18:34:11 +01:00
Andreas Hartmetz	ca5812b45c	radeon: rearrange r600_texture and related code a bit. This should make the differences and similarities between color and depth buffer handling more clear.	2013-12-12 18:34:11 +01:00
Marek Olšák	91aca8c662	r600g,radeonsi: consolidate buffer code, add handling of DISCARD_RANGE for SI This adds 2 optimizations for radeonsi: - handling of DISCARD_RANGE - mapping an uninitialized buffer range is automatically UNSYNCHRONIZED Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	12806449fa	r600g,radeonsi: add common interface for buffer invalidation This will be used by common code in the next commit. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	e1374d86fe	r600g,radeonsi: consolidate some debug flags Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	43ea10eb1d	r600g: refactor out code for buffer invalidation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	bba39d8804	r600g,radeonsi: share flags has_cp_dma and has_streamout Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	32fd445daa	radeonsi: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE which can come from glBufferData and glMapBufferRange. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	cc2c100274	radeonsi: implement accelerated buffer copying Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	171e4842ec	r600g: use common interfaces in buffer_transfer_unmap i.e. dma_copy and resource_copy_region. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	0aea43db93	radeon: move some functions to r600_buffer_common.c Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christoph Brill <egore911@gmail.com> v2: Renamed r600_buffer.c to r600_buffer_common.c. The stupid build system doesn't allow 2 files of the same name in different directories.	2013-12-12 18:34:05 +01:00
Marek Olšák	0b37737cc3	winsys/radeon: set/get the scanout flag with the tiling ioctls If we assume that all buffers allocated by the DDX are scanout, a new flag that says "this is not scanout" has to be added to support the non-scanout buffers and maintain backward compatibility. This fixes bad rendering on Wayland. The flag is defined as: #define RADEON_TILING_R600_NO_SCANOUT RADEON_TILING_SWAP_16BIT AFAIK, RADEON_TILING_SWAP_16BIT is not used on SI. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 17:26:41 +01:00
Tapani Pälli	a6345f1559	glsl: modify ir_clone to use memcpy Patch copies the whole data structure at once instead of assigning individual variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:13 +02:00
Tapani Pälli	447bb9029f	glsl: move variables in to ir_variable::data, part II This patch moves following bitfields and variables to the data structure: explicit_location, explicit_index, explicit_binding, has_initializer, is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray, from_named_ifc_block_array, depth_layout, location, index, binding, max_array_access, atomic Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:11 +02:00
Tapani Pälli	33ee2c67c0	glsl: move variables in to ir_variable::data, part I This patch moves following bitfields in to the data structure: used, assigned, how_declared, mode, interpolation, origin_upper_left, pixel_center_integer Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:08 +02:00
Tapani Pälli	c1d3080ee8	glsl: introduce data section to ir_variable Data section helps serialization and cloning of a ir_variable. This patch includes the helper bits used for read only ir_variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:06 +02:00
Tapani Pälli	cbe7431cdb	mesa: fix a typo in glDetachShader error message Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-12 07:50:06 +02:00
Brian Paul	ccd6bf8272	svga: expose HW smooth/stipple/wide lines Newer virtual HW versions support smooth/stipple/wide lines. Use that instead of 'draw' fallbacks when possible. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-11 17:19:44 -08:00
Juha-Pekka Heikkila	84b1716b5e	glx: Add missing null check in DRI2WireToEvent Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-11 18:18:43 -07:00
Matthew McClure	e84a1ab3c4	llvmpipe: add plumbing for ARB_depth_clamp With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when clamping the fragment's zw value. To support this, the variant key now includes the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's (depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-11 18:24:21 +00:00
Vadim Girlin	00faf82832	r600g/sb: fix stack size computation on evergreen On evergreen we have to reserve 1 stack element in some additional cases besides the ones mentioned in the docs, but stack size computation was recently reimplemented exactly as described in the docs by the patch that added workarounds for stack issues on EG/CM, resulting in regressions with some apps (Serious Sam 3). This patch fixes it by restoring previous behavior. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Tested-by: Andre Heider <a.heider@gmail.com>	2013-12-11 04:08:32 +04:00
Zack Rusin	7a50d38a2b	llvmpipe: add a very useful (disabled) debugging output Disabled by default, but it's very useful when needed. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-10 16:41:11 -05:00
Zack Rusin	48b07fb4fc	draw: fix vbuf caching of vertices with inject front face Caching in the vbuf module meant that once a vertex has been emitted it was cached, but it's possible for a vertex at the same location to be emitted again, but this time with a different front-face semantic. Caching was causing the first version of the vertex to be emitted, which resulted in the renderer getting incorrect front-face attributes. By reseting the vertex_id (which is used for caching) we make sure that once a front-face info has been injected the vertex will endup getting emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-10 16:40:54 -05:00
Zack Rusin	155139059b	llvmpipe: fix blending with half-float formats The fact that we flush denorms to zero breaks our half-float conversion and blending. This patches enables denorms for blending. It's a little tricky due to the llvm bug that makes it incorrectly reorder the mxcsr intrinsics: http://llvm.org/bugs/show_bug.cgi?id=6393 Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-12-10 16:39:48 -05:00
Thomas Hellstrom	1e71493afa	svga/winsys: Implement surface sharing using prime fd handles This needs a prime-aware vmwgfx kernel module to work properly. (With additions by Christopher James Halse Rogers <raof@ubuntu.com>) Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:51 +01:00
Christopher James Halse Rogers	db687011e0	gallium/radeon: Implement hooks for DRI Image 7 (v2) v2: Fix transliteration of lseek arguments Ignore busy return from RADEON_GEM_BUSY ioctl; we're only after the domain Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:45 +01:00
Christopher James Halse Rogers	bff6c5d2b5	radeon: Rename bo_handles hashtable to match its actual contents. It's a map of GEM name->bo, so identify it as such Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:41 +01:00
Christopher James Halse Rogers	7d2c1df99e	ilo: Support DRI Image 7 Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:29 +01:00
Maarten Lankhorst	3e680de1eb	nouveau: Support DRI Image 7 extension Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:17 +01:00
Christopher James Halse Rogers	df3b20b2cf	gallium/dri: Support DRI Image extension version 7 v2: Fix up queryImage return for ATTRIB_FD Use driver_descriptor.configuration to determine whether the driver supports DMA-BUF import/export. v3: Really, truly, fix up queryImage return for ATTRIB_FD Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:13 +01:00
Christopher James Halse Rogers	6b5e15360a	gallium/dri2: Set winsys_handle type to KMS for stride query. Otherwise the default is TYPE_SHARED, which will flink the bo. This seems rather unnecessary for a simple stride query. Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:09 +01:00
Christopher James Halse Rogers	d5a3a2d2fb	gallium/winsys/drm: Prepare for passing prime fds in winsys_handle Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:05 +01:00
Christopher James Halse Rogers	343133167f	gallium/dri: Support DRI Image extension version 6 v2: Pick out the correct gl_context pointer v3: Don't leak pipe_resources on error path Set img->dri_format correctly Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:45:59 +01:00
Ilia Mirkin	bad8871e52	nv50: report 15 max inputs for fragment programs First off, nv50_program only has 16 in/out varyings. However reporting 16 makes 'm' become 68 in nv50_fp_linkage_validate with the varying-packing-simple piglit test. (Subverting the assert makes it compile but fail.) With this patch, varying-packing-simple passes. See: https://bugs.freedesktop.org/show_bug.cgi?id=69155 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-10 08:45:59 +01:00
Maarten Lankhorst	5576ad11ed	nouveau: Fix compiler warning regression cfg is now unused, remove it. Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-10 08:43:41 +01:00
Dave Airlie	0b16042377	swrast: fix readback regression since inversion fix This readback from the frontbuffer with swrast was broken, that bug just made it more obviously broken, this fixes it by inverting the sub image gets. Also fixes a few other piglits. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325 (for 9.2 the patches this depends on were asked to be backported separately in an email). Cc: "9.2" "10.0" mesa-stable@lists.fedoraproject.org Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-12-10 13:33:40 +10:00
Jordan Justen	4859d492b2	dri megadriver_stub: add compatibility for older DRI loaders To help the transition period when DRI loaders are being updated to support the newer __driDriverExtensions_foo mechanism, we populate __driDriverExtensions with the extensions returned by __driDriverExtensions_foo during a library contructor function. We find the driver foo's name by using the dladdr function which gives the path of the dynamic library's name that was being loaded. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Keith Packard <keithp@keithp.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 16:33:45 -08:00
Kristian Høgsberg	4ed055b4a6	egl/wayland: Return -1 from get_back_bo to indicate error A return value of -1 indicate failure to allocate the back buffer and means we don't segfault on the way out.	2013-12-09 16:14:33 -08:00
Neil Roberts	0b7058c46a	egl_dri2: Remove the unused swap_interval member of dri2_egl_surface The _EGLSurface struct which is embedded into dri2_egl_surface also contains a swap interval member so the other member is redundant. Nothing was using it as far as I can tell.	2013-12-09 16:14:32 -08:00
Kenneth Graunke	19190c2b8c	i965: Replace OUT_RELOC_FENCED with OUT_RELOC. On Gen4+, OUT_RELOC_FENCED is equivalent to OUT_RELOC; libdrm silently ignores the fenced flag: /* We never use HW fences for rendering on 965+ */ if (bufmgr_gem->gen >= 4) need_fence = false; Thanks to Eric for noticing this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-09 13:52:18 -08:00
Paul Berry	088494aa03	glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound. Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:09 -08:00
Paul Berry	7ea3baa64d	glsl/loops: Stop creating normatively bound loops in loop_controls. Previously, when loop_controls analyzed a loop and found that it had a fixed bound (known at compile time), it would remove all of the loop terminators and instead set the loop's normative_bound field to force the loop to execute the correct number of times. This made loop unrolling easy, but it had a serious disadvantage. Since most GPU's don't have a native mechanism for executing a loop a fixed number of times, in order to implement the normative bound, the back-ends would have to synthesize a new loop induction variable. As a result, many loops wound up having two induction variables instead of one. This caused extra register pressure and unnecessary instructions. This patch modifies loop_controls so that it doesn't set the loop's normative_bound anymore. Instead it leaves one of the terminators in the loop (the limiting terminator), so the back-end doesn't have to go to any extra work to ensure the loop terminates at the right time. This complicates loop unrolling slightly: when deciding whether a loop can be unrolled, we have to account for the presence of the limiting terminator. And when we do unroll the loop, we have to remove the limiting terminator first. For an example of how this results in more efficient back end code, consider the loop: for (int i = 0; i < 100; i++) { total += i; } Previous to this patch, on i965, this loop would compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop (notice that both g8 and g4 are loop induction variables; one is used to terminate the loop, and the other is used to accumulate the total). After this patch, the same loop compiles to: mov(8) g4<1>.xD 0D loop: cmp.ge.f0(8) null g4<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:06 -08:00
Paul Berry	4d844cfa56	glsl/loops: Get rid of loop_variable_state::max_iterations. This value is now redundant with loop_variable_state::limiting_terminator->iterations and ir_loop::normative_bound. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:03 -08:00
Paul Berry	e734c9f677	glsl/loops: Simplify loop unrolling logic by breaking into functions. The old logic of loop_unroll_visitor::visit_leave(ir_loop *) was: heuristics to skip unrolling in various circumstances; if (loop contains more than one jump) return; else if (loop contains one jump) { if (the jump is an unconditional "break" at the end of the loop) { remove the break and set iteration count to 1; fall through to simple loop unrolling code; } else { for (each "if" statement in the loop body) see if the jump is a "break" at the end of one of its forks; if (the "break" wasn't found) return; splice the remainder of the loop into the other fork of the "if"; remove the "break"; complex loop unrolling code; return; } } simple loop unrolling code; return; These tasks have been moved to their own functions: - splice the remainder of the loop into the other fork of the "if" - simple loop unrolling code - complex loop unrolling code And the logic has been flattened to: heuristics to skip unrolling in various circumstances; if (loop contains more than one jump) return; if (loop contains no jumps) { simple loop unroll; return; } if (the jump is an unconditional "break" at the end of the loop) { remove the break; simple loop unroll with iteration count of 1; return; } for (each "if" statement in the loop body) { if (the jump is a "break" at the end of one of its forks) { splice the remainder of the loop into the other fork of the "if"; remove the "break"; complex loop unroll; return; } } This will make it easier to modify the loop unrolling algorithm in a future patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:59 -08:00
Paul Berry	ffc29120c4	glsl/loops: Move some analysis from loop_controls to loop_analysis. Previously, the sole responsibility of loop_analysis was to find all the variables referenced in the loop that are either loop constant or induction variables, and find all of the simple if statements that might terminate the loop. The remainder of the analysis necessary to determine how many times a loop executed was performed by loop_controls. This patch makes loop_analysis also responsible for determining the number of iterations after which each loop terminator will terminate the loop, and for figuring out which terminator will terminate the loop first (I'm calling this the "limiting terminator"). This will allow loop unrolling to make use of information that was previously only visible from loop_controls, namely the identity of the limiting terminator. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:56 -08:00
Paul Berry	4bbf6d1d2b	glsl/loops: Allocate loop_terminator using new(mem_ctx) syntax. Patches to follow will introduce code into the loop_terminator constructor. Allocating loop_terminator using new(mem_ctx) syntax will ensure that the constructor runs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:53 -08:00
Paul Berry	714e1b331e	glsl/loops: Remove unnecessary list walk from loop_control_visitor. When loop_control_visitor::visit_leave(ir_loop *) is analyzing a loop terminator that acts on a certain ir_variable, it doesn't need to walk the list of induction variables to find the loop_variable entry corresponding to the variable. It can just look it up in the loop_variable_state hashtable and verify that the loop_variable entry represents an induction variable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:49 -08:00
Paul Berry	115fd75ab0	glsl/loops: Remove unused fields iv_scale and biv from loop_variable class. These fields were part of some planned optimizations that never materialized. Remove them for now to simplify things; if we ever get round to adding the optimizations that would require them, we can always re-introduce them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:46 -08:00
Paul Berry	e00b93a1f7	glsl/loops: replace loop controls with a normative bound. This patch replaces the ir_loop fields "from", "to", "increment", "counter", and "cmp" with a single integer ("normative_bound") that serves the same purpose. I've used the name "normative_bound" to emphasize the fact that the back-end is required to emit code to prevent the loop from running more than normative_bound times. (By contrast, an "informative" bound would be a bound that is informational only). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:33 -08:00
Paul Berry	2c17f97fe6	glsl/loops: consolidate bounded loop handling into a lowering pass. Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the i965 fs and vec4 visitors) had nearly identical logic for handling bounded loops. This replaces the duplicate logic with an equivalent lowering pass that is used by all the back-ends. Note: on i965, there is a slight increase in instruction count. For example, a loop like this: for (int i = 0; i < 100; i++) { total += i; } would previously compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) break(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop After this patch, the "(+f0) break(8)" turns into: (+f0) if(8) break(8) endif(8) because the back-end isn't smart enough to recognize that "if (condition) break;" can be done using a conditional break instruction. However, it should be relatively easy for a future peephole optimization to properly optimize this. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:26 -08:00
Paul Berry	97d8b77054	glsl: In loop analysis, handle unconditional second assignment. Previously, loop analysis would set this->conditional_or_nested_assignment based on the most recently visited assignment to the variable. As a result, if a vaiable was assigned to more than once in a loop, the flag might be set incorrectly. For example, in a loop like this: int x; for (int i = 0; i < 3; i++) { if (i == 0) x = 10; ... x = 20; ... } loop analysis would have incorrectly concluded that all assignments to x were unconditional. In practice this was a benign bug, because conditional_or_nested_assignment is only used to disqualify variables from being considered as loop induction variables or loop constant variables, and having multiple assignments also disqualifies a variable from being considered as either of those things. Still, we should get the analysis correct to avoid future confusion. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:23 -08:00
Paul Berry	cb38a0dc0a	glsl: Fix handling of function calls inside nested loops. Previously, when visiting an ir_call, loop analysis would only mark the innermost enclosing loop as containing a call. As a result, when encountering a loop like this: for (i = 0; i < 3; i++) { for (int j = 0; j < 3; j++) { foo(); } } it would incorrectly conclude that the outer loop ran three times. (This is not certain; if foo() modifies i, then the outer loop might run more or fewer times). Fixes piglit test "vs-call-in-nested-loop.shader_test". Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:20 -08:00
Paul Berry	877db5a792	glsl: Fix loop analysis of nested loops. Previously, when visiting a variable dereference, loop analysis would only consider its effect on the innermost enclosing loop. As a result, when encountering a loop like this: for (int i = 0; i < 3; i++) { for (int j = 0; j < 3; j++) { ... i = 2; } } it would incorrectly conclude that the outer loop ran three times. Fixes piglit test "vs-inner-loop-modifies-outer-loop-var.shader_test". Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:16 -08:00
Paul Berry	2e060551bd	glsl: Extract functions from loop_analysis::visit(ir_dereference_variable *). This function is about to get more complex. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:13 -08:00
Paul Berry	69c44d65c8	i965/gen7+: Implement fast color clears for MSAA buffers. Fast color clears of MSAA buffers work just like fast color clears with non-MSAA buffers, except that the alignment and scaledown requirements are different. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-12-09 10:54:10 -08:00
Paul Berry	0ac622accf	i965/blorp: Refactor code for computing fast clear align/scaledown factors. This will make it easier to add fast color clear support to MSAA buffers, since they have different alignment and scaling requirements. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:54:07 -08:00
Paul Berry	da08ee8e3b	i965/blorp: allow multisample blorp clears Previously, we didn't do multisample blorp clears because we couldn't figure out how to get them to work. The reason for this was because we weren't setting the brw_blorp_params num_samples field consistently with dst.num_samples. Now that those two fields have been collapsed down into one, we can do multisample blorp clears. However, we need to do a few other pieces of bookkeeping to make them work correctly in all circumstances: - Since blorp clears may now operate on multisampled window system framebuffers, they need to call intel_renderbuffer_set_needs_downsample() to ensure that a downsample happens before buffer swap (or glReadPixels()). - When clearing a layered multisample buffer attachment using UMS or CMS layout, we need to advance layer by multiples of num_samples (since each logical layer is associated with num_samples physical layers). Note: we still don't do multisample fast color clears; more work needs to be done to enable those. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:54:03 -08:00
Paul Berry	73e8bd9f5c	i965/blorp: Get rid of redundant num_samples blorp param. Previously, brw_blorp_params contained two fields for determining sample count: num_samples (which determined the multisample configuration of the rendering pipeline) and dst.num_samples (which determined the multisample configuration of the render target surface). This was redundant, since both fields had to be set to the same value to avoid rendering errors. This patch eliminates num_samples to avoid future confusion. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:54:00 -08:00
Paul Berry	25195b0041	i965/gen7+: Disentangle MSAA layout from fast clear state. This patch renames the enum that's used to keep track of fast clear state from "mcs_state" to "fast_clear_state", and it removes the enum value INTEL_MCS_STATE_MSAA (which previously meant, "this is an MSAA buffer, so we're not keeping track of fast clear state"). The only real purpose that enum value was serving was to prevent us from trying to do fast clear resolves on MSAA buffers, and it's just as easy to prevent that by checking the buffer's msaa_layout. This paves the way for implementing fast clears of MSAA buffers. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:10 -08:00
Paul Berry	f416a15096	i965: Don't try to use HW blitter for glCopyPixels() when multisampled. The hardware blitter doesn't understand multisampled layouts, so there's no way this could possibly succeed. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:07 -08:00
Paul Berry	b5fe413b4d	i965: Document conventions for counting layers in 2D multisample buffers. The "layer" parameters used in blorp, and the intel_renderbuffer::mt_layer field, represent a physical layer rather than a logical layer. This is important for 2D multisample arrays on Gen7+ because the UMS and CMS multisample layouts use N physical layers to represent each logical layer, where N is the number of samples. Also add an assertion to blorp to help catch bugs if we fail to follow these conventions. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:03 -08:00
Paul Berry	3a2925bfa9	i965/blorp: Improve fast color clear comment. Clarify the fact that we only optimize full buffer clears using fast color clear, and why. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:00 -08:00
Tom Stellard	9a5ce0c4c9	r300/compiler/tests: Fix line length check in test parser Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 09:40:15 -05:00
Tom Stellard	1896431f79	r300/compiler/tests: Fix segfault Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 09:40:15 -05:00
Ilia Mirkin	2cd2b9705e	nouveau/video: update a few more h264 picparm field names Based on comments by Benjamin Morris <bmorris@nvidia.com> in http://lists.freedesktop.org/archives/nouveau/2013-December/015328.html This adds setting of is_long_term, and updates a few field names we were unclear about. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 15:11:50 +01:00
Ilia Mirkin	78525dae8a	nouveau/video: update h264 picparm field names based on usage Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 15:11:42 +01:00
Ilia Mirkin	e01ba9d6b0	nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0) Create the ref_bo without any storage type flags set for now. The issue probably arises from our use of the additional buffer space at the end of the ref_bo. It should probably be split up in the future. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Martin Peres <martin.peres@labri.fr> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 15:11:20 +01:00
Ilia Mirkin	e796fa22d4	nvc0: make sure nvd7 gets NVC8_3D_CLASS as well Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-09 15:10:37 +01:00
Ilia Mirkin	1386cb9488	nv50: TXF already has integer arguments, don't try to convert from f32 Fixes the texelFetch piglit tests Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-09 15:10:37 +01:00
Matthew McClure	0319ea9ff6	llvmpipe: clamp fragment shader depth write to the current viewport depth range. With this patch, generate_fs_loop will clamp any fragment shader depth writes to the viewport's min and max depth values. Viewport selection is determined by the geometry shader output for the viewport array index. If no index is specified, then the default viewport index is zero. Semantics for this path can be found in draw_clamp_viewport_idx and lp_clamp_viewport_idx. lp_jit_viewport was created to store viewport information visible to JIT code, and is validated when the LP_NEW_VIEWPORT dirty flag is set. lp_rast_shader_inputs is responsible for passing the viewport_index through the rasterizer stage to fragment stage (via lp_jit_thread_data). Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-09 12:57:02 +00:00
Neil Roberts	992a2dbba8	wayland: Add support for eglSwapInterval The Wayland EGL platform now respects the eglSwapInterval value. The value is clamped to either 0 or 1 because it is difficult (and probably not useful) to sync to more than 1 redraw. The main change is that if the swap interval is 0 then Mesa won't install a frame callback so that eglSwapBuffers can be executed as often as necessary. Instead it will do a sync request after the swap buffers. It will block for sync complete event in get_back_bo instead of the frame callback. The compositor is likely to send a release event while processing the new buffer attach and this makes sure we will receive that before deciding whether to allocate a new buffer. If there are no buffers available then instead of returning with an error, get_back_bo will now poll the compositor by repeatedly sending sync requests every 10ms. This is a last resort and in theory this shouldn't happen because there should be no reason for the compositor to hold on to more than three buffers. That means whenever we attach the fourth buffer we should always get an immediate release event which should come in with the notification for the first sync request that we are throttled to. When the compositor is directly scanning out from the application's buffer it may end up holding on to three buffers. These are the one that is is currently scanning out from, one that has been given to DRM as the next buffer to flip to, and one that has been attached and will be given to DRM as soon as the previous flip completes. When we attach a fourth buffer to the compositor it should replace that third buffer so we should get a release event immediately after that. This patch therefore also changes the number of buffer slots to 4 so that we can accomodate that situation. If DRM eventually gets a way to cancel a pending page flip then the compositors can be changed to only need to hold on to two buffers and this value can be put back to 3. This also moves the vblank configuration defines from platform_x11.c to the common egl_dri2.h header so they can be shared by both platforms.	2013-12-07 22:36:02 -08:00
Neil Roberts	25cc889004	wayland: Block for the frame callback in get_back_bo not dri2_swap_buffers Consider a typical game-style main loop which might be like this: while (1) { draw_something(); eglSwapBuffers(); } In this case the game is relying on eglSwapBuffers to throttle to a sensible frame rate. Previously this game would end up using three buffers even though it should only need two. This is because Mesa decides whether to allocate a new buffer in get_back_bo which would be before it has tried to read any events from the compositor so it wouldn't have seen any buffer release events yet. This patch just moves the block for the frame callback to get_back_bo. Typically the compositor will send a release event immediately after one of the attaches so if we block for the frame callback here then we can be sure to have completed at least one roundtrip and received that release event after attaching the previous buffer before deciding whether to allocate a new one. dri2_swap_buffers always calls get_back_bo so even if the client doesn't render anything we will still be sure to block to the frame callback. The code to create the new frame callback has been moved to after this call so that we can be sure to have cleared the previous frame callback before requesting a new one.	2013-12-07 22:36:02 -08:00
Vinson Lee	965cde9232	glapi: Do not include dlfcn.h on Windows. This patch fixes this MinGW build error. CC glapi_gentable.lo glapi_gentable.c:47:19: fatal error: dlfcn.h: No such file or directory Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-07 14:31:01 -08:00
Vincent Lejeune	797894036d	r600/llvm: Allow arbitrary amount of temps in tgsi to llvm	2013-12-07 18:39:10 +01:00
Rob Clark	a1d808638d	freedreno/a3xx: add adreno 330 support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-07 09:37:24 -05:00
Rob Clark	d36ae204d5	freedreno/a3xx/compiler: add ROUND Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-07 08:45:27 -05:00
Chris Forbes	88dc246630	mesa: Require per-sample shading if the `sample` qualifier is used. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:15:05 +13:00
Chris Forbes	2625a34bfc	glsl: Populate gl_fragment_program::IsSample bitfield Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:15:03 +13:00
Chris Forbes	6429cc05ca	mesa: add IsSample bitfield to gl_fragment_program Drivers will need to look at this to decide if they need to do per-sample fragment shader dispatch. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:15:01 +13:00
Chris Forbes	5d326fa963	glsl: Put `sample`-qualified varyings in their own packing classes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:14:59 +13:00
Chris Forbes	51c5fc85e1	glsl: Add ir support for `sample` qualifier; adjust compiler and linker Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:14:58 +13:00
Chris Forbes	51aa15aca2	glsl: Add frontend support for `sample` auxiliary storage qualifier Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:14:39 +13:00
Chris Forbes	a1ca580240	i965: Don't flag gather quirks for Gen8+ My understanding is that Broadwell retains the same SCS mechanism that Haswell has, so even if the underlying issue with this format is not fixed, the w/a will be applied in SCS rather than needing shader code. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:17:27 +13:00
Chris Forbes	83b83fb984	i965/Gen7: Allow CMS layout for multisample textures Now that all the pieces are in place, this should provide a nice performance boost for apps using multisample textures. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:10:04 +13:00
Chris Forbes	3122c2421a	i965/vs: Sample from MCS surface when required Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-07 16:10:02 +13:00
Chris Forbes	7810162053	i965/fs: Sample from MCS surface when required Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-07 16:09:49 +13:00
Chris Forbes	7629c489c8	i965: Add shader opcode for sampling MCS surface Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:09:32 +13:00
Chris Forbes	27359b8079	i965/Gen7: Include bitfield in the sampler key for CMS layout We need to emit extra shader code in this case to sample the MCS surface first; we can't just blindly do this all the time since IVB will sometimes try to access the MCS surface even if disabled. V3: Use actual MSAA layout from the texture's mt, rather then computing what would have been used based on the format. This is simpler and less fragile - there's at least one case where we might want to have a texture's MSAA layout change based on what the app does (CMS SINT falling back to UMS if the app ever attempts to render to it with a channel disabled.) This also obsoletes V2's 1/10 -- compute_msaa_layout can now remain an implementation detail of the miptree code. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-07 16:09:12 +13:00
Chris Forbes	b1604841c2	i965/Gen7: Move decision to allocate MCS surface into intel_mipmap_create This gives us correct behavior for both renderbuffers (which previously worked) and multisample textures (which would never get an MCS surface allocated, even if CMS layout was selected) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:08:55 +13:00
Chris Forbes	6ca9a6f4d7	i965/Gen7: emit mcs info for multisample textures Previously this was only done for render targets. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:08:52 +13:00
Chris Forbes	dfa952da97	i965/wm: Set copy of sample mask in 3DSTATE_PS correctly for Haswell The bspec says: "SW must program the sample mask value in this field so that it matches with 3DSTATE_SAMPLE_MASK" I haven't observed this to actually fix anything, but stumbled across it while adding the rest of the support for CMS layout for multisample textures. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:08:47 +13:00
Chris Forbes	8064b0f2c4	i965: refactor sample mask calculation Haswell needs a copy of the sample mask in 3DSTATE_PS; this makes that convenient. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:07:53 +13:00
Ian Romanick	758658850b	glsl: Don't emit empty declaration warning for a struct specifier The intention is that things like int; will generate a warning. However, we were also accidentally emitting the same warning for things like struct Foo { int x; }; Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68838 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Aras Pranckevicius <aras@unity3d.com> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-06 08:06:54 -08:00
Thomas Hellstrom	453651e521	st/xa: Bump major version number to 2 For some reason this was left out when the version was changed... Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-12-06 06:18:03 -08:00
Ben Skeggs	92ceb327ba	nvc0: fixup gk110 and up not being listed in various switch statements Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-12-06 11:28:45 +10:00
Kenneth Graunke	26f3ff8a91	i965: Replace non-standard INLINE macro with "inline". These are identical: main/compiler.h defines INLINE to "inline". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
Kenneth Graunke	11d9af7c0a	i965: Don't use GL types in files shared with intel-gpu-tools. sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \ -e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \ -e 's/GLshort/int16_t/g' \ brw_eu* brw_disasm.c brw_structs.h Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
Kenneth Graunke	a7bdd4cba8	i965: Drop trailing whitespace from the rest of the driver. Performed via: $ for file in ; do sed -i 's/ //g'; done Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
Kenneth Graunke	d542c45c75	i965: Drop trailing whitespace from files shared with intel-gpu-tools. Performed via s/ *$//g. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
José Fonseca	3be333ed30	tools/trace: More tweaks to state dumping. - Ignore buffer format (it is totally arbitrary) - Initialize state. - Handle begin/end_query statements.	2013-12-05 13:35:06 +00:00
José Fonseca	9648b76dc4	trace: Reorder dumping of pipe_rasterizer_state. Such that it matches the pipe_rasterizer_state declaration, making it easier to double-check that all state is being actually dumped. Trivial.	2013-12-05 13:35:06 +00:00
José Fonseca	10450cbbe6	trace: Dump pipe_sampler_state::seamless_cube_map. Trivial.	2013-12-05 13:35:06 +00:00
Michel Dänzer	7435d9f77c	radeonsi: Remove some stale XXX / FIXME comments Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-12-05 13:50:07 +09:00
Matt Turner	cbb49cb2f7	i965: Emit better code for ir_unop_sign. total instructions in shared programs: 1550449 -> 1550048 (-0.03%) instructions in affected programs: 15207 -> 14806 (-2.64%) Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-12-04 20:05:44 -08:00
Matt Turner	d30b2ed5f8	i965/fs: New peephole optimization to flatten IF/BREAK/ENDIF. total instructions in shared programs: 1550713 -> 1550449 (-0.02%) instructions in affected programs: 7931 -> 7667 (-3.33%) Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:44 -08:00
Matt Turner	9658b04fc4	i965/fs: Emit a MOV instead of a SEL if the sources are the same. One program affected. instructions in affected programs: 436 -> 428 (-1.83%) Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:44 -08:00
Matt Turner	4532cac06a	i965/fs: Extend SEL peephole to handle only matching MOVs. Before this patch, the following code would not be optimized even though the first two instructions were common to the then and else blocks: (+f0) IF MOV dst0 ... MOV dst1 ... MOV dst2 ... ELSE MOV dst0 ... MOV dst1 ... MOV dst3 ... ENDIF This commit extends the peephole to handle this case. No shader-db changes. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:44 -08:00
Matt Turner	13de9f03f1	i965/fs: New peephole optimization to generate SEL. fs_visitor::try_replace_with_sel optimizes only if statements whose "then" and "else" bodies contain a single MOV instruction. It also could not handle constant arguments, since they cause an extra MOV immediate to be generated (since we haven't run constant propagation, there are more than the single MOV). This peephole fixes both of these and operates as a normal optimization pass. fs_visitor::try_replace_with_sel is still arguably necessary, since it runs before pull constant loads are lowered. total instructions in shared programs: 1559129 -> 1545833 (-0.85%) instructions in affected programs: 167120 -> 153824 (-7.96%) GAINED: 13 LOST: 6 Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:44 -08:00
Matt Turner	fa227e7cbc	i965/fs: Add SEL() convenience function. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:43 -08:00
Matt Turner	4b0ef4bf38	glsl: Use fabs() on floating point values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-04 20:05:43 -08:00
Matt Turner	8814806c97	i965: Print conditional mod in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	b9af66528e	i965: Externalize conditional_modifier for use in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	637dda1c30	i965: Print argument types in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	21e92e74c8	i965: Externalize reg_encoding for use in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	729fe77e3b	i965/vec4: Don't print swizzles for immediate values. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	2b8e0a73fb	i965/vec4: Print negate and absolute value for src args. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	a85f1b7adf	i965/vec4: Add support for printing HW_REGs in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	942151af30	i965/fs: Print ARF registers properly in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	0e4053234d	i965: Don't print extra (null) arguments in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	d79e711718	glsl: Remove silly OR(..., 0x0) from ldexp() lowering. I translated copysign(0.0f, x) a little too literally. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	b1eb2ad8d1	i965: Allow commuting the operands of ADDC for const propagation. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	04d83396ee	i965/fs: Rename register_coalesce_2() -> register_coalesce(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	9a6b14f674	i965/fs: Remove now useless register_coalesce() pass. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	1520ae48b8	i965/fs: Let register_coalesce_2() eliminate self-moves. This is the last thing that register_coalesce() still handled. total instructions in shared programs: 1561060 -> 1560908 (-0.01%) instructions in affected programs: 15758 -> 15606 (-0.96%) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	8786f381ec	i965: Allow constant propagation into ASR and BFI1. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	ba84800275	i965/cfg: Document cur_* variables. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	7642c3c6ff	i965/cfg: Remove ip & cur from brw_cfg. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	d2fcdd0973	i965/cfg: Clean up cfg_t constructors. parent_mem_ctx was unused since `db47074a`, so remove the two wrappers around create() and make create() the constructor. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	c6450fa963	i965/cfg: Throw out confusing make_list method. make_list is just a one-line wrapper and was confusingly called by NULL objects. E.g., cur_if == NULL; cur_if->make_list(mem_ctx). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	f3bce19f6c	i965/cfg: Include only needed headers. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	f4b50a1466	i965/cfg: Remove unnecessary endif_stack. Unnecessary since last commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Matt Turner	2eb9bbfb68	i965/cfg: Rework to make IF & ELSE blocks flow into ENDIF. Previously we made the basic block following an ENDIF instruction a successor of the basic blocks ending with IF and ELSE. The PRM says that IF and ELSE instructions jump to the ENDIF, rather than over it. This should be immaterial to dataflow analysis, except for if, break, endif sequences: START B1 <-B0 <-B9 0x00000100: cmp.g.f0(8) null g15<8,8,1>F g4<0,1,0>F 0x00000110: (+f0) if(8) 0 0 null 0x00000000UD END B1 ->B2 ->B4 START B2 <-B1 break 0x00000120: break(8) 0 0 null 0D END B2 ->B10 START B3 0x00000130: endif(8) 2 null 0x00000002UD END B3 ->B4 The ENDIF block would have no parents, so dataflow analysis would generate incorrect results, preventing copy propagation from eliminating some instructions. This patch changes the CFG to make ENDIF start rather than end basic blocks, so that it can be the jump target of the IF and ELSE instructions. It helps three programs (including two fs8/fs16 pairs). total instructions in shared programs: 1561126 -> 1561060 (-0.00%) instructions in affected programs: 837 -> 771 (-7.89%) More importantly, it allows copy propagation to handle more cases. Disabling the register_coalesce() pass before this patch hurts 58 programs, while afterward it only hurts 11 programs. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Matt Turner	ed85c0f409	i965/cfg: Keep pointers to IF/ELSE/ENDIF instructions in the cfg. Useful for finding the associated control flow instructions, given a block ending in one. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Matt Turner	51194932d3	i965/cfg: Add code to dump blocks and cfg. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Ian Romanick	fa1923ac3a	mesa: Remove GL_MESA_texture_array cruft from gl.h glext.h has had all the necessary bits for years. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:43 -08:00
Ian Romanick	2a3d1e2e06	mesa: Remove support for GL_MESA_texture_array This extension enabled the use of texture array with fixed-function and assembly fragment shaders. No applications are known to use this extension. NOTE: This patch regresses GL_TEXTURE_1D_ARRAY and GL_TEXTURE_2D_ARRAY cases of the copyteximage piglit test. The test is incorrectly using texture arrays with fixed function while only requiring the GL_EXT_texture_array extension. A fix for the test has been posted to the piglit mailing list. http://lists.freedesktop.org/archives/piglit/2013-November/008639.html Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	538a7f2a80	mesa: Use a single enable for GL_EXT_texture_array and GL_MESA_texture_array Every driver that enables one also enables the other. The difference between the two is MESA adds support for fixed-function and assembly fragment shaders, but EXT only adds support for GLSL. The MESA extension was created back when Mesa did not support GLSL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	e0587fb9d0	mesa: Minor clean-up of target_enum_to_index Constify the gl_context parameter, and remove suffixes from enums that have non-suffix versions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	b092af40a5	mesa: Silence GCC warning in count_tex_size main/texobj.c: In function 'count_tex_size': main/texobj.c:886:23: warning: unused parameter 'key' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	6c84fc2dbf	mesa: Silence GCC warning in _mesa_test_texobj_completeness main/texobj.c: In function '_mesa_test_texobj_completeness': main/texobj.c:553:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] main/texobj.c:553:193: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] main/texobj.c:553:254: warning: signed and unsigned type in conditional expression [-Wsign-compare] main/texobj.c:553:148: warning: signed and unsigned type in conditional expression [-Wsign-compare] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	7144b76872	mesa: Add missing API check for GL_TEXTURE_3D There are no 3D textures in OpenGL ES 1.x. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	01bbebce4d	mesa: Add missing checks for GL_TEXTURE_CUBE_MAP_ARRAY That enum requires GL_ARB_texture_cube_map_array, and it is only available on desktop GL. It looks like this has been an un-noticed issue since GL_ARB_texture_cube_map_array support was added in commit `e0e7e295`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Neil Roberts	5cddb1ce3c	wayland: Add an extension to create wl_buffers from EGLImages This adds an extension called EGL_WL_create_wayland_buffer_from_image which adds the following single function: struct wl_buffer * eglCreateWaylandBufferFromImageWL(EGLDisplay dpy, EGLImageKHR image); The function creates a wl_buffer which shares its contents with the given EGLImage. The expected use case for this is in a nested Wayland compositor which is using subsurfaces to present buffers from its clients. Using this extension it can attach the client buffers directly to the subsurface without having to blit the contents into an intermediate buffer. The compositing can then be done in the parent compositor. The extension is only implemented in the Wayland EGL platform because of course it wouldn't make sense anywhere else.	2013-12-04 17:04:57 -08:00
Kristian Høgsberg	bce64c6c83	egl/wayland: Damage INT32_MAX x INT32_MAX region for eglSwapBuffers If we're not using EGL_EXT_swap_buffers_with_damage, we have to damage the full extent. EGL operates on buffer coordinates, but wl_surface.damage takes surface coordinates. EGL doesn't know the buffer transformation (rotated or scaled) and can't post accurate damage in surface coordinates. The damage event however is clipped to the surface extents so we can just damage the maximum rectangle. In case of EGL_EXT_swap_buffers_with_damage, the application knows the buffer transform and is expected to pass in rectangles in surface space. https://bugs.freedesktop.org/show_bug.cgi?id=70250 Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 16:13:42 -08:00
Axel Davy	afcce46fd5	Enable throttling in SwapBuffers flush_with_flags, when available, allows the driver to throttle. Using this suppress input lag issues that can be observed in heavy rendering situations on non-intel cards. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 15:58:29 -08:00
Kristian Høgsberg	33eb5eabee	egl/wayland: Send commit after flushing the driver context This typically won't make a difference, since we only send the requests at wl_display_flush() time. There might be a small race with another thread calling wl_display_flush() after our commit request, but before we flush the DRI driver. Moving the commit below the DRI driver flush call looks more natural and eliminates the small race. Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 15:48:28 -08:00
Axel Davy	402bf6e8d0	egl/wayland: Flush the wl_display at the end of SwapBuffers We would like the compositor to receive the commited buffer as soon as possible, so it has the time to treat it, and release old ones. We shouldn't rely on the client to flush the queue for us. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 15:48:28 -08:00
Brian Paul	50205e11c6	mesa: reduce memory used for short display lists Display lists allocate memory in chunks of 256 tokens (1KB) at a time. If an app creates many short display lists or uses glXUseXFont() this can waste quite a bit of memory. This patch uses realloc() to trim short lists and reduce the memory used. Also, null/zero-out some list construction fields in _mesa_EndList(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 15:40:32 -07:00
Brian Paul	314ccf6901	mesa: update/remove display list comments Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 09:46:07 -07:00
Brian Paul	483dc973c4	mesa: remove gl_dlist_node::next pointer to reduce dlist memory use Now, sizeof(gl_dlist_node)==4 even on 64-bit systems. This can halve the memory used by some display lists on 64-bit systems. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 09:46:07 -07:00
Brian Paul	b6468b4597	mesa: begin reducing memory used by display lists This is a first step in reducing memory used by display lists on 64-bit systems. On 64-bit systems, the gl_dlist_node union type is 8 bytes because of the 'data' and 'next' fields. This causes every display list node/token to occupy 8 bytes instead of 4 as originally designed. This basically doubles the memory used by some display lists on 64-bit systems. The fix is to remove the 64-bit 'data' and 'next' pointer fields from the union and instead store them as a pair of 32-bit values. Easily done with a few helper functions. The next patch will take care of the 'next' field. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 09:46:07 -07:00
Ilia Mirkin	06359e368b	nouveau: Add lots of comments to the buffer transfer logic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-04 16:38:50 +01:00
Ilia Mirkin	0e5bf85651	nv50: wait on the buf's fence before sticking it into pushbuf This resolves some rendering issues in source games. See https://bugs.freedesktop.org/show_bug.cgi?id=64323 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 16:38:50 +01:00
Ilia Mirkin	ce6dd69697	nouveau: avoid leaking fences while waiting This fixes a memory leak in some situations. Also avoids emitting an extra fence if the kick handler does the call to nouveau_fence_next itself. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 16:38:50 +01:00
Ilia Mirkin	f50a45452a	nv50: fix a small leak on context destroy Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-04 16:38:50 +01:00
Brian Paul	657466a3f6	docs: put MD5 sums in 9.2.4 relnotes file Signed-off-by: Brian Paul <brianp@vmware.com>	2013-12-04 07:47:13 -07:00
Brian Paul	2732d0d21d	docs: use --disable-dri3 for VMware guest driver build For the time being at least. Suggested by Adrian Rangel. Signed-off-by: Brian Paul <brianp@vmware.com>	2013-12-04 07:41:29 -07:00
Siavash Eliasi	f0cc59d68a	mesa: modified _mesa_align_free() to accept NULL pointer So that it acts like ordinary free(). This lets us remove a bunch of if statements where the function is called. v2: - Avoiding compile error on MSVC and possible warnings on other compilers. - Added comment regards passing NULL pointer being safe. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 07:31:27 -07:00
Ilia Mirkin	267679be84	mesa: don't leak performance monitors on context destroy Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 06:20:36 -08:00
Ilia Mirkin	c45cf6199f	nv50: Fix GPU_READING/WRITING bit removal Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> CC: "9.1, 9.2, 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 14:24:30 +01:00
Michel Dänzer	79e6512629	pipe-loader: Fix llvmpipe.la path Fixes make[3]: *** No rule to make target `.../src/gallium/drivers/softpipe/libllvmpipe.la', needed by `pipe_swrast.la'. Stop.	2013-12-04 11:56:10 +09:00
Kenneth Graunke	26b7b50afe	i965: Fix BRW_BATCH_STRUCT to specify RENDER_RING, not UNKNOWN_RING. I missed this in the boolean -> enum conversion. C cheerfully casts false -> 0 -> UNKNOWN_RING. On Gen4-5, this causes the render ring prelude hook to get called in the middle of the batch, which is crazy. BRW_BATCH_STRUCT is not used on Gen6+. Fixes regressions since `395a32717d` ("i965: Introduce an UNKNOWN_RING state."). Fixes "fips -v glxgears" on Ironlake. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-03 16:24:58 -08:00
Kenneth Graunke	e03994bf47	Revert "i965: Move brw_emit_query_begin() to the render ring prelude." This reverts commit `a4bf7f6b6e`. It breaks occlusion queries on Gen4-5. Doing this right will likely require larger changes, which should be done at a future date. Some Piglit tests still passed due to other bugs; fixing those revealed this problem. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-03 16:24:53 -08:00
Kenneth Graunke	da07e1b683	i965: Fix OACONTROL assertion failures on Ironlake. I guarded half of the callers to start/stop_oa_counters with generation checks, but missed the other half (which were added later). OACONTROL doesn't exist on Ironlake, so we better not write it. Also, there's no need---Ironlake's performance counters are always running. This patch moves the generation checks into start/stop_oa_counters, rather than requiring the caller to do them. Fixes assertion failures in Piglit's AMD_performance_monitor/measure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-03 16:24:49 -08:00
Emil Velikov	4c11099453	gallium/radeon: use PRIu64 macro for printing uint64_t Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Emil Velikov	f60737a525	pipe-loader: build llvmpipe on top of softpipe One can select if they want to fallback to softpipe. Current approach makes this not possible, whereas other targets (dri-swrast) handle this approapriately. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Emil Velikov	bc2627a98a	mesa: resolve typo DTXn/DXTn Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Emil Velikov	507c2356e3	automake: include only one copy VERSION in tarball The VERSION file is tracked by git (git ls-files), thus adding it to EXTRA_FILES will result in a duplicate copy within the final tarball. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72230 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reported-by: Patrick Steinhardt <ps@pks.im> Tested-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Juha-Pekka Heikkila	03ef57950a	glx: Add missing null check in gxl/dri2_glx.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-03 14:35:41 -07:00
Juha-Pekka Heikkila	b8875cb7c8	glx: Check malloc return value before accessing memory in glx/clientattrib.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-03 14:35:41 -07:00
Chad Versace	998018d7be	i965: Add extra-alignment for non-msrt fast color clear for all hw (v2) The BSpec states that the aligment for the non-msrt clear rectangle must be doubled; the BSpec does not restricit the workaround to specific hardware. Commit `9a1a67b` applied the workaround to Haswell GT3. Commit `8b659ce` expanded the workaround to all Haswell variants. This commit expands it to all hardware. No Piglit regressions on Ivybridge 0x0166. No fixes either. I know no Ivybridge nor Baytrail bug related to this workaround. However, the BSpec says the extra alignment is required, so let's do it. v2: Apply to all hardware, not just gen7. CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org> CC: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-03 13:19:54 -08:00
Marek Olšák	40e2856123	configure.ac: require libdrm_radeon 2.4.50	2013-12-03 20:07:35 +01:00
Marek Olšák	e47af58bb4	st/mesa: implement layered framebuffer clear for the clear_with_quad fallback Same approach as in u_blitter.	2013-12-03 19:39:13 +01:00
Marek Olšák	6b919b1b2d	gallium/util: implement layered framebuffer clear in u_blitter All bound layers (from first_layer to last_layer) should be cleared. This uses a vertex shader which outputs gl_Layer = gl_InstanceID, so each instance goes to a different layer. By rendering a quad and setting the instance count to the number of layers, it will trivially clear all layers. This requires AMD_vertex_shader_layer (or PIPE_CAP_TGSI_VS_LAYER), which only radeonsi supports at the moment. r600 could do this too. Standard DX11 hardware will have to use a geometry shader though, which has higher overhead.	2013-12-03 19:39:13 +01:00
Marek Olšák	1a02bb71dd	gallium: add support for AMD_vertex_shader_layer	2013-12-03 19:39:13 +01:00
Marek Olšák	d52791a708	radeonsi: add driver support for layered rendering and AMD_vertex_shader_layer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-03 19:39:13 +01:00
Marek Olšák	053606ddae	radeonsi: implement OpenGL edge flags Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-03 19:39:13 +01:00
Marek Olšák	d8d67d2e1f	st/mesa: add support for layered framebuffers and consolidate code This is a subset of geometry shaders. It's all about setting first_layer and last_layer correctly. Also some code between st_render_texture and update_framebuffer_state is consolidated. It doesn't use rtt_level and derives the level from dimensions instead as the code in st_atom_framebuffer.c did.	2013-12-03 19:39:13 +01:00
Marek Olšák	0b3b901cff	mesa: expose AMD_vertex_shader_layer in the core profile only It needs glFramebufferTexture, which isn't available in the compatibility profile. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-03 19:39:13 +01:00
Tapani Pälli	a057b837dd	egl: add HAVE_LIBDRM define, fix EGL X11 platform Commit `a594cec` broke EGL X11 backend by adding dependency between X11 and DRM backends requiring HAVE_EGL_PLATFORM_DRM defined for X11. This patch fixes the issue by adding additional define for libdrm detection independent of which backend is being compiled. Tested by compiling Mesa with '--with-egl-platforms=x11' and running es2gears_x11 + glbenchmark2.7 successfully. v2: return true for dri2_auth if running without libdrm (Samuel) v3: check libdrm when building EGL drm platform + AM_CFLAGS fix (Emil) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72062 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: mesa-stable@lists.freedesktop.org	2013-12-03 09:21:24 -08:00
Andreas Heider	ad3937fd4e	freedreno: Add a few texture formats	2013-12-02 17:37:03 -05:00
Kenneth Graunke	decf070258	i965: Skip the register write check on Broadwell. MI_STORE_REGISTER_MEM has to take a 48-bit address, so the existing code doesn't work. But supposedly Broadwell has a register whitelist and just works out of the box anyway, so there's no need to check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:26:03 -08:00
Kenneth Graunke	8ed9f69b36	i965: Fix texture border color on Broadwell. The Gen7 sampler state code still works. Increasing the alignment to 64 bytes makes bit 5 zero, which is good because it's now reserved. Since we don't use the new filter bits, we can leave those as zero too, which means we don't need to update the code to update the pointer. (We probably should anyway, for clarity, but alas, another day.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:52 -08:00
Kenneth Graunke	bc9d3a0254	i965: Don't use MACH for integer multiplies on Gen8+. The documentation is really hard to follow, but apparently a 32-bit x 32-bit multiply just works without the MACH macro. The macro apparently is only necessary to get the full 64-bit value. Fixes Piglit tests [vf]s-op-mult-int-int.shader_test. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:32 -08:00
Kenneth Graunke	5720832f23	i965: Fix texture swizzling on Broadwell. Like Haswell, we do this in SURFACE_STATE rather than shader workarounds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:23 -08:00
Kenneth Graunke	1110ba4c08	i965: Set vertical alignment unit to 4 on Broadwell. Broadwell doesn't support a surface vertical alignment of 2. It only supports VALIGN_4, VALIGN_8, or VALIGN_16. I chose 4 since it's the least wasteful. v2: Replace my comment with a better one from Eric. Move Broadwell checks earlier so it's more obvious that "return 2" won't be hit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:11 -08:00
Kenneth Graunke	93658054c0	i965/vs: Always store pull constant offsets in GRFs on Gen8. We need to SEND from a GRF, and we can only obtain those prior to register allocation. This allows us to do pull constant loads without the MRF hack. v2: Reword comments (suggested by Paul). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-02 13:19:10 -08:00
Kenneth Graunke	dd159f25e4	i965/vs: Don't copy propagate into SEND-from-GRF messages. SEND can't deal with swizzles, source modifiers, and so on. This should avoid problems with VS pull constant loads on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:10:12 -08:00
Francisco Jerez	ce34158680	clover: Fix missing minus sign in 'iterator_adaptor::operator-='. The method is currently unused, this probably doesn't fix anything at this point.	2013-12-02 11:55:02 -08:00
Chad Versace	8b659cef3a	i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs Pre-patch, the workaround was applied to only HSW GT3. However, the workaround also fixes render corruption on the HSW GT1 Chromebook, codenamed Falco. Also, update the BSpec quote that discusses the workaround to reflect the latest BSpec. The BSpec states that the workaround is required for Ivybridge and Baytrail as well as Haswell. But, we apply the workaround to only Haswell because (a) we suspect that is the only hardware where it is actually required and (b) we haven't yet validated the workaround for the other hardware. CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org> CC: Anuj Phogat <anuj.phogat@gmail.com> OTC-Tracker: CHRMOS-812 Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-02 10:53:33 -08:00
Kenneth Graunke	5b331f6fcb	glsl: Simplify the built-in function linking code. Previously, we stored an array of up to 16 additional shaders to link, as well as a count of how many each shader actually needed. Since the built-in functions rewrite, all the built-ins are stored in a single shader. So all we need is a boolean indicating whether a shader needs to link against built-ins or not. During linking, we can avoid creating the temporary array if none of the shaders being linked need built-ins. Otherwise, it's simply a copy of the array that has one additional element. This is much simpler. This patch saves approximately 128 bytes of memory per gl_shader object. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:33:04 -08:00
Kenneth Graunke	1b557b1606	glsl: Create an accessor for the built-in function shader. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:33:02 -08:00
Kenneth Graunke	5af97b43c9	glsl: Drop crazy looping from no_matching_function_error(). Since the built-in functions rewrite, num_builtins_to_link is always either 0 or 1, so we don't need tho crazy loop starting at -1 with a special case. All we need to do is print the prototypes from the current shader, and the single built-in function shader (if present). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:33:00 -08:00
Kenneth Graunke	e04a97ff23	glsl: Merge "candidates are: " message to the previous line. Previously, when we hit a "no matching function" error, it looked like: 0:0(0): error: no matching function for call to `cos()' 0:0(0): error: candidates are: float cos(float) 0:0(0): error: vec2 cos(vec2) 0:0(0): error: vec3 cos(vec3) 0:0(0): error: vec4 cos(vec4) Now it looks like: 0:0(0): error: no matching function for call to `cos()'; candidates are: 0:0(0): error: float cos(float) 0:0(0): error: vec2 cos(vec2) 0:0(0): error: vec3 cos(vec3) 0:0(0): error: vec4 cos(vec4) This is not really any worse and removes the need for the prefix variable. It will also help with the next commit's refactoring. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:32:59 -08:00
Kenneth Graunke	e5e191a6b1	glsl: Drop unused call_ir parameter from generate_call(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:32:57 -08:00
Kenneth Graunke	c5adc1c8b5	glsl: Remove useless iteration through function parameters. There's no need to loop through the "parameters" list and remove every element; move_nodes_to(&parameters) already throws away all elements of the destination list. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:32:55 -08:00
Jon TURNEY	61e0f11170	Fix 'make check' in src/mapi/glapi/tests when builddir != srcdir make[5]: Entering directory `/jhbuild/build/mesa/mesa/src/mapi/glapi/tests' CXX check_table.o /jhbuild/checkout/mesa/mesa/src/mapi/glapi/tests/check_table.cpp:29:30: fatal error: glapi/glapitable.h: No such file or directory We should look for the generated file glapi/glapitable.h in builddir, not srcdir Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-12-01 12:30:25 +00:00
Ian Romanick	862044c7f7	docs: Import 10.0 release notes, add news item Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-30 23:42:51 -08:00
Paul Berry	c4cf487315	i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats. On gen6, multisamble resolve blits use the SAMPLE message to blend together the 4 samples for each texel. For some reason, SAMPLE doesn't blend together the proper samples when the source format is L32_FLOAT or I32_FLOAT, resulting in blocky artifacts. To work around this problem, sample from the source surface using R32_FLOAT. This shouldn't affect rendering correctness, because when doing these resolve blits, the destination format is R32_FLOAT, so the channel replication done by L32_FLOAT and I32_FLOAT is unnecessary. Fixes piglit tests on Sandy Bridge: - spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float - spec/ARB_texture_float/multisample-formats 4 GL_ARB_texture_float No piglit regressions on Sandy Bridge. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70601 Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:31 -08:00
Paul Berry	26498e0f0c	glsl: Remove unused field loop_variable_state::loop. This field was neither initialized nor used. It was just dead memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:28 -08:00
Paul Berry	af9af2965b	glsl: Improve documentation of ir_loop counter/control fields. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:23 -08:00
Paul Berry	a810db7b84	glsl: In ir_validate, check that ir_loop::counter always refers to a new var. The compiler back-ends (i965's fs_visitor and brw_visitor, ir_to_mesa_visitor, and glsl_to_tgsi_visitor) have been assuming this for some time. Thanks to the preceding patch, the compiler front-end no longer breaks this assumption. This patch adds code to validate the assumption so that if we have future bugs, we'll be able to catch them earlier. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:20 -08:00
Paul Berry	d6eb4321d0	glsl: Fix inconsistent assumptions about ir_loop::counter. The compiler back-ends (i965's fs_visitor and brw_visitor, ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when ir_loop::counter is non-null, it points to a fresh ir_variable that should be used as the loop counter (as opposed to an ir_variable that exists elsewhere in the instruction stream). However, previous to this patch: (1) loop_control_visitor did not create a new variable for ir_loop::counter; instead it re-used the existing ir_variable. This caused the loop counter to be double-incremented (once explicitly by the body of the loop, and once implicitly by ir_loop::increment). (2) ir_clone did not clone ir_loop::counter properly, resulting in the cloned ir_loop pointing to the source ir_loop's counter. (3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting in the ir_variable being missed by reparenting. Additionally, most optimization passes (e.g. loop unrolling) assume that the variable mentioned by ir_loop::counter is not accessed in the body of the loop (an assumption which (1) violates). The combination of these factors caused a perfect storm in which the code worked properly nearly all of the time: for loops that got unrolled, (1) would introduce a double-increment, but loop unrolling would fail to notice it (since it assumes that ir_loop::counter is not accessed in the body of the loop), so it would unroll the loop the correct number of times. For loops that didn't get unrolled, (1) would introduce a double-increment, but then later when the IR was cloned for linking, (2) would prevent the loop counter from being cloned properly, so it would look to further analysis stages like an independent variable (and hence the double-increment would stop occurring). At the end of linking, (3) would prevent the loop counter from being reparented, so it would still belong to the shader object rather than the linked program object. Provided that the client program didn't delete the shader object, the memory would never get reclaimed, and so the shader would function properly. However, for loops that didn't get unrolled, if the client program did delete the shader object, and the memory belonging to the loop counter got re-used, this could cause a use-after-free bug, leading to a crash. This patch fixes loop_control_visitor, ir_clone, and ir_hierarchical_visitor to treat ir_loop::counter the same way the back-ends treat it: as a freshly allocated ir_variable that needs to be visited and cloned independently of other ir_variables. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72026 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:17 -08:00
Paul Berry	9d2951ea0a	glsl: Teach ir_variable_refcount about ir_loop::counter variables. If an ir_loop has a non-null "counter" field, the variable referred to by this field is implicitly read and written by the loop. We need to account for this in ir_variable_refcount, otherwise there is a danger we will try to dead-code-eliminate the loop counter variable. Note: at the moment the dead code elimination bug doesn't occur due to a bug in ir_hierarchical_visitor: it doesn't visit the "counter" field, so dead code elimination doesn't treat it as a candidate for elimination. But the patch to follow will fix that bug, so we need to fix ir_variable_refcount first in order to avoid breaking dead code elimination. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:13 -08:00
Brian Paul	1fb106527f	mesa: fix mem leak of glPixelMap data in display list And simplify save_PixelMapfv() by using the memdup() function. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:14 -07:00
Brian Paul	90d85aa16c	mesa: added memory-related comment in save_error() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:14 -07:00
Brian Paul	95d6ed22b3	mesa: fix flags assignment in save_WaitSync() The flags value is a bitfield so use the union's 'bf' field, not 'e' (enum) field. There's no actual change in behavior here since both fields of the union are the same size. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:14 -07:00
Brian Paul	efe7257ea7	mesa: remove old colortable, histogram, etc. code from dlist.c Trying to compile any of these functions into a display list now just generates a GL_INVALID_OPERATION error. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:13 -07:00
Brian Paul	90891091cd	mesa: have old convolution functions generate GL_INVALID_OPERATION Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:13 -07:00
Brian Paul	214399a3bc	mesa: have old glColorTable functions generate GL_INVALID_OPERATION As is done for the old histogram functions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:12 -07:00
José Fonseca	fb5f5b8188	trace: Dump PIPE_QUERY_* enums. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-28 12:19:42 +00:00
José Fonseca	eb040bd54a	trace: Dump query results faithfully. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-28 12:19:30 +00:00
Carl Worth	eeaa7a05a1	docs: Import 9.2.4 release notes, add news item.	2013-11-28 00:02:52 -08:00
Roland Scheidegger	ca39f4eee2	gallium/cso: fix sampler / sampler_view counts Now that it is possible to query drivers for the max sampler view it should be safe to increase this without crashing. Not entirely convinced this really works correctly though if state trackers using non-linked sampler / sampler_views use this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-28 04:02:41 +01:00
Roland Scheidegger	2983c039df	gallium: new shader cap bit for the amount of sampler views Ever since introducing separate sampler and sampler view max this was really missing. Every driver but llvmpipe reports the same number as number of samplers for now, so nothing should break. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-28 04:02:18 +01:00
Roland Scheidegger	e4d8084cbd	gallium/drivers: support more sampler views than samplers for more drivers This adds support for this to more drivers, in particular for all the "special" ones useful for debugging. HW drivers are left alone, some should be able to support it if they want but they may not be interested at this point. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-28 04:01:54 +01:00
Ian Romanick	53a65e547c	i965: Properly reject __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS when __DRI2_ROBUSTNESS is not enabled Only allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in brwCreateContext if intelInitScreen2 also enabled __DRI2_ROBUSTNESS (thereby enabling GLX_ARB_create_context). This fixes a regression in the piglit test "glx/GLX_ARB_create_context/invalid flag" v2: Remove commented debug code. Noticed by Jordan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-27 15:09:01 -08:00
Matt Turner	0822b2dfbd	Revert "drop old INTEL_DEBUG names for `perf` (fall) and `fs` (wm)" This reverts commit `195994fe4c`. It wasn't sent to the list, Ken didn't review it, and it breaks shader-db.	2013-11-27 13:38:42 -08:00
Vinson Lee	9bf41f09ab	glsl: Link glcpp with math library. This patch fixes this build error with Oracle Solaris Studio. libtool: link: /opt/solarisstudio12.3/bin/cc -g -o glcpp/glcpp glcpp.o prog_hash_table.o ./.libs/libglcpp.a Undefined first referenced symbol in file sqrt prog_hash_table.o Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-27 10:37:37 -08:00
Kenneth Graunke	c4815f6cd6	i965: Always reserve binding table space for at least one render target. In brw_update_renderbuffer_surfaces(), if there are no color draw buffers, we always set up a null render target at surface index 0 so we have something to use with the FB write marking the end of thread. However, when we recently began computing surface indexes dynamically, we failed to reserve space for it. This meant that the first texture would be assigned surface index 0, and our closing FB write would clobber the texture. Fixes Piglit's EXT_packed_depth_stencil/fbo-blit-d24s8 test on Gen4-5, which regressed as of commit `4e5306453d` ("i965/fs: Dynamically set up the WM binding table offsets.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70605 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Tested-by: lu hua <huax.lu@intel.com> Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-11-27 10:28:43 -08:00
Francisco Jerez	6b2b4cc885	glsl: Initialize _mesa_glsl_parse_state::atomic_counter_offsets before using it. Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 19:34:24 -08:00
Francisco Jerez	4f64dabb5f	i965/fs: Fix misleading comment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 19:34:02 -08:00
Francisco Jerez	32f69ad86c	i965: Bump number of supported atomic counter buffers. Now that we have dynamic binding tables there's no good reason anymore to expose so few atomic counter buffers. Increase it to 16. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 19:34:02 -08:00
Paul Berry	d7fa9eb003	glsl/linker: Validate IR just before reparenting. If reparent_ir() is called on invalid IR, then there's a danger that it will fail to reparent all of the necessary nodes. For example, if the IR contains an ir_dereference_variable which refers to an ir_variable that's not in the tree, that ir_variable won't get reparented, resulting in subtle use-after-free bugs once the non-reparented nodes are freed. (This is exactly what happened in the bug fixed by the previous commit). This patch makes this kind of bug far easier to track down, by transforming it from a use-after-free bug into an explicit IR validation error. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 13:22:24 -08:00
Paul Berry	9dfcb05fa6	glsl: Fix lowering of direct assignment in lower_clip_distance. In commit `065da16` (glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor), we failed to notice that since lower_clip_distance_visitor overrides visit_leave(ir_assignment ), ir_rvalue_visitor::visit_leave(ir_assignment ) wasn't getting called. As a result, clip distance dereferences appearing directly on the right hand side of an assignment (not in a subexpression) weren't getting properly lowered. This caused an ir_dereference_variable node to be left in the IR that referred to the old gl_ClipDistance variable. However, since the lowering pass replaces gl_ClipDistance with gl_ClipDistanceMESA, this turned into a dangling pointer when the IR got reparented. Prior to the introduction of geometry shaders, this bug was unlikely to arise, because (a) reading from gl_ClipDistance[i] in the fragment shader was rare, and (b) when it happened, it was likely that it would either appear in a subexpression, or be hoisted into a subexpression by tree grafting. However, in a geometry shader, we're likely to see a statement like this, which would trigger the bug: gl_ClipDistance[i] = gl_in[j].gl_ClipDistance[i]; This patch causes lower_clip_distance_visitor::visit_leave(ir_assignment *) to call the base class visitor, so that the right hand side of the assignment is properly lowered. Fixes piglit test: - spec/glsl-1.50/execution/geometry/clip-distance-itemized-copy Cc: Ian Romanick <idr@freedesktop.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 13:22:24 -08:00
Paul Berry	37bdde1087	i965/gs: Set GS prog_data to NULL if there is no GS program. The previous commit fixes a bug wherein we would incorrectly refer to stale geometry shader prog_data when no geometry shader was active. This patch reduces the likelihood of that sort of bug occurring in the future by setting prog_data to NULL whenever there is no GS program. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 13:22:23 -08:00
Paul Berry	2714ca81b9	i965/gs: Properly skip GS binding table upload when no GS active. Previously, in brw_gs_upload_binding_table(), we checked whether brw->gs.prog_data was NULL in order to determine whether a geometry shader was active. This didn't work: brw->gs.prog_data starts off as NULL, but it is set to non-NULL when a geometry shader program is built, and then never set to NULL again. As a result, if we called brw_gs_upload_binding_table() while there was no geometry shader active, but a geometry shader had previously been active, it would refer to a stale (and possibly freed) prog_data structure. This patch fixes the problem by modifying brw_gs_upload_binding_table() to use the proper technique to determine whether a geometry shader is active: by checking whether brw->geometry_program is NULL. This fixes the crash reported in comment 2 of bug 71870 (the incorrect rendering remains, however). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 13:21:56 -08:00
Ian Romanick	73e9aa9e3f	dri: Allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in driCreateContextAttribs Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-26 13:13:38 -08:00
Ian Romanick	9b1c68638d	i965: Only enable __DRI2_ROBUSTNESS if kernel support is available Rather than always advertising the extension but failing to create a context with reset notifiction, just don't advertise it. I don't know why it didn't occur to me to do it this way in the first place. NOTE: Kristian requested that I provide a follow-up for master that dynamically generates the list of DRI extensions instead of selected between two hardcoded lists. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-26 13:10:52 -08:00
Ian Romanick	0ae8439906	Revert "i965: Make the driver compile until a proper libdrm can be released." libdrm 2.4.48 has been released. This reverts commit `bd4596efac`. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 13:10:52 -08:00
Ian Romanick	cb728bb028	i965: Bump libdrm requirement drm_intel_get_reset_stats is only available in libdrm-2.4.48, and libdrm-2.4.49 contains an important bug fix in that function. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-26 13:10:52 -08:00
Chad Versace	97851145bc	egl: Kill macro _EGL_DECLARE_MUTEX Replace all occurences of the macro with its expansion. It seems that the macro intended to provide cross-platform static mutex intialization. However, it had the same definition in all pre-processor paths: #define _EGL_DECLARE_MUTEX(m) _EGLMutex m = _EGL_MUTEX_INITIALIZER Therefore this abstraction obscured rather than helped. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 12:50:30 -08:00
Chad Versace	3c58d4c700	egl: Enable EGL_EXT_client_extensions Insert two fields into _egl_global to hold the client extensions and statically initialize them: ClientExtensions // a struct of bools ClientExtensionString Post-patch, Mesa supports exactly one client extension, EGL_EXT_client_extensions. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 12:50:29 -08:00
Tom Stellard	ddc77c5092	radeon/compute: Unconditionally inline all functions v2 We need to do this until function calls are supported. v2: - Fix loop conditional https://bugs.freedesktop.org/show_bug.cgi?id=64225 CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-25 20:42:49 -08:00
Kenneth Graunke	ad542a10c5	i965: Use __attribute__((flatten)) on fast tiled teximage code. The fast tiled texture upload code does not compile with GCC 4.8's -Og optimization flag. memcpy() has the always_inline attribute set. This poses a problem, since {x,y}tile_copy_faster calls it indirectly via {x,y}tile_copy, and {x,y}tile_copy normally aren't inlined at -Og. Using __attribute__((flatten)) tells GCC to inline every function call inside the function, which I believe was the author's intent. Fix suggested by Alexander Monakov. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: mesa-stable@lists.freedesktop.org	2013-11-25 19:13:23 -08:00
Zack Rusin	0510ec67e2	llvmpipe: support 8bit subpixel precision 8 bit precision is required by d3d10 but unfortunately requires 64 bit rasterizer. This commit implements 64 bit rasterization with full support for 8bit subpixel precision. It's a combination of all individual commits from the llvmpipe-rast-64 branch. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-25 13:05:03 -05:00
Maarten Lankhorst	5455c818b5	gbm/dri: hide extension loader symbols They should not be exposed. Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-25 13:13:47 +01:00
Chris Forbes	e6a0eca45e	i965: Enable ARB_draw_indirect (and ARB_multi_draw_indirect) on Gen7+ .. and mark them off on the extensions list as done. V2: Enable only if pipelined register writes work. V3: Also update relnotes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:36 +13:00
Chris Forbes	093965f9e3	vbo: map indirect buffer and extract params if doing sw primitive restart V2: Check for mapping failure (thanks Brian) V3: - Change error on mapping failure to OUT_OF_MEMORY (Brian) - Unconst; remove casting away of const. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:36 +13:00
Chris Forbes	3953766e57	mesa: pass indirect buffer to sw primitive restart Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:36 +13:00
Chris Forbes	803fcc3298	i965: pass indirect buffer to primitive restart check Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	02f9757ab5	i965: implement indirect drawing for Gen7 Just prior to emitting the 3DPRIMITIVE command, we load each of the indirect registers. The values loaded are either from offsets into the current indirect BO, or constant zero if the parameter is not used for this draw. Enabling use of the indirect registers is done by turning on a bit in the first dword of the 3DPRIMITIVE command itself. V3: - Deduplicate the common part of both indexed and nonindexed indirect setup. - Just refer to the indirect bo out of the context directly. V4: - Fix bo reference to specify the range we care about. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	1a00317169	i965: Add new defines for indirect draws - MMIO registers for draw parameters - New bit in 3DPRIMITIVE command to enable indirection Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	5a798e73b5	vbo: Flesh out implementation of indirect draws Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-25 22:01:35 +13:00
Chris Forbes	aadbb0f275	mesa: add indirect_offset, is_indirect to _mesa_prim V3: Add missing cases V4: Add indirect_offset here too Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	36046ae278	mesa: Add validation helpers for new indirect draws Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. V3: - Disallow primcount==0 for DrawMulti*Indirect. The spec is unclear on this, but it's silly. We might go back on this later if it turns out to be a problem. - Make it clear that the caller has dealt with stride==0 V4: - Allow primcount==0 again. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	a95236cfc1	mesa: Add binding point for indirect buffer Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	56e98fe2fe	mesa: Add extension scaffolding for ARB_draw_indirect We will reuse the same extension flag for ARB_multi_draw_indirect since it can always be supported by looping. Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	5127318ae8	glapi: add plumbing for GL_ARB_draw_indirect and GL_ARB_multi_draw_indirect Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Christoph Bumiller	80ac616fca	mesa: add indirect drawing buffer parameter to draw functions Split from patch implementing ARB_draw_indirect. v2: Const-qualify the struct gl_buffer_object *indirect argument. v3: Fix up some more draw calls for new argument. v4: Fix up rebase conflicts in i965. v5: Undo const-qualification Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
José Fonseca	eb0892b4b1	docs/llvmpipe: Add one other good reference.	2013-11-25 08:28:23 +00:00
Chris Forbes	90d185544c	docs: describe the INTEL_* envvars that do exist V2: drop description of `fall` and `wm`, which have been removed by the previous patch; describe `stats`. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	195994fe4c	drop old INTEL_DEBUG names for `perf` (fall) and `fs` (wm) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	452721c1fa	i965: remove unused DEBUG_IOCTL Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	e0c98fa401	radeon: change last instance of DEBUG_IOCTL to use RADEON_IOCTL DEBUG_IOCTL comes from i965, and is about to be removed. Both defines have the same value (4). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-25 21:18:33 +13:00
Chris Forbes	26eb6ad831	docs: drop INTEL_* envvars which no longer exist These were removed back in 2012. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	f6159afa19	docs: bump supported shading language version Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Dave Airlie	72cae2a599	st/mesa: respect higher GLSL levels. (v2) Limit the max glsl version level to what the state tracker supports. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-11-25 13:03:02 +10:00
Timothy Arceri	3c9f0096c7	glsl: Improve error message when attemping assignment to unsized array V2: Return after error to avoid cascading error messages and removed redundant "to" from error message Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-23 15:52:27 -08:00
Jordan Justen	bd00c66500	intel: enable GL_AMD_vertex_shader_layer extension for gen7+ Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-23 10:49:56 -08:00
Marek Olšák	751e8697f2	radeonsi: implement MSAA for CIK There are also some changes to the printfs. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-11-23 01:54:58 +01:00
Marek Olšák	7b136de79a	radeonsi: enable 2D tiling on CIK libdrm does the DRM version check and decides if 2D tiling is used. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-11-23 01:54:58 +01:00
Marek Olšák	a3969aa125	mesa: initialize gl_renderbuffer::Depth in core Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-23 01:54:57 +01:00
Eric Anholt	46cf80fb36	i965/fs: Make the first pre-allocation heuristic be the post heuristic. I recently made us try two different things that tried to reduce register pressure so that we would be more likely to allocate successfully. But now that we have the logic for trying two, we can make the first thing we try be the normal, not-prioritizing-register-pressure heuristic. This means one less scheduling pass in the common case of that heuristic not producing spills, plus the best schedule we know how to produce, if that one happens to succeed. This is important, because our register allocation produces a lot of possibly avoidable dependencies for the post-register-allocation schedule, despite ra_set_allocate_round_robin(). GLB2.7: 1.04127% +/- 0.732461% fps improvement (n=31) nexuiz: No difference (n=5) lightsmark: 0.838512% +/- 0.300147% fps improvement (n=86) minecraft apitrace: No difference (n=15) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-22 16:36:27 -08:00
Eric Anholt	09db4940ee	mesa: Remove the ralloc canary on release builds. The canary is basically just to give a better debugging message when you ralloc_free() something that wasn't rallocated. Reduces maximum memory usage of apitrace replay of the dota2 demo by 60MB on my 64-bit system (so half that on a real 32-bit dota2 environment). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-22 16:36:27 -08:00
Eric Anholt	5891f98145	i965: Fix streamed state dumping/annotation after the blorp-flush change. I think I was thinking of the batch command packet cache when I pasted this in, but this counter is only used for dumping out streamed state for INTEL_DEBUG=batch and for putting annotations in our aub files. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-22 16:36:27 -08:00
Chad Versace	315b06ff62	i965: Let driconf clamp_max_samples affect context version Commit `2f89662` added the driconf option 'clamp_max_samples'. In that commit, the option did not alter the context version. The neglect to alter the context version is a fatal issue for some apps. For example, consider running Chromium with clamp_max_samples=0. Pre-patch, Mesa creates a GL 3.0 context but clamps GL_MAX_SAMPLES to 0. This violates the GL 3.0 spec, which requires GL_MAX_SAMPLES >= 4. The spec violation causes WebGL context creation to fail in many scenarios because Chromium correctly assumes that a GL 3.0 context supports at least 4 samples. Since the driconf option was introduced largely for Chromium, the issue really needs fixing. This patch fixes calculation of the context version to respect the post-clamped value of GL_MAX_SAMPLES. This in turn fixes WebGL on Chromium when clamp_max_samples=0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-22 15:27:03 -08:00
Chad Versace	95ebabbc5f	i965: Share code between intel_quantize_num_samples and clamp_max_samples clamp_max_samples() and intel_quantize_num_samples() each maintained their own list of which MSAA modes the hardware supports. This patch removes the duplication by making intel_quantize_num_samples() use the same list as clamp_max_samples(), the list maintained in brw_supported_msaa_modes(). By removing the duplication, we prevent the scenario where someone updates one list but forgets to update the other. Move function `brw_context.c:static brw_supported_msaa_modes()` to `intel_screen.c:(non-static) intel_supported_msaa_modes()` and patch intel_quantize_num_samples() to use the list returned by that function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-22 14:56:15 -08:00
Chad Versace	8d1a8d65b5	i965: Terminate brw_supported_msaa_modes() list with -1, not 0 This simplifies the loop logic in a subsqequent patch that refactors intel_quantize_num_samples() to use brw_supported_msaa_modes(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-22 14:56:02 -08:00
Brian Paul	aad2511c6d	st/mesa: simplify writemask for emitting fog result Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-22 09:01:13 -07:00
Brian Paul	73b19be32d	mesa: fix indentation in ffvertex_prog.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-22 08:52:09 -07:00
José Fonseca	69049555af	tgsi: Prevent emission of instructions with empty writemask. These degenerate instructions can often be emitted by state trackers when the semantics of instructions don't match precisely. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-22 15:03:36 +00:00
José Fonseca	4ade77f625	tgsi: Rework calls to ureg_emit_insn(). Mere syntactical change. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-22 15:03:36 +00:00
José Fonseca	68b696e595	docs: Add a section with recommended reading for llvmpipe development. Several of links the were contributed by Keith Whitwell and Roland Scheidegger. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-22 15:03:36 +00:00
Roland Scheidegger	f69d2c857d	llvmpipe: (trivial) disable new accurate origin calculation It looks like there's some bugs in it...	2013-11-22 11:29:00 +00:00
Vinson Lee	bb354c6c27	meta: Move declaration before code. Fixes MSVC build. meta.c(2411) : error C2143: syntax error : missing ';' before 'type' meta.c(2411) : error C2143: syntax error : missing ')' before 'type' meta.c(2411) : error C2065: 'layer' : undeclared identifier meta.c(2411) : error C2059: syntax error : ')' meta.c(2411) : error C2143: syntax error : missing ';' before '{' meta.c(2413) : error C2065: 'layer' : undeclared identifier meta.c(2415) : error C2065: 'layer' : undeclared identifier Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-11-21 20:29:38 -08:00
Paul Berry	ec79c05cbf	mesa: Implement GL_FRAMEBUFFER_ATTACHMENT_LAYERED query. From section 6.1.18 (Renderbuffer Object Queries) of the GL 3.2 spec, under the heading "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is TEXTURE, then": If pname is FRAMEBUFFER_ATTACHMENT_LAYERED, then params will contain TRUE if an entire level of a three-dimesional texture, cube map texture, or one-or two-dimensional array texture is attached. Otherwise, params will contain FALSE. Fixes piglit tests: - spec/!OpenGL 3.2/layered-rendering/framebuffer-layered-attachments - spec/!OpenGL 3.2/layered-rendering/framebuffertexture-defaults Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> v2: Don't include "EXT" in the error message, since this query only makes sensen in context versions that have adopted glGetFramebufferAttachmentParameteriv(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 18:16:47 -08:00
Paul Berry	af1471dc04	mesa: Fix texture target validation for glFramebufferTexture() Previously we were using the code path for validating glFramebufferTextureLayer(). But glFramebufferTexture() allows additional texture types. Fixes piglit tests: - spec/!OpenGL 3.2/layered-rendering/gl-layer-cube-map - spec/!OpenGL 3.2/layered-rendering/framebuffertexture Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> v2: Clarify comment above framebuffer_texture(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 18:16:44 -08:00
Paul Berry	0831523350	i965: Fix fast clear of depth buffers. From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes the fast depth clear path. Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-depth". Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:42 -08:00
Paul Berry	c1019670ea	i965: Fix blorp clear of layered framebuffers. From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes the blorp clear path for color buffers. Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-color". Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:39 -08:00
Paul Berry	1ec5365429	i965: refactor blorp clear code in preparation for layered clears. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:36 -08:00
Paul Berry	068a073c1d	meta: fix meta clear of layered framebuffers From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes meta clears to properly clear all layers of a layered framebuffer attachment. We accomplish this by adding a geometry shader to the meta clear program which sets gl_Layer to a uniform value. When clearing a layered framebuffer, we execute in a loop, setting the uniform to point to each layer in turn. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:34 -08:00
Paul Berry	95140740ad	mesa: Track number of layers in layered framebuffers. In order to properly clear layered framebuffers, we need to know how many layers they have. The easiest way to do this is to record it in the gl_framebuffer struct when we check framebuffer completeness. This patch replaces the gl_framebuffer::Layered boolean with a gl_framebuffer::NumLayers integer, which is 0 if the framebuffer is not layered, and equal to the number of layers otherwise. v2: Remove gl_framebuffer::Layered and make gl_framebuffer::NumLayers always have a defined value. Fix factor of 6 error in the number of layers in a cube map array. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 18:16:25 -08:00
Ben Skeggs	085ad4821e	nvc0: inform kernel about buffers that screen_create touches Prevents a GPU page fault if somehow the uniform bo gets evicted before the screen_create pushbuf has been submitted. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-11-22 11:34:43 +10:00
Tom Stellard	1bdb99330a	radeonsi/compute: Fix LDS size calculation We need to include the number of LDS bytes allocated by the state tracker. CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-21 16:14:58 -08:00
Tom Stellard	7a30cd7085	r600g/compute: Add a work-around for flushing issues on Cayman Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> https://bugs.freedesktop.org/show_bug.cgi?id=69321 CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-21 15:55:16 -08:00
Paul Berry	544e3129c5	glsl: Fix interstage uniform interface block link error detection. Previously, we checked for interstage uniform interface block link errors in validate_interstage_interface_blocks(), which is only called on pairs of adjacent shader stages. Therefore, we failed to detect uniform interface block mismatches between non-adjacent shader stages. Before the introduction of geometry shaders, this wasn't a problem, because the only supported shader stages were vertex and fragment shaders, therefore they were always adjacent. However, now that we allow a program to contain vertex, geometry, and fragment shaders, that is no longer the case. Fixes piglit test "skip-stage-uniform-block-array-size-mismatch". Cc: "10.0" <mesa-stable@lists.freedesktop.org> v2: Rename validate_interstage_interface_blocks() to validate_interstage_inout_blocks() to reflect the fact that it no longer validates uniform blocks. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v3: Make validate_interstage_inout_blocks() skip uniform blocks. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 15:05:09 -08:00
Paul Berry	0f4cacbb53	glsl: Fix cross-version linking between VS and GS. Previously, when attempting to link a vertex shader and a geometry shader that use different GLSL versions, we would sometimes generate a link error due to the implicit declaration of gl_PerVertex being different between the two GLSL versions. This patch fixes that problem by only requiring interface block definitions to match when they are explicitly declared. Fixes piglit test "shaders/version-mixing vs-gs". Cc: "10.0" <mesa-stable@lists.freedesktop.org> v2: In the interface_block_definition constructor, move the assignment to explicitly_declared after the existing if block. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 15:05:06 -08:00
Paul Berry	2bbcf19aca	glsl: Prohibit illegal mixing of redeclarations inside/outside gl_PerVertex. From section 7.1 (Built-In Language Variables) of the GLSL 4.10 spec: Also, if a built-in interface block is redeclared, no member of the built-in declaration can be redeclared outside the block redeclaration. We have been regarding this text as a clarification to the behaviour established for gl_PerVertex by GLSL 1.50, so we apply it regardless of GLSL version. This patch enforces the rule by adding an enum to ir_variable to track how the variable was declared: implicitly, normally, or in an interface block. Fixes piglit tests: - gs-redeclares-pervertex-out-after-global-redeclaration.geom - vs-redeclares-pervertex-out-after-global-redeclaration.vert - gs-redeclares-pervertex-out-after-other-global-redeclaration.geom - vs-redeclares-pervertex-out-after-other-global-redeclaration.vert - gs-redeclares-pervertex-out-before-global-redeclaration - vs-redeclares-pervertex-out-before-global-redeclaration Cc: "10.0" <mesa-stable@lists.freedesktop.org> v2: Don't set "how_declared" redundantly in builtin_variables.cpp. Properly clone "how_declared". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 15:04:59 -08:00
Kenneth Graunke	7a70f033b5	i965: Enable the AMD_performance_monitor extension on Gen5+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	2af1aedeca	i965: Take "bookend" OA snapshots at the start/end of each batch. Unfortunately, our hardware only has one set of aggregating performance counters shared between all 3D programs, and their values are not saved or restored by hardware contexts. Also, at least on Sandybridge and Ivybridge, the counters lose their values if the GPU goes to sleep. To work around both of these problems, we have to snapshot the performance counters at the beginning and end of each batch, similar to how we handle query objects on platforms that don't support hardware contexts. I call these "bookend" snapshots. Since there can be multiple performance monitors active at a time, we store the bookend snapshots in a global BO, shared by all monitors. For monitors that span multiple batches, acquiring results involves adding up three segments: BeginPerfMonitor --> End of Batch 1 ("head") Start of Batch 2 --> End of Batch 2 ... ("middle") Start of Batch N-1 --> End of Batch N-1 Start of Batch N --> EndPerfMonitor ("tail") Monitors that refer to bookend BO snapshots are considered "unresolved". We delay resolving them (and adding up deltas to obtain the results) as long as possible to avoid blocking on mapping monitor->oa_bo. We can also run out of space in the bookend BO, at which point we have to resolve all unresolved monitors. Then we can throw away the snapshots and begin writing at the beginning of the buffer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	1172974ddd	i965: Reserve batchbuffer space for a closing MI_REPORT_PERF_COUNT. In order to use the Observability Architecture effectively, we'll need to take snapshots of the OA counters via MI_REPORT_PERF_COUNT at the start and end of each batch. Experimentation reveals that we need to flush before and after each MI_REPORT_PERF_COUNT to get working values. For simplicitly, I chose to use intel_batchbuffer_emit_mi_flush(), which unfortunately expands to triple pipe controls on Sandybridge. We may want to start computing per-generation reserved batch space to avoid the insanity of Sandybridge's PIPE_CONTROL cost. That said, much of this cost existed before I rewrote the query object support to use hardware contexts, so it's at least not entirely new. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	fedc14a050	i965: Add some plumbing for gathering OA results. Currently, this only considers the monitor start and end snapshots. This is woefully insufficient, but allows me to add a bunch of the infrastructure now and flesh it out later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	c289c70ce1	i965: Start and stop OA counters as necessary. We need to start OA at the beginning of each batch where monitors are active. OACONTROL isn't part of the hardware context, so to avoid leaving counters enabled for other applications, we turn them off at the end of the batch too. We also need to start them at BeginPerfMonitor time (unless they've already been started). We stop them when the monitor last ends as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	834c9575b2	i965: Add functions to start and stop the OA counters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	367c7c2d7c	i965: Add #defines for the OACONTROL register and fields. We'll need to write this register to start/stop performance counters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	901cae07ff	i965: Take OA counter snapshots at Begin/EndPerfMonitor time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	093ecbfe3b	i965: Add a function to emit the MI_REPORT_PERF_COUNT packet. MI_REPORT_PERF_COUNT writes a snapshot of the Observability Architecture counters to a buffer. Exactly how it works varies between generations: Ironlake requires two packets, Sandybridge has to use GGTT, and Ivybridge and later use PPGTT. v2: Assert that we didn't use more space than we reserved (suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	b05b1eff1c	i965: Track the number of monitors that need OA counters. Using the OA counters requires some per-batch work. When starting and ending a batch, it's useful to know whether any monitors are actually interested in OA data. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	7329f8dd10	i965: Enumerate Observability Architecture counters on Gen5+. In addition to listing the counter names, we include several "remap" tables. Confusingly, counters are documented with names like "A23", are written to some buffer offset other than 23, and exposed by core Mesa under a counter ID that is different still. The first is inevitable; MI_REPORT_PERF_COUNT writes certain counters to fixed locations in the buffer. The latter could be avoided, but core Mesa uses the "Counters" array index as the ID for a counter. We could do remapping there, but it would just complicate the core Mesa code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	9f41585eb5	i965: Expose pipeline statistics registers via performance monitors. This is fairly simple: - At BeginPerfMonitor time, take an opening snapshot. - At EndPerfMonitor time, take a closing snapshot. - The first time the application asks for results, subtract the two and store that value. Then free the BO containing the snapshots. - On subsequent requests for the results, just return the saved value. - On reset, throw away the results. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	91950d1aea	i965: Enumerate the pipeline statistics register counters on Gen6+. For now, we only support these on Gen6+, since that's what currently uses hardware contexts. When we add Ironlake hardware context support, we can add pipeline statistics register support for that as well. In theory, we could support pipeline statistics counters even without hardware contexts, but it would be annoyingly painful. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	569adb40d7	i965: Initialize performance monitor Groups/NumGroups. Since we don't support any counters, there are zero groups. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	7bf3cd4315	i965: Add macros for creating performance monitor counters and groups. The Observability Architecture counters are 32-bit unsigned values, and the Pipeline Statistics Register counters are 64-bit unsigned values. These convenience macros make it easy to create those types of counters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	63b8ce612f	i965: Periodically dump the list of monitors if INTEL_DEBUG=perfmon. It's useful to see the state of all outstanding monitors; the start of a new batch seems like a reasonable time to print them out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	379a246fc1	i965: Add basic driver hooks and plumbing for AMD_performance_monitor. These stub functions will be filled out in later patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	b64eb100b0	i965: Add INTEL_DEBUG=perfmon support. This will enable debugging printfs for the AMD_performance_monitor code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	a4bf7f6b6e	i965: Move brw_emit_query_begin() to the render ring prelude. Without hardware contexts, the pipeline statistics registers are free-running and include data from every 3D application running. In order to find out the contributions of one particular context, we need to take a snapshot at the start and end of each batch. Previously, we emitted the PIPE_CONTROL necessary to capture PS_DEPTH_COUNT when drawing primitives. Special tracking ensured it happened only on the first draw of the batch, rather than on every draw. Moving this to brw_new_batch increases symmetry, since the final snapshot has always been in brw_finish_batch, which is just a few lines below. It should be basically equivalent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	bb9d2eab89	i965: Introduce a "render ring prelude" hook. The new intel_batchbuffer_emit_render_ring_prelude() hook will be called when switching from BLT or UNKNOWN_RING to RENDER_RING. This provides a place to emit state that should go at the start of each render ring batch, with minimal overhead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	395a32717d	i965: Introduce an UNKNOWN_RING state. When we first create a batch buffer, it's empty. We don't actually know what ring it will be targeted at until the first BEGIN_BATCH or BEGIN_BATCH_BLT macro. Previously, one could determine the state of the batch by checking brw->batch.ring (blit vs. render) and brw->batch.used != 0 (known vs. unknown). This should be functionally equivalent, but the tri-state enum is a bit clearer. v2: Catch three explicit require_space callers (thanks to Carl and Eric). v3: Split the boolean -> enum change from the UNKNOWN_RING change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	6bc40f9af5	i965: Convert brw->batch.is_blit to a BLT_RING/RENDER_RING enum. Passing BLT_RING or RENDER_RING to batchbuffer functions is a lot more obvious than passing true or false. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Roland Scheidegger	28d7b4147d	llvmpipe: calculate more accurate interpolation value at origin Some rounding errors could crop up when calculating a0. Use a more accurate method (barycentric interpolation essentially) to fix this, though to fix the REAL problem (which is that our interpolation will give very bad results with small triangles far away from the origin when they have steep gradients) this does absolutely nothing (actually makes it worse). (To fix the real problem, either would need to use a vertex corner (or some other point inside the tri) as starting point value instead of fb origin and pass that down to interpolation, or mimic what hw does, use barycentric interpolation (using the coordinates extracted from the rasterizer edge functions) - maybe another time.) Some (silly) tests though really want a high accuracy at fb origin and don't care much about anything else (Just. Don't. Ask.). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-21 20:39:19 +00:00
Brian Paul	9d1c71e34d	svga: remove special-case code for texkil w component Not actually needed. Fixes piglit ARB_fragment_program/kil-swizzle test. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-21 09:08:17 -07:00
José Fonseca	2d5f21ba65	gallium: Make TGSI_SEMANTIC_FOG register four-component wide. D3D9 Shader Model 2 restricted the fog register to one component, http://msdn.microsoft.com/en-us/library/windows/desktop/bb172945.aspx , but that restriction no longer exists in Shader Model 3, and several WHCK tests enforce that. So this change: - lifts the single-component restriction TGSI_SEMANTIC_FOG from Gallium interface - updates the Mesa state tracker to enforce output fog has (f, 0, 0, 1) - draw module was updated to leave TGSI_SEMANTIC_FOG output registers alone Several gallium drivers that are going out of their way to clear TGSI_SEMANTIC_FOG components could be simplified in the future. Thanks to Si Chen and Michal Krol for identifying the problem. Testing done: piglit fogcoord-*.vpfp tests Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-21 14:00:05 +00:00
José Fonseca	edd9efc2fb	tgsi_exec: Fix mask calculation for emit_kill_if. Same as Si Chen's commit `e7a5905d8a` for tgsi_exec module. Not actually tested, because softpipe is failing the test that caught this bug due to unrelated issues. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-21 13:56:10 +00:00
José Fonseca	bba8f10598	mesa: Use IROUND instead of roundf. roundf is not available on MSVC.	2013-11-21 13:56:00 +00:00
Tapani Pälli	7e61b44dcd	mesa: enable GL_TEXTURE_LOD_BIAS set/get Earlier comments suggest this was removed from GL core spec but it is still there. Enabling makes 'texture_lod_bias_getter' Khronos conformance tests pass, also removes some errors from Metro Last Light game which is using this API. v2: leave NOTE comment (Ian) Cc: "9.0 9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2013-11-21 12:49:18 +02:00
Christian König	ecb37a6e77	winsys/radeon: cleanup virtual memory nonsense The alignment of a virtual memory area must always be at least 4096 bytes. It only worked because size was aligned to 4096 outside of the function. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-11-21 10:24:20 +01:00
Courtney Goeltzenleuchter	f56f875b8b	mesa: Update MESA_INFO to eliminate error If a user set MESA_INFO and the OpenGL application uses a 3.0 or later context then the MESA_INFO debug output will have an error when it queries for extensions using the deprecated enum GL_EXTENSIONS. Passing context argument allows code to return extension list directly regardless of profile. Commit title updated as recommended by Kenneth Graunke. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-21 00:26:20 -08:00
Kenneth Graunke	36c3faf4bf	i965: Disable BLORP on Broadwell for now. BLORP is essential. However, porting it to Gen8 is a huge amount of work. Disabling it for now allows us to proceed with basic hardware enablement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Kenneth Graunke	01ae16a0e7	i965: Disable HiZ on Broadwell for now. HiZ is difficult to implement, and while it's essential for performance, we don't need it right away for purposes of hardware enabling. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Kenneth Graunke	232140a47a	i965: Claim OpenGL 3.3 support on Broadwell. Bugs aside, basically everything ought to work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Kenneth Graunke	b61ff94032	i965: Add device info structs for Broadwell. As always, the chipset limits here are placeholders, rather than the actual values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Vinson Lee	b7c0b61782	glsl: Use more portable bash invocation construct. Fixes 'make check' on distros where bash is not at /bin/bash. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-20 22:39:59 -08:00
Vinson Lee	7f56780915	gallivm: Ignore unknown file type in non-debug builds. Fixes "Uninitialized pointer read" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-20 22:35:36 -08:00
Dave Airlie	b01a3a9b72	glx: don't fail out when no configs if we have visuals GLX 1.2 servers with no SGIX_fbconfigs exist (some citrix thing), and we fail glxinfo completely in those cases. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-11-21 10:50:48 +10:00
Dave Airlie	a43b49dfb1	mesa/swrast: fix inverted front buffer rendering with old-school swrast I've no idea when this broke, but we have some people who wanted it fixed, so here's my attempt. reproducer, run readpix with swrast hit f, or run trival tri -sb things are upside down, after this patch they aren't. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62142 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66213 Cc: <mesa-stable@lists.freedesktop.org>" Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-11-21 10:50:17 +10:00
Eric Anholt	81ff29e30c	mesa: Fix setup of LocalParams array. i965 passed piglit, but swrast and gallium both segfaulted without this. i965 happened to work because it never ran _mesa_load_state_parameters() on the new program before the test called glProgramLocalParameter(), which was allocating a LocalParams array for the fallback path. v2: Since v1 threw away old localparams data, leaked old LocalParams memory, only fixed fragment programs, and I was dubious of my previous invariants already (nothing but program_parse.y will generate LocalParams, and only that one path of program_parse.y will), just late-allocate localparams at the other point of dereferencing them. This adds overhead to _mesa_load_state_parameter, which is uncomfortable, but I'm pretty sure that giant switch statement is super slow already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71734 Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-11-20 16:12:46 -08:00
Matt Turner	5fe49d99f2	i965/test: Use unreachable() to silence warning.	2013-11-20 15:04:53 -08:00
Matt Turner	1f9092958d	i965: Link -ldl after libmesa.la DLOPEN_LIBS is part of DRI_LIB_DEPS. Cc: "10.0" <mesa-stable@lists.freedesktop.org>" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71512 Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-20 15:04:53 -08:00
Matt Turner	a97cd0f4d7	i965: Add a pass to remove dead control flow. Removes IF/ENDIF and IF/ELSE/ENDIF with no intervening instructions. total instructions in shared programs: 1360393 -> 1360387 (-0.00%) instructions in affected programs: 157 -> 151 (-3.82%) (no change in vertex shaders) Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:53 -08:00
Matt Turner	b63d6aae55	i965: Make invalidate_live_intervals() a virtual method of backend_visitor. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:53 -08:00
Matt Turner	1c263f8f4f	i965/vec4: Add invalidate_live_intervals method. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:53 -08:00
Matt Turner	c4464c9eea	i965/fs: Don't emit SIMD16 BFI instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
Matt Turner	9bbedf6146	i965/fs: Emit compressed 3-source instructions on Haswell. For commit `4df56177` Paul discovered that the hardware restriction that Align16 instructions cannot be compressed was lifted on Haswell. This has prevented us from emitting compressed three-source instructions. For added confirmation, the bspec lists a work around called WaBreakSimd16TernaryInstructionsIntoSimd8 that hasn't been applicable since very early Haswell silicon. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
Matt Turner	82bfb45e24	i965: Fix disassembled names of BFI1 and BFI2 instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
Matt Turner	9793fc1335	i965/fs: Use source's original type in register_coalesce(). Previously, register_coalesce() would modify mov vgrf1:f vgrf2:f cmp null vgrf3:d vgrf1:d to be cmp null vgrf3:d vgrf2:f and incorrectly use vgrf2's type in the instruction that the mov was coalesced into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
José Fonseca	060159820c	u_gen_mipmap: Use untampered cubemap texture coords when generating mipmaps. It's not necessary to scale down cubemap texture coords when generating mipmaps: we are doing a 2x minification therefore it's guaranteed that the texture coords will always be at least 1 texel away of the edges. Scaling down can actually be harmful, as it may cause artefacts when generating mipmaps with nearest filtering. Sample points will lie exactly in the middle each 2x2 texels, so the scaling factor was causing different texels to be take on each quadrant of the cube face. This is apparent with a 1x1 checkerboard pattern in the base mipmap level: instead of next mipmap level receiving a constant color throughout the face, it will have different colors for each quadrant of the face. The behaviour for blits is left untouched for now, but the cubemap texture coord scaling hack should be reconsidered eventually. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-20 07:12:59 +00:00
Brian Paul	15d8e05e1e	st/mesa: fix GL_FEEDBACK mode inverted Y coordinate bug We need to check the drawbuffer's orientation before inverting Y coordinates. Fixes piglit feedback tests when running with the -fbo option. Cc: "9.2" "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-19 13:21:35 -07:00
Si Chen	e7a5905d8a	gallivm: Fix mask calculation for emit_kill_if. The exec_mask must be taken in consideration, just like emit_kill above. The tgsi_exec module has the same bug and should be fixed in a future change. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-19 19:16:18 +00:00
Paul Berry	81b998ca48	i965/gen7: Disallow Y tiling of renderable surfaces with valign of 2. Gen7 does not allow render targets to have a vertical alignment of 2. So, when creating a surface, if its format is renderable, and its vertical alignment is 2, force it to use X tiling. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-19 09:48:51 -08:00
Paul Berry	6b40dd17cf	i965/gen7: Prefer vertical alignment of 4 when possible. Gen6+ allows for color buffers to use a vertical alignment of either 4 or 2. Previously we defaulted to 2. This may have caused problems on Gen7 because Y-tiled render targets are not allowed to use a vertical alignment of 2. This patch changes the vertical alignment to 4 on Gen7, except for the few formats where a vertical alignment of 2 is required. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-19 09:48:48 -08:00
Paul Berry	60b1a118e1	i965/vec4: Fix broken IR annotation in debug output. Commit `70953b5` (i965: Initialize all member variables of vec4_instruction on construction) inadvertently added a line to the vec4_instruction constructor setting this->ir to NULL, wiping out the previously set value. As a result, ever since then, the output of INTEL_DEBUG=vs and INTEL_DEBUG=gs has been missing IR annotations. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-19 09:40:57 -08:00
Brian Paul	92c3d5acf7	svga: improve check for 3D compressed textures This is basically a a respin of f1dfcf4bce35e6796f873d9a00103b280da81e4c per Jose's suggestion. Just set the SVGA3dSurfaceFormatCaps flags for 3D and cube textures when checking the texture format capabilities. This will filter out unsupported combinations like 3D+DXT. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-19 09:03:41 -07:00
Jon TURNEY	5ab59e5332	glx/tests: Provide __glXGetCurrentContext() stub when needed Refine `8c533022`. Provide a stub __glXGetCurrentContext() function when $(DEFINES) are such that it is not a macro. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-11-19 15:28:22 +00:00
Brian Paul	21ae5135dd	svga: we don't supported 3D compressed textures Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-11-18 16:34:02 -07:00
Brian Paul	7eab897d4d	st/mesa: pass correct pipe_texture_target to st_choose_format() We were always passing PIPE_TEXTURE_2D, but not all formats are supported for all types of textures. In particular, the driver may not supported texture compression for all types of textures. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-11-18 16:34:02 -07:00
Tom Stellard	1b9511d7ce	r600g/compute: Fix handling of global buffers in r600_resource_copy_region() Global buffers do not have an associate cs_buf handle, so we can't copy them using r600_copy_buffer() https://bugs.freedesktop.org/show_bug.cgi?id=64226 Reviewed-by: Marek Ol????k <marek.olsak@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-18 12:28:13 -08:00
Tom Stellard	17930a66aa	gallium: Pass version scripts to linker using --version-script= This fixes build failures with the gold linker. CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-18 12:19:04 -08:00
Tom Stellard	a84dd2398f	clover: Optionally return context's devices from clGetProgramInfo() The spec allows clGetProgramInfo() to return information about either the devices associated with the program or the devices associated with the context. If there are no devices associated with the program, then we return devices associated with the context. https://bugs.freedesktop.org/show_bug.cgi?id=52171 Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-18 11:54:28 -08:00
Paul Berry	7dfb4b2d00	i965/gen7: Emit workaround flush when changing GS enable state. v2: Don't go to extra work to avoid extraneous flushes. (Previous experiments in the kernel have suggested that flushing the pipeline when it is already empty is extremely cheap). Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-18 10:09:11 -08:00
Brian Paul	d222202193	osmesa: add missing comma	2013-11-18 09:14:48 -07:00
Brian Paul	cadec45c3d	osmesa: add support for postprocess filters Add new OSMesaPostprocess() function to allow using the gallium postprocessing filters. This only works for OSMesa with gallium drivers, not the legacy swrast OSMesa. Bump OSMESA_MAJOR/MINOR_VERSION numbers to 10.0 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:35 -07:00
Brian Paul	7cf40c1cb3	postprocess: document the pp_init() function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	b7e5678fe5	postprocess: move #defines to filters.h They're not needed in postprocess.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	c27d8cc0c9	postprocess: refactor header files, etc Move private data structures and function prototypes out of the public postprocess.h header file. Create a pp_private.h for the shared, private data structures, functions. Remove pp_program.h header. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	de2fd7dd0b	postprocess: rename program to pp_program To match the pp_ namespace convention. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	401f2d6ea8	postprocess: simplify pp_free() code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Emil Velikov	d33d260b90	docs: indicate GLX_MESA_query_renderer's completion Cc: "10.0" <mesa-stable@lists.freedesktop.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:38:37 +00:00
Emil Velikov	b8a1115132	docs: update nv50, nvc0 current status Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:38:29 +00:00
Joerg Mayer	f9868926ee	docs: restructure GL3.txt - Indent items under a GL version to allow context diffs to do their work. - Move complete drivers into the GL version line - this should make the stuff a little bit easier to read. v2: keep the fd.o link (Emil Velikov) Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Joerg Mayer <jmayer@loplof.de> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:38:16 +00:00
Emil Velikov	ca9794658e	docs: add a note about removed state tracker/targets The X.Org state tracker is gone, as well as the xvmc/vdpau r300 and softpipe targets. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:37:39 +00:00
Emil Velikov	0faaed2112	targets/xvmc: export only necessary symbols Export only XvMC* symbols for the xvmc targets. Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:35:21 +00:00
Emil Velikov	5896100a38	drivers/radeon: remove unused CXXFLAGS, LLVM_CPP_FILES The above two variables are unused as of commit commit `024fe6852a` Author: Tom Stellard <thomas.stellard@amd.com> Date: Tue Apr 2 10:42:50 2013 -0700 radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2 which removed the only cpp file from drivers/radeon, but missed to remove the CXXFLAGS. The sequential commit reintroduced and empty LLVM_CPP_FILES. Lets cleanup and remove both. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:35:21 +00:00
José Fonseca	1e67ee8c9a	mesa/main: Move declaration to beginning of scope. Should fix MSVC build. Trivial.	2013-11-18 14:43:31 +00:00
Courtney Goeltzenleuchter	2cfbf84dad	mesa: Add API debug logging to TexStorage Give glTexStorage* equivalent debug logging to glTexImage*. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 19:57:17 -08:00
Tapani Pälli	53f89a436f	glsl: cleanup, remove duplicate assignment Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:51:37 -08:00
Kenneth Graunke	d12e0e8972	mesa: Handle !m->Ended for performance monitor result availability. If a performance monitor has never ended, then no result can be available. Core Mesa can easily handle this, saving drivers a tiny bit of complexity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:07 -08:00
Kenneth Graunke	bde5e4a1e6	mesa: Track whether a performance monitor has ever ended. If a monitor has ended, it means a result should eventually become available, pending some flushing. This is distinct from !m->Active; if a monitor has not been started, then m->Active == false and m->Ended == false. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:07 -08:00
Kenneth Graunke	a6712f5109	mesa: Also initialize gl_performance_monitor::Active. The i965 implementation uses calloc, so I missed this. It's best to simply initialize it to avoid requiring a zeroing allocator, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:06 -08:00
Kenneth Graunke	145138fb3c	mesa: Store the performance monitor object's name. Being able to print monitor->Name is really useful for debugging. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:06 -08:00
Chris Forbes	45a56ce399	mesa: bump version to 10.1 (devel) Now that branch 10.0 is created, bump the minor version in master. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 20:31:49 +13:00
Chris Forbes	61143b87c1	i965: Fix broken asserts These would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:56:57 +13:00
Chris Forbes	0741997ff0	st/vega: Fix broken assert This would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:56:55 +13:00
Chris Forbes	6f7c693a85	r600/sb: Fix broken assert This would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:56:40 +13:00
Vadim Girlin	4cb04aa0df	r600g/sb: work around hw issues with stack on eg/cm v2: make it actually work, improve condition Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68503 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-11-17 01:36:28 +04:00
Kenneth Graunke	04856ceb5c	i965: Make swizzle_to_scs non-static. We'll need this for Broadwell code as well. Normally, when we make things public, we add the "brw" prefix. I'm not crazy about that in this case, since it deals with prog_instruction.h's SWIZZLE_XYZW values, rather than the BRW_SWIZZLE_XYZW enums. However, I can't think of a better name, and at least the comments and code make it clear. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-11-16 09:12:58 -08:00
Kenneth Graunke	717241bf4a	i965: Move enum brw_urb_write_flags from brw_eu.h to brw_defines.h. Broadwell code should not include brw_eu.h (since it is for Gen4-7 assembly encoding), but needs the URB write flags enum. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-11-16 09:12:58 -08:00
Kenneth Graunke	ec8cc65926	i965/fs: Remove force_sechalf stack Only Gen4 color write setup uses the force_sechalf flag, and it only sets it on a single instruction. It also already has to get a pointer to the instruction and manually set the saturate flag, so we may as well just set force_sechalf the same way and avoid the complexity of a stack. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-11-16 09:12:57 -08:00
Emil Velikov	02fdb5cb51	targets/dri: move linker flags out of configure into Automake.inc Previous assumption was that the same set of flags can be reused for both classic and gallium drivers. With megadriver work done the classic drivers ended up using their own (single) instance of the flags. Move these into Automake.inc and rename to indicate that those are gallium specific. Additionally silence an automake/autoconf warning "XXX is not a standard libtool library name", due to the parsing issues of the module tag. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	5b8c2c8f00	targets/dri: compact compiler flags into Automake.inc Greatly reduce duplication and provide a sane minimum of CFLAGS for all DRI targets. Note: This commit adds VISIBILITY_CFLAGS to the following: * freedreno * i915 * ilo * nouveau * vmwgfx Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	38e0b7eeaa	targets/xvmc: do not link against libtrace.la In order to use the trace driver, one needs to define GALLIUM_TRACE. Neither one of the two targets was defining it, thus we're safe to remove libtrace.la. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	dfcdece7c5	targets/xvmc: consolidate lib deps into Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	bfda1460b1	targets/xvmc: move linker flags to Automake.inc Minimise duplication and sources of error (eg nouveau was missing shared and no-undefined) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	5d7d120af1	targets/xvmc: use drop duplicated compiler flags Automake.inc already has GALLIUM_VIDEO_CFLAGS, which provide the essential compiler flags needed. Note: this commit adds VISIBILITY_CFLAGS to nouveau. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	f7ac1d5989	gallium/winsys: compact compiler flags into Automake.inc Cleanup the duplicating flags and consolidate into a sigle variable. Note: this patch adds VISIBILITY_CFLAGS to the following targets * freedreno/drm * i915/{drm,sw} * nouveau/drm * sw/fbdev * sw/null * sw/wayland * sw/wrapper * sw/xlib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	096b988360	targets/vdpau: drop unused libraries from linker In order for one to use trace, noop, rbug and/or galahad, they must set the corresponding GALLIUM_* CFLAG. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	3f920a91f3	targets/vdpau: consolidate lib deps into Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	5f0df8ab22	targets/vdpau: move linker flags to Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:02 +00:00
Emil Velikov	23588a9c04	targets/vdpau: compact compiler flags into Automake.inc Store the compiler flags into a variable, in order to minimise flags duplication (amongst vdpau and xvmc). Note: this commit add VISIBILITY_CFLAGS to the nouveau target Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:02 +00:00
Emil Velikov	7dac1b470a	gallium/drivers: compact compiler flags into Automake.inc * minimise flags duplication * distingush between VISIBILITY C and CXX flags * set only required flags - C and/or CXX v2: add LLVM_CFLAGS back to AM_CFLAGS (add missing backslash) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:29:28 +00:00
Emil Velikov	ad501a535a	targets/radeonsi: move drm_target.c to a common folder ... and symlink to each target. Make automake's subdir-objects work for radeonsi. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:52 +00:00
Emil Velikov	23cdf8de32	targets/r600: move drm_target.c to common folder ... and symlink for each target. Make automake's subdir-objects work for r600. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:52 +00:00
Emil Velikov	a9a3029541	targets/r300: move drm_target.c to common folder ... and symlink for each target. Make automake's subdir-objects work for r300. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:52 +00:00
Emil Velikov	589e0b2305	gallium/drivers: enable automake subdir-objects Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:51 +00:00
Emil Velikov	d5e79a9d2b	r300: move the final sources list to Makefile.sources Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:47 +00:00
Emil Velikov	2c1bb79213	r300: add symlink to ralloc.c and register_allocate.c Make automake's subdir-objects work. Update includes. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:15 +00:00
Emil Velikov	b3c60ff5d0	st/xvmc: enable automake subdir-objects Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:15 +00:00
Emil Velikov	01d35eb372	dri/common: move source file lists to Makefile.sources * Allow the lists to be shared among build systems. * Update automake and Android build systems. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-16 14:02:15 +00:00
Emil Velikov	b51b3fc537	gtest: enable subdir-objects to prevent automake warnings Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:01:27 +00:00
Emil Velikov	b5773ee043	gbm: enable subdir-objects to prevent automake warnings Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:16 +00:00
Emil Velikov	0b57da0211	scons: move SConscript from gallium/targets/ to mesa/drivers/dri/common/ Store scons side by side with the other build systems. v2: cleanup after a failed rebase Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:16 +00:00
Johannes Obermayr	595bd01eb1	freedreno: compact a2xx and a3xx makefiles into parent ones Nearly everything within the three Makefile.am's is identical. Let's simplify things a little. v2: Rebase and rewrite the commit message (Emil Velikov) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:16 +00:00
Emil Velikov	c5062726f1	scons: drop obsolete enabled_apis variable The variable was forgotten during the FEATURE_* removal. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:15 +00:00
Emil Velikov	1aeafcb7c5	Android: remove unused MESA_ENABLED_APIS variable The variable was forgotten during the FEATURE_* removal. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:15 +00:00
Emil Velikov	9560d34fcf	st/egl: use _FILE over _SOURCES names for filelists Silence automake warnings about missing program/library whenever the _SOURCES suffix is used for temporary variable names. warning: variable 'gdi_SOURCES' is defined but no program or library has 'gdi' as canonical name (possible typo) Acked-by: Matt Turner <mattst88@gmail.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Reported-by: Johannes Obermayr <johannesobermayr@gmx.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70581 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 13:53:31 +00:00
Matt Turner	e133c0103d	i965: Assert that IF with cmod is Gen6 only. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 23:31:42 -08:00
Vinson Lee	b570c4229f	i965: Add missing break in SHADER_OPCODE_GEN7_SCRATCH_READ case. Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 18:29:34 -08:00
Eric Anholt	e5885c119d	mesa: Dynamically allocate the storage for program local parameters. The array was 64kb per struct gl_program, plus we statically stored a copy of one on disk for _mesa_DummyProgram. Given that most struct gl_programs we generate are for GLSL shaders that don't have local parameters, this was a waste. Since you can store and fetch parameters beyond what the program actually uses, we do have to do a late allocation if necessary at GetProgramLocalParameter time. Reduces peak memory usage in the dota2 trace I made by 76MB (4.5%) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:35:01 -08:00
Eric Anholt	bb1f096975	mesa: Remove PROGRAM_ENV_PARAM enum. This has been replaced with referring to env parameters using PROGRAM_STATE_VAR and _mesa_load_state_parameters. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:34:59 -08:00
Eric Anholt	33b0455211	mesa: Remove PROGRAM_LOCAL_PARAM enum. This has been replaced with referring to local parameters using PROGRAM_STATE_VAR and _mesa_load_state_parameters. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:34:57 -08:00
Eric Anholt	fddc17ab36	mesa: Update a comment about valid values of a field. Notably, ENV and LOCAL aren't used any more (replaced by STATE_VAR), but apparently CONSTANT is. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:34:49 -08:00
Eric Anholt	aa6d7bc6d6	glsl: Apply the transformation "1/rsq(x) == sqrt(x)" in opt_algebraic. The comment was stale, because the lowering in question wasn't happening in lower_instructions.cpp. Presumably if the lowering ever moves there, we can plumb the lowering mask through to opt_algebraic. total instructions in shared programs: 1618696 -> 1616810 (-0.12%) instructions in affected programs: 243018 -> 241132 (-0.78%) GAINED: 0 LOST: 0 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	477f8cd08b	glsl: Apply the transformation "(a ^^ a) -> false" in opt_algebraic. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	58a98d32e4	glsl: Apply the transformation "(a && a) -> a" in opt_algebraic. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	ee27048262	glsl: Apply the transformation "(a \|\| a) -> a" in opt_algebraic. total instructions in shared programs: 1732385 -> 1732373 (-0.00%) instructions in affected programs: 416 -> 404 (-2.88%) GAINED: 0 LOST: 0 (That's 4 already-short fragment shaders in dota2) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	8957c6b887	glsl: Move the CSE equality functions to the ir class. I want to reuse them in opt_algebraic. v2: Merge in Chris Forbes's break fix. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Matt Turner	fc51e7ac58	clover: Remove dead file from Makefile.sources. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-15 11:10:32 -08:00
Kenneth Graunke	4ec982ad01	i965: Rework brw_new_batch to actually start a new batch. Previously, brw_new_batch was called just after execbuf, but before intel_batchbuffer_reset. Essentially, it prepared for the creation of a new batch, that wasn't yet available, and which it didn't create. This was a bit awkward. This patch makes brw_new_batch call intel_batchbuffer_reset as the very first operation. This means that brw_new_batch actually creates a new batchbuffer, and thus has it available. It brings the creation of the new batchbuffer and BRW_NEW_BATCH flagging together into one place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 10:24:07 -08:00
Kenneth Graunke	720d935fff	i965: Move cache_used_by_gpu flag setting to brw_finish_batch. It really makes more sense here. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 10:24:07 -08:00
Ian Romanick	96a3527a63	i915: Actually enable __DRI2rendererQueryExtensionRec More rebase fail. This code was written long before i915 and i965 were split, so most of the code in i9[16]5/intel_screen.c only needed to exist in one place. It looks like I fixed n-1 of those places after rebasing on the split. I only found this from the defined-but-not-used warning for intelRendererQueryExtension. I noticed this while fixing the other, related warnings. (Note: During review, we decided to not pick this back to 10.0.) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-11-15 10:10:29 -08:00
Aaron Watry	2be85e2492	radeon/llvm: Free elf_buffer after use Prevents a memory leak. v2: Remove null check CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:53:31 -08:00
Aaron Watry	01f3622c74	r600/llvm: Free binary.code/binary.config in r600_llvm_compile radeon_llvm_compile allocates memory for binary.code, binary.config, or neither depending on what's being done. We need to make sure to free that memory after it's no longer needed. v2: Don't bother checking for null before FREE() CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:53:31 -08:00
Aaron Watry	dd73b99420	r600/llvm: initialize radeon_llvm_binary use memset to initialize to 0's... otherwise code_size and config_size could be uninitialized when read later in this method. It's also hard to do NULL checks on uninitialized pointers. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> v2: Fix indentation CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:53:31 -08:00
Brian Paul	2bc1680665	svga: remove unused vars in svga_hwtnl_simple_draw_range_elements() And simplify the code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-15 10:27:01 -07:00
Brian Paul	1a36dfb21e	svga: print warning for unsupported indirect dest reg indexing For DX9-level shaders, there's only limited support for indirect indexing of registers (with the loop counter register, not the general address register.) Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-15 10:23:49 -07:00
Brian Paul	3969330b47	svga: mark dest image as defined in svga_surface_copy() After we blit/copy to a dest texture image we need to mark it as being defined. This fixes broken mipmap generation for quite a few texture formats. Mipgen involves making texture views and svga_texture_view_surface() skips texture images that are undefined. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-15 10:23:48 -07:00
Brian Paul	79984b9928	svga: do primitive trimming in translate_indices() The index translation code expects the number of indexes to be consistent with the primitive type (ex: a multiple of 3 for PIPE_PRIM_TRIANGLES). If it's not, we can write out of bounds in the destination buffer. Fixes failed assertions in the pipebuffer debug code found with Piglit primitive-restart-draw-mode test. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-15 10:23:48 -07:00
Brian Paul	491d6397fc	indices: add comments, assertions in u_indices.c file Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-15 10:23:48 -07:00
Brian Paul	2253fed4a0	mesa: remove duplicated prototypes in varray.h	2013-11-15 10:23:48 -07:00
Aaron Watry	598f61ba28	gallium/pipe_loader: un-reference udev resources when we're done with them. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	4c6ac9e614	radeonsi/compute: Dispose of LLVM module after compiling kernels v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	35dad4a1e2	radeonsi/compute: Free program and program.kernels on shutdown v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	d41b10f811	radeon/llvm: Free created llvm memory buffer v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	a2b93da84b	radeon/llvm: Free libelf resources v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	df482fe02f	radeon/llvm: fix spelling error Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Tom Stellard	17af4dd52b	clover: Support multiple devices in clCreateContextFromType() v2 v2: - Use clGetDeviceIDs to query devices. Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:48 -08:00
Paul Berry	f38ac41ed4	glsl: Rework interface block linking. Previously, when doing intrastage and interstage interface block linking, we only checked the interface type; this prevented us from catching some link errors. We now check the following additional constraints: - For intrastage linking, the presence/absence of interface names must match. - For shader ins/outs, the interface names themselves must match when doing intrastage linking (note: it's not clear from the spec whether this is necessary, but Mesa's implementation currently relies on it). - Array vs. nonarray must be consistent, taking into account the special rules for vertex-geometry linkage. - Array sizes must be consistent (exception: during intrastage linking, an unsized array matches a sized array). Note: validate_interstage_interface_blocks currently handles both uniforms and in/out variables. As a result, if all three shader types are present (VS, GS, and FS), and a uniform interface block is mentioned in the VS and FS but not the GS, it won't be validated. I plan to address this in later patches. Fixes the following piglit tests in spec/glsl-1.50/linker: - interface-blocks-vs-fs-array-size-mismatch - interface-vs-array-to-fs-unnamed - interface-vs-unnamed-to-fs-array - intrastage-interface-unnamed-array v2: Simplify logic in intrastage_match() for handling array sizes. Make extra_array_level const. Use an unnamed temporary interface_block_definition in validate_interstage_interface_blocks()'s first call to definitions->store(). Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 08:56:28 -08:00
Paul Berry	b4c3b833ec	i965: Fix vertical alignment for multisampled buffers. From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit Size): j [vertical alignment] = 4 for any render target surface is multisampled (4x) From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most messages), under the "Surface Vertical Alignment" heading: This field is intended to be set to VALIGN_4 if the surface was rendered as a depth buffer, for a multisampled (4x) render target, or for a multisampled (8x) render target, since these surfaces support only alignment of 4. Back in 2012 when we added multisampling support to the i965 driver, we forgot to update the logic for computing the vertical alignment, so we were often using a vertical alignment of 2 for multisampled buffers, leading to subtle rendering errors. Note that the specs also require a vertical alignment of 4 for all Y-tiled render target surfaces; I plan to address that in a separate patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 08:54:15 -08:00
Paul Berry	46e9f78efc	main: Fix MaxUniformComponents for geometry shaders. For both vertex and fragment shaders we default MaxUniformComponents to 4 * MAX_UNIFORMS. It makes sense to do this for geometry shaders too; if back-ends have different limits they can override them as necessary. Fixes piglit test: spec/glsl-1.50/built-in constants/gl_MaxGeometryUniformComponents Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-15 08:47:41 -08:00
José Fonseca	420ccf7b8f	tools/trace: Several bugfixes/improvements to dump_state.py - Don't crash with user memory pointers. - Support old bind__sampler_ methods. Useful when comparing dumps from old branches. - Misc.	2013-11-15 15:42:02 +00:00
José Fonseca	c5a05a6aef	trace: Dump user_buffer members.	2013-11-15 15:32:33 +00:00
Fredrik Höglund	ff353c218a	mesa: Fix derived vertex state not being updated in glCallList() AEcontext::NewState is not always set when the vertex array state is changed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71492 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-15 15:23:23 +00:00
Alex Deucher	469b42ee21	radeonsi: add Hawaii pci ids Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-15 08:51:20 -05:00
Alex Deucher	f5778f152b	radeonsi: add support for Hawaii asics (v2) Update additional register fields. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-15 08:51:09 -05:00
Vinson Lee	78fc159d68	i965: Initialize schedule_node::delay. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-14 22:36:26 -08:00
Alexander von Gluck IV	f7ce1d772d	haiku/swrast: Inherit gl_config, fix flush * Inherit gl_context so we always have access to it * Thanks curro for the idea. * Last Haiku cannidate for 10.0.0 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-11-14 12:33:03 -06:00
Roland Scheidegger	473cb3fe4a	llvmpipe: (trivial) fix more fallout from the setup cleanup. Oops... Should have done some more testing.	2013-11-14 15:49:42 +00:00
Roland Scheidegger	5190c16a04	llvmpipe: (trivial) fix misplaced bld context assignment. Should fix polygon offset crashes...	2013-11-14 14:44:15 +00:00
José Fonseca	a29e40a423	gallivm: Compile flag to debug TGSI execution through printfs. It is similar to tgsi_exec.c's DEBUG_EXECUTION compile flag. I had prototyped this for a while while debugging an issue, but finally cleaned this up and added a few more bells and whistles. v2: Use '$' as marker; better output. Thanks to Brian, Zack and Roland reviews. Here is a sample output. CONST[0].x = 0.00625000009 0.00625000009 0.00625000009 0.00625000009 CONST[0].y = -0.00714285718 -0.00714285718 -0.00714285718 -0.00714285718 CONST[0].z = -1 -1 -1 -1 CONST[0].w = 1 1 1 1 IN[0].x = 143.5 175.5 175.5 143.5 IN[0].y = 123.5 123.5 155.5 155.5 IN[0].z = 0 0 0 0 IN[0].w = 1 1 1 1 $ 1: RCP TEMP[0].w, IN[0].wwww TEMP[0].w = 1 1 1 1 $ 2: MAD TEMP[0].xy, IN[0], CONST[0], CONST[0].zwzw TEMP[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 TEMP[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 $ 3: MUL OUT[0].xy, TEMP[0], TEMP[0].wwww OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 $ 4: MUL OUT[0].z, IN[0].zzzz, TEMP[0].wwww OUT[0].z = 0 0 0 0 $ 5: MOV OUT[0].w, TEMP[0] OUT[0].w = 1 1 1 1 $ 6: END OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 OUT[0].z = 0 0 0 0 OUT[0].w = 1 1 1 1	2013-11-14 14:04:28 +00:00
Roland Scheidegger	673d5391a2	softpipe: (trivial) fix debug code The debug printfs wouldn't actually compile when enabled, so kill them off and insert some new one in another place, and make sure it keeps compiling by enclosing it in a if-0 clause.	2013-11-14 12:24:55 +00:00
Roland Scheidegger	2dd693412a	llvmpipe: clean up state setup code a bit In particular get rid of home-grown vector helpers which didn't add much. And while here fix formatting a bit. No functional change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-14 12:24:55 +00:00
Roland Scheidegger	754319490f	gallivm,llvmpipe: fix float->srgb conversion to handle NaNs d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-14 12:24:55 +00:00
Ian Romanick	a15a19f0d1	dri: Change value param to unsigned This silences some compiler warnings in i915 and i965. See also `75982a5`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-13 14:49:27 -08:00
Ian Romanick	cb6182bdfa	i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiB Systems with little physical memory installed will report less than 2GiB, and some systems may (hypothetically?) have a larger address space for the GPU. My IVB still reports 1534. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-13 14:49:27 -08:00
Ian Romanick	9fe108db09	i915: Use drm_intel_get_aperture_sizes instead of drmAgpSize Send the zombie back to the grave before it infects the townsfolk. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-13 14:49:26 -08:00
Alexander Monakov	279e8d2641	i965: implement blit path for PBO glDrawPixels This patch implements accelerated path for glDrawPixels from a PBO in i965. The code follows what intel_pixel_read, intel_pixel_copy, intel_pixel_bitmap and intel_tex_image are doing. Piglit quick.tests show no regressions. In my testing on IVB, performance improvement is huge (about 30x, didn't measure exactly) since generic path goes via _mesa_unpack_color_span_float, memcpy, extract_float_rgba. Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-13 12:20:59 -08:00
Brian Paul	19c2f40649	docs: fill in md5 checksums for 9.2.3 release	2013-11-13 10:06:23 -07:00
Brian Paul	c093cd3984	docs: fix 9.2.2 -> 9.2.3 typos	2013-11-13 10:03:35 -07:00
Alexander von Gluck IV	df91144a6d	haiku: add swrast driver * This is pretty small and upkeep should be minimal. * Currently fully working. * Cannidate for 10.0.0 branch Acked-by: Brian Paul <brianp@vmware.com>	2013-11-13 10:41:10 -06:00
Carl Worth	9976a176ae	docs: Import 9.2.3 release notes, add news item.	2013-11-13 07:32:47 -08:00
Kristian Høgsberg	e048953145	dri: Remove redundant createNewContext function from __DRIimageDriverExtension createContextAttribs is a superset of what createNewContext provides. Also remove the function typedef, since createNewContext is deprecated and no longer used in multiple interfaces. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 16:08:17 -08:00
Kristian Høgsberg	68bb26bead	wayland: Use __DRIimage based getBuffers implementation when available This lets us allocate color buffers as __DRIimages and pass them into the driver instead of having to create a __DRIbuffer with the flink that requires. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 16:08:17 -08:00
Kristian Høgsberg	04e3ef00db	gbm: Add support for __DRIimage based getBuffers when available This lets us allocate color buffers as __DRIimages and pass them into the driver instead of having to create a __DRIbuffer with the flink that requires. With this patch, we can now run gbm on render-nodes. A render-node is a drm device that doesn't support modesetting and all the legacy DRI ioctls. flink is also not supported, but now that gbm doesn't need flink, we can run piglit on head-less gbm or head-less GPGPU. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 16:01:40 -08:00
Ander Conselvan de Oliveira	5ba6be2617	dri/i915, dri/i965: Fix support for planar images Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that moved the conversion from dri_format to the mesa format made it impossible to allocate a image with that format. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 15:57:39 -08:00
Eric Anholt	e9daead784	i965/fs: Try a different pre-scheduling heuristic if the first spills. Since LIFO fails on some shaders in one particular way, and non-LIFO systematically fails in another way on different kinds of shaders, try them both, and pick whichever one successfully register allocates first. Slightly prefer non-LIFO in case we produce extra dependencies in register allocation, since it should start out with fewer stalls than LIFO. This is madness, but I haven't come up with another way to get unigine tropics to not spill while keeping other programs from not spilling and retaining the non-unigine performance wins from texture-grf. total instructions in shared programs: 1626728 -> `1626288` (-0.03%) instructions in affected programs: 1015 -> 575 (-43.35%) GAINED: 50 LOST: 0 Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:06:28 -08:00
Eric Anholt	fbd8303a94	i965/fs: Do instruction pre-scheduling just before register allocation. Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling barriers, so we had to run scheduler before them in order for it to be able to do basically anything. Now that that's fixed, we can delay the scheduling until we go to allocate (which will make the next change less scary). Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:06:21 -08:00
Eric Anholt	f72a0d99fe	i965/fs: Ignore actual latency pre-reg-alloc. We care about depth-until-program-end, as a proxy for "make sure I schedule those early instructions that open up the other things that can make progress while keeping register pressure low", not actual latency (since we're relying on the post-register-alloc scheduling to actually schedule for the hardware). total instructions in shared programs: 1609931 -> 1609931 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 55 LOST: 43 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:06:00 -08:00
Eric Anholt	7c90947a0b	i965/fs: Fix message setup for SIMD8 spills. In the SIMD16 spilling changes, I replaced a "1" in the spill path with "mlen", but obviously it wasn't mlen before because spills have the g0 header along with the payload. The interface I was trying to use was asking for how many physical regs we're writing, so we're looking for "1" or "2". I'm guessing this actually passed piglit because the high 8 bits of the execution mask in SIMD8 mode are all 0s. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-12 15:05:07 -08:00
Eric Anholt	bc0e3bb4d0	i965/fs: Prefer things we know reduce reg pressure when pre-scheduling. Previously, the best thing we had was to schedule the things unblocked by the last chosen instruction, on the hope that it would be consuming two values at the end of their live intervals while only producing one new value. But that's just a guess, and we can do counting of usage of registers to know when an instruction would (almost surely) reduce register pressure. The only failure mode I know of in this new dominant heuristic is that inside of a loop when scheduling the iterator (for example), choosing the last use of the iterator doesn't actually reduce the live interval of the iterator. But it doesn't seem to matter in shader-db: total instructions in shared programs: 1618700 -> 1618700 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 13 LOST: 0 Note: The new functions are made virtual because I expect we'll soon lift the pre-regalloc scheduling heuristic over to the vec4 backend. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:04:32 -08:00
Eric Anholt	9b3e1592c2	i965: Fix undefined value usage in ABO setup. Fixes a compiler warning. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:04:28 -08:00
Eric Anholt	8bd45a7e7e	i965: Add a warning if something ever hits a bug I noticed. We'd have to map the VBO and rewrite things to a lower stride to fix it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:04:25 -08:00
Ben Skeggs	c944bde5be	nvc0: release 3d bufctx after drawing Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-11-13 08:09:29 +10:00
Francisco Jerez	99d447cc5d	clover: Fix the const variant of adaptor_range::end to deal with mismatching range sizes. Fixes infinite loop in find_grid_optimal_factor() in cases where the user specifies a grid size with less dimensions than the device supports. Reported-by: Tom Stellard <thomas.stellard@amd.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 11:52:47 -08:00
Roland Scheidegger	50f19e3a66	draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offset Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-12 19:08:58 +00:00
Cyril Brulebois	2d77e4f922	gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detection Thanks to Pino Toscano. Patch from Debian package. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-12 11:57:21 -07:00
Petr Sebor	f2b844f59d	meta: enable vertex attributes in the context of the newly created array object Otherwise, the function would enable generic vertex attributes 0 and 1 of the array object it does not own. This was causing crashes in Euro Truck Simulator 2, since the incorrectly enabled generic attribute 0 in the foreign context got precedence before vertex position attribute at later time, leading to NULL pointer dereference. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Petr Sebor <petr@scssoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-12 11:56:30 -07:00
Brian Paul	76317355bd	mesa: 80-column wrapping, remove trailing whitespace in arrayobj.c	2013-11-12 11:05:25 -07:00
Brian Paul	c8f3722129	mesa: add comment for struct gl_vertex_buffer_binding	2013-11-12 11:05:25 -07:00
Brian Paul	ce193d4f01	mesa: call update_array_format() after error checking We try to do all error checking before changing any GL state. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Jordan Justen <jordan.l.justen@intel.com>	2013-11-12 11:05:19 -07:00
Brian Paul	5f22f3207e	mesa: use _mesa_is_bufferobj() helper in _mesa_vertex_attrib_address() And use a regular if statment to slightly improve readability. Jordan Justen <jordan.l.justen@intel.com>	2013-11-12 11:05:14 -07:00
Brian Paul	e032abcb27	mesa: add const qualifiers to vertex array helper functions Jordan Justen <jordan.l.justen@intel.com>	2013-11-12 11:05:04 -07:00
Ilia Mirkin	08122e151a	nouveau/video: mark bitstream-level acceleration as unsupported Adding a vl_mpeg-based helper didn't seem to work, as it produced data that the card couldn't handle. (And I didn't investigate further.) This makes the decoding functionality only accessible via XvMC and avoids crashes when attempting to use VDPAU. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 10:11:41 +01:00
Ilia Mirkin	e8d5d3409c	nouveau/video: don't try on nv3x It doesn't work, I don't know why, but no point in hanging people's displays until it gets figured out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 10:10:54 +01:00
Tom Stellard	594fa4a208	egl-static: Only export necessary symbols v3 This fixes a crash in glamor when mesa links against static LLVM. v2: - Inline LINKER_SCRIPT variable v3: Kai Wasserbäch - Fix out out-of-tree-builds Tested-by: Kai WasserbÃ¤ch <kai@dev.carbon-project.or>	2013-11-11 17:21:35 -05:00
Tom Stellard	cb080a10b6	configure.ac: Don't require shared LLVM when building OpenCL This works now that pipe_*.so is no longer exporting LLVM symbols. Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>	2013-11-11 17:21:35 -05:00
Tom Stellard	6d6c749215	pipe-loader: Only export necessary symbols v3 This makes it possible to use clover with statically linked LLVM. v2: - Inline LINKER_SCRIPT variable v3: Kai Wasserbäch - Fix out out-of-tree-builds Tested-by: Kai WasserbÃ¤ch <kai@dev.carbon-project.or>	2013-11-11 17:21:34 -05:00
Tom Stellard	a859131003	radeonsi/compute: Add Sea Islands support	2013-11-11 17:21:34 -05:00
Vincent Lejeune	88c8f19729	r600/llvm: Store inputs in function arguments	2013-11-11 23:14:42 +01:00
Rico Schüller	23afe71f44	tests: Fix make check for out of tree builds. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-11-11 14:06:17 -08:00
Anuj Phogat	348b91b7dc	i965: Move #define's inside function as local variables X_f, Y_f, Xp_f, Yp_f variables are used just inside translate_dst_to_src().So, they can be defined just as local variables. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-11 13:35:37 -08:00
Vinson Lee	227872571a	i915, i965: Fix memory leak in intel_miptree_create_for_bo. Fixes "Resource leak" defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-11 13:11:07 -08:00
Brian Paul	ab2da985b6	osmesa: assorted code clean-ups	2013-11-11 08:17:46 -07:00
Brian Paul	a66a008b17	osmesa: fix broken triangle/line drawing when using float color buffer Doesn't seem to help with bug 71363 but it fixed a failure I found in my testing. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-11 08:17:24 -07:00
Brian Paul	34ce1a8502	svga: improve loops over color buffers Only loop over the actual number of color buffers supported, not PIPE_MAX_COLOR_BUFS. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-11 08:12:18 -07:00
Brian Paul	2182d2db28	svga: document magic number of 8 render targets per batch Grab the comments from commit message `b84b7f19df` to explain what the code is doing.	2013-11-11 08:12:18 -07:00
Brian Paul	dc21b36daf	util: set all unused cbufs to NULL in util_copy_framebuffer_state() This helps fix an issue in the svga driver, and is just safer all-around. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-11 08:12:18 -07:00
Brian Paul	944eebbdb4	glx: declare glx_screen struct to silence warning	2013-11-11 08:12:05 -07:00
Brian Paul	75982a5df4	glx: change query_renderer_integer() value param to unsigned When this function was added, the returned value was signed in some places, unsigned in others. v2: also add unsigned in the unit test, per Ian. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-11 08:10:12 -07:00
José Fonseca	6c6f4aa6fd	glx: Fix scons build. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-11 07:30:07 +00:00
Samuel Thibault	a594cec7e3	EGL: fix build without libdrm This fixes building EGL without libdrm support. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>	2013-11-10 22:11:42 +01:00
Chris Forbes	5442c0eae3	i965: convert brw_lower_offset_array_visitor to ir_rvalue_visitor Previously, we would bogusly replace the entire statement containing the ir_texture node with an ir_dereference_variable. Correct this to just replace the ir_texture node itself as intended. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-10 16:57:07 +13:00
Chris Forbes	d257350949	glsl: fix missing breaks in equals(ir_texture,..) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-10 10:20:02 +13:00
Eric Anholt	bd4596efac	i965: Make the driver compile until a proper libdrm can be released. No depending on unreleased code.	2013-11-09 13:00:53 -08:00
Armin K	f0f202e6b7	glx: conditionaly build dri3 and present loader (v3) This patch makes it possible to disable DRI3 if desired. Tested with: ./configure --disable-dri3 --with-dri-drivers=i965 \ --with-gallium-drivers= --disable-vdpau --disable-egl \ --disable-gbm --disable-xvmc Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71397 Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2013-11-09 09:12:46 -08:00
Matt Turner	68349e5219	i965/fs: Don't perform CSE on inst HW_REG dests (unless it's null) Commit `b16b3c87` began performing CSE on CMP instructions with null destinations. I relaxed the restrictions a bit too much, thereby allowing CSE to be performed on instructions with, for instance, an explicit accumulator destination. This broke the arb_gpu_shader5/fs-imulExtended shader tests because they emit MUL instructions with the accumulator as the destination. CSE would instead cause the MUL to write to a GRF, which is lower precision than the accumulator. Reviewed-by: Eric Anholt <eric@anholt.net> Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2013-11-09 09:10:24 -08:00
Chad Versace	b7dfb8528f	i965: Remove some tiny dead code from intel_miptree_map_movntdqa Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-08 14:34:41 -08:00
Brian Paul	f41c01c688	swrast: add missing notify_reset parameter to dri_create_context() Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-08 08:57:03 -07:00
Christian König	754eb6a67d	vl: use a separate context for shader based decode v2 This makes VDPAU thread save again. v2: fix some memory leaks reported by Aaron Watry. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-11-08 14:50:27 +01:00
José Fonseca	cb3c57df3a	scons: Add dri2_query_renderer.c to sources.	2013-11-08 12:22:22 +00:00
José Fonseca	caf1d96862	st/dri: Fix dri_create_context declaration prototype.	2013-11-08 12:20:00 +00:00
Keith Packard	035cce83f7	dri3: Fix pixmap buf_id computation Looks like some kind of rebase damage to me... Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 19:08:09 -08:00
Eric Anholt	4b5d0d10f1	glx: Add a more informative debug message in a DRI3 error path.	2013-11-07 19:08:09 -08:00
Keith Packard	2d94601582	Add DRI3+Present loader Uses the __DRIimage loader interfaces. v2: Fix _XIOErrors when DRI3 isn't present (change by anholt). Apparently XCB just terminates your connection if you don't check for extensions before using them, instead of returning an error like you'd expect. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 19:08:09 -08:00
Keith Packard	442442026e	dri: add __DRIimageLoaderExtension and __DRIimageDriverExtension These provide an interface between the driver and the loader to allocate color buffers through the DRIimage extension interface rather than through a loader-specific extension (as is used by DRI2, for instance). The driver uses the loader 'getBuffers' interface to allocate color buffers. The loader uses the createNewScreen2, createNewDrawable, createNewContext, getAPIMask and createContextAttribs APIS (mostly shared with DRI2). This interface will work with the DRI3 loader, and should also work with GBM and other loaders so that drivers need not be customized for each new loader interface, as long as they provide this image interface. v2: Fix build of i915 and i965 together (by anholt) Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 19:08:09 -08:00
Keith Packard	1f085ba18f	dri/i915,dri/i965: Use driGLFormatToImageFormat and driImageFormatToGLFormat Remove private versions of these functions Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-11-07 19:08:09 -08:00
Keith Packard	b7818b8c36	dri/common: Add functions mapping MESA_FORMAT_* <-> __DRI_IMAGE_FORMAT_* The __DRI_IMAGE_FORMAT codes are used by the image extension, drivers need to be able to translate between them. Instead of duplicating this translation in each driver, create a shared version. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-11-07 19:08:09 -08:00
Keith Packard	aba6b84ce5	Define __DRI_IMAGE_FORMAT_SARGB8 This format will be used by the i965 driver Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-07 19:08:09 -08:00
Keith Packard	bf6591e948	dri/intel: Add explicit size parameter to intel_region_alloc_for_fd Instead of assuming that the size will be height * pitch, have the caller pass in the size explicitly. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-11-07 19:08:09 -08:00
Keith Packard	888533dcd6	dri/intel: Split out DRI2 buffer update code to separate function Make an easy place to splice in a DRI3 version of this function Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-07 19:08:09 -08:00
Keith Packard	f66a6c5fe7	drivers/dri/common: A few dri2 functions are not actually DRI2 specific This just renames them so that they can be used with the DRI3 extension without causing too much confusion. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-07 19:08:09 -08:00
Roland Scheidegger	ea1f7d2894	gallivm: deduplicate some indirect register address code There's only one minor functional change, for immediates the pixel offsets are no longer added since the values are all the same for all elements in any case (it might be better if those weren't stored as soa vectors in the first place maybe). Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-11-08 03:38:32 +01:00
Ian Romanick	8c5330226f	glx/tests: Add unit tests for the DRI2 part of GLX_MESA_query_renderer After adding $(DEFINES) to AM_CPPFLAGS, the __glXGetCurrentContext wrapper function is no longer needed and causes compile errors. Using the correct defines causes it to be a macro! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 18:12:33 -08:00
Ian Romanick	0cce553867	glx/tests: Add unit tests for the GLX part of GLX_MESA_query_renderer These tests primarilly ensure that the functions added by this extension don't abuse other interfaces (e.g., glx_screen::query_renderer_integer) when provided bad data. These tests helped me find a couple small bugs in the initial implementation. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 18:12:33 -08:00
Ian Romanick	d4cc186937	glx/tests: Add GetGLXScreenConfigs_called flag Tests for the GLX_MESA_query_context extension will use this flag. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 18:12:33 -08:00
Ian Romanick	ee6c9fcbca	docs: Import extension spec for GLX_MESA_query_renderer The enumerated values are currently allocated from Intel's range. v2: Fix a typo. Update the list of functions to which the new enums can be passed. The "Current" versions were previously missing. Both things noticed by Marek. v3: Fix typo in return type of glXQueryRendererIntegerMESA in the spec body (noticed by Ken). Fix typo in issue #14 referencing itself instead of issue #13 (noticed by Dave). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Dave Airlie <airlied@redhat.com>	2013-11-07 18:12:33 -08:00
Ian Romanick	4680d237c5	glx/dri2: Add DRI2 support for GLX_MESA_query_renderer The new functions for this extension were added to a separate file (dri2_query_renderer.c) to facilitate unit testing. I tried putting them in dri2_glx.c, and it resulting in an unending chain of dependencies. It was the proverbial threading hanging from a sweater. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:12:33 -08:00
Ian Romanick	419684091c	glx/dri2: Pull some internal structures out to a separate header file This structures will be accessed by internal functions that will be added in a file separate from dri2_glx.c. The new code will be added to a new file to facilitate unit testing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:12:32 -08:00
Ian Romanick	4944588cfd	glx/tests: Silence warnings after adding fields to glx_screen_vtable Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 18:12:32 -08:00
Ian Romanick	6c28c037c4	glx: Add functions and GLX plumbing for GLX_MESA_query_renderer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:12:32 -08:00
Ian Romanick	38a1d8b14c	glx: Add GLX_MESA_query_renderer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:12:32 -08:00
Ian Romanick	b3ffc5b6f4	glx: Add extension tracking GLX_MESA_query_renderer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:12:32 -08:00
Ian Romanick	1e4ce08f38	i965: Wire up initial support for DRI_RENDERER_QUERY extension v2: Use sysconf instead of sysinfo for improved portability. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:12:27 -08:00
Ian Romanick	2fe6fbd19f	i915: Wire up initial support for DRI_RENDERER_QUERY extension v2: Use sysconf instead of sysinfo for improved portability. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	9dbc14abcf	dri: Add function to implement queries common to all Mesa drivers v2: Add assertions that the version string has the expected format. This will catch build errors (or changes to the version string format) in debug build without exposing release builds to buffer over-runs. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	83ffe47be0	i965: Refactor the renderer string creation out of intelGetString This will soon be used in intel_screen.c from a function that doesn't have a gl_context. v2: Delete local variables that are now unused. This matches v1 of the changes to the i915 driver. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	339f36fc5e	i915: Refactor the renderer string creation out of intelGetString This will soon be used in intel_screen.c from a function that doesn't have a gl_context. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	18291251ec	i965: Refactor the vendor string out of intelGetString This will soon be used in intel_screen.c from a function that doesn't have a gl_context. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	135b7e7260	i915: Refactor the vendor string out of intelGetString This will soon be used in intel_screen.c from a function that doesn't have a gl_context. v2: Remove spurious break after return. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	64bb1e857a	dri: Add interface definition for DRI_RENDERER_QUERY extension This will be used to let apps query hardware and driver limits before creating a GL context. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 18:08:15 -08:00
Ian Romanick	1f712bdd38	i965: Enable DRI_Robustness extension Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 17:40:25 -08:00
Ian Romanick	e8dac9632d	i965: Propagate the GPU reset notifiction strategy down into the driver If the application requests reset notifiction, connect up the reset status query method and set gl_context::ResetStrategy. v2: Update based on kernel interface / libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 17:40:25 -08:00
Ian Romanick	8f2c93ff75	i965: Add function to query the GPU reset status for a context v2: Update based on kernel interface / libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 17:40:25 -08:00
Ian Romanick	15c3bac3d0	i965: Handle __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS flag Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 17:40:25 -08:00
Ian Romanick	7b140d1bda	mesa/dri: Move context flag validation down into the drivers Soon some drivers will support a different set of flags than other drivers. If some flags have to be filtered in the driver, we might as well filter all of them in the driver. The changes in nouveau use tabs because nouveau seems to have it's own indentation rules. v2: Fix some rebase failures noticed by Ken (returning the wrong types, etc.). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 17:40:05 -08:00
Ian Romanick	17c94de33b	mesa/dri: Add basic plumbing for GLX_ARB_robustness reset notification strategy No drivers advertise the DRI2 extension yet, so no driver should ever see a value other than false for notify_reset. The changes in nouveau use tabs because nouveau seems to have it's own indentation rules. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 17:31:16 -08:00
Ian Romanick	916bc4491a	mesa: Implement proper tracking logic for glGetGraphicsResetStatusARB Drivers still have to implement dd_function_table::GetGraphicsResetStatus. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-07 16:41:38 -08:00
Ian Romanick	a6eb04c3d8	mesa: Add gl_shared_state::ShareGroupReset and gl_context::ShareGroupReset These will be used to determine whether to signal a GPU reset after another context in the share group has observed a reset. v2: Change ShareGroupReset from GLboolean to bool. Suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-07 16:41:38 -08:00
Ian Romanick	2fdc0ee19f	mesa: Add dd_function_table::GetGraphicsResetStatus This allows drivers to determine whether a GPU reset has occured. It should return non-zero status if a reset was observed by the specified context. Another mechanism will be used to observe resets occuring in other contexts in the share group. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-07 16:41:38 -08:00
Ian Romanick	114d360dfa	mesa: Remove gl_context::ResetStatus This isn't going to be used in the actual implemenation of glGetGraphicsResetStatus. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-07 16:41:38 -08:00
Matt Turner	69b425efae	st/xorg: Delete. Acked-by: Lucas Stach <l.stach@pengutronix.de>	2013-11-07 16:14:25 -08:00
Matt Turner	48f4f59dc6	xorg-nouveau: Delete.	2013-11-07 16:14:25 -08:00
Matt Turner	11ff1725cc	xorg-i915: Delete. Acked-by: Jakob Bornecrantz <wallbraker@gmail.com> Acked-by: Stéphane Marchesin <stephane.marchesin@gmail.com>	2013-11-07 16:14:25 -08:00
Ian Romanick	cf0da87917	docs: Mark off ARB_shader_atomic_counters for i965 ...and update relnotes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:02:03 -08:00
Francisco Jerez	597634556e	i965/gen7: Expose ARB_shader_atomic_counters. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 15:56:57 -08:00
Francisco Jerez	5c114939b4	glsl: Linker support for ARB_shader_atomic_counters. v2: Add comments on the purpose of the auxiliary data structures. Check for atomic counter overlaps. Use the contains_atomic() convenience method. Add static assert with the number of expected shader stages. v3: Don't resize atomic arrays. v4: Add comment on the reason why we don't resize atomic counter arrays. Use 'strcmp(...) == 0' instead of '!strcmp(...)'. v5 (idr): Don't use STL in the linker. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 15:56:57 -08:00
Francisco Jerez	e63bb29853	glsl: Implement parser support for atomic counters. v2: Mark atomic counters as read-only variables. Move offset overlap code to the linker. Use the contains_atomic() convenience method. v3: Use pointer to integer instead of non-const reference. Add comment so we remember to add a spec quotation from the next GLSL release once the issue of atomic counter aggregation within structures is clarified. v4 (idr): Don't use std::map because it's overkill. Add an assertion that ctx->Const.MaxAtomicBufferBindings <= MAX_COMBINED_ATOMIC_BUFFERS. Signed-off-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 15:56:57 -08:00
Kenneth Graunke	30f61c471d	Revert "i965: Add support for GL_AMD_performance_monitor on Ironlake." This reverts most of commit `0f2da77307`. (I chose to leave the additions to brw_defines.h.) My previous Ironlake implementation was somewhat broken: counter data was global, rather than per-context. This meant that performance monitors captured data from your compositor, 2D driver, and other 3D programs. Originally, I believed that Sandybridge and later had an easy way to avoid this problem (setting per-context flags in OACONTROL), while Ironlake did not. So I'd intended to leave it as a known limitation of performance monitoring support on Ironlake. However, this turned out not to be true. Unfortunately, our hardware only has one set of aggregating performance counters shared between all 3D programs, and their values are not saved or restored by hardware contexts. Also, at least on Sandybridge and Ivybridge, the counters lose their values if the GPU goes to sleep. To work around both of these problems, we have to snapshot the performance counters at the beginning and end of each batch, similar to how we handle query objects on platforms that don't support hardware contexts. For occlusion queries, this batch bookending approach is fairly simple: only one occlusion query can be active at a time, and the result is a single integer. Performance monitors are more complex: an arbitrary number of monitors can be active at a time, each monitoring some subset of our ~30 observability counters. Individual monitors can be started and stopped at any point during the batch. Tracking where each monitor started/ended relative to batch flushes ends up being a pain. And you can run out of space in the buffer. Properly supporting this required some serious rearchitecting of the code. Rather than writing patches to try and morph a broken system into a working one (which operates quite differently), I decided it would be simplest to revert the old code and start fresh. Parts will look familiar, but other parts are new. I also decided it would be best to include Sandybridge and Ivybridge support from the start, since the newer platforms have added complexity that I wanted to make sure worked. They're also what most people care about these days. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-07 15:52:02 -08:00
Kenneth Graunke	1bd6233169	glsl: Enable dFdx, dFdy, and fwidth by default in GLSL ES 3.00. Previously, we only exposed them in desktop GL or with: #extension GL_OES_standard_derivatives : enable GLSL ES 3.00 includes these without an extension, so we need to expose them by default. Note that the above #extension line results in an error or desktop GL, so we don't need to worry about this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-07 15:52:02 -08:00
Fredrik Höglund	c9ac891fa4	docs: Mark off ARB_vertex_type_10f_11f_11f_rev for r600g ...and update relnotes. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-07 23:55:46 +01:00
Fredrik Höglund	e420fb887f	r600g: Add support for PIPE_FORMAT_R11G11B10_FLOAT vertex elements Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-07 23:51:44 +01:00
Fredrik Höglund	bfc28e4aff	st/mesa: Add support for ARB_vertex_type_10f_11f_11f_rev Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-07 23:51:24 +01:00
Brian Paul	fe9284a7bf	mesa: fix return statements in varray.c Return false, not GL_FALSE. Add missing return value. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71359	2013-11-07 15:23:36 -07:00
Brian Paul	6592a6d065	svga: always return 4 for PIPE_MAX_COLOR_BUFS Even if the query returns 8, only 4 really work. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-07 15:21:40 -07:00
Brian Paul	055dbd5c3e	svga: return true for the PIPE_CAP_SM3 query This just tells the state tracker to turn on the GL_ARB_shader_texture_lod extension. This simply allows the GLSL compiler to emit TXL and TXD instructions for both vertex and fragment shaders. We already support these opcodes in the svga driver. Though, the shadow2DGrad() Piglit tests are failing. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-07 15:21:40 -07:00
Matt Turner	6b990a7474	i965: Add an implementation of intel_miptree_map using streaming loads. Improves performance of RoboHornet's 2D Canvas toDataURL benchmark [http://www.robohornet.org/#e=canvastodataurl] by approximately 5x on Baytrail on ChromiumOS. Elapsed time drops by -81.4861% +/- 1.22619% (n=3 s=14.9105, confidence=95%). Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-07 13:18:03 -08:00
Matt Turner	6f2e81ce4c	mesa: Add a streaming load memcpy implementation. Uses SSE 4.1's MOVNTDQA instruction (streaming load) to read from uncached memory without polluting the cache. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-07 13:18:03 -08:00
Chris Forbes	d41084a63d	docs: Mark off some more things. These have been supported on i965/Gen7+ for a while, and are listed in the 10.0 release notes. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-11-08 09:57:29 +13:00
Anuj Phogat	735a777842	i965: Fix 'SIMD16 only' dispatch of fragment shader in case of sample shading This patch make changes to correctly set up the Dispatch GRF Start Register in case of 'SIMD16 only' FS dispatch. This fixes an issue of incorrect rendering on dolphin emulator with GL_SAMPLE_SHADING enabled. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 12:20:33 -08:00
Chris Forbes	4871e7b91f	docs: update relnotes	2013-11-08 09:10:06 +13:00
Chris Forbes	2973f38f1c	docs: Mark off ARB_vertex_type_10f_11f_11f_rev. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:10:06 +13:00
Chris Forbes	5e61c746d5	i965: Enable ARB_vertex_type_10f_11f_11f_rev on Gen6+. This theoretically works on earlier hardware as well, but the extension requires at least GL3.0. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:10:06 +13:00
Chris Forbes	7a95bb0a80	i965: add support for UNSIGNED_INT_10F_11F_11F_REV vertex attribs Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:10:06 +13:00
Chris Forbes	48b6d70bef	vbo: add 10_11_11 support to vbo_attrib_tmp Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:10:06 +13:00
Chris Forbes	fa14f8afa0	mesa: Add support to _mesa_bytes_per_vertex_attrib for 10_11_11 format. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:10:06 +13:00
Chris Forbes	1f092a9594	mesa: add varray support for UNSIGNED_INT_10F_11F_11F_REV type V2: fix interaction with VertexAttribFormat, since that landed after this was originally written Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:09:43 +13:00
Chris Forbes	aba355b463	mesa: Add extension scaffolding for ARB_vertex_type_10f_11f_11f_rev Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-08 09:00:47 +13:00
Matthew McClure	f9e2c24326	draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_float With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-07 18:32:54 +00:00
Eric Anholt	185b5a54c9	i965: Avoid flushing the batch for every blorp op. This brings over the batch-wrap-prevention and aperture space checking code from the normal brw_draw.c path, so that we don't need to flush the batch every time. There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't high enough up in the state emit sequences -- before, we implicitly had one at the batch flush before any state was emitted, so Mesa's workaround emits didn't really matter. Since the SNB fixes by Ken, I didn't see any regressions after 3 piglit runs. Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32) Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90). Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88) Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126) No statistically significant performance difference on unigine tropics (n=10) No statistically significant performance difference on openarena (n=755) The two apps that are hurt happen to include stalls on busy buffer objects, so I think this is an effect of missing out on an opportune flush. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-07 10:20:33 -08:00
Matt Turner	fd03dd6ddd	build: Build gen_matypes and matypes.h from src/mesa. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 10:00:25 -08:00
Matt Turner	d8abd6710e	build: Change HAVE_X86_ASM to mean x86 or x86-64 asm. I want a conditional that says generally "we have x86 assembly" in the next patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 10:00:25 -08:00
Matt Turner	957c7570ea	configure.ac: Test $asm_arch directly. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 10:00:25 -08:00
Fredrik Höglund	23e69ad6ec	docs: Mark ARB_vertex_attrib_binding as done, update relnotes Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 16:21:43 +01:00
Fredrik Höglund	d2ac5d9a13	mesa: Enable ARB_vertex_attrib_binding Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	193e8b4b93	mesa: Optimize rebinding the same VBO Check if the new buffer object has the same name as the current buffer object before looking it up. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	965900e830	mesa: Handle zero-stride arrays in _mesa_update_array_max_element() Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	fb370f89db	mesa: Add Get* support for ARB_vertex_attrib_binding Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	59b01ca252	mesa: Add ARB_vertex_attrib_binding update_array() and update_array_format() are changed to update the new attrib and binding states, and the client arrays become derived state. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	bb2d02c7b5	glapi: Add infrastructure for ARB_vertex_attrib_binding Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	ccb6286707	mesa: Make handle_bind_buffer_gen() non-static ...and rename it to _mesa_bind_buffer_gen(). This is so the function can be called from _mesa_BindVertexBuffer(). This patch also adds a caller parameter so we can report the right entry point in error messages. Based on a patch by Eric Anholt. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	12cbe995ed	mesa: Rename gl_array_object::VertexAttrib to _VertexAttrib This will become derived state as part of the ARB_vertex_attrib_binding support. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:45 +01:00
Fredrik Höglund	d5543213f2	mesa: Split out the format code from update_array() Split out the code for updating the array format into a new function called update_array_format(). This function will be called by both update_array() and the new glVertexAttrib*Format() entry points in ARB_vertex_attrib_binding. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:44 +01:00
Fredrik Höglund	6a650fa787	mesa: Restore gl_array_object::NewArray This will be used by the ARB_vertex_attrib_binding implementation. This reverts commit `db38e9a0e1`. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-07 16:20:44 +01:00
Kenneth Graunke	c6a3fb69c6	i965: Use has_surface_tile_offset in depth/stencil alignment workaround. Currently, has_surface_tile_offset is equivalent to gen == 4 && !is_g4x. We already use it for related checks in brw_wm_surface_state.c, so it makes sense to use it here too. It's simpler and more future-proof. Broadwell also lacks surface tile offsets. With this patch, I won't need to update any generation checking; I can simply not set the flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-07 00:17:53 -08:00
Fabio Pedretti	110009302b	gallium: fix build on GNU/kFreeBSD Patch from Debian package Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-11-06 22:08:26 +01:00
Fabio Pedretti	4f4da81dc8	configure.ac: fix build on GNU/kFreeBSD Based on existing patch from Debian package. Debian bug: http://bugs.debian.org/524690 Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-11-06 22:08:26 +01:00
Fabio Pedretti	9d805c96eb	mesa: add arm64 support Patch from Ubuntu package Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-11-06 22:08:26 +01:00
Fabio Pedretti	da7daade92	r600/compute: silence unused var warning Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-06 22:07:58 +01:00
Paul Berry	2fd785ac49	i965/gen6: Don't allow SIMD16 dispatch in 4x PERPIXEL mode with computed depth. Hardware docs say we can only use SIMD8 dispatch in this condition. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-11-06 11:58:42 -08:00
Matt Turner	4e84f394e9	configure.ac: Drop no-out-of-tree notice. We do support out of tree builds now. Tested-by: Colin Walters <walters@verbum.org>	2013-11-06 11:26:19 -08:00
Matt Turner	5ca3926442	mesa: Build program as part of libmesa.	2013-11-06 11:26:19 -08:00
Matt Turner	b0bfb7c41e	mesa: Clean up use of top_srcdir/top_builddir.	2013-11-06 11:26:19 -08:00
Matt Turner	8bc126cd37	i965: Use unreachable() to silence a compiler warning. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-11-06 11:26:18 -08:00
Matt Turner	3a5223c24c	mesa: Add unreachable() macro. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-11-06 11:26:18 -08:00
Roland Scheidegger	b35ea09349	gallivm: fix indirect addressing of inputs We weren't adding the soa offsets when constructing the indices for the gather functions. That meant that we were always returning the data in the first element. (Copied straight from the same fix for temps.) While here fix up a couple of broken comments in the fetch functions, plus don't name a straight float type float4 which is just confusing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-11-06 18:20:54 +01:00
Vincent Lejeune	08556073d1	r600/llvm: Fix isampleBuffer on preEG	2013-11-06 17:36:22 +01:00
Vincent Lejeune	1184f8fd34	r600/llvm: Fix texbuf for pre EG gen	2013-11-06 17:36:22 +01:00
Brian Paul	36f1c6e3db	mesa: for GLSL_DUMP_ON_ERROR, also dump the info log Since it's helpful to know why the shader did not compile. Also, call fflush() for Windows. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-06 09:04:16 -07:00
Grigori Goronzy	5580ff818e	st/vdpau: resolve delayed rendering for GL interop v2 Otherwise OutputSurface interop has funny results sometimes. This fixes interop with the mpv media player. v2 (chk): add proper locking Signed-off-by: Christian König <christian.koenig@amd.com>	2013-11-06 08:45:57 +01:00
Chris Forbes	3785fe2715	docs: Mark off ARB_sample_shading; minor tidyup. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-11-06 19:36:27 +13:00
Chris Forbes	f7e15fcf56	i965/fs: Gen4-5: Implement alpha test in shader for MRT V2: Add comment explaining what emit_alpha_test() is for; fix spurious temp and bogus whitespace. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-06 19:29:52 +13:00
Chris Forbes	ca82ba90dd	i965/fs: Gen4-5: Setup discard masks for MRT alpha test The same setup is required here as when the user-provided shader explicitly uses KIL or discard. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-06 19:29:49 +13:00
Chris Forbes	1080fc610e	i965: Gen4-5: Include alpha func/ref in program key V2: Better explanation of the rationale for doing this. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-06 19:29:46 +13:00
Chris Forbes	dbcd633040	i965: Gen4-5: Don't enable hardware alpha test with MRT We have to do this in the shader instead, since these gens lack an independent RT0 alpha value in their render target write messages. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-06 19:29:36 +13:00
Kenneth Graunke	39ebb72e52	i965: Combine {brw,gen7}_update_texture_buffer_surface() functions. Now that brw_update_texture_buffer_surface() uses the virtual emit_buffer_surface_state() function, it works for Gen7+ too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-05 17:59:53 -08:00
Kenneth Graunke	7a974a645e	i965: Unvirtualize brw_create_constant_surface; delete Gen7+ variant. Now that brw_create_constant_surface uses a virtual function internally, it doesn't need to be virtual itself. We can delete the Gen7+ variant and simplify things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-05 17:59:51 -08:00
Kenneth Graunke	ee23dd139a	i965: Use the new emit_buffer_surface_state() vtable entry. This will allow us to combine the Gen4-6 and Gen7 variants of these functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-05 17:59:50 -08:00
Kenneth Graunke	ba836e02a3	i965: Virtualize emit_buffer_surface_state(). This entails adding "mocs" and "rw" parameters to the Gen4-5 version. I made it actually pay attention to the rw flag (even though it is always false), but mocs is always ignored. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-05 17:59:39 -08:00
Courtney Goeltzenleuchter	e3854fe194	i965: Fix compiler warning. fix: intel_screen.c:1320:4: warning: initialization from incompatible pointer type [enabled by default] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-05 17:59:38 -08:00
Eric Anholt	ff337bc800	i965: Tell the unit states how many binding table entries we have. Before the series with `3c9dc2d31b` to dynamically assign our binding table indices, we didn't really track our binding table count per shader, so we never filled in these fields. Affects cairo-gl trace runtime by -2.47953% +/- 1.07281% (n=20) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-05 15:39:45 -08:00
Eric Anholt	3f319eef76	i965: Fix context initialization after `2f89662717` You can't return stack-initialized values and expect anything good to happen. Reviewed-by: Chad Versace <chad.versace@linux.intel.com Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-05 15:39:44 -08:00
Roland Scheidegger	5ae31d7e1d	gallivm: optimize lp_build_minify for sse SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, which should be way faster (saves over 20 instructions per 8-wide lp_build_minify). This wouldn't work for "generic" 32bit shifts though since we've got only 24bits of mantissa (actually for left shifts it would work by using sse41 int mul instead of float mul but not for right shifts). Note that this has very limited scope for now, since this is only used with per-pixel lod (otherwise we're avoiding the non-constant shift count by doing per-quad shifts manually), and only 1d textures even then (though the latter should change). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-05 23:32:24 +01:00
Ian Romanick	7df7e730fb	nouveau: Use _NEW_SCISSOR instead of hooking through dd_function_table This will enable removing the dd_function_table::Scissor hook in the near future. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-11-05 07:50:19 -08:00
Ian Romanick	3f30425424	nouveau: Use _NEW_VIEWPORT instead of hooking through dd_function_table This will enable removing the dd_function_table::DepthRange hook in the near future. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-11-05 07:50:19 -08:00
Ian Romanick	3a5b84cece	radeon / r200: Don't pass unused parameters to radeon_viewport The x, y, width, and height parameters aren't used by radeon_viewport, so don't pass them. This should make future changes to the dd_function_table::Viewport interface a little easier. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jljusten@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>	2013-11-05 07:50:12 -08:00
Ian Romanick	619a9bee7d	i915: Bring sanity to the Viewport function The i830 and the i915 driver have the same dd_function_table::Viewport function... it just has two names and lives in two places. Using a single implementation allows cleaning up the saved_viewport nonsense too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jljusten@gmail.com> Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>	2013-11-05 07:50:04 -08:00
Ian Romanick	abd962f1d5	i965: Eliminate the saved_viewport wrapper The i965 driver never installed a dd_function_table::Viewport function, so this wrapper never actually did anything. No piglit regressions on IVB on DRI2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jljusten@gmail.com> Cc: Courtney Goeltzenleuchter <courtney@lunarg.com>	2013-11-05 07:49:54 -08:00
Alexander von Gluck IV	1c7605685d	mesa: Remove last BEOS checks * Goodbye BeOS, we hardly knew thee * As BeOS was gcc2 only, there was little chance of this being useful. * Doesn't effect Haiku in any meaningful way Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-05 09:37:58 -06:00
José Fonseca	c883ee4498	util/u_format: take normalized flag in consideration in util_format_is_rgba8_variant Just happened to notice it was missing while looking at it.	2013-11-05 14:05:41 +00:00
Paul Berry	86cdff5635	glsl: Don't generate misleading debug names when packing gs inputs. Previously, when packing geometry shader input varyings like this: in float foo[3]; in float bar[3]; lower_packed_varyings would declare a packed varying like this: (declare (shader_in flat) (array ivec4 3) packed:foo[0],bar[0]) That's confusing, since the packed varying acutally stores all three values of foo and all three values of bar. This patch causes it to generate the more sensible declaration: (declare (shader_in flat) (array ivec4 3) packed:foo,bar) Note that there should be no functional change for users of geometry shaders, since the packed name is only used for generating debug output. But this should reduce confusion when using INTEL_DEBUG=gs. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-04 19:04:29 -08:00
Vinson Lee	749cb89097	gallivm: Remove llvm::DisablePrettyStackTrace for LLVM >= 3.4. LLVM 3.4 r193971 removed llvm::DisablePrettyStackTrace and made the pretty stack trace opt-in rather than opt-out. The default value of DisablePrettyStackTrace has changed to true in LLVM 3.4 and newer. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60929 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-04 18:22:04 -08:00
Alexander von Gluck IV	e759f1c111	target/haiku-softpipe: Fix viewport issues * Call mesa viewport call on winndow resize * Add initial postprocessing code * Pass hgl_context to private statetracker as it is more useful than GalliumContext * Use Lock and Unlock functions to standardize GalliumContext locking * Create texture resources in texture validation Acked-by: Brian Paul <brianp@vmware.com>	2013-11-05 01:17:55 +00:00
Brian Paul	faaf568cfb	mesa: remove __alpha__ && CCPML check Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-04 18:09:57 -07:00
Brian Paul	2671b576b2	mesa: remove OPENSTEP stuff Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-04 18:09:57 -07:00
Brian Paul	32577fc0ad	mesa: remove macintosh preprocessor stuff IIRC, this is MacOS 9.x stuff. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-04 18:09:57 -07:00
Brian Paul	5a5d2d2db8	mesa: remove __QUICKDRAW__ tests Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-04 18:09:57 -07:00
Brian Paul	9bdc94b94d	mesa: remove WGLAPI macro WGLAPI was defined in glheader.h but wasn't used anywhere. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-04 18:09:57 -07:00
Kenneth Graunke	7b4b94a956	i965: Expose brw_reg_from_fs_reg() to other files. This will be useful for Broadwell code as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-04 16:51:22 -08:00
Kenneth Graunke	10cb91d7fb	i965: Combine gen6_clip_state.c and gen7_clip_state.c. The changes between Gen6-7 are minimal, and can easily be solved with an extra generation check. This cuts a lot of duplicated code. It also helps prevent even more duplication for Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-04 16:44:42 -08:00
Francisco Jerez	67b8f4c569	dri/nouveau: Fix nouveau_init_screen2 breakage. Fix incorrect init ordering in nouveau_init_screen2 caused by `083f66fdd6`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71172	2013-11-04 12:17:37 -08:00
Francisco Jerez	35fe7ed7d3	i965/gen7: Add instruction latency estimates for untyped atomics and reads. The latency information has been obtained empirically from measurements taken on Haswell and Ivy Bridge. Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-04 12:12:38 -08:00
Francisco Jerez	ba885c30c7	i965/gen7: Handle atomic instructions from the VEC4 back-end. This can deal with all the 15 32-bit untyped atomic operations the hardware supports, but only INC and PREDEC are going to be exposed through the API for now. v2: Represent atomics as GLSL intrinsics. Add support for variably indexed atomic counter arrays. v3: Add comment on why we don't need to assign uniform storage for atomic counters. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-04 12:12:38 -08:00
Francisco Jerez	764f40d92e	i965/gen7: Handle atomic instructions from the FS back-end. This can deal with all the 15 32-bit untyped atomic operations the hardware supports, but only INC and PREDEC are going to be exposed through the API for now. v2: Represent atomics as GLSL intrinsics. Add support for variably indexed atomic counter arrays. Fix interaction with fragment discard. v3: Add comment on why we don't need to assign uniform storage for atomic counters. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-04 12:12:37 -08:00
Francisco Jerez	34fe051e21	i965: Add a 'has_side_effects' back-end instruction predicate. This patch fixes the three dead code elimination passes and the VEC4/FS instruction scheduling passes so they leave instructions with side effects alone. At some point it might be interesting to have the instruction scheduler calculate the exact memory dependencies between atomic ops, but they're rare enough that it seems unlikely that it will make any practical difference. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-04 12:12:37 -08:00
Francisco Jerez	bf045bf9b4	clover: Calculate optimal work group size when it's not specified by the user. Inspired by a patch sent to the mailing list by Tom Stellard, but using a different algorithm to calculate the optimal block size that has been found to be considerably more effective. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-11-04 12:12:37 -08:00
Francisco Jerez	67a3037444	clover: Constify some command_queue arguments.	2013-11-04 12:12:37 -08:00
Francisco Jerez	6e9206bdcc	clover: Workaround compiler bug present in GCC 4.7.0-4.7.2. Variadic template aliases make these versions of GCC very confused, write down the full type spec instead.	2013-11-04 12:12:37 -08:00
Emil Velikov	0a2bdbb76f	st/xorg: handle updates to DamageUnregister API xserver 1.14.99.2 simplified the DamageUnregister API, by dropping the drawable argument. Follow xf86-video-intel and xf86-video-vmware approach and handle the new API by checking XORG_VERSION_CURRENT. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71110 Reported-by: Michał Górny <mgorny@gentoo.org> Reported-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-04 19:49:26 +00:00
Brian Paul	4e0ed59959	mesa: remove Watcom C support Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-04 12:23:09 -07:00
Brian Paul	2a1f74e7d9	mesa: remove Centerline C support from gl.h Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-04 12:23:09 -07:00
Brian Paul	61ec037c61	mesa: remove BUILD_FOR_SNAP bits Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-04 12:23:09 -07:00
Brian Paul	5d5d63d63c	mesa: remove SciTech stuff from gl.h Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-04 12:23:09 -07:00
Marek Olšák	6463b94973	r600g: properly unbind a DSA state being deleted in r600_delete_dsa_state Tested-by: Christian König <christian.koenig@amd.com>	2013-11-04 19:07:57 +01:00
Marek Olšák	f0733479f0	docs/GL3: document radeonsi support, minor cleanup Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-04 19:07:57 +01:00
Marek Olšák	a767f57a7d	radeonsi: implement ARB_vertex_type_2_10_10_10_rev	2013-11-04 19:07:57 +01:00
Marek Olšák	6a250877ea	r600g,radeonsi: properly expose texture buffer formats This exposes GL_ARB_texture_buffer_object_rgb32.	2013-11-04 19:07:57 +01:00
Marek Olšák	dbeedbb7ab	radeonsi: implement texture buffer objects GLSL 1.40 is done.	2013-11-04 19:07:57 +01:00
Marek Olšák	164de0d2a5	radeonsi: report our border color behavior	2013-11-04 19:07:57 +01:00
Marek Olšák	4569bf9199	radeonsi: bind a dummy constant buffer in place of NULL buffers	2013-11-04 19:07:57 +01:00
Marek Olšák	2fd4200123	radeonsi: implement uniform buffer objects	2013-11-04 19:07:57 +01:00
Marek Olšák	d0cf73a408	tgsi/scan: set maximum index for each constant buffer	2013-11-04 19:07:57 +01:00
Marek Olšák	e5f0080d91	radeonsi: try to fix IA_MULTI_VGT_PARAM programming This doesn't make any difference on Bonaire, but it might help on Hawaii.	2013-11-04 19:07:57 +01:00
Marek Olšák	5e43819475	winsys/radeon: use type-3 NOPs for CS padding on CIK The type-2 NOPs are said to be unstable. It doesn't make a difference here.	2013-11-04 19:07:56 +01:00
Aaron Watry	1b2c6cd205	clover: fix build with LLVM 3.4 dso_list was added as an argument for createInternalizePass in 3.4, and then it was removed again in the same llvm version. Tested-by: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-11-04 08:51:57 -08:00
Brian Paul	9fc41e2eea	draw: move type construction out of loop We can create clip_ptr_type once instead of n times inside the loop. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-04 07:12:14 -07:00
Chad Versace	2f89662717	i965: Add driconf option clamp_max_samples The new option clamps GL_MAX_SAMPLES to a hardware-supported MSAA mode. If negative, then no clamping occurs. v2: (for Paul) - Add option to i965 only, not to all DRI drivers. - Do not realy on int->uint cast to convert negative values to large positive values. Explicitly check for clamp_max_samples < 0. v3: (for Ken) - Don't allow clamp_max_samples to alter context version. - Use clearer for-loop and correct comment. - Rename variables. v4: (for Ken) - Merge identical if-branches. Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-03 15:55:18 -08:00
Vinson Lee	68f1b274b0	i965: Fix logic_op check. Fixes "Macro compares unsigned to 0" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-03 14:45:59 -08:00
Vinson Lee	9943b6612b	i915: Fix logic_op check. Fixes "Macro compares unsigned to 0" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-03 14:45:56 -08:00
Vinson Lee	14ddc83346	i965: Initialize vec4_visitor member variables. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-03 14:36:27 -08:00
Marek Olšák	fa8b1514d3	gallium/targets: remove vdpau-softpipe Reviewed-by: Christian König <christian.koenig@amd.com>	2013-11-02 23:34:01 +01:00
Marek Olšák	7c2531847f	gallium/targets: remove xvmc-softpipe Reviewed-by: Christian König <christian.koenig@amd.com>	2013-11-02 23:34:01 +01:00
Marek Olšák	0e17c12fa7	gallium/targets: remove r300/vdpau Reviewed-by: Christian König <christian.koenig@amd.com>	2013-11-02 23:34:01 +01:00
Marek Olšák	5f7233c8ea	gallium/targets: remove r300/xvmc Reviewed-by: Christian König <christian.koenig@amd.com>	2013-11-02 23:34:00 +01:00
Marek Olšák	be331e82d1	gallium/targets: remove radeonsi/xorg Reviewed-by: Christian König <christian.koenig@amd.com>	2013-11-02 23:34:00 +01:00
Marek Olšák	da82d7b6ba	gallium/targets: remove r600/xorg Reviewed-by: Christian König <christian.koenig@amd.com>	2013-11-02 23:34:00 +01:00
Rob Clark	f407ea1f1c	freedreno/a3xx/texture: min/max lod Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:22:40 -04:00
Rob Clark	2d10e22f8b	freedreno/a3xx: update envytools headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:22:28 -04:00
Rob Clark	f16b084bb9	freedreno/a3xx: fix VS out / FS in linking Actually link VS out / FS in based on semantic info, keeping in mind that position/pointsize can also be an input to the FS. This fixes a few fragment shaders which were using gl_Position. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:20:47 -04:00
Rob Clark	83318d6511	freedreno/a3xx: allow num_samplers != num_textures Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:20:29 -04:00
Rob Clark	a53fe2221c	freedreno/a3xx/compiler: highp frag shader Fixes use of full-precision in fragment shader (ie. don't clobber r0.x since that can be used by future bary instructions for varying fetch). And makes use of full-precision the default in fragment shader (but can be overriden via FD_MESA_DEBUG=fraghalf). Seems like half precision is often not enough for texture coordinates. The blob compiler is clever enough to keep texture coords in full precision registers while using half precision for everything else. But we aren't quite that clever yet, so better to default to full precision. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:19:42 -04:00
Rob Clark	310fd5839c	freedreno/a3xx/compiler: relative addressing fixes. Handle some relative addressing constraints: cannot handle const or relative in cat5 and src2 of cat3. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:18:44 -04:00
Rob Clark	4ddd4e83c7	freedreno: we do actually support sqrt Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-11-01 20:17:56 -04:00
Anuj Phogat	625a631383	i965: Enable ARB_sample_shading on intel hardware >= gen6 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ken Graunke <kenneth@whitecape.org>	2013-11-01 16:01:49 -07:00
Anuj Phogat	e7393260be	i965/gen7: Enable the features required for GL_ARB_sample_shading - Enable GEN7_WM_MSDISPMODE_PERSAMPLE, GEN7_WM_POSOFFSET_SAMPLE, GEN7_WM_OMASK_TO_RENDER_TARGET as per extension's specification. - Only enable one of GEN7_WM_8_DISPATCH_ENABLE or GEN7_WM_16_DISPATCH_ENABLE when GEN7_WM_MSDISPMODE_PERSAMPLE is enabled. Refer IVB PRM Vol. 2, Part 1, Page 288 for details. V2: - Use shared function _mesa_get_min_invocations_per_fragment(). - Use brw_wm_prog_data variables: uses_pos_offset, uses_omask. V3: - Enable simd16 dispatch with per sample shading. - Make changes to give preference to 'simd16 only' mode over 'simd8 only' mode in case of non 1x per sample shading. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:49 -07:00
Anuj Phogat	8d7a934d09	i965/gen6: Enable the features required for GL_ARB_sample_shading - Enable GEN6_WM_MSDISPMODE_PERSAMPLE, GEN6_WM_POSOFFSET_SAMPLE, GEN6_WM_OMASK_TO_RENDER_TARGET as per extension's specification. - Only enable one of GEN6_WM_8_DISPATCH_ENABLE or GEN6_WM_16_DISPATCH_ENABLE when GEN6_WM_MSDISPMODE_PERSAMPLE is enabled. Refer SNB PRM Vol. 2, Part 1, Page 279 for details. V2: - Use shared function _mesa_get_min_invocations_per_fragment(). - Use brw_wm_prog_data variables: uses_pos_offset, uses_omask. V3: - Enable simd16 dispatch with per sample shading. - Make changes to give preference to 'simd16 only' mode over 'simd8 only' mode in case of non 1x per sample shading. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:48 -07:00
Anuj Phogat	e26bdf56a4	i965: Add FS backend for builtin gl_SampleMask[] V2: - Update comments - Add a special backend instructions to compute sample_mask. - Add a new variable uses_omask in brw_wm_prog_data. V3: - Make changes to support simd16 mode. - Delete redundant AND instruction and handle the register stride in FS backend instruction. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:48 -07:00
Anuj Phogat	e12bbb503f	i965: Add FS backend for builtin gl_SampleID V2: - Update comments - Add compute_sample_id variables in brw_wm_prog_key - Add a special backend instruction to compute sample_id. V3: - Make changes to support simd16 mode. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:48 -07:00
Anuj Phogat	65d0452bbc	i965: Add FS backend for builtin gl_SamplePosition V2: - Update comments. - Add compute_pos_offset variable in brw_wm_prog_key. - Add variable uses_pos_offset in brw_wm_prog_data. V3: - Make changes to support simd16 mode. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:48 -07:00
Anuj Phogat	81f5fb352a	i965: Don't do vector splitting for ir_var_system_value This is required while adding builtin system value vec{2, 3, 4} variables. For example: (declare (sys) vec2 gl_SamplePosition) Without this patch above glsl ir splits in to: (declare (temporary) float gl_SamplePosition_x) (declare (temporary) float gl_SamplePosition_y) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-01 16:01:48 -07:00
Anuj Phogat	627b2692e9	mesa: Add a helper function _mesa_get_min_invocations_per_fragment() This function is used to test if we need to do per sample shading or per fragment shading. V2: Use MAX2() to make sure the function returns a number >= 1. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:48 -07:00
Anuj Phogat	e849511c78	glsl: Add new builtins required by GL_ARB_sample_shading New builtins added by GL_ARB_sample_shading: in vec2 gl_SamplePosition in int gl_SampleID in int gl_NumSamples out int gl_SampleMask[] V2: - Use SWIZZLE_XXXX for STATE_NUM_SAMPLES. - Use "result.samplemask" in arb_output_attrib_string. - Add comment to explain the size of gl_SampleMask[] array. - Make gl_SampleID and gl_SamplePosition system values. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-01 16:01:48 -07:00
Anuj Phogat	0d69e8c813	mesa: Pass number of samples as a program state variable Number of samples will be required in fragment shader program by new GLSL builtin uniform "gl_NumSamples". V2: Use "state.numsamples" in place of "state.num.samples" Use _NEW_BUFFERS flag in place of _NEW_MULTISAMPLE Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Ken Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 16:01:47 -07:00
Anuj Phogat	77b440e42d	mesa: Add new functions and enums required by GL_ARB_sample_shading New functions added by GL_ARB_sample_shading: glMinSampleShadingARB() New enums: GL_SAMPLE_SHADING_ARB GL_MIN_SAMPLE_SHADING_VALUE_ARB V2: Update comments. Create new GL4x.xml. Remove redundant code in get.c. Update the API_XML list in Makefile.am. Add extra_gl40_ARB_sample_shading predicate to get.c. V3: Fix make check failure. Add checks for desktop GL. Use GLfloat in place of GLclampf in glMinSampleShading(). Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ken Graunke <kenneth@whitecape.org>	2013-11-01 16:01:47 -07:00
Anuj Phogat	e919e5ee4e	mesa: Add infrastructure for GL_ARB_sample_shading This patch implements the common support code required for the GL_ARB_sample_shading extension. V2: Move GL_ARB_sample_shading to ARB extension list. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Ken Graunke <kenneth@whitecape.org>	2013-11-01 16:01:47 -07:00
Matt Turner	3c28b2c09f	i965/fs: Optimize saturating SEL.G(E) with imm val <= 0.0f. Only one program's instruction count is changed, but a shader in Tropics is also affected. instructions in affected programs: 326 -> 320 (-1.84%) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 15:21:07 -07:00
Matt Turner	ca675b73d3	i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0. total instructions in shared programs: 1409124 -> 1406971 (-0.15%) instructions in affected programs: 158376 -> 156223 (-1.36%) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 15:21:07 -07:00
Matt Turner	a8f76d829b	i965/fs: Optimize OR with identical sources into a MOV. Helps a lot of Steam games. total instructions in shared programs: 1409360 -> 1409124 (-0.02%) instructions in affected programs: 20842 -> 20606 (-1.13%) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 15:21:07 -07:00
Eric Anholt	fd05ede0d0	glsl: Add a CSE pass. This only operates on constant/uniform values for now, because otherwise I'd have to deal with killing my available CSE entries when assignments happen, and getting even this working in the tree ir was painful enough. As is, it has the following effect in shader-db: total instructions in shared programs: 1524077 -> 1521964 (-0.14%) instructions in affected programs: 50629 -> 48516 (-4.17%) GAINED: 0 LOST: 0 And, for tropics, that accounts for most of the effect, the FPS improvement is 11.67% +/- 0.72% (n=3). v2: Use read_only field of the variable, manually check the lod_info union members, use get_num_operands(), rename cse_operands_visitor to is_cse_candidate_visitor, move all is-a-candidate logic to that function, and call it before checking for CSE on a given rvalue, more comments, use private keyword. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 10:25:33 -07:00
Eric Anholt	3641b97bdc	i965/vec4: Don't overwrite op[1] when doing a UBO load. Prior to the GLSL CSE pass, all of our testing happened to have a freshly computed temporary in op[1], from the multiply by 16 to get a byte offset. As of CSE you'll get var_refs of a reused value when you've got multiple loads from the same offset. Make a proper temporary for computing our temporary value, to avoid shifting the value farther and farther down. Avoids a regression in gs-float-array-variable-index Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-01 10:25:33 -07:00
Brian Paul	2197967cd4	st/mesa: fix _mesa_init_transform_feedback_object() argument Need to pass a pointer of the base type, not the st type. Fixes a compiler warning.	2013-11-01 08:43:25 -06:00
Kenneth Graunke	723f047a3b	i965: Fix brw_store_register_mem64 to stay within a single batch. Previously, the write of each 32-bit half might land in separate batch buffers, which is insane. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-31 12:11:52 -07:00
Kenneth Graunke	5eb0835b91	docs: List transfom_feedback{2,3,instanced} for i965 in release notes.	2013-10-31 11:11:01 -07:00
Kenneth Graunke	0eeaf11edf	i965: Enable the ARB_transform_feedback_instanced extension on Gen7+. This depends on ARB_transform_feedback2, so I've predicated it on the ability to do register writes. It also depends on ARB_transform_feedback3, which is the only reason we couldn't expose it previously. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	c4ec0ad8a9	i965: Enable the ARB_transform_feedback3 extension on Gen7+. This extension is written a bit strangely. Although it introduces the concept of multiple transform feedback streams, it doesn't actually provide more than a single stream. The ARB_gpu_shader5 extension is what introduces the ability to write to streams other than stream #0 and increases the required number of streams. Since we don't yet support ARB_gpu_shader5, we can safely enable ARB_transform_feedback3 even though we only support a single stream. This does provide some useful functionality: applications can now use more than one interleaved transform feedback buffer. v2: Only expose the extension if ARB_transform_feedback2 is also available, to avoid confusing applications (suggested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	066fb237e6	i965: Add support for gl_SkipComponents[1234]. ARB_transform_feedback3 allows applications to insert blank space between interleaved varyings by adding fake 1, 2, 3, or 4-component varyings named gl_SkipComponents[1234]. Mesa's core data structures don't explicitly track these, instead simply tracking the buffer offset for each real varying. If there is padding due to gl_SkipComponents, these will not be contiguous. Our hardware takes the specification quite literally. Instead of specifying offsets for each varying, it assumes they're all contiguous and requires you to program fake varyings for each "hole". This patch adds support for emitting SO_DECL structures for these holes. Although we've lost the information about exactly how the application specified their padding (i.e. gl_SkipComponents2, gl_SkipComponents2 vs. a single gl_SkipComponents4), it shouldn't matter. We just need to emit the right amount of space. This patch emits the minimal number of hole SO_DECL structures. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	7232e8bea7	i965: Explicitly maintain a count of SO_DECL structures emitted. Currently, we emit one SO_DECL structure per output, so we use the index in the Outputs[] array as the index into the so_decl[] array as well. In order to support the fake "gl_SkipComponents[1234]" varyings from ARB_transform_feedback3, we'll need to emit SO_DECLs to fill in the holes between successive outputs. This means we'll likely emit more SO_DECLs than there are outputs, so we need to count it explicitly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	e095434e52	i965: Create a temporary for transform feedback output components. This is a bit shorter. v2: Mark the temporary const (requested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	129da5b1c8	i965: Enable ARB_transform_feedback2 on Gen7+ if register writes work. With Linux 3.12, register writes work on Ivybridge and Baytrail, but not Haswell. That will be fixed in a future kernel revision, at which point this extension will automatically be enabled. v2: Use I915_GEM_DOMAIN_INSTRUCTION for the register read, and also correctly set the writeable flag when mapping (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	46d3c2bf4d	i965: Initialize batchbuffer and state modules before extensions. We only want to enable ARB_transform_feedback2 if we can write to registers from batchbuffers. In order to test that, we need to be able to submit batches. And for batches to work, we need to program the initial pipeline state (like PIPELINE_SELECT), which is done from brw_state_init(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	82a5ee6be4	i965: Implement glDrawTransformFeedback(). Implementing the GetTransformFeedbackVertexCount() driver hook allows the VBO module to call us with the right number of vertices. The hardware doesn't directly count the number of vertices written by SOL, so we instead use the SO_NUM_PRIMS_WRITTEN(n) counters and multiply by the number of vertices per primitive. Unfortunately, counting the number of primitives generated is tricky: a program might pause a transform feedback operation, start a second one with a different object, then switch back and resume. Both transform feedback operations share the SO_NUM_PRIMS_WRITTEN counters. To work around this, we save the counter values at Begin, Pause, Resume, and End. This "bookends" each section where transform feedback is active for the current object. Adding up differences of pairs gives us the number of primitives generated. (This is similar to what we do for occlusion queries on platforms without hardware contexts.) v2: Fix missing parenthesis in assertion (caught by Eric Anholt). v3: Reuse prim_count_bo rather than freeing it and immediately allocating a new one (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	b2ff11618f	i965: Mark brw_draw_prims tfb_vertcount parameter as unused. Renaming it makes it obvious that it isn't used, and the assertion verifies that the VBO module never passes us such an object. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	ded34f65ad	mesa: Add a new GetTransformFeedbackVertexCount() driver hook. DrawTransformFeedback() needs to obtain the number of vertices written to a particular stream during the last Begin/EndTransformFeedback block. The new driver hook returns exactly that information. Gallium drivers already implement this by passing the transform feedback object to the drawing function, counting the number of vertices written on the GPU, and using draw indirect. This is efficient, but doesn't always work: If vertex data comes from user arrays, then the VBO module needs to know how many vertices to upload, so we need to synchronously count. Gallium drivers are currently broken in this case. It also doesn't work if primitive restart is done in software. For normal drawing, vbo_draw_arrays() performs software primitive restart, splitting the draw call in two. vbo_draw_transform_feedback() currently doesn't because it has no idea how many vertices need to be drawn. The new driver hook gives it that information, allowing us to reuse the existing vbo_draw_arrays() code to do everything right. On Intel hardware (at least Ivybridge), using the draw indirect approach is difficult since the hardware counts primitives, rather than vertices, which requires doing some simple math. So we always use this hook. Gallium drivers will likely want to use this hook in some cases, but want to use the existing draw indirect approach where possible. Hence, I've added a flag to allow drivers to opt-in to this call. v2: Make it possible to implement this hook but only use this path when necessary (suggested by Marek). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	684958d1e7	i965: Implement Pause/ResumeTransformfeedback driver hooks on Gen7+. The ARB_transform_feedback2 extension introduces the ability to pause and resume transform feedback sessions. Although only one can be active at a time, it's possible to switch between multiple transform feedback objects while paused. In order to facilitate this, we need to save/restore the SO_WRITE_OFFSET registers so that after resuming, the GPU continues writing where it left off. This functionality also exists in ES 3.0, but somehow we completely forgot to implement it. v2: Reduce alignment from 4096 to 64 (it seemed excessive). v3: Use I915_GEM_DOMAIN_INSTRUCTION instead of RENDER, for consistency with other writes. It shouldn't matter on IVB+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	0d7033c394	i965: Create a new brw_transform_feedback_object subclass. This adds the basic driver hooks to allocate/free the brw variant. It doesn't contain any additional information yet, but it will soon. v2: Use the new _mesa_init_transform_feedback_object helper function (requested by Eric and Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	be6227d29d	st/mesa: Use the new _mesa_init_transform_feedback_object() helper. This picks up a missing obj->EverBound = GL_FALSE line, and will catch any new fields that get added in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:37 -07:00
Kenneth Graunke	f02ee3044f	mesa: Separate transform feedback object initialization from allocation. Both Gallium and i965 subclass gl_transform_feedback_object, which requires implementing a custom NewTransformFeedback hook. Creating a helper function to initialize the fields avoids code duplication and divergence. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-31 11:04:36 -07:00
Brian Paul	0e2f0baa43	vbo: fix MSVC double->float conversion warnings	2013-10-31 08:21:58 -06:00
Brian Paul	83f276ab05	swrast: fix MSVC double->float conversion warnings	2013-10-31 08:21:58 -06:00
Brian Paul	717621acff	mesa: fix some MSVC signed/unsigned compiler warnings	2013-10-31 08:21:58 -06:00
Brian Paul	010f8762e8	meta: fix assorted MSVC int/float conversion warnings	2013-10-31 08:21:58 -06:00
Brian Paul	e4d4ec9ddf	glsl: fix MSVC int->bool conversion warning	2013-10-31 08:21:58 -06:00
Brian Paul	3c11bc6a5a	st/draw: silence Mingw warning in pointer_to_offset() Fixes "warning: cast from pointer to integer of different size" for 64-bit builds.	2013-10-31 08:21:58 -06:00
Matt Turner	b16b3c8703	i965/fs: Perform CSE on CMP(N) instructions. Optimizes cmp.ge.f0(8) null g45<8,8,1>F 0F (+f0) sel(8) g50<1>F g40<8,8,1>F g10<8,8,1>F cmp.ge.f0(8) null g45<8,8,1>F 0F (+f0) sel(8) g51<1>F g41<8,8,1>F g11<8,8,1>F cmp.ge.f0(8) null g45<8,8,1>F 0F (+f0) sel(8) g52<1>F g42<8,8,1>F g12<8,8,1>F cmp.ge.f0(8) null g45<8,8,1>F 0F (+f0) sel(8) g53<1>F g43<8,8,1>F g13<8,8,1>F into cmp.ge.f0(8) null g45<8,8,1>F 0F (+f0) sel(8) g50<1>F g40<8,8,1>F g10<8,8,1>F (+f0) sel(8) g51<1>F g41<8,8,1>F g11<8,8,1>F (+f0) sel(8) g52<1>F g42<8,8,1>F g12<8,8,1>F (+f0) sel(8) g53<1>F g43<8,8,1>F g13<8,8,1>F total instructions in shared programs: 1644938 -> 1638181 (-0.41%) instructions in affected programs: 574955 -> 568198 (-1.18%) Two more 16-wide programs (in L4D2). Some large (-9%) decreases in instruction count in some of Valve's Source Engine games. No regressions. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 19:49:27 -07:00
Matt Turner	219b43c612	i965/fs: Don't emit null MOVs in CSE. We'd like to CSE some instructions, like CMP, that often have null destinations. Instead of replacing them with MOVs to null, just don't emit the MOV. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 19:49:27 -07:00
Matt Turner	a93d54eb68	i965/fs: Use reads_flag and writes_flag methods in the scheduler. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 19:49:27 -07:00
Matt Turner	20d0297ff2	i965/fs: Add reads_flag() and writes_flag() to fs_inst. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 19:49:27 -07:00
Matt Turner	f768f998e0	i965/fs: Add is_null() method to fs_reg. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 19:49:27 -07:00
Eric Anholt	8dfc9f038e	i965/fs: Use the gen7 scratch read opcode when possible. This avoids a lot of message setup we had to do otherwise. Improves GLB2.7 performance with register spilling force enabled by 1.6442% +/- 0.553218% (n=4). v2: Use BRW_PREDICATE_NONE, improve a comment (by Paul). Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:51:19 -07:00
Eric Anholt	6032261682	i965: Merge together opcodes for SHADER_OPCODE_GEN4_SCRATCH_READ/WRITE I'm going to be introducing gen7 variants, and the previous naming was going to get confusing. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:51:17 -07:00
Eric Anholt	32182bb004	i965/fs: Fix register unspills from a reg_offset. We were clearing the reg_offset before trying to use it. Oops. Fixes glsl-fs-texture2drect with the reg spilling debug enabled. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:51:15 -07:00
Eric Anholt	0e20051f54	i965/fs: Fix register spilling for 16-wide. Things blew up when I enabled the debug register spill code without disabling 16-wide, so I decided to just fix 16-wide spilling. We still don't generate 16-wide when register spilling happens as part of allocation (since we expect it to be slower), but now we can experiment with allowing it in some cases in the future. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:51:10 -07:00
Eric Anholt	537f183fe6	i965/fs: Exit the compile if spilling would overwrite in-use MRFs. I believe this will never happen in SIMD8 mode, but it could for SIMD16 when we fix it. v2: Fix off-by-one in my register counting comment (caught by Paul). Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)	2013-10-30 17:51:02 -07:00
Eric Anholt	44ec2f1751	i965/fs: Fix broken register spilling debug code. Now that reg spilling generates new vgrfs, we were looping forever if you ever turned it on. Instead, move the debug code into the register allocator right near where we'd be doing spilling anyway, which should more accurately reflect how register spilling occurs in the wild. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:50:59 -07:00
Eric Anholt	b3f6690406	i965/fs: Split "find what MRFs were used" to a helper function. I'm going to need to reuse this for fixing register spilling on SIMD16. Note that BRW_MAX_MRF is 16, which is the same as BRW_MAX_GRF - GEN7_MRF_HACK_START. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:50:56 -07:00
Eric Anholt	32ac5634d6	i965/fs: Update an ancient, wrong comment about reg_offset. This hasn't been true since SIMD16 mode was added. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 17:50:51 -07:00
Kai Wasserbäch	bbb77fc2f1	radeonsi: Allow longer intrinsic names Fixes a boat load of Piglit tests for me, which crashed like fdo#70913 before. Thanks to Michel Dänzer for the tip. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70913 Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-30 16:40:06 -07:00
Tom Stellard	193594a1b8	clover: Don't install headers when using the icd The ICD loader should be responsible for installing headers. Reviewed and Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-10-30 16:40:06 -07:00
Tom Stellard	6f3465f340	radeon/llvm: Specify the DataLayout when running optimizations Without DataLayout, a lot of optimization passes aren't run and the ones that are don't work as well.	2013-10-30 16:40:06 -07:00
Eric Anholt	20dbeadd83	i965/fs: Prefer more-critical instructions of the same age in LIFO scheduling. When faced with a million instructions that all became candidates at the same time (none of which individually reduce register pressure), the ones on the critical path are more likely to be the ones that will free up some candidates soon. shader-db: total instructions in shared programs: 1681070 -> 1681070 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 40 LOST: 74 Fixes indistinguishable-from-hanging behavior in GLES3conform's uniform_buffer_object_max_uniform_block_size test, regressed by `c3c9a8c857`. Given that `93bd627d5a` was unlocked by that commit, the net effect on 16-wide program count is still quite positive, and I think this should give us more stable scheduling (less dependency on original instruction emit order). v2: Comment suggestions by Paul Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70943 Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 15:46:54 -07:00
Eric Anholt	017361dd37	i965: Compute the node's delay time for scheduling. This is a step in doing scheduling as described in Muchnick (p538). A difference is that our latency function is only specific to one instruction (it doesn't describe, for example, the different latency between WAR of a send's arguments and RAW of a send's destination), but that's changeable later. We also don't separately compute the postorder traversal of the graph, since we can use the setting of the delay field as the "visited" flag. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 15:46:48 -07:00
Emil Velikov	9eb3de1ce7	automake: handle expat version pre 2.1 Commit `aec20d66d9` (automake: properly handle non-default expat installation), assumed that up-to date distributions use a recent version of expat that handles security vunerabilities CVE-2012-1147 and CVE-2012-1148. Seems like this is not always the case and they prefer to backport only the fix, rather than use the updated library. This commit adds a default case -lexpat whenever expat is not found, while properly handling expat.pc if present. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71022 Reported-By: Bryce Harrington <b.harrington@samsung.com> Reported-By: Vinson Lee <vlee@freedesktop.org> Tested-by: Bryce Harrington <b.harrington@samsung.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-30 22:05:42 +00:00
Ian Romanick	5cb80f0314	glsl: Move layout(location) checks to AST-to-HIR conversion This will simplify the addition of layout(location) qualifiers for separate shader objects. This was validated with new piglit tests arb_explicit_attrib_location/1.30/compiler/not-enabled-01.vert and arb_explicit_attrib_location/1.30/compiler/not-enabled-02.vert. v2: Refactor error checking to check_explicit_attrib_location_allowed and eliminate the gotos. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 13:49:30 -07:00
Ian Romanick	9d6294f5a2	glsl: Slightly restructure error generation in validate_explicit_location Use mode_string to get the name of the variable mode. Slightly change the control flow. Both of these changes make it easier to support separate shader object location layouts. The format of the message changed because mode_string can return a string like "shader output". This would result in an awkward message like "vertex shader shader output..." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 13:49:30 -07:00
Ian Romanick	f8c579dc0f	glsl: Make mode_string function globally available I made this a function (instead of a method of ir_variable) because it made the change set smaller, and I expect that there will be an overload that takes an ir_var_mode enum. Having both functions used the same way seemed better. v2: Add missing case for ir_var_system_value. v3: Change the ir_var_mode_count case to just break. Move the assertion and the return outside the switch-statment. In the unlikely event that var->mode is an invalid value other than ir_var_mode_count, the assertion will still fire, and in release builds we won't wind up returning a garbage pointer. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 13:49:30 -07:00
Ian Romanick	2cb760d994	glsl: Eliminate the global check in validate_explicit_location Since the separation of ir_var_function_in and ir_var_shader_in (similar for out), this check is no longer necessary. Previously, global_scope was the only way to tell which was which. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 13:49:29 -07:00
Ian Romanick	8f00a77fbc	glsl: Extract explicit location code from apply_type_qualifier_to_variable Future patches will add some extra code to this path, and some of that code will want to exit from the explicit location code early. v2: Change a geometry shader "break" to a "return" so that try to apply a bogus geometry shader location qualifier (which could cause cascading errors). Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-30 13:49:29 -07:00
Gregory Hainaut	0059d1948e	mesa: Drop unused return value from use_shader_program The return value has been unused since commit `d348b0c`. This was originally included in another patch, but it was split out by Ian Romanick. v2: Drop unnecessary final return. Suggested by Paul. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: Eric Anholt <eric@anholt.net>	2013-10-30 13:49:29 -07:00
Fabio Pedretti	103824dc24	wayland: silence unused var warning Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-30 12:50:09 -07:00
Johannes Obermayr	5e162566db	ilo: Fix out-of-tree build. [olv: use $(srcdir) instead of $(top_srcdir)]	2013-10-30 21:17:10 +08:00
José Fonseca	26a8f76ba1	scons: Add missing dependencies to src/mapi/glapi/gen/.xml Incremental builds were failing because not all generated source files were missing dependencies to src/mapi/glapi/gen/.xml. Hopefully this change will be the end of these incremental build failures.	2013-10-30 12:21:54 +00:00
Marek Olšák	e929e27737	glsl: fix crash introduced by the previous commit	2013-10-30 00:14:35 +01:00
Marek Olšák	7e414b5864	glsl: break the gl_FragData array into separate gl_FragData[i] variables This avoids a defect in lower_output_reads. The problem is lower_output_reads treats the gl_FragData array as a single variable. It first redirects all output writes to a temporary variable (array) and then writes the whole temporary variable to the output, generating assignments to all elements of gl_FragData. BTW this pass can be modified to lower all arrays, not just inputs and outputs. The question is whether it is worth it. Reviewed-by: Paul Berry <stereotype441@gmail.com> v2: addressed Paul Berry's comments	2013-10-29 23:50:01 +01:00
Emil Velikov	aec20d66d9	automake: properly handle non-default expat installation Use PKG_CHECK_MODULE over requesting the user to setup the option at configure time. Drop unused EXPAT_INCLUDE and update all targets. NOTE: The this commit removes the --with-expat configure option. One should ensure that the expat they wish to use has expat.pc file accessible by pkg-config. v2: * Add note about the removal of --with-expat (per Tom Stellard) * Drop EXPAT_CFLAGS for targets that do not build DRI_COMMON (spotted by Matt Turner) v3: * Rebase on top of megadrivers (drop EXPAT_CFLAGS from swrast) Acked-by: Matt Turner <mattst88@gmail.com> (v2) Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v2) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Conflicts: configure.ac src/mesa/drivers/dri/common/Makefile.am	2013-10-29 21:14:41 +00:00
Emil Velikov	0828ad4e63	configure: use PKG_CONFIG variable over hardcoded pkg-config Already available and used in other places of configure.ac. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-29 21:04:38 +00:00
Emil Velikov	2a87647c6a	targets/xorg-nouveau: drop usage of dri1 function DRICreatePCIBusID The function should have never used it in the first place as it was a left over from the DRI1 days of the nouveau ddx. While we're around check if KMS is supported before opening the nouveau device, and add support for Fermi & Kepler cards. Compile tested only due to the lack of a Fermi/Kepler card. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-29 21:04:38 +00:00
Emil Velikov	c9e6e6382f	gallium/targets/xorg: drop set but unused variable entity The function xf86GetEntityInfo() retrieves the entity rather than doing any changes. Remove this no-op code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-29 21:04:38 +00:00
Emil Velikov	ba3efd6b42	st/xorg: drop set but unsused variables dxo, dyo Commit `a9f8baf00b` removed the first and only use of the variables but forgot to remove them. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-29 21:04:38 +00:00
Emil Velikov	2b7ffde8bd	st/xorg: add sanity checks after malloc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-29 21:04:37 +00:00
Emil Velikov	5c398e243c	st/xorg: remove unnecessary headers v2: Remove xf86PciInfo.h, all drivers provide their own PCI ID list Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-29 21:04:37 +00:00
Rob Clark	2bc1fc2fb6	freedreno: emulated unsupported primitive types Use u_primconvert to convert unsupported primitives into supported primitive plus index buffer. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-29 16:49:43 -04:00
Rob Clark	b881917088	gallium/auxiliary/indices: add u_primconvert A convenient front end to indices generate/translate code, for emulating primitives which are not supported natively by the driver. This handles saving/restoring index buffer state, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-29 16:49:43 -04:00
Rob Clark	28f3f8d413	gallium/auxiliary/indices: add start param Add 'start' parameter to generator/translator. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-29 16:49:43 -04:00
Rob Clark	5127436a4a	freedreno: update generated headers pull in some fixes to draw-initiator/prim-type. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-29 16:49:43 -04:00
Eric Anholt	774b787d6b	i965/fs: Drop our dead push constants before overflowing to pull constants. The idea of the original order was that you'd dead code eliminate accesses to push constants. But I've never seen a case of that (nor has shader-db), while we frequently see sparse accesses of large constant arrays that would overflow into pull constants. Cuts pull constant use on csgo, serious sam, planeshift, and the cave: total instructions in shared programs: 1695103 -> 1688795 (-0.37%) instructions in affected programs: 92024 -> 85716 (-6.85%) GAINED: 339 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-29 13:43:01 -07:00
Alexander von Gluck IV	9a9fb94ca9	haiku-softpipe: Minor cleanup and color space fixes * Use more consistant data sources * Fix improper color space assignments * Remove unnecessary comments and code * Drop unnecessary round_up function (this was leftover from moving winsys code out of renderer) Acked-by: Brian Paul <brianp@vmware.com>	2013-10-29 15:27:43 -05:00
Alexander von Gluck IV	439dd0e20a	winsys: Correct Haiku winsys display target code * Instead of assuming the displaytarget is the same stride / colorspace as the destination, lets actually check the source bitmap. * Fixes random stride issues in rendering Acked-by: Brian Paul <brianp@vmware.com>	2013-10-29 15:27:40 -05:00
Francisco Jerez	b8f89fc5cb	clover: Use context device list for error checking in clGetProgramBuildInfo. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70891. Reported-by: Bruno Jiménez <brunojimen@gmail.com>	2013-10-29 12:40:56 -07:00
Francisco Jerez	e515dcbf96	i965: Simplify the shader time code by using atomic counter helpers. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-29 12:40:56 -07:00
Francisco Jerez	d58bd75263	i965: Add brw_reg constructors taking a dynamically determined vector width. The MRF variant is going to be used extensively by the atomic counter intrinsics to assemble untyped atomic and surface read messages easily. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-29 12:40:56 -07:00
Francisco Jerez	5e621cb9fe	i965/gen7: Implement code generation for untyped surface read instructions.	2013-10-29 12:40:56 -07:00
Francisco Jerez	cfaaa9bbb7	i965/gen7: Implement code generation for untyped atomic instructions. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-29 12:40:56 -07:00
Francisco Jerez	5809512b17	i965: Implement ABO surface state emission. The maximum number of atomic buffer objects is somewhat arbitrary, we can change it in the future easily if it turns out it's not enough... v2: Add comments with the relevant mesa dirty bits. Fix usage of BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom. v3: Update binding table layout diagrams. v4: Resolve conflicts with the recent dynamic surface index assignment changes. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-29 12:40:56 -07:00
Francisco Jerez	c4e730e218	i965: Define vtbl method that initializes an untyped R/W surface. And add Gen7 implementation. v2: Fix off by one error in buffer size calculation. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	7a54db9ce5	glsl: Fix the function inlining pass to deal with general opaque arguments. Almost a trivial change, it boils down to renaming a few identifiers so their names still make sense for opaque types other than sampler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	bbded5b5fe	glsl: Add built-in functions and constants required for ARB_shader_atomic_counters. v2: Represent atomics as GLSL intrinsics. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	9562922376	glsl: Basic support for built-in intrinsics. Fix the linker to deal with intrinsic functions which are undefined all the way down to the driver back-end, and introduce intrinsic definition helpers in the built-in generator. We still need to figure out what kind of interface we want for drivers to communicate to the GLSL front-end which of the supported intrinsics should use a default GLSL implementation and which should use a hardware-specific override. As there's no default GLSL implementation for atomic ops, this seems like something we can worry about later on. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> v2: Define local helper function to generate ir_call nodes in the builtin generator.	2013-10-29 12:40:55 -07:00
Francisco Jerez	cc744a0947	glsl: Add type predicate to check whether a type contains any opaque types. And use it to forbid comparisons of opaque operands. According to the GL 4.2 specification: > Except for array indexing, structure member selection, and > parentheses, opaque variables are not allowed to be operands in > expressions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	26db3b933f	glsl: Add new atomic_uint built-in GLSL type. v2: Fix GLSL version in which the type became available. Add contains_atomic() convenience method. Split off atomic counter comparison error checking to a separate patch that will handle all opaque types. Include new ir_variable fields for atomic types. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	0bed1ab73b	glsl: Add extension enables for ARB_shader_atomic_counters. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	1c7dcfed7c	mesa: Add support for ARB_shader_atomic_counters. This patch implements the common support code required for the ARB_shader_atomic_counters extension. It defines the necessary data structures for tracking atomic counter buffer objects (from now on "ABOs") associated with some specific context or shader program, it implements support for binding buffers to an ABO binding point and querying the existing atomic counters and buffers declared by GLSL shaders. v2: Fix extension checks. Drop unused MAX_ATOMIC_BUFFERS constant. Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	e3fd31dc41	glapi: Add support for ARB_shader_atomic_counters. Add XML file for the dispatch code generator, update the dispatch_sanity test and add stub definition for the new entry point. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	db47074ac0	i965: Handle deallocation of some private ralloc contexts explicitly. These ralloc contexts belong to a specific object and are being deallocated manually from the class destructor. Now that we've hooked up destructors to ralloc there's no reason for them to be children of any other context, and doing so might to lead to double frees under some circumstances. The class destructor has all the responsibility of freeing class memory resources now.	2013-10-29 12:40:55 -07:00
Francisco Jerez	d18477deea	ralloc: Hook up C++ destructors to ralloc when necessary. This patch makes sure that class destructors are called as they should be when a C++ object allocated by ralloc is released. Based on a previous patch by Kenneth Graunke, but it doesn't exhibit the ~0.8% performance regression in shader compilation times because we now use the HAS_TRIVIAL_DESTRUCTOR() macro to detect the typical case where the indirect function call can be avoided because the object's destructor doesn't need to do anything. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Francisco Jerez	98ab905af0	mesa: Define introspection macro to determine whether a type is trivially destructible. Only implemented on GCC and Clang for now. Other compilers use a dummy implementation that always returns false, which should be a safe [but slightly inefficient] assumption in all cases. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 12:40:55 -07:00
Paul Berry	be63803b0c	glsl: Generalize MSVC fix for strcasecmp(). This will let us use strcasecmp() from anywhere inside Mesa without having to worry about the fact that it doesn't exist in MSVC. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-29 11:10:56 -07:00
Roland Scheidegger	e4195acab5	llvmpipe: fix bogus layer clamping in setup The layer coming from GS needs to be clamped (not sure if that's actually the correct error behavior but we need something) as the number can be higher than the amount of layers in the fb. However, this code was using the layer calculation from the scene, and this was actually calculated in lp_scene_begin_rasterization() hence too late (so setup was using the value from the _previous_ scene or just zero if it was the first scene). Since the value is used in both rasterization and setup, move calculation up to lp_scene_begin_binning() though it's a bit more inconvenient to calculate there. (Theoretically could move _all_ code which was in lp_scene_begin_rasterization() to there, because ever since we got rid of swizzled render/depth buffers our "map" functions preparing the fb data for render don't actually change the data in there at all, but it feels like it would be a hack.) v2: improve comments Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-29 17:54:03 +01:00
Matthew McClure	be0b67a143	util,llvmpipe: correctly set the minimum representable depth value Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-29 15:53:48 +00:00
Brian Paul	d0eaf6752d	st/mesa: move out of memory check in st_draw_vbo() Before we were only checking the st->vertex_array_out_of_memory flag after updating array state. But if there's two consecutive glDrawArrays calls and the first one is skipped because of OOM, the second one should be skipped too. Cc: 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-10-29 08:09:34 -06:00
Brian Paul	ea9fe9ebdb	svga: reindent drawing code	2013-10-29 08:09:34 -06:00
Eric Anholt	415d6dc5bd	i965/vec4: Reduce working set size of live variables computation. Orbital Explorer was generating a 4000 instruction geometry shader, which was taking 275 trips through dead code elimination and register coalescing, each of which updated live variables to get its work done, and invalidated those live variables afterwards. By using bitfields instead of bools (reducing the working set size by a factor of 8) in live variables analysis, it drops from 88% of the profile to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul says 3+ minutes) to 10.5 seconds. Compare to `f179f419d1` on the FS side. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-29 00:27:35 -07:00
Vadim Girlin	8bd4476010	r600g/sb: fix value::is_fixed() This prevents unnecessary (and wrong) register allocation in the scheduler for preloaded values in fixed registers. Fixes interpolation-mixed.shader_test on rv770 (and probably on all other pre-evergreen chips). Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-10-29 05:49:21 +04:00
Eric Anholt	08bf52712e	glsl: Drop no-op shifts involving 0. I noticed this in a shader in Unigine Heaven that was spilling. While it doesn't really reduce register pressure, it shaves a few instructions anyway (7955 -> 7882). v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik Faye-Lund). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-28 14:07:31 -07:00
Eric Anholt	3a0fdf2ab6	glsl: Use ir_builder more in opt_algebraic. While ir_builder is slightly less efficient, we're only increasing the work when there's actual optimization being done, and it's way more readable code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-28 14:07:31 -07:00
Eric Anholt	27bcb5063f	glsl: Move common code out of opt_algebraic's handle_expression(). Matt and I had each screwed up these common required patterns recently, in ways that wouldn't have been noticed for a long time if not for code review. Just enforce it in the caller so that we don't rely on code review catching these bugs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-28 14:07:31 -07:00
Carl Worth	29996e2199	Remove error when calling glGenQueries/glDeleteQueries while a query is active There is nothing in the OpenGL specification which prevents the user from calling glGenQueries to generate a new query object while another object is active. Neither is there anything in the Mesa implementation which prevents this. So remove the INVALID_OPERATION errors in this case. Similarly, it is explicitly allowed by the OpenGL specification to delete an active query, so remove the assertion for that case, replacing it with the necesssary state updates to end the query, (clear the bindpt pointer and call into the driver's EndQuery hook). CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-10-28 12:56:49 -07:00
Kenneth Graunke	5563dfabc8	i965: Also emit HiZ and Stencil packets when disabling depth on Gen6. The normal drawing path does this, and it's necessary on Ivybridge, so let's try it on Sandybridge too. It's not explicitly documented as necessary, but might help with hangs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:36 -07:00
Kenneth Graunke	29e5d5db51	i965: Also emit HIER_DEPTH and STENCIL packets when disabling depth. From the documentation: "[DevIVB] 3DSTATE_DEPTH_BUFFER must always be programmed along with the other Depth/Stencil state commands(i.e. 3DSTATE_CLEAR_PARAMS, 3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)." We normally do this, but BLORP was failing to do so in the case where it disables depth. Not observed to fix anything yet. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:33 -07:00
Kenneth Graunke	65b1f642ac	i965: Move post-sync non-zero flush for 3DSTATE_MULTISAMPLE. For some reason, we put the flush in the caller, rather than just before emitting the packet. This is more than a cosmetic problem: BLORP calls gen6_emit_3dstate_multisample() directly, and so it missed the flush. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:32 -07:00
Kenneth Graunke	10a918e52c	i965: Also guard 3DSTATE_DRAWING_RECTANGLE with a flush in blorp. Non-pipelined commands need this flush. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:31 -07:00
Kenneth Graunke	3aef1fefb4	i965: Emit post-sync non-zero flush before 3DSTATE_DRAWING_RECTANGLE. This is another non-pipelined command that needs a flush on Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:29 -07:00
Kenneth Graunke	436e815a25	i965: Emit post-sync non-zero flush before 3DSTATE_GS_SVB_INDEX. From the comments above intel_emit_post_sync_nonzero_flush: "[DevSNB-C+{W/A}] Before any depth stall flush (including those produced by non-pipelined state commands), software needs to first send a PIPE_CONTROL with no bits set except Post-Sync Operation != 0." This suggests that every non-pipelined (0x79xx) command needs a post-sync non-zero flush before it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:27 -07:00
Daniel Vetter	32a3f5f6d7	i965: CS writes/reads should use I915_GEM_INSTRUCTION Otherwise the gen6 w/a in the kernel won't kick in and the write will land nowhere. Inspired by a patch Ken pointed me at which had the same issue (but isn't yet merged and also for a gen7+ feature). An audit of the entire driver didn't reveal any other case than the one in in the write_reg helper used by the gen6 queryobj code. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Xinkai Chen <yeled.nova@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-28 11:29:15 -07:00
Anuj Phogat	f278d49c4b	i965: Do not set bilinear_filter flag in case of multisample blits Setting bilinear_filter flag in case of multisample blits with GL_LINEAR filter causes incorrect behavior in translate_dst_to_src() function. This broke Modern Warfare (1, 2 and 3) on SNB, IVB and HSW. Tested on SNB and IVB, no Piglit regressions. Trace file of the game (taken with apitrace) works fine with this patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69078 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reported-by: Armin K <krejzi@email.com> Tested-by: Armin K <krejzi@email.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-28 09:33:01 -07:00
Rico Schüller	14f02cdee8	mesa: Remove trailing whitespace in texparam.c Signed-off-by: Rico Schüller <kgbricola@web.de> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-10-28 08:43:40 -06:00
Brian Paul	0ce3bfbd40	mesa: use void in _mesa_VDPAUFiniNV() as in the header file	2013-10-28 08:37:39 -06:00
Timothy Arceri	b59c5926cb	glsl: Add check for unsized arrays to glsl types The main purpose of this patch is to increase readability of the array code by introducing is_unsized_array() to glsl_types. Some redundent is_array() checks are also removed, and small number of other related clean ups. The introduction of is_unsized_array() should also make the ARB_arrays_of_arrays code simpler and more readable when it arrives. V2: Also replace code that checks for unsized arrays directly with the length variable Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> v3 (Paul Berry <stereotype441@gmail.com>): clean up formatting. Separate whitespace cleanups to their own patch. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-28 06:06:04 -07:00
Timothy Arceri	5cd7eb9f07	glsl: whitespace cleanups. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> v2 (Paul Berry <stereotype441@gmail.com>): Separate from "glsl: Add check for unsized arrays to glsl types". Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-28 06:06:04 -07:00
Timothy Arceri	e14abf566b	glsl: Fix comment Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-28 06:05:51 -07:00
Christian König	925ffa8c4a	vl/h264: split fields into SPS/PPS Add alot of missing fields as well. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-28 11:08:12 +01:00
Christian König	6f2410c9aa	radeon/uvd: fix H264 chroma format handling Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-28 11:06:37 +01:00
Christian König	cc49baeedc	vl: add 400 chroma format as well Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-28 11:06:18 +01:00
Chia-I Wu	d2fdc0d634	ilo: minor cleanups for recent interface changes Kill ilo_bind_sampler_states2 and ilo_set_sampler_views2. Map PIPE_FORMAT_R10G10B10A2_UINT to BRW_SURFACEFORMAT_R10G10B10A2_UINT.	2013-10-28 11:40:41 +08:00
Timothy Arceri	d1d3b1e361	glsl: Move error message inside validation check reducing duplicate message handling v2 (Paul Berry <stereotype441@gmail.com): Fix precedence error in call to _mesa_glsl_error(). Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-27 10:23:52 -07:00
Paul Berry	e79e6c5911	i965: Make fs gl_PrimitiveID input work even when there's no gs. When a geometry shader is present, the fragment shader gl_PrimitiveID input acts like an ordinary varying, receiving data from the gs gl_PrimitiveID output. When there's no geometry shader, we have to ask the fixed function SF hardware to provide the primitive ID to the fragment shader instead. Previously, the SF setup code would handle this situation by recognizing that the FS gl_PrimitiveID input didn't match to any VS output; since normally an FS input with no corresponding VS output leads to undefined data, the SF setup code used to just arbitrarily assign it to receive data from attribute 0. This patch changes the SF setup code so that instead of arbitrarily using attribute 0, it assigns the unmatched FS input to receive gl_PrimitiveID. In the case where the FS input really is gl_PrimitiveID, this produces the intended result. In all other cases, no harm is done since GL specifies that the behaviour is undefined. Fixes piglit test primitive-id-no-gs. v2: If an attribute is already being overridden with point coordinates, don't try to also override it with gl_PrimitiveID. This is necessary to avoid regressing piglit tests such as shaders/glsl-fs-pointcoord. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-27 10:23:39 -07:00
Vinson Lee	7f76368305	mesa: Add GL_NV_vdpau_interop functions to dispatch_sanity.cpp. Fixes 'make check' failures introduced with commit `80964226e9`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70900 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-10-26 23:13:51 -07:00
Brian Paul	bc23944091	mesa: add vdpau.c and st_vdpau.c to src/mesa/SConscript Fixes SCons build.	2013-10-26 07:24:17 -06:00
Christian König	80964226e9	implement NV_vdpau_interop v7 v2: Actually implement interop between the gallium state tracker and the VDPAU backend. v3: Make it also available in non legacy contexts, fix video buffer sharing. v4: deny interop if we don't have the same screen object v5: rebased on upstream changes v6: implemented VDPAUGetSurfaceivNV, improved error handling, unregister all surfaces in VDPAUFiniNV v7: squash merge with Mareks changes Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-26 12:13:36 +02:00
Christian König	3d3a0b9b67	winsys/radeon: make radeon_drm_winsys_create public Otherwise OpenGL/VDPAU interop won't work as expected. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-26 12:13:36 +02:00
Chris Forbes	598ca510b8	i965: Remove ir_txf coord+offset special case in visitors Just let it be handled by the lowering pass. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:56:27 +13:00
Chris Forbes	06de9f8ff1	i965: Generalize coord+offset lowering pass for ir_txf ir_txf expects an ivec* coordinate, and may be larger than ivec2; shuffle things around so that this will work. V2: Fix style nits, use ir_builder Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:56:25 +13:00
Chris Forbes	72b5e9c42a	i965: Add lowering pass to fold offset into unnormalized coords It turns out that nonzero offsets with gsampler2DRect don't work -- they just return garbage. Work around this by folding the offset into the coord. Done as an IR pass rather than yet another hack in the visitors because it's clear what's going on this way. Can possibly reuse this to replace the existing txf coord+offset hacks. V2: Use ir_builder Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:56:09 +13:00
Chris Forbes	a936000db6	i965: Add lowering pass for splitting textureGatherOffsets Rewrites textureGatherOffsets(s, p, offsets) into gvec4( textureGatherOffset(s, p, offsets[0]).w, textureGatherOffset(s, p, offsets[1]).w, textureGatherOffset(s, p, offsets[2]).w, textureGatherOffset(s, p, offsets[3]).w ) V2: Use ir_builder to be slightly clearer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:28:26 +13:00
Chris Forbes	4c1eae5395	i965: Add asserts to ensure that ir_tg4 offset arrays are lowered We don't have a message that does 4 independent offsets; a lowering pass needs to lower it to 4 normal gather4s before reaching this point. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:28:05 +13:00
Chris Forbes	de8948a0b6	glsl: add signatures for textureGatherOffsets() Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:28:03 +13:00
Chris Forbes	a9de744a26	glsl: add support for texture functions with offset arrays This is needed for textureGatherOffsets() Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:27:37 +13:00
Chris Forbes	3c98d77460	i965/fs: Add support for shadow comparitors with gather4 Note that gather4_po_c's parameters are too long for SIMD16. It might be worth emitting 2xSIMD8 messages in this case at some point. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:16:32 +13:00
Chris Forbes	32f898a71c	i965/vs: Add support for shadow comparitors with gather4 gather4_c's argument layout is straightforward -- refz just goes on the end. gather4_po_c's layout however -- the array index is replaced with refz. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:16:28 +13:00
Chris Forbes	070c841111	i965: Add Gen7 gather4_c and gather4_po_c message types Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:16:27 +13:00
Chris Forbes	43e3ae112f	glsl: Add new textureGather[Offset]() overloads for shadow samplers Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:16:24 +13:00
Chris Forbes	af1dfd99b7	glsl: Add support for separate reference Z for shadow samplers ARB_gpu_shader5's textureGather*() functions which take shadow samplers have a separate `refz` parameter rather than adding it to the coordinate. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:16:19 +13:00
Chris Forbes	fb08769bb6	i965/vs: add support for gather4 with nonconstant offsets Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-26 22:10:02 +13:00
Chris Forbes	938d909894	i965/fs: add support for gather4 with nonconstant offsets V3: fixup crazy check for whether we need to emit the coordinate after custom handling. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-26 22:08:51 +13:00
Chris Forbes	bdcacaed9c	i965: relax brw_texture_offset assert Some texturing ops are about to have nonconstant offset support; the offset in the header in these cases should be zero. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-26 21:54:15 +13:00
Chris Forbes	6bb2cf2107	i965: Add SHADER_OPCODE_TG4_OFFSET for gather with nonconstant offsets. The generator code ends up clearer this way than if we had to sniff via the message length. Implemented via the gather4_po message in hardware, which is present in Gen7 and later. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-26 21:54:15 +13:00
Chris Forbes	cd8505bfb8	i965: add missing tg4 case in brw_instruction_name Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-26 21:54:15 +13:00
Chris Forbes	4fa123deac	glsl: relax const offset requirement for textureGatherOffset Prior to ARB_gpu_shader5 / GLSL 4.0, the offset is required to be a constant expression. With that extension, it is relaxed to be an arbitrary expression. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-26 21:54:15 +13:00
Chris Forbes	00235402a0	glsl: Add ARB_gpu_shader5 textureGatherOffset signatures - gsampler2DRect - optional `comp` parameter Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-26 21:54:15 +13:00
Kenneth Graunke	d07d38e696	i965: Weaken the flushing in gen7_end_transform_feedback(). Since `062317d667` (i965: Go back to using the kernel SOL reset feature.) we've been flushing the batch on BeginTransformFeedback(). So it's not necessary to do it on EndTransformFeedback(). A PIPE_CONTROL will work. This makes gen7_end_transform_feedback() exactly the same as the gen6 variant. However, they'll diverge again shortly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-25 22:25:38 -07:00
Eric Anholt	93bd627d5a	i965/fs: Stop trying to hack around MRF dep chains on gen7+ LIFO scheduling. This was a hack to avoid choosing to schedule all texturing before consumption of any texture results due to the way dependency chains worked out in the presence of MRFs. On gen7, we don't have MRFs, so the problem doesn't apply, and this was just badly constraining our scheduling. total instructions in shared programs: 1615306 -> 1612534 (-0.17%) instructions in affected programs: 9958 -> 7186 (-27.84%) GAINED: 259 LOST: 9 Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-25 16:45:30 -07:00
Eric Anholt	c3c9a8c857	i965: Try not to reverse-schedule things when doing LIFO scheduling. The LIFO plan was simple: Take the most recently made available instructions, and pick those first. But because of the order we were pushing things onto our list of available-to-schedule instructions, it meant that when a set of instructions was made available at the same time (for example, everything at the start of the program that didn't depend on other instructions) we'd schedule them in reverse order. If you had 10 texture calls in a row in your program, each with independent argument setup, we'd set up the last texture call's args and execute it first, even though we wouldn't be able to consume its results until we'd finished the other 9 texture calls (assuming consumption of texture results happens near each texture call, and combines it with another texture result, which is normal for a convolution shader). To fix this, walk the list for doing LIFO in the order that instructions were originally generated in the program, but choose to push newly-made-available instructions to the other end of the list instead. total instructions in shared programs: 1587242 -> 1586290 (-0.06%) instructions in affected programs: 7801 -> 6849 (-12.20%) GAINED: 76 LOST: 67 Thanks to Chia-I Wu for pointing out the bug in my first version of the patch that made it a huge loss. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-25 16:45:30 -07:00
Ilia Mirkin	a7ce1fef27	mesa/st: disable ARB_framebuffer_object when no driver support. When PIPE_CAP_MIXED_FRAMEBUFFER_SIZES is not provided, parts of ARB_framebuffer_object can't be supported, such as on NV30. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-10-26 01:36:07 +02:00
Ilia Mirkin	12d39b4fa8	gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZES This CAP will determine whether ARB_framebuffer_object can be enabled. The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf textures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-10-26 01:36:07 +02:00
Adam Jackson	1090eb5755	glx: Fix return value from indirect_bind_context _XReply returns 1 on success, but indirect_bind_context returns 0 on success. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70486 Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-10-25 16:49:28 -04:00
Matt Turner	64c081e8b7	glsl: Optimize (not A) and (not B) into not (A or B). No shader-db changes, but seems like a good idea. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-25 10:35:18 -07:00
Matt Turner	65a600f58a	glsl: Optimize (not A) or (not B) into not (A and B). A few Serious Sam 3 shaders affected: instructions in affected programs: 4384 -> 4344 (-0.91%) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-25 10:35:13 -07:00
Matt Turner	e52959e961	i965/fs: Match commutative expressions with reversed arguments. total instructions in shared programs: 1645011 -> 1644938 (-0.00%) instructions in affected programs: 17543 -> 17470 (-0.42%) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-25 10:34:02 -07:00
Matt Turner	503fe278b0	i965: s/Muchnik/Muchnick/. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-25 10:34:02 -07:00
Marek Olšák	9807556e86	r600g,radeonsi: use fences provided by the winsys	2013-10-25 11:55:55 +02:00
Marek Olšák	6067a30838	winsys/radeon: add the implementation of fences from r300g	2013-10-25 11:55:55 +02:00
Marek Olšák	48784f3591	radeonsi: add the vertex shader position output if it's missing This fixes a lockup in piglit/spec/glsl-1.40/execution/tf-no-position. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-10-25 11:55:55 +02:00
Marek Olšák	94715130e6	radeonsi: respect semantic indices for COLOR[i] fragment shader outputs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-10-25 11:55:55 +02:00
Paul Berry	e8f6f244bb	glsl: When disabling gl_PerVertex variables, check that mode matches. In commit `1b4a737` (glsl: Support redeclaration of VS and GS gl_PerVertex output), I added code to ensure that when an unnamed gl_PerVertex interface block is redeclared, any ir_variables that weren't included in the redeclaration are removed from the IR (and the symbol table). This ensures that only those variables that were explicitly redeclared may be used. However, when I wrote this code, I neglected to match the variable mode when finding variables to remove. This meant that redeclaring a built-in output block might cause the built-in input gl_in to be accidentally removed. Fixes piglit test gs-redeclares-pervertex-out-only. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:30 -07:00
Paul Berry	719bf30165	glsl: Remove unused gl_PerVertex interface blocks. The GLSL 4.10 rules for redeclaration of built-in interface blocks (which we've chosen to regard as clarifications of GLSL 1.50) only require gl_PerVertex blocks to match in shaders that actually use those blocks. The easiest way to implement this is to detect situations where a compiled shader doesn't refer to any elements of gl_PerVertex, and remove all the associated ir_variables from the shader at the end of ast-to-ir conversion. Fixes piglit tests linker/interstage-{pervertex,pervertex-in,pervertex-out}-redeclaration-unneeded. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:27 -07:00
Paul Berry	37d97668ae	glsl: Call check_builtin_array_max_size when redeclaring gl_in. Normally when a built-in array (such as gl_ClipDistance) is redeclared, we call get_variable_being_redeclared() to do the redeclaration, and it in turn calls check_builtin_array_max_size() to make sure that the redeclared array size isn't too large. However when a built-in array is redeclared as part of redeclaring gl_in, we don't call get_variable_being_redeclared() (since the individual built-ins aren't each represented by their own ir_variable anymore). So we need to add an explicit call to check_builtin_array_max_size() to make sure the new array size isn't too large. Note: at the moment this is redundant with a test that's done at link time, so there's no change to piglit results. But the patch that follows will prevent link errors from being reported if gl_PerVertex isn't used, so in order to prevent that patch from causing regressions, we need to add the compile check now. Besides, it's nicer to report this error at compile time anyhow. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:24 -07:00
Paul Berry	156b31c5be	mesa: Fix geometry shader program queries. The queries GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, and GEOMETRY_OUTPUT_TYPE (defined by GL 3.2) differ from the corresponding queries in ARB_geometry_shader4 in the following ways: - They use different enum values - They can only be queried; they cannot be set. - Attempting to query them yields INVALID_OPERATION if the program is not linked, or lacks a geometry shader. This patch switches us over from the ARB_geometry_shader4 behaviour to the GL 3.2 behaviour. Fixes piglit test query-gs-prim-types. v2: Improve comment above has_core_gs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:22 -07:00
Paul Berry	a49830b8f5	glsl: Account for interface block lowering in program_resource_visitor. When program_resource_visitor visits variables that were created by lower_named_interface_blocks, it needs to do extra work to un-do the effects of lower_named_interface_blocks and construct the proper API names. Fixes piglit test spec/glsl-1.50/execution/interface-blocks-api-access-members. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:19 -07:00
Paul Berry	4b97c581b4	glsl: mark variables produced by lower_named_interface_blocks. These variables will need to be treated specially by program_resource_visitor, so that they can be addressed through the API using their interface block name (and array index, for interface block arrays). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:14 -07:00
Paul Berry	99512dc40d	glsl: Keep track of centroid/interpolation mode for interface block members. Fixes piglit tests: - interface-block-interpolation-{array,named,unnamed} - glsl-1.50-interface-block-centroid {array,named,unnamed} Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:10 -07:00
Paul Berry	e17d671d9f	glsl: Pass variable mode into ast_process_structure_or_interface_block(). Later patches will use this information to do proper error checking of interpolation qualifiers that appear inside of interface blocks. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:07 -07:00
Paul Berry	81a5067966	glsl: Extract interpretation of interpolation to its own function. In future patches, we will need this in order to interpret interpolation qualifiers that appear inside interface blocks. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:01:04 -07:00
Paul Berry	f65feb5335	glsl: Pull interpolation_string() out of ir_variable. Future patches will need to call this function when there isn't an ir_varible present to refer to. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:59 -07:00
Paul Berry	1e3e72e305	i965: Reduce gl_MaxGeometryInputComponents to 64. Although in principle there is no hardware limitation that prevents gl_MaxGeometryInputComponents from being set to 128 on Gen7, we have the following limitations in the vec4 compiler back end: - Registers assigned to geometry shader inputs can't be spilled or later re-used for any other purpose. - The last 16 registers are set aside for the "MRF hack", meaning they can only be used to send messages, and not for general purpose computation. - Up to 32 registers may be reserved for push constants, even if there is sufficient register pressure to make this impractical. A shader using 128 geometry input components, and having an input type of triangles_adjacency, would use up: - 1 register for r0 (which holds URB handles and various pieces of control information). - 1 register for gl_PrimitiveID. - 102 registers for geometry shader inputs (17 registers per input vertex, assuming DUAL_INSTANCED dispatch mode and allowing for one register of overhead for gl_Position and gl_PointSize, which are present in the URB map even if they are not used). - Up to 32 registers for push constants. - 16 registers for the "MRF hack". That's a total of 152 registers, which is well over the 128 registers the hardware supports. Fortunately, the GLSL 1.50 spec allows us to reduce gl_MaxGeometryInputComponents to 64. Doing that frees up 48 registers, brining the total down to 104 registers, leaving 24 registers available to do computation. Fixes piglit test spec/glsl-1.50/execution/geometry/max-input-components. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:57 -07:00
Paul Berry	3c2feb1969	i965/gs: If a DUAL_OBJECT gs would spill, fall back to DUAL_INSTANCED. This is similar to what we do for 16-wide vs 8-wide fragment shaders. First we try compiling the geometry shader in DUAL_OBJECT mode. If we can't do that without spilling, we fall back on DUAL_INSTANCED mode, which should require less spilling (since it uses an interleaved layout of payload registers). In an ideal world we'd fall back to SINGLE mode, which would allow us to interleave general-purpose registers too (resulting in even less likelihood of spilling). But at the moment, the vec4 generator and visitor classes don't have the infrastructure to interleave general purpose registers, so DUAL_INSTANCED is the best we can do. As a side benefit this paves the way for implementing instanced geometry shaders (which are incompatible with DUAL_OBJECT mode). Since most geometry shaders used in piglit testing are small, DUAL_INSTANCED mode won't get exercised very much in a normal piglit run. To force DUAL_INSTANCED mode to be used for all geometry shaders, set INTEL_DEBUG=nodualobj. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-24 22:00:53 -07:00
Paul Berry	03ac2c7223	i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs. Geometry shaders that run in "DUAL_INSTANCED" mode store their inputs in vec4's. This means that when compiling gl_PointSize input swizzling (a MOV instruction which uses a geometry shader input as both source and destination), we need to do two things: - Set force_writemask_all to ensure that the MOV happens regardless of which channels are enabled. - Set the source register region to <4;4,1> (instead of <0;4,1> to satisfy register region restrictions. v2: move the source register region fixup to the top of vec4_generator::generate_vec4_instruction(), so that it applies to all instructions rather than just MOV. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-24 22:00:50 -07:00
Paul Berry	a05589ea0b	i965/gs: Add the ability to compile a DUAL_INSTANCED geometry shader. Not yet enabled. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-24 22:00:46 -07:00
Paul Berry	34cba13ef8	i965/vec4: Add the ability to suppress register spilling. In future patches, this will allow us to first try compiling a geometry shader in DUAL_OBJECT mode (which is more efficient but uses more registers) and then if spilling is required, fall back on DUAL_INSTANCED mode. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-24 22:00:43 -07:00
Paul Berry	89647cffb3	i965/vec4: if register allocation fails, don't try to schedule. Otherwise the scheduler would be invoked with prog_data->total_grf == 0, causing havoc. In a future patch, this will allow us to try compiling a geometry shader in DUAL_OBJECT mode with spilling disabled, and then fall back to DUAL_INSTANCED mode if that failed. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-24 22:00:40 -07:00
Paul Berry	8bb15813e3	i965/vec4: Add the ability for attributes to be interleaved. When geometry shaders are operated in "single" or "dual instanced" mode, a single set of geometry shader inputs is interleaved into the thread payload (with each payload register containing a pair of inputs) in order to save register space. This patch modifies vec4_visitor::lower_attributes_to_hw_regs so that it can handle the interleaved format. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-24 22:00:37 -07:00
Paul Berry	3da2c5123d	i965/gs: Set force_writemask_all when setting up g0. All geometry shaders begin this instruction: mov(1) g0.2<1>:ud 0x0:ud { align1 } which sets up GRF0 properly for scratch reads and writes. Since this instruction has a SIMD size of 1, it will only have an effect if the first channel is enabled. In practice, the hardware seems to always dispatch geometry shaders with the first channel enabled, but I can't find anything in the docs to guarantee that. So to be on the safe side, set force_writemask_all on the instruction, which guarantees that it will have the desired effect regardless of which channels are enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-24 22:00:37 -07:00
Paul Berry	172aec281d	glsl: set explicit_location correctly in lower_named_interface_blocks. When lower_named_interface_blocks lowers a built-in interface block member to an ir_variable, it needs to set explicit_location in the ir_variable. Otherwise the linker gets confused and treats the variable as a generic varying. Fixes the following piglit tests, which were regressed by commit `63974c0` (glsl: Simplify the interface to link_invalidate_variable_locations): - clip-distance-bulk-copy - clip-distance-in-bulk-read - clip-distance-in-explicitly-sized - clip-distance-in-param - clip-distance-in-values - core-inputs - gs-redeclares-both-pervertex-blocks - gs-redeclares-pervertex-in-only - redeclare-pervertex-subset-vs-to-gs - unsized-in-named-interface-block-gs - unsized-in-named-interface-block-multiple - unsized-in-unnamed-interface-block-gs - unsized-in-unnamed-interface-block-multiple Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70820 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:32 -07:00
Paul Berry	85db1326a2	i965/gs: Precompile geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:28 -07:00
Paul Berry	e0f34301b2	i965/vec4: Extract function to set up vec4 prog key for precompiling. This will allow us to re-use it for precompiling geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:25 -07:00
Paul Berry	068df64ba6	i965/vec4: Remove uses_clip_distance from program key. This should never have been in the program key in the first place, since it's determined by the shader source, not by GL state. Change the code to just refer to gl_program::UsesClipDistanceOut directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:22 -07:00
Paul Berry	11634e491b	glsl: Move UsesClipDistance from gl_{vertex,geometry}_program into gl_program. This will make it easier for back-ends to share code between geometry shader and vertex shader compilation. Also, it is renamed to "UsesClipDistanceOut" to clarify that (a) in geometry shaders, it refers to the gl_ClipDistance output rather than the gl_ClipDistance input, and (b) it is irrelevant in fragment shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 22:00:13 -07:00
Paul Berry	44b7ebe52d	glsl/gs: Fix transform feedback of gl_ClipDistance. Since gl_ClipDistance is lowered from an array of floats to an array of vec4's during compilation, transform feedback has special logic to keep track of the pre-lowered array size so that attempting to perform transform feedback on gl_ClipDistance produces a result with the correct size. Previously, this special logic always consulted the vertex shader's size for gl_ClipDistance. This patch fixes it so that it uses the geometry shader's size for gl_ClipDistance when a geometry shader is in use. Fixes piglit test spec/glsl-1.50/transform-feedback-type-and-size. v2: Change the type of LastClipDistanceArraySize to "unsigned", and clarify the comment above it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 21:59:39 -07:00
Paul Berry	fe36154ff3	i965: Fix gl_MaxCombinedTextureImageUnits. We've always overriden ctx->Const.{Vertex,Fragment}Program.MaxTextureImageUnits to reflect the number of texture image units supported by the hardware (rather than using the default values assigned by Mesa core) so it seems sensible to do that for GeometryProgram.MaxTextureImageUnits too. We set it to 0 if geometry shaders aren't supported. Once that is done, we can just unconditionally add GeometryProgram.MaxTextureImageUnits to MaxCombinedTextureImageUnits. Fixes piglit test "spec/glsl-1.50/built-in constants/gl_MaxCombinedTextureImageUnits". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-24 21:14:26 -07:00
Rob Clark	a453242fda	freedreno/a3xx/compiler: relative addressing Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-24 20:21:08 -04:00
Rob Clark	4317c4e6e0	freedreno/a3xx: fix const/rel/const-rel encoding The encoding of constant, relative, and relative-const src registers is a bit more complex than originally thought, which gives an extra bit to encode const reg # at expense of taking a bit from relative offset. In most cases a3xx seems to actually use a scheme whereby it can encode an extra bit for const register. You have three possible encodings in thirteen bits: register: (11 bits for N.c) 00........... rN.c relative: (10 bits for N) 010.......... r<a0.x + N> 011.......... c<a0.x + N> const: (12 bits for N.c) 1............ cN.c Which means we can deal w/ more consts than previously thought. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-24 20:21:08 -04:00
Rob Clark	bfd30935c9	freedreno/a3xx: add blend state Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-24 20:21:08 -04:00
Rob Clark	0a1e4361e8	freedreno/resource: fail more gracefully Fail more gracefully when buffer allocation/import fails. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-24 20:21:08 -04:00
Roland Scheidegger	2b2fc03beb	gallivm: implement fully accurate corner filtering for seamless cube maps d3d10 requires that cube corners are filtered with accurate weights (that is, the weight of the non-existing corner texel should be evenly distributed to the other 3 texels). OpenGL does not require this (but recommends it). This requires us to use different filtering code, since we need per-texel weights which our 2d lerp doesn't (and can't) do. And of course the (now per element) weights need to be adjusted too for it to work. Invoke the new filtering code whenever there's an edge to keep things simpler, as it will work for edges too not just corners but of course it's only needed with corners. More ugly code for not much gain but at least a hacked up cubemap demo shows very nice corners now... Not sure yet if and how this should be configurable... v2: incorporate feedback from Jose, only use special corner filtering code when there's a corner not when there's only an edge (as corner filtering code is slower, though a perf difference was only measureable when always forcing edge code). Plus some minor style fixes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-25 01:29:14 +02:00
Eric Anholt	dde9260fdc	mesa: Remove dricore from the build. No driver uses it any more, and it's been replaced by megadrivers. v2: Remove always-on conditional for NEED_LIBPROGRAM (review by Emil) Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:13:09 -07:00
Eric Anholt	bdcee13ca3	swrast: Build the driver into the shared mesa_dri_drivers.so. v2: drop dridir now that it's unused. v3: Fix linking after rebase when building just swrast from classic but a drm-using gallium driver. v4: Consistently put spaces around += in the updated Makefile.am block. v5: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). Reviewed-by: Matt Turner <mattst88@gmail.com> (v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:13:09 -07:00
Eric Anholt	86d50c2f15	radeon: Build the driver into the shared mesa_dri_drivers.so. This required some reordering of headers to ensure that the symbol name redefines happened before any prototypes. v2: drop dridir now that it's unused. v3: Consistently put spaces around += in the updated Makefile.am blocks. v4: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). Reviewed-by: Matt Turner <mattst88@gmail.com> (v2) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:13:09 -07:00
Eric Anholt	6665b71b22	i915: Build the driver into the shared mesa_dri_drivers.so. i915 has symbols for formerly-shared code that conflict with i965, so we define them away using gen-symbol-redefs.py. Options considered: - This option. Downsides: The symbols in profiling and debugging don't match the source. The symbol list may change in the future and we won't notice without manually running the tool again. - Use objcopy --localize-hidden to automatically demote our symbols to locals. This didn't work on i965 due to c++ weak symbols (which can't be localized), but could work on i915. We could do it on i915 only, but it does produce libtool warnings at link time due to libtool not knowing if the resulting .o file is safe to link (stupid libtool). Plus you end up with different symbols of the same name, which is confusing for debugging too. On the other hand, no future symbol conflicts long term. - Write our own libelf tool that handles c++ weak symbols like we want and apply it to all drivers. All the downsides of above, but applies uniformly across drivers. - Edit the files to just rename all the i915 or i965 symbols that conflict. There are on the order of 100 that have a prefix we used to share, so it would take a bit of typing. Fewest downsides, but still can have conflicts long term. Ultimately, this is the least invasive change at the moment, and we can see if the "more symbol conflicts appear later" thing is a real concern or not. Note that the ability to compile a version of i915 without INTEL_DEBUG env support is dropped. It's too useful. v2: drop dridir now that it's unused. v3: Consistently put spaces around += in the updated Makefile.am block. v4: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). Reviewed-by: Matt Turner <mattst88@gmail.com> (v2) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:13:09 -07:00
Eric Anholt	ba10d79cca	dri: Add a tool for generating #defines to namespace driver global symbols. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:13:09 -07:00
Eric Anholt	ead86e378f	nouveau: Build the driver into the shared mesa_dri_drivers.so. v2: drop dridir now that it's unused. v3: Consistently put spaces around += in the updated Makefile.am block. v4: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). v5: Fix missed public symbol in nouveau. (caught by Emil) Reviewed-by: Matt Turner <mattst88@gmail.com> (v2) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:13:08 -07:00
Eric Anholt	1925a9aebd	i965: Build the driver into a shared mesa_dri_drivers.so . Previously, we've split things such that mesa core is in libdricore, exposing the whole Mesa core interface in the global namespace, and the i965_dri.so code all links against that. Along with polluting application namespace terribly, it requires extra PLT indirections and prevents LTO. Instead, we can build all of the driver contents into the same .so with just a few symbols exposed to be referenced from the actual driver .so file, allowing LTO and reducing our exposed symbol count massively. FPS improvement on GLB2.7 with INTEL_NO_HW=1: 2.61061% +/- 1.16957% (n=50) (without LTO, just the PLT reductions from this commit) Note that the X Server requires commit 7ecfab47eb221dbb996ea6c033348b8eceaeb893 to successfully load this driver! v2: Set a global driverAPI variable so loaders don't have to update to createNewScreen2() (though they may want to for thread safety). v3: Drop AM_CPPFLAGS addition (Emil pointed out I'd missed some cflags that would be necessary, though only if we actually relied on them). v4: Fix install with DESTDIR set. Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v2)	2013-10-24 14:12:58 -07:00
Eric Anholt	4e54751624	dri: Implement a DRI vtable extension to replace the global driDriverAPI. As we move to megadrivers, we are unable to build multiple drivers with the same public global symbol per driver (Think an X Server with an intel and a nouveau driver, and the X Server implementing indirect for both -- we have to actually talk to the right driver). By slipping the driDriverAPI vtable into the driver's extension list, we can replace the usage of the global symbol with usage of the loader-dlsym()ed driver information. v2: Pull in the hunk to avoid crashing on null driver_extensions. Thanks, Emil! Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	f93533d118	dri: Pass in the dlsym()ed driver extension to screen creation. This will allow a megadrivers build to reference the actual driver being loaded from the shared dri_util screen creation code. v2: Fix indentation, fallback case in EGL (review by Emil). Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	67caf36489	gbm: Add support for the new __driDriverGetExtensions interface. v2: Fix uninitialized variable use in the old-ABI case. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	a64bb7553a	egl: Add an optional function call for getting the DRI driver interface. v2: Fix asprintf error checking. Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	fcb57a8210	glx: Add an optional function call for getting the DRI driver interface. The previous interface relied on a static struct, which meant that the driver didn't get a chance to edit the struct before the struct got used. For megadrivers, I want struct specific to the driver being loaded. v2: Fix the prototype in the docs (caught by Marek). Since the driver name was in the function, we didn't need to also pass it in. v3: Fix asprintf error checking (caught by Matt's gcc). Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	6868923702	dri: Move driver config options to dri driver extensions. This way they aren't all sitting in the global namespace (with the same name per driver). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	cf5d8fc310	dri: Allow config options to be passed to the loader through extensions. Turns out already we have this nice mechanism for providing optional things from the driver to the loader, and I was going to have to rename the public global symbol to avoid conflicts when doing megadrivers. While the former __driConfigOptions is technically loader interface, this is the only loader that made use of that symbol. Continue paying attention to it if we can't find the new option, to retain compatibility with old drivers. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Eric Anholt	80806c98ef	glx: Move the driver extension-loading to a helper function. I'm planning on doing driver extension parsing from 3 places, and making the extension loading step a bit longer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-24 14:04:20 -07:00
Francisco Jerez	7463abd37d	clover: Query maximum kernel block size from the device instead of the kernel object. Based on a similar fix from Aaron Watry. It seems unlikely that we will ever need a kernel-specific setting for this, and the Gallium API doesn't support it. Remove kernel::max_block_size() altogether.	2013-10-24 13:33:41 -07:00
Brian Paul	b8d7a97fad	glsl: silence unused 'var' variable warning Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-24 10:45:47 -06:00
Brian Paul	8d7b913e4e	svga: remove user-space vertex/index buffer code The gallium vbuf module, which we've been using for some time now, takes care of uploading user-space vertex/index data into real buffers. The upload code in the svga driver was unused. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-10-24 10:45:47 -06:00
Chad Versace	2f6a315085	i965: Print more debuginfo in intel_texsubimage_memcpy() Print info about packing, format, type, and tiling. This will help debug future issues with this fastpath. Reviewed-by: Frank Henigman <fjhenigman@google.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-24 09:25:45 -07:00
Chad Versace	c4205590e7	i965: Fix glTexImage when packing alignment != cpp Fixes texture corruption of Weston clients on cairo-glesv2 backend. Commit 49ed599 introduced the bug. Corruption occured when glTexSubImage called intel_texsubimage_tiled_memcpy() with: x,y=10,9 w,h=7,7 format=GL_ALPHA(0x1906) type=GL_UNSIGNED_BYTE(0x1401) gl_format=MESA_FORMAT_A8(0x18) packing.alignemnt=4 The function miscalculated the source image's stride as w*cpp=7 without taking into account the packing alignment. The actual stride was 8. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70435 Reported-by: U. Artie Eoff <ullysses.a.eoff@intel.com> Tested-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by:Frank Henigman <fjhenigman@google.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-24 09:25:24 -07:00
Rob Clark	a6e45b6a17	freedreno: fix compile error Small typo introduced in `a3ed98f`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-10-23 18:38:05 -06:00
Paul Berry	4df56177ed	i965/fs: Only unroll high-accuracy dFdy() from SIMD16 to SIMD8 on gen4 and IVB. In commit `800610f` (i965/fs: Improve accuracy of dFdy() to match dFdx()) I unrolled the high-accuracy dFdy() computation from a single SIMD16 instruction to two SIMD8 instructions because of text I found in the i965 (gen4) PRM saying that instruction compression could not be used in align16 mode. I couldn't find similar text in later hardware docs, and I observed problems trying to use instruction compression on align16 mode on Ivy Bridge, so I assumed that the restriction still applied and the associated documentation had simply been lost. After consultation with the hardware engineers, it turns out this is not the case. In point of fact, the restriction was dropped in gen5, re-introduced in Ivy Bridge, and dropped again in Haswell. The reason I didn't notice this is that in the Ivy Bridge documentation, the restriction was in a different section, and described using different language. Now that we know that the restriction only applies to Gen4 and Ivy Bridge, we can limit the unrolling to those platforms. Tested on gen5, gen6, and gen7 (both Ivy Bridge and Haswell). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-23 16:51:15 -07:00
Paul Berry	8e15207c9d	glsl/gs: Prevent illegal input/output primitive types. From the GLSL 1.50 spec, section 4.3.8.1 (Input Layout Qualifiers): The layout qualifier identifiers for geometry shader inputs are layout-qualifier-id points lines lines_adjacency triangles triangles_adjacency And from section 4.3.8.2 (Output Layout Qualifiers) The layout qualifier identifiers for geometry shader outputs are layout-qualifier-id points line_strip triangle_strip max_vertices = integer-constant We were erroneously allowing line_strip and triangle_strip to be used as input qualifiers, and we were allowing lines, lines_adjacency, triangles, and triangles_adjacency to be used as output qualifiers. Fixes piglit tests "glsl-1.50-gs-{input,output}-layout-qualifiers *". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-23 16:51:05 -07:00
Eric Anholt	867d0cc1fe	i965: Add perf debug hint when the app makes us do index buffer scanning. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-23 15:33:46 -07:00
Eric Anholt	c298f5ff56	i965: Try to avoid stalls on the GPU when doing glBufferSubData(). On DOTA2, framerate on dota2-de1.dem in windowed mode on my laptop improves by 7.69854% +/- 0.909163% (n=3). In a microbenchmark hitting this code path (wall time of piglit vbo-subdata-many), runtime decreases from 0.8 to 0.05 seconds. v2: Use out of range start/end instead of separate bool for the active flag (suggestion by Jordan), fix double-upload in the stalling path. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-23 15:33:19 -07:00
Eric Anholt	3b58e0ed64	i965: Be sure to reset brw->vb.buffers[] when trying to redo vertex setup. The brw_prepare_vertices that sets up buffers[] depends on these parameters, so don't let brw_prepare_vertices() skip it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-23 15:33:16 -07:00
Eric Anholt	a5e2e7f9a4	i965: Add support for GL_ARB_texture_buffer_range. Supporting this extension turns out to simplify our code a bit over not supporting this extension, once the glBufferSubData() synchronization code lands. v2: Use 16 byte alignment like we do for uniform buffers, due to unaligned access penalties. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)	2013-10-23 15:33:10 -07:00
Eric Anholt	b37f7e0160	i965: Add a note about the late-allocation in intel_bufferobj_buffer(). This was mostly for the i915 system-memory VBO code, which we don't have any more, but since that existed we've ended up producing dependencies on it being there. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-23 15:33:06 -07:00
Eric Anholt	060a49a896	i965: Drop intel_bufferobj_source(). Since src_offset was always 0, it wasn't doing anything for us beyond intel_bufferobj_buffer(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-23 15:33:03 -07:00
Eric Anholt	c0a9436d19	i965: Fix texture buffer rendering after a whole buffer replacement. If glBufferData(), glBufferSubData(0, obj->Size), or similar happens, we get a new drm_intel_bo for the buffer object, and thus need to re-upload texture buffer state so we point at the new data. Fixes the new piglit GL_ARB_texture_buffer_object/data-sync Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-23 15:31:44 -07:00
David Heidelberger	2901e2efcd	clover: fix build after `a3ed98f7aa`	2013-10-23 13:13:36 -07:00
Brian Paul	c1345720c8	nv50: clamp PIPE_SHADER_CAP_MAX_TEXTURE_SAMPLERS to PIPE_MAX_SAMPLERS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70212 Tested-by: Aaron Watry <awatry@gmail.com>	2013-10-23 13:43:18 -06:00
Brian Paul	ef98e2ee61	radeonsi: remove unused si_set_cs_sampler_view() Fixes build breakage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70804 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-10-23 13:42:51 -06:00
Brian Paul	a3ed98f7aa	gallium: new, unified pipe_context::set_sampler_views() function The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-23 10:15:38 -06:00
Brian Paul	b11fc226e6	svga: remove unneeded include of u_double_list.h	2013-10-23 10:15:38 -06:00
Kenneth Graunke	30bb170479	i965: Expose write_reg() as brw_store_register_mem64(). Writing a 64-bit register value to memory is sufficiently complicated that it makes sense to reuse this function rather than duplicating it. Exposing it outside of gen6_queryobj.c means it needs a more descriptive function name. It could probably be moved to brw_util.c or somewhere else, but this works too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-23 01:06:26 -07:00
Kenneth Graunke	d5db3ece0a	i965: Move flushing out of write_reg and into the callers. The current callers just want to write a single register, so combining the register read with a pipeline flush made sense. However, in the future we'll want to do multiple register reads back to back, and we'll only want to flush once. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-23 01:06:26 -07:00
Ian Romanick	63974c0f5b	glsl: Simplify the interface to link_invalidate_variable_locations The unit tests added in the previous commits prove some things about the state of some internal data structures. The most important of these is that all built-in input and output variables have explicit_location set. This means that link_invalidate_variable_locations doesn't need to know the range of non-generic shader inputs or outputs. It can simply reset location state depending on whether explicit_location is set. There are two additional assumptions that were already implicit in the code that comments now document. - ir_variable::is_unmatched_generic_inout is only used by the linker when connecting outputs from one shader stage to inputs of another shader stage. - Any varying that has explicit_location set must be a built-in. This will be true until GL_ARB_separate_shader_objects is supported. As a result, the input_base and output_base parameters to link_invalidate_variable_locations are no longer necessary, and the code for resetting locations and setting is_unmatched_generic_inout can be simplified. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	1eee0a9f01	glsl/tests: Unit test vertex shader in / out with link_invalidate_variable_locations Validates: - ir_variable::explicit_location should not be modified. - If ir_variable::explicit_location is not set, ir_variable::location, ir_variable::location_frac, and ir_variable::is_unmatched_generic_inout must be reset to 0. - If ir_variable::explicit_location is set, ir_variable::location should not be modified. ir_variable::location_frac, and ir_variable::is_unmatched_generic_inout must be reset to 0. Previous unit tests have shown that all non-generic inputs / outputs have explicit_location set. v2: Split the link_invalidate_variable_locations interface change out to a separate patch. Remove the vertex_in_builtin_without_explicit and vertex_out_builtin_without_explicit tests. There was a lot of good discussion about this on the mailing list to which I refer the interested reader. Both changes suggested by Paul. http://lists.freedesktop.org/archives/mesa-dev/2013-October/046652.html Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	cf8b14ce6d	glsl: Modify interface to link_invalidate_variable_locations This will make it easier to unit test this function in successive patches. Also, correct the prototype in linker.h. It was... wrong. v2: Split the interface change from adding the unit tests. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	af229c94e3	glsl/tests: Verify geometry shader built-ins generated by _mesa_glsl_initialize_variables Checks that the variables generated meet certain criteria. - Geometry shader inputs have an explicit location. - Geometry shader outputs have an explicit location. - Fragment shader-only varying locations are not used. - Geometry shader uniforms and system values don't have an explicit location. - Geometry shader constants don't have an explicit location and are read-only. - No other kinds of geometry variables exist. It does not verify that an specific variables exist. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	f094a0f825	glsl/tests: Verify fragment shader built-ins generated by _mesa_glsl_initialize_variables Checks that the variables generated meet certain criteria. - Fragment shader inputs have an explicit location. - Fragment shader outputs have an explicit location. - Vertex / geometry shader-only varying locations are not used. - Fragment shader uniforms and system values don't have an explicit location. - Fragment shader constants don't have an explicit location and are read-only. - No other kinds of fragment variables exist. It does not verify that an specific variables exist. v2: Use _mesa_varying_slot_in_fs in fragment_builtin.inputs_have_explicit_location. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	d05202900b	glsl/tests: Verify vertex shader built-ins generated by _mesa_glsl_initialize_variables Checks that the variables generated meet certain criteria. - Vertex shader inputs have an explicit location. - Vertex shader outputs have an explicit location. - Fragment shader-only varying locations are not used. - Vertex shader uniforms and system values don't have an explicit location. - Vertex shader constants don't have an explicit location and are read-only. - No other kinds of vertex variables exist. It does not verify that an specific variables exist. v2: Fix memory management mistakes in common_builtin::string_starts_with_prefix. Clean up error message reporting in common_builtin::no_invalid_variable_modes. Both suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	78b70ceae1	glsl: When constructing a variable with an interface type, set interface_type Ever since the addition of interface blocks with instance names, we have had an implicit invariant: var->type->is_interface() == (var->type == var->interface_type) The odd use of == here is intentional because !var->type->is_interface() implies var->type != var->interface_type. Further, if var->type->is_array() is true, we have a related implicit invariant: var->type->fields.array->is_interface() == (var->type->fields.array == var->interface_type) However, the ir_variable constructor doesn't maintain either invariant. That seems kind of silly... and I tripped over it while writing some other code. This patch makes the constructor do the right thing, and it introduces some tests to verify that behavior. v2: Add general-ir-test to .gitignore. Update the description of the ir_variable invariant for arrays in the commit message. Both suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-22 15:23:30 -07:00
Ian Romanick	09ceed7587	mesa/tests: Add simple, dumb test for _mesa_program_state_string After some discussions about the correct way to update _mesa_program_state_string, I decided to make a unit test for the function. It turns out that the function didn't work quite the way I thought. The unit test proves that the code was already correct. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Anuj Phogat <anuj.phogat@gmail.com>	2013-10-22 15:23:30 -07:00
Ander Conselvan de Oliveira	98b359bd1b	wayland: Don't leak wl_drm global when unbinding display	2013-10-22 14:57:03 -07:00
Scott Graham	dafa97fed9	mesa: fixes for MSVC 2013 Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-22 08:39:40 -06:00
Brian Paul	65ee044a97	st/mesa: minor whitespace, comment changes in st_draw.c	2013-10-22 08:20:45 -06:00
Brian Paul	f166fbae36	st/dri: minor formatting clean-ups in dri_context.c	2013-10-22 08:20:45 -06:00
Brian Paul	f0d4636d9c	mesa: fix a couple issues with U_FIXED, I_FIXED macros Silence a bunch of MSVC type conversion warnings. Changed return type of S_FIXED to int32_t (signed). The result is the same. It just seems more intuitive that a signed conversion function should return a signed value. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-22 08:20:45 -06:00
Brian Paul	6767c56e6d	mesa: remove GL_MESA_program_debug bits from gl.h The code for this was removed from Mesa some time ago. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-22 08:20:45 -06:00
Brian Paul	971c74309e	mesa: remove remnants of GL_MESA_shader_debug This extension never saw any real use so remove it. v2: also update tests/num_strings.cpp for 'make check' Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-22 08:20:45 -06:00
Kenneth Graunke	43b05b8fac	i965: Only emit interpolation setup if there are actual FS inputs. Dead code elimination would get rid of the extra instructions, but skipping this saves iterations through the optimization loop. From shader-db: N Min Max Median Avg Stddev x 14672 3 16 3 3.1334515 0.59904168 + 14672 1 16 3 2.8955153 0.77732963 Difference at 95.0% confidence -0.237936 +/- 0.0158798 -7.59342% +/- 0.506783% (Student's t, pooled s = 0.693935) Embarassingly, the classic shadow mapping shader: void main() { } used to require three iterations through the optimization loop. With this patch, it only requires one (which makes no progress). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-21 23:31:15 -07:00
Chris Forbes	c4de86fd26	i965/fs: Fix accidental type conversion in header setup Previously one side could be UD while the other was float. V2: Prefer float; apparently IVB can dispatch float ops faster. (Thanks Eric) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-22 18:56:14 +13:00
Chris Forbes	b38af01ccf	i965/fs: Fix handling of sampler messages with header but zero offset Gather unconditionally uses a header, but in some cases the texture_offset value will be zero. V2: Don't introduce a bogus conversion. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-22 18:56:14 +13:00
Matt Turner	f1e605f1ad	glsl: Optimize -(-expr) into expr. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-21 22:53:36 -07:00
Matt Turner	963df4d37d	glsl: Optimize abs(-expr) and abs(abs(expr)) into abs(expr). Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-21 22:53:36 -07:00
Matt Turner	5b3aec412e	glsl: Use saved values instead of recomputing them. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-21 22:53:36 -07:00
Matt Turner	6aeb7514c3	docs: Mark GLSL 1.50, 3.30, and geometry shaders done for i965. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-21 22:53:36 -07:00
Rico Schüller	aab03f75f3	docs: Update docs for ARB_texture_mirror_clamp_to_edge. Signed-off-by: Rico Schüller <kgbricola@web.de> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-21 21:02:51 -07:00
Kenneth Graunke	2d3282188e	i965: Implement ARB_texture_mirror_clamp_to_edge. This passes Piglit's texwrap tests. v2: Remove _EXT suffix. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Rico Schüller <kgbricola@web.de>	2013-10-21 21:02:51 -07:00
Kenneth Graunke	cc2f87891b	i965: Drop unused simple_list.h includes. These don't appear to be necessary. Everything compiles just fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-21 21:02:51 -07:00
Kristian Høgsberg	1a2a30ba20	gbm-dri: Support importing RGB565 buffers	2013-10-21 20:56:17 -07:00
Paul Berry	672fab0b1b	glsl/linker: Allow mixing of desktop GLSL versions. Previously, Mesa followed the linkage rules outlined in the GLSL 1.20-1.40 specs, which (collectively) said that GLSL versions 1.10 and 1.20 could be linked together, but no other versions could be linked. In GLSL 4.30, the linkage rules were relaxed so that any two desktop GLSL versions can be linked together. This change was made because it reflected the behaviour of nearly all existing implementations (see Khronos bug 8463). Mesa was one of the few (perhaps the only) exceptions to prohibit cross-linking of some GLSL versions. Since the GLSL linkage rules were deliberately relaxed in order to match the behaviour of existing implementations, it seems appropriate to relax the rules in Mesa too (even though Mesa doesn't support GLSL 4.30 yet). Note that linking ES and desktop shaders is still prohibited, as is linking ES shaders having different GLSL versions. Fixes piglit tests "shaders/version-mixing {interstage,intrastage}". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-21 17:27:41 -07:00
Francisco Jerez	e26ed75066	clover: Improve region and pitch argument handling in memory transfer APIs. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:04 -07:00
Francisco Jerez	adefa84d66	clover: Add a pixel_size() method to the image class. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:04 -07:00
Francisco Jerez	6230f77232	clover: Implement support for the ICD extension. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	9a5afd0dbd	clover: Make sure hidden is the default symbol visibility. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Tom Stellard	07567c17f1	clover: Prepare the build system for ICD support. Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-10-21 10:47:03 -07:00
Francisco Jerez	9e0b7f76f9	clover: Fix memory leak when initializing a device object fails. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	1d741e3ac0	clover: Tidy up resource::mapping. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	6db102597a	clover: Simplify command_queue::flush(). Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	7a9bbff7d6	clover: Clean up the kernel and program object interface. [ Tom Stellard: Make sure to bind global arguments before retrieving handles. ] Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	10284b1d2d	clover: Clean up the interface of the context object slightly. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	5226eacf8d	clover: Delete copy constructors and assignment operators in all non-copiable objects. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	369419f761	clover: Define a few convenience equality operators. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	c6e7a0d0d3	clover: Simplify the platform object by using util/range. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	e5fc61fa3f	clover: Add property list helpers with a syntax consistent with other API objects. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	04d0ab9f64	clover: Switch samplers to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	d6f7afc3ed	clover: Switch memory objects to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	35307f540f	clover: Switch kernel and program objects to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	9968d9daf2	clover: Switch command queues to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:03 -07:00
Francisco Jerez	257781f243	clover: Switch event objects to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	9d06fb8fa8	clover: Switch context objects to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	c9e009b74d	clover: Switch device objects to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	49a49e0742	clover: Switch platform objects to the new model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	bff60c894a	clover: Define helper classes for the new object model. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	d8b4994281	clover: Clean up property query functions by using a new property_buffer helper class. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	7d61769e44	clover: Switch to the new utility code. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	099d281b38	clover: Name include guards consistently. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	8e14b82fd2	clover: Replace a bunch of double underscores with single underscores. Identifiers with double underscores are reserved, and using them has undefined behavior according to the C++ spec. It's unlikely to make any difference, but... Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	ebfdce079b	clover: Clean up the event profiling code. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	e93efa0d50	clover: Import new utility library. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Francisco Jerez	7baad4b996	clover: Require GCC 4.7 or higher to build. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-21 10:47:02 -07:00
Tom Stellard	4f49c97afe	clover: Use std::numeric_limits<std::size_t>::max() instead of SIZE_MAX This prevents a build failure on some systems. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-10-21 10:47:02 -07:00
Roland Scheidegger	ac81b6f2be	llvmpipe: enable seamless cube filtering Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-21 15:42:04 +02:00
Roland Scheidegger	3bdd1074e1	gallivm: implement seamless cube filtering For seamless cube filtering it is necessary to determine new faces and new coords per sample. The logic for this is _seriously_ complex (what needs to happen is very "asymmetric" wrt face, x/y under/overflow), further complicated by the fact that if the 4 samples are in a corner (meaning we only have actually 3 samples, and all 3 are on different faces) then falling off the edge is happening _both_ on x and y axis simultaneously. There was a noticeable performance hit in mesa's cubemap demo when seamless filtering was forced on (just below 10 percent or so in a debug build, when disabling all filtering hacks, otherwise it would probably be a bit more) and when always doing the logic, hence use a branch which it only does it if any of the pixels in a quad (or in two quads) actually hit this. With that there was no measurable performance hit in the cubemap demo (neither in a debug nor release buidl), but this will vary (cubemap demo very rarely hits edges). Might also be different on other cpus, as this forces SoA sampling path which potentially can be quite a bit slower. Note that as for corners, this code gets all the 3 samples which actually exist right, and the 4th texel will simply be the same as one of the others, meaning that filter weights will be a bit wrong. This however should be enough for full OpenGL (but not d3d10) compliance. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-21 15:42:04 +02:00
Christian König	21a57f9040	winsys/radeon: cleanup CS offloading Using atomic function for ncs is superfluous since it is protected by a mutex anyway. Also lock the mutex only once while retrieving the next CS for submission. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-10-21 10:20:18 +02:00
Rico Schüller	14429295e1	radeon: Enable ARB_texture_mirror_clamp_to_edge. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-10-20 20:12:39 -07:00
Rico Schüller	5da618c20e	r200: Enable ARB_texture_mirror_clamp_to_edge. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-10-20 20:12:39 -07:00
Rico Schüller	e487948bef	gallium: Enable ARB_texture_mirror_clamp_to_edge. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-10-20 20:12:39 -07:00
Rico Schüller	a59ae25d81	swrast: Enable ARB_texture_mirror_clamp_to_edge. v2: fix commit message Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-10-20 20:12:39 -07:00
Rico Schüller	1bbd3bb98a	mesa: Add infrastructure for GL_ARB_texture_mirror_clamp_to_edge. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-10-20 20:12:08 -07:00
Alexander von Gluck IV	50370e483b	scons: Fix Haiku missing library * The softpipe add-on needs libtranslation due to the use of BTranslatorRoster Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-20 19:20:59 -05:00
Alexandre Demers	24fd074ce7	docs: Updating forgotten GL feature completion for r600	2013-10-21 01:35:08 +02:00
David Heidelberger	c948aab96c	r300g/compiler: Fix unsigned comparison with less than zero rc_find_free_temporary_list() returns signed integer (in case of lack of free temporary registers returns -1), so new_index in radeon_rename_regs() should be signed. https://bugs.freedesktop.org/show_bug.cgi?id=54867 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-10-21 01:31:51 +02:00
Vinson Lee	c325aa5d80	r600g/sb: Initialize shader::dce_flags. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-10-20 00:38:40 -07:00
Kenneth Graunke	00b5d8aeae	i965: Mark G45 as having surface tile offset support. Fixes a regression since `02b632d8e8`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-19 18:43:09 -07:00
Vinson Lee	37cd9ac6df	glsl: Initialize per_vertex_accumulator::fields. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-18 18:29:18 -07:00
Vinson Lee	136a12ac98	mesa: Remove GLXContextID typedef from glx.h. Fixes this build error. CC clientattrib.lo In file included from ../../include/GL/glx.h:333, from glxclient.h:45, from clientattrib.c:32: ../../include/GL/glxext.h:275: error: redefinition of typedef ‘GLXContextID’ ../../include/GL/glx.h:171: note: previous declaration of ‘GLXContextID’ was here Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70591 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-10-18 18:08:31 -07:00
Carl Worth	bf7b425083	docs: Import 9.2.2 release notes, add news item.	2013-10-18 17:19:31 -07:00
Kenneth Graunke	653cc008a8	docs: Note that we support OpenGL 3.3 in the release notes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-18 15:24:18 -07:00
Kenneth Graunke	567445e2b9	i965: Enable OpenGL 3.3 and GLSL 3.30. Everything necessary for these appears to be implemented. We'll want to add more tests to guard against bugs, but it should be functionally complete. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-18 15:24:18 -07:00
Jon TURNEY	cedfd79be2	translate_sse: Fix generated code argument handling for msabi on x86_64 translate_sse.c contains code for msabi on x86_64, but it appears to be untested. Currently arguments 1 and 2 passed to the generated code are moved as 32-bit quantities into the registers used by sysvabi, irrespective of the architecture. Since these may be pointers, they must be moved as 64-bit quantities to avoid truncation. Commit `f4dd099171` disabled tranlate_sse.c on MinGW x86_64, I don't know if was due to this issue, or a different one... Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-18 14:17:15 +01:00
Jon TURNEY	72a0f832ec	rtasm: Cygwin uses the msabi calling convention on x86_64 Cygwin also uses the msabi calling convention on x86_64, not the sysvabi calling convention Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com> ignored, and an empty message aborts the commit.	2013-10-18 14:16:56 +01:00
Jon TURNEY	87e84acbfd	rtasm: The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() implementation which uses mmap() The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() implementation which uses mmap() to allocate an anonymous page with execute permission, rather than the one which just uses malloc(). Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-18 14:16:27 +01:00
Alexander von Gluck IV	9aad1ba70f	scons: Simplified fix of llvm cxxflags for rtti * Based on ideas of Jose Fonseca * A rework of `ce8eadb6e8` Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-10-17 20:33:05 -05:00
Paul Berry	b08195faec	glsl: Fix MSVC build (missing strcasecmp()) MSVC doesn't have a strcasecmp() function; it uses _stricmp() instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-17 18:11:22 -07:00
Kenneth Graunke	b3360d23ac	i965: Fold brwInitVtbl() into brwCreateContext(). With most of the virtual functions gone, brwInitVtbl() is now tiny. Merging it into the caller allows us to delete the entire file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-17 14:27:03 -07:00
Kenneth Graunke	f8fef8ee92	i965: Merge brw_destroy_context() into intelDestroyContext(). Now that i915 and i965 have been split, the separation between intelDestroyContext and brw_destroy_context is kind of arbitrary. This patch replaces the only brw->vtbl.destroy() call with the body of brw_destroy_context (the only implementation of that virtual function). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-17 14:27:03 -07:00
Kenneth Graunke	7601ba649f	i965: Replace dri_bo_release with drm_intel_bo_unreference. dri_bo_release is a helper function that calls drm_intel_bo_unreference but then also sets the pointer to NULL. This is unnecessary, since brw_destroy_context is called from intelDestroyContext, which also frees brw completely. If you're still trying to access them, you've got bigger problems. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-17 14:27:03 -07:00
Kenneth Graunke	5f76bc37ab	i965: Unindent the body of intelDestroyContext. Having almost the entire body of the function indented one level for a check that should never happen seems silly. Just early return. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-17 14:27:03 -07:00
Kenneth Graunke	80a9c42e9e	i965: Un-virtualize brw_new_batch(). Since the i915/i965 split, there's only one implementation of this virtual function. We may as well just call it directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-17 14:27:03 -07:00
Kenneth Graunke	6613f346ac	i965: Un-virtualize brw_finish_batch(). Since the i915/i965 split, there's only one implementation of this virtual function. We may as well just call it directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-17 14:27:03 -07:00
Paul Berry	e2d1eaa32a	glsl: In update_max_array_access, fix interface instance check. In commit `f878d20` (glsl: Update ir_variable::max_ifc_array_access properly), I accidentally used the wrong kind of check to determine whether the variable being accessed was an interface instance (I used var->get_interface_type() != NULL when I should have used var->is_interface_instance()). As a result, if an unnamed interface block contained a struct which contained an array, update_max_array_access() would mistakenly interpret the struct as a named interface block and try to dereference a null var->max_ifc_array_access. This patch corrects the check, fixing the null dereference. Fixes piglit test interface-block-struct-nesting. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70368 Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-17 11:51:06 -07:00
Paul Berry	79e835a712	glsl: Treat layout-qualifier-id's as case-insensitive in desktop GLSL. In desktop GLSL, location qualifiers are case-insensitive. In GLSL ES, they are case-sensitive. This patch handles the difference by using a new function to match layout qualifiers, match_layout_qualifier(), which calls either strcmp() or strcasecmp() as appropriate. Fixes piglit tests: - layout-not-case-sensitive-in.geom - layout-not-case-sensitive-max-vert.geom - layout-not-case-sensitive-out.geom - layout-not-case-sensitive.frag Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-17 11:51:01 -07:00
Brian Paul	a36f7e651e	mesa: remove PFNGLBLENDCOLORPROC, PFNGLBLENDEQUATIONPROC typedefs in gl.h Fixes error about duplicated typedefs (also in glext.h) reported on NetBSD 6.1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70546 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-10-17 12:10:39 -06:00
Brian Paul	282bb87366	st/mesa: add a few comments in st_create_context_priv()	2013-10-17 09:28:17 -06:00
Dave Airlie	530afc82a1	st/mesa: handle layer and primitive id output and point size input This fixes a number of piglit crashes when running on a hacked up llvmpipe. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-17 08:35:42 +01:00
Dave Airlie	038a9aab33	st/mesa: add geometry shader ubo support This just adds the missing bits so the ubo tests don't crash. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-17 08:35:42 +01:00
Fabian Bieler	20cad7fd6f	mesa/st: Allow geometry shaders without gl_Position export. From the ARB_geometry_shader4 spec (section Geometry Shader outputs): "The built-in special variable gl_Position is intended to hold the homogeneous vertex position. Writing gl_Position is optional." Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-17 08:35:42 +01:00
Bryan Cain	9bfa475684	st/mesa, glsl_to_tgsi: add support for geometry shaders v2 (Bryan Cain <bryancain3@gmail.com>): fix 2D array indexing order. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-17 08:35:42 +01:00
Bryan Cain	6b0df34ae5	mesa/st: Add VARYING_SLOT_TEX[1-7] to st_translate_geometry_program(). v2 (Paul Berry <stereotype441@gmail.com>: Split out to separate patch (previously this was part of "glsl: add builtins for geometry shaders.") Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-17 08:35:42 +01:00
Kristian Høgsberg	4ef1c8fb4c	Revert "i965: Create ARGB2101010 DRI configs" Exposing 10-bit color configs confuses too many applications that try to use the chooser to pick an 8 bit config. The chooser consider an fbconfig with more bits a better match and will thus give a 10 bit config when an application asks for a config with GLX_RED_SIZE 1 or 8. One key example is glxinfo, which does this, and then doesn't specify that it needs a config where GLX_DRAWABLE_TYPE has the GLX_WINDOW_BIT set. This way it ends up with a 10 bit config that it can't use to create a GLX window and fails to log extensions. This reverts commit `f354bcc177`. https://bugs.freedesktop.org/show_bug.cgi?id=70557	2013-10-16 22:22:45 -07:00
Vadim Girlin	62c8149472	r600g/sb: fix issue with DCE between GVN and GCM (v2) We can't perform DCE using the liveness pass between GVN and GCM because it relies on the correct schedule, but GVN doesn't care about preserving correctness - it's rescheduled later by GCM. This patch makes dce_cleanup pass perform simple DCE between GVN and GCM instead of relying on liveness pass. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70088 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-10-17 07:57:49 +04:00
Matt Turner	38fe3bd5f2	glapi: Add missing XML files to Makefile dependencies. Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-10-16 20:49:43 -07:00
Matt Turner	a360ca7476	glsl: Optimize mul(a, -1) into neg(a). Two extra instructions in some heroesofnewerth shaders, but a win for everything else. total instructions in shared programs: 1531352 -> 1530815 (-0.04%) instructions in affected programs: 121898 -> 121361 (-0.44%) Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-16 20:49:43 -07:00
Matt Turner	197f3a33fb	i965/fs: Handle printing HW_REGS in dump_instruction(). Scheduling debugging now prints: Instructions before scheduling (reg_alloc 1) 0: linterp vgrf20, hw_reg2, hw_reg3, hw_reg4, 1: linterp vgrf21, hw_reg2, hw_reg3, hw_reg4+16, Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-16 20:49:43 -07:00
Matt Turner	7d0519c082	i965: Print instructions' children during scheduling debugging. Useful for tracking down problems in dependency calculations. Scheduling debugging now prints: clock 2, scheduled: linterp vgrf5, hw_reg2, hw_reg3, hw_reg0, child 0, 53 parents: fb_write (null), (null), (null), (null), child 1, 2 parents: tex vgrf4, vgrf5, (null), (null), child 2, 52 parents: placeholder_halt (null), (null), (null), (null), clock 4, scheduled: linterp vgrf5+1, hw_reg2, hw_reg3, hw_reg0+16, child 0, 52 parents: fb_write (null), (null), (null), (null), child 1, 1 parents: tex vgrf4, vgrf5, (null), (null), now available child 2, 51 parents: placeholder_halt (null), (null), (null), (null), Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-16 20:49:43 -07:00
José Fonseca	40ddd8b659	Revert "scons: Fix build when rtti is disabled" This reverts commit `94d05bf87a` as it has a few problems: - it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there - it is merging not only rtti, but the whole cxxflags (defines etc) which has proven to be a source of troubles (breaks debugging etc.)	2013-10-16 15:05:51 -07:00
Tom Stellard	9da4021626	radeonsi: Use 'SI' as the LLVM processor for CIK on LLVM <= 3.3 LLVM 3.3 does not know about CIK processors, and the codes paths for SI and CIK are the same. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-16 12:55:30 -04:00
Tom Stellard	13ac38b4ef	r600g/compute Improve debugging output	2013-10-16 09:39:31 -07:00
Tom Stellard	de1de88dfc	clover: Link libclc before running any optimizations This is required in order for clang to correctly handle the OpenCL C barrier() builtin which has the following restrictions acording to the OpenCL 1.1 Specification: If barrier is inside a conditional statement, then all work-items must enter the conditional if any work-item enters the conditional statement and executes the barrier. If barrier is inside a loop, all work-items must execute the barrier for each iteration of the loop before any are allowed to continue execution beyond the barrier. By linking before otimizations, we can replace calls to barrier() with calls to a target specific intrinsic which has the noduplicate attribute This attribute prevents clang from performing optimizations which could violate the above rules. This attribute must be applied to the call instruction that invokes the function, so it is not enough to add this attribute the barrier() declaration. As a bonus this will probably speed up compile times since we will no longer need to run link-time optimizations.	2013-10-16 09:39:15 -07:00
Brian Paul	2273b04c61	mesa: change glTexImage[23]DMultisample() internalformat to GLenum To match glext.h and the GL_ARB_texture_multisample extension. However, the GL 4.0 spec and man page say it's GLint. An OpenGL spec bug will be filed.	2013-10-16 08:43:23 -06:00
Brian Paul	4f08cdefda	svga: minor fix-ups in svga_get_shader_param() Fix debug error message. Add switch case for PIPE_SHADER_COMPUTE. Trivial.	2013-10-16 08:26:45 -06:00
Brian Paul	e96c55ff49	cso: fix incorrect sampler view count in cso_restore_sampler_views() During the recent bind_sampler_states() interface change in gallium we changed the CSO single_sampler_done() function so that if we were decreasing the number of sampler states bound in the driver, we'd null-out the "extra/old" sampler states to unbind them. See commit `1e2fbf265`. However, we didn't make the corresponding fix for sampler views. This caused an assertion to fail in the svga driver which checked that the number of sampler views matched the number of sampler states. This patch fixes cso_restore_sampler_views() so that it nulls-out the extra/old sampler views if the number of new views is less than the number of current/old views. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-16 08:13:47 -06:00
Brian Paul	0d1011638b	mesa: update glxext.h to version 20131008 The diff is huge but the actual changes are few: * Whitespace changes * Items are reordered * extern qualifiers dropped	2013-10-16 08:13:46 -06:00
Brian Paul	4d9e61c046	mesa: update glext.h to version 20131008 Only two notable changes in this revision: * GLvoid has been replaced by void. * Added the GL_NV_blend_equation_advanced extension.	2013-10-16 08:13:45 -06:00
Brian Paul	3c074e4d4d	vbo: access VBO memory more efficiently when building display lists Use GL_MAP_INVALIDATE_RANGE, UNSYNCHRONIZED and FLUSH_EXPLICIT flags when mapping VBOs during display list compilation. This mirrors what we do for immediate-mode VBO building in vbo_exec_vtx_map(). This improves performance for applications which interleave display list compilation with execution. For example: glNewList(A); glBegin/End prims; glEndList(); glCallList(A); glNewList(B); glBegin/End prims; glEndList(); glCallList(B); Mesa's vbo module tries to combine the vertex data from lists A and B into the same VBO when there's room. Before, when we mapped the VBO for building list B, we did so with GL_MAP_WRITE_BIT only. Even though we were writing to an unused part of the buffer, the map would stall until the preceeding drawing call finished. Use the extra map flags and FlushMappedBufferRange() to avoid the stall. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-10-16 08:13:45 -06:00
Brian Paul	fa9c702164	mesa: consolidate cube width=height error checking Instead of checking width==height in four places, just do it in _mesa_legal_texture_dimensions() where we do the other width, height, depth checks. Similarly, move the check that cube map array depth is a multiple of 6. This change also fixes some missing cube dimension checks for the glTexStorage[23]D() functions. Remove width==height assertion in _mesa_get_tex_max_num_levels() since that's called before the other size checks for glTexStorage. Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-10-16 08:13:45 -06:00
Kristian Høgsberg	6e444a72c1	gbm: Add support for gbm bos and surfaces using GBM_FORMAT_ARGB2101010 We can now add GBM support for the 10 bit/channel formats which lets us create a gbm surface that we can use with KMS for display hardware that support the format. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 22:07:52 -07:00
Kristian Høgsberg	3160ec353e	dri: Add __DRIimage support for the ARGB2101010 format We add support for the ARGB2101010 color format to the DRI image extension, which allows DRI loaders to create a __DRIimage with this color format. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 22:07:52 -07:00
Kristian Høgsberg	f354bcc177	i965: Create ARGB2101010 DRI configs This commit enables ARGB2101010 system framebuffers (that is, DRI drawables) for the i965 drivers. This is done by generating DRI configs that advertise this color format as well as teaching intelCreateBuffer to pick the right color format when it sees such a DRI config. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 22:07:52 -07:00
Kristian Høgsberg	afda76cc0d	dri/common: Add support for creating ARGB2101010 configs This extends the common dri driver infrastructure with the ability to create __DRIconfigs for 10 bits/channel + 2 bit alphs formats. This still has to be supported and requested by a driver, so this doesn't enable anthing yet. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 22:07:52 -07:00
Kristian Høgsberg	df479cffcc	egl_dri2: Set NativeVisualID to the matching GBM config for the gbm platform The EGLConfig doesn't have the rgba masks, only the rgba sizes. To make sure a config is usable with a given GBM/KMS format, we need a way to make sure the formats really match.	2013-10-15 22:07:52 -07:00
Kristian Høgsberg	44e584a73a	egl_dri2: Remove depth argument from dri2_add_config() All callers now use the more correct rgba mask mechanism for filtering out mathcing DRI configs. Even if depth and buffer size match, the color component layout can be different, or in case or ARGB8888 and ARGB2101010 the color components can even be different sizes. Since anything that the depth check would reject is also rejected by the rgba mask comparison, the depth parameter is redundant and not specific enough. We should probably have removed it when the rgba masks argument was introduced, but better late than never. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 22:07:52 -07:00
Kristian Høgsberg	e3d0a0eac7	egl_dri2: Match X11 visuals using rgba masks instead of depth Matching on visual depth to buffer size makes 8 bpc RGBA look similar to 10 bit RGB with 2 bit alphs - both have buffer size 32. Instead, build the rgba masks from the visual data and use that for finding matching DRI configs. We need to keep the special case that allows us to match 24 bit visuals to DRI configs with buffer size 32. We do that by creating an alpha mask of "all the non-rgb bits" for 24 bit visuals and matching a second time with that. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 22:06:46 -07:00
Singh, Satyeshwar	e2620c1a74	i965: Add support for RGB565 __DRIimage Add information for RGB565 to the table of image formats so that we can create a __DRIimage for that format. This in turn enables RGB565 wayland clients. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 21:30:49 -07:00
Singh, Satyeshwar	2efc97d513	egl-wayland: Add support for RGB565 pixel format for Wayland clients With this patch Wayland clients can now ask EGL for RGB 565 format buffers and attach them to a Wayland compositor. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-10-15 21:26:56 -07:00
Alexander von Gluck IV	94d05bf87a	scons: Fix build when rtti is disabled * The rtti fix actually dug up a bug in the scons build scripts. * Autotools took the LLVM cpp and cxx flags, while scons only took the cpp flags. * This grabs the cxx flags and applies them where needed. We may want to make the same change for the llvm cpp flags in scons. * The only linux platform I can find with LLVM no-rtti is Ubuntu. * Fixes bug #70471 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-10-15 22:12:18 -05:00
José Fonseca	85d7f6779f	llvmpipe: Advertise PIPE_CAP_DEPTH_CLIP_DISABLE. Actually implemented by draw module. Tested piglit ARB_depth_clamp tests, which pass 100%. Trivial.	2013-10-15 18:22:57 -07:00
José Fonseca	3b3591cd15	draw: make vs_slot signed. Otherwise (vs_slot < 0) will never be true. Trivial.	2013-10-15 18:22:57 -07:00
Emil Velikov	b1e7cd037e	configure.ac: drop obsolete variable HAVE_COMMON_DRI The original intent of the variable was to prevent adding libdrm dependency for non drm drivers (swrast). This is already handled with __NOT_HAVE_DRM_H, and with the recent merge of the dri_util and drisw_util code this variable has started causing build issues. Eg. the following will fail $ ./autogen.sh --with-dri-drivers=swrast --with-gallium-drivers= $ make Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-10-15 21:54:20 +02:00
Emil Velikov	cd3fa176a8	swrast: add correct include for out-of-tree builds The xmlpool/options.h file was not accessible when building out-of-tree leading to failure. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70378 Reported-by: Fabio Pedretti <fabio.ped@libero.it> Tested-by: Fabio Pedretti <fabio.ped@libero.it> Tested-by: Andre Heider <a.heider@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-10-15 21:50:09 +02:00
Bryan Cain	467e3aa3de	mesa: fix transform feedback when a geometry shader is active. When a geometry shader is active, the transform feedback primitive type ("mode") needs to be validated against the geometry shader output primitive type, not the primitive type passed to the glDraw*() function. Fixes the following piglit tests: - glsl-1.50-geometry-primitive-types GL_LINES - glsl-1.50-geometry-primitive-types GL_LINES_ADJACENCY - glsl-1.50-geometry-primitive-types GL_LINE_STRIP - glsl-1.50-geometry-primitive-types GL_LINE_STRIP_ADJACENCY - glsl-1.50-geometry-primitive-types GL_TRIANGLES - glsl-1.50-geometry-primitive-types GL_TRIANGLES_ADJACENCY - glsl-1.50-geometry-primitive-types GL_TRIANGLE_FAN Exposes previously hidden failures in the following piglit tests: - glsl-1.50-geometry-primitive-id-restart GL_LINES other - glsl-1.50-geometry-primitive-id-restart GL_LINES_ADJACENCY other - glsl-1.50-geometry-primitive-id-restart GL_LINE_LOOP ffs - glsl-1.50-geometry-primitive-id-restart GL_LINE_LOOP other - glsl-1.50-geometry-primitive-id-restart GL_LINE_STRIP other - glsl-1.50-geometry-primitive-id-restart GL_LINE_STRIP_ADJACENCY other - glsl-1.50-geometry-primitive-id-restart GL_TRIANGLES other - glsl-1.50-geometry-primitive-id-restart GL_TRIANGLES_ADJACENCY other - glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_FAN ffs - glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_FAN other - glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_STRIP other - glsl-1.50-geometry-primitive-id-restart GL_TRIANGLE_STRIP_ADJACENCY other (These failures were previously hidden due to a flaw in the test: it doesn't check for GL errors. I'll fix the test shortly). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-15 11:40:43 -07:00
Paul Berry	afccf3d8e7	i965/gs: Set the REORDER bit in 3DSTATE_GS. Ivy Bridge's "reorder enable" bit gives us a binary choice for the order in which vertices from triangle strips are delivered to the geometry shader. Neither choice follows the OpenGL spec, but setting the bit is better, because it gets triangle orientation correct. Haswell replaces the "reorder enable" bit with a new "reorder mode" bit (which occupies the same location in the command packet). This bit gives us a different binary choice, which affects both triangle strips and triangle strips with adjacency. Setting the bit ("reorder trailing") gives the proper order according to the OpenGL spec. So in either case we want to set the bit. On Ivy Bridge, fixes piglit test "triangle-strip-orientation". On Haswell, fixes piglit tests "glsl-1.50-geometry-primitive-types {GL_TRIANGLE_STRIP,GL_TRIANGLE_STRIP_ADJACENCY}" and "glsl-1.50-geometry-tri-strip-ordering-with-prim-restart *". v2: Rename the bit to "REORDER_TRAILING" for consistency with Haswell docs. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-15 11:40:32 -07:00
Paul Berry	caf9cef7ee	i965/fs: Remove bogus field prog_data->dispatch_width. Despite the name, this field wasn't being set to the dispatch width at all; it was always 8. The only place it was used was that the constant buffer read length was aligned to it, and as far as I can tell from the docs, there is no need to align this value to the dispatch width; aligning it to a multiple of 8 is sufficient. So I've just replaced it with a hardcoded 8. v2: In gen6_wm_state, use brw->wm.base.push_const_size for consistency with VS and GS state upload. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-15 11:34:30 -07:00
Paul Berry	2910a82eb4	glsl: Add new GLSL 1.50 constants. This patch populates the following built-in GLSL 1.50 variables based on constants stored in ctx->Const: - gl_MaxVertexOutputComponents - gl_MaxGeometryInputComponents - gl_MaxGeometryOutputComponents - gl_MaxFragmentInputComponents - gl_MaxGeometryTextureImageUnits - gl_MaxGeometryOutputVertices - gl_MaxGeometryTotalOutputComponents - gl_MaxGeometryUniformComponents - gl_MaxGeometryVaryingComponents On i965/gen7, fixes all Piglit tests in "spec/glsl-1.50/built-in constants/*" except for gl_MaxCombinedTextureImageUnits and gl_MaxGeometryUniformComponents. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-15 11:34:30 -07:00
Eric Anholt	705a90e304	i965: Move the common binding table offset code to brw_shader.cpp. Now that both vec4 and fs are dynamically assigning offsets, a lot of the code is the same. v2: Avoid passing around the next offset through the class. (Review by Paul) Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:50 -07:00
Eric Anholt	d395485e1d	i965/vec4: Dynamically assign the VS/GS binding table offsets. Note that the dropped comment in brw_context.h is mostly (better written) in brw_binding_table.c as well. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:48 -07:00
Eric Anholt	4e5306453d	i965/fs: Dynamically set up the WM binding table offsets. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:45 -07:00
Eric Anholt	3c9dc2d31b	i965: Make a brw_stage_prog_data for storing the SURF_INDEX information. It would be nice to be able to pack our binding table so that programs that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1 binding table entries. To do that, we need the compiled program to have information on where its surfaces go. v2: Rename size to size_bytes to be more explicit. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:42 -07:00
Eric Anholt	5463b5bbbd	i965: Always have the struct gl_program * in the backend visitor. vec4 already had it, so put it in the FS, too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:40 -07:00
Eric Anholt	2788798388	i965: Drop a couple of unused defines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:37 -07:00
Eric Anholt	fbc088ee49	i965: Remove dead arguments from prog_data_compare. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-15 10:18:32 -07:00
Alexander von Gluck IV	ce8eadb6e8	build: remove forced -fno-rtti * As discussed on the mailing list, forced no-rtti breaks C++ public API's such as the Haiku C++ libGL.so * -fno-rtti can be still set however instead of blindly forcing -fno-rtti, we can rely on the llvm-config --cppflags output. If the system llvm is built without rtti (default), the no-rtti flag will be present in llvm-config --cppflags (which we pick up on) If llvm is built with rtti (REQUIRES_RTTI=1), then -fno-rtti is removed from llvm-config --cppflags. * We could selectively add / remove rtti from various components, however mixing rtti and non-rtti code is tricky and could introduce missing symbols. * This needs impact tested. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-10-14 23:00:55 -05:00
Matt Turner	7a2e9f9778	configure.ac: Don't check for awk, grep, nm. Not used since `d53901c6`.	2013-10-14 11:13:09 -07:00
Matt Turner	9ae1f0bad6	configure.ac: Don't check for cross compiling. Dead since `c845140a`.	2013-10-14 11:13:09 -07:00
Matt Turner	a5ec01fb1b	i965: Don't copy prop source mods into instructions that can't take them.	2013-10-14 11:13:09 -07:00
Constantin Baranov	53904c64da	mesa: Add missing switch break in invalidate_framebuffer_storage() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70411 Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-14 09:06:07 -06:00
Grigori Goronzy	e6c2afa9ce	st/vdpau: add format conversions for GetBitsYCbCr Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY conversions. The NV12->YV12 conversion is commonly used, for instance by VLC. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-13 20:09:38 +02:00
Grigori Goronzy	f250fd59c4	radeon: use staging for mapping linear textures Textures that likely reside in VRAM, are mapped for reading and don't require direct mapping should be staged into GTT, to avoid bad performance. This fixes readback performance of VDPAU surfaces. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-10-13 20:09:34 +02:00
Grigori Goronzy	270fab5164	radeon/uvd: use PIPE_BIND_LINEAR for video surfaces This new bind flag forces linear storage, but does not have other side effects like R600_RESOURCE_FLAG_TRANSFER. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-13 20:09:02 +02:00
Vincent Lejeune	6e51c2a941	radeonsi: Allow Sinking pass to move preloaded const/res/sampl This fixes a crash in Unigine Heaven 3.0, and probably in some others apps.	2013-10-13 20:03:42 +02:00
Vadim Girlin	453ea2d309	radeonsi: pass alpha_ref value to PS in the user sgpr Currently it's hardcoded in the shader, so every change requires compilation of the shader variant, killing the performance in Serious Sam 3 and probably other apps. This patch passes alpha_ref in the user sgpr and removes it from the shader key. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-10-13 20:03:35 +04:00
Vadim Girlin	10ddeb910b	r600g: fix tgsi_op2_s with trans-only instructions This fixes the issue when dst and src is the same reg and operation on one channel overwrites the source for other channels, e.g.: UMUL TEMP[2].xyz, TEMP[0].xyzz, TEMP[2].xxxx In this example the result of the operation on channel x is written in TEMP[2].x and then used as a second source operand for channels y and z instead of original value in TEMP[2].x. This patch stores the results in temp reg and moves them to dst after performing operation on all channels. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70327 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-10-13 20:03:35 +04:00
Kenneth Graunke	8958741e5a	i965: Merge intel_context.h into brw_context.h. v2: Keep the random 32-bit only version of memcpy, since Ian says I can't delete it without data proving it isn't useful. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	3dda3ebec9	i965: Delete our copy of likely/unlikely macros. brw_context.h includes imports.h which includes compiler.h which already defines these. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	67601da24c	mesa: Move U_FIXED/S_FIXED macros from i965 to macros.h. These make it easy to convert a floating point value to a fixed point numbers. The second parameter is the number of bits used for the fractional part of the number. It looks like core Mesa has similar functions already, but none that allows an arbitrary number of fractional bits. The more generic version is probably useful to everyone. r600g apparently has an identical copy of the S_FIXED macro, but doesn't include this file. I'm not sure what to do about that, so I'm just going to leave it for now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	1a82081db6	mesa: Move ROUND_DOWN_TO() macro from i915/i965 to macros.h. This seems generally useful, so it may as well live in core Mesa. In fact, the comment for ALIGN() in macros.h actually says to "see also" ROUND_DOWN_TO, which...was in a driver somewhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	50c9f04c5f	i965: Move need_workaround_flush = true to intel_batchbuffer_init. intel_batchbuffer_init() sets up initial batchbuffer state; it seems like a reasonable place to initialize this flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	ddc8decdb2	i965: Move DriverFlag initialization to brw_init_state(). Configuring which dirty flags we want sounds like a job for brw_init_state(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	ba0cc79ab9	i965: Merge intelInitContext into brwCreateContext. The split here was completely arbitrary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	90d52d2c76	i965: Move viewport driver hook setup to brw_init_driver_functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	f118fc26e1	i965: Make brwInitFunctions take brw_context rather than intel_screen. It actually just wants generation checking, and brw->gen is the usual way of doing that. In the future, we'll also want to check brw->hw_ctx, which isn't available from the screen. While we're changing the function signature, convert from camel case to our usual naming conventions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	9848a42287	i965: Merge intelInitFunctions() and brwInitFunctions(). They do exactly the same thing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	0138fd4610	i965: Merge intel_context.c into brw_context.c. There's no point in having two files for context functions. This patch moves the code from intel_context.c into brw_context.c unmodified (other than whitespace fixes). Right now, this looks silly; future patches will merge functions and tidy things up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	8d315b2583	i965: Move memset of TextureFormatSupported to brw_init_surface_formats. brw_init_surface_formats already sets entries in TextureFormatsSupported to true; it may as well take care of initializing it to false too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	fc5b865cec	i965: Remove has_aa_line_parameters. This flag is only used in one place, and is only set on one platform. Just check for original Gen4 in the relevant function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	220c1e5610	i965: Move state setup from brwCreateContext to brw_init_state(). This seems like a better place for it, and helps clean up brwCreateContext (which is full of a lot of random stuff). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	d31b928b93	i965: Remove the brw_context::emit_state_always flag. This was always set to false, and is only used for debugging. To enable it, simply change the if (0) block and recompile. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	02b632d8e8	i965: Move hardware feature flags to brw_device_info. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:44 -07:00
Kenneth Graunke	ea890c031d	i965: Move device quirks to brw_device_info. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	d76f6c7ae4	i965: Move hardware limits to brw_device_info. Since each kind of device has its own brw_device_info structure, we can simply store the URB and thread limits there. This eliminates all the large if-ladders, and simplifies the context initialization code quite a bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	afe05e7193	i965: Replace some intel_screen fields with brw_device_info references. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	9d490c172b	i965: Delete the INTEL_SEPARATE_STENCIL override. This option was useful during initial development, but it's been ages since I've heard of anyone using it. Plus, Gen7+ mandates separate stencil, so it was really only useful on Sandybridge anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	6e9f427ed8	i965: Add a new brw_device_info structure. The idea is that struct brw_device_info should store statically-known information about hardware features. Using the new family name in the PCI ID table, we can easily grab the right structure. This is basically the equivalent of intel_device_info in the kernel. This patch also makes the new structure available from intel_screen, but nothing uses it. Right now, it looks very redundant with existing fields, but that will change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	4a29b9a066	i965: Add the family name to the PCI ID table. I removed this a while ago, since we never used it, but I'm finally resurrecting the idea in the next commits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	8d4ecbccd6	i965: Remove #define name from PCI ID table. Nothing uses the #define name, and it's not terribly useful - the numerical ID serves the same purpose. The only thing we could really do with it is generate slightly prettier preprocessed code. But who looks at that? Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	90511faedd	i965: Pull most driconf option handling into a centralized function. Using a helper function clarifies the context initialization code. I would've liked to completely centralize it, but moving the optionCache code from intelInitExtensions into here would've required setting flags in the context, which seems like a waste. v2: Rebase for the introduction of disable_derivative_optimization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	0fb525b87c	i965: Move a bunch of code from intelInitContext to brwCreateContext. Now that intelInitContext isn't shared between i915 and i965, the split is fairly arbitrary. This patch moves a bunch of the basic context creation and generation checking code up to the top-level function (and slightly earlier). More will follow. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	a25caad9e4	i965: Update the comment about viewport hacks. It wasn't clear that this was necessary for EGL, or why. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	832bcc3613	i965: Pull out INTEL_DEBUG handling into new intel_debug.[ch] files. Now that there isn't an intel_context structure, the split between brw_context.[ch] and intel_context.[ch] is rather awkward and arbitrary. Removing intel_context.[ch] seems desirable, but not everything really belongs in brw_context.[ch], either. Moving INTEL_DEBUG handling into separate intel_debug.[ch] files should make them relatively easy to find. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Kenneth Graunke	3f7b4e5d04	i965: Rename brwCreateContext's error parameter to dri_ctx_error. "error" is a very generic name. dri_ctx_error is the name used in intelInitContext(), which is more specific. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-13 00:10:43 -07:00
Eric Anholt	95bd8a332d	dri: Move i965-specific context flag logic to dri common. Nobody else yet can do a forward context anyway, but others should be able to do debug contexts, and those would have just had no effect currently.	2013-10-13 00:10:43 -07:00
Stephane Marchesin	5ceeeb360e	i915g: Fix assert Now that we support start, assert on start + num < max samplers Reported by xexaxo	2013-10-12 11:40:54 -07:00
Paul Berry	975c6ce605	mesa: Bump version to 10.0.0. Mesa now supports OpenGL 3.2 and GLSL 1.50, so bump the Mesa major version from 9 to 10 to reflect this. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-12 08:58:18 -07:00
Paul Berry	200f9a0576	mesa: Remove warning that geometry shader support is experimental. Geometry shader support is now working well, and adequately piglit tested. There are just a few piglit failures left to fix. So there's no need for an "experimental" warning anymore. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-12 08:58:02 -07:00
Paul Berry	b6d6ea396c	i965: Turn on GLSL 1.50 and GL 3.2 support for i965 gen7. Geometry shaders were the last thing we needed to finish before turning on GLSL 1.50 and GL 3.2 support. They are now working well, with just a few piglit failures left to fix. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-12 08:57:45 -07:00
Jay Cornwall	d7d539a1cb	radeon/llvm: show LLVM disassembly when available With code dump enabled LLVM may generate disassembly during compilation. Show this disassembly when available and prefer it to SI bytecode dump. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jay Cornwall <jay@jcornwall.me>	2013-10-12 00:03:58 -04:00
Roland Scheidegger	7681beedd1	softpipe: fix seamless cube filtering Fix coord wrapping (and face selection too) in case of edges. Unfortunately, the coord wrapping is way more complicated than what the code did, as it depends on the face and the direction where the texel falls off the face (the logic needed to get this right in fact seems utterly ridiculous). Also fix a bug in (y direction under/overflow) face selection. And get rid of complicated cube corner handling. Just like edge case, the coord wrapping was wrong and it seems very difficult to fix. I'm near certain it can't always work anyway (though ordinary seamless filtering on edge has actually a similar problem but not as severe) because we don't have per-pixel face, hence could have multiple corner texels which would make it very difficult to average the remaining texels correctly. Hence simply pick a texel which would only have fallen off one edge but not both instead, which is not quite accurate but actually I think should be enough to meet OpenGL (but not d3d10) requirements. v2: small fixes suggested by Brian, add some comments. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-12 04:05:57 +02:00
Roland Scheidegger	75f1fea14f	llvmpipe: increase fs shader variant instruction cache limit by factor 4 The previous limit of of 128*1024 was reported to cause frequent recompiles in some apps due to shader variant thrashing on IRC in some apps leading to noticeable lags. Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible to reach, since even simple fragment shaders without texturing (glxgears) used more than twice than 128 instructions, hence the instruction limit would have always been reached first (excluding things like trivial shaders not writing color). Even with the new limit it is VERY likely the instruction limit is hit first. Should help with such lags due to recompiles (though other shader types have their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in particular the latter seems a bit small (128)). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-12 04:05:57 +02:00
Vinson Lee	a9a78640d9	mesa: Do not use newlocale on NetBSD. Fixes this build error. CC imports.lo ../../src/mesa/main/imports.c: In function '_mesa_strtof': ../../src/mesa/main/imports.c:570:20: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'loc' ../../src/mesa/main/imports.c:570:20: error: 'loc' undeclared (first use in this function) ../../src/mesa/main/imports.c:570:20: note: each undeclared identifier is reported only once for each function it appears in ../../src/mesa/main/imports.c:572:7: error: implicit declaration of function 'newlocale' ../../src/mesa/main/imports.c:572:23: error: 'LC_CTYPE_MASK' undeclared (first use in this function) ../../src/mesa/main/imports.c:574:4: error: implicit declaration of function 'strtof_l' ../../src/mesa/main/imports.c:580:1: warning: control reaches end of non-void function Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-10-11 17:04:54 -07:00
Brian Paul	1737189f0a	svga: s/0/FALSE/	2013-10-11 17:07:44 -06:00
Brian Paul	6f1b5052ec	mesa: add comment to clarify ctx->Driver.MapBufferRange() return value	2013-10-11 17:07:44 -06:00
Brian Paul	3710b65823	st/mesa: whitespace fixes in st_cb_bufferobjects.c	2013-10-11 17:07:44 -06:00
Brian Paul	ffe529352b	vbo: assorted minor clean-ups Use GL_TRUE/FALSE instead of 1/0. Remove extraneous parentheses. Remove trailing whitespace.	2013-10-11 17:07:44 -06:00
Brian Paul	2a429f9d9c	glsl: fix signed/unsigned comparison warning	2013-10-11 17:07:44 -06:00
Kristian Høgsberg	1d34927061	wayland: Only pass wl_drm instance to gbm when using gbm platform	2013-10-11 15:30:09 -07:00
Kristian Høgsberg	360a141f24	wayland: Don't rely on static variable for identifying wl_drm buffers Now that libEGL has been fixed to not leak all kinds of symbols, gbm links to its own copy of the libwayland-drm.a helper library. That means we can't rely on comparing the addresses of a static vtable symbol in that library to determine if a wl_buffer is a wl_drm_buffer. Instead, we move the vtable into the wl_drm struct and use that for comparing. https://bugs.freedesktop.org/show_bug.cgi?id=69437 Cc: 9.2 <mesa-stable@lists.freedesktop.org>	2013-10-11 15:14:35 -07:00
Vinson Lee	fe6974382b	glapi: Do not use backtrace on NetBSD. execinfo.h is not available on NetBSD. Fixes this bulid error. CC glapi_gentable.lo glapi_gentable.c:44:22: fatal error: execinfo.h: No such file or directory Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-10-11 14:48:45 -07:00
Ian Romanick	59f18340c3	glsl: Remove extraneous .dir-locals.el This was overriding the top-level .dir-locals.el causing some settings (like forcing spaces instead of tabs!) to be lost. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-11 10:43:37 -07:00
Grigori Goronzy	3de7e11f58	r600g: fix crash in set_framebuffer_state We should be able to safely set the framebuffer state without a fragment shader bound. bind_ps_state will take care of updating the necessary state bits later. v2: check in update_db_shader_control	2013-10-11 17:33:18 +02:00
Topi Pohjolainen	396c69bf5d	mesa: Allow external textures to use fallback (0, 0, 0, 1) Fixes GL2ExtensionTests/egl_image_external/TestSimpleUnassociated.test which is part of gles2/3 conformance suite. Here image external textures are switched to be treated the same as 2D textures. These can be associated with the fallback texture providing fixed sample values of (0, 0, 0, 1). The OES_EGL_image_external spec says: "Sampling an external texture which is not associated with any EGLImage sibling will return a sample value of (0,0,0,1)." "External textures cannot be used with TexImage2D, TexSubImage2D, CompressedTexImage2D, CompressedTexSubImage2D, CopyTexImage2D, or CopyTexSubImage2D, and an INVALID_ENUM error will be generated if this is attempted." And quoting Chad: "That's enforced in _mesa_TexImage() by calling legal_teximage_target(), and enforced in _mesa_TexSubImage() by calling legal_texsubmimage_target(). Each of the legal_tex*image_target() functions reject external textures. Therefore, allowing GL_TEXTURE_EXTERNAL_OES in store_texsubimage() won't violate the above spec quote. I think it's safe to allow GL_TEXTURE_EXTERNAL_OES in store_texsubimage(), as long as the texture has only a single plane. Luckily, that's the only type of external textures that Mesa currently supports." CC: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2013-10-11 09:59:01 +03:00
Chad Versace	9cb8f7a126	doxygen: Add i965 to list of modules in html header Signed-off-by: Chad Versace <Chad Versace chad@chad-versace.us>	2013-10-10 22:20:39 -07:00
Frank Henigman	49ed5991ee	i965: extend fast texture upload Extend the fast texture upload from BGRA X-tiled to include RGBA, Alpha/Luminance, and Y-tiled. Speed improvements, measured with mesa demos teximage program, on 256 x 256 texture, in MB/s, on a Sandy Bridge (Ivy is comparable): before after increase BGRA/X-tiled 3266 4524 1.39x BGRA/Y-tiled 1739 3971 2.28x RGBA/X-tiled 474 4694 9.90x RGBA/Y-tiled 477 3368 7.06x L/X-tiled 1268 1516 1.20x L/Y-tiled 1439 1581 1.10x v2: Cosmetic changes only: reformat and reword comments, make doxygen-friendly, rename variables, use existing macros, add an assert. Signed-off-by: Frank Henigman <fjhenigman@google.com> Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-10 18:16:41 -07:00
Alexander von Gluck IV	0fda1cb498	haiku: Fix llvmpipe and clean up softpipe tracing * Fix LLVM library and defines * Only enable tracing when scons build=debug Acked-by: Brian Paul <brianp@vmware.com>	2013-10-10 19:28:23 -05:00
Alexander von Gluck IV	69508950da	haiku: Remove common directory search path * /boot/common no longer exists in Haiku as of a few days ago (and this is undefined) Acked-by: Brian Paul <brianp@vmware.com>	2013-10-10 19:28:23 -05:00
Eric Anholt	8821e9d108	dri: Reference the global driver vtable once at screen init.. This is part of the prep for megadrivers, which won't allow using a single global symbol due to the fact that there will be multiple drivers built into the same dri.so file. For that, we'll need screen init to take a reference to the driver to set up this vtable. v2: Fix two missed references to driDriverAPI. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-10-10 16:34:30 -07:00
Eric Anholt	ee8983becc	i965: Clean up error handling for context creation. The intel_screen.c used to be a dispatch to one of 3 driver functions, but was down to 1, so it was kind of a waste. In addition, it was trying to free all of the data that might have been partially freed in the kernel 3.6 check (which comes after intelInitContext, and thus might have had driverPrivate set and result in intelDestroyContext() doing work on the freed data). By moving the driverPrivate setup earlier, we can use intelDestroyContext() consistently and avoid such problems in the future. v2: Adjust the prototype of brwCreateContext to use the proper enum (fixing a compiler warning in some builds) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-10-10 16:34:30 -07:00
Eric Anholt	18a8f31070	intel: Remove silly check for !bufmgr. If bufmgr didn't get created, then screen creation failed, and we never should have got here in the first place. This was added by Chris Wilson in 2010 with no explanation for why it would be needed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 16:34:30 -07:00
Eric Anholt	083f66fdd6	dri: Move API version validation into dri/common. i965, i915, radeon, r200, swrast, and nouveau were mostly trying to do the same logic, except where they failed to. Notably, swrast had code that appeared to try to enable GLES1/2 but forgot to set api_mask (thus preventing any gles context from being created), and the non-intel drivers didn't support MESA_GL_VERSION_OVERRIDE. nouveau still relies on _mesa_compute_version(), because I don't know what its limits actually are, and gallium drivers don't declare limits up front at all. I think I've heard talk about doing so, though. v2: Compat max version should be 30 (noted by Ken) Drop r100's custom max version check, too (noted by Emil Velikov) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 16:34:30 -07:00
Eric Anholt	d81632fb1e	dri: Merge drisw_util.c into dri_util.c The only important difference was not calling drmGetVersion, and making the swrast extension vtable. That doesn't justify duplicating the other 330 lines of code. v2: fix the scons build (code by Emil Velikov) v3: fix scons build with swrast-only (code by Emil Velikov) v4: Drop the new define I added, when we already have __NOT_HAVE_DRM_H. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-10 16:34:30 -07:00
Eric Anholt	683f6daa97	dri: Add an explanatory comment for an important driver entrypoint. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 16:34:30 -07:00
Eric Anholt	7f3a131b6e	dri: Remove dead comment. The code it was referencing was removed in 2010. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 16:34:30 -07:00
Eric Anholt	36fbe66d3a	i965/fs: Convert gen7 to using GRFs for texture messages. Looking at Lightsmark's shaders, the way we used MRFs (or in gen7's case, GRFs) was bad in a couple of ways. One was that it prevented compute-to-MRF for the common case of a texcoord that gets used exactly once, but where the texcoord setup all gets emitted before the texture calls (such as when it's a bare fragment shader input, which gets interpolated before processing main()). Another was that it introduced a bunch of dependencies that constrained scheduling, and forced waits for texture operations to be done before they are required. For example, we can now move the compute-to-MRF interpolation for the second texture send down after the first send. The downside is that this generally prevents remove_duplicate_mrf_writes() from doing anything, whereas previously it avoided work for the case of sampling from the same texcoord twice. However, I suspect that most of the win that originally justified that code was in avoiding the WAR stall on the first send, which this patch also avoids, rather than the small cost of the extra instruction. We see instruction count regressions in shaders in unigine, yofrankie, savage2, hon, and gstreamer. Improves GLB2.7 performance by 0.633628% +/- 0.491809% (n=121/125, avg of ~66fps, outliers below 61 dropped). Improves openarena performance by 1.01092% +/- 0.66897% (n=425). No significant difference on Lightsmark (n=44). v2: Squash in the fix for register unspilling for send-from-GRF, fixing a segfault in lightsmark. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com>	2013-10-10 15:54:16 -07:00
Eric Anholt	ee21c8b1e6	i965/fs: Allocate more register classes on gen7. For texturing from GRFs, we now have payloads of arbitrary sizes up to the message length limit. v2 (Kenneth Graunke): Rebase on intel_context -> brw_context change. v3: Add some comment text. v4: Change some magic 16s to BRW_MAX_MRF (noted by Ken). Leave the 11, which is the magic "max sampler message length". BRW_MAX_MRF sizing on the little int arrays is retained because I could see us needing to extend in the future if we move to GRFs for FB writes (those go to at least 12 long in a quick scan of the specs) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2) Acked-by: Matt Turner <mattst88@gmail.com>	2013-10-10 15:54:16 -07:00
Eric Anholt	b6af650a09	i965/fs: Use per-channel interference for register_coalesce_2(). This will let us coalesce into texture-from-GRF arguments, which would otherwise be prevented due to the live interval for the whole vgrf extending across all the MOVs setting up the channels of the message v2 (Kenneth Graunke): Rebase for renames. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 15:54:16 -07:00
Eric Anholt	3093085847	i965/fs: Use the new per-channel live ranges for dead code elimination. v2 (Kenneth Graunke): Rebase on s/live_variables/live_intervals/g. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 15:54:16 -07:00
Eric Anholt	b4d676d710	i965/fs: Keep a copy of the live variables class around. Now optimization passes will be able to look at the per-channel ranges. v2: Rebase on various optimization pass changes. v3 (Kenneth Graunke): Rename live_variables to live_intervals; split introduction of invalidate_live_intervals() into a separate patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 15:54:15 -07:00
Kenneth Graunke	3ea84beb16	i965/fs: Invalidate live intervals when compacting; don't fix them. When compacting the list of VGRFs, we patch up the live interval ranges (which are indexed by VGRF number). Unfortunately, once we make per-component data available, this will become too complicated to maintain. Instead, simply invalidate them. This was pulled out of a patch by Eric Anholt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-10 15:54:15 -07:00
Kenneth Graunke	939b0f2c2f	i965/fs: Remove start/end aliases in compute_live_intervals(). In compute_live_intervals(), start and end are shorter names for the virtual_grf_start and virtual_grf_end class members. Now that the fs_live_intervals class has arrays named start and end which are indexed by var, rather than VGRF, reusing the name is confusing. Plus, most of the code has been factored out, so using the long names isn't as inconvenient. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-10 15:54:15 -07:00
Eric Anholt	398656d97e	i965/fs: Track live variable ranges on a per-channel level. This is the information we'll actually use to replace the virtual_grf_start[]/end[] arrays. No change in shader-db. v2 (Kenneth Graunke): Rebase; minor comment updates. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 15:54:15 -07:00
Eric Anholt	097bf101c3	i965/fs: Factor def[]/use[] setup out to a separate function. These blocks are about to grow some more code, and the indentation was getting out of hand. v2 (Kenneth Graunke): Rebase, minor typo fixes and style changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-10 15:54:15 -07:00
Kenneth Graunke	4b821a97b5	i965/fs: Create a helper function for invalidating live intervals. For now, this simply sets live_intervals_valid = false, but in the future it will do something more sophisticated. Based on a patch by Eric Anholt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-10 15:54:15 -07:00
Eric Anholt	45ffaeccaf	i965/fs: Do live variables dataflow analysis on a per-channel level. This significantly improves our handling of VGRFs of size > 1. Previously, we only marked VGRFs as def'd if the whole register was written by a single instruction. Large VGRFs which were written piecemeal would not be considered def'd at all, even if they were ultimately completely written. Without being def'd, these were then marked "live in" to the basic block, often extending the range to preceding blocks and sometimes even the start of the program. The new per-component tracking gives more accurate live intervals, which makes register coalescing more effective. In the future, this should help with texturing from GRFs on Gen7+. A sampler message might be represented by a 2-register VGRF which holds the texture coordinates. If those are incoming varyings, they'll be produced by two PLN instructions, which are piecemeal writes. No reduction in shader-db instruction counts. However, code which prints the live interval ranges does show that some VGRFs now have smaller (and more correct) live intervals. v2: Rebase on current send-from-GRF code requiring adding extra use[]s. v3: Rebase on live intervals fix to include defs in the end of the interval. v4 (Kenneth Graunke): Rebase; split off a few preparatory patches; add lots of comments; minor style changes; rewrite commit message. v5 (Eric Anholt): whitespace nit. Written-by: Eric Anholt <eric@anholt.net> [v1-3] Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v4] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> (v4)	2013-10-10 15:54:14 -07:00
Kenneth Graunke	5af8388110	i965/fs: Rename num_vars to num_vgrfs in live interval analysis. num_vars was shorthand for the number of virtual GRFs. num_vgrfs is a bit clearer. Plus, the next patch will introduce "vars" which are distinct from vgrfs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-10 15:54:14 -07:00
Kenneth Graunke	701e9af15f	i965/fs: Short-circuit a loop in live variable analysis. This has no functional effect, but should make subsequent changes a little simpler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-10 15:54:14 -07:00
Paul Berry	8cb9cce040	glsl: Don't allow gl_PerVertex to be redeclared after it's been used. Fixes piglit tests: - spec/glsl-1.50/compiler/gs-redeclares-pervertex-in-after-other-usage.geom - spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-after-other-usage.geom - spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-after-usage.geom - spec/glsl-1.50/compiler/vs-redeclares-pervertex-out-after-other-usage.vert - spec/glsl-1.50/compiler/vs-redeclares-pervertex-out-after-usage.vert Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:40 -07:00
Paul Berry	84b9fa83a0	glsl: Support redeclaration of GS gl_PerVertex input. Fixes piglit test spec/glsl-1.50/execution/redeclare-pervertex-subset-vs-to-gs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:38 -07:00
Paul Berry	fc2330b0be	glsl: Catch redeclaration of interface block instance names at compile time. From section 4.1.9 (Arrays) of the GLSL 4.40 spec (as of revision 7): However, unless noted otherwise, blocks cannot be redeclared; an unsized array in a user-declared block cannot be sized through redeclaration. The only place where the spec notes that interface blocks can be redeclared is to allow for redeclaration of built-in interface blocks such as gl_PerVertex. Therefore, user-defined interface blocks can never be redeclared. This is a clarification of previous intent (see Khronos bug 10659). We were already preventing interface block redeclaration using the same block name at compile time, but we weren't preventing interface block redeclaration using the same instance name (and different block names) at compile time. And we weren't preventing an instance name from conflicting with a previously-declared ordinary variable. In practice the problem would be caught at link time, but only because of a coincidence: since ast_interface_block::hir() wasn't doing any checking to see if the instance name already existed in the shader, it was creating a second ir_variable in the shader having the same name but a different type. Coincidentally, when the linker checked for intrastage consistency of global variable declarations, it treated the two declarations from the same shader as a conflict, so it reported a link error. But it seems dangerous to rely on that linker behaviour to catch illegal redeclarations that really ought to be detected at compile time. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:35 -07:00
Paul Berry	1b4a7378e9	glsl: Support redeclaration of VS and GS gl_PerVertex output. Fixes piglit tests: - spec/glsl-1.50/execution/redeclare-pervertex-out-subset-gs - spec/glsl-1.50/execution/redeclare-pervertex-subset-vs Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:33 -07:00
Paul Berry	79f515251a	glsl: Error check redeclarations of gl_PerVertex. This patch verifies that: - The gl_PerVertex input interface block may only be redeclared in a geometry shader, and that it may only be redeclared as gl_in[]. - The gl_PerVertex output interface block may only be redeclared in a vertex or geometry shader, and that it may only be redeclared as a non-array without an interface name. - gl_PerVertex may not be redeclared as any other type of interface block (i.e. as a uniform interface block). As a side-effect, the code now keeps track of what the previous declaration of gl_PerVertex was--this will be needed in future patches. Fixes piglit tests: - spec/glsl-1.50/compiler/gs-redeclares-pervertex-in-with-incorrect-name.geom - spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-as-array.geom - spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-with-instance-name.geom Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:31 -07:00
Paul Berry	3c83c96dcd	glsl: Make it possible to disable a variable in the symbol table. In later patches, we'll use this in order to implement the required behaviour that after the gl_PerVertex interface block has been redeclared, only members of the redeclared interface block may be used. v2: Update the function name and comment to clarify that we aren't actually removing the variable from the symbol table, just disabling it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:27 -07:00
Paul Berry	24b9bba19b	glsl: Add an ir_variable::reinit_interface_type() function. This will be used by future patches to change an ir_variable's interface type when the gl_PerVertex built-in interface block is redeclared. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:22 -07:00
Paul Berry	3699ff4dd1	glsl: Generalize processing of variable redeclarations. This patch modifies the get_variable_being_redeclared() function so that it no longer relies on the ast_declaration for the variable being redeclared. In future patches, this will allow get_variable_being_redeclared() to be used for processing redeclarations of the built-in gl_PerVertex interface block. v2: Also make get_variable_being_redeclared() static. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:20 -07:00
Paul Berry	78b072b2bc	glsl: Don't allow invalid identifiers as struct names. Fixes piglit test spec/glsl-1.10/compiler/struct/struct-name-uses-gl-prefix.vert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:17 -07:00
Paul Berry	9fb6f59552	glsl: Don't allow invalid identifiers as interface block instance names. Note: we need to make an exception for the gl_PerVertex interface block, since in geometry shaders it is allowed to be redeclared with the instance name gl_in. Future patches will make redeclaration of gl_PerVertex work properly. Fixes piglit test spec/glsl-1.50/compiler/interface-block-instance-name-uses-gl-prefix.vert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:15 -07:00
Paul Berry	9b5b0320b6	glsl: Don't allow invalid identifier names in struct/interface fields. Note: we need to make an exception for the gl_PerVertex interface block, since built-in variables are allowed to be redeclared inside it. Future patches will make redeclaration of gl_PerVertex work properly. Fixes piglit tests: - spec/glsl-1.50/compiler/interface-block-array-elem-uses-gl-prefix.vert - spec/glsl-1.50/compiler/named-interface-block-elem-uses-gl-prefix.vert - spec/glsl-1.50/compiler/unnamed-interface-block-elem-uses-gl-prefix.vert Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:12 -07:00
Paul Berry	f2dd3a04ce	glsl: Don't allow invalid identifiers as interface block names. Note: we need to make an exception for the gl_PerVertex interface block, since this is allowed to be redeclared. Future patches will make redeclaration of gl_PerVertex work properly. Fixes piglit test spec/glsl-1.50/compiler/interface-block-name-uses-gl-prefix.vert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:10 -07:00
Paul Berry	9bb60a155f	glsl: Don't allow unnamed interface blocks to redeclare variables. Note: some limited amount of redeclaration is actually allowed, provided the shader is redeclaring the built-in gl_PerVertex interface block. Support for this will be added in future patches. Fixes piglit tests spec/glsl-1.50/compiler/unnamed-interface-block-elem-conflicts-with-prev-{block-elem,global}.vert. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:08 -07:00
Paul Berry	1838df97a2	glsl: Refactor code to check that identifier names are valid. GLSL reserves identifiers beginning with "gl_" or containing "__", but we haven't been consistent about enforcing this rule. This patch makes a new function to check whether identifier names are valid. In the process it closes a loophole where we would previously allow function argument names to contain "__". v2: Rename check_valid_identifier() -> validate_identifier(). Add curly braces in validate_identifier(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:05 -07:00
Paul Berry	6a157f2e33	glsl: Account for location field when comparing interface blocks. In commit e2660770731b018411fbe1620cacddaf8dff5287 (glsl: Keep track of location for interface block fields), I neglected to update glsl_type::record_key_compare to account for the fact that interface types now contain location information. As a result, interface types that differ only by their location information would not be properly distinguished. At the moment this is not a problem, because the only interface block in which location information != -1 is gl_PerVertex, and gl_PerVertex is always created in the same way. However, in the patches that follow, we'll be adding new ways to create gl_PerVertex (by redeclaring it), so we'll need location information to be handled properly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:03 -07:00
Paul Berry	5a234d92af	glsl: Construct gl_PerVertex interfaces for GS and VS outputs. Although these interfaces can't be accessed directly by GLSL (since they don't have an instance name), they will be necessary in order to allow redeclarations of gl_PerVertex. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:27:00 -07:00
Paul Berry	fb41f2c531	glsl: Refactor code for creating gl_PerVertex interface block. Currently, we create just a single gl_PerVertex interface block for geometry shader inputs. In later patches, we'll also need to create an interface block for geometry and vertex shader outputs. Moving the code into its own class will make reuse easier. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:26:58 -07:00
Paul Berry	d2e66b953e	glsl: Fix block name of built-in gl_PerVertex interface block. Previously, we erroneously used the name "gl_in" for both the block name and the instance name. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:26:56 -07:00
Paul Berry	192d05f277	glsl: Construct gl_in with a location of -1. We use a location of -1 for variables which don't have their own assigned locations--this includes ir_variables which represent named interface blocks. Technically the location assigned to gl_in doesn't matter, since gl_in is only accessed via its members (which have their own locations). But it's nice to be consistent. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-10 14:26:53 -07:00
Christian König	8bc7673ef8	radeon/winsys: fix handling in radeon_drm_cs_flush v2 Calling radeon_drm_cs_flush from multiple threads might cause deadlocks, fix this by immediately signaling the semaphore after waiting for it. This is a candidate for the stable branch(es). Partially fixes: https://bugs.freedesktop.org/show_bug.cgi?id=70123 v2: some fixes on commit message Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-10 11:50:38 +02:00
José Fonseca	a922d3413f	util: Fix MinGW build. _GNU_SOURCE appears to not be used reliably. Use _MSC_VER instead so that MSVC alone is affected.	2013-10-09 21:17:53 -07:00
José Fonseca	1aef0ef277	llvmpipe: We don't use the draw pipeline for offset_point/line. Unless the polygon fill mode is different from PIPE_POLYGON_MODE_FILL, so checking the the polygon mode is sufficient. Testing done: no regression in polygon-mode-offset Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-10-09 21:09:07 -07:00
Roland Scheidegger	9b3dbaf396	gallivm: kill old per-quad face selection code Not used since ages, and it wouldn't work at all with explicit derivatives now (not that it did before as it ignored them but now the code would just use the derivs pre-projected which would be quite random numbers). v2: also get rid of 3 helper functions no longer used. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-10 04:32:57 +02:00
Roland Scheidegger	47d0613eb7	gallivm: handle explicit derivatives for cubemaps They need some special handling. Quite complicated. Additionally, use the same code for implicit derivatives too if no_rho_approx and no_quad_lod is set, because it seems while generally it should be ok to use per quad lod for implicit derivatives there's at least some test which insists that in case of cubemaps the shared lod value MUST come from a pixel inside the primitive (due to the derivatives becoming different if a different larger major axis is chosen). v2: based on Brian's feedback, clean up code a bit. And use sign bit of major axis instead of pre-select s/t/r sign for coord mirroring (which should be the same in the end, saves 2 ands). Also fix two bugs with select/mirror of derivatives, the minor axes need to use major axis sign as well (instead of major derivative axis sign), and don't mistakenly use absolute values of major derivative and inverse major values. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-10-10 04:32:57 +02:00
Roland Scheidegger	ce1d8634aa	gallivm: ignore rho approximation for cube maps There's two reasons for this: 1) even when ignoring rho approximation for cube maps, the result is still not correct, but it's better as the max error at edges is now sqrt(2) instead of 2 (which was a full mip level), same as it is for ordinary 2d maps when doing rho approximations (so the error actually goes from factor 2 at edges and sqrt(2) completely inside a face to sqrt(2) at edges and 0 inside a face). 2) I want to repurpose rho_no_approx for cubemaps for fully correct cubemap derivatives (so don't need yet another debug var). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-10 04:32:57 +02:00
Paul Berry	15e05b999b	glsl: Modify array_sizing_visitor to handle unnamed interface blocks. We were already setting the array size of unsized arrays that appeared inside unnamed interface blocks, but we weren't updating ir_variable::interface_type to reflect the new array size, causing bogus link errors. This patch causes array_sizing_visitor to keep track of all the unnamed interface types it sees, and the ir_variables corresponding to each one. After the visitor runs, a new function, fixup_unnamed_interface_types(), adjusts each unnamed interface type to correctly correspond with the array sizes in the ir_variables. Fixes piglit tests: - spec/glsl-1.50/execution/unsized-in-unnamed-interface-block-gs - spec/glsl-1.50/execution/unsized-in-unnamed-interface-block-multiple Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:48 -07:00
Paul Berry	45e46b2e37	glsl: Update call_link_visitor to update max_ifc_array_access. When multiple shaders of the same type access an interface block containing an unsized array, we need to set the array size based on the maximum array element accessed across all the shaders. This is similar to what we already do with unsized arrays occurring outside of interface blocks. Note: one corner case is not yet addressed by these patches: the case where one compilation unit defines an interface block containing unsized arrays and another compilation unit defines the same interface block containing sized arrays. Fixes piglit test: - spec/glsl-1.50/execution/unsized-in-named-interface-block-multiple Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:46 -07:00
Paul Berry	e226669eea	glsl/linker: Modify array_sizing_visitor to handle named interface blocks. Unsized arrays appearing inside named interface blocks now get a proper size assigned by the array_sizing_visitor. Fixes piglit tests: - spec/glsl-1.50/execution/unsized-in-named-interface-block - spec/glsl-1.50/execution/unsized-in-named-interface-block-gs - spec/glsl-1.50/linker/unsized-in-named-interface-block - spec/glsl-1.50/linker/unsized-in-named-interface-block-gs - spec/glsl-1.50/linker/unsized-in-unnamed-interface-block-gs () () is fixed by dumb luck--support for unsized arrays in unnamed interface blocks will come in a later patch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:41 -07:00
Paul Berry	f878d2060c	glsl: Update ir_variable::max_ifc_array_access properly. This patch modifies update_max_array_access() so that it updates ir_variable::max_ifc_array_access to reflect the shader's use of arrays appearing within interface blocks. v2: Use an ordinary function in ast_array_index.cpp rather than a virtual function in ir_rvalue. Avoid dereferencing NULL when handling accesses to ordinary structs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:38 -07:00
Paul Berry	ca8a5ce919	glsl: Sanity check max_ifc_array_access in ir_validate::visit(ir_variable *). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:36 -07:00
Paul Berry	3f4292a6e3	glsl: Add an ir_variable::max_ifc_array_access field. For interface blocks that contain arrays, this field will contain the maximum element of each contained array that is accessed by the shader. This is a first step toward supporting unsized arrays in interface blocks. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:31 -07:00
Paul Berry	22d3ef2df1	glsl: Make accessor functions for ir_variable::interface_type. In a future patch, this will allow us to enforce invariants when the interface type is updated. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:26 -07:00
Paul Berry	6f19e552af	glsl: Move update of max_array_access into a separate function. Currently, when converting an access to an array element from ast to IR, we need to see if the array is an ir_dereference_variable, and if so update the variable's max_array_access. When we add support for unsized arrays in interface blocks, we'll also need to account for cases where the array is an ir_dereference_record and the record is an interface block. To make this easier, move the update into its own function. v2: Use an ordinary function in ast_array_index.cpp rather than a virtual function in ir_rvalue. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:23 -07:00
Paul Berry	2f2f39c389	glsl: Add parser support for unsized arrays in interface blocks. Although it's not explicitly stated in the GLSL 1.50 spec, unsized arrays are allowed in interface blocks. section 1.2.3 (Changes from revision 5 of version 1.5) of the GLSL 1.50 spec says: * Completed full update to grammar section. Tested spec examples against it: ... * add unsized arrays for block members And section 7.1 (Vertex and Geometry Shader Special Variables) includes an unsized array in the built-in gl_PerVertex interface block: out gl_PerVertex { vec4 gl_Position; float gl_PointSize; float gl_ClipDistance[]; }; Furthermore, GLSL 4.30 contains an example of an unsized array occurring inside an interface block. From section 4.3.9 (Interface Blocks): uniform Transform { // API uses "Transform[2]" to refer to instance 2 mat4 ModelViewMatrix; mat4 ModelViewProjectionMatrix; vec4 a[]; // array will get implicitly sized float Deformation; } transforms[4]; This patch adds the parser rule to support unsized arrays inside interface blocks. Later patches in the series will add the appropriate semantics to handle them. Fixes piglit tests: - spec/glsl-1.50/execution/unsized-in-unnamed-interface-block - spec/glsl-1.50/linker/unsized-in-unnamed-interface-block Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-09 16:49:21 -07:00
Paul Berry	8cf35c3d2f	glsl: Rename the fourth argument to get_interface_instance. Interface declarations have two names associated with them: the block name and the instance name. It's the block name that needs to be passed to get_interface_instance(). This patch renames the argument so that there's no confusion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-09 16:49:16 -07:00
Kenneth Graunke	b330125790	i965/blorp: Allow format conversions for CopyTexSubImage. BLORP performs blits by drawing a rectangle with a shader that samples from the source texture, and writes color data to the destination. The sampler always returns 32-bit RGBA float data, regardless of the source format's component ordering or data type. Likewise, the render target write message takes 32-bit RGBA float data, and converts it appropriately. So the bulk of the work is already taken care of for us. This greatly accelerates a lot of CopyTexSubImage calls, and makes Legends of Aethereus playable on Ivybridge. At the default settings, LOA continually blits between SRGBA8888 (the window format) and RGBA16_FLOAT. Since neither BLORP nor our BLT paths supported this, it fell back to meta, spending 33% of the CPU in floorf() converting between floats and half-floats. v2: Use != instead of ^ (suggested by Ian). Note that only CopyTexSubImage is affected by this patch (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:50 -07:00
Kenneth Graunke	72aade48fe	i965/blorp: Rework sRGB override behavior. The previous code for sRGB overrides assumes that the source and destination formats are equal, other than the color space. This won't be feasible when we add support for format conversions. Here are a few cases, and how the old code handled them: 1. RGB8 -> SRGB8, MSAA ==> SRGB8 -> SRGB8 2. RGB8 -> SRGB8, single ==> RGB8 -> RGB8 3. SRGB8 -> RGB8, MSAA ==> RGB8 -> RGB8 4. SRGB8 -> RGB8, single ==> SRGB8 -> SRGB8 Apparently, preserving the behavior of #1 is important. When doing a multisample to single-sample resolve, blending the samples together in an sRGB correct fashion results in a noticably higher quality image. It also is necessary to pass Piglit's EXT_framebuffer_multisample accuracy color tests. Paul, Eric, Anuj, and I talked about this, and aren't sure that it matters in the other cases. This patch preserves the behavior of #1, but otherwise reverts to doing everything in linear space, changing the behavior of case #4. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:50 -07:00
Kenneth Graunke	0589eaecde	i965/blorp: Explain why Z24 can't use a sensible format. We could conceivably use BRW_SURFACEFORMAT_R24_UNORM_X8_TYPELESS for Z24 source images, allowing conversions from Z24 to either Z16 or Z32F. Unfortunately, we can't use it for destination images since it isn't supported as a render target. Using different formats for sources or destinations would be painful, so for now, punt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:50 -07:00
Kenneth Graunke	590d71791a	i965/blorp: Use R32_FLOAT for Z32F surfaces. Currently, all that matters is that we copy the correct number of bits, so any format that has 32-bits of data will work fine. Once BLORP begins handling format conversions, the sampler will need to correctly interpret the data. We don't need a depth format, but we do need the right number of components and data type (FLOAT). For Z32F, this means using R32_FLOAT. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:49 -07:00
Kenneth Graunke	4dc25b7615	i965/blorp: Use R16_UNORM for Z16 surfaces. Currently, all that matters is that we copy the correct number of bits, so any format that has 16-bits of data will work fine. Once BLORP begins handling format conversions, the sampler will need to correctly interpret the data. We don't need a depth format, but we do need the right number of components and data type (UNORM). For Z16, this means using R16_UNORM. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:49 -07:00
Kenneth Graunke	6f7c41dd1d	i965/blorp: Add support for non-render-target formats. Once blorp gains the ability to do format conversions, it's conceivable that the source format may be texturable but not supported as a render target. This would break Paul's code, which assumes that it can use the render_target_format array even for the source format. There are three ways to convert MESA_FORMAT enums to BRW_SURFACEFORMAT enums: 1. brw_format_for_mesa_format() This translates the Mesa format to the most equivalent BRW format. 2. brw->render_target_format[] This is used for renderbuffers, and handles the subset of formats that are renderable. However, it's not always equivalent, since it overrides a few non-renderable formats. For example, it converts B8G8R8X8_UNORM to B8G8R8A8_UNORM so it can be rendered to. 3. translate_tex_format() This is used for textures. It wraps brw_format_for_mesa_format(), but overrides depth textures, and one sRGB case on Gen4. BLORP has a fourth function, which uses brw->render_target_format[] and overrides depth formats (differently than translate_tex_format). This patch makes the BLORP function to use brw_format_for_mesa_format() for textures/source data, since not everything will be a render target. It continues using brw->render_target_format[] for render targets, since it needs the format overrides that provides. We don't use translate_tex_format() since the additional overrides are not useful or simply redundant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:49 -07:00
Kenneth Graunke	4b2e819e10	i965/blorp: Add an is_render_target parameter to surface_info::set. This allows us to determine whether we're setting up a format for the source (as a texture) or destination (as a render target). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-09 16:36:49 -07:00
José Fonseca	dbc1f3677c	util/u_math: Fix C++ include of u_math.h on MSVC. GNU C++ compiler declares the C99 lrint, etc. when _GNU_SOURCE is defined, but MSVC does not. Trivial.	2013-10-10 00:31:53 +01:00
Zack Rusin	edde6c77bd	llvmpipe: abstract the code to set number of subpixel bits As we're moving towards expanding the number of subpixel bits and the width of the variables used in the computations we need to make this code a bit more centralized. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-09 18:30:31 -04:00
Zack Rusin	87fe4a33d3	llvmpipe: implement 64 bit mul opcodes in llvmpipe Both the imul_hi and umul_hi are working with this patch. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-09 18:30:27 -04:00
Zack Rusin	6905698fc2	gallium: Add support for 32x32 muls with 64 bit results The code introduces two new 32bit integer multiplication opcodes which can be used to produce correct 64 bit results. GLSL, OpenCL and D3D10+ require them. We use two seperate opcodes, because they match the behavior of GLSL and OpenCL, are a lot easier to add than a single opcode with multiple destinations and because there's not much (any) difference wrt code-generation. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-09 18:30:20 -04:00
Zack Rusin	c01c6a95b4	gallivm: support printing of 64 bit integers only 8 and 32 bit integers were supported before. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-10-09 18:29:05 -04:00
Eric Anholt	58bab95c95	i965/blorp: Fix the register types on blorp's push constants. The UD values were getting set up as floats. This happened to work out because they were used as the second argument where the first was a dword, and gen6+ doesn't do source conversions. But it did trigger fulsim warnings, and it meant if you used the push constant as the first operand you would have been disappointed. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-09 11:43:46 -07:00
Eric Anholt	8da15d7544	i965: Fix 3D texture layout by more literally copying from the spec. Fixes 3 texelFetch tests in piglit all.tests on ivb, and cubemap npot on gm45. v2: Don't forget the gen4 DL=6 cubemap behavior. Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)	2013-10-09 11:28:19 -07:00
Eric Anholt	bfe6e5dda5	mesa: Fix compiler warnings when ALIGN's alignment is "1 << value". We hadn't run into order of operation warnings before, apparently, since addition is so low on the order. Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-09 11:28:19 -07:00
Eric Anholt	791550aa8e	i965: Don't forget the cube map padding on gen5+. We had a fixup for gen4's 3d-layout cubemaps (which, iirc, we'd experimentally found to be necessary!), but while the spec still requires it on gen5, we'd been missing it in the array-layout cubemaps. Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-09 11:28:19 -07:00
Gaetan Nadon	e6fb744141	egl/main: remove undefined X11_LIBS automake variable The EGL library has some references to x11 but it gets the link flags from the XCB_DRI2_LIBS if and only if HAVE_EGL_PLATFORM_X11 is true. The X11_LIBS variable was probably coming from a PKG_CHECK_MODULES (x11) earlier in history. If it is possible to have HAVE_EGL_DRIVER_GLX without HAVE_EGL_PLATFORM_X11 then the link flags for libX11 should be passed. However, it won't come from X11_LIBS which is undefined. Reported-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Gaetan Nadon <memsize@videotron.ca>	2013-10-09 10:36:01 -04:00
Gaetan Nadon	bc93c3798a	gallium/state_trackers/glx: X11/Xlib.h: No such file or directory The compiler cannot find the Xlib.h in the installed system headers. All supplied include directives point to inside the mesa module. The X11_CFLAGS variable is undefined (not defined in config.status). It appears the intent was to use X11_INCLUDES defined in configure.ac. The Xlib.h file is not installed on my workstation. It is supplied in the libx11-dev package. This allows an X developer control over which version of this file is used for X development. Use to test: --enable-gallium-egl --enable-xlib-glx --disable-dri Acked-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gaetan Nadon <memsize@videotron.ca>	2013-10-09 10:28:12 -04:00
Gaetan Nadon	54b028ba89	gallium/targets/libgl-xlib: X11/Xlib.h: No such file or directory The compiler cannot find the Xlib.h in the installed system headers. All supplied include directives point to inside the mesa module. The X11_CFLAGS variable is undefined (not defined in config.status). It appears the intent was to use X11_INCLUDES defined in configure.ac. The Xlib.h file is not installed on my workstation. It is supplied in the libx11-dev package. This allows an X developer control over which version of this file is used for X development. Acked-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gaetan Nadon <memsize@videotron.ca>	2013-10-09 10:24:35 -04:00
Gaetan Nadon	d901d7e08e	gallium/state_trackers/egl: use X11_INCLUDES rather than X11_CFLAGS The X11_CFLAGS variable is undefined (not defined in config.status). It appears the intent was to use X11_INCLUDES defined in configure.ac. It is used for building the code in the x11 subdir. The build does not fail on this one as LIBDRM_CFLAGS happens to have the inludedir value as the one for X11. It will not always be the case. The option --enable-gallium-egl is required durimg configuration. Acked-by: Brian Paul <brianp@vmware.com> Signed-off-by: Gaetan Nadon <memsize@videotron.ca>	2013-10-09 10:23:00 -04:00
Grigori Goronzy	bd19e25703	st/vdpau: really block until surface is idle pipe_screen::fence_finish with zero timeout returns quickly and doesn't wait at all. Fix that, and also delete the fence afterwards, so that QuerySurfaceStatus returns the right state later. Addresses: https://trac.videolan.org/vlc/ticket/9281 https://bugs.freedesktop.org/show_bug.cgi?id=68792 Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-09 13:02:40 +02:00
Grigori Goronzy	48563bd45c	st/vdpau: add new formats to OutputSurface rendering OutputSurfaces have simple YCbCr rendering functionality built in, but so far only 4:2:0 subsampling worked correctly. This fixes 4:2:2 and 4:4:4 formats. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-09 13:02:40 +02:00
Grigori Goronzy	1a5bac2149	st/vdpau: fix GenerateCSCMatrix with NULL procamp As per API specification, it is legal to supply a NULL procamp. In this case, a CSC matrix according to the colorspace should be generated, but no further adjustments are made. Addresses: https://trac.videolan.org/vlc/ticket/9281 https://bugs.freedesktop.org/show_bug.cgi?id=68792 Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-09 13:02:40 +02:00
Grigori Goronzy	5b4e2db12d	radeon/uvd: disable VC-1 simple/main profile It doesn't work (decodes to garbage) with most videos on UVD 3.0. Worse yet, it often results in random memory corruption or GPU hangs. Rumor has it only the newest UVD hardware could do it anyway. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-09 13:02:40 +02:00
Grigori Goronzy	5403dd4b68	radeon/uvd: try to fix VC-1 decoding The DPB size calculations seem to be off; there is various random corruption happening, even with advanced profile. Always assuming a minimum number of references appears to fix it, similarly to H.264. This might overallocate the DPB. Also clean up the SPS/PPS field setup so that it matches VC-1 specifications better. With these changes, all advanced profile VC-1 files I could get my hand on work fine. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-09 13:02:40 +02:00
Grigori Goronzy	0bb05484bf	radeon/uvd: fix video format reporting UVD can only support NV12 in the case of hardware decoding, but we can still use all other formats for software decoding. Use the UNKNOWN profile to signal that we're not interesting in hardware decoding. v2: use profile instead of entrypoint Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-09 13:02:40 +02:00
Marek Olšák	c207fa6c18	gallium/dri targets: use DRI_DRIVER_LDFLAGS which contains -Wl,-Bsymbolic. If I understand it correctly, it prevents symbols from clashing if multiple drivers are loaded at the same time. Tested-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-09 12:04:38 +02:00
Marek Olšák	6b7c039dc2	radeonsi: fix occlusion queries for CIK Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-10-09 11:44:48 +02:00
Marek Olšák	ec922ef987	radeonsi: draw register fixes for CIK This doesn't fix any known issue. I'm just following the docs. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-10-09 11:44:48 +02:00
Chia-I Wu	a26e17a365	i965: keep SecHalf flag after register coalescing Copy sechalf to the new register, otherwise we would read wrong HW registers. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-09 14:49:11 +08:00
Chia-I Wu	3db52b6e36	i965: allow SIMD8 sampler messages in SIMD16 mode When the instruction to send the sampler message is forced uncompressed or sechalf, send SIMD8 one even in SIMD16 mode. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-09 14:49:11 +08:00
Chia-I Wu	44f0777f17	i965: make BRW_COMPRESSION_2NDHALF valid for brw_SAMPLE SIMD8 sampler messages are allowed in SIMD16 mode, and they could not work without BRW_COMPRESSION_2NDHALF. Later PRMs (gen5 and later) do not explicitly state whether BRW_COMPRESSION_2NDHALF is allowed, but they do have examples using send with SecHalf. It should be safe to assume SecHalf is valid. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-09 14:49:11 +08:00
Vinson Lee	1176a3aac6	i965: Initialize brw_blorp_const_color_program::prog_data. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-08 22:10:46 -07:00
Eric Anholt	8c197d4aae	i965: Fix a compiler warning about conservative depth enums. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-08 14:34:35 -07:00
Paul Berry	d14fcd7db7	i965/gs: Fixup gl_PointSize on entry to geometry shaders. gl_PointSize is stored in the w component of VARYING_SLOT_PSIZ, but the geometry shader infrastructure assumes that it should look for all geometry shader inputs of type float in the x component. So when compiling a geomtery shader that uses a gl_PointSize input, fix it up during the shader prolog by moving the w component to the x component. This is similar to how we emit fixups and workarounds for vertex shader attributes. Fixes piglit test spec/glsl-1.50/execution/geometry/core-inputs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-08 12:44:24 -07:00
Bryan Cain	8f758b0b92	glsl/gs: handle gl_ClipDistance geometry input in lower_clip_distance. This corresponds to the lowering of gl_ClipDistance to gl_ClipDistanceMESA for vertex and geometry shader outputs. Since this lowering pass occurs after lower_named_interface blocks, it deals with 2D arrays (gl_ClipDistance[vertex][clip_plane]) rather than 1D arrays in an interface block (gl_in[vertex].gl_ClipDistance[clip_plane]). v2 (Paul Berry <stereotype441@gmail.com>): Fix indexing order for gl_ClipDistance input lowering. Properly lower bulk assignment of gl_ClipDistance inputs. Rework for GLSL 1.50 style geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v3 (Paul Berry <stereotype441@gmail.com>): Add comments and assertions to clarify that the 2D version of clip distance is only used for geometry shader inputs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-08 12:44:21 -07:00
Paul Berry	c09adcb21b	glsl/gs: add gl_in support to builtin_variables.cpp. Previously, builtin_variables.cpp was written assuming that we supported ARB_geometry_shader4 style geometry shader inputs, meaning that each built-in varying input to a geometry was supplied via an array variable whose name ended in "In", e.g. gl_PositionIn or gl_PointSizeIn. However, in GLSL 1.50 style geometry shaders, things work differently--built-in inputs are supplied to geometry shaders via a built-in interface block called gl_in, which contains all the built-in inputs using their usual names (e.g. the gl_Position input is supplied to the geometry shader as gl_in[i].gl_Position). This patch adds the necessary logic to builtin_variables.cpp to create the gl_in interface block and populate it accordingly for geometry shader inputs. The old ARB_geometry_shader4 style varyings are removed, though they can easily be added back in the future if we decide to support ARB_geometry_shader4. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-08 12:44:19 -07:00
Paul Berry	378ff1dbac	glsl: Keep track of location for interface block fields. This patch adds a "location" element to struct glsl_struct_field, so that we can keep track of the gl_varying_slot associated with each built-in geometry shader input. In lower_named_interface_blocks, we use this value to populate the "location" field in the ir_variable that stores each geometry shader input. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-10-08 12:44:01 -07:00
Adam Jackson	e166a58c43	glx: Generate fewer errors in MakeContextCurrent For a few reasons. 1: In the (current) common case, these conditionals are never true. All we're doing by checking them is slowing down MakeCurrent. The server does these checks already anyway. 2: GLX >= 3.0 contexts may legally be made current without a bound framebuffer. This does not fix piglit/glx-create-context-current-no-framebuffer, but is a prerequisite for fixing it. Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-10-08 13:24:20 -04:00
Adam Jackson	d101204c23	glx: Propagate failures from SendMakeCurrentRequest where possible Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-10-08 13:24:20 -04:00
Adam Jackson	68412d5006	glx: Hide xGLXMakeCurrentReply inside SendMakeCurrentRequest Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-10-08 13:24:20 -04:00
Marek Olšák	15a201c610	st/dri: don't export any private symbols Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-08 16:23:52 +02:00
Marek Olšák	085e5adede	gallium/swrast: don't export any private symbols Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-08 16:23:52 +02:00
Marek Olšák	c787a9767c	gallium/radeon: don't export any private symbols Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-08 16:23:52 +02:00
Marek Olšák	790c8a2405	configure.ac: report an error if LLVM shared libs are disabled and CL is enabled Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-08 16:23:52 +02:00
Marek Olšák	e9c9d28203	st/mesa: improve format selection for GLES Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>	2013-10-08 16:23:04 +02:00
Stéphane Marchesin	20bf508a42	i915g: Rename sampler to fragment_sampler Otherwise it is fairly confusing.	2013-10-07 20:53:55 -07:00
Stéphane Marchesin	8c6594074e	i915g: Fix the sampler bind function The new sampler bind sends us NULL samplers, so we need to count the number of valid samplers ourselves. This fixes ~500 piglit regressions from the sampler rework. While we're at it, let's also support start.	2013-10-07 20:51:53 -07:00
Chad Versace	6cd1da8377	gen7: Use logical, not physical, dims in 3DSTATE_DEPTH_BUFFER (v2) In 3DSTATE_DEPTH_BUFFER, we set Width and Height to the miptree slice's physical dimensions. (Logical and physical dimensions may differ for multisample surfaces). However, in SURFACE_STATE, we always set Width and Height to the slice's logical dimensions. We should do the same for 3DSTATE_DEPTH_BUFFER, because the hw docs say so. No Piglit regressions (-x glx -x glean) on Ivybridge with Wayland. v2: No Piglit regressions, for real this time. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-07 11:55:24 -07:00
Chad Versace	ccad802ed5	doxygen: Generate Doxygen for i965 Now, one can do the following to generate and read the i965 Doxygen: cd $MESA_TOP/doxygen make firefox i965/index.html Reviewed-by: Frank Henigman <fjhenigman@google.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-07 11:55:16 -07:00
Matt Turner	b645913ff6	i965: Remove the "ARF" register file. The registers in the architecture register file don't share much in common, so there's no point in grouping them together. Use the HW_REG class instead. The vec4 backend already does this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 11:38:52 -07:00
Matt Turner	e7dc88026a	i965: Fixup for don't dead-code eliminate instructions that write to the accumulator. Accidentally pushed an old version of the patch. v2: Set destination register using brw_null_reg(). Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 11:38:15 -07:00
Matt Turner	c4e6569fc8	i965: Generate code for ir_binop_imul_high. v2: Make accumulator's type match the type of the operation. Noticed by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 10:43:19 -07:00
Matt Turner	85154241d6	i965: Use the multiplication result's type for the accumulator. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-07 10:43:19 -07:00
Matt Turner	6ff8f06308	i965/fs: Disable CSE on instructions writing to HW_REG. CSE would otherwise combine the two mul(8) emitted by [iu]mulExtended: mul(8) acc0 x y mach(8) null x y mov(8) lsb acc0 ... mul(8) acc0 x y mach(8) msb x y Into: mul(8) temp x y mov(8) acc0 temp mach(8) null x y mov(8) lsb acc0 ... mov(8) acc0 temp mach(8) msb x y But mul(8) into the accumulator produces more than 32-bits of precision, which is required and lost if multiplying into a general register and moving to the accumulator. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-07 10:43:19 -07:00
Matt Turner	06e41a02a3	glsl: Implement [iu]mulExtended() built-ins for ARB_gpu_shader5. These built-ins have two "out" parameters, which makes implementing them efficiently with our current compiler infrastructure difficult. Instead, implement them in terms of the existing ir_binop_mul IR (to return the low 32-bits) and a new ir_binop_mul64 which returns the high 32-bits. v2: Rename mul64 -> imul_high as suggested by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 10:43:19 -07:00
Matt Turner	69909c866b	i965: Add Gen assertion checks for newer instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 10:43:19 -07:00
Matt Turner	92dc16c3e2	i965: Don't dead-code eliminate instructions that write to the accumulator. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 10:41:17 -07:00
Matt Turner	014cce3dc4	i965: Generate code for ir_binop_carry and ir_binop_borrow. Using the ADDC and SUBB instructions on Gen7. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 10:41:17 -07:00
Matt Turner	4ec37317c5	i965: Add UD null register helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 10:41:16 -07:00
Matt Turner	6f9428eb68	glsl: Implement usubBorrow() built-in for ARB_gpu_shader5. i965 implements this with a single (multiple destination) instruction, SUBB. Emitting SUBB directly from usubBorrow() would be ideal, but our optimization passes don't know how to copy with expressions with side-effects. Radeon has an SUBB_UINT instruction that only generates the borrow bit. I've chosen to go this route and implement usubBorrow() by doing the subtraction and the borrow operations separately. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-07 10:41:16 -07:00
Matt Turner	6c125973f3	glsl: Implement uaddCarry() built-in for ARB_gpu_shader5. i965 implements this with a single (multiple destination) instruction, ADDC. Emitting ADDC directly from uaddCarry() would be ideal, but our optimization passes don't know how to copy with expressions with side-effects. Radeon has an ADDC_UINT instruction that only generates the carry bit. I've chosen to go this route and implement uaddCarry() by doing the addition and the carry operations separately. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-07 10:41:16 -07:00
Matt Turner	499d7a7f6e	glsl: Add ir_binop_carry and ir_binop_borrow. Calculates the carry out of the addition of two values and the borrow from subtraction respectively. Will be used in uaddCarry() and usubBorrow() built-in implementations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-07 10:41:16 -07:00
Ian Romanick	ae514416b2	glsl_compiler: Enable any extension that any Mesa driver enables The only GLSL extension that is not enabled is AMD_vertex_shader_layer. I think the standalone-compiler could enable this (as shading language support is complete), but no driver enables it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 09:59:23 -07:00
Ian Romanick	136568ea18	glsl_compiler: Sort extensions by name Makes it a little easier to see which ones are missing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 09:59:23 -07:00
Ian Romanick	587cd971c8	glsl_compiler: Always log the compiler diagnostics Not just when there's an error. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 09:59:23 -07:00
Ian Romanick	3646d65f6a	glsl_compiler: Set max GLSL version on the command line Infer whether or not to use ES based on the GLSL version (100 or 300 are for ES). This replaces the --glsl-es command line option. Set various compiler limits based on the minimums required for the specified GLSL version. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 09:59:23 -07:00
Ian Romanick	257db619c6	glsl_compiler: Use no_argument instead of 0 in getopt_long options The choices aren't just 0 and 1, so using the enum names is much more clear. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 09:59:23 -07:00
Ian Romanick	75e9bd13c4	glsl_compiler: Re-enable building glsl_compiler This allows application developers to use Mesa's compiler as a standalone validator for their shaders. This is mostly a revert of commit `569f0e4`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-07 09:59:23 -07:00
Ian Romanick	5d6b0e7f1b	glsl: Remove glsl_parser_state MaxVaryingFloats field Pull the data directly from the context like the other varying related limits. The parser state shadow copies were added back when the parser state didn't have a pointer to the context. There's no reason to do it now days. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-07 09:59:23 -07:00
Ian Romanick	7db50171be	glsl: Set gl_MaxVertexOutputs from VertexProgram.MaxOutputComponents etc gl_MaxVertexOutputVectors => ctx->Const.VertexProgram.MaxOutputComponents gl_MaxFragmentInputVectors => ctx->Const.FragmentProgram.MaxInputComponents v2: Add types so that the code compiles. Pointed out by Brian. v3: Leave gl_MaxVaryingFloats et al. as-is. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> [v2] Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v2] Reviewed-by: Paul Berry <stereotype441@gmail.com> [v2]	2013-10-07 09:59:23 -07:00
Ian Romanick	42305fb502	glsl: Count shader inputs and outputs separately Starting with OpenGL 3.2 input limits and output limits for stages may not match. This means they need to be accounted separately. No piglit regressions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-10-07 09:59:23 -07:00
Emilio Pozuelo Monfort	d4b5bc62af	glapi: add output info to GetProgramiv's params Signed-off-by: Emilio Pozuelo Monfort <emilio.pozuelo@collabora.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-07 09:06:33 -07:00
Laurent Carlier	72465fcf57	clover: fix building with llvm-3.4 since rev191922 http://llvm.org/viewvc/llvm-project?view=revision&revision=191922	2013-10-07 08:41:02 -07:00
Brian Paul	e58dd465f0	st/mesa: silence warning about unhandled ir_query_levels in switch	2013-10-07 09:08:16 -06:00
Christian König	289d928c8e	radeon/vdpau: only export necessary symbols Export only the absolutely necessary symbols in radeon vdpau targets. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-07 11:16:53 +02:00
Christian König	731f5471fb	radeon/uvd: optimize message handling a bit No need to keep a copy of the message in system memory anymore, since it should now be in GART memory on newer chips. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-10-07 11:16:53 +02:00
Kenneth Graunke	cfbfb50cb8	docs: Mark a few more things as "in progress" in GL3.txt.	2013-10-06 13:58:53 -07:00
Ilia Mirkin	7178d6ac59	dri/nouveau: add AllocTextureImageBuffer implementation This fixes issues where get_rt_format would see a 0 format because the nouveau_surface had not been properly initialized. Fixes crash on supertuxkart startup (which still fails due to out-of-vram issues). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-10-06 12:59:18 -07:00
Francisco Jerez	b3c04362b4	glsl: Fix usage of the wrong union member in program_resource_visitor::recursion. In the array-of-struct case, recursion() takes the row_major flag for each iteration from 't->fields.structure[i]', but 't' is not a record type. Inherit the array declaration row_major flag instead. This mistake was found by running piglit on valgrind. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69449 Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 12:55:14 -07:00
Marek Olšák	373f8670d1	Revert "r600g: only flush the caches that need to be flushed during CP DMA operations" This reverts commit `7948ed1250`. It caused graphical corruption. I've got no idea why. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70042 https://bugs.freedesktop.org/show_bug.cgi?id=68451 Conflicts: src/gallium/drivers/r600/evergreen_hw_context.c src/gallium/drivers/r600/r600_hw_context.c src/gallium/drivers/r600/r600_pipe.h	2013-10-06 03:13:48 +02:00
Chris Forbes	2656c6118b	i965/ivb: Flag RG32F quirk for texture gather regardless of swizzles As of ARB_gpu_shader5, textureGather doesn't always read the post-swizzle RED channel -- so we can't just look at the red swizzle state. Theoretically we could only flag the quirk if some green swizzle is in use, but that's probably more trouble than it's worth. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:25:14 +13:00
Chris Forbes	e8ec2e0344	i965/vs: Add support for textureGather(.., comp) - For HSW: Select the channel based on the component selected (swizzle is done in HW) - For IVB: Select the channel based on the swizzle state for the component selected. Only apply the RG32F w/a if we actually want green -- we're about to flag it regardless of swizzle state. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:25:11 +13:00
Chris Forbes	09c6fd450d	i965/fs: Add support for textureGather(.., comp) - For HSW: Select the channel based on the component selected (swizzle is done in HW) - For IVB: Select the channel based on the swizzle state for the component selected. Only apply the RG32F w/a if we actually want green -- we're about to flag it regardless of swizzle state. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:25:03 +13:00
Chris Forbes	7335bc7526	glsl: add ARB_gpu_shader5's additional textureGather signatures - gsampler2DRect support - optional `comp` parameter Future patches will add shadow sampler support and textureGatherOffsets(). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:13:17 +13:00
Chris Forbes	88ee9bc9d1	glsl: Add support for specifying the component in textureGather ARB_gpu_shader5 introduces new variants of textureGather* which have an explicit component selector, rather than relying purely on the sampler's swizzle state. This patch adds the GLSL plumbing for the extra parameter. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:12:29 +13:00
Chris Forbes	f93a63bfcc	docs: mark ARB_conservative_depth done on i965 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-06 11:05:37 +13:00
Chris Forbes	7ec4668696	i965: Enable ARB_conservative_depth for Gen7+. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:05:35 +13:00
Chris Forbes	4697955c5b	i965/wm: Program correct conservative depth modes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-06 11:05:10 +13:00
Brian Paul	64b1a1d459	docs: rephrase 9.2.1, 9.1.7 news item Both are bug-fix releases, not new development releases.	2013-10-05 14:25:25 -06:00
Brian Paul	21315bfb71	docs: add the MD5 sums for the 9.2.1 and 9.1.7 releases	2013-10-05 14:20:37 -06:00
Timothy Arceri	c70e2471dc	docs: Mark off KHR_debug, update relnotes Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-05 11:41:05 -07:00
Chris Forbes	84e1a396ec	i965/vs: add missing break between ir_query_levels and ir_tg4 cases Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-05 23:18:45 +13:00
Chris Forbes	2beb60c4e7	docs: Mark off ARB_texture_query_levels, update relnotes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-05 19:16:33 +13:00
Chris Forbes	317e172677	i965: enable ARB_texture_query_levels on Gen6+ Theoretically would work on Gen5 as well but requires GLSL 1.30, which is not (yet) enabled by default there. V2: Enable for Gen5 conditionally on GLSL version. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-05 19:16:33 +13:00
Chris Forbes	4be21a07ea	i965/vs: implement ir_query_levels Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-05 19:16:33 +13:00
Chris Forbes	fa6440acdb	i965/fs: implement ir_query_levels Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-05 19:16:33 +13:00
Chris Forbes	7480ae3cb8	i965: ignore all texturing opcodes without a coordinate, for cubemap normalize Previously we special-cased textureSize() but this is the more correct condition. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-05 19:16:33 +13:00
Chris Forbes	7a4754d7d9	glsl: add plumbing for GL_ARB_texture_query_levels Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-05 19:16:32 +13:00
Chris Forbes	6ce4e7672e	mesa: add plumbing for GL_ARB_texture_query_levels Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-05 19:16:32 +13:00
Carl Worth	30e6501820	docs: Add release notes for 9.1.7 release Including a news item.	2013-10-04 21:58:51 -07:00
Carl Worth	058fa59d6b	docs: Add release notes and NEWS item for 9.2.1 release Better late than never, right?	2013-10-04 21:58:51 -07:00
Alexander von Gluck IV	765baec8f7	haiku: Ensure correct libraries are referenced.	2013-10-04 18:20:09 -05:00
Alexander von Gluck IV	a4144af400	haiku: Clean up code, use target-helpers * Thanks for the help xexaxo!	2013-10-04 18:20:09 -05:00
Alexander von Gluck IV	4d15ef5121	haiku: Drop haiku-softpipe.c; fix extern C * It isn't needed any longer as we're moving in the code that called it. * The winsys code is C, so make sure we include the header in the extern C	2013-10-04 18:20:09 -05:00
Alexander von Gluck IV	bc2fb19773	haiku: Correct Haiku softpipe library * Use LoadableModule vs SharedLibrary	2013-10-04 18:20:09 -05:00
Alexander von Gluck IV	8730236d1a	haiku: Add first Haiku renderer (softpipe) * This shared library gets parsed by the system as a system "add-on"	2013-10-04 18:20:09 -05:00
Alexander von Gluck IV	c9f1217e1f	haiku: Build Haiku's libGL from within Mesa * This in essence means that Mesa would be taking control of Haiku's OpenGL kit. * This works by dispatching renderers from the OpenGL add-ons directory	2013-10-04 18:20:09 -05:00
Vinson Lee	1349766612	glsl: Define isnormal for Oracle Solaris Studio. This patch fixes this Oracle Solaris Studio build error. "../../src/glsl/ir_constant_expression.cpp", line 1398: Error: The function "isnormal" must have a prototype. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-10-04 15:37:33 -07:00
Grigori Goronzy	8419c5c3ce	r600g: texture offsets for non-TXF instructions All texture instructions can use offsets, not just TXF. Offsets into the literals array were wrong, too. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-10-04 22:44:47 +02:00
Marek Olšák	c04b8d1dab	r600g: remove an assertion causing a crash at context cleanup Compute samplers are advertised, but not implemented. I think that's intentional.	2013-10-04 20:01:51 +02:00
Marek Olšák	eda1f2aa12	r300g: remove unused function r300_lacks_vertex_textures	2013-10-04 20:01:48 +02:00
Ian Romanick	0667e2c969	mesa: Don't return any data for GL_SHADER_BINARY_FORMATS We return 0 for GL_NUM_SHADER_BINARY_FORMATS, so GL_SHADER_BINARY_FORMATS should not write any data to the application buffer. Fixes piglit test 'arb_get_program_binary-overrun shader'. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-10-04 10:08:45 -07:00
Brian Paul	a50c5f8d24	svga: fix incorrect memcpy src in svga_buffer_upload_piecewise() As we march over the source buffer we're uploading in pieces, we need to memcpy from the current offset, not the start of the buffer. Fixes graphical corruption when drawing very large vertex buffers. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2013-10-04 10:25:37 -06:00
Matthew McClure	d164d50a85	util: when packing depth values, round to nearest. This patch adds the lrint, lrintf, llrint, and llrintf rounding utility functions. When packing unorm depth values, we will round to nearest. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-10-04 10:55:51 +01:00
Tom Stellard	b280516e11	radeonsi/compute: Fix segfault caused by recent refactoring Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-10-03 17:29:54 -07:00
Brian Paul	b181be6266	radeonsi: Fix build Reviewed-by: Tom Stellard <thomas.stellard@amd.com> https://bugs.freedesktop.org/show_bug.cgi?id=70106	2013-10-03 17:29:42 -07:00
Emil Velikov	757ec72b23	configure: set HAVE_COMMON_DRI when building only swrast With commit `cb1febb07`, I have incorrectly removed HAVE_COMMON_DRI assuming that swrast does not need to build the translations for driconf options, as effectively swrast/drisw does not use them. With the incoming unification work of dri and drisw, it makes sense just to revert the offending hunk. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70057 Reported-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-10-03 16:52:38 -07:00
Brian Paul	99a471c67b	radeonsi/compute: fix bind_compute_sampler_states() breakage Remove the assignment and the no-op function.	2013-10-03 17:32:40 -06:00
Paul Berry	800610f9eb	i965/fs: Improve accuracy of dFdy() to match dFdx(). Previously, we computed dFdy() using the following instruction: add(8) dst<1>F src<4,4,0)F -src.2<4,4,0>F { align1 1Q } That had the disadvantage that it computed the same value for all 4 pixels of a 2x2 subspan, which meant that it was less accurate than dFdx(). This patch changes it to the following instruction when c->key.high_quality_derivatives is set: add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q } This gives it comparable accuracy to dFdx(). Unfortunately, align16 instructions can't be compressed, so in SIMD16 shaders, instead of emitting this instruction: add(16) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1H } We need to unroll to two instructions: add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q } add(8) (dst+1)<1>F (src+1)<4,4,1>.xyxyF -(src+1)<4,4,1>.zwzwF { align16 2Q } Fixes piglit test spec/glsl-1.10/execution/fs-dfdy-accuracy. Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-10-03 13:49:15 -07:00
Brian Paul	9267565ee4	gallium/tests: fix SHADER typo	2013-10-03 14:24:55 -06:00
Emil Velikov	13895abd86	gallium-egl: use standard variable types over EGLBoolean/EGLint The inferface/prototype in native_wayland_bufmgr.h uses boolean/int, as well as the rest of the file. Convert to improve consistency and to prevent gcc compiler warnings due to type miss-match. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-03 14:05:29 -06:00
Brian Paul	379deaf5c6	gallium: remove old bind_*_sampler_states() functions The new bind_sampler_states() function takes a shader argument to specify the shader stage.	2013-10-03 14:05:29 -06:00
Brian Paul	55e81b06e7	gallium/docs: update bind_sampler_states() documentation	2013-10-03 14:05:28 -06:00
Brian Paul	1e2fbf2657	cso: make sure all sampler states are set/cleared	2013-10-03 14:05:28 -06:00
Brian Paul	7d7a9714d2	freedreno: use new bind_sampler_states() function	2013-10-03 14:05:28 -06:00
Brian Paul	88b17a15f3	svga: don't hook in old bind_fragment_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	27c054edf0	radeon: don't use old bind_vertex/fragment_sampler_states() hooks	2013-10-03 14:05:28 -06:00
Brian Paul	1e8d3eb08d	i915g: remove old bind_vertex/fragment_sampler_states() hooks	2013-10-03 14:05:28 -06:00
Brian Paul	edd9af675c	noop: remove old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	f233ee0cd6	galahad: remove old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	d0520d5bf6	vl: remove old bind_fragment_sampler_states() calls	2013-10-03 14:05:28 -06:00
Brian Paul	3925e521d6	util: remove old bind_fragment_sampler_states() calls from blitter code	2013-10-03 14:05:28 -06:00
Brian Paul	9fa6722a68	draw: remove use of old bind_fragment_sampler_states()	2013-10-03 14:05:28 -06:00
Brian Paul	7478236da9	nouveau: remove old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	1446600d1a	cso: remove use of old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	bcf7508a7d	rbug: remove old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	22480c5b5b	identity: remove old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	dd4816e3fd	trace: remove old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	5807105ad7	ilo: don't hook up old bind_*_sampler_states() functions	2013-10-03 14:05:28 -06:00
Brian Paul	2d0effaa10	llvmpipe: remove old bind_*_sampler_states() functions	2013-10-03 14:05:27 -06:00
Brian Paul	6e640545ac	softpipe: remove old bind_*_sampler_states() functions	2013-10-03 14:05:27 -06:00
Brian Paul	93e6694f2c	clover: remove bind_compute_sampler_states() calls	2013-10-03 14:05:27 -06:00
Brian Paul	a5350a9f3e	gallium/tests: use pipe_context::bind_sampler_states()	2013-10-03 14:05:27 -06:00
Brian Paul	bc367ab54d	gallium/tools: update dump_state.py to use bind_sampler_states()	2013-10-03 14:05:27 -06:00
Brian Paul	3f0627c2ad	nouveau: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:27 -06:00
Brian Paul	550f9ee64c	softpipe: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	8280b29d7c	radeon: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	0de99d52b7	svga: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	6ef9fc791e	trace: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	e64112b1f9	rbug: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	bd1514849b	noop: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	c772338488	llvmpipe: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	41a9be70e4	ilo: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	9564ec8317	identity: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	aec11d48cf	i915g: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	e5d000c3f1	galahad: implement pipe_context::bind_sampler_states()	2013-10-03 14:05:26 -06:00
Brian Paul	4bdf7d3842	clover: use pipe_context::bind_sampler_states() if non-null	2013-10-03 14:05:26 -06:00
Brian Paul	96b9c09495	vl: use pipe_context::bind_sampler_states() if non-null	2013-10-03 14:05:26 -06:00
Brian Paul	bbc1fd8c80	util: use pipe_context::bind_sampler_states() if non-null	2013-10-03 14:05:26 -06:00
Brian Paul	27d500a844	draw: use pipe_context::bind_sampler_states() if non-null	2013-10-03 14:05:26 -06:00
Brian Paul	5cba8725a4	cso: use pipe_context::bind_sampler_states() if non-null	2013-10-03 14:05:26 -06:00
Brian Paul	755d788fe2	gallium: add pipe_context::bind_sampler_states() The bind_vertex/geometry/fragment/compute_sampler_states() functions will be replaced by a single functions.	2013-10-03 14:05:26 -06:00
Brian Paul	9b99451da2	r300g: rename r300_bind_sampler_states to r300_bind_fragment_sampler_states	2013-10-03 14:05:26 -06:00
Brian Paul	c368479e38	draw: rename bind_sampler_states variables Put 'fragment' in the names. In preparation for upcoming function renaming.	2013-10-03 14:05:25 -06:00
Marek Olšák	c7d91a6f13	r600g: fix ínitialization of non_disp_tiling flag This fixes a regression caused by `e64633e8c3`	2013-10-03 18:30:49 +02:00
Marek Olšák	b893bbf438	r600g,radeonsi: create aux_context last This fixes a regression caused by `68f6dec32e`.	2013-10-03 18:30:49 +02:00
Marek Olšák	52bfe8e0f6	r300g/swtcl: don't call draw_prepare_shader_outputs	2013-10-03 18:30:49 +02:00
Brian Paul	bde5b626c2	st/mesa: silence warning about unhandled enum in switch statement	2013-10-03 09:14:03 -06:00
Chris Forbes	d133592619	mesa: fix make check for ARB_texture_gather Clean up inconsistency in enum decoration: - Use the undecorated enums where possible. - MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB remains decorated, since it has no undecorated equivalent in GL4. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70054 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 21:38:48 +13:00
Chris Forbes	61519f15ac	docs: Mark off ARB_texture_gather	2013-10-03 07:58:12 +13:00
Chris Forbes	88f196ab6e	i965/hsw: Apply gather4 RG32F w/a using SCS instead of shader. The new surface channel select bits allow us to avoid having to recompile the shader for this workaround. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:40 +13:00
Chris Forbes	7df985ad47	i965: Enable ARB_texture_gather on Gen7 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:37 +13:00
Chris Forbes	dd4c2a516c	i965: use gather slots in the binding table for gather4. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:34 +13:00
Chris Forbes	c08f2083ee	i965: Emit a second set of SURFACE_STATE for gather4 from textures. This allows us to use a different surface format for gather4, which is required for R32G32_FLOAT to work on Gen7. V4: - Only emit alternate surface state for shaders which will actually use it. - Pass a simple 'for_gather' flag rather than a function pointer. The callee can decide what w/a to apply. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:29 +13:00
Chris Forbes	5901d48b41	i965: make room in the binding table for a full alternate set of surface_states Worst-case is that every texunit uses a format that needs overriding. V4: Place the gather slots last, so shaders which don't use gather don't get penalized by having a huge binding table. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:26 +13:00
Chris Forbes	855b2a8f4a	i965: Add BRW_SURFACEFORMAT_R32G32_FLOAT_LD, required for IVB gather4 w/a gather4 GREEN channel against a surface with format R32G32_FLOAT doesn't work correctly on IVB. w/a from bspec: - use R32G32_FLOAT_LD = 0x97 instead, for gather4 only. - select BLUE channel to read GREEN Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:23 +13:00
Chris Forbes	cfa3c8a0d3	i965: w/a for gather4 green RG32F V4: Only flag quirks if there are any uses of gather in the shader, to avoid spurious recompiles just because someone happened to use RG32F. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:20 +13:00
Chris Forbes	36e25ccd29	glsl: flag shaders which use gather4 at all Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:56:02 +13:00
Chris Forbes	4ed3930f97	i965/vs: Add support for ir_tg4 Pretty much the same as the FS case. Channel select goes in the header, V2: Less mangling. V3: Avoid sampling at all, for degenerate swizzles. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:55:59 +13:00
Chris Forbes	942a4ec18f	i965/fs: Add support for ir_tg4 Lowers ir_tg4 (from textureGather and textureGatherOffset builtins) to SHADER_OPCODE_TG4. The usual post-sampling swizzle workaround can't work for ir_tg4, so avoid doing that: * For R/G/B/A swizzles use the hardware channel select (lives in the same dword in the header as the texel offset), and then don't do anything afterward in the shader. * For 0/1 swizzles blast the appropriate constant over all the output channels instead of sampling. V2: Avoid duplicating header enabling block V3: Avoid sampling at all, for degenerate swizzles. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:55:56 +13:00
Chris Forbes	fb455500bf	i965: add SHADER_OPCODE_TG4 Adds the Gen7 message IDs, a new SHADER_OPCODE_TG4 pseudo-op, and low-level support for emitting it via generate_tex(). V3: Updated for changes in master. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:55:55 +13:00
Maxence Le Dore	18002d9eda	glsl: add texture gather changes V2 [Chris Forbes]: - Add new pattern, fixup parameter reading. V3: Rebase onto new builtins machinery Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-03 07:55:54 +13:00
Maxence Le Dore	d3575622b7	mesa: add texture gather changes Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-03 07:55:51 +13:00
Chris Forbes	0d7fc10bcd	i965: fix bogus swizzle in brw_cubemap_normalize When used with a cube array in VS, failed assertion in ir_validate: Assignment count of LHS write mask channels enabled not matching RHS vector size (3 LHS, 4 RHS). To fix this, swizzle the RHS correctly for the writemask. This showed up in the ARB_texture_gather tests, which exercise cube arrays in the VS. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-03 07:54:53 +13:00
Vincent Lejeune	4e4c32ba11	r600/llvm: Adds support for MSAA	2013-10-02 17:30:21 +02:00
Vincent Lejeune	8edbd7609b	r600g/llvm: Undef z and w component of 2D TXP inst	2013-10-02 17:30:14 +02:00
Vincent Lejeune	9f183eb7de	r600g/llvm: fix txq for texture buffer	2013-10-02 17:30:07 +02:00
Chia-I Wu	848c0e72f3	i965: compute DDX in a subspan based only on top row Consider only the top-left and top-right pixels to approximate DDX in a 2x2 subspan, unless the application requests a more accurate approximation via GL_FRAGMENT_SHADER_DERIVATIVE_HINT or this optimization is disabled from the new driconf option disable_derivative_optimization. This results in a less accurate approximation. However, it improves the performance of Xonotic with Ultra settings by 24.3879% +/- 0.832202% (at 95.0% confidence) on Haswell. No noticeable image quality difference observed. The improvement comes from faster sample_d. It seems, on Haswell, some optimizations are introduced to allow faster sample_d when all pixels in a subspan have the same derivative. I considered SAMPLE_STATE too, which allows one to control the quality of sample_d on Haswell. But it gave much worse image quality without giving better performance comparing to this change. No piglit quick.tests regression on Haswell (tested with v1). v2: better guess for precompile program key Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-10-02 15:26:40 +08:00
Chris Forbes	72edba1659	i965/blorp: Use passed in framebuffer rather than ctx->DrawBuffer We have the destination framebuffer object passed in; there's no need to go digging around in the context. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-02 18:31:24 +13:00
Francisco Jerez	ef8cc3e51f	ralloc: Remove the rzalloc-based new/delete operator definition macro. Using it encourages the (IMHO worrying) practice of leaving member variables uninitialized in constructor definitions. This macro shouldn't be necessary anymore after the last patch series fixing all its users to initialize all member variables from the class constructor. Remove it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:39:45 -07:00
Francisco Jerez	fcbbecb9bc	st/mesa: Switch glsl_to_tgsi_instruction to the non-zeroing allocator. All member variables of glsl_to_tgsi_instruction are already being initialized from its implicitly defined constructor, it's not necessary to use rzalloc to allocate its memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:52 -07:00
Francisco Jerez	03d46344df	mesa/program: Switch ir_to_mesa_instruction to the non-zeroing allocator. All member variables of ir_to_mesa_instruction are already being initialized from its implicitly defined constructor, it's not necessary to use rzalloc to allocate its memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:52 -07:00
Francisco Jerez	23e8673afb	i965: Switch vec4_live_variables to the non-zeroing allocator. All member variables of vec4_live_variables are already being initialized from its constructor, it's not necessary to use rzalloc to allocate its memory, and doing so makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:52 -07:00
Francisco Jerez	c307d27c5e	i965: Switch fs_live_variables to the non-zeroing allocator. All member variables of fs_live_variables are already being initialized from its constructor, it's not necessary to use rzalloc to allocate its memory, and doing so makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:52 -07:00
Francisco Jerez	ced327ec64	i965: Switch fs_inst to the non-zeroing allocator. All member variables of fs_inst are already being initialized from its constructor, it's not necessary to use rzalloc to allocate its memory, and doing so makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	a5d843ebdf	i965: Switch ip_record to the non-zeroing allocator. All member variables of ip_record are already being initialized from its constructor, it's not necessary to use rzalloc to allocate its memory, and doing so makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	ddd694293a	i965: Initialize all member variables of cfg_t on construction. The cfg_t object relies on the memory allocator zeroing out its contents before it's initialized, which is quite an unusual practice in the C++ world because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Initialize all fields from the constructor and stop using the zeroing allocator. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	fde23b61a9	i965: Initialize all member variables of bblock_t on construction. The bblock_t object relies on the memory allocator zeroing out its contents before it's initialized, which is quite an unusual practice in the C++ world because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Initialize all fields from the constructor and stop using the zeroing allocator. v2: Use zero initialization for numeric types instead of default construction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	58d772cb41	glsl: Switch ast_type_qualifier to the non-zeroing allocator. All member variables of ast_type_qualifier are already being initialized from its implicitly defined constructor, it's not necessary to use rzalloc to allocate its memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	8bd1c69f3b	glsl: Switch ast_node to the non-zeroing allocator. All member variables of ast_node are already being initialized from its constructor, but some of its derived classes were leaving members uninitialized -- Fix them. Using rzalloc makes it more likely that we will start relying on the allocator to zero out all memory if the class is ever extended with new member variables. That's bad because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. v2: Use NULL initialization instead of default construction for pointers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	70953b5fea	i965: Initialize all member variables of vec4_instruction on construction. The vec4_instruction object relies on the memory allocator zeroing out its contents before it's initialized, which is quite an unusual practice in the C++ world because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Initialize all fields from the constructor and stop using the zeroing allocator. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	43bf36b080	glsl: Initialize all member variables of _mesa_glsl_parse_state on construction. The _mesa_glsl_parse_state object relies on the memory allocator zeroing out its contents before it's initialized, which is quite an unusual practice in the C++ world because it ties objects to some specific allocation scheme, and gives unpredictable results when an object is created with a different allocator -- Stack allocation, array allocation, or aggregation inside a different object are some of the useful possibilities that come to my mind. Initialize all fields from the constructor and stop using the zeroing allocator. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 17:30:51 -07:00
Francisco Jerez	0e72db9f97	mesa: Fix misplaced includes of "main/uniforms.h". Several C++ source files include "main/uniforms.h" from an extern "C" block, which is both unnecessary, because "uniforms.h" already checks for a C++ compiler and sets the right linkage, and incorrect, because the header file includes other C++ headers ("glsl_types.h" and "ir_uniform.h") that are supposed to get C++ linkage. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-01 17:30:51 -07:00
Grigori Goronzy	6349b3235c	st/egl: flush resources before presentation Fixes regression on r600g due to fast clear introduced by commit `edbbfac6`. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-10-01 21:42:02 +02:00
Paul Berry	d99b5b2d82	i965/gs: Fix incorrect numbering of DWORDs in 3DSTATE_GS In commit `247f90c77e` (i965/gs: Set control data header size/format appropriately for EndPrimitive()), I incorrectly numbered the DWORDs in the 3DSTATE_GS command starting from 1 instead of starting from 0. This caused the control data format to be programmed into the wrong DWORD, resulting in corruption in some geometry shaders that used an output type of points. This patch numbers the DWORDs starting from 0, as we do for all other commands, which causes the control data format to be programmed into the correct DWORD. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-10-01 11:06:17 -07:00
Brian Paul	6659131be3	mesa: check for bufSize > 0 in _mesa_GetSynciv() The spec doesn't say GL_INVALID_VALUE should be raised for bufSize <= 0. In any case, memcpy(len < 0) will lead to a crash, so don't allow it. CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-01 10:10:01 -06:00
Brian Paul	755602df12	mesa: minor fix-ups for _mesa_validate_sync() Return bool instead of int. Const-qualify the syncObj. Add some comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-10-01 10:10:01 -06:00
Brian Paul	79a03068cd	mesa: add missing error checks in _mesa_GetObject[Ptr]Label() Error checking bufSize isn't mentioned in the spec, but it is in the man pages. However, I believe the man page is incorrect. Typically, GL functions that take GLsizei parameters check that they're positive or non-negative. Negative values don't make sense here. A spec bug has been filed with Khronos/ARB. v2: check for negative values, not <= 0.	2013-10-01 10:10:01 -06:00
Brian Paul	69daf335a0	mesa: use caller string in error message in get_label_pointer() Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2013-10-01 10:10:00 -06:00
Brian Paul	ecd155a428	mesa: asst. clean-ups in copy_label() This incorporates Vinson's change to check for a null src pointer as detected by coverity. Also, rename the function params to be src/dst, const-qualify src, and use GL types to match the calling functions. And add some more comments. Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2013-10-01 10:10:00 -06:00
Alex Deucher	d2eb281fb2	st/xorg: Include u_surface.h for u_copy_rect Fixes build errors. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-10-01 11:49:08 -04:00
Emil Velikov	9c446afb18	winsys/freedreno/drm: drop obsolete .gitignore Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	16661a9d84	winsys/freedreno/drm: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	5d7690991a	winsys/nouveau/drm: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	0d36f5c3be	winsys/i915/sw: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	56dfbbd24a	st/xvmc: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	10bd3a3f71	st/xorg: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	556207e579	st/xa: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:52 -07:00
Emil Velikov	f7df719b39	st/wgl: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	9f03c763e9	st/vega: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	bfbbc7c8c8	st/vdpau: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	c0024c4548	st/osmesa: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	921fdf1429	st/glx: consolidate C sources list into Makefile.sources Move glx/{,xlib/}Makefile.am to preserve file list Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	760c1a6e66	st/gbm: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	4e9028b638	st/egl: consolidate C sources lists into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	edd11ece38	st/dri/sw: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:51 -07:00
Emil Velikov	f9ddeac213	st/dri: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	d8afbc6177	st/clover: consolidate CPP sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	1918c37008	galahad: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	38d80c01d0	noop: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	d7c66ff59e	identity: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	959ed5c163	freedreno: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	b91a9cdeaa	trace: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:50 -07:00
Emil Velikov	e369126709	llvmpipe: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:49 -07:00
Emil Velikov	2234e187c6	rbug: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:49 -07:00
Emil Velikov	9bc5ced1c7	softpipe: consolidate C sources list into Makefile.sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:49 -07:00
Emil Velikov	6ea73bb395	r600: use NEED_RADEON_LLVM over R600_NEED_RADEON_GALLIUM libllvmradeon.la is available whenever NEED_RADEON_LLVM is set, using R600_NEED_RADEON_GALLIUM is rather ambiguous and unnecessary. Drop it in favour of NEED_RADEON_LLVM. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:49 -07:00
Emil Velikov	4334666b47	gallium/radeon: drop unused variable LIBGALLIUM_LIBS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:49 -07:00
Emil Velikov	e11ff60e28	mesa/drivers: drop HAVE__DRI from individual makefiles The mesa/drivers/dri/Makefile.am already guards the individual targets/subdirs with HAVE__DRI before including them. Thus making the additional check within each Makefile.am unnecessary. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-10-01 07:29:49 -07:00
Johannes Obermayr	cb1febb074	gallium/targets: Make use of prebuilt libdricommon.la. libdricommon.la is available whenever a non swrast driver is built. All the classic dri drivers make use of the prebuild library but all of the gallium ones rebuild it explicitly. While we're here gallium/{llvm,soft}pipe does not require HAVE_COMMON_DRI thus do not set in during configure. v2: [Emil] Add commit message and drop HAVE_COMMON_DRI from configure.ac v3: [Emil] Rebase and resolve targets/r*/dri conflicts Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-10-01 07:29:49 -07:00
Vinson Lee	eb0a57acaa	i915: Fix memory leak in do_blit_readpixels. Fixes "Resource leak" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-30 22:08:48 -07:00
Vinson Lee	76df7edacf	llvmpipe: Remove unnecessary null check of shader. shader has already been dereferenced earlier so cannot be null here. Fixes "Dereference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-30 22:00:54 -07:00
Vinson Lee	ac82495d6d	util/u_format: Assert that format block size is at least 1 byte. The block size for all formats is currently at least 1 byte. Add an assertion for this. This should silence several Coverity "Division or modulo by zero" defects. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-30 21:53:04 -07:00
Vinson Lee	505a6de7fc	draw: Add a null check for draw. There is an earlier null check for draw so draw could be null here as well. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-30 21:46:42 -07:00
Vinson Lee	9b388c66fc	st/vdpau: Include u_surface.h for u_copy_rect. Fix build errors. CC surface.lo surface.c: In function 'vlVdpVideoSurfaceGetBitsYCbCr': surface.c:247:10: error: implicit declaration of function 'util_copy_rect' [-Werror=implicit-function-declaration] CC output.lo output.c: In function 'vlVdpOutputSurfaceGetBitsNative': output.c:216:4: error: implicit declaration of function 'util_copy_rect' [-Werror=implicit-function-declaration] Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-09-30 20:49:38 -07:00
Vinson Lee	05474ac9c4	st/vdpau: Include u_format.h for util_format_description. Fix build error. CC device.lo device.c: In function 'vlVdpDefaultSamplerViewTemplate': device.c:251:4: error: implicit declaration of function 'util_format_description' [-Werror=implicit-function-declaration] device.c:251:9: warning: assignment makes pointer from integer without a cast [enabled by default] device.c:252:12: error: dereferencing pointer to incomplete type device.c:252:28: error: 'UTIL_FORMAT_SWIZZLE_0' undeclared (first use in this function) device.c:252:28: note: each undeclared identifier is reported only once for each function it appears in device.c:254:12: error: dereferencing pointer to incomplete type device.c:256:12: error: dereferencing pointer to incomplete type device.c:258:12: error: dereferencing pointer to incomplete type Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-09-30 20:38:06 -07:00
Vinson Lee	14442c46fb	st/xvmc: Include u_surface.h for u_copy_rect. This patch fixes the build error introduced with commit `81bb98e928`. CC subpicture.lo subpicture.c: In function 'upload_sampler': subpicture.c:181:4: error: implicit declaration of function 'util_copy_rect' [-Werror=implicit-function-declaration] subpicture.c: In function 'XvMCClearSubpicture': subpicture.c:304:21: error: storage size of 'uc' isn't known subpicture.c:328:4: error: implicit declaration of function 'util_fill_rect' [-Werror=implicit-function-declaration] subpicture.c:304:21: warning: unused variable 'uc' [-Wunused-variable] Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-09-30 20:15:53 -07:00
Brian Paul	9f6e76a91e	st/egl: include u_format.h for util_format_get_blocksize()	2013-09-30 19:02:27 -06:00
Brian Paul	1d05caf9f2	svga: fix pixel center integer The svga/d3d9 convention is that pixel centers are at integer coordinates. Fixes piglit glsl-arb-fragment-coord-conventions test. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-09-30 18:50:37 -06:00
Brian Paul	360610c89e	svga: return 0 for PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER Using the map/unmap path for glTexImage is a little bit faster than blitting. Also, this fixes about 50 assorted piglit failures that seem to be related to the blit version of glReadPixels. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-09-30 18:50:37 -06:00
Brian Paul	395fac25a6	svga: we don't support TGSI_OPCODE_CONT So return PIPE_SHADER_CAP_TGSI_CONT_SUPPORTED = 0.	2013-09-30 18:50:37 -06:00
Brian Paul	81bb98e928	gallium: include u_surface.h instead of u_rect.h u_rect.h was including u_surface.h just to avoid touching a bunch of other source files after some functions were moved from u_rect.h to u_surface.h. This patch cleans up that hack. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-30 18:50:37 -06:00
Eric Anholt	48b9720272	i965: Reenable glBitmap() after the sRGB winsys enabling. The format of the window system framebuffer changed from ARGB8888 to SARGB8, but we're still supposed to render to it the same as ARGB8888 unless the user flipped the GL_FRAMEBUFFER_SRGB switch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for stable branches.	2013-09-30 16:49:43 -07:00
Ian Romanick	3e1fdf3899	mesa: Remove all traces of GL_OES_matrix_get I believe this extension was enabled by accident. As far as I can tell, there has never been any code in Mesa to actually support it. Not only that, this extension is only useful in the common-lite profile, and Mesa does the common profile. This "fixes" the piglit test oes_matrix_get-api. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-30 16:40:00 -07:00
Carl Worth	9baf35de5c	Use -Bsymbolic when linking libEGL.so For some reason that I don't yet fully understand, Glaze does not work with libEGL unless libEGL is linked with -Bsymbolic.[] Beyond that specific reason, all of the reasons for which libGL.so is linked with -Bsymbolic, (see the commit history), should also apply here. [] The specific behavior I am seeing is that when Glaze calls dlopen for libEGL.so, ifunc resolvers within Glaze for EGL functions are called before the dlopen returns. These resolvers cannot succeed, as they need the return value from dlopen in order to find the functions to resolve to. I don't know what's causing these resolvers to be called, but I have verified that linking libEGL with -Bsymbolic causes this problematic behavior to stop. CC: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 15:49:16 -07:00
Paul Berry	4c4934636c	i965/blorp: retype destination register for texture SEND instruction to UW. From the bspec documentation of the SEND instruction: "destination region cannot cross the 256-bit register boundary." To avoid violating this restriction when executing SIMD16 texturing operations (such as those used by blorp), we need to ensure that the destination of the SEND instruction doesn't exceed 256 bits in size. An easy way to do this is to set the type of the destination register to UW (unsigned word), since 16 unsigned words can fit inside a 256-bit register. Fortunately, this has no effect on the sampling operation, since the sampler always infers the destination data type from the sampler message rather than from the type of the instruction operand. Previously, we did this for texturing operations issued by the vec4 and fs back-ends, but not for blorp. This patch makes blorp use the same trick. I haven't observed any behavioural difference on actual hardware due to this patch, but it avoids a warning from the simulator so it seems like the right thing to do. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 15:16:44 -07:00
Eric Anholt	1c7f75e45e	i965: Add a real native TexStorage path. We originally had a path just did the loop and called ctx->Driver.AllocTextureImageBuffer(), which I moved into Mesa core. But we can do better, avoiding incorrect miptree size guesses and later texture validations by just directly allocating the miptree and setting it to all the images. v2: drop debug printf. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	aff7f335c1	i965: Add missing license to intel_tex_validate.c. I've rewritten a lot of this file. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	8037c0b69c	i965: Always allocate validated miptrees from level 0. No change in copies during a piglit run, but it's one less first_level != 0 in our codebase. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	16060c5adc	i965: Don't relayout a texture just for baselevel changes. As long as the baselevel, maxlevel still sit inside the range we had previously validated, there's no need to reallocate the texture. I also hope this makes our texture validation logic much more obvious. It's taken me enough tries to write this change, that's for sure. Reduces miptree copy count on a piglit run by 1.3%, though the change in amount of data moved is much smaller. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	97bdb4c039	i965: Don't allocate a 1-level texture when GL_GENERATE_MIPMAP is set. Given that a teximage that calls us with this flag set will immediately proceed to allocate the other levels, we can probably just go ahead and allocate those levels now. Reduces miptree copies in piglit by about .05%. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	6ca9b532d8	i965: Stop allocating miptrees with first_level != 0. If the caller shows up with GL_BASE_LEVEL != 0, it doesn't mean that the texture will over the course of its lifetime have that nonzero baselevel, it means that the caller is filling the texture from the bottom up for some reason (one could imagine demand-loading detailed texture layers at runtime, for example). If we allocate from just the current baselevel, it means when they come along with the next level up, we'll have to allocate a new miptree and copy all of our bits out of the first miptree. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	3b9a2dc938	i965: Drop a special case for guessing small miptree levels. Let's say you started allocating your 2D texture with level 2 of a tree as a 1x1 image. The driver doesn't know if this means that level 0 is 4x4 or 4x1 or 1x4, so we would just allocate a single 1x1 and let it get copied in to the real location at texture validate time later. Since this is just a temporary allocation that will get copied, the extra space allocation of just taking the normal path which will happen to producing a 4x1 level 0, 2x1 level 1, and 1x1 level 2 is the right way to go, to reduce complexity in the normal case. No change in miptree copies over the course of a piglit run. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	7de88ac380	i965: Totally switch around how we handle nonzero baselevel-first_level. This has no effect currently, because intel_finalize_mipmap_tree() always makes mt->first_level == tObj->BaseLevel. The change I made before to handle it (`b1080cfbdb`) got very close to working, but after fixing some unrelated bugs in the series, it still left tex-miplevel-selection producing errors when testing textureLod(). The problem is that for explicit LODs, the sampler's LOD clamping is ignored, and only the surface's MIP clamping is respected. So we need to use surface mip clamping, which applies on top of the sampler's mip clamping, so the sampler change gets backed out. Now actually tested with a non-regressing series producing a non-zero computed baselevel. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Eric Anholt	9c116d5eac	i965: Always look up from the object's mt when setting up texturing state. We know that the object's mt is equal to the firstimage's mt because it's gone through intel_finalize_mipmap_tree(). Saves a lookup of firstimage on pre-gen7. v2: Merge in the warning fix that appeared later in the series (noted by Chad) Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-30 14:35:42 -07:00
Vinson Lee	114ae47475	r600g/sb: Move variable dereference after null check. Fixes "Deference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-09-30 10:27:52 -07:00
Brian Paul	0d441aac3d	st/mesa: fix comment typo	2013-09-30 09:06:52 -06:00
Marek Olšák	7b25f52a95	r600g,radeonsi: workaround for late shared screen initialization Accidentally broken by the consolidation.	2013-09-30 13:01:13 +02:00
Laurent Carlier	868791f0ba	r600g: Fix build failure introduced with r600_texture.c consolidation It seems that case with opencl enabled was forgotten Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-09-29 22:01:04 +02:00
Marek Olšák	4e9aa6711f	radeon: make texture logging more useful This has been very useful for tracking down bugs in libdrm. The *_PRINT_TEXDEPTH environment variables were probably never used, so I removed them.	2013-09-29 15:18:10 +02:00
Marek Olšák	e64633e8c3	r600g,radeonsi: share r600_texture.c The function r600_choose_tiling is new and needs a review. The only change in functionality is that it enables 2D tiling for compressed textures on SI. It was probably accidentally turned off. v2: don't make scanout buffers linear	2013-09-29 15:18:10 +02:00
Marek Olšák	4069d39465	r600g: remove compute_global_transfer_* calls from texture_transfer_map/unmap Textures can never have target==PIPE_BUFFER.	2013-09-29 15:18:10 +02:00
Marek Olšák	ef6680d3ee	r600g: move the low-level buffer functions for multiple rings to drivers/radeon Also slightly optimize r600_buffer_map_sync_with_rings.	2013-09-29 15:18:09 +02:00
Marek Olšák	1bb77f81db	r600g,radeonsi: consolidate tiling_info initialization and the util_format_s3tc_init calls too.	2013-09-29 15:18:09 +02:00
Marek Olšák	09fc5d6e26	radeonsi: implement clear_buffer using CP DMA, initialize CMASK with it More work needs to be done for this to be entirely shared with r600g. I'm just trying to share r600_texture.c now. The reason I put the implementation to si_descriptors.c is that the emit function had already been there.	2013-09-29 15:18:09 +02:00
Marek Olšák	68f6dec32e	r600g: move aux_context and r600_screen_clear_buffer to drivers/radeon This will be used in the next commit.	2013-09-29 15:18:09 +02:00
Marek Olšák	0cb9de1dd0	radeonsi: move debug options to R600_DEBUG	2013-09-29 15:18:09 +02:00
Marek Olšák	ba650ccf91	r600g: move some debug options to drivers/radeon	2013-09-29 15:18:09 +02:00
Marek Olšák	2814202ef4	r600g,radeonsi: share the async dma interface r600_texture.c is one step closer to r600g.	2013-09-29 15:18:09 +02:00
Marek Olšák	e916267285	radeonsi: move radeonsi-specific functions out of r600_texture.c	2013-09-29 15:18:08 +02:00
Marek Olšák	31169400a0	r600g,radeonsi: remove unused code	2013-09-29 15:18:08 +02:00
Marek Olšák	6f21009cb3	r600g: move r600g-specific functions out of r600_texture.c	2013-09-29 15:18:08 +02:00
Marek Olšák	bfea9c498d	r600g,radeonsi: consolidate r600_texture structures	2013-09-29 15:18:08 +02:00
Marek Olšák	4ea2e5a4e7	r600g: get rid of r600_texture::is_rat It's always 0.	2013-09-29 15:18:08 +02:00
Marek Olšák	ba29324dba	r600g: get rid of r600_texture::array_mode	2013-09-29 15:18:08 +02:00
Marek Olšák	39801d4ba7	r600g,radeonsi: consolidate transfer, cmask, and fmask structures	2013-09-29 15:18:08 +02:00
Marek Olšák	a62cd6949c	radeon drivers: handle PIPE_CAP_MAX_VIEWPORTS	2013-09-29 15:18:07 +02:00
Marek Olšák	900b1863c8	radeon/llvm: fix TGSI_OPCODE_UCMP This doesn't fix any known issue (I haven't run piglit with this yet), but the code was obviously completely wrong. It looks like copy-pasted from CMP. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-09-29 14:49:23 +02:00
Marek Olšák	2bda5f3298	st/mesa: fix GLSL mix(.., .., bvecN) v2: use CMP on drivers without native integer support	2013-09-29 14:42:42 +02:00
Tom Stellard	a64d3dd135	configure.ac: Add a more informative warning when libclc.pc is not found v2 v2: - Don't display an error message when the user doesn't ask for libclc. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-09-27 20:20:35 -07:00
Vinson Lee	b2d5757831	mesa: Include stdint.h in mtypes.h for uint32_t symbol. This patch fixes the MSVC build error introduced with commit `b2e327e08f`. api_arrayelt.c src\mesa\main/mtypes.h(1809) : error C2061: syntax error : identifier 'uint32_t' src\mesa\main/mtypes.h(1810) : error C2059: syntax error : '}' src\mesa\main/mtypes.h(1825) : error C2079: 'Minimum' uses undefined union 'gl_perf_monitor_counter_value' src\mesa\main/mtypes.h(1828) : error C2079: 'Maximum' uses undefined union 'gl_perf_monitor_counter_value' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-09-26 20:48:47 -07:00
Kenneth Graunke	aac75f877d	i965/fs: Don't double-accept operands of logical and/or/xor operations. If the argument to emit_bool_to_cond_code() is an ir_expression, we loop over the operands, calling accept() on each of them, which generates assembly code to compute that subexpression. We then emit one or two final instruction that perform the top-level operation on those operands. If it's not an expression (say, a boolean-valued variable), we simply call accept() on the whole value. In commit `80ecb8f1` (i965/fs: Avoid generating extra AND instructions on bool logic ops), Eric made logic operations jump out of the expression path to the non-expression path. Unfortunately, this meant that we would first accept() the two operands, skip generating any code that used them, then accept() the whole expression, generating code for the operands a second time. Dead code elimination would always remove the first set of redundant operand assembly, since nothing actually used them. But we shouldn't generate it in the first place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-09-26 16:55:18 -07:00
Kenneth Graunke	e5c49bc25b	i965: Add #define for MI_REPORT_PERF_COUNT on Gen6+. This appears in Volume 1 Part 1 of the Sandybridge PRM on page 48. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-26 16:55:18 -07:00
Kenneth Graunke	0f2da77307	i965: Add support for GL_AMD_performance_monitor on Ironlake. Ironlake's counters are always enabled; userspace can simply send a MI_REPORT_PERF_COUNT packet to take a snapshot of them. This makes it easy to implement. The counters are documented in the source code for the intel-gpu-tools intel_perf_counters utility. v2: Adjust for core data structure changes. Add a table mapping buffer object offsets to exposed counters (which changes each generation). Finally, add report ID assertions to sanity check the BO layout (thanks to Carl Worth). v3: Update for core BeginPerfMonitor hook changes (requested by Brian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-26 16:55:18 -07:00
Kenneth Graunke	b2e327e08f	mesa: Add core support for the GL_AMD_performance_monitor extension. This provides an interface for applications (and OpenGL-based tools) to access GPU performance counters. Since the exact performance counters available vary between vendors and hardware generations, the extension provides an API the application can use to get the names, types, and minimum/maximum values of all available counters. Counters are also organized into groups. Applications create "performance monitor" objects, select the counters they want to track, and Begin/End monitoring, much like OpenGL's query API. Multiple monitors can be in flight simultaneously. v2: Pass ctx to all driver hooks (suggested by Christoph), and attempt to fix overallocation of bitsets (caught by Christoph). Incomplete. v3: Significantly rework core data structures. Store counters in groups rather than in a global list. Use their array index in the group's counter list as the ID rather than trying to store a globally unique counter ID. Use bitsets for active counters within a group, and also track which groups are active so that's easy to query. v4: Remove _mesa_ prefix on static functions; detect out of memory conditions in new_performance_monitor(); make BeginPerfMonitor hook return a boolean rather than setting m->Active or raising an error. Switch to GLuint/unsigned for NumGroups, NumCounters, and MaxActiveCounters (which also means switching a bunch of temporary variable types). All suggested by Brian Paul. Also, remove commented out code at the bottom of the block. Finally, fix the dispatch sanity test (noticed by Ian Romanick). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> [v3] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-26 16:55:18 -07:00
Kenneth Graunke	f91475d4ab	glsl: Create and use a has_uniform_buffer_objects() helper. This is better than overriding the extension enable based on the language version; it's robust against shaders that do: #version 140 #extension GL_ARB_uniform_buffer_object : disable Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-26 16:55:18 -07:00
Kenneth Graunke	e4af55c78f	glsl: Create and use a has_explicit_attrib_location() helper. Explicit attribute locations are supported with GLSL 3.30, GLSL ES 3.00, or "#extension GL_ARB_explicit_attrib_location: enable". Using a helper function makes it easy to check for this. This enables support in GLSL 3.30, which was previously missing. Previously, we overrode the extension enable flag for ES 3.00. This is not robust against a shader such as: #version 330 #extension GL_ARB_explicit_attrib_location : disable Disabling extensions should not remove core language functionality. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-26 16:55:18 -07:00
Kenneth Graunke	e9b410b54d	mesa: Remove 'invalidate_state' parameter to _mesa_dirty_texobj(). Every caller passed true. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-26 16:55:18 -07:00
Eric Anholt	1c904466aa	mesa: Remove some remaining FEATURE_* detritus. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-26 16:29:39 -07:00
Chris Forbes	fe2528c0b6	i965: Fix cube array coordinate normalization Hardware requires the magnitude of the largest component to not exceed 1; brw_cubemap_normalize ensures that this is the case. Unfortunately, we would previously multiply the array index for cube arrays by the normalization factor. The incorrect array index would then cause the sampler to attempt to access either the wrong cube, or memory outside the cube surface entirely, resulting in garbage rendering or in the worst case, hangs. Alter the normalization pass to only multiply the .xyz components. Fixes broken rendering in the arb_texture_cube_map_array-cubemap piglit, which was recently adjusted to provoke this behavior. V2: Fix indent. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "9.2" mesa-stable@lists.freedesktop.org Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-26 18:24:22 +12:00
Zack Rusin	d83ef680e2	draw/clip: don't emit so many empty triangles Compress empty triangles (don't emit more than one in a row) and never emit empty triangles if we already generated a triangle covering a non-null area. We can't skip all null-triangles because c_primitives expects ones that were generated from vertices exactly at the clipping-plane, to be emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-25 19:42:22 -04:00
Zack Rusin	60c448faea	llvmpipe: count c_primitives before discarding null prims We need to count the clipper primitives before the rasterizer discards one it considers to be null. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-25 19:41:02 -04:00
Zack Rusin	1291e833e7	llvmpipe: we need to subdivide if fb is bigger in either direction We need to subdivide triangles if either of the dimensions is larger than the max edge length, not when both of them are larger. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-25 19:38:21 -04:00
Marek Olšák	028b26e2ef	radeon/llvm: fix shadow cube texturing for GL3.0 The fix is at the end (TGSI_TEXTURE_SHADOWCUBE handling), but I also restructured the code for it to be more readable. Fixes spec/!OpenGL 3.0/sampler-cube-shadow. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-25 20:45:23 +02:00
Marek Olšák	57f38e9f92	radeonsi: fix blitting the last 2 mipmap levels of compressed textures This fixes compressedteximage piglit tests. +10 piglits Evergreen and Cayman have the same issue. R600 and R700 don't. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-25 20:45:22 +02:00
Marek Olšák	296adb6de9	radeonsi: add missing colorbuffer formats (rework format translation) This fixes some piglits, e.g: spec/!OpenGL 3.0/required-renderbuffer-attachment-formats. This can be ported to r600g. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-25 20:45:22 +02:00
Marek Olšák	f9ea435ebc	radeonsi: bypass alpha-test for integer colorbuffers Fixes spec/EXT_texture_integer/fbo-blending. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-25 20:45:22 +02:00
Marek Olšák	f7d004b9ad	r600g: fix texture buffer object cache flushing Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-09-25 20:45:22 +02:00
Marek Olšák	6317a3fb31	r600g: fix constant buffer cache flushing Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-09-25 20:45:22 +02:00
Christian König	4871128e58	radeon/winsys: keep screen pointer in winsys v2 Only create one screen for each winsys instance. This helps with buffer sharing and interop handling. v2: rebased and some minor cleanup Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-25 19:41:31 +02:00
Christian König	f6e2aa0e12	build/radeonsi: group all targets in common subdir Allows us to share more code between different targets. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2013-09-25 19:41:27 +02:00
Christian König	015853b568	build/r600: group all targets in common subdir Allows us to share more code between different targets. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2013-09-25 19:41:23 +02:00
Christian König	533e9a04b4	build/r300: group build target in common subdir Allows us to share more code between different targets. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Marek Olšák <marek.olsak@amd.com>	2013-09-25 19:41:03 +02:00
Christian König	1c57d9a6c6	radeon/uvd: try to place msg/fb buffer into GART This is only supported on NI+, but the kernel takes care of those limitations. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-09-25 10:59:03 +02:00
Christian König	f9f14201c1	radeon/uvd: move alignment to winsys Similar to GFX and DMA. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-25 10:58:58 +02:00
Christian König	5f6ae61e69	st/vdpau: use a separate lock per decoder Signed-off-by: Christian König <christian.koenig@amd.com>	2013-09-25 10:58:58 +02:00
Christian König	34b5a4e0d8	st/vdpau: use new vlc function to serach for VC-1 start codes Signed-off-by: Christian König <christian.koenig@amd.com>	2013-09-25 10:58:58 +02:00
Christian König	eb1cb253b7	vl/mpeg12: use new vlc function to search for start codes Signed-off-by: Christian König <christian.koenig@amd.com>	2013-09-25 10:58:58 +02:00
Christian König	e3ecea9ddf	vl/vlc: add fast forward search for byte value Commonly used to find start codes and has far less overhead to searching manually. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-09-25 10:58:58 +02:00
Vinson Lee	59157d1c96	glsl: Initialize ir_lower_jumps_visitor member variables. Fixes "Unintialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-24 22:54:25 -07:00
Vinson Lee	94e3ecae2d	glsl: Initialize lower_vector_visitor::dont_lower_swz. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-24 22:51:23 -07:00
Vinson Lee	74b02b8e3f	glsl: Initialize assignment_generator member variables. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-24 22:16:39 -07:00
Vinson Lee	6128c226b4	glsl: Remove unused pointer value. Silences "Unused pointer value" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-24 22:10:36 -07:00
Zack Rusin	71ecc2cf71	Revert "llvmpipe: increase number of subpixel bits to eight" This reverts commit `755c11dc5e`. We agreed that this is band-aid that's not very useful and the proper solution is to rewrite the rasterization algo so that it operates on 64 bit values. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-09-24 15:10:02 -04:00
Dylan Noblesmith	49f8fc64de	mesa: remove handcounted magic number Also make it a compile-time error with STATIC_ASSERT. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-24 11:29:17 -07:00
Dylan Noblesmith	ea3847b12e	mesa: remove outdated comment No such argument exists since this commit: commit `92f3fca0ea` Author: Ian Romanick <ian.d.romanick@intel.com> AuthorDate: Sun Aug 21 17:23:58 2011 -0700 Commit: Ian Romanick <ian.d.romanick@intel.com> CommitDate: Tue Aug 23 14:52:09 2011 -0700 mesa: Remove target parameter from dd_function_table::BufferSubData Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-24 11:27:12 -07:00
Dylan Noblesmith	2f5d41ce79	mesa: remove stale comment This line stopped making sense in the great sed replace of commit `f9995b3075` Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-24 11:27:03 -07:00
Zack Rusin	e5ec5aef2b	llvmpipe: align the array used for subdivived vertices When subdiving a triangle we're using a temporary array to store the new coordinates for the subdivided triangles. Unfortunately the array used for that was not aligned properly causing random crashes in the llvm jit code which was trying to load vectors from it. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-23 18:10:51 -04:00
Vinson Lee	f036d55515	glapi: Move declaration before code. This patch fixes the MSVC build error introduced by commit `673129e0b9`. enums.c mesa\main\enums.c(3776) : error C2143: syntax error : missing ';' before 'type' mesa\main\enums.c(3781) : error C2065: 'elt' : undeclared identifier mesa\main\enums.c(3781) : warning C4047: '!=' : 'int' differs in levels of indirection from 'void *' mesa\main\enums.c(3782) : error C2065: 'elt' : undeclared identifier mesa\main\enums.c(3782) : error C2223: left of '->offset' must point to struct/union mesa\main\enums.c(3782) : warning C4033: '_mesa_lookup_enum_by_nr' must return a value Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-09-23 14:14:32 -07:00
Eric Anholt	11e494a572	mesa: Use -Bsymbolic in the linker to locally resolve Mesa-internal symbols. Normally, LD_PRELOAD will take precedence over your own symbols, which you want for things like malloc() in libc. But we don't have any local symbols we would want overridden (like hash_table_insert(), for example!), so tell the linker to resolve them internally. This also avoids calls through the PLT. Saves almost 100k on libdricore's size, and gets us a bunch of the performance back that we had with non-dricore. Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Eric Anholt	10ef949424	glsl: Hide many classes local to individual .cpp files in anon namespaces. This gives the compiler the chance to inline and not export class symbols even in the absence of LTO. Saves about 60kb on disk. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Eric Anholt	07572621bc	mesa: Drop an extra copy-and-pasted copy in the program clone function. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Eric Anholt	669b88eb12	mesa: Convert some runtime asserts to static asserts. Noticed while grepping through the code for something else. v2: Don't convert really-runtime asserts to static asserts. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Eric Anholt	673129e0b9	mesa: Shrink the size of the enum string lookup struct. Since it's only used for debug information, we can misalign the struct and save the disk space. Another 19k on a 64-bit build. v2: Make a compiler.h macro to only use the attribute if we know we can. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Eric Anholt	c0378b6400	mesa: Remove the extra enum strings and extra lookup table. Now that there's no name -> enum direction, we can drop the extra strings, and merge the offsets table and the reduced_enums table. Between the previous commit and this one, Mesa core drops by 30k. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Eric Anholt	3b29a6ec91	mesa: Remove _mesa_lookup_enum_by_name(). It's been unused for a long time. I stopped digging through git history as of 2009. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>	2013-09-23 12:45:22 -07:00
Zack Rusin	755c11dc5e	llvmpipe: increase number of subpixel bits to eight Unfortunately d3d10 requires a lot higher precision (e.g. wgf11clipping tests for it). The smallest number of precision bits with which it passes is 8. That means that we need to decrease the maximum length of an edge that we can handle without subdivision by 4 bits. Abstracted the code a bit to make it easier to change once to switch to 64bit rasterization. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-23 14:53:07 -04:00
Vinson Lee	6d29db715b	glsl: Define isnormal and copysign for MSVC to fix build. This patch fixes these MSVC build errors. ir_constant_expression.cpp src\glsl\ir_constant_expression.cpp(564) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data src\glsl\ir_constant_expression.cpp(1384) : error C3861: 'isnormal': identifier not found src\glsl\ir_constant_expression.cpp(1385) : error C3861: 'copysign': identifier not found Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69541 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Matt Turner <mattst88@gmail.com>	2013-09-22 16:11:36 -07:00
Johannes Obermayr	6016dabfa2	Suppress clang's warnings about unused CFLAGS and CXXFLAGS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-22 13:10:43 -07:00
Christian König	8bbcc43ad9	radeon/uvd: async flush the UVD cs No need to block for the CS thread here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-22 10:33:20 +02:00
Christian König	01a0dbcb96	winsys/radeon: share winsys between different fd's Share the winsys between different fd's if they point to the same device. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-22 10:33:20 +02:00
Christian König	0653c66ef4	winsys/radeon: remove cs_queue_empty Waiting for an empty queue is nonsense and can lead to deadlocks if we have multiple waiters or another thread that continuously sends down new commands. Just post the cs to the queue and immediately wait for it to finish. This is a candidate for the stable branch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-22 10:33:20 +02:00
Christian König	f7ccb84aa1	winsys/radeon: fix killing the CS thread Kill the thread only after we checked that it's not used any more, not before. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-22 10:33:20 +02:00
Eric Anholt	938956ad52	i965/gen4: Fix fragment program rectangle texture shadow compares. The rescale_texcoord(), if it does something, will return just the GLSL-sized coordinate, leaving out the 3rd and 4th components where we were storing our projected shadow compare and the texture projector. Deref the shadow compare before using the shared rescale-the-coordinate code to fix the problem. Fixes piglit tex-shadow2drect.shader_test and txp-shadow2drect.shader_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69525 NOTE: This is a candidate for stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-21 16:48:58 -07:00
Abdiel Janulgue	1266f01dc7	i965/gen7.5: Fix missing Shader Channel Select entries on Haswell Probably non-intentional, but the SURFACE_STATE setup refactoring for buffer surfaces had missed the scs bits when creating constant surface states. Fixes broken GLB 2.5 on Haswell where the knight's textures are missing Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-21 12:53:13 -07:00
Kenneth Graunke	4f1ebb8ddd	i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros. These classes declared a placement new operator, but didn't declare a delete operator. Switching to the macro gives them a delete operator, which probably is a good idea anyway. This also eliminates a lot of boilerplate. v2: Properly use RZALLOC in Mesa IR/TGSI translators. Caught by Eric and Chad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-21 09:17:21 -07:00
Kenneth Graunke	81a3759bb5	glsl: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS in a bunch of places. This eliminates a lot of boilerplate and should be 100% equivalent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-21 09:17:06 -07:00
Kenneth Graunke	bfbad9d1a8	ralloc: Introduce new macros for defining C++ new/delete operators. Most of our C++ classes define placement new and delete operators so we can do convenient allocation via: thing *foo = new(mem_ctx) thing(...) Currently, this is done via a lot of boilerplate. By adding simple macros to ralloc, we can condense this to a single line, making it trivial to add this feature to a new class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-21 09:16:02 -07:00
Grigori Goronzy	edbbfac6cf	r600g: fast color clears for single-sample buffers Allocate a CMASK on demand and use it to fast clear single-sample colorbuffers. Both FBOs and window system colorbuffers are fast cleared. Expand as needed when colorbuffers are mapped or displayed on screen. v2: cosmetics, move transfer expansion into dma_blit Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-09-20 20:35:55 +02:00
Grigori Goronzy	56d9a397aa	r600g: add support for separately allocated CMASKs v2: check for NULL cbufs Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-09-20 20:35:55 +02:00
Marek Olšák	419cd5f2a2	gallium: add flush_resource context function r600g needs explicit flushing before DRI2 buffers are presented on the screen. v2: add (stub) implementations for all drivers, fix frontbuffer flushing v3: fix galahad Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-09-20 20:35:55 +02:00
Marek Olšák	d2bd63433a	radeonsi: simplify and fix MSAA texture sampling for array textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-20 20:35:55 +02:00
Marek Olšák	defedc0f61	radeonsi: fix textureOffset and texelFetchOffset GLSL functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-20 20:35:55 +02:00
José Fonseca	1569b3e536	llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM. We must take rounding in consideration when re-scaling to narrow normalized channels, such as 2-bit normalized alpha. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-20 17:34:57 +01:00
José Fonseca	2ab4e1d1e6	draw: Ensure draw_pt_middle_end::bind_parameters is never NULL. Prevents calling NULL pointer with softpipe in certain cases. Trivial.	2013-09-20 17:34:57 +01:00
José Fonseca	75c394f567	tools/trace: Simple script to compare two traces. Based on the earlier apitrace tracediff.sh script.	2013-09-20 17:34:57 +01:00
Ian Romanick	1cc3b90d47	mesa: Silence GCC warning 'comparison between signed and unsigned integer expressions' Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 17:15:09 -05:00
Ian Romanick	7db6b5aa91	mesa: Fix broken call to print_table_stats The function takes a parameter, but none was given. Also, in the non-GET_DEBUG case, silence the unused parameter warning. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 17:15:09 -05:00
Ian Romanick	b4cf56cdf8	glsl: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents in standalone scaffolding Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 17:14:49 -05:00
Ian Romanick	be8963a18f	mesa: Allow several ARB_geometry_shader4 queries in OpenGL 3.2 GL_MAX_GEOMETRY_TEXTURE_IMAGE_UNITS, GL_MAX_GEOMETRY_OUTPUT_VERTICES, GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS, and GL_MAX_GEOMETRY_UNIFORM_COMPONENTS all have the same enum value and meaning as their _ARB counterparts. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	df371e2b1b	mesa: Expose MAX_GEOMETRY_{INPUT,OUTPUT}_COMPONENTS on OpenGL 3.2 The comment '# GL 3.0 / GLES3' was incorrect. The MAX_VERTEX_OUTPUT_COMPONENTS and MAX_FRAGMENT_INPUT_COMPONENTS queries were added in OpenGL 3.2 (with geometry shaders) and OpenGL ES 3.0. This just fixes that comment. v2: Add the GEOMETRY queries in the existing '# GL 3.2' section since they have nothing to do with GLES3. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	965d9e649d	mesa: Get GL_MAX_FRAGMENT_INPUT_COMPONENTS from FragmentProgram.MaxInputComponents In OpenGL ES 3.0 the minimum-maximum for GL_MAX_VERTEX_OUTPUT_VECTORS is 16, but the minimum-maximum for GL_MAX_FRAGMENT_INTPUT_VECTORS is 15. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	d1ade4eaf1	mesa: Get GL_MAX_VERTEX_OUTPUT_COMPONENTS from VertexProgram.MaxOutputComponents In OpenGL ES 3.0 the minimum-maximum for GL_MAX_VERTEX_OUTPUT_VECTORS is 16, but the minimum-maximum for GL_MAX_FRAGMENT_INTPUT_VECTORS is 15. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	67a2d31735	i915: Set VertexProgram.MaxOutputComponents and FragmentProgram.MaxInputComponents This was the only remaining place in Mesa that sets MaxVaryings without also setting these values. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	e1f8c58590	i965: Set *Program.Max{Input,Output}Components Now that MaxVaryings is > 16, VertexProgram.MaxOutputComponents, GeometryProgram.MaxInputComponents, GeometryProgram.MaxOutputComponents, and FragmentProgram.MaxInputComponents also need to be set. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	d358c6b700	mesa: Set default values for Max{Input,Output}Components in init_program_limits Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	052c9ae1f3	mesa: Remove gl_constants::MaxVaryingComponents There are no longer any users. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: Zack Rusin <zackr@vmware.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	d91249df1a	mesa: Use correct data for MAX_{VERTEX,GEOMETRY}_VARYING_COMPONENTS_ARB queries Previously gl_constants::MaxVaryingComponents was used. Now gl_constants::VertexProgram::MaxOutputs and gl_constants::GeometryProgram::MaxOutputs are used. This means that st_extensions.c had to be updated to set these fields instead of MaxVaryingComponents. It was previously the only place that set MaxVaryingComponents. I believe that the structure is allocated by calloc, so the value should be initialized to zero in non-Gallium drivers before and after my change. Right now nobody enables GL_ARB_geometry_shader4, so it's pretty much dead code anyway. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: Zack Rusin <zackr@vmware.com>	2013-09-19 16:29:44 -05:00
Ian Romanick	a384238c3d	mesa: Track per-stage shader input and output limits independently In OpenGL 3.2 these are independently queryable. In addition, the spec has different minimum-maximums for various values. GL_MAX_VERTEX_OUTPUT_COMPONENTS is 64, but GL_MAX_GEOMETRY_OUTPUT_COMPONENTS (and GL_MAX_FRAGMENT_INPUT_COMPONENTS) is 128. In OpenGL ES 3.0 these are also independently queryable. The spec has different minimum-maximums for various values. GL_MAX_VERTEX_OUTPUT_VECTORS is 16, but GL_MAX_FRAGMENT_INTPUT_VECTORS is 15. None of these values are used yet. I have just added space to the structures. Future patches will add users and eventually remove some old fields. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: Zack Rusin <zackr@vmware.com>	2013-09-19 16:29:43 -05:00
Ian Romanick	d38765f3c8	mesa: Support GL_MAX_VERTEX_OUTPUT_COMPONENTS query with ES3 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>	2013-09-19 16:29:43 -05:00
Kenneth Graunke	b6b549ccfc	i965: Refactor Gen4-6 SURFACE_STATE setup for buffer surfaces. This was an embarassingly large amount of copy and pasted code, and it wasn't particularly simple code either. By factoring it out into a helper function, we consolidate the complexity. v2: Properly NULL-check bo. Caught by Eric Anholt. v3: Do the subtraction by 1 in gen7_emit_buffer_surface_state, rather than making callers do it. This makes the buffer_size parameter the actual size of the buffer. Suggested by Paul Berry. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:58 -07:00
Kenneth Graunke	e114cbff96	i965: Refactor Gen7+ SURFACE_STATE setup for buffer surfaces. This was an embarassingly large amount of copy and pasted code, and it wasn't particularly simple code either. By factoring it out into a helper function, we consolidate the complexity. v2: Properly NULL-check bo. Caught by Eric Anholt. v3: Do the subtraction by 1 in gen7_emit_buffer_surface_state, rather than making callers do it. This makes the buffer_size parameter the actual size of the buffer. Suggested by Paul Berry. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:58 -07:00
Kenneth Graunke	35a54ad02f	i965: Fix off by one errors in texture buffer size calculations. The value that's split into width/height/depth needs to be the size of the buffer minus one. This makes it consistent with the constant buffer and shader time SURFACE_STATE setup code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:58 -07:00
Kenneth Graunke	34b11334d4	i965: Fix writemask != 0 assertions on Sandybridge. This fixes myriads of regressions since commit `169f9c030c` ("i965: Add an assertion that writemask != NULL for non-ARFs."). On Sandybridge, our control flow handling (such as brw_IF) does: brw_set_dest(p, insn, brw_imm_w(0)); insn->bits1.branch_gen6.jump_count = 0; This results in a IMM destination with zero for the writemask. IMM destinations are rather bizarre, but the code has been working for ages, so I'm loathe to change it. Fixes glxgears on Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:58 -07:00
Kenneth Graunke	d2d90d66d8	glsl: Delete builtin_builder::shader when destroying built-ins. I would use _mesa_delete_shader, but it's declared static, and we don't really need any of the stuff in it anyway. This fixes a memory leak caught by Valgrind. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:58 -07:00
Kenneth Graunke	9f64bb2312	i965: Fix brw_gs_prog_data_compare to actually check field members. &a and &b are the address of the local stack variables, not the actual structures. Instead of comparing the fields of a and b, we compared ...some stack memory. Not a candidate for stable since GS code doesn't exist in 9.2. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-19 10:52:57 -07:00
Kenneth Graunke	4e4b079916	i965: Fix brw_vs_prog_data_compare to actually check field members. &a and &b are the address of the local stack variables, not the actual structures. Instead of comparing the fields of a and b, we compared ...some stack memory. Caught by Valgrind on Piglit's glsl-lod-bias test (among many others). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68233 Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: mesa-stable@lists.freedesktop.org	2013-09-19 10:52:57 -07:00
Kenneth Graunke	feaad189b4	i965: Move binding table code to a new file, brw_binding_tables.c. The code to upload the binding tables for each stage was scattered across brw_{vs,gs,wm}_surface_state.c and brw_misc_state.c, which also contain a lot of code to populate individual SURFACE_STATE structures. This patch brings all the binding table upload code together, and splits it out from the code which fills in SURFACE_STATE entries. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:57 -07:00
Kenneth Graunke	113a75ff2d	i965: Use brw_upload_binding_table() for the pixel shader as well. This is not quite the same: brw_upload_binding_table() also has code to early-return if there are no entries, while the existing code did not. The PS binding table is unlikely to be empty since it will have at least one color buffer. If it ever is empty, early returning seems wise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:57 -07:00
Kenneth Graunke	72340839ca	i965: Generalize brw_vec4_upload_binding_table() beyond vec4 stages. Instead of passing in a brw_vec4_prog_data structure, we can simply pass the one field it needs: the number of entries in the binding table. We also need to pass in the shader time surface index rather than hardcoding SURF_INDEX_VEC4_SHADER_TIME. Since the resulting function is stage-agnostic, this patch removes "vec4_" from the name. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:57 -07:00
Kenneth Graunke	254891b3fc	i965: Convert loop to memcpy in brw_vec4_upload_binding_table(). This is probably more efficient. At any rate, it's less code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:57 -07:00
Kenneth Graunke	0532b200f3	i965: Update comments in brw_vec4_upload_binding_table(). The first comment was a bit stale; there are more kinds of surfaces than textures and pull constants. The second was a leftover "to do" comment for something I already did. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-09-19 10:52:57 -07:00
Gaetan Nadon	79930c6027	winsys/sw/xlib: fix compile error in xlib_sw_winsys.c. xlib_sw_winsys.h:5:22: fatal error: X11/Xlib.h: No such file or directory The compiler cannot find the Xlib.h in the installed system headers. All supplied include directives point to inside the mesa module. The X11_CFLAGS variable is undefined (not defined in config.status). It appears the intent was to use X11_INCLUDES defined in configure.ac. The Xlib.h file is not installed on my workstation. It is supplied in the libx11-dev package. This allows an X developer control over which version of this file is used for X development. Signed-off-by: Gaetan Nadon <memsize@videotron.ca> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-19 10:49:57 -07:00
Gaetan Nadon	092f2e8336	glx: fix compile error in egl_glx.c. egl_glx.c:40:22: fatal error: X11/Xlib.h: No such file or directory The compiler cannot find the Xlib.h in the installed system headers. All supplied include directives point to inside the mesa module. The X11_CFLAGS variable is undefined (not defined in config.status). It appears the intent was to use X11_INCLUDES defined in configure.ac. The Xlib.h file is not installed on my workstation. It is supplied in the libx11-dev package. This allows an X developer control over which version of this file is used for X development. Signed-off-by: Gaetan Nadon <memsize@videotron.ca> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-19 10:49:47 -07:00
Rob Clark	7dab097a51	freedreno/a3xx: fix typo mixup w/ mipfilter Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-19 11:47:40 -04:00
Rob Clark	575a6e7ec5	freedreno: fix glReadPixels duh, we still need to flush if there are pending draws and it isn't an unsynchronized case. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-19 11:45:01 -04:00
Roland Scheidegger	532dc8939f	gallivm: adjust wrap mode to CLAMP_TO_EDGE always for cube maps. Technically without seamless filtering enabled GL allows any wrap mode, which made sense when supporting true borders (can get seamless effect with border and CLAMP_TO_BORDER), but gallium doesn't support borders and d3d9 requires wrap modes to be ignored and it's a pain to fix up the sampler state (as it makes it texture dependent). It is difficult to imagine a situation where an app really wants another behavior so just cheat here. (It looks like some graphics hw (intel) actually requires this too hence it should be safe.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-09-19 17:14:36 +02:00
Adrian Negreanu	602d368446	android: Remove builtin_compiler The first part was done in: commit `c845140a20` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Tue Sep 3 21:22:17 2013 -0700 Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-18 09:35:55 -07:00
José Fonseca	e150c0da71	util/u_blit: Implement util_blit_pixels via pipe_context::blit. This removes a lot of code, but not everything, as util_blit_pixels_tex is still useful when one needs to override pipe_sampler_view::swizzle_?. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-18 11:25:02 +01:00
José Fonseca	d8c7e13886	util/u_blit: Support blits from cubemaps. By calling util_map_texcoords2d_onto_cubemap. A new parameter for util_blit_pixels_tex is necessary, as pipe_sampler_view::first_layer is always supposed to point to the first face when sampling from cubemaps. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-18 11:24:59 +01:00
José Fonseca	fb1d992da4	vega: Use pipe_context::blit instead of util_blit_pixels_tex. Only compile-tested but it seems straightforward. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-18 11:23:28 +01:00
Kenneth Graunke	ec44d56a5b	i965: Rename brw_{fs,vec4}_emit.cpp to brw_{fs,vec4}_generator.cpp. The previous names were really confusing to talk about: - brw_fs_visitor() contained methods named emit_whatever(). - brw_fs_generator() contained methods named generate_whatever(), but lived in brw_fs_emit.cpp. So when someone said "the emit layer", or "emit code", we weren't sure whether they meant the visitor's emit() functions or the generator in brw_fs_emit.cpp. By renaming these files, the method names, class names, and file names all match, which is much less confusing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2013-09-18 00:08:31 -07:00
Matt Turner	a3b51a22f7	glsl: Correctly validate fma()'s types. lrp() can take a scalar as a third argument, and fma() cannot. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 17:02:06 -07:00
Matt Turner	d56bbd0441	glsl: Add frexp signatures and implementation. I initially implemented frexp() as an IR opcode with a lowering pass, but since it returns a value and has an out-parameter, it would break assumptions our optimization passes make about ir_expressions being pure (i.e., having no side effects). For example, if opt_tree_grafting encounters this code: uniform float u; void main() { int exp; float f = frexp(u, out exp); float g = float(exp)/256.0; float h = float(exp) + 1.0; gl_FragColor = vec4(f, g, h, g + h); } it may try to optimize it to this: uniform float u; void main() { int exp; float g = float(exp)/256.0; float h = float(exp) + 1.0; gl_FragColor = vec4(frexp(u, out exp), g, h, g + h); } Some hardware has an instruction which performs frexp(), but we would need some other compiler infrastructure to be able to generate it, such as an intrinsics system that would allow backends to emit specific code for particular bits of IR. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 17:01:58 -07:00
Matt Turner	c43d6060b1	i965: Lower ldexp. v2: Drop frexp lowering. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:26 -07:00
Matt Turner	d0b8ea60b7	glsl: Add ldexp_to_arith lowering pass. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:23 -07:00
Matt Turner	5561251b58	glsl: Allow vectors to be created from ir_constant(). Note the parameter name change in the int version of ir_constant, to avoid the conflict with the loop iterator. v2: Make analogous change to builtin_builder::imm(). Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:14 -07:00
Matt Turner	b2ab840130	glsl: Add support for ldexp. v2: Drop frexp. Rebase on builtins rewrite. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 16:59:05 -07:00
Paul Berry	4b0488ef4e	i965: Add some missing bits to {mesa,brw,cache}_bits[]. These data structures are used for debug output, so it wasn't hurting anything that there were missing bits. But it's good to keep things up to date. This patch also adds static asserts so that the {brw,cache}_bits[] arrays are the proper size, so that we don't forget to add to them in the future. Unfortunately there's no convenient way to assert that mesa_bits[] is the proper size. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:18 -07:00
Paul Berry	3374dabce7	i965/gs: Implement basic gl_PrimitiveIDIn functionality. If the geometry shader refers to the built-in variable gl_PrimitiveIDIn, we need to set a bit in 3DSTATE_GS to tell the hardware to dispatch primitive ID to r1, and we need to leave room for it when allocating registers. Note: this feature doesn't yet work properly when software primitive restart is in use (the primitive ID counter will incorrectly reset with each primitive restart, since software primitive restart works by performing multiple draw calls). I plan to address that in a future patch series. Fixes piglit test "spec/glsl-1.50/execution/geometry/primitive-id-in". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:14 -07:00
Paul Berry	f67fa8f3c8	i965/gs: New gs primitive types are supported by HW primitive restart. When we previously implemented primitive restart, we didn't add cases to brw_primitive_restart.c's can_cut_index_handle_prims() for the primitive types that are introduced with geometry shaders. It turns out that all of the new primitive types are supported by hardware primitive restart. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:11 -07:00
Paul Berry	9791af90e3	i965/gs: Add new primitive types. As part of its support for geometry shaders, GL 3.2 introduces four new primitive types: GL_LINES_ADJACENCY, GL_LINE_STRIP_ADJACENCY, GL_TRIANGLES_ADJACENCY, and GL_TRIANGLE_STRIP_ADJACENCY. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-17 15:18:07 -07:00
Roland Scheidegger	93b5f71179	gallivm: some bits of seamless cube filtering implementation Simply adjust wrap mode to clamp_to_edge. This is all that's needed for a correct implementation for nearest filtering, and it's way better than using repeat wrap for instance for linear filtering (though obviously this doesn't actually do seamless filtering). v2: fix s/t wrap not r/s... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-09-18 00:00:37 +02:00
Kenneth Graunke	b8244b0056	i965: Remove MIPLAYOUT_BELOW from Gen4-6 constant buffer surface state. Specifying a miptree layout makes no sense for constant buffers. This has no functional change since BRW_SURFACE_MIPMAPLAYOUT_BELOW is just a #define for 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-17 13:17:07 -07:00
Kristian Høgsberg	a1b6e69e45	egl: Also add EGL_TEXTURE_FORMAT as a valid eglQueryWaylandBufferWL attribute Now that we have a table of accepted eglQueryWaylandBufferWL() attributes, we should also list EGL_TEXTURE_FORMAT.	2013-09-16 22:22:49 -07:00
Stanislav Vorobiov	1281a90532	egl: add EGL_WAYLAND_Y_INVERTED_WL attribute This enables querying of wl_buffer's orientation	2013-09-16 22:20:27 -07:00
Kenneth Graunke	9ad6dda21e	i965: Use gen7_upload_constant_state for 3DSTATE_CONSTANT_PS as well. Now we use gen7_upload_constant_state() for all three shader stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-16 18:25:14 -07:00
Kenneth Graunke	e776c18afb	i965: Set brw_stage_state::push_const_size for PS constants. This paves the way for using gen7_upload_constant_state for PS data. The formula is copied from gen7_wm_state.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-16 18:25:11 -07:00
Kenneth Graunke	d385edf4c3	i965: Introduce a prog_data temporary in gen6_upload_wm_push_constants. This saves a bit of typing and shortens a few lines. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-16 18:25:07 -07:00
Paul Berry	24765c58bd	i965/gen6+: Support 128 varying components. GL 3.2 requires us to support 128 varying components for geometry shader outputs and fragment shader inputs, and 64 varying components otherwise. But there's no hardware limitation that restricts us to 64 varying components, and core Mesa doesn't currently allow different stages to have different maximum values, so just go ahead and enable 128 varying components for all stages. This gets us better test coverage anyway. Even though we are only working on GL 3.2 support for gen7 right now, gen6 also supports 128 varying components, so go ahead and switch it on there too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:58 -07:00
Paul Berry	f5d38c58ee	i965/ff_gs: Generate URB writes using a loop. Previously we only ever did 1 URB write, since the maximum number of varyings we support is small enough to fit in 1 URB write (when using BRW_URB_SWIZZLE_NONE, which is what the pre-Gen7 GS always uses). But we're about to increase the number of varying components we support from 64 to 128. With 128 varyings, the most URB writes we'll have to do is 2, but it's just as easy to write a general-purpose loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:55 -07:00
Paul Berry	57b8cff33c	i965/gen6: Fix assertions on VS/GS URB size. The "{VS,GS} URB Entry Allocation Size" fields of 3DSTATE_URB allow values in the range 0-4, but they are U8-1 fields, so the range of possible allocation sizes is 1-5. We were erroneously prohibiting a size of 5. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:52 -07:00
Paul Berry	784044c206	i965/vec4: Generate URB writes using a loop. Previously we only ever did 1 or 2 URB writes, since the maximum number of varyings we support is small enough to fit in 2 URB writes. But GL 3.2 requires the geometry shader to support 128 output varying components, and this could require up to 3 URB writes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:49 -07:00
Paul Berry	875972029e	i965/fs: When >64 input components, order them to match prev pipeline stage. Since the SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots, we can't arrange the fragment shader inputs in an arbitrary order if there are more than 16 input varying slots in use. We need to make sure that slots 16-31 match the corresponding outputs of the previous pipeline stage. The easiest way to accomplish this is to just make all varying slots match up with the previous pipeline stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:46 -07:00
Paul Berry	a4546ec114	i965/fs: Simplify computation of key.input_slots_valid during precompile. The for loop was rather silly. In addition to checking brw->gen < 6 on each loop iteration, it took pains to exclude bits from fp->Base.InputsRead that don't correspond to fragment shader inputs. But those bits would never have been set in the first place, since the only bits that are ever set in fp->Base.InputsRead are fragment shader inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:43 -07:00
Paul Berry	8a36f4382b	i965/gs: Stop storing an input VUE map in the GS program key. Now that the vertex shader output VUE map is determined solely by a 64-bit bitfield, we don't have to store it in its entirety in the geometry shader program key; instead, we can just store the bitfield, and let the geometry shader infer the VUE map at compile time. This dramatically reduces the size of the geometry shader program key, which we want to keep small since it gets recomputed whenever the active program changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:40 -07:00
Paul Berry	d1ad447f01	i965/gen6+: Remove VUE map dependency on userclip_active. Previously, on Gen6+, we laid out the vertex (or geometry) shader VUE map differently depending whether user clipping was active. If it was active, we put the clip distances in slots 2 and 3 (where the clipper expects them); if it was inactive, we assigned them in the order of the gl_varying_slot enum. This made for unnecessary recompiles, since turning clipping on/off for a shader that used gl_ClipDistance might rearrange the varyings. It also required extra bookkeeping, since it required the user clipping flag to be provided to brw_compute_vue_map() as a parameter. With this patch, we always put clip distances at in slots 2 and 3 if they are written to. do_vs_prog() and do_gs_prog() are responsible for ensuring that clip distances are written to when user clipping is enabled (as do_vs_prog() previously did for gen4-5). This makes the only input to brw_compute_vue_map() a bitfield of which varyings the shader writes to, a fact that we'll take advantage of in forthcoming patches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:36 -07:00
Paul Berry	3a83b20dcc	i965/fs: Stop wasting input attribute space on gl_FragCoord and gl_FrontFacing. Previously, if a fragment shader accessed gl_FragCoord or gl_FrontFacing, we would assign them their own slots in the fragment shader input attribute array, using up space that could be made available to real varyings. This was not strictly necessary (since these values are not true varyings, and are instead computed from other data available in the FS payload). But we had to do it anyway because the SF/SBE setup code assumed that every 1 bit in the gl_program::InputsRead bitfield corresponded to a genuine varying variable. Now that the SF/SBE code consults brw_wm_prog_data and only sets up the attributes that the fragment shader actually needs, we don't have to do this anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:32 -07:00
Paul Berry	0af1252ae4	i965/sf: Consult brw_wm_prog_data when setting up SF/SBE state. Previously, the SF/SBE setup code delivered varying inputs to the FS in the order in which they appear in the gl_program::InputsRead bitfield, since that's what the FS expects. When we add support for more than 64 varying components, this will no longer always be the case, because the Gen6+ SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots. So, when there are more than 16 vec4's worth of varying inputs, the FS will have to adjust the order its input varyings in order to partially match the order of outputs from the geometry or vertex shader. To allow extra flexibility in the ordering of FS varyings, this patch causes the SF/SBE to deliver varying inputs to the FS in exactly the order that the FS requests, by consulting brw_wm_prog_data::urb_setup and brw_wm_prog_data::num_varying_inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:29 -07:00
Paul Berry	af84bbd2ca	i965/sf: Consolidate common code for setting up gen6-7 attribute overrides. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:25 -07:00
Paul Berry	d5b4095356	i965/sf: Use BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoded values. We always program the SF unit to start reading the vertex URB entry at offset 1. In upcoming patches, we'll be adding FS code that relies on this. So consistently use the constant BRW_SF_URB_ENTRY_READ_OFFSET rather than hardcoding a 1. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:21 -07:00
Paul Berry	8c2b9bd1df	i965/fs: Consult brw_wm_prog_data::num_varying_inputs when setting up WM state. Previously, we assumed that the number of varying inputs consumed by the fragment shader was equal to the number of bits set in gl_program::InputsRead. However, we'll soon be making two changes that will cause that not to be true: - We'll stop wasting varying input space for gl_FragCoord and gl_FrontFacing, which aren't varyings. - For fragment shaders that have more than 16 varying inputs, we'll adjust the layout of the inputs to account for the fact that the SF/SBE pipeline stage can't reorder inputs beyond the first 16; if there are GS outputs that the FS doens't use (or vice versa) this may cause the number of FS varying inputs to change. So, instead of trying to guess the number of FS inputs from gl_program::InputsRead, simply read it from brw_wm_prog_data:num_varying_inputs, which is guaranteed to be correct since it's populated by fs_visitor::calculate_urb_setup(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:18 -07:00
Paul Berry	8c69eaba1a	i965/fs: Change brw_wm_prog_data::urb_read_length to num_varying_inputs. On gen4-5, the FS stage reads varying inputs from URB entries that were output by the SF thread, where each register stores the interpolation setup for two components of a vec4, therefore the FS urb_read_length is twice the number of FS input varyings. On gen6+, varying inputs are directly deposited in the FS payload by the SF/SBE fixed function logic, so urb_read_length is irrelevant. However, in future patches, it will be nice to be able to consult brw_wm_prog_data to determine how many varying inputs the FS expects (rather than inferring it from gl_program::InputsRead). So instead of storing urb_read_length, we simply store num_varying_inputs in brw_wm_prog_data. On gen4-5, we multiply this by 2 to recover the URB read length. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:14 -07:00
Paul Berry	58f01bd17d	i965/fs: Expose "urb_setup" as part of brw_wm_prog_data. At the moment, for Gen6+, the FS assumes that all varying inputs are delivered to it in the order in which they appear in the gl_program::InputsRead bitfield, and the SF/SBE setup code ensures that they are delivered in this order. When we add support for more than 64 varying components, this will no longer always be possible, because the Gen6+ SF/SBE stage is only capable of performing arbitrary reorderings of 16 varying slots. To allow extra flexibility in the ordering of FS varyings, this patch causes the FS to advertise exactly what ordering it expects. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 12:53:05 -07:00
Chia-I Wu	4a6939edae	ilo: make ilo_bind_sampler_states return void So that it can be hooked up pipe_context::bind_sampler_states that is currently living on another branch.	2013-09-17 00:20:50 +08:00
Kenneth Graunke	120d100627	glsl/tests: Update .gitignore for new unit test. I rarely run 'git status', so I failed to notice this was missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-16 08:26:09 -07:00
Kenneth Graunke	1da3ff1b1c	glsl/tests: Add a test for properties of sampler types. For each sampler type, this tests that: - The base type is GLSL_TYPE_SAMPLER. - The dimensionality is set correctly. - The returned data type is correct. - The sampler_array and sampler_shadow flags are set correctly. - sampler_coordinate_components() returns the correct value. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2013-09-15 21:48:20 -07:00
Dave Airlie	2f508f244e	st/mesa: don't dereference stObj->pt if NULL It seems a user app can get us into this state, I trigger the fail running fbo-maxsize inside virgl, it fails to create the backing storage for the texture object, but then segfaults here when it should fail the completeness test. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-09-16 08:33:02 +10:00
Dave Airlie	bbe3d6dc29	nouveau: fix regression since float comparison instructions (v2) Fix the return type and allow src and dst types for comparison to be separate, this at least fixes the two test cases I've written. v2: drop the u32->s32 change Acked-by: Christoph Bumiller <christoph.bumiller@speed.at> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-09-16 08:32:42 +10:00
Rico Schüller	6f52295129	vdpau/decode: Check max width and max height. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-09-15 16:18:08 +02:00
Rob Clark	ffa3244534	freedreno: PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE When the old contents do not need to be preserved, it is faster to create a new backing bo rather than stall. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	d7be322410	freedreno/a3xx: fix VFD_INDEX_MAX overflow max_index may be 0xffffffff. The hardware does not need 1 + max_index (although it does not hurt unless max_index wraps around to zero). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	c756a3ef70	freedreno: add debug option to disable GMEM bypass Useful for debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	cdec879e38	freedreno/a3xx: handle front_ccw Used by supertuxkart. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	cda75253f7	freedreno/a3xx: stencil fixes For mem->gmem we don't sample depth/stencil as it's native type. So we need to setup the swizzle state for the sampler based on the format used for sampling. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	65ae4392ce	freedreno/a3xx: alpha-test Needed by some games, like etuxracer and supertuxkart which use alpha test rather than blending, to handle texture transparency. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	dbf041e61f	freedreno/a3xx/compiler: implement SUB Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	1a42d4ee34	freedreno/a3xx: use INDIRECT state load for shaders With a debug option to force DIRECT (mainly to make it easier for capturing cmdstream dumps). Using INDIRECT for large shaders at least makes a noticable reduction in CPU load, which helps for CPU limited games. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	6e9c386d16	freedreno: avoid stalling at ringbuffer wraparound Because of how the tiling works, we can't really flush at arbitrary points very easily. So wraparound is handled by resetting to top of ringbuffer. Previously this would stall until current rendering is complete. Instead cycle through multiple ringbuffers to avoid a stall. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	ca505303a7	freedreno: emit markers to scratch registers Emit markers by writing to scratch registers in order to "triangulate" gpu lockup position from post-mortem register dump. By comparing register values in post-mortem dump to command-stream, it is possible to narrow down which DRAW_INDX caused the lockup. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	1e6d290f21	freedreno: split out WFI helper Mostly just to give an easy debug/instrumentation point. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	74052347f3	freedreno: fd_draw helper Have a single helper that all draws come through.. mainly for a convenient debug and instrumentation point. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	4712904ddc	freedreno/a3xx: fix gpu lockup in some piglit tests The varying-out config comes from the inputs of the frag shader (so that we aren't exporting unneeded varyinges). The varyings-count should come from the frag shader as well, to avoid a discrepency in configuration and resulting gpu lockup. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	64c134cedb	freedreno/a3xx/compiler: add LIT Needed by glxgears and etuxracer ;-) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Rob Clark	cb9e07aa84	freedreno: multi-slice resources (cubemap, mipmap, etc) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-09-14 13:31:58 -04:00
Paul Berry	71ffac691b	glsl/builtins: Fix {texture1D,texture2D,shadow1D}ArrayLod availibility. These functions are defined in EXT_texture_array, which makes no mention of what shader types they should be allowed in. At the time EXT_texture_array was introduced, functions ending in "Lod" were available only in vertex shaders, however this restriction was lifted in later spec versions and extensions. We already have the function lod_exists_in_stage() for figuring out whether functions ending in "Lod" should be available, so just re-use that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-13 14:59:06 -07:00
Kenneth Graunke	4b3c0a797f	i965: Use brw_stage_state for WM data as well. This gets the VS, GS, and PS all using the same data structure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-13 14:26:52 -07:00
Kenneth Graunke	e6e5f88848	i965: Increase the size of brw_stage_state::surf_offset. Since BRW_MAX_WM_SURFACES is greater than BRW_MAX_VEC4_SURFACES, the existing array isn't large enough to be used by the WM. Increasing it will make it possible to share them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-13 14:26:50 -07:00
Kenneth Graunke	3a835b699a	i965: Add comments to the new brw_state_state structure's fields. These are largely based on the similar fields in brw->wm. v2: Add a better comment than "Scratch buffer". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-13 14:26:31 -07:00
Ian Romanick	ea373f03e8	mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix Everyone at the Khronos meeting was as surprised that GLSL didn't already support this as we were. Several vendors said they'd ship it, but there didn't seem to be enough interest to put in the effort to make it ARB or KHR. v2: Fix a couple typos and rename the spec file to EXT_shader_integer_mix.spec. Suggested by Roland. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-13 09:56:36 -05:00
Marek Olšák	f4e35f897e	radeonsi: fix and enable transform feedback for CIK The CP_STRMOUT_CNTL register was moved again. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-09-13 01:08:04 +02:00
Marek Olšák	f317ce5c5d	radeonsi: fix gl_InstanceID with non-zero start_instance start_instance doesn't affect gl_InstanceID. There's no piglit test, but it's kinda obvious the code was wrong. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-09-13 01:08:03 +02:00
Marek Olšák	9c75d2f65b	gallium: comment that INSTANCEID doesn't include start_instance Reviewed-by: Christian König <christian.koenig@amd.com>	2013-09-13 01:08:03 +02:00
Marek Olšák	122a880b78	radeonsi: enable streamout AKA transform feedback for SI Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:07:56 +02:00
Marek Olšák	8d03d923b6	radeonsi: implement streamout shader support The shader is responsible for writing to streamout buffers using the TBUFFER_STORE_FORMAT_* instructions. The locations of some input SGPRs and VGPRs are assigned dynamically, because the input SGPRs controlling streamout are not declared if they are not needed, decreasing the indices of all following inputs. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	9d16e70b3f	radeonsi: implement glDrawTransformFeedback functionality Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	6cf29c7dab	radeonsi: fix streamout queries Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	91ede46222	radeonsi: implement streamout flush properly Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	2993ccab38	radeonsi: bind streamout buffers to VGT and the vertex shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	e4c5d3ee27	radeonsi: handle rasterizer_discard and set GS_OUT_PRIM_TYPE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	9eb3b9dc2b	radeonsi: initialize the first CS like any other So that the "init" state is always emitted first and not later in draw_vbo. This fixes streamout where the "init" state, which disables streamout, was emitted in draw_vbo after streamout was enabled. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	2b0a54d6ec	radeonsi: integrate shared streamout state Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	4ea35023c5	radeon: don't emit streamout state if there are no streamout buffers This could happen if set_stream_output_targets is called twice in a row without a draw call in between. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Marek Olšák	60416cb173	radeon: don't emit VGT_STRMOUT_BUFFER_BASE on SI The register doesn't exist on SI. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-09-13 01:04:44 +02:00
Kenneth Graunke	2b71b3d466	mesa: Disallow relinking if a program is used by an active XFB object. Paused transform feedback objects may refer to a program other than the current program. If any active objects refer to a program, LinkProgram must reject the request to relink. The code to detect this is ugly since _mesa_HashWalk is awkward to use, but unfortunately we can't use hash_table_foreach since there's no way to get at the underlying struct hash_table (and even then, we'd need to handle locking somehow). Fixes the last subcase of Piglit's new ARB_transform_feedback2 api-errors test. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-12 10:19:10 -07:00
Kenneth Graunke	9cc74c93f8	mesa: Reject ResumeTransformFeedback if the wrong program is bound. This is actually a pretty important error condition: otherwise, you could set up transform feedback with one program, and resume it with a program that generates a completely different set of outputs. Fixes a subcase of Piglit's new ARB_transform_feedback2 api-errors test. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-12 10:19:09 -07:00
Kenneth Graunke	c732f68cf4	mesa: Track the vertex program active at BeginTransformFeedback() time. The next few patches will use this for API error checking. All of the drivers appear to CALLOC_STRUCT transform feedback objects, so this should be properly NULL initialized on creation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-12 10:19:07 -07:00
Kenneth Graunke	a7d616da69	mesa: Disallow TransformFeedbackVaryings when active. Fixes a subcase of Piglit's new ARB_transform_feedback2 api-errors test. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-12 10:18:59 -07:00
Christian König	2487324591	radeon/uvd: move more logic into the common files Move the code back into the common UVD files since we now have base structures for R600 and radeonsi. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-09-12 15:16:30 +02:00
Christian König	56be937d42	radeon/uvd: use more sane defaults for bitstream buffer size Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-09-12 15:16:06 +02:00
Andreas Boll	32637f56a5	os: First check for __GLIBC__ and then for PIPE_OS_BSD Fixes FTBFS on kfreebsd-* Debian GNU/kFreeBSD doesn't provide getprogname() since it uses stdlib.h from glibc. Instead it provides program_invocation_short_name from glibc. You can find the same order in src/mesa/drivers/dri/common/xmlconfig.c Cc: "9.2" <mesa-stable@lists.freedesktop.org> Tested-by: Julien Cristau <jcristau@debian.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-12 12:35:34 +02:00
José Fonseca	315f8f17d0	llvmpipe: Remove the special path for TGSI_OPCODE_EXP. It was wrong for EXP.y, as we clamped the source before computing the fractional part, and this opcode should be rarely used, so it's not worth the hassle.	2013-09-12 11:24:24 +01:00
José Fonseca	e75211df0f	trace: Several enhancements to dump_state.py - Handle more calls - Handle more state - Try to normalize the output a bit, to eliminate spurious differences	2013-09-12 11:24:24 +01:00
José Fonseca	9641f1037c	trace: Support bigger TGSI shaders. Trivial.	2013-09-12 11:24:24 +01:00
Kenneth Graunke	c59659ca08	glsl: Use sampler_coordinate_components instead of passing it by hand. We used to pass the number of components actually used for the coordinate (rather than padding, shadow comparitors, and projectors) by hand, specifying it on every _texture() call. The new helper function can just compute this, eliminating a lot of potential mistakes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 22:48:32 -07:00
Kenneth Graunke	694be9115d	glsl: Add a new glsl_type::sampler_coordinate_components() function. This computes the number of components necessary to address a sampler based on its dimensionality. It will be useful for texturing built-ins. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 22:48:32 -07:00
Johannes Obermayr	5eb7ff1175	Move nv30, nv50 and nvc0 to nouveau. It is planned to ship openSUSE 13.1 with -shared libs. nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau related targets. This change makes it possible to easily build one shared libnouveau.so which is then LIBADDed. Also dlopen will be faster for one library instead of three and build time on -jX will be reduced. Whitespace fixes were requested by 'git am'. Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de> Acked-by: Christoph Bumiller <christoph.bumiller@speed.at> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-11 21:47:07 +02:00
Paul Berry	ebcdaa7bbc	i965/gs: implement EndPrimitive() functionality in the visitor. According to GLSL, the shader may call EndPrimitive() at any point during its execution, causing the line or triangle strip currently being output to be terminated and a new strip to be begun. This is implemented in gen7 hardware by using one control data bit per vertex, to indicate whether EndPrimitive() was called after that vertex was emitted. In order to make this work without sacrificing too much efficiency, we accumulate 32 control data bits at a time in a GRF. When we have accumulated 32 bits (or when the shader terminates), we output them to the appropriate DWORD in the control data header and reset the accumulator to 0. We have to take special care to make sure that EndPrimitive() calls that occur prior to the first vertex have no effect. Since geometry shaders that output a large number of vertices are likely to be rare, an optimization kicks in if max_vertices <= 32. In this case, we know that we can wait until the end of shader execution before any control data bits need to be output. I've tried to write the code in such a way that in the future, we can easily adapt it to output stream ID bits (which are two bits/vertex instead of one). Fixes piglit tests "spec/glsl-1.50/glsl-1.50-geometry-end-primitive *". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:17:54 -07:00
Paul Berry	564a900a45	i965/vec4: Add the ability to emit opcodes with just a dst register. This is needed for GS_OPCODE_PREPARE_CHANNEL_MASKS. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:17:50 -07:00
Paul Berry	6ced0fa57f	i965/gs: Add opcodes needed for EndPrimitive(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:17:41 -07:00
Paul Berry	a74af8148d	i965/gen7: Add the ability to send URB_WRITE_OWORD messages. Previously, brw_urb_WRITE() would always generate a URB_WRITE_HWORD message, we always wanted to write data to the URB in pairs of varying slots or larger (an HWORD is 32 bytes, which is 2 varying slots). In order to support geometry shader EndPrimitive functionality, we'll need the ability to write to just a single OWORD (16 byte) slot, since we'll only be outputting 32 of the control data bits at a time. So this patch adds a flag that will cause brw_urb_WRITE to generate a URB_WRITE_OWORD message. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:17:31 -07:00
Paul Berry	bf5419e389	i965/gen7: Allow URB_WRITE channel masks to be used. Previously, brw_urb_WRITE() would unconditionally override the channel masks in the URB_WRITE message to 0xff (indicating that all channels should be written to the URB). In order to support geometry shader EndPrimitive functionality, we'll need the ability to set the channel masks programatically, so that we can output just 32 of the control data bits at a time. So this patch adds a flag that will prevent brw_urb_WRITE() from overriding them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:17:24 -07:00
Paul Berry	247f90c77e	i965/gs: Set control data header size/format appropriately for EndPrimitive(). The gen7 geometry shader uses a "control data header" at the beginning of the output URB entry to store either (a) flag bits (1 bit/vertex) indicating whether EndPrimitive() was called after each vertex, or (b) stream ID bits (2 bits/vertex) indicating which stream each vertex should be sent to (when multiple transform feedback streams are in use). Fortunately, OpenGL only requires separate streams to be supported when the output type is points, and EndPrimitive() only has an effect when the output type is line_strip or triangle_strip, so it's not a problem that these two uses of the control data header are mutually exclusive. This patch modifies do_vec4_gs_prog() to determine the correct hardware settings for configuring the control data header, and modifies upload_gs_state() to propagate these settings to the hardware. In addition, it modifies do_vec4_gs_prog() to ensure that the output URB entry is large enough to contain both the output vertices and the control data header. Finally, it modifies vec4_gs_visitor so that it accounts for the size of the control data header when computing the offset within the URB where output vertex data should be stored. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> v2: Fixed incorrect handling of IVB/HSW differences. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:17:14 -07:00
Paul Berry	1a33e0233a	glsl: During linking, record whether a GS uses EndPrimitive(). This information will be useful in the i965 back end, since we can save some compilation effort if we know from the outset that the shader never calls EndPrimitive(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:16:35 -07:00
Paul Berry	79d9c6b7ff	i965/gs: Add a state atom to set up geometry shader state. v2: Do not attempt to share the code that uploads 3DSTATE_BINDING_TABLE_POINTERS_GS, 3DSTATE_SAMPLER_STATE_POINTERS_GS, or 3DSTATE_GS with VS. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> v3: Add _NEW_TRANSFORM to gen7_gs_state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:16:25 -07:00
Paul Berry	ec5c924290	i965/gen7: Extract a function for setting up a shader stage's constants. This will allow us to reuse some code when setting up the geometry shader stage. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-11 11:16:19 -07:00
Torsten Duwe	3bc642cbf6	wayland-egl.pc requires wayland-client.pc. Mesa provides the wayland-egl libs and the pkgconfig file, but the headers originate from the wayland package. Ensure everything matches, by requiring application builds to look at the wayland headers as well. Signed-off-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>	2013-09-11 10:51:02 -07:00
Johannes Obermayr	87ebbe1270	st/gbm: Add $(WAYLAND_CFLAGS) for HAVE_EGL_PLATFORM_WAYLAND.	2013-09-11 10:50:34 -07:00
Maarten Lankhorst	b217d48364	st/dri: do not create a new context for msaa copy Commit `b77316ad75` st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers introduced creating a pipe_context for every call to validate, which is not required because the callers have a context anyway. Only exception is egl_g3d_create_pbuffer_from_client_buffer, can someone test if it still works with NULL passed as context for validate? From examining the code I believe it does, but I didn't thoroughly test it. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-11 09:03:44 +02:00
Kenneth Graunke	169f9c030c	i965: Add an assertion that writemask != NULL for non-ARFs. We've observed GPU hangs on Ivybridge from the following instruction: mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q }; There should be no reason to ever set the writemask on a destination register to zero, except for perhaps the ARF NULL register. This patch adds an assertion to enforce this for non-ARF registers. Excluding ARFs is conservative yet should still catch the majority of mistakes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-09-10 17:52:59 -07:00
Kenneth Graunke	4e5eb8ba25	i965/vec4: Only zero out unused message components when there are any. Otherwise, coordinates with four components would result in a MOV with a destination writemask that has no channels enabled: mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q }; At best, this is stupid: we emit code that shouldn't do anything. Worse, it apparently causes GPU hangs (observable with Chris's textureGather test on CubeArrays.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <idr@freedesktop.org> Cc: mesa-stable@lists.freedesktop.org	2013-09-10 17:52:56 -07:00
Kenneth Graunke	17eb1df7b8	i965/vec4: Simplify the computation of coord_mask and zero_mask. We can easily compute these without loops, resulting in simpler and shorter code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-09-10 17:52:36 -07:00
Matt Turner	66be7b4c27	docs: Clean up autoconf.html. Remove long dead options and clarify some things. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69148 Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-10 16:59:35 -07:00
Henri Verbeet	bd77f51758	mesa: Properly set the fog scale (gl_Fog.scale) to +INF when fog start and end are equal. This was originally introduced by commit `ba47aabc98`, but unfortunately the commit message doesn't go into much detail about why +INF would be a problem here. A similar issue exists for STATE_FOG_PARAMS_OPTIMIZED, but allowing infinity there would potentially introduce NaNs where they shouldn't exist, depending on the values of fog end and the fog coord. Since STATE_FOG_PARAMS_OPTIMIZED is only used for fixed function (including ARB_fragment_program with fog option), and the calculation there probably isn't very stable to begin with when fog start and end are close together, it seems best to just leave it alone. This fixes piglit glsl-fs-fogscale, and a couple of Wine D3D tests. No piglit regressions on Cayman. Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-10 22:25:16 +02:00
Vinson Lee	09e385ee3b	mesa: Use correct enum conversion function. Fixes "Mixing enum types" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-10 10:56:38 -07:00
Vinson Lee	fd66a85f6b	mesa: Ensure gl_sync_object is fully initialized. `278372b47e` added the uninitialized pointer field gl_sync_object:Label. A free of this pointer, added in commit `6d8dd59cf5`, resulted in a crash. This patch fixes piglit ARB_sync regressions with swrast introduced by `6d8dd59cf5`. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-10 10:54:26 -07:00
Vinson Lee	49f2ba2cb0	radeonsi: Add parentheses around '\|' operands. Fixes GCC parentheses warning. r600_texture.c: In function 'si_texture_create': r600_texture.c:518:20: warning: suggest parentheses around arithmetic in operand of '\|' [-Wparentheses] !(templ->bind & PIPE_BIND_CURSOR \| PIPE_BIND_LINEAR)) { ^ Fixes "Wrong operator used" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-10 10:44:09 -07:00
Vinson Lee	d93e23ba25	util: Fix unmatched parenthesis. Fixes MSVC build error introduced with commit `923d346714`. src\gallium\auxiliary\util\u_cpu_detect.c(286) : fatal error C1012: unmatched parenthesis : missing '(' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-09-10 10:33:47 -07:00
Brian Paul	923d346714	util: don't use _fxsave() with MSVC 2010 or older And update _MSC_VER comments in p_config.h Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-09-10 11:01:37 -06:00
Vinson Lee	787ac4207e	glsl: Add missing va_end in builtin_builder::add_function. Fixes "Missing varargs init or cleanup" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-10 09:52:03 -07:00
Vinson Lee	118cdd1d3f	glsl: Initialize builtin_builder member variables. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-10 09:49:02 -07:00
Brian Paul	395b941086	glsl: fix variadic macro for MSVC MSVC doesn't accept the rest... syntax. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-09 17:52:44 -06:00
Brian Paul	1ddb56d160	glsl: remove struct keyword from ir_variable declarations To silence MSVC warnings. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-09 17:52:44 -06:00
Kenneth Graunke	0bb3cd8090	Revert "i965/vec4: Only zero out unused message components when there are any." This reverts commit `6c3db2167c`, which I accidentally pushed along with other code. A better version of the fix will be committed later.	2013-09-09 15:33:16 -07:00
Matt Turner	89f5f675ad	i965: Allow immediates to be folded into logical and shift instructions. These instructions will be used with immediate arguments in the upcoming ldexp lowering pass and frexp implementation. v2: Add vec4 support as well. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 15:01:08 -07:00
Matt Turner	d83221c2d3	i965: Enable MESA_shader_integer_mix. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-09 15:01:08 -07:00
Matt Turner	56fff7063d	glsl: Implement MESA_shader_integer_mix extension. Because why doesn't GLSL allow you to do this already? Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-09 15:01:08 -07:00
Matt Turner	fd183fa02c	glsl: Use conditional-select in mix(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-09 15:01:08 -07:00
Matt Turner	8477262958	i965: Add support for ir_triop_csel. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-09 15:01:08 -07:00
Matt Turner	7aaa38728f	glsl: Add conditional-select IR. It's a ?: that operates per-component on vectors. Will be used in upcoming lowering pass for ldexp and the implementation of frexp. csel(selector, a, b): per-component result = selector ? a : b Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-09 15:01:08 -07:00
Kenneth Graunke	60850b7b9f	glsl: Rename ir_function_signature::builtin_info to builtin_avail. builtin_info was originally going to be a structure containing a bunch of information, but after various rewrites, it turned into a boolean availability predicate. builtin_avail is a better name than builtin_info, since it doesn't store any information other than availability. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:54:46 -07:00
Kenneth Graunke	260965b7a7	build: Delete cross-compiling macros. Now that builtin_compiler is gone, nothing uses these. Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:42:33 -07:00
Kenneth Graunke	b973b44a4d	glsl: Add missing type inference for ir_binop_bfm. Matt noticed that this was missing. Nothing uses this currently. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:42:33 -07:00
Kenneth Graunke	722eff674b	glsl: Delete old built-in function generation code. None of this is used anymore. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:42:33 -07:00
Kenneth Graunke	c845140a20	glsl: Remove builtin_compiler from the build system. We don't actually use anything from builtin_function.cpp, so we don't need to generate it anymore. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:42:33 -07:00
Kenneth Graunke	76d2f73643	glsl: Switch to the new built-in function module. All built-ins are now handled by the new code; the old system is dead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:42:33 -07:00
Kenneth Graunke	7ddc312c1b	glsl: Write a new built-in function module. This creates a new replacement for the existing built-in function code. The new module lives in builtin_functions.cpp (not builtin_function.cpp) and exists in parallel with the existing system. It isn't used yet. The new built-in function code takes a significantly different approach: Instead of implementing built-ins via printed IR, build time scripts, and run time parsing, we now implement them directly in C++, using ir_builder. This translates to faster load times, and a much less complex build system. It also takes a different approach to built-in availability: each signature now stores a boolean predicate, which makes it easy to construct arbitrary expressions based on _mesa_glsl_parse_state's fields. This is much more flexible than the old system, and also easier to use. Built-ins are also now stored in a single gl_shader object, rather than being spread out across a number of shaders that need to be linked. When searching for a matching prototype, we simply consult the availability predicate. This also simplifies the code. v2: Incorporate Matt Turner's feedback: use the new fma() function rather than expr(). Don't expose textureQueryLOD() in GLSL 4.00 (since it was renamed to textureQueryLod()). Also correct some #undefs. v3: Incorporate Paul Berry's feedback: rename legacy to compatibility; add comments to explain a few things; fix uvec availability; include shaderobj.h instead of repeating the _mesa_new_shader prototype. v4: Fix lack of TEX_PROJECT on textureProjGrad[Offset] (caught by oglc). Add an out_var convenience function (more feedback by Matt Turner). v5: Rework availability predicates for Lod functions. They were broken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Enthusiastically-acked-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 14:42:18 -07:00
Kenneth Graunke	8d90328eb3	glsl: Add optional parameters to the ir_factory constructor. Each ir_factory needs an instruction list and memory context in order to be useful. Rather than creating an object and manually assigning these, we can just use optional parameters in the constructor. This makes it possible to create a ready-to-use factory in one line: ir_factory body(&sig->body, mem_ctx); Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	666df56551	glsl: Add IR builder shortcuts for a bunch of random opcodes. Adding new convenience emitters makes it easier to generate IR involving these opcodes. bitfield_insert is particularly useful, since there is no expr() for quadops. v2: Add fma() and rename lrp() operands to x/y/a to match the GLSL specification (suggested by Matt Turner). Fix whitespace issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	1a6c0efa11	glsl: Expose IR builder support for arbitrary swizzling. IR builder already offers a lot of swizzling functions, such as swizzle_xxxx, swizzle_z, or swizzle_for_size. The swizzle_xxxx style is convenient if you statically know which components you want. swizzle_for_size is great if you want to select the first few components. However, if you want to select components based on, say, a loop counter, none of those are sufficient. IR builder actually already had support for arbitrary swizzling, but didn't expose it. This patch exposes that API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	202238824b	glsl: Add a new ir_builder::dotlike() function. dotlike() uses ir_binop_mul for scalars, and ir_binop_dot for vectors. When generating built-in functions, we often want to use regular multiply for scalar signatures, and dot() for vector signatures. ir_binop_dot only works on vectors, so we have to switch opcodes, even if the code is otherwise identical. dotlike() makes this easy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	d716b3376c	glsl: Add IR builder support for generating return statements. We use "ret" as the function name since "return" is a C++ keyword, and "ir_return" is already a class name. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	f72a8498e7	glsl: Add IR builder support for conditional assignments. This adds two new signatures: assign(lhs, rhs, condition, writemask); assign(lhs, rhs, condition); All the other existing APIs still exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	eff2ca1ac3	glsl: Add IR builder support for triops. Now that we have the ir_expression constructor that does type inference, this is trivial to do. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	7f0f60cd84	glsl: Add an ir_expression triop constructor with type inference. We already have ir_expression constructors for unary and binary operations, which automatically infer the type based on the opcode and operand types. These are convenient and also required for ir_builder support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:22 -07:00
Kenneth Graunke	183f7a3e6f	glsl: Add missing type inference support for ARB_gpu_shader5 unops. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:21 -07:00
Kenneth Graunke	33faaf0b4a	glsl: Initialize lod_info in the ir_texture constructor. This isn't strictly necessary, since creators of ir_texture objects should set LOD when relevant. However, it's nice to have a NULL pointer in case they forget. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:21 -07:00
Kenneth Graunke	1b3a482a96	glsl: Skip unavailable built-ins when printing out similar candidates. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:21 -07:00
Kenneth Graunke	1ffcef04ce	glsl: Skip unavailable built-ins when matching signatures. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:21 -07:00
Kenneth Graunke	3e820e3aef	glsl: Pass _mesa_glsl_parse_state into matching_signature and such. During compilation, we'll use this to determine built-in availability. The plan is to have a single shader containing every built-in in every version of the language, but filter out the ones that aren't actually available to the shader being compiled. At link time, we don't actually need this filtering capability: we've already imported prototypes for every built-in that the shader actually calls, and they're flagged as is_builtin(). The linker doesn't import any additional prototypes, so it won't pull in any unavailable built-ins. When resolving prototypes to function definitions, the linker ensures the values of is_builtin() match, which means that a shader can't trick the linker into importing the body of an unavailable built-in by defining a suspiciously similar prototype. In other words, during linking, we can just pass in NULL. It will work out fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:21 -07:00
Kenneth Graunke	0823a87a75	glsl: Add a method to tell whether a built-in is available. We can simply call the stored predicate function. If state is NULL, just report that the function is available. v2: Add a comment (requested by Paul Berry). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:52:16 -07:00
Kenneth Graunke	d403a10573	glsl: Mark _mesa_glsl_parse_state::is_version() as const. This promises the method won't modify the contents of the object. This allows us to call it even with a const pointer to the state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:46:51 -07:00
Kenneth Graunke	4b0bac0dce	glsl: Convert ir_function_signature::is_builtin to a method. A signature is a built-in if and only if builtin_info != NULL, so we don't actually need a separate flag bit. Making a boolean-valued method allows existing code to ask the same question while not worrying about the internal representation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:46:51 -07:00
Kenneth Graunke	ca321d07fd	glsl: Store a predicate for whether a built-in signature is available. For the upcoming built-in function rewrite, we'll need to be able to answer "Is this built-in function signature available?". This is actually a somewhat complex question, since it depends on the language version, GLSL vs. GLSL ES, enabled extensions, and the current shader stage. Storing such a set of constraints in a structure would be painful, so instead we store a function pointer. When creating a signature, we simply point to a predicate that inspects _mesa_glsl_parse_state and answers whether the signature is available in the current shader. Unfortunately, IR reader doesn't actually know when built-in functions are available, so this patch makes it lie and say that they're always present. This allows us to hook up the new functionality; it just won't be useful until real data is populated. In the meantime, the existing profile mechanism ensures built-ins are available in the right places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-09 11:46:50 -07:00
Kenneth Graunke	6c3db2167c	i965/vec4: Only zero out unused message components when there are any. Otherwise, coordinates with four components would result in a MOV with a destination writemask that has no channels enabled: mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q }; At best, this is stupid: we emit code that shouldn't do anything. Worse, it apparently causes GPU hangs (observable with Chris's textureGather test on CubeArrays.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2013-09-09 11:26:53 -07:00
Paul Berry	2924b5f73b	vbo: Implement new gs prim types in vbo_count_tessellated_primitives. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-09-09 09:34:46 -07:00
Ian Romanick	2937d704dc	i965: Enable AMD_seamless_cubemap_per_texture The change is very small. Do seamless filtering if either the context enable is set or the sampler enable is set. The AMD_seamless_cubemap_per_texture says: "If TEXTURE_CUBE_MAP_SEAMLESS_ARB is emabled (sic) globally or the value of the texture's TEXTURE_CUBE_MAP_SEAMLESS_ARB parameter is TRUE, seamless cube map sampling is enabled..." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-08 07:54:12 -07:00
Ian Romanick	4a19503516	mesa: Always use seamless cubemap filtering in GLES3 Appendix F.2 of the OpenGL ES 3.0.0 spec says: "OpenGL ES 3.0 requires that all cube map filtering be seamless. OpenGL ES 2.0 specified that a single cube map face be selected and used for filtering." Setting the field only in the context will work fine with sampler objects (and drivers that support AMD_seamless_cubemap_per_texture) because seamless filtering is used if either the context or the sampler enable it: "If TEXTURE_CUBE_MAP_SEAMLESS_ARB is emabled (sic) globally or the value of the texture's TEXTURE_CUBE_MAP_SEAMLESS_ARB parameter is TRUE, seamless cube map sampling is enabled..." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reported-by: Maxence Le Dore <maxence.ledore@gmail.com> Thanked-by: Maxence Le Dore <maxence.ledore@gmail.com>	2013-09-08 07:54:12 -07:00
Ian Romanick	e334ff43c4	mesa: Don't allow glSamplerParameteriv(GL_TEXTURE_CUBE_MAP_SEAMLESS) in ES There is no GL_TEXTURE_CUBE_MAP_SEAMLESS in any version of OpenGL ES or in any extension that applies to OpenGL ES. The same error check already occurs for glTexParameteri. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: Maxence Le Dore <maxence.ledore@gmail.com>	2013-09-08 07:54:12 -07:00
Ian Romanick	7efe55cb2d	docs: initial 9.3 release notes file Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-09-08 07:54:11 -07:00
Chia-I Wu	e67f99bd29	ilo: preliminary GEN 7.5 support This is based on grepping for brw->is_haswell in i965 to see how GEN 7.5 differs from GEN 7. Slightly tested with Xonotic and some Mesa demos.	2013-09-08 01:22:52 +08:00
Alex Deucher	18805b16c8	radeonsi: add berlin pci ids Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-09-06 19:27:23 -04:00
Alex Deucher	9bc47dbe50	r600g: remove DMA padding This is now handled in the winsys. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-09-06 19:10:27 -04:00
Alex Deucher	a81beee37e	radeon/winsys: pad IBs to a multiple of 8 DWs This aligns the gfx, compute, and dma IBs to 8 DW boundries. This aligns the the IB to the fetch size of the CP for optimal performance. Additionally, r6xx hardware requires at least 4 DW alignment to avoid a hw bug. This also aligns the DMA IBs to 8 DW which is required for the DMA engine. This alignment is already handled in the gallium driver, but that patch can be removed now that it's done in the winsys. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" <mesa-stable@lists.freedesktop.org> CC: "9.1" <mesa-stable@lists.freedesktop.org>	2013-09-06 19:08:35 -04:00
Axel Davy	e8f9195e5f	gallium, intel: Implements new __DRI_IMAGE_USE_LINEAR and PIPE_BIND_LINEAR flags to enforce no tiling. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2013-09-06 15:02:34 -07:00
Vinson Lee	0a0f543082	mesa: Ensure gl_query_object is fully initialized. `278372b47e` added the uninitialized pointer field gl_query_object:Label. A free of this pointer resulted in a crash. This patch fixes piglit regressions with swrast introduced by `6d8dd59cf5`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69047 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-06 14:51:51 -07:00
Zack Rusin	e9f1f6ab42	gallivm: support indirect registers on both dimensions We support indirect addressing only on the vertex index, but some shaders also use indirect addressing on attributes. This patch adds support for indirect addressing on both dimensions inside gs arrays. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-09-06 15:05:27 -04:00
Stéphane Marchesin	f9b37f7183	i915g: Document fall-through switch Fixes warning reported by Coverity.	2013-09-06 11:05:25 -07:00
Stéphane Marchesin	519a2cf950	i915g: Handle i915->batch == NULL correctly in flush Fixes warning reported by Coverity.	2013-09-06 11:05:24 -07:00
Stéphane Marchesin	9e14895884	i915g: Remove useless comparison Fixes "Macro compares unsigned to 0" defect reported by Coverity.	2013-09-06 11:05:24 -07:00
Stéphane Marchesin	7125af2957	i915g: Fix initial array index Fixes "Out-of-bounds read" defect reported by Coverity.	2013-09-06 11:05:24 -07:00
Brian Paul	ac8448dd97	mesa: add GL_KHR_debug functions to dispatch_sanity.cpp Fixes 'make check' failures. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-09-06 07:53:41 -06:00
Timothy Arceri	238201158f	docs: Add some notes on submitting patches Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-06 07:52:18 -06:00
Tom Stellard	505fad04f1	r600g/compute: Fix bug in compute memory pool When adding a new buffer to the beginning of the memory pool, we were accidentally deleting the buffer that was first in the buffer list. This was caused by a bug in the memory pool's linked list implementation.	2013-09-05 17:18:00 -07:00
Tom Stellard	f0435ebb07	r600g/compute: Don't flush the cs in pipe_context::launch_grid() This is the state tracker's responsibility. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-05 17:17:43 -07:00
Matt Turner	16cedf3a25	i965: Remove never used DPA2 opcode. DPA2 is listed in the "Defeatured Instructions" section of the 965 PRM, Volume 4: "The following instructions are removed from Gen4 implementation mainly due to implementation cost/schedule reasons. They are candidates for future generations." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-05 14:55:27 -07:00
Matt Turner	4a6100054c	i965: Remove never used RSR and RSL opcodes. RSR and RSL are listed in the "Defeatured Instructions" section of the 965 PRM, Volume 4: "The following instructions are removed from Gen4 implementation mainly due to implementation cost/schedule reasons. They are candidates for future generations." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-05 14:55:19 -07:00
Dominik Behr	0f6fce1585	glsl: propagate max_array_access through function calls Fixes a bug where if an uniform array is passed to a function the accesses to the array are not propagated so later all but the first vector of the uniform array are removed in parcel_out_uniform_storage resulting in broken shaders and out of bounds access to arrays in brw::vec4_visitor::pack_uniform_registers. Cc: mesa-stable@lists.freedesktop.org Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dominik Behr <dbehr@chromium.org>	2013-09-05 14:36:11 -07:00
Ilia Mirkin	85f7df81a9	nv30: fix inconsistent setting of push->user_priv It's set to &nv30->bufctx everywhere else. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-09-05 20:46:56 +02:00
Paul Berry	588ec545ac	i965/gen7.5: Fix lower bound on number of VS URB entries. Haswell GT2 and GT3 require the number of vertex shader URB entries to be at least 64, not 32. At the moment, we always meet this requirement automatically, because in the absence of a geometry shader, we assign all available URB space to the vertex shader. But when we turn on support for geometry shaders, this lower limit will become important. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-05 09:52:47 -07:00
Paul Berry	ae79e3332e	i965/vs: Move vs-specific code out of brw_vec4_visitor.cpp. This patch creates a new file brw_vec4_vs_visitor.cpp, to contain code that is specific to the vertex shader. Now the organization of vertex shader and geometry shader visitor code is symmetric: vs-specific code is in brw_vec4_vs_visitor.cpp, gs-specific code is in brw_vec4_gs_visitor.cpp, and code shared between vs and gs is in brw_vec4_visitor.cpp. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-05 09:52:42 -07:00
Paul Berry	e241e7c979	i965/vec4: Make with_writemask() non-static. This will allow it to be shared between brw_vec4_visitor.cpp and brw_vec4_vs_visitor.cpp (which will be created in the next patch). Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-05 09:52:38 -07:00
Paul Berry	8f9a339c10	i965/vs: Move vs-specific code out of brw_vec4.h. Now brw_vec4.h contains only code that is shared between the vertex and geometry shaders. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-05 09:52:33 -07:00
Paul Berry	9dfa8ae662	i965/gs: Don't assign gl_Layer its own slot in the VUE map. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-09-05 09:52:20 -07:00
Stéphane Marchesin	8709e2b6c5	i915g: Implement writemask fixup The fixup code emulates non-BGRA render targets by adding an extra instruction at the end of fragment shaders to swizzle the output. To do this, we also swizzle the blend function. However an oversight until now was that the writemask wasn't getting swizzled. This patch fixes that which fixes a bunch of piglit tests.	2013-09-04 19:48:18 -07:00
Stéphane Marchesin	b1461acf15	i915g: Stop calling draw_prepare_shader_outputs It's not useful on i915g since we don't support primid. Fixes piglit point tests on i915g.	2013-09-04 19:48:18 -07:00
Rico Schüller	8b302e1635	glx: Initialize OpenGL version to 1.0 The old code in dri2_glx suffered from a typographical error that caused the default version to be 2.1 instead of 1.2 (minimum required by the Linux OpenGL ABI). drisw_glx had a similar error resulting in a default version of 0.1. Some driver/card combinations (r200/RV280, i915/915G) don't support OpenGL 2.1. These create in some corner cases an indirect context instead of a direct context when calling glXCreateContextAttribsARB(). This happens because of a bad default value. To avoid this, just used the default value specified by the GLX_ARB_create_context specification: "The default values for GLX_CONTEXT_MAJOR_VERSION_ARB and GLX_CONTEXT_MINOR_VERSION_ARB are 1 and 0 respectively. In this case, implementations will typically return the most recent version of OpenGL they support which is backwards compatible with OpenGL 1.0 (e.g. 3.0, 3.1 + GL_ARB_compatibility, or 3.2 compatibility profile)" Refactor all the default value setting to dri2_convert_glx_attribs, and make sure the correct defaults are set in that one place. Signed-off-by: Rico Schüller <kgbricola@web.de> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla http://bugs.winehq.org/show_bug.cgi?id=34238 Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>	2013-09-04 16:07:21 -07:00
Stéphane Marchesin	4e861ac4a1	i915g: Add more optimizations This patch adds liveness analysis to i915g and a couple optimizations which benefit from it. One interesting optimization turns (fake) indirect texture accesses into direct texture accesses (the i915 supports a maximum of 4 indirect texture accesses). Among other things this fixes a bunch of piglit tests.	2013-09-04 12:11:02 -07:00
Ian Romanick	a974b915b6	glsl: Remove unused prog parameter from tfeedback_decl::init It looks like commit `53febac` removed the last user of that parameter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-04 08:13:11 -07:00
Ian Romanick	0851aa7365	glsl: Validate qualifiers on VS color outputs with FS color inputs The vertex shader color outputs (gl_FrontColor, gl_BackColor, gl_FrontSecondaryColor, and gl_BackSecondaryColor) don't have the same names as the matching fragment shader color inputs (gl_Color and gl_SecondaryColor). As a result, the qualifiers on them were not being properly cross validated. Full spec compliance required ir_variable::used and ir_variable::assigned be set properly. Without the preceeding patch, which fixes the ::clone method to copy them, this will not be the case. Fixes all of the previously failing piglit spec/glsl-1.30/linker/interpolation-qualifiers tests. v2: Update callers of cross_validate_types_and_qualifiers and cross_validate_front_and_back_color. The function signature changed in v2 of a previous patch. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47755	2013-09-04 08:11:45 -07:00
Ian Romanick	ceceaf53ce	glsl: Copy ir_variable::assigned and ir_variable::used fields in ::clone method Nothing currently relies on this, but one of the next patches will. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-04 08:10:01 -07:00
Ian Romanick	c0e4a4adb7	glsl: Refactor a bunch of the code out of cross_validate_outputs_to_inputs The new function, cross_validate_types_and_qualifiers, will have multiple callers from this file in future commits. v2: Don't pass the names of the producer / consumer stages to cross_validate_types_and_qualifiers. Instead, pass the types and get the names only in the error paths. Suggested by Paul. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-04 08:08:15 -07:00
Ian Romanick	87252bf97b	glsl: Reallow precision qualifiers on structure members Changes to the grammar for GL_ARB_shading_language_420pack (commit `6eec502`) moved precision qualifiers out of the type_specifier production chain. This caused declarations such as: struct S { lowp float f; }; to generate parse errors. Section 4.1.8 (Structures) of both the GLSL ES 1.00 spec and GLSL 1.30 specs says: "Member declarators may contain precision qualifiers, but may not contain any other qualifiers." So, it sure seems like we shouldn't generate a parse error. :) Instead of type_specifier, use fully_specified_type in struct members. However, fully_specified_type allows a lot of other qualifiers that are not allowed on structure members, so expeclitly disallow them. Note, this makes struct_declaration look an awful lot like member_declaration (used for interface blocks). We may want to (somehow) unify these rules to reduce code duplication at some point. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68753 Reported-by: Aras Pranckevicius <aras@unity3d.com> Cc: Aras Pranckevicius <aras@unity3d.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-09-04 08:02:23 -07:00
Timothy Arceri	51a279254f	mesa: Setup remaining infrastucture and enable KHR_debug Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	9405be4add	glapi: Setup autogeneration infrastructure for KHR_debug Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	6964fa7ea3	mesa: Remap debug type and severity Remap any type or severity exclusive to KHR_debug to something suitable for ARB_debug_output Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	b5c4795f38	mesa: Implement GL_DEBUG_OUTPUT Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	a7f5eb8ebb	mesa: Update builds scripts to build object labels Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	262b5ff667	mesa: Implement KHR_debug ObjectLabel functions V3: make sure to add null terminator when setting label, generate error when the client specifies an explicit length that exceeds MAX_LABEL_LENGTH, set label pointer to NULL when freed, and output correct length in MAX_LABEL_LENGTH error message. V2: fixed indentation of comment Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	21b5bf712b	mesa: make _mesa_validate_sync() non-static Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:49 -06:00
Timothy Arceri	6d8dd59cf5	mesa: free object labels when deleting Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	278372b47e	mesa: add debug Label field to several data structures Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	6faf7052a2	mesa: make _mesa_lookup_list() non-static Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	97f9f11ec4	mesa: make _mesa_lookup_arrayobj() non-static Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	797b9dc3ff	mesa: Implement glPushDebugGroup and glPopDebugGroup V4: fixes _mesa_error() compiler warnings (BrianP). V3: removed C++ style comment V2: fixed spelling typo in comment Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	60f435319c	mesa: Add a clone function to mesa hash V2: const qualify table parameter Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	f5badf4671	mesa: Share common code between ARB_debug_output and KHR_debug functions Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Timothy Arceri	77d38fd3fb	mesa: Add some constants and state variables for KHR_debug functions Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-09-04 07:47:48 -06:00
Kenneth Graunke	644fbbd3eb	mesa: Rename gl_context::swtnl_im to vbo_context; use proper type. The main GL context's swtnl_im field is the VBO module's vbo_context structure. Using the name "swtnl" in the name is confusing since some drivers use hardware texturing and lighting, but still rely on the VBO module for drawing. v2: Forward declare the type and use that instead of void * (suggested by Eric Anholt). v3: Remove unnecessary cast (pointed out by by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-03 11:30:15 -07:00
Kenneth Graunke	6e143af66d	i965: Rename "prim" parameter to "prims" where it's an array. Some drawing functions take a single _mesa_prim object, while others take an array of primitives. Both kinds of functions used a parameter called "prim" (the singular form), which was confusing. Using the plural form, "prims," clearly communicates that the parameter is an array of primitives. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-03 11:29:33 -07:00
Kenneth Graunke	9f7d5870a3	i965: Actually check every primitive for cut index support. can_cut_index_handle_prims() was passed an array of _mesa_prim objects and a count, and ran a loop for that many iterations. However, it treated the array like a pointer, repeatedly checking the first element. This patch makes it actually check every primitive. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-03 11:29:09 -07:00
Michel Dänzer	6b5c802c30	radeonsi: Don't save/restore FMASK sampler view states for u_blitter Fixes assertion failues in 24 piglit tests with MESA_GL_VERSION_OVERRIDE=3.0, 12 of which are now passing. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-02 17:25:27 +02:00
Michel Dänzer	9933b85e12	radeonsi: Expose pure integer vertex formats Fixes 20 piglit tests with MESA_GL_VERSION_OVERRIDE=3.0. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-02 17:25:27 +02:00
Maarten Lankhorst	ad4dc77231	nvc0: restore viewport after blit Based on calim's original fix in the nine branch. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>	2013-09-02 17:09:21 +02:00
Christian König	3e81b8eedd	radeon/uvd: save the aligned width & height Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=68845 Signed-off-by: Christian König <christian.koenig@amd.com>	2013-09-02 15:42:13 +02:00
Chia-I Wu	da33347131	glx: make the interval of LIBGL_SHOW_FPS adjustable LIBGL_SHOW_FPS=1 makes GLX print FPS every second while other values do nothing. Extend it so that LIBGL_SHOW_FPS=N will print the FPS every N seconds. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-09-02 11:42:58 +08:00
Kenneth Graunke	b8211ab3ed	i965: Use the proper element of the prim array in brw_try_draw_prims. The VBO module actually calls us with an array of _mesa_prim objects. For example, it may break up a DrawArrays() call into multiple primitives when primitive restart is enabled. Previously, we treated prim like a pointer, always accessing element 0. This worked because all of the primitive objects in a single draw call have the same value for num_instances and basevertex. However, accessing an array as a pointer and using the wrong object's fields is misleading. For stylistic reasons alone, we should use the right object. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-01 18:54:39 -07:00
Kenneth Graunke	976d1d6665	i965: Combine brw_emit_prim and gen7_emit_prim. These functions have almost identical code; the only difference is that a few of the bits moved around. Adding a few trivial conditionals allows the same function to work on all generations, and the resulting code is still quite readable. v2: Comment that the workaround flush is only necessary on SNB (requested by Paul Berry). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-01 18:54:37 -07:00
Kenneth Graunke	a3335417e3	i965: Remove unused ATTRIB_BIT_DWORDS define. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-09-01 18:53:55 -07:00
Christoph Bumiller	7fe159ba74	nvc0: delete compute object on screen destruction Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-09-01 20:57:15 +02:00
Joakim Sindholt	2a7762bdb6	nvc0: fix blitctx memory leak Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>	2013-09-01 20:56:23 +02:00
Christoph Bumiller	1048d89907	nvc0: don't use bufctx in nvc0_cb_push Too many calls into libdrm when a single one is enough.	2013-09-01 20:53:11 +02:00
Christoph Bumiller	528a48ee8d	nvc0: clear the flushed flag	2013-09-01 20:52:27 +02:00
Christoph Bumiller	5399206056	nvc0/ir: add f32 long immediate cannot saturate Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-09-01 20:51:56 +02:00
Tiziano Bacocco	7086636358	nvc0/ir: fix use after free in texture barrier insertion pass Fixes crash with Amnesia: The Dark Descent. Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>	2013-09-01 20:51:39 +02:00
Ilia Mirkin	3282697621	nv30: find first unused texcoord rather than bailing if first is used This fixes shaders produced by supertuxkart. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-09-01 20:38:21 +02:00
Emil Velikov	dc10251d08	nouveau: initialise the nouveau_transfer maps Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-09-01 20:38:07 +02:00
Chris Forbes	f35dea05b1	i965/fs: Gen4: Zero out extra coordinates when using shadow compare Fixes broken rendering if these MRFs contained anything other than zero. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-09-01 19:50:59 +12:00
Paul Berry	4cc692e355	i965/gs: Implement support for geometry shader samplers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:13:10 -07:00
Paul Berry	89563489ff	i965/gs: add geometry shader support to brw_texture_surfaces. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:13:07 -07:00
Paul Berry	08d8ff0965	i965/gs: generalize brw_texture_surfaces in preparation for gs. There is a slight functionality change. Previously we would compute a common value for num_samplers for all stages, and populate that many entries in each stage's surf_offset table regardless of how many samplers each stage used. Now we only populate the number of entries in the surf_offset table corresponding to the number of samplers actually used by the stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:13:04 -07:00
Paul Berry	5a8033f142	i965: Modify signature to update_texture_surface functions. Previously these functions would accept a pointer to the binding table and an index indicating which entry in the binding table should be updated. Now they merely take a pointer to the binding table entry to be updated. This will make it easier to generalize brw_texture_surfaces to support geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:12:53 -07:00
Paul Berry	f560ce4a38	i965/vs: generalize gen6_vs_push_constants in preparation for GS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:12:43 -07:00
Paul Berry	4ec2604422	i965/gs: make the state atom for compiling Gen7 geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Use "unsigned" rather than "GLuint".	2013-08-31 17:12:33 -07:00
Paul Berry	130f0f78be	i965/gs: Implement support for geometry shader surfaces. This patch implements pull constant upload, binding table upload, and surface setup for geometry shaders, by re-using vertex shader code that was generalized in previous patches. Based on work by Eric Anholt <eric@anholt.net>. v2: Update ditry bits for brw_gs_ubo_surfaces to account for commit `77d8fbc` (mesa: add & use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECT). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:12:21 -07:00
Paul Berry	f986222754	i965/vs: generalize brw_vs_binding_table in preparation for GS. v2: Use GLbitfield instead of GLbitfield64 in brw_vec4_upload_binding_table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:12:15 -07:00
Paul Berry	1b19f2c576	i965: generalize brw_vs_pull_constants in preparation for GS. v2: Use GLbitfield instead of GLbitfield64 in brw_upload_vec4_pull_constants. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:12:09 -07:00
Paul Berry	555f9cf46d	i965: Make sure constants re-sent after constant buffer reallocation. The hardware requires that after constant buffers for a stage are allocated using a 3DSTATE_PUSH_CONSTANT_ALLOC_{VS,HS,DS,GS,PS} command, and prior to execution of a 3DPRIMITIVE, the corresponding stage's constant buffers must be reprogrammed using a 3DSTATE_CONSTANT_{VS,HS,DS,GS,PS} command. Previously we didn't need to worry about this, because we only programmed 3DSTATE_PUSH_CONSTANT_ALLOC_{VS,HS,DS,GS,PS} once on startup (or, previous to that, whenever BRW_NEW_CONTEXT was flagged). But now that we reallocate the constant buffers whenever geometry shaders are switched on and off, we need to make sure the constant buffers are reprogrammed. We do this by adding a new bit, BRW_NEW_PUSH_CONSTANT_ALLOCATION, to brw->state.dirty.brw. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:11:59 -07:00
Paul Berry	27eecefc67	i965/gs: Allocate push constant space for use by GS. Previously, we would always use the same push constant allocation regardless of what shader programs were being run: the available push constant space was split into 2 equal size partitions, one for the vertex shader, and one for the fragment shader. Now that we are adding geometry shader support, we need to do something smarter. This patch adjusts things so that when a geometry shader is in use, we split the available push constant space into 3 nearly-equal size partitions instead of 2. Since the push constant allocation is now affected by GL state, it can no longer be set up by brw_upload_initial_gpu_state(); instead it must be set up by a state atom. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:11:49 -07:00
Paul Berry	df62421382	i965/gen7: Emit CS stall after 3DSTATE_PUSH_CONSTANT_ALLOC_PS. This is required by the internal hardware docs and the PRM. Probably the reason we were getting away with not doing it was because we only emitted 3DSTATE_PUSH_CONSTANT_ALLOC_PS during startup. However that's going to change with the introduction of geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:11:46 -07:00
Paul Berry	fffba41c68	i965/gs: Allocate URB space for use by GS. Previously, we gave all of the URB space (other than the small amount that is used for push constants) to the vertex shader. However, when a geometry shader is active, we need to divide it up between the vertex and geometry shaders. The size of the URB entries for the vertex and geometry shaders can vary dramatically from one shader to the next. So it doesn't make sense to simply split the available space in two. In particular: - On Ivy Bridge GT1, this would not leave enough space for the worst case geometry shader, which requires 64k of URB space. - Due to hardware-imposed limits on the maximum number of URB entries, sometimes a given shader stage will only be capable of using a small amount of URB space. When this happens, it may make sense to allocate substantially less than half of the available space to that stage. Our algorithm for dividing space between the two stages is to first compute (a) the minimum amount of URB space that each stage needs in order to function properly, and (b) the amount of additional URB space that each stage "wants" (i.e. that it would be capable of making use of). If the total amount of space available is not enough to satisfy needs + wants, then each stage's "wants" amount is scaled back by the same factor in order to fit. When only a vertex shader is active, this algorithm produces equivalent results to the old algorithm (if the vertex shader stage can make use of all the available URB space, we assign all the space to it; if it can't, we let it use as much as it can). In the future, when we need to support tessellation control and tessellation evaluation pipeline stages, it should be straightforward to expand this algorithm to cover them. v2: Use "unsigned" rather than "GLuint". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-31 17:11:35 -07:00
Paul Berry	53f6e79633	i965: Make CACHE_NEW_GS_PROG. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:11:25 -07:00
Paul Berry	a702f6325c	i965/gs: Create brw_context::gs structure to track GS program state. v2: Change name from "vec4_gs" to simply "gs". Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:11:15 -07:00
Paul Berry	ec94e3c3d0	i965: Move data from brw->vs into a base class if gs will also need it. This paves the way for sharing the code that will set up the vertex and geometry shader pipeline state. v2: Rename the base class to brw_stage_state. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:11:05 -07:00
Paul Berry	cdf03b6928	i965/gs: Update defines related to GS surface organization. Defines that previously referred to VS now refer to VEC4, since they will be shared by the user-programmable vertex shader and geometry shader stages. Defines that previously referred to the Gen6 geometry shader stage (which is only used for transform feedback) are now renamed to explicitly refer to Gen6, to avoid confusion with the Gen7 user-programmable geometry shader stage. Based on work by Eric Anholt <eric@anholt.net>. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:10:54 -07:00
Paul Berry	b3a4d5c785	i965: Move vec4 register allocation data structures to brw->vec4. This will avoid confusion when we add geometry shaders, since these data structures will be shared by vertex and geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:10:44 -07:00
Paul Berry	56a2e57bdb	i965: Rename user-defined gs structs from vec4_gs to gs. Now that the name "gs" is no longer used to refer to the legacy fixed function geometry shaders, we can use it to refer to user-defined geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:10:34 -07:00
Paul Berry	32e16e2337	i965: rename legacy gs structs and functions to ff_gs. "ff" is for "fixed function". This frees up the name "gs" to refer to user-defined geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-31 17:10:15 -07:00
Marek Olšák	a77ee8b548	radeonsi: simplify and improve flushing This mimics r600g. The R600_CONTEXT_xxx flags are added to rctx->b.flags and si_emit_cache_flush emits the packets. That's it. The shared radeon code tells us when the streamout cache should be flushed, so we have to check the flags anyway. There is a new atom "cache_flush", because caches must be flushed after resource descriptors are changed in memory. Functional changes: * Write caches are flushed at the end of CS and read caches are flushed at its beginning. * Sampler view states are removed from si_state, they only held the flush flags. * Everytime a shader is changed, the I cache is flushed. Is this needed? Due to a hw bug, this also flushes the K cache. * The WRITE_DATA packet is changed to use TC, which fixes a rendering issue in openarena. I'm not sure how TC interacts with CP DMA, but for now it seems to work better than any other solution I tried. (BTW CIK allows us to use TC for CP DMA.) * Flush the K cache instead of the texture cache when updating resource descriptors (due to a hw bug, this also flushes the I cache). I think the K cache flush is correct here, but I'm not sure if the texture cache should be flushed too (probably not considering we use TC for WRITE_DATA, but we don't use TC for CP DMA). * The number of resource contexts is decreased to 16. With all of these cache changes, 4 doesn't work, but 8 works, which suggests I'm actually doing the right thing here and the pipeline isn't drained during flushes. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-31 01:34:30 +02:00
Marek Olšák	aa5c40f97c	radeonsi: convert constant buffers to si_descriptors There is a new "class" si_buffer_resources, which should be good enough for implementing any kind of buffer bindings (constant buffers, vertex buffers, streamout buffers, shader storage buffers, etc.) I don't even keep a copy of pipe_constant_buffer - we don't need it. The main motivation behind this is to have a well-tested infrastrusture for setting up streamout buffers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-31 01:34:30 +02:00
Marek Olšák	a81c3e00fe	radeonsi: use r600_common_context, r600_common_screen, r600_resource Also r600_hw_context_priv.h and si_state_streamout.c are removed, because they are no longer needed. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-31 01:34:30 +02:00
Marek Olšák	d5b23dfc1c	r600g: move streamout state to drivers/radeon This streamout state code will be used by radeonsi. There are new structures r600_common_context and r600_common_screen. What is inherited by what is shown here: pipe_context -> r600_common_context -> r600_context pipe_screen -> r600_common_screen -> r600_screen The common structures reside in drivers/radeon. Currently they only contain enough functionality to be able to handle streamout. Eventually I'd like the whole pipe_screen implementation to be shared and some of the context stuff too. This is quite big, but most changes are because of the new structures and the fact r600_write_value is replaced by radeon_emit. Thanks to Tom Stellard for fixing the build for r600g/compute. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-31 01:34:30 +02:00
Marek Olšák	13a1a8b877	radeonsi: cleanup initialization of SGPR shader parameters Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-31 01:34:29 +02:00
Marek Olšák	d698f19cba	r600g,radeonsi: remove unused variables Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-31 01:34:29 +02:00
Marek Olšák	89a665eb5f	draw: fix segfaults with aaline and aapoint stages disabled There are drivers not using these optional stages. Broken by `a3ae5dc7dd`. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-31 01:34:29 +02:00
Kenneth Graunke	a35b320250	i965/fs: Detect GRF sources in split_virtual_grfs send-from-GRF code. It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the GRF. For example, FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD uses src[1] for the GRF. To be safe, loop over all the source registers and mark any GRFs. We probably won't ever have more than one, but it's simpler to just check all three rather than attempting to bail early. Not observed to fix anything yet, but likely to. Parallels the bug fix in the previous commit, which actually does fix known failures. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: mesa-stable@lists.freedesktop.org	2013-08-30 15:49:31 -07:00
Kenneth Graunke	4e3d1712a2	i965/vs: Detect GRF sources in split_virtual_grfs send-from-GRF code. It is incorrect to assume that src[0] of a SEND-from-GRF opcode is the GRF. VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 uses an IMM as src[0], and stores the GRF as src[1]. To be safe, loop over all the source registers and mark any GRFs. We probably won't ever have more than one, but it's simpler to just check all three rather than attempting to bail early. Fixes assertion failures in Unigine Sanctuary since we started making register allocation rely on split_virtual_grfs working. (The register classes were actually sufficient, we were just interpreting an IMM as a virtual GRF number.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68637 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: mesa-stable@lists.freedesktop.org	2013-08-30 15:49:31 -07:00
Niels Ole Salscheider	217d2f7359	radeonsi: Do not suspend timer queries Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-08-30 23:30:00 +02:00
Roland Scheidegger	431e60625b	draw: fix PIPE_MAX_SAMPLER/PIPE_MAX_SHADER_SAMPLER_VIEWS issues pstipple/aaline stages used PIPE_MAX_SAMPLER instead of PIPE_MAX_SHADER_SAMPLER_VIEWS when dealing with sampler views. Now these stages can't actually handle sampler_unit != texture_unit anyway (they cannot work with d3d10 shaders at all due to using tex not sample opcodes as "mixed mode" shaders are impossible) but this leads to crashes if a driver just installs these stages and then more than PIPE_MAX_SAMPLER views are set even if the stages aren't even used. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-30 23:20:04 +02:00
Roland Scheidegger	f37edb5e20	gallivm: handle unbound textures in texture sampling / texture queries Turns out we don't need to do much extra work for detecting this case, since we are guaranteed to get a empty static texture state in this case, hence just rely on format being 0 and return all zero then. Previously needed dummy textures (would just have crashed on format being 0 otherwise) which cannot return the correct result for size queries and when sampling textures with wrap modes using border. As a bonus should hugely increase performance when sampling unbound textures - too bad it isn't a useful feature :-). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-30 23:20:03 +02:00
Roland Scheidegger	bb7dc1b2f6	softpipe: handle NULL sampler views for texture sampling / queries Instead of crashing just return all zero. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-30 23:20:03 +02:00
Roland Scheidegger	81ab3e57bc	softpipe: check if so_target is NULL before accessing it No idea if this is working right but copied straight from llvmpipe. (Not only does this check the so_target but also use buffer->data instead of buffer for the mapping.) Just trying to get rid of a segfault testing something else... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-30 23:20:03 +02:00
Roland Scheidegger	289faa7e23	gallivm: (trivial) don't pass sampler_unit variable down to filtering funcs The only reason this was needed was because the fetch texel function had to get the (dynamic) border color, but this is now done much earlier. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-30 23:20:03 +02:00
Roland Scheidegger	61add3cc3c	gallivm: don't use AoS path if min/mag filter are different with multiple lods Instead of enhancing the AoS path so it can deal with it, just use SoA. Fixing AoS path wouldn't be all that difficult (use all the same logic as SoA) but considered not worth it for now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-30 23:20:03 +02:00
Eric Anholt	bdf3f50e9a	mesa: Don't choose S3TC for generic compression if we can't compress. If the app is asking us to do GL_COMPRESSED_RGBA, then the app obviously doesn't have pre-compressed data to hand us. So don't choose a storage format that we won't actually be able to compress and store. Fixes black screen in warzone2100 when libtxc_dxtn is not present. Also 66 piglit tests. NOTE: This is a candidate for the 9.2 branch. Reported-by: Paul Wise <pabs@debian.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-30 11:49:09 -07:00
Eric Anholt	b188467fdf	mesa: Rip out more extension checking from texformat.c. You should only be flagging the formats as supported if you support them anyway. NOTE: This is a candidate for the 9.2 branch. (required for next commit) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-30 11:49:07 -07:00
Eric Anholt	b1080cfbdb	i965: Switch gen4-6 to using the sampler's base level for GL BASE_LEVEL. Thanks to Ken for trawling through my neglected public branches and finding the bug in this change (inside a megacommit) that made me abandon this work. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-30 11:30:45 -07:00
Eric Anholt	f217791ee2	i965/gen7: Use the base_level field of the sampler to handle GL's BASE_LEVEL. This avoids the need to get the inter- and intra-tile offset and adjust our miptree info based on them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-30 11:30:45 -07:00
Eric Anholt	2e2445fa7e	i965: Add missing state reset at the end of blorp. These are things that happen to be occurring because of the batch flush at the start of the blorp op (which exists to prevent batch space or aperture space overflow), but the intention was for this sequence of state resets at the end of blorp to be everything necessary for the next draw call. Found when debugging the next commit, by comparing brw_new_batch() and intel_batchbuffer_reset() to brw_blorp_exec(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	85aff83f3e	i965: Drop extra flush when calling intel_miptree_map_raw(). The code that got replaced with map_raw didn't do the flush, but now map_raw() is responsible for it and we don't have to worry about it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	535fbf286c	i965: Make a slight distinction in perf debug for BOs versus miptrees. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	7801a8cc89	intel: Reuse intel_glFlush(). v2 (Kenneth Graunke): Rebase on latest master. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	313f2bc32b	intel: Add support for the new flush_with_flags extension. This gives us more information about why we're flushing that we can use for handling our throttling. v2 (Kenneth Graunke): Rebase on latest master, add missing FLUSH_VERTICES and FLUSH_CURRENT, which fixes a regression in Glean's polygonOffset test. v3 (anholt): Drop FLUSH_CURRENT -- FLUSH_VERTICES is what we need, which is "get any queued prims out of VBO and into the driver", not "update ctx->Current so we can read it with the CPU." Also drop batch->used check, which intel_batchbuffer_flush() does anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	bbdc83bca9	intel: Add a batch flush between front-buffer downsample and X protocol. This was already happening because blorp happens to flush at the end of every call, but we have been talking about removing that at some point, and this would surely get overlooked. v2 (Kenneth Graunke): Rebase on latest master. Note that we did remove the other flush, and this change actually did get overlooked! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	6404fcb266	i965: Directly call intel_batchbuffer_flush() after i915 split. intel_flush() now did nothing except call through (and intel_batchbuffer_flush() does the no-op check, too!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-30 11:30:44 -07:00
Eric Anholt	09e2df5961	i965/vs: Fix regression on pre-gen6 with no VS uniforms in use. `df06745c5a` made it so that we didn't allocate extra uniform space for unused clip planes, which also incidentally made us not allocate any space at all, which we were relying on for this no-uniforms case. Instead of putting the knowledge of this special HW exception into the thing that normally preallocates prog_data for us, just allocate it here. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68766 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-30 11:29:50 -07:00
Vadim Girlin	f7217b99f2	r600g: enable SB backend by default Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-30 15:51:11 +04:00
Vadim Girlin	29ff2e907d	r600g: fix color exports when we have no CBs We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-08-30 15:51:11 +04:00
Vinson Lee	74be77a99e	nvc0/ir: Initialize NVC0LegalizePostRA member variables. Fixes "Uninitialized pointer field" defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-08-29 20:42:24 -07:00
Roland Scheidegger	a479f34025	gallivm: support per-pixel min/mag filter in SoA path Since we can have per-pixel lod we should also honor the filter per-pixel (in fact we didn't honor it per quad neither in the multiple quad case). Do this by running the linear path and simply beating the weights into shape (the sample with the higher weight is the one which should have been chosen with nearest filtering hence adjust filter weight to 1.0/0.0 based on that). If all pixels use nearest filter (either min and mag) then still run just a nearest filter as this is way cheaper (probably around 4 times faster for 2d, more for 3d case) and it should be relatively rare that pixels really need different filtering. OTOH if all pixels would require linear don't do anything special since the linear path with filter adjustments shouldn't really be all that much more expensive than ordinary linear, and we think it's rare that min/mag filters are configured differently so there doesn't seem much value in trying to optimize this further. This does not yet fix the AoS path (though currently AoS is only used for single quads hence it could be considered less broken, just never honoring per-pixel filter decision but doing it per quad). v2: simplify code a bit (unify min linear and min nearest cases) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-30 02:16:45 +02:00
Roland Scheidegger	81cfcdbd87	gallivm: don't calculate square root of rho if we use accurate rho method While a sqrt here and there shouldn't hurt much (depending on the cpu) it is possible to completely omit it since rho is only used for calculating lod and there log2(x) == 0.5*log2(x^2). Depending on the exact path taken for calculating lod this means we get a simple mul instead of sqrt (in case of nearest mip filter in fact we don't need to replace the sqrt with something else at all), only in some not very useful path this doesn't work (combined brilinear calculation of int level and fractional lod, accurate rho calc but brilinear filtering seems odd). Apart from being faster as an added bonus this should increase our crappy fractional accuracy of lod, since fast_log2 is only good for ~3bits and this should increase accuracy by one bit (though not used if dimension is just one as we'd need an extra mul there as we never had the squared rho in the first place). v2: use separate ilog2_sqrt function if we have squared rho. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-30 02:16:45 +02:00
Roland Scheidegger	10e40ad11d	gallivm: refactor num_lods handling This is just preparation for per-pixel (or per-quad in case of multiple quads) min/mag filter since some assumptions about number of miplevels being equal to number of lods no longer holds true. This change does not change behavior yet (though theoretically when forcing per-element path it might be slower with different min/mag filter since the code will respect this setting even when there's no mip maps now in this case, so some lod calcs will be done per-element just ultimately still the same filter used for all pixels). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-30 02:16:45 +02:00
Vinson Lee	4a6d2f3dd7	radeonsi: Early return if no depth or stencil on release builds. Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-08-29 15:49:12 -07:00
Rob Clark	de10d383d0	freedreno: pipe loader for either kgsl or msm The downstream android kernel driver is "kgsl", the upstream drm/kms driver is called "msm". Since libdrm_freedreno handles the differences between the two, we need to load the same thing for either device. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-29 17:35:05 -04:00
Rob Clark	e95b7d89b9	freedreno: updates for msm drm/kms driver There where some small API tweaks in libdrm_freedreno to enable support for msm drm/kms driver. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-29 17:35:05 -04:00
Rob Clark	0267f264cc	freedreno/a3xx/compiler: handle sync flags better We need to set the flag on all the .xyzw components that are written by the instruction, not just on .x. Otherwise a later use of rN.y (for example) will not trigger the appropriate sync bit to be set. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-29 17:35:04 -04:00
Rob Clark	4a2b5b2384	freedreno/a3xx/compiler: better const handling Seems like most/all instructions have some restrictions about const src registers. In seems like the 2 src (cat2) instructions can take at most one const, and the 3 src (cat3) instructions can take at most one const in the first 2 arguments. And so on. Handle this properly now. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-29 17:35:04 -04:00
Anuj Phogat	9c0b7be964	glsl: Allow precision qualifiers for sampler types GLSL 1.30 doesn't allow precision qualifiers on sampler types, but in GLSL ES, sampler types are also allowed. This seems like an oversight (since the intention of including these in GLSL 1.30 is to allow compatibility with ES shaders). Currently, Mesa allows "default" precision qualifiers to be set for sampler types in GLSL (commit `d5948f2`). This patch makes it follow GLSL ES rules and also allow declaring sampler variables with a precision qualifier in GLSL 1.30 (and later). e.g. uniform lowp sampler2D sampler; This fixes a shader compilation error in Khronos OpenGL conformance test "depth_texture_mipmap". V2: Update comments. Signed-off-by: Ian Romanick <idr@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <idr@lists.freedesktop.org> Cc: <mesa-stable@lists.freedesktop.org>	2013-08-29 12:10:57 -07:00
Matt Turner	1ecfdba98a	glsl: Add heuristics to print floating-point numbers better. v2: Fix *.expected files to match. Reviewed-by: Paul Berry <strereotype441@gmail.com>	2013-08-29 12:07:28 -07:00
Jonathan Gray	57cf5946ce	radeonsi: Make sure libdrm_radeon headers are picked up from the right place And remove libdrm/ from a winsys include statement. Signed-off-by: Jonathan Gray <jsg@jsg.id.au>	2013-08-29 15:37:44 +02:00
Brian Paul	4e7f1346ae	draw: fix point/line/triangle determination in draw_need_pipeline() The previous point/line/triangle() functions didn't handle GS primitives. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-29 07:29:31 -06:00
Christian König	aebd065a64	radeon/uvd: fix MPEG2/4 ref frame index limit Otherwise the first few frames have an incorrect reference index. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-08-29 08:51:12 +02:00
Vinson Lee	57684d52e9	nouveau: Copy m4x4 and m8x8 separately. Silences Coverity "Out-of-bounds access" defect. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-08-28 23:23:49 -07:00
Kenneth Graunke	df06745c5a	i965: Allocate just enough space for user clip planes in uniform arrays. Previously, we allocated space in brw_vs_prog_data's params and pull_params arrays for MAX_CLIP_PLANES vec4s---even when it wasn't necessary. On a 64-bit architecture, this used 0.5 kB of space (8 clip planes * 4 floats per plane * 8 bytes per float pointer * 2 arrays of pointers = 512 bytes). Since this cost was per-vertex shader, it added up. Conveniently, we already store the number of clip plane constants in the program key. By using that, we can allocate the exact amount of space needed. For the common case where user clipping is disabled, this means 0 bytes. While we're here, mention exactly what code requires this extra space, since it wasn't obvious. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-28 14:12:48 -07:00
Chad Versace	72b3c6c96f	i965: Silence unused variable warning in release build Use `(void) success;` to silence this warning: i965/brw_vs.c:481:12: warning: unused variable 'success' [-Wunused-variable] bool success = do_vs_prog(brw, ctx->Shader.CurrentVertexProgram, Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-28 10:42:51 -07:00
Brian Paul	031c3393a1	docs: minor fixes for 9.2 release notes Fix incorrect </li> tag, fix language. (cherry picked from commit `2377205bcb`)	2013-08-27 18:59:05 -06:00
Ian Romanick	e496583975	docs: Add news item for 9.2 release Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-27 16:38:57 -07:00
Ian Romanick	9f2608bc46	docs: Import 9.2 release notes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-27 16:38:57 -07:00
Fabian Bieler	cd18269705	mesa/main: Check for 0 size draws after validation. When validating draw parameters move check for 0 draw count last (drawing with count 0 is not an error), so that other parameters (e.g.: the primitive type) are validated and the correct errors (if applicable) are generated. >From the OpenGL 3.3 spec page 33 (page 48 of the PDF): "[Regarding DrawArraysOneInstance, in terms of which other draw operations are defined:] If count is negative, an INVALID_VALUE error is generated." This patch also changes the bahavior of MultiDrawElements to perform the draw operation if some primitive's index counts are zero. Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-27 15:11:52 -07:00
Matt Turner	ac74de3710	glsl: Add built-ins from ARB_shader_bit_encoding to ARB_gpu_shader5. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-27 15:06:16 -07:00
Matt Turner	4929be0b5f	i965/vs: Add support for translating ir_triop_fma into MAD. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-27 15:03:30 -07:00
Matt Turner	530842127e	i965/fs: Add support for translating ir_triop_fma into MAD. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-27 15:03:30 -07:00
Matt Turner	e817b94a2c	i965/fs: Assert that ir_expressions are usable by 3-src instructions. MAD will be generated directly from ir_triop_fma, so this assertion checks that all ir_expressions are usable. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-27 15:03:30 -07:00
Matt Turner	d55c543c36	glsl: Add support for new fma built-in in ARB_gpu_shader5. v2: Add constant folding support. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-27 15:03:30 -07:00
Matt Turner	6829c18609	glsl: Add new fma built-in IR and prototype from ARB_gpu_shader5. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-27 15:03:30 -07:00
Marek Olšák	adb93e3bda	r300g: enable MSAA on r300-r400, be careful about using color compression MSAA was tested by one user on RS690 and it works for him with color compression (CMASK) disabled. Our theory is that his chipset lacks CMASK RAM. Since we don't have hardware documentation about which chipsets actually have CMASK RAM, I had to take a guess based on the presence of HiZ. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-08-27 23:18:54 +02:00
Fabio Pedretti	aa3905423e	configure.ac: Bump Wayland requirement to 1.2.0 Since `8d29b52` wayland 1.2.0 is required. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-27 08:40:40 -07:00
Roland Scheidegger	bd3909f265	draw: clean up setting stream out information a bit In particular noone is interested in the vertex count, so drop that, and also drop the duplicated num_primitives_generated / so.primitives_storage_needed variables in drivers. I am unable for now to figure out if primitives_storage_needed in SO stats (used for d3d10) should increase if SO is disabled, though the equivalent num_primitives_generated used for OpenGL definitely should increase. In any case we were only counting when SO is active both in softpipe and llvmpipe anyway so don't pretend there's an independent num_primitives_generated counter which would count always. (This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just as before, should eventually fix this by doing either separate counting for this query or adjust the code so it always counts this even if SO is inactive depending on what's correct for d3d10.) Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-27 16:59:39 +02:00
Roland Scheidegger	aff2ecf09a	llvmpipe: support nested/overlapping queries for all query types There's just no way resetting the counters is working with nested/overlapping queries. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-27 16:59:01 +02:00
Roland Scheidegger	4900e625bd	softpipe: support nested/overlapping queries for all query types There's just no way resetting the counters is working with nested/overlapping queries. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-27 16:58:20 +02:00
Matt Turner	d8ac987f6a	glsl: Disallow uniform block layout qualifiers on non-uniform block vars. Cc: 9.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68460 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-26 23:19:14 -07:00
Kristian Lehmann	cec7b5c5bc	Fixed and/or order mistake, resulting in compiling llvmpipe without llvm installed Cc: 9.2 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68544 Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-26 22:13:45 -07:00
Ian Romanick	d127a0343d	i915: Optimize SEQ and SNE when two operands are uniforms SEQ and SNE are not native i915 instructions, so they each generate at least 3 instructions. If both operands are uniforms or constants, we get 5 instructions like: U[1] = MOV CONST[1] U[0].xyz = SGE CONST[0].xxxx, U[1] U[1] = MOV CONST[1].-x-y-z-w R[0].xyz = SGE CONST[0].-x-x-x-x, U[1] R[0].xyz = MUL R[0], U[0] This code is stupid. Instead of having the individual calls to i915_emit_arith generate the moves to utemps, do it in the caller. This results in code like: U[1] = MOV CONST[1] U[0].xyz = SGE CONST[0].xxxx, U[1] R[0].xyz = SGE CONST[0].-x-x-x-x, U[1].-x-y-z-w R[0].xyz = MUL R[0], U[0] This allows fs-temp-array-mat2-index-col-wr and fs-temp-array-mat2-index-row-wr to fit in hardware limits (instead of falling back to software rasterization). NOTE: Without pending patches to the piglit tests, these tests will now fail. This is an unrelated, pre-existing issue. v2: Copy most of the body of the commit message into comments in the code. Suggested by Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-08-26 22:11:26 -07:00
Tom Stellard	f3e86d4a68	clover: Don't use PIPE_TRANSFER_UNSYNCHRONIZED for blocking copies CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-08-26 18:27:03 -07:00
Niels Ole Salscheider	ef6ed7220a	st/clover: Add event to deps even if it has been triggered The command is submitted once the event has been triggered, but it might not have completed yet. Therefore, we have to add it to deps in order to wait on it. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-08-26 18:25:17 -07:00
Niels Ole Salscheider	4a3505d548	st/clover: Profiling support Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Acked-by: Francisco Jerez <currojerez@riseup.net>	2013-08-26 18:25:17 -07:00
Dave Airlie	4763a032a0	tgsi_build: fix order of arguments for ind register build This was broken when arrayid was added. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-08-27 10:41:27 +10:00
Dave Airlie	81204d0e9c	tgsi: finish declaration parsing for arrays. I previously fixed this partly in `9e8400f4c9`, however I didn't go far enough in testing it, now when I parse a TGSI shader with arrays in it my iterator can see the ArrayID set to the proper value. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-08-27 10:41:09 +10:00
Brian Paul	92cbfded6a	svga: replace 0 with PIPE_OK in a few places	2013-08-26 15:49:16 -06:00
Brian Paul	5e7ac28ebf	swrast: init i0, i1 values to silence warnings Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-26 12:52:06 -06:00
Brian Paul	ef47ab520d	mesa: init dst values in COPY_CLEAN_4V_TYPE_AS_FLOAT() to silence gcc 4.8.1 warnings. And improve the ASSERT(0) call. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-26 12:52:06 -06:00
Brian Paul	f91f6ef739	glsl: init limit=0 to silence uninitialized var warning Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-26 12:52:06 -06:00
Kenneth Graunke	d65e3c082a	i965/vs: Allocate register set once at context creation. Now that we use a fixed set of register classes, we can set up the register set and conflict graphs once, at context creation, rather than on every VS compile. This is obviously less expensive, and also what we already do in the FS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-26 11:21:10 -07:00
Kenneth Graunke	a149f744d9	i965/vs: Move base_reg_count computation to brw_alloc_reg_set(). We're soon going to be calling brw_alloc_reg_set() from outside of the visitor, where we don't have the precomputed "max_grf" variable handy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-26 11:21:10 -07:00
Kenneth Graunke	7aaaa8bc8f	i965/vs: Expose the payload registers to the register allocator. For now, nothing else can get allocated over them. That may change at some point in the future. This also means that base_reg_count can be computed without knowing the number of registers used for the payload, which is required if we want to allocate the register set once at context creation time. See commit `551e1cd44f`, which implemented virtually identical code in the FS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-26 11:21:10 -07:00
Kenneth Graunke	528d70d0b5	i965/vs: Use a fixed set of register classes. Arrays, structures, and matrices use large VGRFs of arbitrary sizes. However, split_virtual_grfs() breaks those down into VGRFs of size 1. For reference, commit `5d90b98879` is the analogous change to the FS backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-26 11:21:10 -07:00
Paul Berry	cfe39ea14e	i965: Allow C++ type safety in the use of enum brw_urb_write_flags. (From a suggestion by Francisco Jerez) If an enum represents a bitfield of flags, e.g.: enum E { A = 1, B = 2, C = 4, D = 8, }; then C++ normally prohibits statements like this: enum E x = A \| B; because A and B are implicitly converted to ints before OR-ing them, and an int can't be stored in an enum without a type cast. C, on the other hand, allows an int to be implicitly converted to an enum without casting. In the past we've dealt with this situation by storing flag bitfields as ints. This avoids ugly casting at the expense of some type safety that C++ would normally have offered (e.g. we get no warning if we accidentally use the wrong enum type). However, we can get the best of both worlds if we override the \| operator. The ugly casting is confined to the operator overload, and we still get the benefit of C++ making sure we don't use the wrong enum type. v2: Remove unnecessary comment and unnecessary use of "enum" keyword. Use static_cast. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-08-26 10:15:51 -07:00
Paul Berry	612226c43b	i965: Remove redundant (and uninitialized) field vec4_generator::ctx. We never noticed that this field was uninitialized because it is only used in an error path that reports internal Mesa errors. But it's silly to have it around anyway because &brw->ctx is equivalent. Should fix Coverity defect CID 1063351: Uninitialized pointer field (UNINIT_CTOR) /src/mesa/drivers/dri/i965/brw_vec4_emit.cpp: 148 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-26 08:55:39 -07:00
Paul Berry	4bf91ca791	i965: Don't try to fall back when creating unrecognized program targets. If brwNewProgram is asked to create a program for an unrecognized target, don't bother falling back on _mesa_new_program(). That just hides bugs. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> v2: Use assert() rather than _mesa_problem(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-26 08:55:39 -07:00
Michel Dänzer	46fd81e586	radeonsi: Also set the depth component mask bit for stencil-only exports The stencil values come out wrong without this for some reason. 50 more little piglits. Cc: mesa-stable@lists.freedesktop.org	2013-08-26 15:47:50 +02:00
Kenneth Graunke	7fa18774bd	glsl: Add built-in function prototypes for GLSL 3.30 330.frag is a direct copy of 150.frag. 330.glsl is 150.glsl combined with ARB_shader_bit_encoding.glsl. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-25 20:32:39 -07:00
Kenneth Graunke	8f00409d23	glsl: Bump standalone compiler versions to 3.30. These are necessary in order to compile the built-in functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-25 20:32:39 -07:00
Kenneth Graunke	7950315583	mesa: Set query->EverBound in glQueryCounter(). glIsQuery is supposed to return false for names returned by glGenQueries until their first use. BeginQuery is a use, but QueryCounter is also a use. From the ARB_timer_query spec: "A timer query object is created with the command void QueryCounter(uint id, enum target); [...] If <id> is an unused query object name, the name is marked as used [...]" Fixes Piglit's spec/ARB_timer_query/query-lifetime. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: mesa-stable@lists.freedesktop.org	2013-08-25 20:29:59 -07:00
Henri Verbeet	b5ddaf9975	r600g: Implement the new float comparison instructions for Cayman as well. I assume this should have been part of commit `7727fbb7c5`. This (obviously) fixes a lot tests. Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-08-25 13:00:02 +02:00
Ilia Mirkin	bac6efe8e3	nv30: add forgotten PIPE_CAP_CUBE_MAP_ARRAY cap to list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-25 10:47:28 +02:00
Ilia Mirkin	293fa4e559	nouveau/video: avoid overwriting base codec init with template Commit `53e20b8b` introduced the use of a template to initialize some common fields. Move this copying of fields to before the common vp3 fields are initialized. Reported-by: Martin Peres <martin.peres@labri.fr> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-25 10:14:30 +02:00
Rob Clark	56ea2c4816	freedreno/a3xx: don't leak so much Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:58:01 -04:00
Rob Clark	9b9038496c	freedreno/a3xx/compiler: fix SGT/SLT/etc The cmps.f.* instruction doesn't actually seem to give a float 1.0 or 0.0 output. It either needs a cov.u16f16 or add.s + sel.f16. This makes SGT/SLT/etc more similar to CMP, so handle them in trans_cmp(). This fixes a bunch of piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	572d4646f7	freedreno/a3xx/compiler: bit of re-arrange/cleanup It seems there are a number of cases where instructions have limitations about taking reading src's from const register file, so make get_unconst() a bit easier to use. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	d63bbac3a5	freedreno/a3xx/compiler: make compiler errors more useful We probably should get rid of assert() entirely, but at this stage it is more useful for things to crash where we can catch it in a debugger. With compile_error() we have a single place to set an error flag (to bail out and return an error on the next instruction) so that will be a small change later when enough of the compiler bugs are sorted. But re-arrange/cleanup the error/assert stuff so we at least get a dump of the TGSI that triggered it. So we see some useful output in piglit logs. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	4c91930a25	freedreno: fix segfault when no color buffer bound Don't crash when no color buffer bound. Something caught when starting to run piglit, fixes a hanful of piglit tests. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	7eeab24344	freedreno/a3xx/compiler: cat4 cannot use const reg as src Category 4 instructions (rsq, rcp, sqrt, etc) seem to be unable to take a const register as src. In these cases we need to move the src to a temporary gpr first. This is the second case of such a restriction, where the instruction encoding appears to support a const src, but in fact the hw appears to ignore that bit. So split things out into a helper that can be re-used for any instructions which have this limitation. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	2effac5a67	freedreno/a3xx/compiler: use max_reg rather than file_count Our current (rather naive) register assignment is based on mapping different register files (INPUT, OUTPUT, TEMP, CONST, etc) based on the max register index of the preceding file. But in some cases, the lowest used register in a file might not be zero. In which case file_count[file] != file_max[file] + 1. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	aee1ed708a	freedreno/a3xx/compiler: handle saturate on dst Sometimes things other than color dst need saturating, like if there is a 'clamp(foo, 0.0, 1.0)'. So for saturated dst add the extra instructions to fix up dst. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	8b250bb8aa	freedreno/a3xx/compiler: fix CMP The 1st src to add.s needs (r) flag (repeat), otherwise it will end up: add.s dst.xyzw, tmp.xxxx -1 instead of: add.s dst.xyzw, tmp.xyzw, -1 Also, if we are using a temporary dst to avoid clobbering one of the src registers, we actually need to use that as the dst for the sel instruction. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:23:32 -04:00
Rob Clark	528bee59fe	freedreno/a3xx: some texture fixes Stop hard coding bits that indicate texture type (2d/3d/cube/etc). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:21:59 -04:00
Rob Clark	fd59f3ea98	freedreno: update register headers resync w/ rnndb database Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:12:26 -04:00
Rob Clark	c2babfccb5	freedreno: add debug option to disable scissor optimization Useful for testing and debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:11:50 -04:00
Rob Clark	ae1a3f1736	freedreno/a3xx: fix viewport on gmem->mem resolve Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:04:29 -04:00
Rob Clark	fbef4e795f	freedreno/a3xx: fix color inversion on mem->gmem restore Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-08-24 13:04:29 -04:00
Niels Ole Salscheider	288a252523	radeonsi: Handle additional PIPE_COMPUTE_CAP_* This patch adds support for: PIPE_COMPUTE_CAP_MAX_INPUT_SIZE PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE Return the values reported by the closed source driver for now. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-23 17:00:01 -07:00
Niels Ole Salscheider	04349541cd	radeonsi: copy r600_get_timestamp Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-08-23 16:59:55 -07:00
Niels Ole Salscheider	db6f4165f4	radeonsi: Implement PIPE_QUERY_TIMESTAMP Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-08-23 16:59:44 -07:00
Roland Scheidegger	ad9b5b9ae9	gallivm: fix min/mag switchover point for nearest/none mip filter Previously, the min/mag switchover point when using nearest/none mip filter was effectively -0.5 which can't be right. Looks like new OpenGL thinks it's ok if it's always 0.0 (older versions required 0.5 in some cases), let's hope everybody else thinks that's fine too. Refactor this slightly and get the per-quad/per-pixel min/mag decision values further down to sampling, though still only the first component is used yet. While here also fix code trying to skip lod bias application etc. when mipfilter is none, as this is still needed for determining min/mag filter. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-23 23:46:28 +02:00
Jon Severinsson	b47bde0079	gallium/osmesa: Link, not copy, the shared library to the LIB_DIR. Just like all other mesa libraries... CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 12:58:48 -07:00
Jon Severinsson	aeb9c9e4b0	gallium/osmesa: Always link with the c++ linker. Just like all other gallium targets... CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 12:58:45 -07:00
Jon Severinsson	c811190430	gallium/osmesa: Make and install an osmesa.pc. As of "2f142d59 build: Add --enable-gallium-osmesa flag." the pkgconfig file from classic osmesa is no longer installed when building gallium osmesa, so copy it to gallium osmesa and install the copy instead. CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 12:58:30 -07:00
Paul Berry	60ddb96f7e	i965/gs: Add a data structure for tracking VS output VUE map. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:47 -07:00
Paul Berry	06918f84c2	i965/vec4: Make a function for setting up vec4 program key clip info. This functionality will need to be reused by geometry shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:43 -07:00
Paul Berry	5b5d10bcd3	i965: Make prim_to_hw_prim accessible outside brw_draw.c. We will need access to this array in order to configure the geometry shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:38 -07:00
Paul Berry	16512ba70d	i965/gs: add GS visitors. This patch introduces the vec4_gs_visitor class, which translates geometry shaders from GLSL IR to back-end opcodes. This class is derived from vec4_visitor (which is also the base class for vec4_vs_visitor), so as a result most of the back end code is shared. The only parts that differ are: - Geometry shaders use a different input payload organization, since the inputs need to match up with the outputs of the previous pipeline stage (vec4_gs_visitor::setup_payload() and vec4_gs_visitor::setup_varying_inputs()). - Geometry shader input array dereferences need a special stride computation, since all geometry shader inputs are interleaved into one giant array (vec4_gs_visitor::compute_array_stride()). - There are no geometry shader system values (vec4_gs_visitor::make_reg_for_system_value()). - At the beginning of a geometry shader, extra data in R0 needs to be zeroed out, and a vertex counter needs to be initialized (vec4_gs_visitor::emit_prolog()). - When EmitVertex() appears in the shader, the current contents of output variables need to be emitted to the URB, and the vertex counter needs to be incremented (vec4_gs_visitor::visit(ir_emit_vertex )). - When generating a URB_WRITE message to output vertex data, the current state of the vertex counter needs to be used to store a write offset in the message header (vec4_gs_visitor::emit_urb_write_header()). - The URB_WRITE message that outputs vertex data needs to be sent using GS_OPCODE_URB_WRITE, since VS_OPCODE_URB_WRITE would overwrite the offsets in the message header (vec4_gs_visitor::emit_urb_write_opcode()). - At the end of a geometry shader, the final vertex count needs to be delivered using a URB WRITE message (vec4_gs_visitor::emit_thread_end()). - EndPrimitive() functionality is not implemented yet (vec4_gs_visitor::visit(ir_end_primitive )). - There is no support for assembly shaders (vec4_gs_visitor::emit_program_code()). v2: Make num_input_vertices const. Refer to registers as rN rather than gN, for consistency with the PRM. Fix misspelling. Improve comment in the ir_emit_vertex visitor explaining why we emit vertices inside a conditional. Enclose the conditional code in the ir_emit_vertex visitor between curly braces. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:34 -07:00
Paul Berry	35bdd552d5	i965/gs: Add GS_OPCODE_SET_DWORD_2_IMMED. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:31 -07:00
Paul Berry	7417eddea9	i965/gs: Add GS_OPCODE_SET_VERTEX_COUNT. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:27 -07:00
Paul Berry	ce722fd65d	i965/gs: Add GS_OPCODE_SET_WRITE_OFFSET. v2: Added a comment to vec4_generator::generate_gs_set_write_offset(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:23 -07:00
Paul Berry	4416cb7992	i965/gs: Add GS_OPCODE_THREAD_END. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:19 -07:00
Paul Berry	96eb2f3536	i965/gs: Add GS_OPCODE_URB_WRITE. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:15 -07:00
Paul Berry	eaa63cbbc2	i965/gs: Add a flag allowing URB write messages to use a per-slot offset. This will be used by geometry shaders to implement the EmitVertex() function, since it requires writing data to a dynamically-determined offset within the geometry shader's URB entry. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:12 -07:00
Paul Berry	a9e8c10bd7	i965: Combine 4 boolean args of brw_urb_WRITE into a flags bitfield. The arguments to brw_urb_WRITE() were getting pretty unwieldy, and we have to add more flags to support geometry shaders anyhow. Also plumb these flags through brw_clip_emit_vue(), brw_set_urb_message(), and the vec4_instruction class. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:08 -07:00
Paul Berry	591fc0861c	i965/gs: Add a case to brwNewProgram() for geometry shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:05 -07:00
Paul Berry	ebbb8c0c76	i965/gs: Create structs for use by GS program compilation. v2: Make id "unsigned" rather than "GLuint". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:03:01 -07:00
Paul Berry	3167dca3d4	i965/gs: Add a case to brwBindProgram() for geometry shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:58 -07:00
Paul Berry	158dcdc0e2	i965/gs: Add brw->geometry_program. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:54 -07:00
Paul Berry	7f57101ad5	i965/vec4: Virtualize setup_payload instead of setup_attributes. When I initially generalized the vec4_visitor class in preparation for geometry shaders, I assumed that the setup_attributes() function would need to be different between vertex and geometry shaders, but its caller, setup_payload(), could be shared. So I made setup_attributes() a virtual function. It turns out this isn't true; setup_payload() needs to be different too, since the geometry shader payload sometimes includes an extra register (primitive ID) that has to come before uniforms. So setup_payload() needs to be the virtual function instead of setup_attributes(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:51 -07:00
Paul Berry	626495d269	i965/vec4: Allow for dispatch_grf_start_reg to vary. Both 3DSTATE_VS and 3DSTATE_GS have a dispatch_grf_start_reg control, which determines the register where the hardware delivers data sourced from the URB (push constants followed by per-vertex input data). For vertex shaders, we always set dispatch_grf_start_reg to 1, since R1 is always the first register available for push constants in vertex shaders. For geometry shaders, we'll need the flexibility to set dispatch_grf_start_reg to different values depending on the behvaiour of the geometry shader; if it accesses gl_PrimitiveIDIn, we'll need to set it to 2 to allow the primitive ID to be delivered to the thread in R1. This patch eliminates the assumption that dispatch_grf_start_reg is always 1. In vec4_visitor, we record the regnum that was passed to vec4_visitor::setup_uniforms() in prog_data for later use. In vec4_generator, we consult this value when converting an abstract UNIFORM register to a concrete hardware register. And in the code that emits 3DSTATE_VS, we set dispatch_grf_start_reg based on the value recorded in prog_data. This will allow us to set dispatch_grf_start_reg to the appropriate value when compiling geometry shaders. Vertex shaders will continue to always use a dispatch_grf_start_reg of 1. v2: Make dispatch_grf_start_reg "unsigned" rather than "GLuint". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:47 -07:00
Paul Berry	72168f5f00	i965/vec4: Move vec4 data structures and functions to brw_vec4.{cpp,h}. This patch moves the following things into brw_vec4.{cpp,h}: - struct brw_vec4_compile - struct brw_vec4_prog_key - brw_vec4_prog_data_compare() - brw_vec4_prog_data_free() This will allow us to avoid having to include brw_vs.h in geometry-shader-specific files. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:44 -07:00
Paul Berry	e556286802	i965: Make brw_{shader,vec4}.h safe to include from C. The patch that follows will move the definition of struct brw_vec4_prog_key from brw_vs.h to brw_vec4.h, making it necessary for brw_vs.h to include brw_vec4.h (because brw_vs.h defines struct brw_vs_prog_key, which contains brw_vec4_prog_key as a member). Since brw_vs.h is included from C source files, that means that brw_vec4.h will need to be safe to include from C. Same for brw_shader.h, since it is included by brw_vec4.h. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:40 -07:00
Paul Berry	5fb13d871e	i965: Stop including brw_vs.h from brw_vec4.h. This is backwards from what we are going to want in the long term, which is: - brw_vec4.h declares general-purpose vec4 infrastructure needed by both VS and GS - brw_vs.h includes brw_vec4.h and adds VS-specific parts. - brw_gs.h includes brw_vec4.h and adds GS-specific parts. Note that at the moment brw_vec.h contains a fair amount of VS-specific declarations--I plan to address that in a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:37 -07:00
Paul Berry	52bac6e4ff	i965: Initialize all elements of ctx->ShaderCompilerOptions. Otherwise any GS that requires lowering (e.g. one that uses gl_ClipDistance as an input or output) will fail to work. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:34 -07:00
Paul Berry	61a5bd8336	i965: Make brw_{program,vs}.h safe to include from C++. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:31 -07:00
Paul Berry	ad65825098	mesa/program: Make prog_instruction.h and program.h safe to include from C++. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:25 -07:00
Paul Berry	44e07de3ac	glsl: Refactor handling of gl_ClipDistance/gl_ClipVertex linkage rules for GS. This patch extracts the following logic from validate_vertex_shader_executable(): (a) Generate an error if the shader writes to both gl_ClipDistance and gl_ClipVertex. (b) Record whether the shader writes to gl_ClipDistance in gl_shader_program for use by the back-end. (c) Record the size of gl_ClipDistance in gl_shader_program for use by transform feedback logic. And moves it into a function that is shared between vertex and geometry shaders. Strictly speaking we only need to have shared logic for (b) and (c) right now (since (a) only matters in compatibility contexts, and we're only implementing geometry shaders in core contexts right now). But the three are closely related enough that it seems sensible to keep them together. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-23 11:02:15 -07:00
Timothy Arceri	f0072e3c6b	mesa: Fix assertion error with glDebugMessageControl enums were being converted twice resulting in incorrect values. The extra conversion has been removed and the redundant assert is removed also. Cc: 9.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-23 08:15:19 -06:00
Kenneth Graunke	a27180d0d8	mesa: Specify a better GL_MAX_SERVER_WAIT_TIMEOUT limit. The previous value of (GLuint64) ~0 has some problems: GL_MAX_SERVER_WAIT_TIMEOUT is supposed to be a GLuint64 value, but has to be queried via GetInteger64v(), which returns a GLint64. This means that some applications are likely to treat it as a signed integer, where ~0 means -1. Negative values are nonsensical and problematic. When interpreted correctly, ~0 translates to about 0.58 million years, which seems rather excessive. This patch changes it to 0x1fff7fffffff, which is about 1.11 years. This is still plenty long, and is the same as both an int64 and uint64. Applications that accidentally store it in a 32-bit int/unsigned also get a non-negative value, which is again the same as both int and unsigned. This value was suggested by Ian Romanick. v2: Add the ULL prefix on the constant (suggested by Ian). Fixes Piglit's spec/!OpenGL 3.2/get-integer-64v. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2013-08-22 23:08:20 -07:00
Kenneth Graunke	62411681da	meta: Set correct viewport and projection in decompress_texture_image. _mesa_meta_begin() sets up an orthographic project and initializes the viewport based on the current drawbuffer's width and height. This is likely the window size, since it occurs before the meta operation binds any temporary buffers. decompress_texture_image needs the viewport to be the size of the image it's trying to draw. Otherwise, it may only draw part of the image. v2: Actually set the projection properly too. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68250 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Mak Nazecic-Andrlon <owlberteinstein@gmail.com>	2013-08-22 20:28:53 -07:00
Chad Versace	ce8639a766	i965: Fix misapplication of gles3 srgb workaround Fixes inconsistent failure of gles2conform/GL2Tests/glUniform/glUniform.test under gnome-shell. What follows is a description of the bug and its fix. When intel_update_renderbuffers() allocates a miptree for a winsys renderbuffer, it propagates the renderbuffer's format to become also the miptree's format. If the winsys color buffer format is SARGB, then, in the first call to eglMakeCurrent, intel_gles3_srgb_workaround() changes the renderbuffer's format to ARGB. That is, it changes the format from sRGB to non-sRGB. However, it changes the renderbuffer's format after intel_update_renderbuffers() has allocated the renderbuffer's miptree. Therefore, when eglMakeCurrent returns, the miptree format (SARGB) differs from the renderbuffer format (ARGB). If the X server reallocates the color buffer, intel_update_renderbuffers() will create a new miptree for the renderbuffer. The new miptree's format (ARGB) will differ from old miptree's format (SARGB). This mismatch between old and new miptrees causes bugs. Fix the bug by moving intel_gles3_srgb_workaround() to occur before intel_update_renderbuffers(). CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67934 Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-22 10:54:36 -07:00
Roland Scheidegger	bd0b6c5180	gallivm: do per-element lod for lod bias and explicit derivs too Except for explicit derivs with cube maps which are very bogus anyway. Just like explicit lod this is only used if no_quad_lod is set in GALLIVM_DEBUG env var. Minification is terrible on cpus which don't support true vector shifts (but should work correctly). Cannot do the min/mag filter decision (if they are different) per pixel though, only selecting different mip levels works. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-22 19:05:52 +02:00
Roland Scheidegger	33694a1800	gallivm: (trivial) fix int/uint border color clamping Just a copy & paste error. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=68409. Note that the test passing before probably simply means it doesn't verify clamping of the border color itself as required by the OpenGL spec. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-22 19:05:52 +02:00
Roland Scheidegger	6ff9008544	gallivm: (trivial) fix linear aos sampling of 3d compressed formats block size depth is always 1 even for compressed formats (unless someone invents true 3d compressed formats at least which we can't represent). Nearest (and soa) path had it right. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-22 19:05:52 +02:00
Michel Dänzer	237cb074cb	radeonsi: Fix y/z/w component values of TGSI_SEMANTIC_FOG pixel shader inputs They are defined as constant 0.0/0.0/1.0. Three more little piglits. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-08-22 16:12:17 +02:00
José Fonseca	fb62388d6a	gallium: Support PIPE_FORMAT_R10G10B10A2_UINT. Same as PIPE_FORMAT_B10G10R10A2_UINT but without the swizzling. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-22 12:14:15 +01:00
José Fonseca	c5f2cd6e41	trace: Handle null tokens. Used for example on stream out without geometry shader.	2013-08-22 12:14:15 +01:00
Chia-I Wu	b6037e734e	ilo: do not need last shader stage for 3DSTATE_SBE We have set up 3DSTATE_SBE (or 3DSTATE_SF on GEN6) in ilo_shader_select_kernel_routing(). There is no need to pass the last shader stage to the GPE function.	2013-08-22 15:18:29 +08:00
Chia-I Wu	627d7ca763	ilo: fix a potential issue with STATE_SIP Command length is ORed to the wrong place. Since the ORed value is zero, there is no real change.	2013-08-22 15:18:29 +08:00
Chia-I Wu	475d7ecce2	ilo: add GEN check to 3DSTATE_CLIP Assert that gen6_emit_3DSTATE_CLIP is for GEN 6 and 7.	2013-08-22 15:18:29 +08:00
Matt Turner	2f142d596f	build: Add --enable-gallium-osmesa flag. The Gallium implementation is apparently not ready for regular consumption, so as much as I hate adding more build-time options, here's another. Acked-by: Brian Paul <brianp@vmware.com>	2013-08-21 23:07:10 -07:00
Ian Romanick	dded321f92	glsl: Give a warning, not an error, for UBO qualifiers on non-matrices. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59648 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-21 23:06:59 -07:00
Matt Turner	921ef55a72	glsl: Remove ubo_qualifiers_allowed variable. No longer used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-08-21 22:47:02 -07:00
Matt Turner	77373e020e	glsl: Drop duplicate error messages. This same message is printed in the validate_matrix_layout_for_type function. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-08-21 22:47:02 -07:00
Matt Turner	1a45db9705	glsl: Rename ubo_qualifiers_valid to ubo_qualifiers_allowed. The variable means that UBO qualifiers are allowed in a particular context (e.g., not allowed in a struct field declaration), rather than a particular set of UBO qualifiers are valid. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-08-21 22:47:02 -07:00
Kenneth Graunke	9d08756ac7	i965/fs: Add code to print out global copy propagation sets. This was invaluable when debugging the global copy propagation algorithm. We may as well commit it in case someone needs to print out the sets in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-21 21:05:50 -07:00
Armin K	63ac68bae3	osmesa: Symlink shared library to LIB_DIR Cc: 9.2 <mesa-stable@lists.freedesktop.org> Tested-by: Brian Paul <brianp at vmware.com> Reviewed-by: Brian Paul <brianp at vmware.com>	2013-08-21 17:55:32 -06:00
Brian Paul	e4217396b7	svga: minor clean-ups in emit_hw_vs_vdecl()	2013-08-21 17:55:06 -06:00
Roland Scheidegger	e6013e4bee	gallivm: unify sin and cos implementation The (complicated!) math is all identical, there's just minimal differences how sign bit is calculated plus there's an additional subtraction for the argument going into the polynomial for cos. The logic stays 100% the same (with a small exception, sign bit calculation for sin is minimally simplified, applying sign mask after xoring the arguments instead of applying it to each argument). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-21 22:05:53 +02:00
Roland Scheidegger	275d2efeed	gallivm: add comment for bogus min/mag filter selection with nearest mip filter Detected this hunting some other bug, not sure if it really needs fixing but it is definitely wrong. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-21 22:05:52 +02:00
Roland Scheidegger	21d8fa2759	gallivm: fix rho calculation for 1d case Was using wrong (undefined) vector element (the elements are at 0/2 position, not 0/1). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-21 22:05:52 +02:00
Ville Syrjälä	e6893b99ad	i965/gen7: Set MOCS L3 cacheability for IVB/BYT (v2) IVB/BYT also has the same L3 cacheability control in MOCS as HSW, so let's make use of it. pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most other things show less gains/no regressions, except furmark which loses some 10 points. I didn't have a BYT at hand for testing. v2: Don't check (brw->gen == 7) in gen7 functions. (chadv) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-21 10:14:04 -07:00
Ville Syrjälä	22161983c3	i965/hsw: Populate MOCS for STATE_BASE_ADDRESS (v2) Just spotted these unpopulated MOCS fields when comparing the code against BSpec. Set the MOCS to the same as everywhere else in Haswell: L3-cacheable. v2: Annotate state packet fields (chadv). Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-21 10:14:04 -07:00
Maarten Lankhorst	10aa3677cc	glapi/gen: build temporary files in the build directory Writing to the source directory can cause multiple parallel builds from the same source to fail. Create the temporary files in the build directory. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-21 18:34:59 +02:00
Ian Romanick	f53b634807	mesa: Never advertise _S3TC compressed formats The NVIDIA driver doesn't expose them, and piglit's arb_texture_compression-invalid-formats expects them to not be there. This, with the previous commit, fixes piglit arb_texture_compression-invalid-formats. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-21 07:48:31 -07:00
Ian Romanick	40550c8ced	mesa: Only advertise GL_ETC1_RGB8_OES in ES contexts There is no extension for this format in desktop GL, so an application can't give the format back to glCompressedTexImage2D. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-21 07:46:51 -07:00
Ian Romanick	cabd45773b	glsl: Track existence of default float precision in GLSL ES fragment shaders This is required by the spec, and it's a bit tricky because the default precision is scoped. As a result, I'm slightly abusing the symbol table. Fixes piglit no-default-float-precision.frag tests and the piglit default-precision-nested-scope-0[1234].frag tests that are currently on the piglit mailing list for review. On IRC I got confirmation from cwabbot that ARM (Mali T6xx and T400) enforces this requirement and from kusma that NVIDIA (Tegra2) enforces this requirement. We should be safe from regressing shipping applications. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-21 07:44:26 -07:00
Ian Romanick	73e2d69792	glsl: Merge precision qualifiers too We never noticed this before because we previously didn't enfoce GLSL ES fragement shader requirements that precision be defined. There may also have been some interaction here with the addition of GL_ARB_shading_language_420pack, but it doesn't appear to me that it added any new bugs (just perhaps uncovered some old ones). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-21 07:43:48 -07:00
Ian Romanick	b15b62c54c	glsl: Pass type to is_valid_default_precision_type instead of name This is used by the next patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-21 07:43:48 -07:00
Rico Schüller	00fcdc81ff	vdpau/decode: Fix comment. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-21 11:25:36 +02:00
Rico Schüller	d8d90ecf30	vl/query: Only support VDP_CHROMA_TYPE_420 for 12 bit formats. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-21 11:25:10 +02:00
Roland Scheidegger	4b45b61fef	util: add avx2 and xop detection to cpu detection code Going to need this soon (not going to bother with avx2 intrinsics at this time but don't want to do workarounds for true vector shifts if llvm itself can use them just fine and won't need the gazillion instruction emulation). Not really tested other than my cpu returns 0 for these features... (I have no idea if llvm actually would emit avx2/xop instructions neither...) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-20 23:00:24 +02:00
Roland Scheidegger	9299128bf2	gallivm: fix bogus aos path detection Need to check the wrap mode of the actually used coords not a fixed 2. While checking more than necessary would only potentially disable aos and not cause any harm I'm pretty sure for 3d textures it could have caused assertion failures (if s,t coords have simple filter and r not). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-20 23:00:24 +02:00
Roland Scheidegger	fe92d7fab4	gallivm: do clamping of border color correctly for all formats Turns out it is actually very complicated to figure out what a format really is wrt range, as using channel information for determining unorm/snorm etc. doesn't work for a bunch of cases - namely compressed, subsampled, other. Also while here add clamping for uint/sint as well - d3d10 doesn't actually need this (can only use ld with these formats hence no border) and we could do this outside the shader for GL easily (due to the fixed texture/sampler relation) do it here too just so I can forget about it. v2: move border color clamping out of fetch texel. Also change it to clamp the whole border vector at once (and use vectorized load of border color), which saves a couple of instructions - needs some different handling of mixed signed/unsigned formats so skip the per channel stuff and just derive this from first channel except for special formats. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-20 23:00:24 +02:00
Roland Scheidegger	ac1a2714c7	gallivm: implement better control of per-quad/per-element/scalar lod There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-20 23:00:24 +02:00
Brian Paul	d427278a2d	mesa: use ARRAY_SIZE() macro instead of magic number Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-20 13:14:25 -06:00
Ross Burton	76feef0823	build: fix out-of-tree builds in gallium/auxiliary The rules were writing files to e.g. util/u_indices_gen.py, but in an out-of-tree build this directory doesn't exist in the build directory. So, create the directories just in case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ross Burton <ross.burton@intel.com>	2013-08-20 10:35:14 -07:00
Michel Dänzer	be301f707e	radeonsi: Always pre-load separate VGPRs for centroid vs. center interpolation The LLVM R600 backend currently always uses separate VGPRs for these. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68162 (Centroid interpolation is identical to center interpolation without multisampling, so the shader hardware was only pre-loading one set of interpolation coefficients, and the pixel shader code was using uninitialized values as the centroid interpolation coefficients) Cc: mesa-stable@lists.freedesktop.org Tested-by: Laurent Carlier <lordheavym@gmail.com>	2013-08-20 18:50:28 +02:00
Michel Dänzer	5edcb682c9	radeonsi: Fix SPI_BARYC_CNTL register initialization The centroid / center interpolation related bits have different meanings as of SI. Fixes 7 centroid interpolation related piglit tests.	2013-08-20 18:50:10 +02:00
Maarten Lankhorst	86751cbddf	gallium/osmesa: add same checks to OSMesaMakeCurrent as the other osmesa Fixes a opengl crash in wine. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-08-20 12:36:17 +02:00
Maarten Lankhorst	603160d4c0	gallium/osmesa: link against static libglapi library too to get the gl exports This should fix missing symbols in a osmesa built against shared glapi osmesa build. All opengl exports were missing that are defined in the static glapi, so link against both to fix this. I could swear I've done this before, maybe there was a glitch in the matrix. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824 Cc: "9.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-08-20 10:44:53 +02:00
Kenneth Graunke	a4ff1fd388	i965: Shorten sampler loops in precompile key setup. Now that we have the number of samplers available, we don't need to iterate over all 16. This should be particularly helpful for vertex shaders. v2: Use the correct shader program (caught by Paul Berry). This needs to initialize the exact same set of sampler swizzles as the actual key setup, or else we end up doing recompiles due to some being XYZW and others being 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-20 01:09:52 -07:00
Chia-I Wu	ce87c51e9a	ilo: add ILO_DEBUG=flush When specified, ilo will print a line similar to cp flushed for render with 949+888 DWords (22.4%) because of frame end for every ilo_cp_flush() call.	2013-08-20 13:54:39 +08:00
Chia-I Wu	216a576e11	ilo: add ILO_DEBUG=draw It can print out pipe_draw_info and the dirty bits set, useful for debugging.	2013-08-20 13:54:38 +08:00
Vinson Lee	ff3cb378ad	r600g/sb: Move memsets of member structs to within constructor bodies. Silences "Uninitialized pointer field" defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-08-19 17:37:08 -07:00
Ian Romanick	574e4843e9	glsl: Use alignment of container record for its first field The first field of a record in a UBO has the aligment of the record itself. Fixes piglit vs-struct-pad, fs-struct-pad, and (with the patch posted to the piglit list that extends the test) layout-std140. NOTE: The bit of strangeness with the version of visit_field without the record_type poitner is because that method is pure virtual in the base class. The original implementation of the class did this to ensure derived classes remembered to implement that flavor. Now they can implement either flavor but not both. I don't know a C++ way to enforce that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68195 Cc: "9.2 9.1" mesa-stable@lists.freedesktop.org	2013-08-19 16:39:04 -07:00
Ian Romanick	5ac884fd9f	glsl: Add new overload of program_resource_visitor::visit_field method The outer-most record is passed into the visit_field method for the first field. In other words, in the following structure: struct S1 { vec4 v; float f; }; struct S { S1 s1; S1 s2; }; uniform Ubo { S s; }; s.s1.v would get record_type = S (because s1.v is the first non-record field in S), and s.s2.v would get record_type = S1. s.s1.f and s.s2.f would get record_type = NULL becuase they aren't the first field of anything. This new overload isn't used yet, but the next patch will add several uses. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: "9.2 9.1" mesa-stable@lists.freedesktop.org	2013-08-19 16:39:04 -07:00
Ian Romanick	d9bb8b7b56	glsl: Disallow embedded structure definitions Continue to allow them in GLSL 1.10 because the spec allows it. Generate an error in all other versions because the specs specifically disallow it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-19 16:39:04 -07:00
Ian Romanick	5fb1dd51f3	meta: Add default precision qualifier to all fragement shaders Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-19 16:39:04 -07:00
Ian Romanick	5ac247a73e	glsl: Add default precision qualifiers for ES builtins Once the compiler proplerly checks for default precision qualifiers, these shaders will cease to compile. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-19 16:39:04 -07:00
Ian Romanick	0b5fb6d417	glsl: Remove extra "types" from error message Send it straight to the Department of Redundancy Department. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-19 16:39:04 -07:00
Kenneth Graunke	e197f53730	i965: Make the VS binding table as small as possible. For some reason, we didn't use this information even though the VS backend has computed it (albeit poorly) for ages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	7e9559c9ba	i965/vs: Rework binding table size calculation. Unlike the FS, the VS backend already computed the binding table size. However, it did so poorly: after compilation, it looked to see if any pull constants/textures/UBOs were in use, and set num_surfaces to the maximum surface index for that category. If the VS only used a single texture or UBO, this overcounted by quite a bit. The shader time surface was also noted at state upload time (during drawing), not at compile time, which is inefficient. I believe it also had an off by one error. This patch computes it accurately, while also simplifying the code. It also renames num_surfaces to binding_table_size, since num_surfaces wasn't actually the number of surfaces used. For example, a VS that used one UBO and no other surfaces would have set num_surfaces to SURF_INDEX_VS_UBO(1) == 18, rather than 1. A bit of a misnomer there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	c642bd3dcc	i965/vs: Plumb brw_vec4_prog_data into vec4_generator(). This will be useful for the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	60689c05d1	i965/fs: Make the FS binding table as small as possible. Computing the minimum size was easy, and done at compile-time for no extra overhead here. Making the binding table smaller wastes less batch space. Adding the CACHE_NEW_WM_PROG dirty bit isn't strictly necessary, since other atoms depend on it and flag BRW_NEW_SURFACES. However, it's best to add it for clarity and safety. It shouldn't add any new overhead. v2: Use binding_table_size, rather than max_surface_index. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	6d89bc803d	i965/fs: Track the binding table size in brw_wm_prog_data. By tracking the maximum surface index used by the shader, we know just how small we can make the binding table. Since it depends entirely on the shader program, we can just compute it once at compile time, rather than at binding table emit time (which happens during drawing). v2: Store binding_table_size, rather than max_surface_index, for consistency with the VS (which needs to be able to represent 0 surfaces). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	7c717690b5	i965: Use SURF_INDEX_DRAW() for drawbuffer binding table indices. SURF_INDEX_DRAW() has been the identity function since the dawn of time, and both the shader code and binding table upload code relied on that, simply using X rather than SURF_INDEX_DRAW(X). Even if that continues to be true, using the macro clarifies the code. The comment about draw buffers needing to be first in order for headerless render target writes to work turned out to be wrong; with this change, SURF_INDEX_DRAW can be changed to arbitrary indices and everything continues working. The confusion was over the "Render Target Index" field in the FB write message header. If it were a binding table index, then RT 0 would have to be at index 0 for headerless FB writes to work. However, it's actually an index into the blend state table, so there's no problem. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	c5fe7d063c	i965: Shorten sampler loops in key setup. Now that we have the number of samplers available, we don't need to iterate over all 16. This should be particularly helpful for vertex shaders. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	d0401d09ce	i965: Make sampler counts available for the entire drawing operation. Previously, we computed sampler counts when generating the SAMPLER_STATE table. By computing it earlier, we should be able to shorten a bunch of loops. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	c6e572275b	i965: Split the brw_samplers atom into separate FS/VS stages. This allows us to avoid uploading the VS sampler state table if only the fragment program changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:17:00 -07:00
Kenneth Graunke	7e01af662a	i965: Upload separate VS and FS sampler state tables. Now, each shader stage has a sampler state table that only refers to the samplers actually used by that problem. This should make the VS table non-existant or very small. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:16:59 -07:00
Kenneth Graunke	2b7f876a6a	i965: Make upload_sampler_state_table a virtual function. This allows us to coalesce the brw_samplers and gen7_samplers atoms. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:16:59 -07:00
Kenneth Graunke	decc708c7c	i965: Upload separate per-stage sampler state tables. Also upload separate sampler default/texture border color entries. At the moment, this is completely idiotic: both tables contain exactly the same contents, so we're simply wasting batch space and CPU time. However, soon we'll only upload data for textures actually /used/ in a particular stage, which will usually make the VS table empty and very likely eliminate all redundancy. This is just a stepping stone. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:16:59 -07:00
Kenneth Graunke	9525bcf5f7	i965: Un-hardcode border color table from update_sampler_state(). Like the previous patch, this simply pushes direct access to brw->wm up one level in the call chain. Rather than passing the whole array, we just pass a pointer to the correct spot in the array, similar to what we do for the actual sampler state structure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:16:59 -07:00
Kenneth Graunke	ed4459b10b	i965: Un-hardcode border color table from upload_default_color. When we begin uploading separate sampler state tables for VS and FS, we won't be able to use &brw->wm.sdc_offset[ss_index]. By passing it in as a parameter, we push the problem up to the caller. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:16:59 -07:00
Kenneth Graunke	f5a690cb68	i965: Split sampler count variable to be per-stage. Currently, we only have a single sampler state table shared among all stages, so we just copy wm.sampler_count into vs.sampler_count. In the future, each shader stage will have its own SAMPLER_STATE table, at which point we'll need these separate sampler counts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 13:16:59 -07:00
Kenneth Graunke	44960ef918	i965/fs: Re-enable global copy propagation. I believe the data flow analysis actually works now, and it should be safe to re-enable global copy propagation. It even does things now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	72f2249c11	i965/fs: Fix computation of livein. Since the initial value for livein is an overestimation (0xffffffff), it's extremely likely that it will shrink, which means we can't simply OR in new bits - we need to fully recompute it based on the current liveout values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	70b02a7fac	i965/fs: Fully recompute liveout at each step. Since we start with an overestimation of livein (0xffffffff), successive steps can actually take away values. This means we can't simply OR in new liveout values; we need to recompute it from scratch at each iteration of the fixed-point algorithm. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	d20b472d0a	i965/fs: Skip the initial block when updating livein/liveout. The starting block always has livein = 0 and liveout = copy. Since we start with real data, not estimates, there's no need to refine it with the fixed point algorithm. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	731145c579	i965/fs: Drop unnecessary and incorrect liveout initialization. The previous commit properly initialized liveout. This previous (and incorrect) initialization is no longer necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	1d40c784f2	i965/fs: Properly initialize the livein/liveout sets. Previously, livein was initialized to 0 for all blocks. According to the textbook, it should be the universal set (~0) for all blocks except the one representing the start of the program (which should be 0). liveout also needs to be initialized to COPY for the initial block. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	f06826cece	i965/fs: Use the COPY set in the calculation for liveout. According to page 360 of the textbook, the proper formula for liveout is: CPout(n) = COPY(i) union (CPin(i) - KILL(i)) Previously, we omitted COPY. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	a291c59bba	i965/fs: Simplify liveout calculation. Excluding the existing liveout bits is a deviation from the textbook algorithm. The reason for doing so was to determine if the value changed, which means the fixed-point algorithm needs to run for another iteration. The simpler way to do that is to save the value from step (N-1) and compare it to the new value at step N. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	597efd2b67	i965/fs: Create the COPY() set for use in copy propagation dataflow. This is the "COPY" set from Muchnick's textbook, which is necessary to do the dataflow algorithm correctly. v2: Simplify initialization based on Paul Berry's observation that out_acp contains exactly what needs to be in the COPY set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	669d4d7f77	i965/fs: Rename setup_kills() to setup_initial_values(). Although this function currently only initializes the KILL set, it will soon initialize other data flow sets as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	2ef81372dc	i965/fs: Separate the updating of liveout/livein. To compute the actual liveout/livein data flow values, we start with some initial values and apply a fixed-point algorithm until they settle. Previously, we iterated through all blocks, updating both liveout and livein together in one pass. This is awkward, since computing livein for a block requires knowing liveout for all parent blocks. Not all of those parent blocks may have been processed yet. This patch separates the two. First, we update liveout for all blocks. At iteration N of the fixed-point algorithm, this uses livein values from iteration N-1. Secondly, we update livein for all blocks. At step N, this uses the liveout information we just computed (in step N). This ensures each computation has a consistent picture of the data, rather than seeing an random mix of data from steps N-1 and N depending on the order of the blocks in the CFG data structure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:24 -07:00
Kenneth Graunke	7d86042dee	i965/fs: Rename "cont" to "progress" in dataflow algorithm. This variable indicates that the fixed-point algorithm made changes to the data at this step, so it needs to run for another iteration. "progress" seems a nicer name for that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:23 -07:00
Kenneth Graunke	0225dea6c4	i965/fs: Switch to a do-while loop in copy propagation dataflow. The fixed-point algorithm needs to run at least once, so a do-while loop is more natural. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:23 -07:00
Kenneth Graunke	3c68662bb1	i965/fs: Skip global copy propagation step. The dataflow analysis used for global copy propagation is severely broken, and I believe it doesn't actually do anything. Fixing it will require a lot of changes, each of which might break things. Once all the fixes land, we can re-enable this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-19 11:29:23 -07:00
Emil Velikov	b9d1173f2c	vl/buffers: consistent use on VL_MAX_SURFACES Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-08-19 18:32:08 +02:00
Emil Velikov	e7c17eb819	st/vdpau: drop unnecessary variable prof Any decent compiler will do this for us, although doing this will make grepping through the code alot easier. v2: In both mixer and query interface v3: rebase Reviewed-by: Christian König <christian.koenig@amd.com> [v1] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-08-19 18:32:08 +02:00
Emil Velikov	1d260360d8	vl/idct: cleanup all idct buffers Code should loop through and cleanup the three (VL_NUM_COMPONENTS) idct buffers, rather than doing the first one three times. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-08-19 18:32:08 +02:00
Emil Velikov	5354d2e76a	vl/buffer: add sanity check after CALLOC_STRUCT Check if we have successfully allocated memory. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-08-19 18:32:08 +02:00
Emil Velikov	eab9bad1ac	st/xvmc: exit gracefully if we fail to create video buffer Free any allocated memory and return BadAlloc if create_video_buffer() has failed to create a buffer. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-08-19 18:32:07 +02:00
Emil Velikov	5e91c15290	st/vdpau: don't try to create video buffer when the format is FORMAT_NONE Not seen in the wild yet, but seems like a reasonable thing to do. [suggested by Christian] Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-19 18:32:03 +02:00
Andy Furniss	3448b66dac	vdpau/vl 422 chroma width/height mix up I was looking into some minor 422 issues/discrepencies I noticed long ago using vdpau on my rv790. I noticed that there is code that is halving height rather than width - 422 is full height AFAIK. Making the changes below doesn't actually make any noticable difference to what I was looking into. Maybe there are more but here's three I've found so far Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-19 18:31:26 +02:00
Vinson Lee	b1d05eeb1f	radeonsi: Ensure fmask_format is initialized in release builds. Fixes "Uninitialized scalar variable" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-08-19 09:19:19 -07:00
Paul Berry	c6b6c93643	i965: STATIC_ASSERT that there aren't too many BRW_NEW_* flags. We are getting close to the maximum number of BRW_NEW_* bits that can be stored in brw->state.dirty.brw without overflowing 32 bits, and geometry shaders are going to add more. Add a STATIC_ASSERT so that we will be alerted when we need to switch to 64 bits. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-19 08:28:17 -07:00
Christian König	5ddd840f5a	vl: add entrypoint to is_video_format_supported Signed-off-by: Christian König <christian.koenig@amd.com>	2013-08-19 10:21:15 +02:00
Christian König	a15cbabb8b	vl: add entrypoint to get_video_param Signed-off-by: Christian König <christian.koenig@amd.com>	2013-08-19 10:21:15 +02:00
Christian König	f2f7064e56	vl: rename pipe_video_decoder to pipe_video_codec Signed-off-by: Christian König <christian.koenig@amd.com>	2013-08-19 10:21:15 +02:00
Christian König	8e423ab984	vl: rename enum pipe_video_codec to pipe_video_format Signed-off-by: Christian König <christian.koenig@amd.com>	2013-08-19 10:21:15 +02:00
Christian König	53e20b8b41	vl: use a template for create_video_decoder Signed-off-by: Christian König <christian.koenig@amd.com>	2013-08-19 10:21:14 +02:00
Marek Olšák	d13003f544	glsl: don't eliminate texcoords that can be set by GL_COORD_REPLACE Tested by examining generated TGSI shaders from piglit/glsl-routing. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Henri Verbeet <hverbeet@gmail.com> Tested-by: Henri Verbeet <hverbeet@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-18 12:27:08 +02:00
Ilia Mirkin	a8346a2f52	nv50: allow non-nv12 buffers to be created, just pass them through to vl Since we expose non-NV12 formats as supported when there is no decoer profile selected, make sure that those formats are actually allowed to be allocated. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-17 17:58:36 +02:00
Eric Anholt	bef423bee6	dri: Choose a decent global driNConfigOptions. Previously, we were asserting that each driver specified an NConfigOptions exactly equal to the number of options they supplied, leading to frequent bugs when people would forget to adjust the value when adjusting driver options. Instead, just overallocate the table by a bit and leave sanity checking to the assert in findOption(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-17 11:43:19 +02:00
Kenneth Graunke	703a2f4219	i965: Improve comments for driver hooks in intel_buffer_object.c. Consistently using a "The ___ driver hook." line at the the top of each function's comment block makes it easy to see at a glance what function is being implemented. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 19:00:49 -07:00
Kenneth Graunke	96a0fe7e4d	i965: Split intel_upload code out into a separate file. This code upload performs batched uploads via a BO. By moving it out to a separate file, intel_buffer_objects.c only provides the core buffer object functionality. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 19:00:49 -07:00
Kenneth Graunke	76c2533470	i965: Move GL_APPLE_object_purgeable functionality into a new file. GL_APPLE_object_purgeable creates a mechanism for marking OpenGL objects as "purgeable" so they can be thrown away when system resources become scarce. It specifically applies to buffer objects, textures, and renderbuffers. The intel_buffer_objects.c file provides core functionality for GL buffer objects, such as MapBufferRange and CopyBufferSubData. Having texture and renderbuffer functionality in that file is a bit strange. The 2010 copyright on the new file is because Chris Wilson first added this code in January 2010 (commit `755915fa`). v2: Actually remember to call the new dd table setup function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 19:00:49 -07:00
Marek Olšák	aafb0f9e06	radeonsi: fix feature support reporting broken by `21d9a1b5ef`	2013-08-17 02:49:00 +02:00
Niels Ole Salscheider	5394ee8f30	clover: Fix linkage of libOpenCL Clover needs the option component of llvm. Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2013-08-16 16:52:31 -07:00
Marek Olšák	21d9a1b5ef	radeonsi: require LLVM 3.4 for MSAA	2013-08-17 01:48:25 +02:00
Marek Olšák	87b88f1dae	radeonsi: don't make scanout resources linear except for cursors The surface allocator understands the scanout flag just fine. This seems to improve performance for Ubuntu Unity on top of st/xorg and it fixes the cursor. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	89ca4a00f5	radeonsi: remove useless code from tex_fetch_args The array slice has already been added to "address". Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	5550554f1e	radeonsi: disable unbound colorbuffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	356c041167	radeonsi: port texture improvements from r600g This started as an attempt to add support for MSAA texture transfers and MSAA depth-stencil decompression for the DB->CB copy path. It has gotten a bit out of control, but it's for the greater good. Some changes do not make much sense, they are there just to make it look like the other driver. With a few cosmetic modifications, r600_texture.c can be shared with a symlink. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	4855acd461	radeonsi: implement texture fetching for compressed MSAA textures (v2) v2: use resource slots 16..31 for FMASK textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	f671dfa8aa	radeonsi: add FMASK texture binding slots and resource setup (v2) v2: bind FMASK textures to shader resource slots 16..31 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	3c3feb38f4	radeonsi: implement FMASK decompression for MSAA texturing Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	8c04f25360	radeonsi: scanout buffers cannot be a destination of MSAA resolve Resolving to scanout buffers just doesn't work. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	2a4b2e2305	radeonsi: implement MSAA colorbuffer compression for rendering Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	2f1c449415	radeonsi: implement uncompressed MSAA texturing This is glBlitFramebuffer support for MSAA surfaces as required by GL 3.0 and texturing as required by GL 3.2 and GL_ARB_texture_multisample. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	f083f79751	radeonsi: disable alpha-to-coverage for integer colorbuffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	6d4755a4d7	radeonsi: implement GL_SAMPLE_ALPHA_TO_ONE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	07955d4f2b	radeonsi: implement uncompressed MSAA rendering and color resolving This is basic MSAA support which should work with most apps. Some features are missing, those will be implemented by other commits. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-17 01:48:25 +02:00
Marek Olšák	c8e70e64ac	radeonsi: add flexible shader descriptor management and use it for sampler views It moves all sampler view descriptors to a buffer. It supports partial resource updates and it can also unbind resources (required for FMASK texturing). The buffer contains all sampler view descriptors for one shader stage, represented as an array. On top of that, there are N arrays in the buffer, which are used to emulate context registers as implemented by the previous ASICs (each array is a context). This uses the RCU synchronization approach to avoid read-after-write hazards as discussed in the thread: "radeonsi: add FMASK texture binding slots and resource setup" CP DMA is used to clear the descriptors at context initialization and to copy the descriptors from one context to the next. v2: - use PKT3_DMA_DATA on CIK (I'll test CIK later) - turn the bool CP DMA parameters into self-explanatory flags - add a nice simple API for packet emission to radeon_winsys.h - use 256 contexts, 128 causes texture corruption in openarena	2013-08-17 01:48:25 +02:00
Tom Stellard	764502b481	radeonsi/compute: Let the state tracker do all the flushing It shouldn't be necessary to call radeon_winsys::cs_flush() from radeonsi_launch_grid(), because the state tracker is responsible for flushing the pipeline at the appropriate time. The current behavior is also wrong, because radeonsi_launch_grid() submits packets to the compute ring, but when the state tracker calls pipe->flush() everything is submitted to the graphics ring. This has the potential to create a race condition. The downside of removing this flush is that the compute dispatch packets will be sent to the graphics ring rather than the compute ring. In the future we will need to come up with a way to detect 'compute' command streams and submit them to the appropriate ring. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-08-17 01:48:25 +02:00
Kenneth Graunke	e29931aa74	i965: Dump more information about batch buffer usage. Previously, INTEL_DEBUG=bat would dump messages like: intel_mipmap_tree.c:1643: Batchbuffer flush with 456b used This only reported the space used for command packets, and didn't report any information on the space used for indirect state. Now it dumps: intel_context.c:366: Batchbuffer flush with 6128b (pkt) + 4288b (state) = 10416b (31.8%) This conveniently shows the breakdown of space used for packets vs. state, as well as the percentage of batchbuffer space. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 15:54:24 -07:00
Kenneth Graunke	2a9492f321	i965: Add Gen7 depth stall flushes before disabling depth in BLORP. We emit these before configuring depth in the normal path, or actually using the depth buffer in BLORP - we just failed to emit them when disabling depth altogether. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-16 15:03:55 -07:00
Kenneth Graunke	8fba8d4ee7	i965: Add Gen6 depth stall flushes before disabling depth in BLORP. We emit these before configuring depth in the normal path, or actually using the depth buffer in BLORP - we just failed to emit them when disabling depth altogether. On Sandybridge, this also requires the post_sync_nonzero flush. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-16 15:03:38 -07:00
Matt Turner	9c48ae751a	i965: Don't copy propagate bitcasts with source modifiers. Previously, copy propagation would cause bitcast_f2u(abs(float)) to be performed in a single step, but the application of source modifiers (abs, neg) happens after type conversion, leading to incorrect results. That is, for bitcast_f2u(abs(float)) we would in fact generate code to do abs(bitcast_f2u(float)). For example, whereas bitcast_f2u(abs(float)) might result in a register argument such as (abs)g2.2<0,1,0>UD v2: Set interfered = true and break in register_coalesce instead of returning false. Reviewed-by: Paul Berry <stereoytpe441@gmail.com>	2013-08-16 13:11:07 -07:00
Matt Turner	0ae9ca12a8	i965: Emit MOVs for neg/abs. Necessary to avoid combining a bitcast and a modifier into a single operation. Otherwise if safe, the MOV should be removed by copy-propagation or register coalescing. With this and the next patch, there are only four changes in shader-db: all a single extra instruction. The code does something like mov a.w, -b.x and copy propagation doesn't work because it only handles no-op swizzles. Seems acceptable, given the known limitation of our copy propagation. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Paul Berry <stereoytpe441@gmail.com>	2013-08-16 13:11:07 -07:00
Anuj Phogat	079bdba05f	i965/blorp: Add support for single sample scaled blit with bilinear filter Currently single sample scaled blits with GL_LINEAR filter falls back to meta path. Patch removes this limitation in BLORP engine and implements single sample scaled blit with bilinear filter. No piglit, gles3 regressions are observed with this patch on Ivybridge. V2: Use "sample" message to utilize the linear filtering functionality built in to hardware. V3: Define a bool variable (bilinear_filter) to handle the conditions for GL_LINEAR blits. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 09:46:15 -07:00
Anuj Phogat	aff371b634	i965/blorp: Define a function to clamp texture coordinates New function clamp_tex_coords() clamps the texture coordinates to texture boundaries. This function will also be utilized later for the BLORP implementation of single-sample scaled blit with bilinear filter. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 09:46:15 -07:00
Anuj Phogat	6066fb1721	i965/blorp: Use more appropriate variable names When we talk about both multi-sample and single-sample scaled blits, rect_grid_{x1, y1} are more appropriate variable names as compared to sample_grid_{x1, y1}. There are no functional changes in this patch. It just prepares for the BLORP implementation of single-sample scaled blit with bilinear filter. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 09:46:15 -07:00
Anuj Phogat	d944a6144f	meta: Fix blitting a framebuffer with renderbuffer attachment This patch fixes a case of framebuffer blitting with renderbuffer as color attachment and GL_LINEAR filter. Meta implementation of glBlitFrambuffer() converts source color buffer to a texture and uses it to do the scaled blitting in to destination buffer. Using the exact source rectangle to create the texture does incorrect linear filtering along the edges. This patch makes the changes to extend the texture edges by one pixel in x, y directions. This ensures correct linear filtering. It fixes failing piglit fbo-attachments-blit-scaled-linear test. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> CC: "9.2" <mesa-stable@lists.freedesktop.org> CC: "9.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 09:46:15 -07:00
Ilia Mirkin	a2061eea0f	nv50: add vp3/vp4 support for mpeg2/vc1 h264/mpeg4 remain disabled for pre-nvc0, there's some minor bug/difference which causes the decoding to hang after some frames. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-16 09:48:47 +02:00
Ilia Mirkin	b3f6f127f2	nv50: separate video logic from noalloc The upcoming vp3 logic will want the video layout, but allocated by the miptree. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-16 09:48:26 +02:00
Ilia Mirkin	c1a6f59b20	nv30: remove no-longer-used formats from table Commit `14ee790df7` removed the formats from the vtxfmt_table but forgot to also update the info_table. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>	2013-08-16 09:48:09 +02:00
Fredrik Höglund	0e7a61a29f	mesa: Update the BGRA vertex array error handling The error code was changed from INVALID_VALUE to INVALID_OPERATION in OpenGL 3.3. We should also generate an error when size is BGRA and normalized is FALSE. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-15 21:38:13 -07:00
Kenneth Graunke	90129da82c	i965/fs: Fix Sandybridge regressions from SEL optimization. Sandybridge is the only platform that supports an IF instruction with an embedded comparison. In this case, we need to emit a CMP to go along with the SEL. Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16, fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and isinf-and-isnan fs_fbo. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68086 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: lu hua <huax.lu@intel.com>	2013-08-15 15:33:00 -07:00
Kenneth Graunke	c189840b21	i965: Force X-tiling for 128 bpp formats on Sandybridge. 128 bpp formats are not allowed to be Y-tiled on any architectures except Gen7. +11 Piglits on Sandybridge (mostly regression fixes since the switch to Y-tiling). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64261 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-15 15:18:48 -07:00
Ian Romanick	41eef83cc0	mesa/vbo: Fix handling of attribute 0 in non-compatibilty contexts It is only in OpenGL compatibility-style contexts where generic attribute 0 and GL_VERTEX_ARRAY have a bizzare, aliasing relationship. Moreover, it is only in OpenGL compatibility-style contexts and OpenGL ES 1.x where one of these attributes provokes the vertex. In all other APIs each implicit call to glArrayElement provokes a vertex regardless of which attributes are enabled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Robert Bragg <robert@sixbynine.org> Cc: "9.0 9.1 9.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55503 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66292 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67548	2013-08-15 14:59:37 -07:00
Zack Rusin	7115bc3940	draw: handle nan clipdistance If clipdistance for one of the vertices is nan (or inf) then the entire primitive should be discarded. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-15 16:26:32 -04:00
Vinson Lee	035bf21983	i915,i965: Fix memory leak in try_pbo_upload (v2) Fixes "Resource leak" defect reported by Coverity. Tested on Haswell, no Piglit regressions. v2: Apply to i965, not just i915. (chadv) CC: "9.2, 9.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-15 10:37:22 -07:00
Roland Scheidegger	6ca18e06ae	gallivm: revert accidentally commited hunk That magic wasn't meant to be commited, need to work on some proper fix.	2013-08-15 19:26:39 +02:00
Roland Scheidegger	5626a84a00	gallivm: do per-sample depth comparison instead of doing it post-filter Doing the comparisons pre-filter is highly recommended by OpenGL (and d3d9) and definitely required by d3d10. This actually doesn't do it pre-filter but more "in-filter" as otherwise need to push the comparisons even further down into fetch code and this also trivially allows using a somewhat cheaper lerp. Doing it pre-filter would actually have some performance advantage for UNORM formats (because the comparisons should be done in texture format, we'd only need to convert the shadow ref coord to texture format once, but in turn would save converting the per-sample texture values to floats) but this gets a bit messy as this has implications for border color handling as well (which needs to be done prior to depth comparisons, hence would also need to convert border color to texture format too or use some other tricks like doing separate border color / shadow ref comparison and simply using that result directly when doing border replacement). Should make no difference for nearest filtering, and performance for linear filtering should be mostly the same too (essentially have one more comparison instruction per sample, and replace the sub/mul/add lerp with a sub/and/and/add special "lerp" which all in all shouldn't be much of a difference). v2: get rid of old code completely Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-15 18:42:20 +02:00
Michel Dänzer	3b2f3f90ac	radeonsi: Pixel shaders pre-load one more SGPR Acked-by: Marek Olšák <maraeo@gmail.com>	2013-08-15 17:55:00 +02:00
Michel Dänzer	f0753a3cd4	radeonsi: TGSI_SEMANTIC_CLIPVERTEX doesn't use any parameters	2013-08-15 17:54:40 +02:00
Michel Dänzer	2f98dc223f	radeonsi: Don't export unused clip distance vectors from vertex shader E.g. the Source engine seems to always write to gl_ClipVertex, but normally doesn't enable any GL_CLIP_DISTANCEn states. This change removes some irrelevant parts from the generated vertex shader code in such cases. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-15 17:53:50 +02:00
Michel Dänzer	b00269aa58	radeonsi: Don't leave gaps between position exports from vertex shader If the vertex shader exports clip distances but not point size, use position exports 1/2 instead of 2/3 for the clip distances. Fixes geometry corruption in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66974 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-15 17:42:26 +02:00
Roland Scheidegger	abdd32dcd5	llvmpipe: fix stencil bug if we have both stencil and depth tests This is a very well hidden bug found by accident (only the fixed glean tstencil2 test so far seems to hit it). We must use new mask with combined s_pass values and orig_mask values for zpass/zfail stencil ops, otherwise both the sfail op and one of zpass/zfail op are applied (probably not hit in most tests because some of the ops tend to be KEEP usually). Note: this is a candidate for the 9.2 branch. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-15 17:30:07 +02:00
Roland Scheidegger	7ae9cc71f0	st/mesa: use new float comparison opcodes if native integers are supported Should get rid of some float-to-int conversions (with negation). No piglit regressions (with llvmpipe). v2: fix bogus formatting spotted by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-15 17:30:07 +02:00
Ilia Mirkin	4ea191fb2d	nvc0: move video param and format support functions to nouveau Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:48 +02:00
Ilia Mirkin	9255019a53	nvc0: move firmware loading functions to nouveau Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:48 +02:00
Ilia Mirkin	9d8c076803	nvc0: move some of the simpler decoder functions into nouveau Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:48 +02:00
Ilia Mirkin	73f4499a02	nvc0: move vp param filling logic into nouveau Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:48 +02:00
Ilia Mirkin	e1cd987bb6	nvc0: move bsp param-filling logic into nouveau Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:48 +02:00
Ilia Mirkin	d6a82a7747	nvc0: move nvc0_decoder into nouveau, rename to nouveau_vp3_decoder Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:47 +02:00
Ilia Mirkin	86e5c3c97b	nvc0: standardize on using #if for NVC0_DEBUG_FENCE Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:47 +02:00
Ilia Mirkin	b57875bbb3	nvc0: refactor video buffer management logic into nouveau_vp3 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:19:47 +02:00
Ilia Mirkin	940f7cec77	nv50: allow forcing PMPEG use, for ease of testing This also allows people who don't want to install the binary blobs required for VP2 to still get MPEG decoding. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:15:23 +02:00
Ilia Mirkin	ee3ca3614e	nv30: hook up PMPEG support via nouveau_video, enables XvMC to work Force the format to be the reasonable format that doesn't require an inverse z-scan. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:15:12 +02:00
Ilia Mirkin	6010c683d0	nouveau: set buffer format of video buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:15:04 +02:00
Ilia Mirkin	8975f83402	nouveau: fix number of surfaces in video buffer, use defines Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-15 15:15:02 +02:00
Ilia Mirkin	14ee790df7	nv30: U8_USCALED only works for size 4 See https://bugs.freedesktop.org/show_bug.cgi?id=61635 for a sample program. Changing it to use a vec4 makes it work. Remove the unsupported formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org>	2013-08-15 15:14:25 +02:00
Chris Forbes	4f739646b0	i965: allow 8 user clip planes on CTG+ There's no need to use a clip flag for NEGW on these gens, so no reason we can't just enable 8 planes. V2: - Bump (and document!) MAX_VERTS in the clip code. - Fix clip flag masks in the clip unit state and in the shader prolog - Move this to the end of the series for less breakage. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:56 +12:00
Chris Forbes	ee0b8e0f06	i965: get rid of clip plane compaction Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:56 +12:00
Chris Forbes	cf52f6435e	i965/clip: Support clip distances for line clipping This does the same thing as we do for triangle clipping -- select the appropriate source (either dot(hpos,fixed plane) or a clipdistance slot). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:56 +12:00
Chris Forbes	2a8a85e1ad	i965/clip: remove spurious clipvertex param Nothing in the clipper uses gl_ClipVertex any more, so we don't care where it is. V2: Don't bother fishing out the clipvertex offset either. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:56 +12:00
Chris Forbes	45540921ec	i965/clip: Use clip distances for all user clipping V2: Adjust explanation of load_clip_distance() Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:55 +12:00
Chris Forbes	bf9ede92c2	i956/clip: push dp4 into load_clip_distance Soon the dp4 is only going to be used for fixed clip planes. V2: Remove old inaccurate comment about the behavior of this function; add a better explanation above. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:55 +12:00
Chris Forbes	265336e75a	i965/clip: Track offset into the vertex for clipdistance Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:55 +12:00
Chris Forbes	3b738f5f85	i965/Gen4-5: Set clip flags from clip distances V2: - Use the new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 to correctly split the flags for the two vertices being processed together. - Don't apply bogus masking of clip flags. The set of plane enables aren't included in the shader key, and we wouldn't want the recompiles anyway. V3: - Tidy up spurious instructions, name temps properly. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:55 +12:00
Chris Forbes	a9be50f776	i965: add new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 Splits the bottom 8 bits of f0.0 for further wrangling in a SIMD4x2 program. The 4 bits corresponding to the channels in each program flow are copied to the LSBs of dst.x visible to each flow. This is useful for working with clipping flags in the VS. V3: - Fixup immediate types - Teach scheduler about the hidden dep on flags Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> V2: Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:24:38 +12:00
Chris Forbes	9e2c1e28a1	i965/vs: add vec4_instruction::depends_on_flags We're about to have an instruction that depends on the flags but isn't predicated. This lays the groundwork. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-08-16 07:21:43 +12:00
Chris Forbes	c5e2d0454b	i965/clip: Enable interpolation of clip distances Previously we had disabled interpolation of the clip distances as a special case, since they were unused. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:21:42 +12:00
Chris Forbes	972e2f11c0	i965/vs: Do legacy clip lowering earlier We need to produce clip flags for the vertex header on Gen4/5, so clip plane lowering has to be done before we try to emit the flags/psiz attribute. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:21:37 +12:00
Chris Forbes	9e07a68cad	i965/Gen4-5: ensure VUE slots for clipdistance are valid if user clipping is enabled. V2: We don't particularly care where they fall in the VUE map, as long as they are allocated somewhere, and occupy two contiguous slots. Don't fiddle with the SF layout at all -- there's no need. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-16 07:20:47 +12:00
Chia-I Wu	a453eb6f86	ilo: fix fragment shaders that use PCB on GEN7+ Missed this commit when preparing PCB changes for upstreaming.	2013-08-15 11:35:46 +08:00
Vinson Lee	ae645b83fc	nouveau: Fix variable name. Fixes build error introduced with commit `d1ba1055d9`. CC nouveau_video.lo nouveau_video.c: In function 'nouveau_screen_get_video_param': nouveau_video.c:866:33: error: 'screen' undeclared (first use in this function) nouveau_video.c:866:33: note: each undeclared identifier is reported only once for each function it appear Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-08-14 17:35:31 -07:00
Matt Turner	57a6bcd56b	glsl: Add i2b() and b2i() to ir_builder. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-14 17:15:06 -07:00
Matt Turner	1cf76c72da	glsl: Add nequal() to ir_builder. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-14 17:15:06 -07:00
Matt Turner	16be6298c0	glsl: Add abs() to ir_builder. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-14 17:15:06 -07:00
Matt Turner	6bfb1a8344	glsl: Add bitcast_i2f() to ir_builder. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-14 17:15:06 -07:00
Marek Olšák	3d1b01662b	radeonsi: unduplicate code in create_context Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:03 +02:00
Marek Olšák	e801b78aa0	radeonsi: initialize the radeon_surface structure this fixes valgrind warnings Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:03 +02:00
Marek Olšák	731c6aa52d	radeonsi: correct sampler function names Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:03 +02:00
Marek Olšák	0469171159	radeonsi: rename r600_texture::dirty_db_mask to dirty_level_mask Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:03 +02:00
Marek Olšák	363b2805f7	radeonsi: rename r600_resource_texture to r600_texture Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:02 +02:00
Marek Olšák	128819d394	tgsi: add info about MSAA samplers to tgsi_shader_info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:02 +02:00
Marek Olšák	0ee4bae70d	tgsi: fix the location of sample index The sample index is always in W. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-08-15 02:03:02 +02:00
Roland Scheidegger	7727fbb7c5	r600/radeonsi: implement new float comparison instructions Also use ordered comparisons for old cmp instructions. Tested-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Tom Stellard <tom@stellard.net>	2013-08-15 00:40:14 +02:00
Roland Scheidegger	72874d2352	nv50: implement new float comparison instructions untested. Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2013-08-15 00:40:14 +02:00
Roland Scheidegger	e858921d52	ilo: implement new float comparison instructions untested. Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2013-08-15 00:40:14 +02:00
Roland Scheidegger	e58c2310b8	gallivm: already pass coords in the right place in the sampler interface This makes things a bit nicer, and more importantly it fixes an issue where a "downgraded" array texture (due to view reduced to 1 layer and addressed with (non-array) samplec instruction) would use the wrong coord as shadow reference value. (This could also be fixed by passing target through the sampler interface much the same way as is done for size queries, might do this eventually anyway.) And if we'd ever want to support (shadow) cube map arrays, we'd need 5 coords in any case. v2: fix bugs (texel fetch using wrong layer coord for 1d, shadow tex using wrong shadow coord for 2d...). Plus need to project the shadow coord, and just for fun keep projecting the layer coord too. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-15 00:40:14 +02:00
Roland Scheidegger	d4b43cedb6	gallivm: change coordinate handling throughout functions Instead of passing s,t,r coordinates pass a coord array - the reason is that I need to pass more coords (in particular for shadow "coord", future will also need another one for cube map arrays) so just pass them as an array. Also, to simplify things, use fixed location for the shadow reference value I want to get rid of the silly "where is the right coord value" game. Keep old-style however for aos sampling (which is not going to need shadow coord, though for cube map arrays it still would need fixing). (Next patch will pass those through using the new arrangement directly from sampler interface.) v2: fix up soa split path (unreachable currently but still...) Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-15 00:40:14 +02:00
Roland Scheidegger	c6c55ad3e9	gallivm: fix border color with normalized texture formats We need to put border color into texture format color space which essentially means clamping for non-float, normalized formats (not entirely sure if we're also meant to quantize the float but it's probably ok not to do it thankfully). For OpenGL we could do this easily outside generated code due to the 1:1 sampler/texture correspondence but not for d3d10 which is terrible (as we recalculate a constant over and over again per shader invocation). Fortunately border color should be rare enough that we don't care THAT much. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-15 00:40:14 +02:00
Zack Rusin	27cedd8aec	llvmpipe: fix pipeline statistics with a null ps If the fragment shader is null then pixel shader invocations have to be equal to zero. And if we're running a null ps then clipper invocations and primitives should be equal to zero but only if both stancil and depth testing are disabled. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-14 18:23:36 -04:00
Zack Rusin	a3ae5dc7dd	draw: make sure that the stages setup outputs Calling the prepare outputs cleans up the slot assignments for outputs, unfortunately aapoint and aaline didn't have code to reset their slots after the initial setup, this was messing up our slot assignments. The unfilled stage was just missing the initial assignment of the face slot. This fixes all of the reported piglit failures. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-14 18:23:35 -04:00
Paul Berry	98d2498404	glsl: Fix incorrect pattern matching in ir_set_program_inouts In commit `8fc41df` (glsl: Modify ir_set_program_inouts to handle geometry shaders), when attempting to pattern match the "foo" part of expressions such as: foo[i][j] foo[i] I incorrectly called as_dereference_variable() on the subexpression foo[i] instead of foo. As a result, the pattern never matched, so ir_set_program_inouts would fall back on marking the entire variable as used, rather than just the portion indexed by the array. This didn't result in incorrect behaviour, but it could have resulted in inefficiency by causing the back-end to allocate resources for unused parts of an input or output array. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-14 10:53:47 -07:00
Rico Schüller	d1ba1055d9	vl: Add support for max level query v2 This patch adds the level query support to the video decoders and uses some more reasonable defaults. v2: (ck) add commit message Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-14 13:20:01 +02:00
Ian Romanick	830f4df993	glsl: Emit better warnings for things that look like default precision statements Previously we would emit a warning for empty declarations like float; We would also emit the same warning for things like highp float; However, this second case is most likely the application trying to set the default precision. This makes the compiler generate a stronger warning with some suggestion of a fix. It really seems like this should be an error. I'll bet that 100% of the time someone writes 'highp float;' the actually meant 'precision highp float;'. Alas, both AMD and NVIDIA accept this syntax, and the spec doesn't explicitly forbid it. This makes piglit's precision-05.vert generate the following warnings: 0:12(11): warning: empty declaration with precision qualifier, to set the default precision, use `precision lowp float;' 0:13(12): warning: empty declaration with precision qualifier, to set the default precision, use `precision mediump int;' v2: Add { } around a one-line if body and fix a comment. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 20:47:20 -07:00
Paul Berry	825f9ff5d3	glsl/ast: Don't perform GS input array checks on non-inputs. Previously, we were accidentally calling handle_geometry_shader_input_decl() on non-input interface block declarations, resulting in bogus error checking. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-13 20:02:55 -07:00
Paul Berry	91c8fea924	glsl/ast: Fix assertion failure when GS input declared as non-array. Previously, if a geometry shader input was declared as a non-array, we would flag the proper compiler error, but then before we got a chance to report it to the client, handle_geometry_shader_input_decl() would assertion fail. With this patch, handle_geometry_shader_input_decl() ignores non-arrays. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-13 20:02:54 -07:00
Paul Berry	336351e971	glsl/ast: Check that geometry shader interface block inputs are arrays. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-13 20:02:54 -07:00
Paul Berry	3b837e637e	i965/gen7+: Fix build error introduced by renaming upload_3dstate_so_decl_list. Commit `9f9ccf707c` renamed upload_3dstate_so_decl_list to gen7_upload_3dstate_so_decl_list but forgot to update the caller.	2013-08-13 19:36:27 -07:00
Jon Severinsson	9298f537a7	radeon/llvm: Add missing "%s" format string to fprintf. This fixes a compilation warning with -Wformat-security. CC: "9.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-13 19:18:14 -07:00
Chad Versace	11b8f8e7e4	i965: Move arrays brw_multisample_positions* to new header Move the arrays to the new header brw_multisample_state.h, which will be shared with Broadwell code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-13 18:04:20 -07:00
Chad Versace	7eecda29c8	i965: Refactor names of sample_positions_8/4x arrays Place each array in the brw namespace by renaming it: sample_positions_4x -> brw_multisample_positions_4x sample_positions_8x -> brw_multisample_positions_8x This prepares for moving the arrays to a header shared by gen6 and gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-13 18:03:59 -07:00
Kenneth Graunke	9f9ccf707c	i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2) We will reuse this for Broadwell. v2: Prefix function name with 'gen7'. (chadv) Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-13 18:03:57 -07:00
Kenneth Graunke	f4e5c235de	i965: Mark a few brw_draw_upload.c functions as non-static We will reuse these for Broadwell. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-13 18:02:13 -07:00
Ian Romanick	1b35e33af4	glsl: Require function return type arrays be explicitly sized Fixes piglit array-function-return-unsized.vert. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 17:53:33 -07:00
Ian Romanick	42624b1c81	glsl: Move and refine test for unsized arrays in GLSL ES GLSL ES does not allow unsized arrays, and GLSL ES 1.00 does not allow array initializers. However, GLSL ES 3.00 allows array initializers, and the initializer can explicitly size the array. The specification even includes some examples of this: float x[] = float[2] (1.0, 2.0); // declares an array of size 2 float y[] = float[] (1.0, 2.0, 3.0); // declares an array of size 3 float a[5]; float b[] = a; Move the unsized array check to after the initializer has been processed. If the array is still unsized, generate the error. This should have no effect in GLSL ES 1.00 because, as previously mentioned, array initializers are not allowed. Fixes piglit "glsl-es-3.00 compiler array-sized-by-initializer.vert". Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 17:53:33 -07:00
Ian Romanick	d5aee174b8	glx: Generate GLXBadDrawable when drawable is zero Fixes piglit glx-query-drawable-GLXBadDrawable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 17:53:33 -07:00
Ian Romanick	ef83bd2b95	mesa: Use _mesa_detach_renderbuffer when deleting a texture The functional change is that now invalidate_framebuffer is called if the texture is actually detached from one of the currently bound FBOs. Previously this was only done for renderbuffers. The remaining changes make the texture delete path look more similar to the renderbuffer delete path. This includes adding relevant spec quotations to justify the behavior. Fixes piglit fbo-incomplete "delete texture of bound FBO" test. v2: Move 'fb->Attachment[i].Texture == att' check from previous patch to this patch... where it was intended to be in the first place. Noticed by Chad. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 17:53:33 -07:00
Ian Romanick	438cc6bc49	mesa: Make detach_renderbuffer available outside fbobject.c Also add a return value indicating whether any work was done. This will be used by the next patch. v2: Move 'fb->Attachment[i].Texture == att' check to the next patch... where it was intended to be in the first place. Noticed by Chad. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 17:53:33 -07:00
Ian Romanick	341fb93c16	meta: Don't call _mesa_Ortho with width or height of 0 Fixes failures in oglconform fbo mipmap.manual.color, mipmap.manual.colorAndDepth, mipmap.automatic, and mipmap.manualIterateTexTargets subtests. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-13 17:53:33 -07:00
Vadim Girlin	17bb96b03d	r600g/sb: use MULADD workaround on R7xx for MULADD_IEEE Looks like the same issue that was seen with MULADD in trans slot on R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is just a most frequently used?). So the workaround is to not allow affected instructions to be placed into the trans slot. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-14 01:03:18 +04:00
Roland Scheidegger	6991f86945	gallivm: implement new float comparison instructions returning integer masks FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the select. And just for consistency use the same appropriate ordered/unordered comparisons for the old opcodes as well. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-13 19:09:17 +02:00
Roland Scheidegger	0930082ffd	tgsi: implement new float comparison instructions returning integer masks Also while here add a bunch of other forgotten (integer) instructions to tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing away unused input components), though it may still be incomplete. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-13 19:09:17 +02:00
Roland Scheidegger	e7a5bf7a34	gallium: add new float comparison instructions returning integer masks Newer graphic languages don't want messy float mask results but instead true "boolean" mask results for float comparisons. Otherwise just need to convert the floats back to integers. Need to keep the old opcodes however due to both legacy (gl and d3d9) needing them and because older hw can't really deal with integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and hence must be supported if a driver claims to support glsl 1.30 (or PIPE_SHADER_CAP_INTEGERS). Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-13 19:09:17 +02:00
Chia-I Wu	3b6cee1634	ilo: enable dumping of WM PCB It was disabled because it wasn't supported.	2013-08-13 16:28:24 +08:00
Chia-I Wu	0f8a86682f	ilo: no binding table change when constants are pushed When constants can be pushed, and nothing else requires new SURFACE_STATEs, there is no need to emit BINDING_TABLE_STATE.	2013-08-13 16:26:03 +08:00
Chia-I Wu	c6e1e0157b	ilo: support push constant model in shaders Source constants from URB constant data when the constant data can fit in the PCB.	2013-08-13 16:04:35 +08:00
Chia-I Wu	5e30ffbda6	ilo: support copying constant buffer 0 to PCB Add ILO_KERNEL_PCB_CBUF0_SIZE so that a kernel can specify how many bytes of constant buffer 0 need to be copied to PCB.	2013-08-13 15:52:41 +08:00
Chia-I Wu	5df62dce34	ilo: make constant buffer 0 upload optional Add ILO_KERNEL_SKIP_CBUF0_UPLOAD so that we can skip constant buffer 0 upload when the kernel does not need it.	2013-08-13 15:52:37 +08:00
Chia-I Wu	8b5b5fe394	Revert "ilo: initialize constant buffer SURFACE_STATE early" This reverts commit `a9b800aa81`. With push constant support, the constructed SURFACE_STATE is unused and wasted. The change only slows things down.	2013-08-13 15:24:58 +08:00
Armin K	f423eba46e	gbm: Link to libwayland-drm if Wayland EGL platform is enabled We were relying on libEGL to pull in libwayland-client symbols, but with commit `2c2e64edab` cleaned up the symbol leak. https://bugs.freedesktop.org/show_bug.cgi?id=67962	2013-08-12 15:16:22 -07:00
Roland Scheidegger	cd2f26090a	gallivm: fix exec_mask interaction with geometry shader after end of main Because we must maintain an exec_mask even if there's currently nothing on the mask stack, we can still have an exec_mask at the end of the program. Effectively, this mask should be set back to default when returning from main. Without relying on END/RET opcode (I think it's valid to have neither) it is actually difficult to do this, as there doesn't seem any reasonable place to do it, so instead let's just say the exec_mask is invalid outside main (which it really is effectively). The problem is that geometry shader called end_primitive outside the shader (in the epilogue), and as a result used a bogus mask, leading to bugs if we had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid the mask combining function when called from outside the shader. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-12 23:33:00 +02:00
Roland Scheidegger	dfa7b72563	draw: simplify prim mask construction The code was quite weird, the second comparison was in fact a complete no-op and we can also do the comparison with the vector directly instead of scalar, which should not also be faster but it is way more obvious how that mask is actually going to look like. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-12 23:33:00 +02:00
Roland Scheidegger	7147094ff2	gallivm: simplify geometry shader mask handling a bit Instead of reducing masks to 0/1 simply use the mask directly as -1. Also use some signed comparison instead of unsigned (as far as I understand these values have to be (very) small and signed means llvm doesn't have to apply additional logic to do the unsigned comparisons the cpu can't do). Saves a couple of instructions in some test geometry shader here. v2: that was a bit to much optimization, don't skip combining the masks... Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-12 23:33:00 +02:00
Roland Scheidegger	84fce45321	draw: (trivial) dump tgsi for geometry shaders with GALLIVM_DEBUG_TGSI And dump the variant key too (same as vs does). Just so I can stop wondering why I see the tgsi dump for fs and vs but not gs...	2013-08-12 23:33:00 +02:00
Roland Scheidegger	8c5283dc17	gallivm: (trivial) fix typo in argument declaration of lp_build_size_query_soa Was meant to match the name used elsewhere, spotted by Anthony.	2013-08-12 23:33:00 +02:00
Kenneth Graunke	4d95efd146	i965/fs: Add dump_instruction() support for ARF destinations. CMP instructions use BRW_ARF_NULL as a destination. Prior to this patch, dump_instruction() decoded the destination as "???". Now it decodes BRW_ARF_NULL as "(null)" and other ARFs numerically. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-12 13:13:06 -07:00
Kenneth Graunke	ee7bfab068	i965/fs: Remove extraneous newline in dump_instruction() for CMP. This resulted in printouts like: 246: cmp.cmod.f0.0 ???, vgrf152, 0.000000f, (null), With this patch, CMP is properly printed on one line. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-12 13:13:04 -07:00
Kenneth Graunke	80e1c2f35f	i965/fs: Optimize IF/MOV/ELSE/MOV/ENDIF to SEL when possible. Many GLSL shaders contain code of the form: x = condition ? foo : bar The compiler emits an ir_if tree for this, since each subexpression might be a complex tree that could have side-effects and short-circuit logic operations. However, the common case is to simply pick one of two constants or variable's values---which is exactly what SEL is for. Replacing IF/ELSE with SEL also simplifies the control flow graph, making optimization passes which work on basic blocks more effective. The shader-db statistics: total instructions in shared programs: 1655247 -> 1503234 (-9.18%) instructions in affected programs: 949188 -> 797175 (-16.02%) 2,970 shaders were helped, none hurt. Gained 181 SIMD16 programs. This helps Valve's Source Engine games (max -41.33%), The Cave (max -33.33%), Serious Sam 3 (max -18.64%), Yo Frankie! (max -30.19%), Zen Bound (max -22.22%), GStreamer (max -6.12%), and GLBenchmark 2.7 (max -1.94%). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-12 13:13:01 -07:00
Kenneth Graunke	2c32c3985c	i965/fs: Consider predicated SEL instructions as whole variable writes. The instruction (+f0.0) SEL dst, src0, src1 will write either src0 or src1 to dst, depending on the predicate. Unlike most predicated instructions, it always writes to dst. fs_inst::is_partial_write() is supposed to return true if the whole register is guaranteed to be written. The !inst->predicated check makes sense for most instructions, which might not write the whole register, but SEL is a special case. This caused live interval analysis to ignore the destination of predicated SEL instructions when computing "def" information. Requires the previous commit to avoid regressions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-12 13:12:59 -07:00
Kenneth Graunke	d21f542aa1	i965/fs: Explicitly disallow CSE on predicated instructions. The existing inst->is_partial_write() already disallows predicated instructions, so this has no functional change. However, it's worth doing explicitly since the CSE pass does not consider the flag register. This means it could blindly factor out operations that use the same sources, but which have different condition codes set. This prevents a regression in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-12 13:12:57 -07:00
Kenneth Graunke	53d8cff63b	i965/fs: Log a performance warning if skipping 16-wide due to pulls. Usually, the driver creates both 8-wide and 16-wide variants of every fragment shader. When 16-wide compilation fails, it logs a performance warning explaining why only an 8-wide program exists. However, when there are pull parameters, the driver won't even bother trying the 16-wide compile (since it would fail). In this case, it failed to emit a performance warning, leaving no explanation for the missing 16-wide program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-12 13:12:47 -07:00
Chia-I Wu	a9b800aa81	ilo: initialize constant buffer SURFACE_STATE early Fix ilo_gpe_init_view_surface_for_buffer to allow buffer to be NULL, and add ilo_gpe_set_view_surface_bo to set it later. This allows us to set up SURFACE_STATE early for constant buffers backed by user buffers.	2013-08-12 11:49:51 +08:00
Chia-I Wu	b2f79a3823	ilo: 3DSTATE_INDEX_BUFFER may be wrongly skipped In finalize_index_buffer(), when the current index buffer was destroyed due to u_upload_data(), it may happen that the new index buffer is at the same address as the old one. Comparing the pointers to the two buffers could fail to work, and 3DSTATE_INDEX_BUFFER would be incorrectly skipped. Holding a reference to the current index buffer before calling u_upload_data() should fix the problem.	2013-08-10 13:01:41 +08:00
Chris Forbes	637e6a0aa8	i965: add missing BRW_NEW_INTERPOLATION_MAP to state dump Makes this flag appear in the output for INTEL_DEBUG=state Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-10 20:29:12 +12:00
Chris Forbes	e114b13dae	i965: Add a new debug mode for the VUE map INTEL_DEBUG=vue now emits a listing of each slot in the VUE map, and the corresponding interpolation mode. V2: Fix whitespace issues. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-10 20:28:45 +12:00
Ian Romanick	5894898148	glsl: Don't allow const on out or inout function parameters Fixes piglit tests const-inout-parameter.frag and const-out-parameter.frag. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-09 13:51:18 -07:00
Roland Scheidegger	894d4903e7	gallivm: set non-existing values really to zero in size queries for d3d10 My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-09 20:49:19 +02:00
Roland Scheidegger	b0f74250e1	gallivm: use texture target from shader instead of static state for size query d3d10 has no notion of distinct array resources neither at the resource nor sampler view level. However, shader dcl of resources certainly has, and d3d10 expects resinfo to return the values according to that - in particular a resource might have been a 1d texture with some array layers, then the sampler view might have only used 1 layer so it can be accessed both as 1d or 1d array texture (I think - the former definitely works). resinfo of a resource decleared as array needs to return number of array layers but non-array resource needs to return 0 (and not 1). Hence fix this by passing the target from the shader decl to emit_size_query and use that (in case of OpenGL the target will come from the instruction itself). Could probably do the same for actual sampling, though it may not matter there (as the bogus components will essentially get clamped away), possibly could wreak havoc though if it REALLY doesn't match (which is of course an error but still). Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-09 20:49:18 +02:00
Roland Scheidegger	38ad404f76	gallivm: honor d3d10's wishes of out-of-bounds behavior for texture size query Specifically, must return 0 for non-existent mip levels (and non-existent textures which is an unsolved problem) for everything but total mip count. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-09 20:49:18 +02:00
Paul Berry	417dc8081b	glsl: Enable ARB_fragment_coord_conventions functionality in GLSL 1.50. GLSL 1.50 incorporates the functionality of the ARB_fragment_coord_conventions extension, so we need to make this functionality available even if the extension isn't enabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-09 10:35:06 -07:00
Paul Berry	13fedf2883	main: Fix deprecation of glLineWidth() From section E.1 (Profiles and Deprecated Features of OpenGL 3.0) of the OpenGL 3.0 spec: "LineWidth is not deprecated, but values greater than 1.0 will generate an INVALID VALUE error" From context it is clear that values greater than 1.0 should only generate an INVALID VALUE error in a forward-compatible context. The code was correctly quoting this spec text, but it was disallowing all line widths in forward-compatible contexts, instead of just widths greater than 1.0. This patch introduces the correct check, so that setting a line width of 1.0 or less is permitted. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-09 10:34:05 -07:00
Roland Scheidegger	836098f6b2	util: (trivial) fix asm input/output list for fxsave Otherwise gcc might do very unsafe optimizations, spotted by Uros Bizjak. Hopefully this time it's finally right?	2013-08-09 17:30:13 +02:00
Alex Deucher	c88783047e	r600g: disable GPUVM by default Cayman and trinity systems still seem to suffer from stability problems with GPUVM. This also fixes compute on these asics. It can still be enabled for testing by setting env var RADEON_VA=true. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=65958 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" <mesa-stable@lists.freedesktop.org> CC: "9.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-09 10:51:25 -04:00
Zack Rusin	e8d8974f80	softpipe: fix the regressions softpipe has a really weird handling of the draw attrs, lets just not inject outputs in its data. Trivial.	2013-08-08 20:54:50 -04:00
Zack Rusin	662a4d4a12	draw: rewrite primitive assembler We can't be injecting the primitive id's in the pipeline because by that time the primitives have already been decomposed. To properly number the primitives we need to handle the adjacency primitives by hand. This patch moves the prim id injection into the original primitive assembler and completely removes the useless pipeline stage. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-08 20:54:25 -04:00
Zack Rusin	1d425c4c6d	draw: reset the vertex id when injecting new primitive id Without reseting the vertex id, with primitives where the same vertex is used with different primitives (e.g. tri/lines strips) our vbuf module won't re-emit those vertices with the changed primitive id. So lets reset the vertex id whenever injecting new primitive id to make sure that the vertex data is correctly emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-08 20:54:03 -04:00
Zack Rusin	57cd326778	draw: cleanup the extra attribs Before inserting new front face and prim id outputs cleanup the old extra outputs, otherwise our cache will use previous output slots which will break as soon as outputs of the current shader don't match the last. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-08 20:53:40 -04:00
Dieter Nützel	8f40fa0e7f	util: (trivial) fix more compile errors in u_cpu_detect (gcc/x86 this time). Oops. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=67921	2013-08-09 01:25:54 +02:00
Chad Versace	2c2e64edab	egl: Do not export private symbols libEGL was incorrectly exporting all symbols, public and private. This patch adds -fvisibility=hidden to libEGL's linker flags to ensure that only symbols annotated with __attribute__((visibility("default"))) get exported. Sanity-checked with libEGL's builtin DRI2 driver and the i965 DRI driver by running Piglit on X/EGL and by running weston-gears on Weston as an X client. Sanity-checked with libEGL's Gallium driver (which is not built-in) and the swrast Gallium driver by running es2gears_x11. Kristian reviewed the symbol diff in `nm libEGL.so`. CC: "9.2" <mesa-stable@lists.freedesktop.org> CC: Ian Romanick <idr@freedesktop.org> Acked-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-08 15:17:51 -07:00
Kenneth Graunke	fb3d62fe3d	i965: Remember to call intel_prepare_render() before blitting. Otherwise, blits to the window system buffer may cause crashes, since dst_irb->mt may be NULL. This code is lifted straight out of brw_blorp_framebuffer()'s try_blorp_blit() helper. Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-08 12:12:47 -07:00
Roland Scheidegger	43076a55c2	util: (trivial) fix compile error with MSVC on x86	2013-08-08 19:08:57 +02:00
Roland Scheidegger	6ce54a81b2	gallivm: honor d3d10 floating point rules for shadow comparisons d3d10 specifies ordered comparisons for everything but not_equal which is unordered (http://msdn.microsoft.com/en-us/library/windows/desktop/cc308050.aspx). OpenGL probably doesn't care. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:58 +02:00
Roland Scheidegger	aa84f1ad55	softpipe: don't clamp reference value for shadow comparison for float formats Clamping is only done for fixed-point formats as part of conversion to texture format. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	e1590b9690	gallivm: don't clamp reference value for shadow comparison for float formats This is wrong both for OpenGL and d3d. (In fact clamping is a side effect of converting to depth format, so this should really do quantization too at least in d3d10 for the comparisons to be truly correct.) Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	eac57bc223	gallivm: propagate scalar_lod to emit_size_query too Clearly the returned values need to be per-element if the lod is per element. Does not actually change behavior yet. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	c8572a9457	gallium: clarify SVIEWINFO opcode This opcode is quite problematic in tgsi, while it tries to mirror d3d10 resinfo it can't really do what's stated there due to missing the crazy return type modifiers. Hence specify this is ignored along with the swizzle. (Other options would be to have multiple opcodes or specify the ret type modifier maybe in dst_reg as there's padding bits left there but it is the only instruction allowing this.) Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	ce0e66af0a	gallivm: fix out-of-bounds behavior for fetch/ld For d3d10 and ARB_robust_buffer_access_behavior, we are required to return 0 for out-of-bounds coordinates (for which we can just enable the code already there was just disabled). Additionally, also need to return 0 for out-of-bounds mip level and out-of-bounds layer. This changes the logic so instead of clamping the level/layer, an out-of-bound mask is computed instead in this case (actual clamping then can be omitted just like with coordinates, since we set the fetch offset to zero if that happens anyway). Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	883987503f	util: try much harder to set DAZ flag While so far this only causes some harmless test failures, there's lots more cpus with DAZ. All 64bit capable ones can do it (particularly relevant for AMD cpus as they supported sse3 very very late) but if really necessary we can check support for that for real with some more magic. (In fact just about ANY cpu with sse2 can support DAZ, I believe the only exception are first gen P4 (Willamette) and from those only early steppings which can't do it it's almost like intel forgot to add it... - a real pity though docs say you can't just try to set it as they will throw a GPF.) While this was meant to address https://bugs.freedesktop.org/show_bug.cgi?id=67672 it does not fix it. Most likely the tests need fixing as I don't think there's any guarantee about denorm handling in the reference math library functions if the flags aren't set to standard values. Nevertheless enabling DAZ on all cpus which can do it should be the right thing to do. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	e3b5e2db1b	util: implement table-based + linear interpolation linear-to-srgb conversion Should be much faster, seems to work in softpipe. While here (also it's now disabled) fix up the pow factor - the former value is what is in GL core it is however not actually accurate to fp32 standard (as it is 1.0/2.4), and if someone would do all the accurate math there's no reason to waste 8 mantissa bits or so... v2: use real table generating function instead of just printing the values (might take a bit longer as it does calculations on some 3+ million floats but much more descriptive obviously). Also fix up another inaccurate pow factor (this time in the python code) - wondering where the couple one bit errors came from :-(. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-08-08 18:55:57 +02:00
Roland Scheidegger	2d9fea95e8	gallivm: fix comment wrt srgb accuracy. I think it's actually not good enough now...	2013-08-08 18:55:57 +02:00
Chia-I Wu	f9a4288bd2	ilo: get rid of GPE tables completely Move the estimate functions out of the tables and kill the tables.	2013-08-08 13:46:01 +08:00
Chia-I Wu	19204081ce	ilo: clean up GPE header inclusions This reduces the number of source files need to be recompiled when GPE functions are changed other than regular clean ups.	2013-08-08 13:41:10 +08:00
Chia-I Wu	e292b9362a	ilo: initialize alpha test state in ilo_gpe_init_dsa This could speed up BLEND_STATE and COLOR_CALC_STATE emission a bit.	2013-08-08 13:30:34 +08:00
Chia-I Wu	02496cd2b6	ilo: fold gen6_translate_index_size into the caller There is only one caller so fold it.	2013-08-08 13:10:36 +08:00
Chia-I Wu	1c19d0bb81	ilo: fold gen6_translate_depth_format into the caller There is only one caller so fold it.	2013-08-08 13:02:17 +08:00
Courtney Goeltzenleuchter	c2c5366ff2	ilo: Call GPE emit functions directly. Eliminate pipeline and GPE function vectors and have the pipeline functions call the GPE emit functions directly.	2013-08-08 11:39:21 +08:00
Courtney Goeltzenleuchter	4bc9daf923	ilo: move emit functions so that they can be inlined.	2013-08-08 11:39:21 +08:00
Tom Stellard	d0c13fba17	r300g/compiler/tests: Pass the required LDFLAGS when building the test program CC: "9.2 <mesa-stable@lists.freedesktop.org>"	2013-08-07 17:28:19 -07:00
Tom Stellard	d691ba4d94	r300g/compiler/tests: Fix segfault CC: "9.2" <mesa-stable@lists.freedesktop.org>	2013-08-07 17:27:23 -07:00
Kristian Høgsberg	5575fdaccf	gallium-egl: Commit the rest of the native_wayland_drm_bufmgr_helper v2 patch I missed Anders v2 on the list which fixed non-wayland compilation: http://lists.freedesktop.org/archives/mesa-dev/2013-July/042062.html Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-08-07 11:23:47 -07:00
Ander Conselvan de Oliveira	8d29b5271a	egl: Update to Wayland 1.2 server API Since Wayland 1.2, struct wl_buffer and a few functions are deprecated. References to wl_buffer are replaced with wl_resource and some getter functions and calls to deprecated functions are replaced with the proper new API. The latter changes are related to resource versioning. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>	2013-08-07 10:37:58 -07:00
Ander Conselvan de Oliveira	602351dd58	gallium-egl: Don't add a listener for wl_drm twice in wayland platform A listener is added just after the interface is bound, in registry_handle_global(). Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>	2013-08-07 10:37:58 -07:00
Ander Conselvan de Oliveira	331a8fa41d	gallium-egl: Simplify native_wayland_drm_bufmgr_helper interface The helper provides a series of functions to easy the implementation of the WL_bind_wayland_display extension on different platforms. But even with the helpers there was still a bit of duplicated code between platforms, with the drm authentication being the only part that differs. This patch changes the bufmgr interface to provide a self contained object with a create function that takes a drm authentication callback as an argument. That way all the helper functions are made static and the "_helper" suffix was removed from the sources file name. This change also removes the mix of Wayland client and server code in the wayland drm platform source file. All the uses of libwayland-server are now contained in native_wayland_drm_bufmgr.c. Changes to the drm platform are only compile tested. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>	2013-08-07 10:37:58 -07:00
Chia-I Wu	79b868fea1	ilo: speed up 3DSTATE_VERTEX_BUFFERS emission a bit Ignore vbuffer_mask which does not gain us anything.	2013-08-07 23:13:50 +08:00
Chia-I Wu	7ce3cbaacf	ilo: skip state emission when reducing sampler count When the number of sampler states bound is reduced, we are good to keep referencing the old SAMPLER_STATE array and skip emitting a new one.	2013-08-07 23:13:44 +08:00
Chia-I Wu	2811dba1d0	ilo: simplify setting of shader samplers and views Remove the special path that unbinds all samplers/views not in the range. Just make another call to unbind them.	2013-08-07 18:10:32 +08:00
Chia-I Wu	186dab5b8f	ilo: correctly check for stencil ref change I intended to do a memcmp(), not a memcpy()...	2013-08-07 18:00:46 +08:00
Zack Rusin	12522041d6	draw: fix slot detection Nowadays -1 for slots means that the semantic is not present, so we need to store it in a signed variables, otherwise <0 comparisons are pointless. Fixes http://bugzilla.eng.vmware.com/show_bug.cgi?id=67811 (at least with softpipe, edgeflags don't work wit llvmpipe) Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-08-06 20:23:57 -04:00
Laurent Carlier	2572e3b4a1	gallivm: Fix build - Remove TargetOptions.RealignStack for llvm>=3.4 Since llvm -3.4svn r187618, TargetOptions doesn't provide RealignStack, so only enable it with llvm<3.4 This option must now be specified using function attributes, see LLVM commit r187618 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-08-06 15:31:48 -07:00
Kenneth Graunke	0f7a15a247	i965: Add #defines for the MI_LOAD_REGISTER_MEM command. This command reads a value from memory and writes it to a register (the opposite of MI_STORE_REGISTER_MEM). It's only available on Gen7+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-06 14:41:37 -07:00
Kenneth Graunke	c047ad000b	i965: Initialize the intel_context::bufmgr pointer earlier. This prevents a crash in a future patch. _mesa_initialize_context() creates a default transform feedback object by calling the NewTransformFeedbackObject() driver hook. Eventually, we'll want to subclass that and allocate a buffer object. This means passing brw->bufmgr to drm_intel_alloc_bo(), and crashing if it isn't initialized yet. The buffer manager is actually already initialized; we just hadn't copied the pointer from intel_screen to intel_context quite early enough. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-06 14:41:37 -07:00
Kenneth Graunke	263ebe1a71	i965: Tidy preprocessor macros for SO_PRIM_STORAGE_NEEDED registers. Gen7+ supports four transform feedback streams. Using a function-like macro makes it easy to access them by stream number or loop over them. "GEN7_" prefixes are more common than "_IVB" suffixes, so use that. Gen6 only supports a single stream, so the single #define should be fine. However, SO_NUM_PRIM_STORAGE_NEEDED was a poor name. For one, the word "NUM" doesn't appear in the actual name of the register. It's also confusingly generic, as it doesn't exist on Gen7+. Add a "GEN6_" prefix for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-06 14:41:37 -07:00
Kenneth Graunke	8c27f13cd9	i965: Tidy preprocessor macros for SO_NUM_PRIMS_WRITTEN registers. Gen7+ supports four transform feedback streams. Using a function-like macro makes it easy to access them by stream number or loop over them. "GEN7_" prefixes are more common than "_IVB" suffixes, so we use that. Gen6 only supports a single stream, so the single #define should be fine. However, SO_NUM_PRIMS_WRITTEN was confusingly generic, as it doesn't exist on Gen7+. Add a "GEN6_" prefix for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-06 14:41:37 -07:00
Christoph Bumiller	2daf974cfe	nvc0: don't access array out of bounds on unexpected sample count	2013-08-06 22:29:33 +02:00
Emil Velikov	07c8f7a6f8	nv50: handle pure integer vertex attributes And as a side effect fix a crash in the following piglit test: general/attribs GL3 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "9.2 and 9.1" mesa-stable@lists.freedesktop.org	2013-08-06 22:25:26 +02:00
Samuel Pitoiset	31caddb8d9	nvc0: implement MP performance counters for nvc0:nvd9	2013-08-06 22:24:30 +02:00
Samuel Pitoiset	9dcd7888e6	nvc0: implement compute support for nvc0 Tested on nvc0, nvc1, nvcf and nvd9.	2013-08-06 22:22:49 +02:00
Samuel Pitoiset	981b589101	nvc0: add more MP counters for nve4	2013-08-06 22:22:34 +02:00
Ian Romanick	2f9fe2d80a	mesa: Generate a renderbuffer wrapper even if the texture has no image This prevents a segfault in check_begin_texture_render when an FBO is rebound while in this state. This fixes the piglit test fbo-incomplete-invalid-texture. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org	2013-08-06 12:18:50 -07:00
Ian Romanick	25281fef0f	mesa: Validate the layer selection of an array texture too Previously only the slice of a 3D texture was validated in the FBO completeness check. This fixes the failure in the 'invalid layer of an array texture' subtest of piglit's fbo-incomplete test. v2: 1D_ARRAY textures have Depth == 1. Instead, compare against Height. v3: Handle CUBE_MAP_ARRAY textures too. Noticed by Marek. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org	2013-08-06 12:18:46 -07:00
Ian Romanick	41485fea7c	mesa: Don't call driver RenderTexture for invalid zoffset This fixes the segfault in the 'invalid slice of 3D texture' and 'invalid layer of an array texture' subtests of piglit's fbo-incomplete test. The 'invalid layer of an array texture' subtest still fails. v2: Fix off-by-one comparison error noticed by Chris Forbes. Also, 1D_ARRAY textures have Depth == 1. Instead, compare against Height. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org	2013-08-06 12:18:42 -07:00
Ian Romanick	fb49713f8e	mesa: Don't call driver RenderTexture for really broken textures This fixes the segfault in the '0x0 texture' subtest of piglit's fbo-incomplete test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org	2013-08-06 12:18:39 -07:00
Ian Romanick	0c3dbd689b	mesa: Remove stray debug printfs in attachment completeness code Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.1 9.2" mesa-stable@lists.freedesktop.org	2013-08-06 12:18:29 -07:00
Ian Romanick	4a9522a5a0	mesa: Treat glBindFramebuffer and glBindFramebufferEXT more correctly Allow user-generated names for glBindFramebufferEXT on desktop GL. Disallow its use altogether for core profiles. Names bound with glBindFramebuffer in desktop OpenGL are still (incorrectly) shared across the share group instead of being per-context. This gets us a bit closer to being strictly conformant. v2: Disallow glBindFramebufferEXT in 3.1 by not installing it in the dispatch table. Suggested by Jordan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1] Cc: mesa-stable@lists.freedesktop.org	2013-08-06 10:46:05 -07:00
Ian Romanick	97965e87fc	mesa: Treat glBindRenderbuffer and glBindRenderbufferEXT correctly Allow user-generated names for glBindRenderbufferEXT on desktop GL. Disallow its use altogether for core profiles. v2: Disallow glBindRenderbufferEXT in 3.1 by not installing it in the dispatch table. Suggested by Jordan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v1] Cc: mesa-stable@lists.freedesktop.org	2013-08-06 10:46:05 -07:00
Michel Dänzer	46b6f79fea	radeonsi: Number of SGPRs retrieved from LLVM already includes VCC Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using all 104 SGPRs. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com>	2013-08-06 12:50:01 +02:00
Kenneth Graunke	59f22148b3	i965: Don't allocate curbe buffers on Gen6+. These are only used on Gen4-5. Why waste the 8kB of space? Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-08-06 00:21:10 -07:00
Vinson Lee	b57c1e4b86	llvmpipe: Do not need to free anything if there is no geometry shader. If gs is null, then freeing state->shader.tokens would result in a null dereference. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-05 21:54:20 -07:00
Vinson Lee	60b567ee59	nvc0: Initialize ptr for unexpected sample_count on release builds. Fixes "Uninitialized pointer read" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-05 21:53:39 -07:00
Vinson Lee	8e850f2feb	draw: Change slot from unsigned to int. unfilled_stage::face_slot is of type int. Fixes "Unsigned compared against 0" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-05 17:40:19 -07:00
Vinson Lee	8294d969e1	postprocess: Check ppq is null before calling pp_free_bos. pp_free_bos dereferences ppq without a null check. Fixes "Dereference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-05 17:27:38 -07:00
Zack Rusin	a9cb914f49	draw: add back separate input assembler the issue is that stream output is run before the pipeline, which means that unless we decompose the primitives before the so then things crash. we could convert the entire stream output code into a pipeline stage but it will take a bit, so for now fix the crashes by simply re-adding the old input assembler which is run before the SO. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-08-03 02:57:40 -04:00
Zack Rusin	c9c211fae1	draw: implement proper primitive assembler as a pipeline stage we used to have a face primitive assembler that we ran after if the gs was missing but we had adjacency primitives in the pipeline, lets convert it to a pipeline stage, which allows us to use it to inject outputs (primitive id) into the vertices. it's also a lot cleaner because the decomposition is already handled for us. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-03 00:38:58 -04:00
Zack Rusin	8a94d15fba	draw: fix front face injection Inject front face only if the fragment shader uses it and propagate through all channels because otherwise we'll need to figure out the exact swizzle that the fs expects and it's just simpler to make sure all the components within the front face register are correctly set. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-03 00:36:39 -04:00
Brian Paul	4c9f12d69c	tgsi: remove unneeded File == TGSI_FILE_INPUT test We're already in an "if (File == TGSI_FILE_INPUT)" block at that point.	2013-08-05 10:25:08 -06:00
Brian Paul	3e4b5c6c9c	tgsi: clean up tgsi_scan_shader() function Replace "fulldecl->Semantic.Name/Index" with semName/semIndex. Simplify if/else logic for TGSI_FILE_OUTPUT code. Remove old comment. Fix indentation. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-05 10:11:33 -06:00
Zack Rusin	95829e2029	llvmpipe: fix frontface behavior again Lets make sure the frontface is 1 for front and -1 for back. Discussed with Roland and Jose. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-08-02 22:21:29 -04:00
Vinson Lee	0794f638ee	r600g/sb: Dump correct value for CND. Fixes "Copy-paste error" reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-08-04 13:49:17 -07:00
Jordan Justen	83486d3148	intel_fbo: remove unused intel_renderbuffer hiz functions We are now using functions that operate on the renderbuffer attachment to handle layered rendering. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-04 11:52:38 -07:00
Jordan Justen	7b36137642	i965 clear/draw: set renderbuffer attachment as needing depth resolve Previously we would mark a renderbuffer as needing a depth resolve. But, to support layered rendering, we need to look at the attachment instead, since the attachment knows if layered rendering is being used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-04 11:52:38 -07:00
Jordan Justen	d44be9ed2f	i965: add intel_renderbuffer_att_set_needs_depth_resolve This function is needed to support layered rendering. With layered rendering, the attachment stores the state of whether layered rendering is being used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-04 11:52:38 -07:00
Jordan Justen	814a040504	i965: add intel_miptree_set_all_slices_need_depth_resolve This function marks all slices of a renderbuffer at a particular level as needing a depth resolve. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-04 11:52:38 -07:00
Jordan Justen	b05b81743c	i965 gen7: don't set FORCE_ZERO_RTAINDEX for layered rendering When layered rendering is being used, we should not set FORCE_ZERO_RTAINDEX in the clip state to allow render target array values other than zero to be used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:38 -07:00
Jordan Justen	20799c11eb	hsw hiz: Remove x/y offset restriction for hiz This restriction was related to programming the offset fields of the depth buffer packet. We are now setting these offsets to 0 now, so this restriction should no longer be required. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	bf25ee2840	gen7 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surface Previously we would always find the 2D sub-surface of interest, and then program the surface to this location. Now we always program the 3DSTATE_DEPTH_BUFFER at the start of the surface. To select the lod/slice, we utilize the lod & minimum array element fields. As part of this change, we must revert `1f112ccf`: Revert "i965/gen7: Align all depth miplevels to 8 in the X direction." We also must disable brw_workaround_depthstencil_alignment for gen >= 7. Now the hardware will handle alignment when rendering to additional slices/LODs. v2: * Merge with recent MOCS changes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	f3c886be1f	gen7 fbo: make unmatched depth/stencil configs return unsupported For gen >= 7, we will use the lod/minimum-array-element fields to support layered rendering. This means that we must restrict the depth & stencil attachments to match in various more retrictive ways. (Now the width, height, depth, LOD and layer must match) The reason width, height, and depth must match is that the hardware has a single set of width, height, and depth settings (in 3DSTATE_DEPTH_BUFFER) that affect both the depth and stencil buffers. Since these controls determine the miptree layout, they need to be set correctly in order for lod and minimum-array-element to work properly. So the only way rendering can work is if the width, height, and depth match. In the future, if this restriction proves to be a problem (say because some crucial client application relies on rendering to different levels/layers of stencil and depth buffers), then we can always work around the restriction by copying depth and/or stencil data to a temporary buffer prior to rendering (much in the same way that brw_workaround_depthstencil_alignment() does today for gen < 7), but hopefully that won't be necessary. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	65290a20f9	hsw hiz: Add new size restrictions for miplevels > 0 When performing hiz ops, we must ensure that the region sizes have an 8 aligned width and 4 aligned height. We can tweak the size for blorp hiz operations at LOD 0, but for the others we can't. Therefore, we disable hiz for these miplevels if they don't meet the size alignment requirements. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	e3a49e1ad3	gen7 blorp depth: calculate base surface width/height This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	a23cfb8648	gen7 depth surface: calculate minimum array element being rendered In layered rendering this will be 0. Otherwise it will be the selected slice. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	08ef1dde1b	gen7 depth surface: calculate LOD being rendered to This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	bc1acaa426	gen7 depth surface: calculate depth (array size) for depth surface This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Note: Cube maps are treated as 2D arrays with 6 times as many array elements as the cube map array would have. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	171e633294	gen7 depth surface: calculate more specific surface type This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Note: Cube maps are treated as 2D arrays with 6 times as many array elements as the cube map array would have. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Jordan Justen	0e6be2e67b	i965: init global state first in brw_workaround_depthstencil_alignment In a future pass this will allow us to exit-early from this routine to disable it for gen >= 7. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-04 11:52:37 -07:00
Ilia Mirkin	8edb79f1ef	nv50: fix some h264 interlaced decoding on vp2 Some videos specify mb_adaptive_frame_field_flag instead of field_pic_flag. This implies that the pic height needs to be halved, and this field needs to be passed to the VP engine. Cc: "9.2" mesa-stable@lists.freedesktop.org Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-08-03 12:52:04 +02:00
Zack Rusin	bff0d87668	llvmpipe: don't interpolate front face or prim id The loop was iterating over all the fs inputs and setting them to perspective interpolation, then after the loop we were creating extra output slots with the correct interpolation. Instead of injecting bogus extra outputs, just set the interpolation on front face and prim id correctly when doing the initial scan of fs inputs. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-02 20:12:53 -04:00
Zack Rusin	8e77e5e543	draw: make sure clipping works with injected outputs clipping would drop the extra outputs because it always used the number of standard vertex shader outputs, without geometry shader or extra outputs. The commit makes sure that clipping with geometry shaders which have more outputs than the current vertex shader and with extra outputs correctly propagates the entire vertex. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 20:11:18 -04:00
Zack Rusin	d6b3a193d4	draw: inject frontface info into wireframe outputs Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 20:11:18 -04:00
Zack Rusin	05487ef88d	draw: stop crashing with extra shader outputs Draw sometimes injects extra shader outputs (aa points, lines or front face), unfortunately most of the pipeline and llvm code didn't handle them at all. It only worked if number of inputs happened to be bigger or equal to the number of shader outputs plus the extra injected outputs. In particular when running the pipeline which depends on the vertex_id in the vertex_header things were completely broken. The patch adjust the code to correctly use the total number of shader outputs (the standard ones plus the injected ones) to make it all stop crashing and work. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-08-02 20:11:18 -04:00
Zack Rusin	2e46a1dcb3	draw: use the vertex size Instead of using the magical 4 use the above computed vertex size. Doesn't change the behavior, just makes the code a bit cleaner. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 20:11:18 -04:00
Zack Rusin	da1a74f673	draw/llvm: add some extra debugging output when dumping shader outputs it's nice to have the integer values of the outputs, in particular because some values are integers. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 20:11:18 -04:00
Zack Rusin	36096af026	tgsi: detect prim id and front face usage in fs Adding code to detect the usage of prim id and front face semantics in fragment shaders. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 20:11:18 -04:00
Zack Rusin	2da1daaa4e	tgsi: add ucmp to the list of opcodes we forgot to add ucmp to the list of opcodes, so it was never generated for ureg. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 19:08:39 -04:00
Zack Rusin	2d15f4746b	llvmpipe: make the front-face behavior match the gallium spec The spec says that front-face is true if the value is >0 and false if it's <0. To make sure that we follow the spec, lets just subtract 0.5 from our value (llvmpipe did 1 for frontface and 0 otherwise), which will get us a positive num for frontface and negative for backface. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 15:50:16 -04:00
Matt Turner	4f83956347	Makefile.am: Remove api_exec_es* from EXTRA_FILES. These files were removed in commits `a0102154` and `a8ab7e33`. Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-08-02 09:51:57 -07:00
Matt Turner	5854883312	mesa: Use MIN3 instead of two MIN2s.	2013-08-02 09:51:57 -07:00
Matt Turner	01bdad3173	mesa: Update comments to match newer specs. Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately above the second hunk also uses 'p'.	2013-08-02 09:51:57 -07:00
Kenneth Graunke	9375c16e72	i965: Initialize the maximum number of GS threads on Haswell. We'll need proper values for max_gs_threads when we eventually support geometry shaders. Also, we initialize it for every other platform. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-02 08:24:23 -07:00
Kenneth Graunke	a1ddbd1d7c	glsl: Disallow interpolation qualifiers on non-input/output variables. Commit `2548092ad8` switched the sense of interpolation qualifier checks in order to permit them on geometry shader in/out variables. In doing so, it accidentally allowed interpolation qualifiers to be applied to ordinary variables and function parameters. Fixes a regression in Piglit's local-smooth-01.frag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-02 08:24:23 -07:00
Kenneth Graunke	7d2423a09e	glsl: Fix NULL pointer dereferences when linking fails. Commit `7cfefe6965` introduced a check for whether linked->Type equals GL_GEOMETRY_SHADER. However, linked may be NULL due to an earlier error condition. Since the entire function after the error path is (or should be) guarded by linked != NULL checks, we may as well just return early and remove the checks. Fixes crashes in 9 Piglit tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-08-02 08:24:23 -07:00
Andreas Boll	9d569fed8d	docs: Document UVD (2.2 and 3.0) video decoding support in mesa 9.2 Cc: "9.2" mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-08-02 17:14:08 +02:00
Andreas Boll	ec4a6a94b1	docs: Document that i965 Gen6+ requires Kernel 3.6 or later Cc: "9.2" mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-02 17:13:40 +02:00
Timothy Arceri	37f9e0e84f	docs: Update some out of date sourcetree information Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-08-02 16:22:03 +02:00
Christoph Bumiller	957a2014f9	r600g: honour semantic index in fragment color exports Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-08-02 13:32:49 +02:00
Andreas Boll	38903db439	docs: Add md5sums to 9.1.5 release notes	2013-08-02 09:58:34 +02:00
Andreas Boll	7eaaf62434	docs: Fix a typo in the 9.1.6 release notes	2013-08-02 09:47:43 +02:00
Topi Pohjolainen	f5947c2bc7	i965: enable image external sampling for imported dma-buffers Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	20de7f9f22	egl/dri2: support for creating images out of dma buffers v2: - upon success close the given file descriptors v3: - use specific entry for dma buffers instead of the basic for primes, and enable the extension based on the availability of the hook v4 (Chad): - use ARRAY_SIZE - improve the comment about the number of file descriptors - in case of invalid format report EGL_BAD_ATTRIBUTE instead of EGL_BAD_MATCH - take into account specific error set by the driver. v5: - fix error handling v6 (Chad): - fix invalid plane count checking v7 (Chad): - fix indentation and reset loop counter before checking for excess attributes Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	3a52cd351a	intel: restrict dma-buf-import images to external sampling only Memory originating outside mesa stack is meant to be for reading only. In addition, the restrictions imposed by the image external extension should apply. For example, users shouldn't be allowed to generare mip-trees based on these images. v2 (Chad): document using full extension names, fix the comment style itself and emit description of error Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	0de013b619	egl: definitions for EXT_image_dma_buf_import As specified in: http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt Checking for the valid fourcc values is left for drivers avoiding dependency to drm header files here. v2: enforce EGL_NO_CONTEXT v3: declare the extension as EGL (not GLES) v4: do not update eglext.h manually but rely on update from Khronos instead v5: (Eric) report invalid context as EGL_BAD_PARAMETER instead of as EGL_BAD_CONTEXT v6: (Chad) fix the checking for valid hints. Before all values were rejected. v7: (Chad) comment style change from /** * Multi- * line into /* Multi- * line Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	674dedc87a	dri: propagate extra dma_buf import attributes to the drivers v2: do not break ABI, but instead introduce new entry point for dma buffers and bump up the dri-interface version to eight v3 (Chad): allow the hook to specify an error originating from the driver. For now only unsupported format is considered. I thought about rejecting the hints also as they are addressing only YUV sampling which is not supported at the moment but then thought against it as the spec is not saying one way or the other. v4 (Eric, Chad): restrict to rgb formatted only v5: rebased on top of i915/i965 split v6 (Chad): document using full extension name Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	ee844b6660	intel: set dri image dimensions even when creating out of primes Otherwise 'intel_set_texture_image_region()' won't have enough details to work with. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	904587ac3a	intel: refactor planar format lookup v2 (Eric): refactor both occurences, not just one v3 (Chad): replace 0 by NULL Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	55162e2164	intel: do not create renderbuffers out of planar images v2 (Chad): emit 'GL_INVALID_OPERATION' and description of error Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-02 08:56:03 +03:00
Topi Pohjolainen	e8568a0803	intel: allow packed prime buffers to be treated normally v2: - fix earlier rebase error breaking bisect (loaderPriv -> loaderPrivate) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-02 08:56:02 +03:00
Paul Berry	34c55b5925	main: Warn that geometry shader support is experimental. Geometry shader support in the Mesa front end is still fairly preliminary. Many features are untested, and the following things are known not to work: - The gl_in interface block - The gl_ClipDistance input - Transform feedback of geometry shader outputs - Constants that are new in GLSL 1.50 (e.g. gl_MaxGeometryInputComponents) This isn't a problem, since no back-end drivers currently enable geometry shaders. However, to make sure no one gets the wrong impression, emit a nasty warning to let the user know that geometry shader support isn't complete. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:24:49 -07:00
Paul Berry	7cfefe6965	glsl: Implement rules for geometry shader input sizes. Section 4.3.8.1 (Input Layout Qualifiers) of the GLSL 1.50 spec contains some tricky rules for how the sizes of geometry shader input arrays are related to the input layout specification. In essence, those rules boil down to the following: - If an input array declaration does not specify a size, and it follows an input layout declaration, it is sized according to the input layout. - If an input layout declaration follows an input array declaration that didn't specify a size, the input array declaration is given a size at the time the input layout declaration appears. - All input layout declarations and input array sizes must ultimately match. Inconsistencies are reported as soon as they are detected, at compile time if the inconsistency is within one compilation unit, otherwise at link time. - At least one compilation unit must contain an input layout declaration. (Note: the geom_array_resize_visitor class was contributed by Bryan Cain <bryancain3@gmail.com>.) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:24:39 -07:00
Paul Berry	20ae8e0c91	glsl: Allow geometry shader input instance arrays to be unsized. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:24:32 -07:00
Paul Berry	c1f1d8522c	glsl: Permit non-ubo input interface arrays to use non-const indexing. From the GLSL ES 3.00 spec: "All indexes used to index a uniform block array must be constant integral expressions." Similar text exists in GLSL specs since 1.50. When we implemented this, the only type of interface block supported by Mesa was uniform blocks, so we required all indexes used to index any interface block to be constant integral expressions. Now that we are adding interface block support for GLSL 1.50, we need a more specific check. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:24:27 -07:00
Eric Anholt	6065a87bce	glsl: Cross-validate GS layout qualifiers while intrastage linking. This gets piglit's geometry-basic test running. TODO: Still need to validate that the GS layout qualifiers don't get used in places they shouldn't (like an interface block, or a particular shader input or output) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:24:23 -07:00
Eric Anholt	010a6a8fd3	glsl: Export the compiler's GS layout qualifiers to the gl_shader. Next step is to validate them at link time. v2 (Paul Berry <stereotype441@gmail.com>): Don't attempt to export the layout qualifiers in the event of a compile error, since some of them are set up by ast_to_hir(), and ast_to_hir() isn't guaranteed to have run in the event of a compile error. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v3 (Paul Berry <stereotype441@gmail.com>): Use PRIM_UNKNOWN to represent "not set in this shader". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:23:43 -07:00
Eric Anholt	624b7bac76	glsl: Parse the GLSL 1.50 GS layout qualifiers. Limited semantic checking (compatibility between declarations, checking that they're in the right shader target, etc.) is done. v2: Remove stray debug printfs. v3 (Paul Berry <stereotype441@gmail.com>): Process input layout qualifiers at ast_to_hir time rather than at parse time, since certain error conditions depend on the relative ordering between input layout qualifiers, declarations, and calls to .length(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:23:33 -07:00
Eric Anholt	f2e14238a7	glsl: Make sure that we don't put too many bitfields in ast_type_qualifier. We do some tests of qualifiers using a union containing an int and the struct full of bitfields, so make sure the bitfields don't spill outside the int. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:23:28 -07:00
Paul Berry	e62ca57199	main: Fix delete_shader_cb() for geometry shaders Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:23:25 -07:00
Fabian Bieler	bd85ba08bc	glsl/linker: Fail to link geometry shader without vertex shader. From section 2.15 (Geometry Shaders) the OpenGL 3.2 spec: A program object that includes a geometry shader must also include a vertex shader; otherwise a link error will occur. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:23:21 -07:00
Fabian Bieler	8cdbe8394e	mesa: Validate the drawing primitive against the geometry shader input primitive type. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:23:19 -07:00
Fabian Bieler	39ca58192b	mesa/shaderapi: Allow 0 GEOMETRY_VERTICES_OUT. ARB_geometry_shader4 spec Errors: "The error INVALID_VALUE is generated by ProgramParameteriARB if <pname> is GEOMETRY_VERTICES_OUT_ARB and <value> is negative." Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:23:16 -07:00
Paul Berry	72219acf6b	glsl: Properly pack GS output varyings In geometry shaders, outputs are consumed at the time of a call to EmitVertex() (as opposed to all other shader types, where outputs are consumed when the shader exits). Therefore, when packing geometry shader output varyings using lower_packed_varyings, we need to do the packing at the time of the EmitVertex() call. This patch accomplishes that by adding a new visitor class, lower_packed_varyings_gs_splicer, which is responsible for splicing the varying packing code into place wherever EmitVertex() is found. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:23:12 -07:00
Paul Berry	f2ecc84826	glsl: Modify varying packing to use a temporary exec_list. This patch modifies lower_packed_varyings to store the packing code it generates in a temporary exec_list, and then splice that list into the shader's main() function when it's done. This paves the way for supporting geometry shader outputs, where we'll have to splice a clone of the packing code before every call to EmitVertex(). As a side benefit, varying packing code is now emitted in the same order for inputs and outputs; this should make debug output a little easier to read. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:23:08 -07:00
Paul Berry	3b0cf7027d	glsl/linker: Properly pack GS input varyings. Since geometry shader inputs are arrays (where the array index indicates which vertex is being examined), varying packing needs to treat them differently. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:22:59 -07:00
Paul Berry	40d469f9ac	glsl/linker: Properly error check VS-GS linkage. From section 4.3.4 (Inputs) of the GLSL 1.50 spec: Geometry shader input variables get the per-vertex values written out by vertex shader output variables of the same names. Since a geometry shader operates on a set of vertices, each input varying variable (or input block, see interface blocks below) needs to be declared as an array. Therefore, the element type of each geometry shader input array should match the type of the corresponding vertex shader output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:22:55 -07:00
Paul Berry	05234e707b	glsl: Require geometry shader inputs to be arrays. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:22:48 -07:00
Paul Berry	fc5fa56c86	mesa: Copy linked program data for GS. The documentation for gl_shader_program.Geom and gl_geometry_program says that the former is copied to the latter at link time, but this wasn't happening. This patch causes _mesa_ir_link_shader() to perform the copy, and updates comment accordingly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:22:07 -07:00
Paul Berry	13022c9c5f	mesa: Refactor copying of linked program data. This patch creates a single function to copy the the UsesClipDistance flag from gl_shader_program.Vert to gl_vertex_program. Previously this logic was duplicated in the i965-specific function brw_link_shader() and the core mesa function _mesa_ir_link_shader(). This logic will have to be expanded to support geometry shaders, and I don't want to have to update it in two separate places. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:21:26 -07:00
Bryan Cain	2548092ad8	glsl: support compilation of geometry shaders This commit adds all of the parsing and semantics for GLSL 150 style geometry shaders. v2 (Paul Berry <stereotype441@gmail.com>): Add a few missing calls to get_pipeline_stage(). Fix some signed/unsigned comparison warnings. Fix handling of NULL consumer in assign_varying_locations(). v3 (Bryan Cain <bryancain3@gmail.com>): fix indexing order of 2D arrays. Also, allow interpolation qualifiers in geometry shaders. v4 (Paul Berry <stereotype441@gmail.com>): Eliminate get_pipeline_stage()--it is no longer needed thanks to `030ca23` (mesa: renumber shader indices according to their placement in pipeline). Remove 2D stuff. Move vertices_per_prim() to ir.h, so that it will be accessible from outside the linker. Remove inject_num_vertices_visitor. Rework for GLSL 1.50. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> v5 (Paul Berry <stereotype441@gmail.com>): Split out do_set_program_inouts() argument refactoring to a separate patch. Move geom_array_resizing_visitor to later in the series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:20:45 -07:00
Paul Berry	844bd71736	glsl/linker: Make separate allocations to track vertex and fragment shaders. There's no reason to be clever about this. By making separate allocations for vertex and fragment shaders, we'll allow geometry shaders to be added without introducing any complication. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:20:41 -07:00
Bryan Cain	ff52377183	glsl: add builtins for geometry shaders. v2 (Paul Berry <stereotype441@gmail.com>): Account for rework of builtin_variables.cpp. Use INTERP_QUALIFIER_FLAT for gl_PrimitiveID so that it will obey provoking vertex conventions. Convert to GLSL 1.50 style geometry shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> v3 (Paul Berry <stereotype441@gmail.com>): Be less obscure about setting interpolation field of gl_Primitive variables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:20:36 -07:00
Bryan Cain	ae6eba3e32	glsl: add ir_emit_vertex and ir_end_primitive instruction types These correspond to the EmitVertex and EndPrimitive functions in GLSL. v2 (Paul Berry <stereotype441@gmail.com>): Add stub implementations of new pure visitor functions to i965's vec4_visitor and fs_visitor classes. v3 (Paul Berry <stereotype441@gmail.com>): Rename classes to be more consistent with the names used in the GL spec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:20:16 -07:00
Bryan Cain	c6be77ee6f	mesa: account for geometry shader texture fetches in update_texture_state Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:20:14 -07:00
Paul Berry	b272a01879	main: Allow for the possibility of GL 3.2 without ARB_geometry_shader4. Previously, we assumed that the only way Mesa would expose geometry shader support was via the ARB_geometry_shader4 extension. But this extension has some extra complications over GL 3.2 (interactions with compatibility-only features, and link-time initialization of the constant gl_VerticesIn). So we want to allow for the possibility of supporting GL 3.2 (with GLSL 1.50 style geometry shaders) even if ctx->Extensions.ARB_geometry_shader4 is false. This patch adds a new function, _mesa_has_geometry_shaders(), which returns true if either ARB_geometry_shader4 is supported or the GL version is at least 3.2 desktop. Since compute_version() only enables GL 3.2 functionality when GLSL 1.50 support is present, a sufficient way for a back-end to advertise geometry shader support is to set ctx->Const.GLSLVersion >= 150. v2: Remove unnecessary ctx->Const.GeometryShaders150 constant. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:57 -07:00
Paul Berry	56dcc46f0e	main: Fix geometry shader error messages (missing right paren) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:55 -07:00
Paul Berry	37270715ff	glsl: Add EXT_texture_array support for geometry shaders. We can't just use a ".glsl" file since the Lod variants are only available in vertex and geometry shaders, while the bias variants are only available in the fragment shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:19:51 -07:00
Paul Berry	6a2baf3a06	glsl/linker: Make update_array_sizes apply to just uniforms. Commit `586b4b5` (glsl: Also update implicit sizes of varyings at link time) extended update_array_sizes() to apply to both uniforms and shader ins/outs. However, doing creates problems for geometry shaders, because update_array_sizes() assumes that variables with matching names in different parts of the pipeline should have the same sizes. With the addition of geometry shaders, this is no longer true (e.g. both vertex and geometry shaders have a gl_ClipDistance output variable, but there's no reason these variables should have the same sizes). The original reason for commit `586b4b5` (avoid problems with gl_TexCoord being 0 length) has since been addressed by commit `6f53921` (linker: Ensure that unsized arrays have a size after linking). So go ahead and switch update_array_sizes() back to only acting on uniforms. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:47 -07:00
Paul Berry	8fc41df549	glsl: Modify ir_set_program_inouts to handle geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-08-01 20:19:43 -07:00
Paul Berry	cea946e39d	glsl: In ir_set_program_inouts, handle indexing outside array/matrix bounds. According to GLSL, indexing into an array or matrix with an out-of-range constant results in a compile error. However, indexing with an out-of-range value that isn't constant merely results in undefined results. Since optimization passes (e.g. loop unrolling) can convert non-constant array indices into constant array indices, it's possible that ir_set_program_inouts will encounter a constant array index that is out of range; if this happens, just mark the whole array as used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:39 -07:00
Paul Berry	1c789d8087	glsl: Fallback gracefully if ir_set_program_inouts sees unexpected indexing. The code in ir_set_program_inouts that marks just a portion of a variable as used (rather than the whole variable) only works on a few kinds of indexing operations: - Indexing into matrices - Indexing into arrays of matrices, vectors, or scalars. Fortunately these are the only kinds of indexing operations that we expect to see; everything else is either handled by a previously-executed lowering pass or prohibited by GLSL. However, that could conceivably change in the future (the GLSL rules might change, or we might modify the lowering passes). To avoid mysterious bugs in the future, let's have ir_set_program_inouts report an assertion failure if it ever encounters an unexpected kind of indexing operation (and in release builds, fall back to just marking the whole variable as used). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:35 -07:00
Paul Berry	d5a333a06f	glsl: Extract marking functions from ir_set_program_inouts. This patch extracts the functions mark_whole_variable() and try_mark_partial_variable() from the ir_set_program_inouts visitor functions. This will make the code easier to follow when we add geometry shader support. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:31 -07:00
Paul Berry	0b0dc03a31	glsl: Use count_attribute_slots() in ir_set_program_inouts. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:26 -07:00
Paul Berry	7d95d2b4c9	glsl: Expand count_attribute_slots() to cover structs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:22 -07:00
Paul Berry	0026ad4994	Move count_attribute_slots() out of the linker and into glsl_type. Our previous justification for leaving this function out of glsl_type was that it implemented counting rules that were specific to GLSL 1.50. However, these counting rules also describe the number of varying slots that Mesa will assign to a varying in the absence of varying packing. That's useful to be able to compute from outside of the linker code (a future patch will use it from ir_set_program_inouts.cpp). So go ahead and move it to glsl_type. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:19:02 -07:00
Paul Berry	906eff09e3	glsl: Change do_set_program_inouts' is_fragment_shader arg to shader_type. This will allow us to add geometry shader support without having to add another boolean argument. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:18:42 -07:00
Roland Scheidegger	e7ed70a52e	gallivm: obey clarified shift behavior llvm shifts are undefined for shift counts exceeding (or matching) bit width, so need to apply a mask for the tgsi shift instructions. v2: only use mask for the tgsi shift instructions, not for the build shift helpers. None of the internal callers need this behavior, and while llvm can optimize away the masking for constants there are legitimate cases where it might not be able to do so even if we know that shift count must be smaller than type width (currently all such callers do not use the build shift helpers). Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 03:49:57 +02:00
Roland Scheidegger	7a72bef47e	tgsi: obey clarified shift behavior c shifts are undefined for shift counts exceeding (or matching) bit width, so need to apply a mask (on x86 it actually would usually probably work as shifts do masking on int domain shifts - unless some auto-vectorizer would come along at last as simd domain does not mask the shift count). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 03:49:57 +02:00
Roland Scheidegger	606132b4de	gallium: clarify shift behavior with shift count >= 32 Previously, nothing was said what happens with shift counts exceeding bit width of the values to shift. In theory 3 behaviors are possible: 1) undefined (classic c definition) 2) just shift out all bits (so result is zero, or -1 potentially for ashr) 3) mask the shift count to bit width - 1 API's either require 3) or are ok with 1). In particular, GLSL (as well as a couple uninteresting legacy GL extensions) is happy with undefined, whereas both OpenCL and d3d10 require 3). Consequently, most hw also implements 3). So, for simplicity we just specify that 3) is required rather than saying undefined and then needing state trackers to work around it. Also while here specify shift count as a vector, not scalar. As far as I can tell this was a doc bug, neither state trackers nor drivers used scalar shift count. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-08-02 03:49:57 +02:00
Carl Worth	7f2f63409a	docs: Add md5sums to 9.1.6 release notes	2013-08-01 15:45:04 -07:00
Carl Worth	964b89e42a	docs: Import 9.1.6 release notes, add news item.	2013-08-01 15:12:25 -07:00
Kenneth Graunke	fcb4ab6db1	i965: Delete the BATCH_LOCALS macro. This hasn't done anything in a long time, and it's only used in a couple places...which means we couldn't use it without doing a bunch of work anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-08-01 10:38:20 -07:00
Corey Richardson	abdbd02e59	Correct clamping of TEXTURE_{MAX, BASE}_LEVEL Previously, if TEXTURE_IMMUTABLE_FORMAT was TRUE, the levels were allowed to be set like usual, but ARB_texture_storage states: > if TEXTURE_IMMUTABLE_FORMAT is TRUE, then level_base is clamped to the range > [0, <levels> - 1] and level_max is then clamped to the range [level_base, > <levels> - 1], where <levels> is the parameter passed the call to > TexStorage* for the texture object Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Corey Richardson <corey@octayn.net>	2013-08-01 10:23:39 -07:00
Corey Richardson	986ae4306c	De-tab and align comments in gl_texture_object Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Corey Richardson <corey@octayn.net>	2013-08-01 10:23:39 -07:00
Chris Forbes	3eef7fec67	i965 Gen4/5: clip: Don't mangle flat varyings This patch ensures that integers will pass through unscathed. Doing (useless) computations on them is risky, especially when their bit patterns correspond to values like inf or nan. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:59:03 +12:00
Chris Forbes	3f6fb5e1dd	i965 Gen4/5: clip: Add support for noperspective varyings Adds support for interpolating noperspective varyings linearly in screen space when clipping. Based on Olivier Galibert's patch from last year: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024341.html At this point all -fixed and -vertex interpolation tests work. V5: Add brw_clip_compile.has_noperspective_shading rather than another key flag. V6: Real bools. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:58:59 +12:00
Chris Forbes	f0feb32eaf	i965 Gen4/5: clip: correctly handle flat varyings Previously we only gave special treatment to the builtin color varyings. This patch adds support for arbitrary flat-shaded varyings, which is required for GLSL 1.30. Based on Olivier Galibert's patch from last year: http://lists.freedesktop.org/archives/mesa-dev/2012-July/024340.html V5: Move key.do_flat_shading to brw_clip_compile.has_flat_shading V6: Real bools. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:58:56 +12:00
Chris Forbes	21922cb70d	i965 Gen4/5: Generalize SF interpolation setup for GLSL1.3 Previously the SF only handled the builtin color varying specially. This patch generalizes that support to cover user-defined varyings, driven by the interpolation mode array set up alongside the VUE map. Based on the following patches from Olivier Galibert: - http://lists.freedesktop.org/archives/mesa-dev/2012-July/024335.html - http://lists.freedesktop.org/archives/mesa-dev/2012-July/024339.html With this patch, all the GLSL 1.3 interpolation tests that do not clip (spec/glsl-1.30/execution/interpolation/*-none.shader_test) pass. V5: Move key.do_flat_shading to brw_sf_compile.has_flat_shading; drop vestigial hunks. V6: Real bools. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:58:52 +12:00
Chris Forbes	3b5fe704e1	i965: Add helper functions for interpolation map V6: real bools Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:58:49 +12:00
Chris Forbes	9f51499d28	i965 Gen4/5: Introduce 'interpolation map' alongside the VUE map The interpolation map (in brw->interpolation_mode) is a new auxiliary structure alongside the post-GS VUE map, which describes the interpolation modes for each VUE slot, for use by the clip and SF stages. This patch introduces a new state atom to compute the interpolation map, and adjusts the program keys for the clip and SF stages, but it is not actually used yet. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> V3: Updated for vue_map changes, intel -> brw merge, etc. (Chris Forbes) V4: Compute interpolation map as a new state atom rather than tacking it on the front of the clip setup V5: Rework commit message, make interpolation_mode_map a struct. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-08-01 20:58:19 +12:00
Carl Worth	c6f3036179	get-pick-list: Allow for non-whitespace between "CC:" and "mesa-stable" We recently proposed a new syntax for stable-patch nominations such as: CC: "9.2 and 9.1" <mesa-stable@lists.freedesktop.org> and this has already appeared in the wild. So we extend the regular expression to pick this up as well.	2013-07-31 15:49:48 -07:00
Samuel Pitoiset	ef6d5ee9f3	nvc0: properly align NVE4_COMPUTE_MP_TEMP_SIZE MP_TEMP_SIZE must be aligned to 0x8000, while TEMP_SIZE on NVE4_3D must be aligned to 0x20000, so perform both alignments to be sure we allocate enough space (actually the bo will most likely use 128 KiB pages and not aligning to that would be a waste anyway). Cc: "9.2" mesa-stable@lists.freedesktop.org	2013-07-31 21:40:38 +02:00
Laurent Carlier	5ffa28df4e	mesa/program: remove useless YYID This fixes the build with Bison 3.0. Also works with Bison 2.7.1. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-31 11:57:32 -07:00
Kenneth Graunke	6d2a9220b8	mesa/program: Switch from the deprecated YYLEX_PARAM to %lex-param. YYLEX_PARAM is no longer supported as of Bison 3.0. Instead, the Bison developers recommend using %lex-param. %lex-param takes a type and variable name, similar to %parse-param, so you can't pass an arbitrary expression like state->scanner. But Flex insists on passing the actual scanner object, not an arbitrary object like state. To solve this, the parser defines a wrapper lex() function which accepts "state," and calls Flex's lex() function with state->scanner. Fixes the build with Bison 3.0. Also works with Bison 2.7.1. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> Cc: "9.2" mesa-stable@lists.freedesktop.org	2013-07-31 11:52:13 -07:00
Kenneth Graunke	de917b4c4c	mesa/program: Change the program parser's namespace. Bison 3.0 removes the YYLEX_PARAM macro. In preparation for handling this using %lex-param, the parser needs a wrapper function for the actual Flex lex() function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> Cc: "9.2" mesa-stable@lists.freedesktop.org	2013-07-31 11:52:06 -07:00
Kenneth Graunke	f043381334	glsl: Switch from the deprecated YYLEX_PARAM to %lex-param. YYLEX_PARAM is no longer supported as of Bison 3.0. Instead, the Bison developers recommend using %lex-param. %lex-param takes a type and variable name, similar to %parse-param, so you can't pass an arbitrary expression like state->scanner. But Flex insists on passing the actual scanner object, not an arbitrary object like state. To solve this, the parser defines a wrapper lex() function which accepts "state," and calls Flex's lex() function with state->scanner. Fixes the build with Bison 3.0. Also works with Bison 2.7.1. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> Cc: "9.2" mesa-stable@lists.freedesktop.org	2013-07-31 11:51:57 -07:00
Kenneth Graunke	eb7c8c7fb6	glsl: Change the lexer's namespace. Bison 3.0 removes the YYLEX_PARAM macro. In preparation for handling this using %lex-param, the parser needs a wrapper function for the actual Flex lex() function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67354 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> Cc: "9.2" mesa-stable@lists.freedesktop.org	2013-07-31 11:49:30 -07:00
Eric Anholt	eed0a80137	egl: Restore "bogus" DRI2 invalidate event code. I had removed it in commit `1e7776ca2b` because it was obviously wrong -- why do we care whether the server is a version that emits events, if we're not watching for the server's events, anyway? And why would you only invalidate on a server that emits invalidate events, when the comment said to emit invalidates if the server doesn't? Only, I missed that we otherwise don't flag that our buffers might have changed at swap time at all, so the driver was only checking for new buffers when triggered by the Viewport hack. Of course you don't expect Viewport to be called after a swap. So, this is effectively a revert of the previous commit, except that I dropped the check for only emitting invalidates on a new server -- we always need to invalidate if we're doing a SwapBuffers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63435 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org>	2013-07-31 10:43:35 -07:00
Roland Scheidegger	b1ed7202df	gallivm: use nearest rounding for float->unorm24 conversion Previously we were using truncation, which gives the correct result only for numbers in [0.5-1.0] range (because there's no mantissa bits to do any rounding there). This is frequently hit (and probably only used there) when converting fragment depth to depth format (d24s8 etc.) or otherwise dealing with depth format. v2: as spotted by Jose, get rid of extra type (src_type is already unsigned). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-31 17:09:02 +02:00
Mikko Juola	8624a514c2	mesa: fix multisampling proxy textures not being queryable The code that checks if some texture target is valid for glGetTexLevelParameter() was not programmed to check for multisampling proxy textures. This made it impossible(?) to use the proxy textures for their intended purpose as glGetTexLevelParameter() would just fail on you. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: mesa-stable@lists.freedesktop.org	2013-07-31 07:27:01 -06:00
Mikko Juola	e404105e7d	mesa: fix proxy textures becoming immutable and unusable glTexStorage*() functions make textures immutable. This carries on to proxy textures. Error checking in texture storage functions prevents proxy textures from working after first time because internally, they became immutable. This commit makes the error checking ignore the immutability flag when working with proxy textures. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: mesa-stable@lists.freedesktop.org	2013-07-31 07:26:55 -06:00
Mikko Juola	3f3f66fd94	mesa: fix proxy textures not working with default texture binding When working with the glTexStorage() functions, the error checking checks that a non-default (i.e., non-zero) texture is currently bound. However, this check made glTexStorage() functions fail with proxy textures when the default texture is bound. Proxy textures do not care about the current texture bindings so for them this check should not be done. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: mesa-stable@lists.freedesktop.org	2013-07-31 07:26:50 -06:00
Mikko Juola	de7e3741eb	mesa: fix number of mipmaps calculation for proxy textures The function _mesa_get_tex_max_num_levels() is supposed to calculate the number of mipmap levels but it was not written to handle proxy textures, at best returning a maximum of 1 mipmap level. Because of this, at least glTexStorage*() calls would incorrectly fail when used with proxy textures with more than one mipmap level. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: mesa-stable@lists.freedesktop.org	2013-07-31 07:26:43 -06:00
Brian Paul	e5f32a0b3a	mesa: improve free() cleanup in generate_mipmap_compressed() Free all our temporary buffers in one place at the end of the function. Fixes memory leak detected by Coverity. Note: This is a candidate for the 9.x branches Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-31 06:53:48 -06:00
Brian Paul	fdbd6a5033	gallium/util: reformat, comment util_get_offset() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-31 06:53:48 -06:00
Brian Paul	30f1770cb1	gallium/util: comments, var renaming in u_inlines.h The variable 'usage' was being used for two different things. Sometimes for PIPE_USAGE_x and other times for PIPE_TRANSFER_x. This renames usage to access when we're talking about PIPE_TRANSFER_x flags. Plus, add a bunch of comments to remind us what's going on. Also, use unsigned for PIPE_TRANSFER_x bitmask to be consistent with other places. And add a missing const qualifier. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-31 06:53:48 -06:00
Brian Paul	365f38f3df	softpipe: use new softpipe_resource_data() accessor We should probably be using map()/unmap() when accessing resource data, but this is a little better. v2: assert that the resource is not a display target, per Jose. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-31 06:53:48 -06:00
Brian Paul	99c42d11a2	softpipe: don't ignore pipe_constant_buffer::buffer_offset This was never a problem since the Mesa state tracker always gives us a user-space constant buffer with buffer_offset=0. But if another state tracker ever gave us a "HW" constant buffer with non-zero buffer_offset we'd mis-render. Also, use the correct buffer size. And move an assertion to the top of the function. Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-31 06:53:48 -06:00
Brian Paul	089ef37eab	gallium/docs: clarify definition of PIPE_CAP_USER_CONSTANT_BUFFERS, etc The cap means _can_ accept user-space constant buffers; it doesn't mean _only_ accepts user-space constant buffers. v2: also update the PIPE_CAP_USER_VERTEX_BUFFERS and PIPE_CAP_USER_INDEX_BUFFERS descriptions as well. Per Jose. Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-31 06:53:48 -06:00
Chris Forbes	cace82b0cd	i965/vs: Put lod parameter in the correct place for Gen4 This was never visible before due to the bogus sampler state pointer. Fixes remaining vertex texturing breakage on Gen4. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2013-07-31 21:33:18 +12:00
Chris Forbes	97676032c2	i965/vs: set up sampler state pointer for Gen4/5. Fixes broken filter and lod selection for vertex texturing. (txs/txf only worked properly because they ignore the sampler state completely) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2013-07-31 21:33:18 +12:00
Marek Olšák	7568a89500	st/dri: add a new driconf option disable_shader_bit_encoding for Unigine Now Unigine Heaven 3.0 finally works with r600g. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:30 +02:00
Marek Olšák	369c829152	st/mesa: fix opcode translation for ARB_shader_bit_encoding functions We treat the opcodes as MOVs, but we should at least change the type of the expression, which later affects which TGSI opcode is chosen. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:30 +02:00
Marek Olšák	0f6a7cb00c	mesa,glsl,st/dri: add a new driconf option force_glsl_version for Unigine See documentation in mtypes.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 23:31:28 +02:00
Marek Olšák	ab78939344	mesa: add MESA_GLSL debug flag to dump shaders on compile error Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 23:31:26 +02:00
Marek Olšák	7f2f804c75	driconf: enable app-specific workarounds for all drivers They were only enabled for i965. Note that drirc must be installed in /etc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 23:31:24 +02:00
Marek Olšák	bc4f0b6bac	st/dri: remove driOptionCache from dri_context in favor of dri_screen There is no reason to have this duplicated. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:24 +02:00
Marek Olšák	dda936e057	st/dri: move enabling postprocessing to dri_screen The driconf options are global. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:24 +02:00
Marek Olšák	772070527f	st/dri: remove more unused driconf options vblank_mode is read by dri_util.c and falls under the "dri2" driver name, which is not connected to the actual Mesa/Gallium driver in any way. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:24 +02:00
Marek Olšák	83dbe61ea4	st/dri: implement the driconf option force_s3tc_enable properly Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:24 +02:00
Marek Olšák	f27f3a4b15	driconf: remove the unused option allow_large_textures Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 23:31:23 +02:00
Marek Olšák	2acc27cc6d	st/dri: support the driconf option disable_blend_func_extended This is needed for Unigine. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:23 +02:00
Marek Olšák	71e0b5d688	st/osmesa: initialize disable_glsl_line_continuations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:22 +02:00
Marek Olšák	4c89ec1f69	gallium/postprocessing: convert blits to pipe->blit PP saves current states to cso_context and then util_blit_pixels does the same. cso_context doesn't like that and the original state is not correctly restored. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:22 +02:00
Marek Olšák	c84e8d039e	gallium/postprocessing: fix shader parsing tokens was converted to a pointer, which made the Elements macro return 1. Broken by `e87fc11cac`. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-30 23:31:22 +02:00
Marek Olšák	c40f8d087a	docs/GL3: clarify core vs compatibility extension support Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 23:31:21 +02:00
Marek Olšák	7db83d8d4b	mesa: default texture buffer format should be R8 in the core profile Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> v2: Since we don't expose the extension in the compatibility profile, the "if (API == CORE) .. else .." statement is removed.	2013-07-30 22:36:21 +02:00
Marek Olšák	a6b1a7c0d2	mesa: default DEPTH_TEXTURE_MODE should be RED in the core profile Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-30 22:36:21 +02:00
Marek Olšák	63569dbeb0	st/mesa: expose EXT_framebuffer_multisample_blit_scaled if MSAA is supported Surprisingly all drivers supporting MSAA can already do this (r300g and r600g for sure) and I think Christoph wanted to have this feature for his Nouveau drivers anyway.	2013-07-30 22:36:21 +02:00
Marek Olšák	1302c66896	st/mesa: fix sRGB renderbuffers without EXT_framebuffer_sRGB support https://bugs.freedesktop.org/show_bug.cgi?id=59322 Cc: mesa-stable@lists.freedesktop.org	2013-07-30 22:36:20 +02:00
Marek Olšák	4dfe1a0df5	Revert "r300g: Give CLIP_DISABLE another try" This reverts commit `e866bd1ade`. https://bugs.freedesktop.org/show_bug.cgi?id=57875 Cc: mesa-stable@lists.freedesktop.org	2013-07-30 22:36:20 +02:00
Carl Worth	122d8d2f5a	get-pick-list.sh: Include commits mentionining "CC: mesa-stable..." in pick list We recently adopted a new convention that patches can be nominated for the stable branch by including a line in the commit message as follows: CC: mesa-stable@lists.freedesktop.org This is a convenient syntax as "git send-email" will notice this line and automatically copy the resulting patch email to the mesa-stable mailing list. Here we extend the regular expression in the get-pick-list.sh script to also notice this pattern, (as well as the traditional "NOTE: This patch is a candidate..." form.	2013-07-30 12:36:37 -07:00
Paul Berry	1299694ed5	glsl: Remove redundant writes to prog->LinkStatus The linker_error() function sets prog->LinkStatus to false. There's no reason for the caller of linker_error() to also do so. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 10:10:27 -07:00
Paul Berry	5fe6b90c87	glsl: Improve error message for interstage interface block mismatch. We're now emitting this error from a point where we have easy access to the name of the block that failed to match, so go ahead and include that in the error message, as we do for intrastage interface block mismatches. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 10:10:27 -07:00
Paul Berry	b95d237fe6	glsl: Use a consistent technique for tracking link success/failure. This patch changes link_shaders() so that it sets prog->LinkStatus to true when it starts, and then relies on linker_error() to set it to false if a link failure occurs. Previously, link_shaders() would set prog->LinkStatus to true halfway through its execution; as a result, linker functions that executed during the first half of link_shaders() would have to do their own success/failure tracking; if they didn't, then calling linker_error() would add an error message to the log, but not cause the link to fail. Since it wasn't always obvious from looking at a linker function whether it was called before or after link_shaders() set prog->LinkStatus to true, this carried a high risk of bugs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 10:10:26 -07:00
Paul Berry	659ec1c958	glsl: Add error message for intrastage interface block mismatch. Previously we failed to link (which is correct), but we did not output an error message, which could have been confusing for users. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 10:10:26 -07:00
Paul Berry	4682b9b7bf	glsl: Remove bogus check on return value of link_uniform_blocks(). A comment in link_intrastage_shaders(), and an if-test that followed it, seemed to indicate that link_uniform_blocks() would return a negative value in the event of an error. But this is not the case--all error checking has already been performed by validate_intrastage_interface_blocks(), and link_uniform_blocks() can only return unsigned values. So get rid of the if-test and change the return type of link_intrastage_shaders() to clarify that it can only return unsigned values. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-30 10:10:25 -07:00
Jonathan Charest	4f8048bb5a	r600g/compute: Added missing address space checking of kernel parameters To have non-static buffers in local memory, it is necessary to pass them as arguments to the kernel. For r600, the correct lds size must be set to the SQ_LDS_ALLOC register. The correct size is the clover size plus the size reported by the compiler. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-30 07:09:16 -07:00
Jonathan Charest	d9576598c7	clover: Added missing address space checking of kernel parameters v2 Here is an updated patch with no line wrapping and respecting 80-column limit (for my changes). v2: Tom Stellard - Create global arguments for constant buffers so we don't break r600g. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-30 07:09:15 -07:00
Kenneth Graunke	07cdf426c1	mesa: Remove broken assertion about enabled texture targets. For GLSL programs, enabledTargets can have more than one bit set. For example, a shader that uses sampler2D and samplerCube uniforms will have both TEXTURE_2D_BIT and TEXTURE_CUBE_BIT set. The code that sets _ReallyEnabled already handles this, selecting the "highest priority" texture target. We should simply use that. Fixes new Piglit test incomplete-textures-of-multiple-types. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62698 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-29 22:35:37 -07:00
Emil Velikov	488b3ed6f4	build: unify mesa version by using a VERSION file Rather than having to keep track of all the build systems and their respecitve definition of the mesa version, use a single top file VERSION. Every build system is responsible for reading/parsing the file and using it v2: * remove useless bulletpoint from the documentation, suggested by Matt * "Androing is Linux. Use '/' in stead of '\'", spotted by Chad V * use cleaner code to get the version in scons, suggested by Chad V v3: * ensure leading and trailing whitespace characters are stripped while parsing * android: handle GNU shell commands approapriately Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-07-29 13:39:29 -07:00
Kenneth Graunke	efb566dff2	i965: Don't create a swrast context on ES2+. We already skip this for API_OPENGL_CORE; ES2+ is very similar. The primary user of the swrast context is GL_SELECT and GL_FEEDBACK, which have never existed in ES. This saves approximately 18MB of memory in GLBenchmark 2.7 Egypt (ES2). No regressions in es3conform on Ivybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-07-29 13:26:27 -07:00
Kenneth Graunke	6aba035f6b	glsl: Remove shader stage checking for extension handling. Certain extensions only add functionality to particular shader stages. (For example, ARB_draw_instanced only adds variables to the vertex shader stage.) Previously, we only allowed such extensions to be enabled in the shader stages where they're useful. However, I've never found any text which mandates that behavior; in my opinion, you should be able to turn on extensions in any shader stage, even if they have no effect. Fixes Piglit tests glslparsertest/glsl2/draw_buffers-05.vert and ARB_draw_instanced/preprocessor/feature-macro-enabled.frag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29185 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-29 10:51:25 -07:00
Matt Turner	0ed02d435e	mesa: Expose OES_surfaceless_context. EGL_KHR_surfaceless_context extension allows contexts to be made current without a default winsys fbo. This extension specifies what ES 1.1 and 2.0 should do (the ES 3.0 spec already does). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-29 10:35:16 -07:00
Matt Turner	8dd15e6021	mesa: Return GL_FRAMEBUFFER_UNDEFINED if the winsys fbo is incomplete. Specified by ARB_framebuffer_object, GL 3.0, and ES 3.0. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-29 10:35:01 -07:00
Matt Turner	b2d3f25aa2	gles3: Update gl3.h to 2013-02-12. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-07-29 10:35:00 -07:00
Matt Turner	00a945f61e	gles2: Update gl2ext.h to revision 22161. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-07-29 10:34:58 -07:00
Matt Turner	efa8a6e72f	gles2: Update gl2.h to revision 20555. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-07-29 10:34:47 -07:00
Matt Turner	32a2ab47fe	gles: Update glext.h to revision 20798. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-07-29 10:34:42 -07:00
Roland Scheidegger	e08114fed7	gallivm: (trivial) git rid of assertion in float->uint conversion code Commit `8c3d3622d9` introduced a new assertion, but since it causes lp_test_conv failures remove it again and let's hope we don't really hit bugs caused by the potentially bogus code (it is possible the assert() caught some cases which work correctly too).	2013-07-29 13:23:56 +02:00
Maarten Lankhorst	e847b5ae06	nvc0: force use of correct firmware file Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-07-28 12:06:57 +02:00
Ian Romanick	803f755ede	glsl: Less const for glsl_type convenience accessors The second 'const' says that the pointer itself is constant. This in unenforcible in C++, so GCC emits a warning (see) below for each of these functions in every file that includes glsl_types.h. It's a lot of warning spam. ../../../src/glsl/glsl_types.h:176:58: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2013-07-27 12:13:03 -07:00
Kenneth Graunke	17856726c9	glsl: Disallow auxiliary storage qualifiers on FS outputs. This has always been an error; we just forgot to check for it. Fixes Piglit's no-aux-qual-on-fs-output.frag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67333 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2013-07-27 10:31:40 -07:00
Kenneth Graunke	c178ec0d7e	glsl: Classify "layout" like other identifiers. When "layout" isn't being lexed as LAYOUT_TOK, we should treat it like an ordinary identifier. This means we need to classify it to determine whether we should return IDENTIFIER, TYPE_IDENTIFIER, or NEW_IDENTIFIER. Fixes the WebGL conformance test "shader-with-non-reserved-words." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64087 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2013-07-27 10:31:38 -07:00
Paul Berry	4d7899fe81	glsl: Be consistent about '\n', '.', and capitalization in errors/warnings. The majority of calls to _mesa_glsl_error(), _mesa_glsl_warning(), and _mesa_glsl_parse_state::check_version() use a message that begins with a lower case letter and ends without a period. This patch makes all messages follow that convention. Also, error/warning messages shouldn't end in '\n', since _mesa_glsl_msg() automatically adds '\n' at the end of the message. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-27 09:41:30 -07:00
Roland Scheidegger	8c3d3622d9	gallivm: fix float->SNORM conversion Just like the UNORM case we need to use round to nearest, not trunc. (There's also another problem, we're using the formula for SNORM->float which will produce a value below -1.0 for the most negative value which according to both OpenGL and d3d10 would need clamping. However, no actual failures have been observed due to that hence keep cheating on that.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-27 16:41:29 +02:00
Roland Scheidegger	d86fddc876	util: don't flush overflowing values to infinity in half-float conversion I am not able to find _any_ rounding behavior specified for OpenGL for float to half-float conversions. However, it is specified for fp11/fp10 which suggests round to next finite value but round-to-zero would also be allowed, but finite values must not be flushed to infinity in either case. Hence I believe it makes sense to do the same for half-floats too. We could probably also use round-to-zero consistently, which is in fact required by d3d10 (but it doesn't seem to matter much). Does not match the mesa core function doing the same though (which is saying it was built to match intel gpus which I don't believe for a second as it would cause failures in d3d10, moreover the PRM (for ivy bridge, not listed in older manuals) while not specifying rounding behavior clearly states finite numbers are never flushed to infinity). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-27 16:41:29 +02:00
Roland Scheidegger	47e528b740	tgsi: handle texel swizzles correctly for d3d10-style sample opcodes Same as for gallivm (though these don't quite work correctly in softpipe, so untested). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-27 16:41:29 +02:00
Roland Scheidegger	abcc40e7f0	gallivm: handle texel swizzles correctly for d3d10-style sample opcodes unlike OpenGL, the texel swizzle is embedded in the instruction, so honor that. (Technically we now execute both the sampler_view swizzle and the per-instruction swizzle but this should be quite ok.) v2: add documentation note as it's not obvious. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-27 16:41:29 +02:00
Kenneth Graunke	f2be639972	docs: Mark ARB_vertex_attrib_binding as started. Fredrik Höglund has a partial implementation in his git tree.	2013-07-26 23:47:27 -07:00
Ian Romanick	b55c1638ad	mesa: Disable GL_EXT_framebuffer_object in core profiles and OpenGL 3.1 GL_EXT_framebuffer_object differs from GL_ARB_framebuffer_object in ways that we can't and don't implement in core profiles. Exposing it is a lie, so we shouldn't do that. It's possible the some other GL_EXT_framebuffer_* extensions should be disabled, but it's not quite so clear cut. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-26 22:56:26 -07:00
Matt Turner	86ae3027a1	docs: Mark GL_ARB_shading_language_420pack as done.	2013-07-26 22:33:39 -07:00
Chris Forbes	6c0dad6128	docs: Mark off 420pack Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-07-27 21:29:01 +12:00
Tapani Pälli	8c211dd742	glsl: disable ARB_texture_cube_map_array_enable keywords for glsl es Patch fixes a crash with Webgl 'shader-with-non-reserved-words' conformance test by ignoring desktop extension keywords on GLSL ES. v2: fix reserved and allowed desktop glsl versions (Chris) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64087 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-26 10:05:20 -07:00
Chris Forbes	124f567f1d	i965/vs: Fix flaky texture swizzling If any component used the ZERO or ONE swizzle, its corresponding member in the `swizzle` array would never be initialized. We mostly got away with this, except when that memory happened to contain a value that clobbered another channel when combined using BRW_SWIZZLE4(). NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-27 06:34:29 +12:00
Niels Ole Salscheider	81a156d099	st/clover: Allow double precision operations Pass "cl_khr_fp64" preprocessor definition to clang Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-25 18:55:56 -07:00
Dave Airlie	19338157c9	gallium/vl: add prime support This fixes the dri2 opening to check if DRI_PRIME is set, and picks the correct drm device path to open, this along with a change to libvdpau allows vdpauinfo to work at least, Martin Peres tested with nouveau, and there seems to be a further issue with final displaying, it only works sometimes, but this patch is at least necessary to help debug further. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67283 Tested-by: Armin K. <krejzi@email.com>	2013-07-26 08:42:00 +10:00
Kenneth Graunke	0e9549e2bd	Revert "i965: Delete pre-DRI2.3 viewport hacks." This reverts commit `c9db037dc9`. Eric believes that the viewport hacks are still necessary for EGL; invalidate events aren't hooked up properly. This commit caused a regression where EFL applications wouldn't show anything other than window decorations; GLBenchmark also showed issues. The revert had conflicts due to the intel_context/brw_context merge. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66606 Cc: mesa-stable@lists.freedesktop.org	2013-07-25 15:25:43 -07:00
Kenneth Graunke	a8c8c5f8d2	mesa: Bump version to 9.3.0-devel. This should have been done when making the 9.2 branch, but was missed.	2013-07-25 13:34:53 -07:00
Kenneth Graunke	7d24d1b873	docs: Remove <em> obfuscation on public mailing list addresses. Wrapping every character of an email address in <em> looks bizarre, and makes it impossible to read the text. Apparently Brian did this in 2003 to try and obfuscate email addresses and avoid spam. Of course, mesa-*@lists.freedesktop.org are public mailing lists and trivial to find on the internet. So obfuscation buys us nothing (assuming the <em> technique even works at all, which I doubt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> LOLed-at-by: Matt Turner :)	2013-07-25 13:34:43 -07:00
Rob Clark	890e27ef25	xa: bump major version Bump major version, as the change to require explicit xa_context_flush(), the addition of the handle-type parameter to xa_surface_handle(), and change of surface to ref/unref will require a minor change in DDX.	2013-07-25 13:59:55 -04:00
Jerome Glisse	8b21a3825b	xa: move surface to ref/unref api This make ddx life easier. Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-07-25 13:59:55 -04:00
Jerome Glisse	d156c032c9	xa: let ddx handle flush Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-07-25 13:59:55 -04:00
Jerome Glisse	6e8c9589db	xa: export a common context flush function First step before moving flushing inside the ddx. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-07-25 13:59:55 -04:00
Jerome Glisse	d1444225d3	xa: add handle type parameter to get handle Allow to retrieve non shared handle. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-07-25 13:59:55 -04:00
Rob Clark	984da46219	xa: add xa_surface_from_handle() For freedreno DDX, we have to create the scanout GEM bo in a special way (until we have our own KMS/DRM kernel driver.. and even then for phones/tablets you probably need to use the android drivers if you don't want to port the lcd panel driver support). The easiest way to handle this is let the DDX create the scanout bo, and then create the xa surface from that. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-07-25 13:59:54 -04:00
Vinson Lee	60c248c3af	gallivm: Remove NoFramePointerElimNonLeaf for LLVM >= 3.4. TargetOptions::NoFramePointerElimNonLeaf was removed in LLVM 3.4 r187093. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-25 09:50:07 -07:00
Paul Berry	a5eecb246d	glsl: Handle empty if statement encountered during loop analysis. The is_loop_terminator() function was asserting that the following kind of if statement could never occur: if (...) { } else { } (presumably based on the assumption that such an if statement would be eliminated by previous optimization stages). But that isn't the case--it's possible that previous optimization stages might simplify more complex code down to this empty if statement, in which case it won't be eliminated until the next time through the optimization loop. So is_loop_terminator() needs to handle it. Fortunately it's easy to handle--it's not a loop terminator because it does nothing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64330 CC: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-25 09:37:02 -07:00
Paul Berry	b8f13fbb85	i965: Initialize inout_offset parameter to brw_search_cache(). Two callers of brw_search_cache() weren't initializing that function's inout_offset parameter: brw_blorp_const_color_params::get_wm_prog() and brw_blorp_const_color_params::get_wm_prog(). That's a benign problem, since the only effect of not initializing inout_offset prior to calling brw_search_cache() is that the bit corresponding to cache_id in brw->state.dirty.cache may not be set reliably. This is ok, since the cache_id's used by brw_blorp_const_color_params::get_wm_prog() and brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are not used. However, failing to initialize this parameter causes valgrind to complain. So let's go ahead and fix it to reduce valgrind noise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-25 09:36:15 -07:00
Paul Berry	42a921fa92	glsl: don't rename variables in interface block arrays. The linker matches up variables in interface blocks according to their block name and variable name. When support for interface block arrays was added in commit `d6863acb`, we renamed variables appearing in interface blocks so that their name included the array size. For example, in a block like this: out foo { float bar } baz[3]; The variable "bar" would get renamed to "bar[3]". This is unnecessary, and leads to problems in supporting geometry shaders, since geometry shaders require vertex shader outputs which are non-arrays to be linked up to geometry shader inputs which are arrays. This patch makes the behaviour of interface block arrays the same as simple non-array interface blocks; in both cases, the variables contained within them are not renamed. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-25 09:34:24 -07:00
Zack Rusin	f19cb0e5f3	draw: fix vertex id computation vertex id has to be unaffected by the start index (i.e. when calling draw arrays with start_index = 5, the first vertex_id has to still be 0, not 5) and it has to be equal to the index when performing indexed rendering (in which case it has to be unaffected by the index bias). This fixes our behavior. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-25 02:02:59 -04:00
Zack Rusin	0e9ec86973	draw: cleanup and fix instance id computation The instance id system value always starts at 0, even if the specified start instance is larger than 0. Instead of implicitly setting instance id to instance id plus start instance and then having to subtract instance id when computing the buffer offsets lets just set instance id to the proper instance id. This fixes instance id computation and cleansup buffer offset computation. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-25 02:02:36 -04:00
Vinson Lee	0ac3164708	gallivm: Remove dead code in lp_build_compare_ext. There are earlier returns for PIPE_FUNC_NEVER and PIPE_FUNC_ALWAYS. The switch value of 'func' cannot be either of those values. Fixes "Logically dead code" defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-24 23:47:34 -07:00
Brian Paul	8a9df7a370	mesa: implement mipmap generation for compressed 2D array textures We weren't looping over all the slices in the array. The updated code should also correctly handle 3D compressed textures too, whenever we have that feature. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850 NOTE: This is a candidate for the 9.x branches Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-24 15:29:30 -06:00
Brian Paul	484fa87984	meta: handle 2D texture arrays in decompress_texture_image() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850 NOTE: This is a candidate for the 9.x branches. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-24 15:29:30 -06:00
Brian Paul	2931bcb0d2	mesa: handle 2D texture arrays in get_tex_rgba_compressed() If we call glGetTexImage() for a compressed 2D texture array we need to loop over all the slices. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66850 NOTE: This is a candidate for the 9.x branches. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-24 15:29:29 -06:00
Christoph Bumiller	5c37039797	nv50,nvc0: s/uint16/uint32 for constant buffer offset Looks like a thinko, "Hey, constant buffers can be at most 64 KiB in size, offset can't be larger." But it can, of course. I think piglit lacks a test for UBO and BindBufferRange that tests if it actually works.	2013-07-24 20:46:38 +02:00
Roland Scheidegger	1e003b44e8	draw: always call util_cpu_detect() in draw context creation. Since disabling denorms in draw_vbo() we require the util_cpu_caps to be initialized there. Hence add another util_cpu_detect() call in draw_create_context() which should ensure this. (There is another call in draw_get_option_use_llvm() which only gets called with x86 (not x86_64) but calling it always there wouldn't help since it most likely wouldn't get called when compiling without llvm, so leave it alone there.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=66806. (Because util_cpu_caps wasn't initialized when first calling util_fpstate_get() hence it returning zero, but it would later get initialized by rtasm translate code hence when draw call returned it unmasked all exceptions by calling util_fpstate_set(). This was happening only with DRAW_USE_LLVM=0 or not compiling with llvm, otherwise the llvm init code was calling it on time too.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-07-24 15:58:07 +02:00
Roland Scheidegger	bceb5f36ec	mesa: fix rgtc snorm decoding The codeword must be unsigned (otherwise will shift in 1's from above when merging low/high parts so some texels decode wrong). This also affects gallium's util/u_format_rgtc. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-24 15:58:00 +02:00
Andre Heider	0acf3a8407	gallium/util: Fix detection of AVX cpu caps For AVX it's not sufficient to only rely on the cpuid flags. If the CPU supports these extensions, but the OS doesn't, issuing these insns will trigger an undefined opcode exception. In addition to the AVX cpuid bit we also need to: * test cpuid for OSXSAVE support * XGETBV to check if the OS saves/restores AVX regs on context switches See "Detecting Availability and Support" at http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions Signed-off-by: Andre Heider <a.heider@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-07-23 23:12:58 +01:00
Chris Forbes	5a7bdd4b41	docs: Add items for GL4.4 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-23 19:04:43 +12:00
Francisco Jerez	df530829f7	clover: Respect kernel argument alignment restrictions. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-22 23:09:34 +02:00
Francisco Jerez	f64c0ca692	clover: Extend kernel arguments for differing host and device data types. Loosely based on a similar patch by Tom Stellard. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-22 23:09:34 +02:00
Francisco Jerez	829caf410e	clover: Byte-swap kernel arguments when host and device endianness differ. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-22 23:09:22 +02:00
Francisco Jerez	2265b40e37	clover: Add kernel argument fields to allow differing host/target data types. Loosely based on a similar patch by Tom Stellard. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-22 22:47:27 +02:00
Francisco Jerez	a3dcab43c6	clover: Pass corresponding module::argument to kernel::argument::bind(). And remove size information from most kernel::argument derived classes, it's no longer going to be necessary. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-22 22:45:41 +02:00
Tom Stellard	8c9d3c62f6	clover: Return correct value for CL_DEVICE_ENDIAN_LITTLE Query the driver using PIPE_CAP_ENDIANNESS rather than always returning true. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-07-22 22:45:20 +02:00
Tom Stellard	4e90bc9a12	gallium: Add PIPE_CAP_ENDIANNESS Cc: mesa-stable@lists.freedesktop.org [ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation, define PIPE_ENDIAN_NATIVE. ]	2013-07-22 22:43:17 +02:00
Matt Turner	c09a4cbbaf	configure.ac: Use correct options names in AC_ARG_ENABLE.	2013-07-22 10:48:45 -07:00
Matt Turner	242a59d535	egl/build: Remove unused GLAPI_LIB.	2013-07-22 10:48:45 -07:00
Matt Turner	3647efa5c1	build: Remove unused EGL_PLATFORMS.	2013-07-22 10:48:45 -07:00
Matt Turner	5e4e145025	build: Add tests directories to SUBDIRS Fixes a problem with distcheck.	2013-07-22 10:48:45 -07:00
Zack Rusin	7bae56c5c2	llvmpipe: Ensure FTZ/DAZ flags are set on deferred draw flushes. Tested-by: José Fonseca <jfonseca@vmware.com>	2013-07-22 18:11:39 +01:00
José Fonseca	2a650611be	llvmpipe: Remove lp_rast_get_num_threads(). Never called. Trivial.	2013-07-22 18:08:39 +01:00
José Fonseca	190312949e	scons: Don't use -z defs ld option on Mac. Should fix fdo bug 67098.	2013-07-21 09:55:04 +01:00
Vinson Lee	cd90ebefd4	glsl: Initialize ast_function member variables. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-21 00:23:17 -07:00
Jeremy Huddleston Sequoia	fa5ed99d8e	Apple: glFlush() is not needed with CGLFlushDrawable() <rdar://problem/14496373> Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2013-07-20 10:25:28 -07:00
José Fonseca	b844c8e039	util/u_math: Define NAN/INFINITY macros for MSVC. Untested. But should hopefully fix the build.	2013-07-20 00:31:18 +01:00
Zack Rusin	f59cb67376	llvmpipe/tests: update arith test to check for edge cases Test infs, zeros and nans with our arith functions to assure correct/defined behavior with those values. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:18 -04:00
Zack Rusin	f7c06785d0	gallivm: add a log function that handles edge cases Same as log2_safe, which means that it can handle infs, 0s and nans. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:18 -04:00
Zack Rusin	018c69ac56	gallivm: export unordered/ordered cmp to a common function Only the floating point operarators change everything else is the same so it makes sense to share the code. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:18 -04:00
Zack Rusin	192c68b85a	gallivm: handle -inf, inf and nan's in sin/cos instructions sin/cos for anything not finite is nan and everything else has to be between [-1, 1]. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:17 -04:00
Zack Rusin	13e2cd2f2c	gallivm: add a version of log2 which handles edge cases That means that if input is: * - less than zero (to and including -inf) then NaN will be returned * - equal to zero (-denorm, -0, +0 or +denorm), then -inf will be returned * - +infinity, then +infinity will be returned * - NaN, then NaN will be returned It's a separate function because the checks are a little bit costly and in most cases are likely unnecessary. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:17 -04:00
Zack Rusin	7b672c1503	gallivm: fix edge cases in exp2 exp(0) has to be exactly 1, exp(-inf) has to be 0, exp(inf) has to be inf and exp(nan) has to be nan, this fixes all of those cases. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:17 -04:00
Zack Rusin	ab47bbecd6	gallivm: handle nan's in min/max Both D3D10 and OpenCL say that if one the inputs is nan then the other should be returned. To preserve that behavior the patch fixes both the sse and the non-sse paths in both functions and adds helper code for handling nans. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-19 16:29:17 -04:00
José Fonseca	719000bd7d	scons: Disallow undefined symbols in Xlib libGL.so. It's not the first time that, due to missing build dependencies or incomplete commits, we end up with a broken libGL.so that's missing symbols, causing all tests to fail catastrophically. Instead try to catch this sort of issues earlier.	2013-07-19 13:08:07 +01:00
Tomasz Lis	9f07ca11c1	mesa: Dispatch ARB_framebuffer_object and EXT_framebuffer_object differently Almost all of the functions between the ARB and the EXT share the same GLX protocol because the functionality is, essentially, identical. However, there are some differences between the extensions: - In the ARB extension, names must come from glGenBuffers. - In the ARB extension, framebuffer objects are not shared (but they are in the EXT). For these reasons, glBindFramebuffer and glBindRenderbuffer have different GLX protocol opcodes than their EXT counterparts. Currently these functions alias each other in the dispatch table. This makes it impossible to be truly spec conformant. This patch enables fixing the conformance issue by splitting glBindFramebuffer / glBindFramebufferEXT and glBindRenderbuffer / glBindRenderbufferEXT into separate dispatch table entries. Patches will be available shortly to: - Fix the conformance issue. - Stop advertising the EXT in OpenGL 3.1 (or core profiles). HOWEVER, this does represent a compatibility break between the loader (libGL or the Xserver GLX module) and the driver. Mesa drivers compiled without this change will request a single dispatch table entry for glBindFramebuffer and glBindFramebufferEXT. Since the updated loader has different entries for each, the request will fail, and the driver will die in a fire. Drivers built with the change should continue to load fine on loaders without the change. In this case, the driver will separately ask for entries for glBindFramebuffer and glBindFramebufferEXT, and the loader will tell it the same location. Since the loader in the server's GLX module is not (yet) updated, this should not be a problem. We also do not advertise the ARB extension from the server, so, again, this should not be a problem for the server. HOWEVER, this means that DRI1 drivers (remember mga_dri.so?) will no longer load with libGL build hereafter. That means this patch will need to be back ported to the 8.0 branch. v2 (idr): Added missing GLX protocol opcodes for the EXT functions and corrected the opcodes for the ARB functions. Updated GLX indirect_api unit test and dispatch sanity unit test. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Bartosz Zawistowski <bartosz.l.zawistowski@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]	2013-07-18 17:42:46 -07:00
Kenneth Graunke	adfd0123c8	st/mesa: Enable the ARB_shading_language_420pack extension for 1.30+. Any driver that supports GLSL 1.30 should be able to handle this extension, as it's entirely implemented in the GLSL compiler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	46d9baf3e3	i965: Enable the GL_ARB_shading_language_420pack extension on Gen6+. While all the work is in the shared GLSL compiler, this extension requires GLSL 1.30, which is currently only supported on Gen6+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	bfcec4618a	glsl: Handle the binding qualifier for UBO variables. layout(binding = N) is equivalent to calling glUniformBlockBinding(_,N). This currently only handles the GLSL 1.40 case - no interface names, no arrays of uniform blocks. This is okay since we don't yet support GLSL 1.50, and don't expose ARB_shading_language_420pack in ES 3.0. v2: Move into the other function; use binding, not constant_value. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	f25d94084c	glsl: Propagate UBO binding qualifier into UBO member variables. Without an instance name, there is no ir_variable representing the actual uniform block declaration. When the linker goes to set uniform initializers, it only sees the members as ir_variables; never the block. So, unfortunately, the members need to know about the binding. There has to be a better way to do this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	34e2ccc9f0	glsl: Handle the binding qualifier for arrays of samplers. Normally, uniform array variables are initialized by array literals. That is, val->type->array_elements >= storage->array_elements. However, samplers are different. Consider a declaration such as: layout(binding = 5) uniform sampler2D[3]; The initializer value is a single integer (5), while the storage has 3 array elements. The proper behavior here is to increment one for each element; they should be initialized to 5, 6, and 7. This patch introduces new code for sampler types which handles both arrays of samplers and single samplers correctly. v2: Move into the other function; use binding, not constant_value. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	67038c6ba2	glsl: Add plumbing for handling uniform binding qualifiers. Sampler uniforms and uniform blocks do not have a var->constant_value. Instead, they have an integer var->binding value. This makes extending set_uniform_initializer() somewhat problematic: it assumes that there is an ir_constant * which represents the initializer, and that it's safe to dereference that without any NULL checks. Instead, this patch creates an analogous function for binding qualifiers, and calls one or the other as appropriate. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	0a23ec2b6e	glsl: Delete unused code for handling samplers in array-initializers. There is existing code to handle sampler uniform initializers. Prior to GLSL 4.20's "binding" keyword, sampler uniforms don't have initializers at all, so this is somewhat surprising. The existing code is broken into two cases: one where both the variable and initializer are arrays, and a second where the variable and initializer are scalars. The first case should never occur, since array-typed initializers do not exist for sampler uniforms. Even with the binding keyword, the initializer is a single integer which represents the texture unit to use for the first array element. The second is apparently used for some fixed-function code. v2: Rewrite the commit message - suggested by Paul. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	9a9a830b44	glsl: Cross-validate explicit binding points. All compilation units need to agree on the binding point, if they specify one at all. v2: Use binding, not constant_value. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	d4375fc016	glsl: Propagate explicit binding information from AST to IR. Rather than creating a new "binding" field in ir_variable, we reuse constant_value since the linker code for handling uniform initializers uses that. Since UBOs and samplers can't otherwise have initializers/constant values, there shouldn't be a conflict. v2: Propagate the new binding variable around too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	4da1504c0f	glsl: Add ir_variable fields for explicit bindings. These are not used yet, but they exist and are copied appropriately. v2: Add an explicit "int binding" variable rather than reusing constant_value, as suggested by Paul Berry. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:24 -07:00
Kenneth Graunke	5e5e12040b	glsl: Add validation for the "binding" qualifier. The "binding" qualifier only applies to UBO blocks and samplers, along with arrays of those types. (It would also apply to images and atomic counters, but we don't support those yet.) This also validates sampler bindings against the maximum number of texture units, and UBO bindings against the number of uniform buffer binding points. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	0418846a07	glsl: Parse the "binding" keyword and store it in ast_type_qualifier. Nothing actually uses this yet. v2: Remove >= 0 checks. They'll be handled in later validation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	7f6a2d6937	glsl: Have the lexer return LAYOUT_TOK if 420pack is enabled. GL_ARB_shading_language_420pack also provides layout qualifiers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	56bcde34b2	glsl: Use has_layout() rather than a partial open coded version. The idea of this code is to disallow layout(...) sections with the deprecated "varying" or "attribute" keywords, unless a few select extensions are enabled which allow a more relaxed check. In order to detect a layout(...) section, the code checks for a number of layout qualifiers. However, it failed to check for all of them, which could lead to layout(...) not being detected when it should. By replacing this with has_layout(), we properly check for all layout qualifiers, and also guarantees that new qualifiers added in the future will not be forgotten. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	c397ec94e9	glsl: Relax auxiliary storage ordering requirements with 420pack. These were already semi-relaxed, since the storage qualifier rule already skipped when 420pack was enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	b5d6c51e2b	glsl: Handle centroid qualifier ordering in C code, not the parser. The GL_ARB_shading_language_420pack extension/GLSL 4.20 split centroid off into a new category, "auxiliary storage qualifiers," and allow these to be placed anywhere in the series. So we have to stop recognizing "centroid in"/"centroid out"/"centroid varying" in the grammar and get more creative. The same approach used before works here, too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	844307a584	glsl: Allow precision qualifiers to be flexibly ordered with 420pack. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	6eec502e84	glsl: Move precision handling to be part of qualifier handling. This is necessary for the parser to be able to accept precision qualifiers not immediately adjacent to the type, such as "const highp inout float foo". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	308d4c7146	glsl: Change is_precision_statement to default_precision != none. Currently, we store precision in ast_type_specifier, rather than ast_type_qualifier. This works because precision is the last qualifier, and immediately adjacent to the type. Default precision statements (such as "precision highp float") are represented as ast_type_specifier objects, with a boolean to indicate that it's a default precision statement rather than an ordinary type. ast_type_specifier::precision will be moving to ast_type_qualifier soon, in order to support arbitrary qualifier ordering. However, we still need to store a "this is a precision statement" flag /and/ the default precision in ast_type_specifier. This patch changes the boolean into a new field, default_precision. If default_precision != ast_precision_none, it's a precision statement with the specified precision. Otherwise, it's an ordinary type. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:23 -07:00
Kenneth Graunke	7855482138	glsl: Disable ordering checks for const parameters with 420pack. This makes the complier accept both "const in" and "in const". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	293dfe5738	glsl: Handle "const" as a parameter qualifier. This will make it easy to support both "const in" and "in const", as required by GLSL 4.20/ARB_shading_language_420pack. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	a4d15a3cd9	glsl: Refactor parameter qualifier handling. "Parameter direction qualifier" is a new term I invented just now; it's not part of any GLSL specification. This paves the way handling multiple parameter qualifiers, in any order, as required by GLSL 4.20/ARB_shading_language_420pack. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	83fe4f7019	glsl: Use merge_qualifier() when processing qualifier lists. Most of ast_type_qualifier is simply a bitfield (represented as a structure of unsigned:1 bits in a union with an unsigned). However, it also contains ARB_explicit_attrib_location's location/index fields. In the past, this has worked by simply returning the layout qualifier's ast_type_qualifier and merging the other bits into it. However, that's not obvious until you break it by switching $1 and $2. Using merge_qualifier() copies them appropriately, and also properly overrides layout qualifiers. It also checks for duplicate qualifiers, which renders some of the checks in the previous patch unnecessary. However, those checks provide better error messages, such as "Duplicate interpolation qualifier", rather than just "duplicate qualifier". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	0cb90fcfbd	glsl: Allow duplicate layout qualifiers with 420pack. The new 4.20 rules explicitly allow multiple layout(...) sections. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	89f75e7e7b	glsl: Disable ordering checks on most qualifiers for 420pack. This makes the compiler accept invariant, storage, layout, and interpolation qualifiers in any order when ARB_shading_language_420pack is enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	48e3bd33dc	glsl: Handle most qualifier ordering in C code rather than the grammar. The GL_ARB_shading_language_420pack extension/GLSL 4.20 allow qualifiers to be specified in (basically) any order. In order to support this, we can't hardcode the ordering restrictions in the grammar. This patch alters the grammar to accept invariant, storage, layout, and interpolation qualifiers in any order, but adds C code to enforce the ordering requirements. In the 420pack case, we should be able to simply skip the error checks. As a bonus, this also lets us generate decent error messages, rather than Bison's awful "unexpected TOKEN" errors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	1b719df14d	glsl: Add a new ast_type_qualifier::has_auxiliary_storage() method. "Auxiliary storage qualifiers" is the new term given to "centroid", "patch", and "sample" by GLSL 4.20/GL_ARB_shading_language_420pack. Even though we only support "centroid", it's useful to add this now so that all auxiliary storage qualifiers get handled in the right places once they're eventually supported. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	eb30af51d6	glsl: Add a new ast_type_qualifier::has_storage() method. This makes it easy to check if any storage qualifiers are set. "centroid" is not considered a storage qualifier. In the old language rules, you can't specify "centroid" by itself; it's always "centroid in", "centroid out", or "centroid varying." So one of the other storage qualifiers will always be set; there's no need to specifically check for centroid. In the new 4.20 rules, centroid is an auxiliary storage qualifier, not a storage qualifier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:22 -07:00
Kenneth Graunke	7cef2b22b8	glsl: Add a new ast_type_qualifier::has_layout() method. This makes it easy to check if any layout qualifiers are set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-18 16:57:21 -07:00
Kenneth Graunke	7ce5c6b214	i965: Combine URB code emission into a single group. All four URB packets need to be programmed together in order for the GPU state to be valid. Putting them in separate BEGIN..ADVANCE blocks is risky: if we're nearing the end of a batch, the batch could be flushed inbetween two of the commands, causing the URB programming to be split into two batchbuffers. This -might- be okay with hardware contexts, but it offers no advantages over keeping them together, and has a potential for hangs. Putting them into a single BEGIN..ADVANCE block ensures they'll be kept in the same batch, which seems wise. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-18 16:57:21 -07:00
Chad Versace	30f33deccb	i965/hsw: Change L3 MOCS for depth, hiz, and stencil Change from "not cacheable" to "cacheable" in L3. Do so for the draw upload path and blorp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-18 16:18:22 -07:00
Chad Versace	2273b652bb	i965/hsw: Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS Change from "not cacheable" to "cacheable" in L3. Do so for the draw upload path and blorp. In blorp, change only the PS packet, because the VS packet is disabled. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-18 16:18:22 -07:00
Chad Versace	2f346395f5	i965/hsw: Change L3 MOCS of SURFACE_STAT Change from "not cacheable" to "cacheable" in L3. Do so for the draw upload path and blorp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-18 16:18:21 -07:00
Chad Versace	a16d47465e	i965/hsw: Change L3 MOCS of 3DSTATE_VERTEX_BUFFERS Change from "not cacheable" to "cacheable" in L3. Do so for the draw upload path and blorp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-18 16:18:21 -07:00
Tomasz Lis	eb83079b35	glx: Enable floating-point fbconfig extensions Signed-off-by: Tomasz Lis <listom@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Ian Romanick	74cbe6e497	egl: Drop configs with unknown or invalide __DRI_ATTRIB_RENDER_TYPE Some render types, such as floating-point, aren't valid with EGL. Return NULL in those cases to drop them. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	c37c367d38	dri: Introduce new flags in __DRI_ATTRIB_RENDER_TYPE Mark __DRI_ATTRIB_FLOAT_MODE as deprecated, and introduce new flags to __DRI_ATTRIB_RENDER_TYPE for float modes. Both signed float (fbconfig_float) and unsigned (packed_float) are introduced. The old attribute should be set for both float modes. v2 (idr): Require that the render mode from the DRI attributes matches the render mode of the config exactly. This is the behavior of the old code. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	4473af7aca	glx: Require proper drawableType in init_fbconfig_for_chooser Make sure that init_fbconfig_for_chooser sets correct value of drawableType for visual configs and fbconfigs. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	2eed9ff2fb	glx: Validate the GLX_RENDER_TYPE value Correctly handle the value of renderType in GLX context. In case of the value being incorrect, context creation fails. v2 (idr): indirect_create_context is just a memory allocator, so don't validate the GLX_RENDER_TYPE there. Fixes regressions in several GLX_ARB_create_context piglit tests. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	27c8aa5cfb	glx: Store the RENDER_TYPE in indirect rendering v2 (idr): Open-code the check for GLX_RENDER_TYPE. dri2_convert_glx_attribs can't be called from here because that function only exists in direct-rendering builds. Also add a stub version of indirect_create_context_attribs to tests/fake_glx_screen.cpp to prevent 'make check' regressions. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	1c748dff6b	glx: Handling RENDER_TYPE in glXCreateContext and init_fbconfig_for_chooser Set the correct values of renderType in glXCreateContext and init_fbconfig_for_chooser. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	b8126c7c8a	glx: Changes to visual configs initialization. Correctly handle the value of renderType and drawableType in fbconfig. Modify glXInitializeVisualConfigFromTags to read the parameter value, or detect it if it's not there. v2 (idr): If there was no GLX_RENDER_TYPE property, set the type based purely on the rgbMode as the previous code did. It is impossible for floatMode to be set at this point, so we can't have a float config. The previous code regressed a large number of piglit GLX tests because those tests don't set GLX_RENDER_TYPE in the glXChooseConfig call. Restoring the old behavior for that case fixes those regressions. Also fix handling of GLX_DONT_CARE for GLX_RENDER_TYPE. Fixes a regression in glx-dont-care-mask. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	a92cd5b245	glx: Retrieve the value of RENDER_TYPE from GLX attribs array Make sure that context creation routines are provided with the value of RENDER_TYPE retrieved from GLX attribs. v2 (idr): Minor formatting changes. Change type of dri2_convert_glx_attribs render_type parameter to uint32_t to silence some GCC warnings. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Tomasz Lis	36259a16fe	glx: Store the value of renderType while creating context Make sure that renderType property value is stored in GLX context while it's being created. Further patches will be provided to make the value correspond to fbconfig's renderType. v2 (idr): Move a hunk from the next patch to this patch to prevent a build break. Signed-off-by: Tomasz Lis <tomasz.lis@intel.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-18 16:03:42 -07:00
Kenneth Graunke	7791c9869b	i965: Add #defines for Memory Object Control State fields on Gen7-7.5. The L3 controls are identical on all platforms, but LLC differs: - Ivybridge has a "cache in LLC" flag - Baytrail has no LLC, but instead has a snoop bit: "data accesses in this page must be snooped in the CPU caches." - Haswell has writeback/uncached flags for LLC and eLLC (eDRAM). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-18 16:03:19 -07:00
Fabian Bieler	6368478712	glsl/linker: Use correct array length when linking inter-stage uniforms and varyings. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>	2013-07-18 14:12:44 -07:00
Mike Frysinger	73c9b4b0e0	gen_matypes: fix cross-compiling with gcc The current gen_matypes logic assumes that the host compiler will produce information that is useful for the target compiler. Unfortunately, this is not the case whenever cross-compiling. When we detect that we're cross-compiling and using GCC, use the target compiler to produce assembly from the gen_matypes.c source, then process it with a shell script to create a usable header. This is similar to how the linux kernel creates its asm-offsets.c file. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org>	2013-07-18 13:55:48 -07:00
Andreas Oberritter	a48be954ce	ax_prog_flex.m4: change grep syntax to accept e.g. flex.real This is required in case a wrapper or symlink is used. This patch has also been sent upstream, awaiting moderation. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Andreas Oberritter <obi@saftware.de>	2013-07-18 13:54:59 -07:00
Jonathan Liu	2da0bd0526	builtin_compiler/build: Avoid using libtool if cross compiling Adds the dependencies of builtin_compiler as sources when cross compiling instead of using libtool to share compilation with src/glsl. The builtin_compiler executable is built for the host when cross compiling so it doesn't make sense to share compilation with src/glsl built for the target in this case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44618 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jonathan Liu <net147@gmail.com>	2013-07-18 13:54:20 -07:00
Kenneth Graunke	2b5b436615	i965: Add MOCS shift and mask for SURFACE_STATE entries. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-18 10:45:49 -07:00
Roland Scheidegger	4ef19f7fec	llvmpipe: clamp inputs for srgb render buffers Usually with fixed point renderbuffers clamping is done as part of conversion. However, since we blend in float format, we essentially skip all conversion steps pre-blend but since this is still a fixed point renderbuffer we must still clamp the inputs in this case. Makes no difference for piglit though. Obviously we could skip this if fragment color clamping is enabled, but a) this is deprecated in OpenGL (d3d never had it) and b) we don't support it natively so it gets baked into the shader. Also add some comment about logic ops being broken for srgb, luckily no test tries to do that as there's no easy fix... Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-18 19:04:20 +02:00
Roland Scheidegger	e57b98bad3	llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha We were fixing up the blend factor to ZERO, however this only works correctly with fixed point render buffers where the input values are clamped to 0/1 (because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped inputs). Haven't seen any failure anywhere due to that with fixed point SNORM buffers (which clamp inputs to -1/1) but it should apply there as well (snorm blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all, d3d10 requires them but they are not blendable). Doesn't look like piglit hits this though (some internal testing hits the float case at least). (With legacy OpenGL we could theoretically still use the fixup to zero if the fragment color clamp is enabled, but we can't detect that easily since we don't support native clamping hence it gets baked into the shader.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-18 19:03:35 +02:00
Marek Olšák	0d7f087483	r600g: use WAIT_3D_IDLE before using CP DMA I broke this with `7948ed1250` for r700 at least.	2013-07-18 14:27:34 +02:00
Jonathan Gray	0b405f364f	r300g: make use of gallium's os_get_process_name() Lets the code compile on non Linux systems. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-07-18 14:04:48 +02:00
Jean-Sébastien Pédron	148f0deb06	configure.ac: On some systems, "x86-64" is called "amd64" For instance, this is the case on FreeBSD. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-17 23:10:23 -07:00
Ilia Mirkin	fbdae1ca41	nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0 Adds H.264 and MPEG2 codec support via VP2, using firmware from the blob. Acceleration is supported at the bitstream level for H.264 and IDCT level for MPEG2. Known issues: - H.264 interlaced doesn't render properly - H.264 shows very occasional artifacts on a small fraction of videos - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there when using XvMC on the same videos Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-07-18 07:52:32 +02:00
Jonathan Gray	f96c07abf6	configure.ac: make grep tests more portable Use grep -w instead of the empty string escape sequences which are less portable. Makes the grep tests function as intended on OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-17 22:50:19 -07:00
Jonathan Gray	78fbb41fe3	configure.ac: add OpenBSD Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-17 21:06:46 -07:00
Vinson Lee	21f97446f4	glsl: Remove comma at end of enumerator list. Fixes this build error on OpenBSD 5.3. In file included from ../../src/mesa/main/ff_fragment_shader.cpp:53: ./../glsl/ir_optimization.h:64: error: comma at end of enumerator list Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-17 20:57:54 -07:00
Vinson Lee	77311dab3a	mesa: Remove commas at end of enumerator lists. Fixes these build errors on OpenBSD 5.3. In file included from ../../src/mesa/main/errors.h:47, from ../../src/mesa/main/imports.h:41, from ../../src/mesa/main/ff_fragment_shader.cpp:32: ../../src/mesa/main/mtypes.h:3286: error: comma at end of enumerator list ../../src/mesa/main/mtypes.h:3296: error: comma at end of enumerator list ../../src/mesa/main/mtypes.h:3303: error: comma at end of enumerator list ../../src/mesa/main/mtypes.h:3356: error: comma at end of enumerator list Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-17 20:57:53 -07:00
Carl Worth	ceaf1a74cb	docs: Import 9.1.5 release notes And add news item for the release.	2013-07-17 20:11:02 -07:00
Roland Scheidegger	7fd30a8621	gallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bit Use "or" instead of "add" (this is a classic select sequence, which at least newer llvm versions can actually recognize (3.2+?), and the "add" might prevent that - and we really don't want an add instead of an or with avx if it isn't recognized (even without avx logic ops might be cheaper)). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-17 18:16:34 +02:00
Roland Scheidegger	f0f9fb59c3	util/u_format_s3tc: handle srgb formats correctly. Instead of just ignoring the srgb/linear conversions, simply call the corresponding conversion functions, for all of pack/unpack/fetch, both for float and unorm8 versions (though some don't make a whole lot of sense, i.e. unorm8/unorm8 srgb/linear combinations). Refactored some functions a bit so don't have to duplicate all the code (there's a slight change for packing dxt1_rgb, as there will now be always 4 components initialized and sent to the external compression function so the same code can be used for all, the quite horrid and ad-hoc interface (by now) should always have worked with that). Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-17 18:16:27 +02:00
Vadim Girlin	07baf9cfd1	r600g/sb: improve alu packing on cayman Scheduler/register allocator in r600-sb was developed and optimized on evergreen (VLIW-5) hardware, so currently it's not optimal for VLIW-4 chips. This patch should improve performance on cayman gpus due to better alu packing, but also it tends to increase register usage, so overall positive effect on performance has to be proven by real benchmarks yet. Some results with bfgminer kernel on cayman: source bytecode: 60 gprs, 3905 alu groups, sbcl before the patch: 45 gprs, 4088 alu groups, sbcl with this patch: 55 gprs, 3474 alu groups. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-17 18:29:56 +04:00
Vadim Girlin	ba7fa4c4c9	r600g/sb: fix handling of new multislot instructions on cayman Ex-scalar instructions that became multislot on cayman do replicate result to all channels - handle them similar to DOT4. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-17 18:27:31 +04:00
Vadim Girlin	033eec4145	r600g/sb: fix debug dump code in scheduler Update the stale debug code for other changes related to debug output. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-17 18:27:31 +04:00
Vadim Girlin	44ebe7291c	r600g/sb: fix initial register allocation Mark values that are members of the 'same register' constraint as preallocated in ra_init pass, this will prevent incorrect reallocation in scheduler in some cases. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-17 18:27:30 +04:00
Vadim Girlin	f0d881106a	r600g/sb: move chip & class name functions to sb_context Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-17 18:27:30 +04:00
Vadim Girlin	96efa4cdf4	r600g/sb: fix handling of PS in source bytecode on cayman Actually PS doesn't make sense for cayman and isn't even mentioned in cayman docs, but llvm backend currently uses it in bytecode and, assuming that hw seems to be mostly ok with it, this will allow sb to parse such source bytecode correctly. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-17 18:27:30 +04:00
Vinson Lee	81d3881367	r600g/sb: Initialize ra_checker member variables. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-17 18:27:30 +04:00
Emil Velikov	b20e0fb520	gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}int Every function but the above four uses explicitly sized types for their src and dst arguments. Even fetch_rgba_{s,u}int follows the convention. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-07-17 13:01:46 +02:00
Kyle McMartin	87c3440567	llvmpipe: use MCJIT on ARM and AArch64 MCJIT is the only supported LLVM JIT on AArch64 and ARM (the regular JIT has bit-rotted badly on ARM and doesn't exist on AArch64.) Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Dave Airlie <airlied@gmail.com>	2013-07-17 17:29:01 +10:00
Kenneth Graunke	00d32cd5b4	glsl: Fix absurd whitespace conventions in the parser. Historically, we indented grammar production rules with a single 8-space tab, but code inside of blocks used Mesa's 3-space indents. This meant when editing code, you had to use an 8-space tab for the first level of indentation, and 3-spaces after that. Unless you specifically configure your editor to understand this, it will get the indentation wrong on every single line you touch, which quickly devolves into a colossal waste of time. It's also inconsistent with every other file in the entire project. This patch removes all tabs and moves to a consistent 3-space indent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-07-16 11:31:58 -07:00
Kenneth Graunke	4ab7fc9ec3	glsl: Fail the build if the grammar contains shift/reduce errors. When working on a parser, it's very easy to accidentally introduce new shift/reduce conflicts. Failing the build guarantees they'll be noticed and fixed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-07-16 11:31:58 -07:00
Kenneth Graunke	73620709c9	glsl: Silence the last shift/reduce conflict warning in the grammar. The single remaining shift/reduce conflict was the classic ELSE problem: 292 selection_rest_statement: statement . ELSE statement 293 \| statement . ELSE shift, and go to state 479 ELSE [reduce using rule 293 (selection_rest_statement)] $default reduce using rule 293 (selection_rest_statement) The correct behavior here is to shift, which is what happens by default. However, resolving it explicitly will make it possible to fail the build on new errors, making them much easier to detect. The classic way to solve this is to use right associativity: http://www.gnu.org/software/bison/manual/html_node/Non-Operators.html Since there is no THEN token in GLSL, we need to fake one. %right THEN creates a new terminal symbol; the %prec directive says to use the precedence of that terminal. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-07-16 11:31:58 -07:00
Vinson Lee	fa7829c36b	glsl: Initialize ast_jump_statement::opt_return_value. opt_return_value was not initialized if mode != ast_return. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-16 09:03:02 -07:00
Vinson Lee	f74acb9835	glapi: Do not use backtrace on OpenBSD. execinfo.h is not available on OpenBSD. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-16 09:00:38 -07:00
Maarten Lankhorst	b20b2b6dc8	osmesa: link against static libglapi library too to get the gl exports This should fix missing symbols in a osmesa built against shared glapi osmesa build. All opengl exports were missing that are defined in the static glapi, so link against both to fix this. This is a candidate for the stable series. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-07-16 10:18:40 +02:00
Chris Forbes	121ea0b38b	i965/Gen4: Zero extra coordinates for ir_tex We always emit U,V,R coordinates for this message, but the sampler gets very angry if we pass garbage in the R coordinate for at least some texture formats. Fill the remaining coordinates with zero instead. Fixes broken rendering on GM45 in Source games, and in VDrift. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65236 NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-16 19:08:41 +12:00
Kenneth Graunke	e4fdf1b008	i965: Cite the Ivybridge PRM for 3DSTATE_CLEAR_PARAMS notes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:53 -07:00
Kenneth Graunke	b72a298751	i965: Refer people to brw_tex_layout.c rather than the BSpec. brw_tex_layout.c sets up the align_w/h fields, and has all the appropriate spec references already. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:53 -07:00
Kenneth Graunke	4b704424e0	i965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets. The Sandybridge code had a citation for the range of the "Maximum Number of Threads" field, and the Ivybridge code just mentioned the "BSpec" in general. That's documented in the obvious place, so people can find it without a spec reference. The real value of the comment is to say "we tried zero, and it exploded, so program it to a valid number even if pixel shading is off." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	ada110716a	i965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	90b5a03581	i965: Update workaround flush comments for Gen6 3DSTATE_VS. Unfortunately, the workaround text never made it into the Sandybridge PRM, so we still have to refer to the BSpec. It also wasn't obvious why we needed this workaround at all, since we don't currently do VS passthrough - but BLORP can turn off the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	3b3a440d2b	i965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	9a86875c6b	i965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements. Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant text for some reason. However, the Sandybridge PRM has the text Chad originally quoted, and the modern BSpec has the same text. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	2e928e2a3f	i965: Cite the Ivybridge PRM for multisample surface format notes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	43ea434225	i965: Delete "the data cache is the sampler cache" comments on Gen7+. I cut and pasted these comments from the Gen4 code during Ivybridge enabling, and didn't understand what they meant at the time. The data cache is NOT the same as the sampler cache on Ivybridge. The sampler cache has L1 and L2 caches in addition to the L3 cache, while data port messages to the "data cache" hit L3 directly. This means that the sampler domain is technically wrong, but we stopped caring about read/write domains quite a while ago. The kernel just flushes all the caches at the end of each batchbuffer, and our render to texture code flushes the sampler caches when necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:52 -07:00
Kenneth Graunke	3f64cfabfc	i965: Cite the 965 PRM for "the data cache is the sampler cache". Presumably, this comment exists to justify the usage of I915_GEM_DOMAIN_SAMPLER for this relocation. At one point, this was necessary to ensure that the right flushing was done to keep caches coherent. These days, the kernel just flushes everything, so I don't think it matters. Still, the comment is interesting, so leave it in place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:51 -07:00
Kenneth Graunke	f254c94204	i965: Cite the Ivybridge PRM for DP message descriptor fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:51 -07:00
Kenneth Graunke	a0c8e76202	i965: Cite the Ivybridge PRM for why the fake MRF range is what it is. The exact text is in the public docs, so we should cite those. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:51 -07:00
Kenneth Graunke	3090d39dde	i965: Cite the Ivybridge PRM for SFID enum values. The Ivybridge PRM adds new SFIDs and lists them in a different volume than Sandybridge, so it's worth adding a reference. I also removed the BSpec reference, as the section it referred to was moved somewhere, and I couldn't find it. This leaves one Haswell SFID without a citation, but we can add one once the PRMs are out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 19:40:51 -07:00
Roland Scheidegger	dc1cc928ed	llvmpipe: support sRGB framebuffers Just use the new conversion functions to do the work. The way it's plugged in into the blend code is quite hacktastic but follows all the same hacks as used by packed float format already. Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never worked anyway in the blend code and are thus disabled, and I don't think anyone is interested in L8/L8A8. Would need even more hacks otherwise. Unless I'm missing something, this is the last feature except MSAA needed for OpenGL 3.0, and for OpenGL 3.1 as well I believe. v2: prettify a bit, use separate function for packing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-16 01:54:51 +02:00
Marek Olšák	a882067d74	Revert "r300g: allow HiZ with a 16-bit zbuffer" This reverts commit `631c631cbf`. https://bugs.freedesktop.org/show_bug.cgi?id=66921 Cc: mesa-stable@lists.freedesktop.org	2013-07-15 23:46:01 +02:00
Marek Olšák	7969b567bd	r300g/swtcl: fix a lockup in MSAA resolve Cc: mesa-stable@lists.freedesktop.org	2013-07-15 23:45:22 +02:00
Marek Olšák	22427640b2	r300g/swtcl: fix geometry corruption by uploading indices to a buffer The splitting of a draw call into several draw commands was broken, because the split sometimes took place in the middle of a primitive. The splitting was supposed to be dealing with the case when there are more indices than the maximum size of a CS. This commit throws that code away and uses a real index buffer instead. https://bugs.freedesktop.org/show_bug.cgi?id=66558 Cc: mesa-stable@lists.freedesktop.org	2013-07-15 23:45:16 +02:00
Matt Turner	c889df3fbe	glsl: Reject C-style initializers with unknown types. _mesa_ast_set_aggregate_type walks through declarations initialized with C-style aggregate initializers and stops when it runs out of LHS declarations or RHS expressions. In the example vec4 v = {{{1, 2, 3, 4}}}; _mesa_ast_set_aggregate_type would not recurse into the subexpressions (since vec4s do not contain types that can be initialized with an aggregate initializer) to set their <constructor_type>s. Later in ::hir we would dereference the NULL pointer and segfault. If <constructor_type> is NULL in ::hir we know that the LHS and RHS were unbalanced and the code is illegal. Arrays, structs, and matrices were unaffected. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-07-15 13:02:36 -07:00
Paul Berry	7706e52b25	glsl: Rework builtin_variables.cpp to reduce code duplication. Previously, we had a separate function for setting up the built-in variables for each combination of shader stage and GLSL version (e.g. generate_110_vs_variables to generate the built-in variables for GLSL 1.10 vertex shaders). The functions called each other in ad-hoc ways, leading to unexpected inconsistencies (for example, generate_120_fs_variables was called for GLSL versions 1.20 and above, but generate_130_fs_variables was called only for GLSL version 1.30). In addition, it led to a lot of code duplication, since many varyings had to be duplicated in both the FS and VS code paths. With the advent of geometry shaders (and later, tessellation control and tessellation evaluation shaders), this code duplication was going to get a lot worse. So this patch reworks things so that instead of having a separate function for each shader type and GLSL version, we have a function for constants, one for uniforms, one for varyings, and one for the special variables that are specific to each shader type. In addition, we use a class, builtin_variable_generator, to keep track of the instruction exec_list, the GLSL parse state, commonly-used types, and a few other variables, so that we don't have to pass them around as function arguments. This makes the code a lot more compact. Where it was feasible to do so without introducing compilation errors, I've also gone ahead and introduced the variables needed for {ARB,EXT}_geometry_shader4 style geometry shaders. This patch takes care of everything except the GS variable gl_VerticesIn, the FS variable gl_PrimitiveID, and GLSL 1.50 style geometry shader inputs (using the gl_in interface block). Those remaining features will be added later. I've also made a slight nomenclature change: previously we used the word "deprecated" to refer to variables which are marked in GLSL 1.40 as requiring the ARB_compatibility extension, and are marked in GLSL 1.50 onward as requiring the compatibilty profile. This was misleading, since not all deprecated variables require the compatibility profile (for example gl_FragData and gl_FragColor, which have been deprecated since GLSL 1.30, but do not require the compatibility profile until GLSL 4.20). We now consistently use the word "compatibility" to refer to these variables. This patch doesn't introduce any functional changes (since geometry shaders haven't been enabled yet). Reviewed-by: Matt Turner <mattst88@gmail.com> v2: Rename "typ" -> "type". Add blank line between inline functions and declarations in builtin_variable_generator class. Use the standard comment "/* FALLTHROUGH */" for compatibility with static code analysis tools. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 09:35:28 -07:00
Paul Berry	428e030210	glsl: Fix lower_named_interface_blocks to account for dereferences of consts. In certain rare cases (such as those involving dereference of a literal constant array of structs), flatten_named_interface_blocks_declarations's rvalue visitor may be invoked on an ir_dereference_record whose variable_referenced() method returns NULL. Check for this case to avoid a segfault. Prevents crashes in piglit tests {vs,fs}-deref-literal-array-of-structs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-07-15 07:59:52 -07:00
Paul Berry	b2265db8e7	glsl: Don't allow vertex shader input arrays until GLSL 1.50. Vertex shader inputs are not allowed to be arrays until GLSL 1.50. We were accidentally enabling them for GLSL 1.40 (although we haven't written any tests for them, so it's not clear whether they actually work). NOTE: although this is a simple bug fix, it probably isn't sensible to cherry-pick it to stable release branches, since its only effect is to cause incorrectly-written shaders to fail to compile. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-15 07:50:47 -07:00
Chris Forbes	b616d01661	i965: Gen4/5: use IEEE floating point mode for GLSL shaders. Fixes isinf(), isnan() from GLSL 1.30 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-14 19:58:25 +12:00
Chris Forbes	1ec66f2fb2	i965/vs: Gen4/5: enable front colors if back colors are written Fixes undefined results if a back color is written, but the corresponding front color is not, and only backfacing primitives are drawn. Results are still undefined if a frontfacing primitive is drawn, but that's OK. The other reasonable way to fix this would have been to just pick the one color slot that was populated, but that dilutes the value of the tests. On Gen6+, the fixed function clipper and triangle setup already take care of this. Fixes 11 piglits: spec/glsl-1.10/execution/interpolation/interpolation-none-gl_BackColor- NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-14 19:58:11 +12:00
Roland Scheidegger	796b73d1fe	gallivm: (trivial) use constant instead of exp2f() function Some lame compilers can't do exp2f() and as far as I can tell they can't do exp2() (with doubles) neither so instead of providing some workaround for that (wouldn't actually be too bad just replace with pow) and since it is used with a constant only just use the precalculated constant.	2013-07-14 02:39:33 +02:00
Chia-I Wu	62c546bbf8	ilo: skip 3DSTATE_INDEX_BUFFER when possible When only the offset to the index buffer is changed, we can skip the 3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add (offset / index_size) to Start Vertex Location in 3DPRIMITIVE.	2013-07-14 05:59:52 +08:00
Roland Scheidegger	6bcbb0dc82	gallivm: handle srgb-to-linear and linear-to-srgb conversions srgb-to-linear is using 3rd degree polynomial for now which should be _just_ good enough. Reverse is using some rational polynomials and is quite accurate, though not hooked into llvmpipe's blend code yet and hence unused (untested). Using a table might also be an option (for srgb-to-linear especially). This does not enable any new features yet because EXT_texture_srgb was already supported via util_format fallbacks, but performance was lacking probably due to the external function call (the table used by the util_format_srgb code may not be all that much slower on its own). Some performance figures (taken from modified gloss, replaced both base and sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge, the numbers aren't terribly accurate): normal gloss, aos, 8-wide: 47 fps normal gloss, aos, 4-wide: 48 fps normal gloss, forced to soa, 8-wide: 48 fps normal gloss, forced to soa, 4-wide: 47 fps patched gloss, old code, soa, 8-wide: 21 fps patched gloss, old code, soa, 4-wide: 24 fps patched gloss, new code, soa, 8-wide: 41 fps patched gloss, new code, soa, 4-wide: 38 fps So there's a performance hit but it seems acceptable, certainly better than using the fallback. Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will continue to use the old util_format fallback, because I can't be bothered to write code for formats noone uses anyway (as decoding is done as part of lp_build_unpack_rgba_soa which can only handle block type width of 32). Compressed srgb formats should get their own path though eventually (it is going to be expensive in any case, first decompress, then convert). No piglit regressions. v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also since keeping both linear to srgb functions for now make sure both are compiled (since they share quite some code just integrate into the same function). v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb path. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-13 18:42:17 +02:00
Roland Scheidegger	9b8d97e5bf	gallivm: better support for fast rsqrt We had to disable fast rsqrt before because it wasn't precise enough etc. However in situations when we know we're not going to need more precision we can still use a fast rsqrt (which can be several times faster than the quite expensive sqrt). Hence introduce a new helper which does exactly that - it is probably not useful calling it in some situations if there's no fast rsqrt available so make it queryable if it's available too. v2: use fast_rsqrt consistently instead of rsqrt_fast, fix indentation, let rsqrt use fast_rsqrt. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-13 18:42:17 +02:00
Klemens Baum	45574ab2e9	configure.ac: better detection of LLVM version Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-12 21:20:59 -07:00
Vinson Lee	b0c3c955ae	r600g/sb: Initialize ra_constraint::cost. Fixes "Uninitialized scalar field" reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-07-13 06:57:26 +04:00
Vinson Lee	be8d787873	glsl: Initialize ast_aggregate_initializer::constructor_type. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-12 18:42:46 -07:00
Paul Berry	c6bfe62e21	glsl: Make gl_TexCoord compatibility-only gl_TexCoord was deprecated in GLSL 1.30. In GLSL 1.40 it was marked as ARB_compatibility-only, and in GLSL 1.50 and above it was marked as only appearing in the compatibility profile. It has never appeared in GLSL ES. However, Mesa erroneously included it in all desktop versions of GLSL, even versions 1.40 and 1.50 (which do not currently support the compatibility profile). This patch makes gl_TexCoord available in the compatibility profile (and GLSL versions 1.30 and prior) only. NOTE: although this is a simple bug fix, it probably isn't sensible to cherry-pick it to stable release branches, since its only effect is to cause incorrectly-written shaders to fail to compile. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-12 18:18:49 -07:00
Paul Berry	8f51d68f8c	glsl ES: Fix magnitude of gl_MaxVertexUniformVectors. Previously, we set it equal to MaxVertexUniformComponents. It should be MaxVertexUniformComponents / 4. NOTE: This is a candidate for the stable branches. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-12 18:18:48 -07:00
Marek Olšák	06b38dbab2	winsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfault The original idea was that cs=NULL should be allowed here, but we never used NULL until `862f69fbe1`. This fixes a segfault in CoreBreach.	2013-07-13 02:38:23 +02:00
Chia-I Wu	8d4ac98549	ilo: move a santiy check into its assert() The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and can be eliminated in a release build in gen6_pipeline_end(). Move the call into the assert().	2013-07-13 07:27:28 +08:00
Chia-I Wu	bf9670270f	ilo: mark some states dirty when they are really changed The checks may seem redundant because cso_context handles them, but util_blitter does not have access to cso_context.	2013-07-13 06:43:53 +08:00
Chia-I Wu	9047598a8d	ilo: clean up ilo_blitter_pipe_begin() Document why certain states need to be saved, and fix a bug when blitting with scissor enabled.	2013-07-13 06:43:53 +08:00
Alex Deucher	e0a7565832	r600g: don't use the CB/DB CP COHER logic on r6xx There are hw bugs. Flush and inv event is sufficient. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66837 Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-12 18:07:56 -04:00
Jonathan Liu	af16f73051	configure: Avoid use of AC_CHECK_FILE for cross compiling The AC_CHECK_FILE macro can't be used for cross compiling as it will result in "error: cannot check for file existence when cross compiling". Replace it with the AS_IF macro. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jonathan Liu <net147@gmail.com>	2013-07-12 13:21:28 -07:00
Brian Paul	bf86e0e050	nv30: fix KILL_IF breakage Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858	2013-07-12 10:00:18 -06:00
Zack Rusin	00cd455bd5	gallium: fixup definitions of the rsq and sqrt GLSL spec says that rsq is undefined for src<=0, but the D3D10 spec says it needs to be a NaN, so lets stop taking an absolute value of the source which completely breaks that behavior. For the gl program we can simply insert an extra abs instrunction which produces the desired behavior there. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-11 20:19:04 -04:00
José Fonseca	a171812d27	util/u_format: Comment out half float denormal test case. So that lp_test_format doesn't fail until we decide what should be done.	2013-07-12 15:48:38 +01:00
José Fonseca	1b0d29b5da	gallivm: Eliminate redundant lp_build_select calls. lp_build_cmp already returns 0 / ~0, so the lp_build_select call is unnecessary. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-12 15:40:16 +01:00
Brian Paul	46205ab8cc	tgsi: rename the TGSI fragment kill opcodes TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	f501baabdb	tgsi: fix-up KILP comments KILP is really unconditional fragment kill. We've had KIL and KILP transposed forever. I'll fix that next. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	e7c3898725	tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector To align with the docs and the state tracker. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	f3fad24b62	tgsi: use X component of the second operand in exec_scalar_binary() The code happened to work in the past since the (scalar) src args effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so whether you grab the X or Y component doesn't really matter. Just fixing the code to make it look right. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-12 08:32:51 -06:00
Brian Paul	cb2de08f27	mesa: update glext.h to version 20130708 This update fixes the problem with duplicated typedefs for GLclampf and GLclampd in the previous version. It also changes some parameter types for glDebugMessageCallbackARB() and glTransformFeedbackVaryingsEXT(). Note we should someday update the glapi-gen code so that it understands void pointer parameters. Currently, the Python code only understands "GLvoid " but not "void ". Luckily, the compilers don't seem to complain about mixing GLvoid and void.	2013-07-12 08:32:51 -06:00
Brian Paul	5749aea255	mesa: fix Address Sanitizer (ASan) issue in _mesa_add_parameter() If the size argument isn't a multiple of four, we would have read/ copied uninitialized memory. Fixes an issue reported by Myles C. Maxfield <myles.maxfield@gmail.com>	2013-07-12 08:32:51 -06:00
Brian Paul	9ca026e220	mesa: simplify some _mesa_IsEnabled() queries No need to test array->Enabled != 0 since the Enabled field can only be 0 or 1. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-12 08:32:50 -06:00
Brian Paul	9fc532a263	os: add os_get_process_name() function v2: explicitly test for BSD/APPLE, #warning for unexpected environments.	2013-07-12 08:32:50 -06:00
Brian Paul	3fb3e1e38c	mesa: whitespace, formatting, 80-column wrapping	2013-07-12 08:32:22 -06:00
Brian Paul	919236f3a2	softpipe: silence some MSVC warnings	2013-07-12 08:19:52 -06:00
Brian Paul	76666b9394	hud: silence some MSVC warnings	2013-07-12 08:19:52 -06:00
Brian Paul	d7a852b3a1	util: add casts to silence MSVC warnings in u_blit.c	2013-07-12 08:19:51 -06:00
Brian Paul	c45d8f2e98	tgsi: s/unsigned/int/ to silence MSVC warning	2013-07-12 08:19:50 -06:00
Brian Paul	2cfd768473	mesa: s/unsigned/int/ to fix MSVC warning in uniforms.c	2013-07-12 08:19:50 -06:00
Brian Paul	5b0fbf1b0b	mesa: s/GLuint/GLint/ to silence MSVC warning in textore.c	2013-07-12 08:19:50 -06:00
Brian Paul	721f47227e	mesa: add casts to fix MSVC warnings in multisample.c	2013-07-12 08:19:49 -06:00
Brian Paul	528e5b9476	mesa: s/GLint/GLuint/ to fix MSVC warnings in mipmap.c	2013-07-12 08:19:49 -06:00
Brian Paul	738337356b	mesa: fix inconsistent function declaration, definitions To silence MSVC warnings that the declaration and definitions were different.	2013-07-12 08:19:49 -06:00
Brian Paul	8ba5c79d2c	mesa: add cast to silence MSVC warning	2013-07-12 08:19:49 -06:00
Christian König	1681bd7f2b	radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2 UVD 2.x doesn't support hardware decoding of MPEG2, just use shader based decoding for those chipsets. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450 v2: fix interlacing as well Signed-off-by: Christian König <christian.koenig@amd.com>	2013-07-12 10:52:27 +02:00
José Fonseca	649ef4da30	glsl: Avoid variable length arrays. They are a non-standard GCC extension that's not widely supported by other C/C++ compilers. Use a dynamic array instead. Trivial. Should fix the MSVC build.	2013-07-12 09:28:22 +01:00
Matt Turner	1b0d6aef03	glsl: Add support for C-style initializers. Required by GL_ARB_shading_language_420pack. Parts based on work done by Todd Previte and Ken Graunke, implementing basic support for C-style initializers of arrays. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	ae79e86d4c	glsl: Add infrastructure for aggregate initializers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	8d45caaeba	glsl: Add an is_declaration field to ast_struct_specifier. Will be used in a later commit to differentiate between a structure type declaration and a variable declaration of a struct type. I.e., the difference between struct S { float x; }; (is_declaration = true) and S s; (is_declaration = false) Also note that is_declaration = true for struct S { float x; } s; Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	5df807b06f	glsl: Track structs' ast_type_specifiers in symbol table. Will be used in a future commit. An ast_type_specifier is stored (rather than an ast_struct_specifier) with the idea that we may have more general uses for this in the future. struct names are prefixed with '#ast.' to avoid collisions with the glsl_types in the symbol table. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	e641b5fbee	glsl: Add process_vec_mat_constructor() function. Based largely on process_array_constructor(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	af2987d5b6	glsl: Separate code into process_record_constructor(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	a760c73853	glsl: Add copy-constructor for ast_struct_specifier. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:59 -07:00
Matt Turner	43757135b2	glsl: Add a constructor for ast_type_specifier. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 20:58:58 -07:00
Matt Turner	b85f0c5121	glsl: Clean up and clarify comment explaining initializer rules. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>	2013-07-11 20:58:58 -07:00
Matt Turner	ce2464a8a7	glsl: Change type of is_array to bool. Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>	2013-07-11 20:58:58 -07:00
Matt Turner	361206771c	glsl: Add a comment to note what an exec_list is a list of. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>	2013-07-11 20:58:58 -07:00
Matt Turner	46b74ca7bc	glsl: Fix inverted conditional in error message. The code float a[2] = float[2]( 3.4, 4.2, 5.0 ); previously generated this: error: array constructor must have at least 2 parameters when in fact it requires exactly two. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>	2013-07-11 20:58:58 -07:00
Matt Turner	9749d96817	glsl: Add missing return error_value(ctx) in error path. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>	2013-07-11 20:58:58 -07:00
Matt Turner	e117eda251	glsl: Remove unnecessary #include from ast_type.cpp. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romainck@intel.com>	2013-07-11 20:58:58 -07:00
Chia-I Wu	93742d9757	glsl/build: build builtin_compiler with VISIBILITY_CFLAGS libglslcore.la and libglcpp.la that are built with builtin_compiler are also linked to by drivers not using libdricore. Since there is no public symbol in them, it is better to mark all symbols hidden. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-07-12 09:42:25 +08:00
Matt Turner	08c90f651b	glsl: Add comment explaining "row_major" parsing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-11 16:22:07 -07:00
Matt Turner	14ed9018de	glsl: Mark "row_major" as not a reserved word in GLSL ES 3.0. We mark ARB_uniform_buffer_object as enabled under ES 3 since it contains that functionality, which tricked the compiler into tokenizing "row_major". Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 16:22:07 -07:00
Matt Turner	c30948517e	glsl: Remove outdated FINISHME comment. Explicit index support was added by commit `1256a5dc`. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-11 16:22:07 -07:00
Alex Deucher	77300bacaf	radeon: bump libdrm_radeon requirement for CIK support Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-11 19:11:44 -04:00
Christoph Bumiller	9974593dfb	r600g: x/y coordinates must be divided by block dim in dma blit Note: this is a candidate for the 9.1 branch. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-11 19:11:44 -04:00
Chih-Wei Huang	1d9271a95c	r600g/sb: Fix Android build v2 Add the sb CXX files to the Android Makefile and also stop using some c++11 features. v2 (Vadim Girlin): use &bc[0] instead of bc.begin()	2013-07-12 01:11:04 +04:00
Vadim Girlin	758ac6f918	r600g/sb: improve math optimizations v2 This patch adds support for some math optimizations that are generally considered unsafe, that's why they are currently disabled for compute shaders. GL requirements are less strict, so they are enabled for for GL shaders by default. In case of any issues with applications that rely on higher precision than guaranteed by GL, 'sbsafemath' option in R600_DEBUG allows to disable them. v2 - always set proper src vector size for transformed instructions - check for clamp modifier in the expr_handler::fold_assoc Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-07-11 23:01:01 +04:00
Jonathan Gray	c451619dde	st/xvmc/tests: avoid non portable error.h functions Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-07-11 09:52:00 +02:00
Anuj Phogat	9a1a67b081	i965/blorp: Fix clear rectangle alignment in fast color clear From BSpec: 3D-Media-GPGPU Engine > 3D Pipeline > Pixel > Pixel Backend > MCS Buffer for Render Target(s) [DevIVB+]: [DevHSW:GT3]: Clear rectangle must be aligned to two times the number of pixels in the table shown below... Observed no piglit, gles3conform regressions with this patch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65744	2013-07-10 18:41:16 -07:00
Chia-I Wu	ad244884fc	winsys/intel: build with VISIBILITY_CFLAGS There is no public symbol in this winsys.	2013-07-11 09:03:59 +08:00
Chia-I Wu	79bc245c01	ilo: reduce PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS to 12 So that there are at most (2^22 * 6) texels, lower than the 2^26 limit.	2013-07-11 08:03:27 +08:00
Chia-I Wu	29af29b8dc	ilo: correctly initialize undefined registers in fs Initialize all 4 channels of undefined registers (that is, TEMPs that are used before being assigned) in FS.	2013-07-11 07:01:51 +08:00
Michel Dänzer	a06ee5a09e	radeonsi: Handle TGSI_OPCODE_DDX/Y using local memory 16 more little piglits. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-10 18:40:32 +02:00
Michel Dänzer	a6b83c0f23	radeonsi: Handle TGSI_OPCODE_TXD One more little piglit. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-10 12:16:38 +02:00
José Fonseca	b042aae70d	util/u_math: Use xmmintrin.h whenever possible. It seems __builtin_ia32_ldmxcsr is only available on gcc and only when -msse is used. xmmintrin.h/pmmintrin.h provide portable intrinsics, but these too are only available with gcc when -msse/-msse3 are set. scons build always sets -msse on x86 builds, but autotools doesn't seem to. We could try to get this working on gcc x86 without -msse by emitting assembly, but I believe that in this day and age we really should be building Mesa with -msse and -msse2.	2013-07-10 07:56:17 +01:00
Chia-I Wu	045bf0db52	ilo: honor surface padding requirements The PRM specifies several padding requirements that we failed to honor.	2013-07-10 12:40:22 +08:00
Zack Rusin	63386b2f66	util: treat denorm'ed floats like zero The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-09 23:30:55 -04:00
Matt Turner	80bc14370a	mesa: Set ProfileMask properly for core profile. Fixes MESA_GL_VERSION_OVERRIDE=3.2 egl-create-context-verify-gl-flavor. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-09 14:19:22 -07:00
Kenneth Graunke	8c9a54e7bc	i965: Delete intel_context entirely. This makes brw_context inherit directly from gl_context; that was the only thing left in intel_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:35 -07:00
Kenneth Graunke	53631be4eb	i965: Move intel_context::gen and gt fields to brw_context. Most functions no longer use intel_context, so this patch additionally removes the local "intel" variables to avoid compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:34 -07:00
Kenneth Graunke	2e26afb37b	i965: Move intel_context::has_llc to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:33 -07:00
Kenneth Graunke	794de2f387	i965: Move intel_context::is_<platform> flags to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:31 -07:00
Kenneth Graunke	44fd490067	i965: Move must_use/has_separate_stencil fields to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:30 -07:00
Kenneth Graunke	3b80b147f6	i965: Move intel_context::has_hiz to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:29 -07:00
Kenneth Graunke	351d2add62	i965: Free brw, not intel. Things worked out in the past because both brw and intel share the same memory address (by virtue of intel being the first member of brw). However, brw is what actually gets rzalloc'd (brw_context.c:285), so freeing that seems safer and more obvious. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:28 -07:00
Kenneth Graunke	e3c2bb1eb4	i965: Shorten context base class dereference chains. ctx->DrawBuffer is much more sensible than brw->intel.ctx.DrawBuffer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:26 -07:00
Kenneth Graunke	d5b4a3f5a3	i965: Move intel_context::has_swizzling to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:25 -07:00
Kenneth Graunke	02128c448d	i965: Move intel_context::intelScreen to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:24 -07:00
Kenneth Graunke	44a11eab9c	i965: Delete unused intel_context::driFd field. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:23 -07:00
Kenneth Graunke	e0858763bc	i965: Store brw_context as the DRI driver private, not intel_context. Right now, they're interchangeable. In the future, intel_context will either go away or change purpose. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:21 -07:00
Kenneth Graunke	a1d94cdb00	i965: Move intel_context::driContext to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:20 -07:00
Kenneth Graunke	a9d33dbbdd	i965: Move intel_context::NewGLState to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:19 -07:00
Kenneth Graunke	dd54558d31	i965: Move intel_context::upload to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:17 -07:00
Kenneth Graunke	0273e6e23e	i965: Move intel_context::max_gtt_map_object_size to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:16 -07:00
Kenneth Graunke	b15f1fc3c6	i965: Move intel_context::perf_debug to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:14 -07:00
Kenneth Graunke	7c3180a4ad	i965: Move intel_context::no_batch_wrap to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:13 -07:00
Kenneth Graunke	5314afa27a	i965: Move intel_context's framerate throttling fields to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:12 -07:00
Kenneth Graunke	ec995de6fb	i965: Move intel_context::stats_wm to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:10 -07:00
Kenneth Graunke	329779a0b4	i965: Move intel_context::batch to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:08 -07:00
Kenneth Graunke	5d8186ac1a	i965: Move intel_context::hw_ctx to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:07 -07:00
Kenneth Graunke	eeb75b41f1	i965: Move intel_context::bufmgr to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:05 -07:00
Kenneth Graunke	e33439045d	i965: Move intel_context's driconf flags to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:04 -07:00
Kenneth Graunke	fe0a8cb30d	i965: Move intel_context::reduced_primitive to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:03 -07:00
Kenneth Graunke	9147b40496	i965: Move front buffer rendering fields from intel_context to brw. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:09:01 -07:00
Kenneth Graunke	e43043c316	i965: Move intel_context::vtbl to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:58 -07:00
Kenneth Graunke	fbdd3891e1	i965: Move intel_context::optionCache to brw_context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:55 -07:00
Kenneth Graunke	ca437579b3	i965: Pass brw_context to functions rather than intel_context. This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:53 -07:00
Kenneth Graunke	86f2711722	i965: Remove pointless intel_context parameter from try_copy_propagate. It's already part of the visitor class. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:51 -07:00
Kenneth Graunke	18a223d323	i965: Add forward declarations of brw_context to a few places. These files have forward declarations for intel_context. This makes brw_context available in the same places without further #include monkeying. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:50 -07:00
Kenneth Graunke	a69274454b	i965: Replace #include "intel_context.h" with brw_context.h. brw_context.h includes intel_context.h, but additionally makes the brw_context structure available. Switching this allows us to start using brw_context in more places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:48 -07:00
Kenneth Graunke	99ebf9d07a	i965: Move ctx->Const setup from intelInitContext to the new helper. This also requires moving _mesa_init_point() to after the ctx->Const initialization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:47 -07:00
Kenneth Graunke	963d9f78a4	i965: Split code to set ctx->Const values into a helper function. brwCreateContext() has a lot of random things to do. Factoring out the part that initializes ctx->Const values and shader compiler options makes the main function a bit easier to read. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:45 -07:00
Kenneth Graunke	d13c120573	i915: Remove i965+ chip names. i965+ chipsets shouldn't ever hit this driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:44 -07:00
Kenneth Graunke	e4f3d5cdcf	i965: Remove i915 chip names. i915 chipsets shouldn't ever hit this driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:08:42 -07:00
Kenneth Graunke	2921390666	i965: Replace intel_context:needs_ff_sync with intel->gen == 5. Technically, needs_ff_sync was set on Gen5+, but it was only consulted in the clipper threads and quad/lineloop decomposition code, which are both Gen4-5 only. So in reality it only identified Ironlake. The named flag doesn't really clarify things, and seems like overkill. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-07-09 14:07:13 -07:00
Kenneth Graunke	968c57782d	i965: Add missing newline to blorp color clear perf_debug message. perf_debug() doesn't add a newline for you; without this, all the INTEL_DEBUG=perf output was jumbled together. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-09 10:10:46 -07:00
Emil Velikov	f0260f4e3d	glsl: Silence unused variable warning in the release build Resolves the following gcc warning opt_flip_matrices.cpp:84:32: warning: unused variable 'deref' v2: keep the variable, but wrap it in a ifndef NDEBUG block (suggested by Ian) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-08 19:08:42 -07:00
Emil Velikov	4df6823f21	glsl/ast: Silence uninitialized variable warnings in the release build Resolves the following gcc warnings warning: 'iface_type_name' may be used uninitialized in this function warning: 'var_mode' may be used uninitialized in this function Note: The variables are initialised to UNKNOWN and ir_var_auto Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-08 19:08:30 -07:00
Paul Berry	292368570a	i965: Add an assertion to brwProgramStringNotify. driver->ProgramStringNotify is only called for ARB programs, fixed function vertex programs, and ir_to_mesa (which isn't used by the i965 back-end). Therefore, even after geometry shaders are added, brwProgramStringNotify should only ever be called with a target of GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB. This patch adds an assertion to clarify that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-08 14:18:02 -07:00
Matt Turner	ba7b60d3e4	glsl: Allow non-constant expression initializers of const-qualified vars. Required by ARB_shading_language_420pack. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-07-08 12:46:56 -07:00
Marek Olšák	1faa375573	r600g: improve the mechanism for recognizing an empty CS Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	287b2fa115	r600g: explicitly flush caches for streamout-based buffer copying & clearing It's done automatically for vertex buffers, but not for constant buffers, textures, and colorbuffers. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	7948ed1250	r600g: only flush the caches that need to be flushed during CP DMA operations This should increase performance if constant uploads are done with the CP DMA, because only the cache that needs to be flushed is flushed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	1b40398d02	r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags also flushing any cache in evergreen_emit_cs_shader seems to be superfluous (we don't flush caches when changing the other shaders either) Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Alex Deucher	098316211c	r600g: adjust flush flags (v3) 1. flush SH with read caches 2. add flag for DB flushes 3. add flag for CB flushes v2: flush all CBs, remove redundant emit_state variable. v3: Marek: also set the new flags in r600_context_flush, the CP dma functions, and texture_barrier, and rename them Signed-off-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	862f69fbe1	r600g: don't call buffer_wait in buffer_mmap_sync_with_rings The winsys should do this, because it measures how much time we spend in buffer_map doing synchronization, which can be viewed with the gallium HUD. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	94d294137e	r600g: don't read back the MSAA depth buffer if the read flag is not set Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	141b892620	r600g: don't flush the context in texture_transfer_map the winsys does this automatically Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	ae87aae0c4	r600g: fix texture offset computation for mapped MSAA depth buffers It was wrong, because the offset shouldn't be applied to MSAA depth buffers. This small cleanup should prevent such issues in the future. This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n". Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	a3263cca59	r600g: fix color resolve for RGBX8 and RGBX16 integer formats Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	b1a061b81e	r600g: enable fast MSAA color clear for array/3D/cube textures Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Marek Olšák	87669c3654	r600g: implement fast MSAA color clear for integer textures this also fixes the fast clear with multiple colorbuffers and each having a different format Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-07-08 20:25:18 +02:00
Christian König	085c695488	r600/uvd: fix check for UVD 2.x Signed-off-by: Christian König <christian.koenig@amd.com>	2013-07-08 19:51:20 +02:00
Chris Forbes	1415a1884c	i965: fix alpha test for MRT Include src0 alpha in the RT write message when using MRT, so it is used for the alpha test instead of the normal per-RT alpha value. Fixes broken rendering in Dota2 under Wine [FDO #62647]. No Piglit regressions on Ivybridge. V2: reuse (and simplify) existing sample_alpha_to_coverage flag in the FS key, rather than adding another redundant one. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewd-by: Paul Berry <stereotype441@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647 NOTE: This is a candidate for the stable branches.	2013-07-06 12:41:54 +12:00
Roland Scheidegger	9ef49cfd84	gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch The logic for choosing number of lods was bogus. (The code should ultimately handle the case of only one lod even with multiple quads but currently can't.)	2013-07-05 18:07:51 +02:00
José Fonseca	45f174ce40	gallivm: Remove bogus assert. It is perfectly valid for the swizzle to be bigger than 2. For example the texel offsets could be SAMPLE ..., IMM[0].zzz What is not correct is for chan_index to be bigger than 2. Trivial.	2013-07-05 14:35:54 +01:00
Ben Skeggs	c29c6b2b2e	nvc0: enable very initial support for nvf0 (GK110) Shaders need a lot of work still. Basic stuff generally works, so this is basically just fine for gnome-shell, OA etc at this point. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-07-05 14:15:04 +10:00
Roland Scheidegger	4dbca8672b	gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources The assertion was always broken but the code unused until enabling the per-element lod code. Fixes piglit texelFetch vs isampler1D and similar tests (only run with GL 3.0 version override).	2013-07-05 01:19:23 +02:00
Roland Scheidegger	f3bbf65929	gallivm: do per-pixel lod calculations for explicit lod d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-07-04 19:42:04 +02:00
Zack Rusin	bbd1e60198	draw: fix overflows in the indexed rendering paths The semantics for overflow detection are a bit tricky with indexed rendering. If the base index in the elements array overflows, then the index of the first element should be used, if the index with bias overflows then it should be treated like a normal overflow. Also overflows need to be checked for in all paths that either the bias, or the starting index location. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-03 09:06:30 -04:00
Zack Rusin	09820902d7	draw/llvm: index overflows if it's greater than elt max The comparison, incorrectly, was greater-than-or-equal to elt max. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-07-03 09:06:24 -04:00
Kenneth Graunke	764afc48cf	i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c. The texture alignment unit functions are called from brw_tex_layout.c, so it makes sense to put them there. Since the only caller of intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be made into a static function. However, this patch instead simply folds it into the caller, as it's only two lines anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:15 -07:00
Kenneth Graunke	466aa712b6	i965: Push intel_get_texture_alignment_unit call into brw_miptree_layout intel_miptree_create_layout() calls intel_get_texture_alignment_unit() and then immediately calls brw_miptree_layout(). There are no other callers. intel_get_texture_alignment_unit() populates the miptree's alignment unit fields, which are used by brw_miptree_layout() to determine where to place each miplevel. Since brw_miptree_layout() needs those to be present, it makes sense to have it initialize them as the first step. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:15 -07:00
Kenneth Graunke	c4c3c0dc94	i965: Declare for-loop counters in the loop in brw_tex_layout.c. The driver is compiled in C99 mode, so this is not a problem. It's slighlty tidier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:15 -07:00
Kenneth Graunke	ccf312fd12	i965: Remove use of GLuint/GLint in brw_tex_layout.c. Using GL types is silly; this isn't even remotely API-facing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:15 -07:00
Kenneth Graunke	ed95e396f3	i965: Tidy the brw_tex_layout.c copyright and file header comments. This uses Doxygen style for the file comments, and generally makes it more consistent with the rest of the driver. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:15 -07:00
Kenneth Graunke	2ea87fde31	i965: Move i945_texture_layout_2d to brw_tex_layout.c This consolidates the miptree layout logic in a single file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:15 -07:00
Kenneth Graunke	1920209970	i965: Remove fallthrough for Gen4 cube map layout. Now that both 2DArray and Cube layouts are taken care of by helper functions, it's easy to just call the right function for each generation. This is a little cleaner than falling through. This also reworks the comments. Referencing "Volume 1" of the BSpec isn't very helpful, since that's only available inside Intel, and it doesn't even use volume numbers. Also, "Ironlake...finally" sounds a bit strange considering that almost all hardware uses the 2D array approach. At this point, Gen4 is the only special case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	7e4007a1b3	i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases. These do the exact same thing; combining them is tidier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	bc51f15b32	i965: Pull 3D texture layout code out into a helper function. A bit cleaner than having it in one giant function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	abc2bdffd6	i965: Replace maxBatchSize variable with BATCH_SZ define. maxBatchSize was only ever initialized to BATCH_SZ, and a few places used BATCH_SZ directly anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	2c602d2adf	i965: Move annotate_aub out of the vtable. brw_annotate_aub() is the only implementation of this function, so it makes sense to just call it directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	f05f8793c8	i965: Move debug_batch hook out of the vtable. brw_debug_batch() is the only implementation of this function, so it makes sense to just call it directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	749160aab3	i965: Remove render_target_supported from the vtable. brw_render_target_supported() is the only implementation of this function, so it makes sense to just call it directly. Rather than adding an #include of brw_wm.h, this patch moves the prototype to brw_context.h. Prototypes seem to be in rather arbitrary places at the moment, and either place seems as good as the other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	7c5279e554	i965: Move is_hiz_depth_format out of the vtable. brw_is_hiz_depth_format() is the only implementation of this function, so it makes sense to just call it directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	607338f1cb	i965: Remove the invalidate_state() vtable hook. The hook was a noop. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	251cdcf059	i965: Replace fprintfs with assertions in GLenum comparison translators. These functions translate GLenum comparison operations into the hardware enumerations. They should never be passed something other than a GL comparison operator, or something is very broken. Assertions seem more appropriate than fprintf. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:14 -07:00
Kenneth Graunke	7ee616f1bf	i965: Replace intel_state.c enums with those from brw_defines.h. Both intel_context.h and brw_defines.h have #defines for comparison functions, stencil ops, blending logic ops, and blending factors. They're exactly the same values, so it makes sense to pick one. brw_defines.h is the logical place for this kind of stuff, so this patch converts intel_state.c to use the set defined there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	c9db037dc9	i965: Delete pre-DRI2.3 viewport hacks. The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit `4258e3a2e1`. At this point, it's unlikely that anyone's using the right mix of new and old components to hit this path. Deleting it removes an untested code path and cleans up the driver a bit. Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Keith Packard <keithp@keithp.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	cbb37b7586	i965: Remove "There are probably better ways" comment. There are always better ways to do things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	7115bee993	i965: Delete brw_print_reg() function. This wasn't called from anywhere; presumably it was used to examine brw_regs when debugging shader assembly. However, it prints registers in a different notation than brw_disasm.c which everyone is used to...which means I doubt anyone will want to use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	bc8b62e3a0	i965: Move contents of intel_clear.h to intel_context.h. Having a header file for a single prototype seems rather excessive. Plus, the actual function is in brw_clear.c, not intel_clear.c, so there isn't even the .c/.h filename symmetry one might expect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	7d8e70f301	i965: Move contents of intel_extensions.h to intel_context.h. Having an entire header file for a single prototype seems a bit excessive. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	7d119880e8	i965: Remove some dead code. A random smattering of things that just aren't used anymore. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	d245e795cf	i965: Delete dead intel_buffer_object::range_map_size field. Nothing uses this, apparently. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	1f6ebdd43f	i965: Remove intel_buffer_object::source. This was only used for BOs backed by system memory on i915. With that gone, there's nothing that even sets source to non-zero, so this is purely dead code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:13 -07:00
Kenneth Graunke	6e5b80ee5a	i965: Fix buffer object segfault since removal of system memory BOs. Commit `cf31a19300` removed support for BOs backed by system memory, as it was only useful for i915. However, it removed a little too much code: intel_bufferobj_buffer() used to call intel_bufferobj_alloc_buffer(), and after that commit, it didn't. This led to NULL pointer dereferences in several test cases, such as es3conform's transform_feedback_state_variables test. This commit restores the allocation, preserving the original behavior. It may not be the cleanest approach, but tidying should come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-03 10:48:12 -07:00
Matthew McClure	012ba47076	postprocess: move second temporary assertion into isolated configuration With this patch we will only assert that the second temporary is allocated, when there are more than two active filters. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423 Signed-off-by: Brian Paul <brianp@vmware.com>	2013-07-03 09:19:04 -06:00
José Fonseca	9b6788eb15	glsl: Ensure snprintf is defined on MSVC builds. Should fix: src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found ...	2013-07-03 08:26:08 +01:00
Ilia Mirkin	4bc8e3c3e4	targets/xvmc-nouveau: add in missing nv30 lib Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so that it may be dlopen'd. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-07-03 09:02:40 +02:00
Marek Olšák	30c3e8718d	mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
Marek Olšák	74edd56927	st/mesa: disable EXT_separate_shader_objects The extension disallows elimination of set-but-unused varyings. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
Marek Olšák	b3d8b4c0b4	glsl/linker: eliminate unused and set-but-unused built-in varyings This eliminates built-in varyings such as gl_Color, gl_SecondaryColor, gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is broken down into separate vec4s if needed. v2: - use a switch statement in varying_info_visitor::visit(ir_variable*) - use snprintf - disable the optimization for GLES2 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
Marek Olšák	3c555827c3	glsl/linker: check against varying limit after unused varyings are eliminated We counted even the varyings which were later eliminated, which was suboptimal. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
Marek Olšák	284d954912	glsl/linker: link shaders in the opposite order (from fragment to vertex) This ensures that inter-shader outputs and inputs are properly eliminated across 3 or more shader stages. The behavior is unchanged with 2 or less shader stages. For example, elimination of unused FS inputs causes elimination of matching GS outputs, which causes elimination of the GS inputs that were needed for evaluation of the eliminated GS outputs, which causes elimination of matching VS outputs. An unused FS input is all that's needed to trigger this chain reaction. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
Marek Olšák	030ca230e2	mesa: renumber shader indices according to their placement in pipeline See my explanation in mtypes.h. v2: don't do this in gallium v3: also updated the comment at the gl_shader_type definition Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-02 17:02:14 +02:00
José Fonseca	84f367e69a	gallivm: Simplify intrinsic name construction. Just noticed this could be slightly shortened when fixing MSVC build. Trivial.	2013-07-02 13:12:31 +01:00
Kenneth Graunke	15ca0ca1b6	glsl/builtins: Fix ARB_texture_cube_map_array built-in availability. This patch adds texture() for isamplerCubeArray and usamplerCubeArray, which were entirely missing. It also makes texture() with a LOD bias fragment shader specific. The main GLSL specification explicitly says that texturing with LOD bias should not be allowed for vertex shaders. Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert. which tries to use bias in a vertex shader. Currently, it expects this to pass (so this patch regresses the test), but I've sent a patch to reverse the expected behavior (so this patch would fix the updated test): http://lists.freedesktop.org/archives/piglit/2013-June/006123.html NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2013-07-02 01:01:30 -07:00
José Fonseca	4c859901ce	gallivm: Fix MSVC build.	2013-07-02 06:41:32 +01:00
José Fonseca	e621ec816d	gallivm: Fix indirect immediate registers. If reg->Register.Indirect is true then the immediate is not truly a constant LLVM expression. There is no performance regression in using LLVMBuildBitCast, as it will fallback to LLVMConstBitCast internally when the argument is a constant. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-07-02 06:30:06 +01:00
Zack Rusin	70bc43acdb	gallium/tests: fix the translate test	2013-06-28 09:43:17 -04:00
Anuj Phogat	722721d718	i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w This patch enables ext_framebuffer_multisample_blit_scaled extension on intel h/w >= gen6. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-01 15:21:25 -07:00
Anuj Phogat	6fc3da2da0	i965/blorp: Add bilinear filtering of samples for multisample scaled blits Current implementation of ext_framebuffer_multisample_blit_scaled in i965/blorp uses nearest filtering for multisample scaled blits. Using nearest filtering produces blocky artifacts and negates the benefits of MSAA. That is the reason why extension was not enabled on i965. This patch implements the bilinear filtering of samples in blorp engine. Images generated with this patch are free from blocky artifacts and show big improvement in visual quality. Observed no piglit and gles3 regressions. V3: - Algorithm used for filtering assumes a rectangular grid of samples roughly corresponding to sample locations. - Test the boundary conditions on the edges of texture. V4: - Clip texcoords and use conditional MOVs. - Send texture dimensions as push constants. - Remove the optimization in case of scaled multisample blits. V5: - Move mcs_fetch() inside the 'for' loop after computing pixel coordinates. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-07-01 15:21:25 -07:00
Ian Romanick	27f2df2507	docs: Import 9.1.4 release notes, add news item. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-07-01 14:48:58 -07:00
Zack Rusin	1c2e5c223d	draw/translate: fix instancing We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 05:21:20 -04:00
Zack Rusin	df4ab7974a	draw: fix incorrect clipper invocation statistics clipper invocations are computed earlier (of course before the emittion) so this code was adding bogus numbers to already computed clipper invocations. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:29 -04:00
Zack Rusin	34546d61c1	draw/gallivm: export overflow arithmetic to its own file We'll be reusing this code so lets put it in a common file and use it in the draw module. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:24 -04:00
Zack Rusin	88de009cc1	draw: check for integer overflows in instance computation Integers could easily overflow is the starting instance was large enough. Instead of letting bogus counts through set the instance to max if it overflown and let our regular buffer overflow computation handle it. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:20 -04:00
Zack Rusin	2f13f28120	draw: check for an integer overflow when computing stride Our buffer overflow arithmetic was susceptible to integer overflows which was the buffer overflow logic to break. Lets use the llvm overflow intrinsics to check for integer overflows while computing the stride/needed buffer size. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:16 -04:00
Zack Rusin	e742f7788e	draw: account for elem size when computing overflow We weren't taking into account the size of element that is to be fetched, which meant that it was possible to overflow the buffer reads if the stride was very close to the end of the buffer, e.g. stride = 3, buffer size = 4, and the element to be read = 4. This should be properly detected as an overflow. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-28 04:24:12 -04:00
Vinson Lee	7214fe3cc4	i965: Initialize brw_blorp_const_color_program member variables. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-07-01 10:16:16 -07:00
Ross Burton	2c6186390c	eglplatform: use unsigned long instead of 32-bit ints in generic platform In the generic Unix case use the "unsigned long" type instead of 32-bit integers so that the type sizes are consistant on 64-bit machines between X11 and not-X11. Signed-off-by: Ross Burton <ross.burton@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-01 10:06:24 -07:00
Ross Burton	1a7275de9a	build: fix EGL build when no X11 headers are present eglplatform.h defaults to X11 on Unix unless told otherwise, so if we're doing a build without any X11 support tell it so that we don't try including headers that don't exist. Also set GL_PC_FLAGS so that the definition is in egl.pc, so that applications using EGL don't try to pull in X11 headers on systems where EGL was configured without X11 support. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64959 Signed-off-by: Ross Burton <ross.burton@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-07-01 10:06:11 -07:00
José Fonseca	acc6a141b8	tools/trace: Return dummy fence object to silence warnings.	2013-07-01 12:06:58 +01:00
José Fonseca	0fd71ac9eb	tools/trace: Don't crash if a trace has no timing information.	2013-07-01 12:05:57 +01:00
José Fonseca	fa3040c117	scons: Fix dependencies of enums.c and api_exec.c.	2013-07-01 12:04:59 +01:00
Maarten Lankhorst	bf95ca7de0	nvc0: allow frame dropping in h264 The only reason the checks existed were paranoia, when I first wrote the code I wasn't sure it was correct. Now that I am, the asserts triggered when XBMC was dropping frames, so remove it. NOTE: This is a candidate for the 9.1 branch.	2013-07-01 08:47:49 +02:00
Tom Stellard	24fa43675f	r300g/compiler: Prevent regalloc from swizzling texture operands v2 https://bugs.freedesktop.org/show_bug.cgi?id=63520 NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-30 21:38:57 -07:00
Tom Stellard	e2c3640540	r300g/compiler/tests: Add an assembly parser The assembly parser can be used to load r300 assembly dumps and run them through any of the r300 compiler passes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-30 21:38:57 -07:00
Tom Stellard	ab40d8d56f	r300g: Fix make check Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-30 21:24:55 -07:00
Grigori Goronzy	30004b20c2	r600g: implement fast color clears for MSAA on evergreen+ Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all bound colorbuffers fulfill certain conditions: a CMASK is required, we have to be able to create a clear color value for the format and the texture mustn't contain multiple images. Technically, it should be possible to support array textures and cubemaps if all images are attached to the framebuffer, but this does not appear to be common. v2: fix fast clear check v3: Marek: - disable fast clear with 128-bit formats, which are unsupported - set tex->dirty_level_mask in r600_clear, so that the driver knows the resource must be decompressed/expanded - return early from r600_clear if there's nothing else to do Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-07-01 03:02:43 +02:00
Marek Olšák	b1693194ee	r600g/compute: disable unused colorbuffer slots Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-07-01 03:02:43 +02:00
Marek Olšák	f83e220d36	st/mesa: handle SNORM formats in generic CopyPixels path v2: check desc->is_mixed in util_format_is_snorm	2013-06-30 22:14:37 +02:00
Matt Turner	adf8afa168	i965: NULL check depth_mt to quiet static analysis. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-29 15:19:08 -07:00
Roland Scheidegger	7d430bfab9	llvmpipe: fix timer query if there's no bins `b04a295a4a` removed seemingly unnecessary code in get_query. Turns out this code could in fact be reached - while timestamps are always binned, if there are no bins (which happens if fb size is 0) then the rasterization query code filling this in is still never executed. So fix this up by filling in some timestamp, but do it at EndQuery time not GetQuery time which should be more appropriate. Makes piglit arb_timer_query-timestamp-get happy again. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-29 16:58:02 +02:00
Tom Stellard	5a925cc550	clover: Don't segfault when compiling a program with no kernel	2013-06-28 15:19:06 -07:00
Eric Anholt	d7361f2943	mesa: Remove unused allow_large_textures driconf from classic drivers. This option hasn't been used since the introduction of DRI2. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:27 -07:00
Kenneth Graunke	03600660a1	i915: Remove GLES 3.0 sRGB workaround. Gen3 doesn't support GLES 3.0, so there's no need for it. Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	dc8796506e	i965: Remove is_945. Only relevant on Gen3. Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	a4e31956ac	i965: Delete hw_stencil flag. This was only used by i915. Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	4299e35888	i965: Remove hw_stipple flag. This was only used by i915. Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	1a5dca38e9	i965: Remove use_early_z option. This was only used by i965+. v2: Also remove the option from the driconf list. (change by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	2cc5724db2	i965: Remove unused SUBPIXEL_* macros. Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	2e9fe0ca12	i965: Remove redundant Gen3 PCI IDs. Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Kenneth Graunke	1811f5c43d	intel: Remove unused INTEL_MAX_FIXUP macro. v2: Remove it from i915, too (change by anholt) Acked-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:26 -07:00
Eric Anholt	0ac0a1b02e	i965: Drop i915 register/instruction definitions. v2: Remove unused DV_PF_* macros, too. (change by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:26 -07:00
Eric Anholt	1b67cd29a1	i965: Drop code for calling the empty brw_update_draw_buffers() hook. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	7c232189c5	i965: Drop dead i915 blend state code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	d58d0a3754	i965: Drop i915-specific blit clear code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	cf31a19300	i965: Drop the system-memory VBO support for i915. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	814440aadd	i965: Drop i915 swtnl code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	bb2e312d4d	i965: Drop i915-specific vtbl entries. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	a61d8f6110	i965: Drop swtnl fallback code for i915. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	28e80d7136	i965: Drop i915 code from intel_screen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	4a08a86f22	i965: Drop #ifdef I915 code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	6fddd375d7	i965: Drop code checking for gen <= 3. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:25 -07:00
Eric Anholt	3c231b8631	i915: Remove a duplicated set of PCI IDs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	8ac1ed92aa	i915: Remove various remaining dead code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	934974fba6	i915: Remove dead debug flags. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	39c5fd7f13	i915: Remove state batch emit support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	a40f9871a0	i915: Drop unused register #defines from the shared reg file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	173666e2ed	i915: Drop 965+ GL version setup. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	f6426509dc	i915: Remove gen6+ batchbuffer support. While i915 does have hardware contexts in hardware, we don't expect there to ever be SW support for it (given that support hasn't even made it back to gen5 or gen4). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	c25e3c34d6	i915: Drop chipset detection code for 965+ chipsets. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	014251ef42	i915: Drop context fields specific to 965+ chipsets. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	d71b7301ec	i915: Drop all has_llc code. i915 never has llc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:24 -07:00
Eric Anholt	be63c1c993	i915: Remove the remainder of the batchbuffer caching. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	7f210bf535	i915: Remove miscellanous uncalled gen4 code from formerly shared files. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	6bdc5ecbba	i915: Remove most of the code under gen >= 4 checks. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	18100d415e	i915: Remove fake ETC support that only existed on gen4+ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	27eedca3e0	i915: Remove separate stencil code. This was formerly-shared code for supporting gen5+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	279f0bce47	i915: Remove the I915 macro from the formerly shared code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	f26104eb5b	i915: Remove all the MSAA support code. This hardware doesn't have MSAA support, so this code is all a waste for it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Eric Anholt	0f31e06a2e	i915: Remove all the HiZ code from i915. v2: Remove extra struct forward declaration (change by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:23 -07:00
Ian Romanick	927f572c27	mesa: GL_EXT_shadow_funcs is not optional with GL_ARB_shadow Every driver left in Mesa that enables one also enables the other. There's no reason to let it be optional. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	41853b598c	mesa: GL_ARB_texture_storage_multisample is not optional with GL_ARB_texture_multisample In Mesa, this extension is implemented purely in software. Drivers may optionally provide optimized paths. If a driver enables, GL_ARB_texture_multisample, it gets GL_ARB_texture_storage_multisample for free. NOTE: This has the side effect of enabling the extension in Gallium drivers that enable GL_ARB_texture_multisample. v2 (Ken): Still prevent multisample texture targets in TexParameter for implementations that don't support multisampling. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	d5b6b7a39b	mesa: GL_ARB_texture_storage is not optional In Mesa, this extension is implemented purely in software. Drivers may optionally provide optimized paths. NOTE: This has the side effect of enabling the extension in the radeon, r200, and nouveau drivers. v2: Minor whitespace tidying (suggested by Brian). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	70966570f3	mesa: GL_ARB_shading_language_100 is not optional This extension just provides some of the most basic software framework for GLSL. Without GL_ARB_vertex_shader or GL_ARB_fragment_shader, applications still cannot use GLSL. There's no value in conditionalizing support for this extension. NOTE: This has the side effect of enabling the extension in the radeon, r200, and nouveau drivers. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	e6ec425d6e	mesa: GL_ARB_shader_objects is not optional This extension just provides some of the most basic software framework for GLSL. Without GL_ARB_vertex_shader or GL_ARB_fragment_shader, applications still cannot use GLSL. There's no value in conditionalizing support for this extension. NOTE: This has the side effect of enabling the extension in the radeon, r200, and nouveau drivers. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	9bc24b4fc4	mesa: GL_NV_blend_square is not optional Every driver left in Mesa enables this extension all the time. There's no reason to let it be optional. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	338ea2e4d1	mesa: GL_EXT_fog_coord is not optional Every driver left in Mesa enables this extension all the time. There's no reason to let it be optional. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	c139708087	mesa: GL_EXT_secondary_color is not optional Every driver left in Mesa enables this extension all the time. There's no reason to let it be optional. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	b5305a303b	mesa: GL_EXT_framebuffer_object is not optional Every driver left in Mesa enables this extension all the time. There's no reason to let it be optional. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:22 -07:00
Ian Romanick	f4571640b8	mesa: Remove GL_MESA_resize_buffers Commit `bab755a` made the implementation a no-op, and it was only ever enabled by software rasterizers. v2: Move the spec into docs/specs/OLD since it's now obsolete (squashed patch from Andreas Boll) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	34e8905077	mesa: Remove _mesa_{enable, disable}_extension and _mesa_extension_is_enabled They're not used anywhere. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	e14b486113	mesa: Just set extension flags instead of calling _mesa_enable_extension Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	b0d755f00b	mesa: Remove _mesa_enable_._._extensions functions After the preceeding commits, they are not used. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	45099ec175	swrast: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions _mesa_enable_sw_extensions enables all the extensions (and more) that the others enable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	a964397fd9	osmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions _mesa_enable_sw_extensions enables all the extensions (and more) that the others enable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	c9edd661c4	wmesa: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions _mesa_enable_sw_extensions enables all the extensions (and more) that the others enable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	89cf6e6273	x11: Don't call _mesa_enable_._._extensions and _mesa_enable_sw_extensions _mesa_enable_sw_extensions enables all the extensions (and more) that the others enable. Also, don't duplicate the DXTn checks. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 13:35:21 -07:00
Ian Romanick	0b9398c74f	i965: Merge the two GEN >= 6 extension enable blocks There's no reason for these blocks to be separate. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:21 -07:00
Ian Romanick	ae66a656fd	i965: Move GEN >= 4 extensions into the "always on" list This copy of the source file is only used for GEN >= 4, so extensions that are enabled for GEN >= 4 are always enabled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:21 -07:00
Ian Romanick	4ed976f6b5	i965: Move GEN >= 3 extensions into the "always on" list This copy of the source file is only used for GEN >= 4, so extensions that are enabled for GEN >= 3 are always enabled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:20 -07:00
Ian Romanick	e621208e29	i915: Remove GEN >= 4 extension support This copy of the source file is only used for GEN <= 3, so remove the dead code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-28 13:35:20 -07:00
Kenneth Graunke	745f6c692c	i965: Split surface format code into a new file (brw_surface_formats.c). brw_wm_surface_state.c has gotten rather large and unwieldy. At this point, it consists of two separate portions: 1. Surface format code This includes the giant table of surface formats and what features they support on each generation, as well as the code to translate between Mesa formats and hardware formats. This is used across all generations. 2. Binding table (SURFACE_STATE) related code. This is the code to generate SURFACE_STATE entries for renderbuffers, textures, transform feedback buffers, constant buffers, and so on, as well as the code to assemble them into binding tables. This is only used on Gen4-6; gen7_surface_state.c has Gen7+ code. Since the two are logically separate, and one is reused on every generation while the other is not, it makes a lot of sense to split them out. It should also make finding code easier. No code is changed by this patch. I simply copied the file then deleted portions of both. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-28 13:35:11 -07:00
Alex Deucher	c309e64db8	radeonsi: add kabini pci ids Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:17:27 -04:00
Alex Deucher	b6b1346691	radeonsi: add bonaire pci ids Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:17:18 -04:00
Alex Deucher	d669992e35	radeonsi: disable 2D tiling on CIK for now Causes GPU hangs. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:17:10 -04:00
Alex Deucher	1357624abc	radeonsi: add llvm processor names for CIK Requires updated llvm. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:17:00 -04:00
Alex Deucher	234d81e6b2	radeonsi: emit PA_SC_RASTER_CONFIG[_1] on cik Use the golden values for each asic. Todo: update Kabini and Kaveri. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:53 -04:00
Alex Deucher	9d8ad222c6	radeonsi: PA_CL_ENHANCE is privileged on CIK Needs to be and is set by the kernel. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:46 -04:00
Alex Deucher	72c10be3a7	radeonsi: update surface sync packet emit for CIK Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:35 -04:00
Alex Deucher	f2a9bd8084	radeonsi: store chip class in the pm4 struct Will be used for asic specific pm4 behavior. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:27 -04:00
Alex Deucher	3a47f1945f	radeonsi: properly handle DB tiling setup on CIK On CIK, DB switches back to using per-surface tiling parameters rather than the tile index used on SI. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:17 -04:00
Alex Deucher	8c903f5df9	radeonsi: emit additional shader pgm rsrc registers for CIK Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:10 -04:00
Alex Deucher	59e4fe0b75	radeonsi: emit TA_BC_BASE_ADDR_HI for border color on CIK Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:16:03 -04:00
Alex Deucher	b363a45c54	radeonsi: fix VGT_PRIMITIVE_TYPE emit for CIK Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:15:54 -04:00
Alex Deucher	ecb679a8d3	radeonsi: register updates for CIK Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:15:46 -04:00
Alex Deucher	deb2358243	radeonsi: initial PM4 changes for CIK note which packets are removed and add new ones. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:15:36 -04:00
Alex Deucher	f29f206c93	radeonsi: initial support for CIK chips Add the infrastructure to differentiate them. Just treat them like SI for now. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:15:28 -04:00
Alex Deucher	5b3f1ea933	radeonsi: rename SI chip class from TAHITI to SI Covers the entire family. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-06-28 15:15:20 -04:00
Tom Stellard	47e35eff9d	r600g: Fix build Broken since `2840bec56f` when opencl is disabled.	2013-06-28 11:11:43 -07:00
Anuj Phogat	ee723ffabb	mesa: Return ZeroVec/dummyReg instead of NULL pointer Assertions are not sufficient to check for null pointers as they don't show up in release builds. So, return ZeroVec/dummyReg instead of NULL pointer in get_{src,dst}_register_pointer(). This should calm down the warnings from static analysis tool. Note: This is a candidate for the 9.1 branch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 10:53:43 -07:00
Tom Stellard	bee49cb0ec	mesa: Fix build with older gcc since update of glext.h Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-28 08:49:06 -07:00
Tom Stellard	2840bec56f	r600g/compute: Accept LDS size from the LLVM backend And allocate the correct amount before dispatching the kernel. Tested-by: Aaron Watry <awatry@gmail.com>	2013-06-28 08:33:11 -07:00
Tom Stellard	2639fca1f0	r600g/compute: Move compute_shader_create() function into evergreen_compute.c Tested-by: Aaron Watry <awatry@gmail.com>	2013-06-28 08:33:11 -07:00
Brian Paul	ba4979810f	svga: pass svga_compile_key by reference instead of value Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-28 08:38:00 -06:00
Brian Paul	74e8a7d1dd	svga: use switch statement in svga_shader_type() Safer in case the PIPE_SHADER_x tokens get renumbered (as Marek wanted to do). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-28 08:37:59 -06:00
Chia-I Wu	24b05ff158	ilo: clean up states that use ilo_view_surface Use variables that are easier to remember what they are.	2013-06-28 15:01:00 +08:00
Chia-I Wu	2c9b6a2164	ilo: remove ilo_cbuf_state::count We can derive it from enabled_mask.	2013-06-28 15:01:00 +08:00
Chia-I Wu	7ea3ed81c8	ilo: clean up ilo_set_constant_buffer() Add loops that will be optimized away.	2013-06-28 15:01:00 +08:00
Chia-I Wu	11d283cde9	ilo: clean up states that take a start_slot They are similar, so clean them up to make them look similar.	2013-06-28 15:00:42 +08:00
Vinson Lee	def634979d	glsl: Initialize member variable is_ubo_var in constructor. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-06-27 21:51:32 -07:00
Chia-I Wu	20c691b936	ilo: use shorter names for dirty flags The new names match those of ilo_context's members respectively, and are shorter.	2013-06-28 10:44:51 +08:00
Chia-I Wu	cabc7b44c0	ilo: track if primitive restart has changed Re-emit 3DSTATE_INDEX_BUFFER to enable/disable primitive restart.	2013-06-28 10:44:38 +08:00
Chia-I Wu	e071812e46	ilo: avoid potential dangling pointer dereference Set pipe_draw_info to NULL after draw_vbo().	2013-06-28 10:11:49 +08:00
Ian Romanick	c74a7eb9c5	mesa: Remove GL_EXT_clip_volume_hint As far as I can tell, no driver has enabled this extension since `c6499a7` back in 2007. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-27 18:14:33 -07:00
Chad Versace	6b676e6634	i965,i915: Return early if miptree allocation fails If allocation fails in intel_miptree_create_layout(), don't proceed to dereference the miptree. Return an early NULL. Fixes static analysis error reported by Klocwork. Note: This is a candidate for the 9.1 branch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-27 13:16:47 -07:00
Roland Scheidegger	670f829102	llvmpipe: handle offset_clamp This was just ignored (unless for some reason like unfilled polys draw was handling this). I'm not convinced of that code, putting the float for the clamp in the key isn't really a good idea. Then again the other floats for depth bias are already in there too anyway (should probably have a jit_context for the setup function), so this is just a quick fix. Also, the "minimum resolvable depth difference" used isn't really right as it should be calculated according to the z values of the current primitive and not be a constant (of course, this only makes a difference for float depth buffers), at least for d3d10, so depth biasing is still not quite right. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-27 19:06:40 +02:00
Roland Scheidegger	b04a295a4a	llvmpipe: remove never reached code for timestamp queries. timestamp queries are always binned in an active scene, therefore always have a result.	2013-06-27 19:06:40 +02:00
Roland Scheidegger	59b8689d37	llvmpipe: fix a bug in opaque optimization If there are queries active the opaque optimization reseting the bin needs to be disabled. (Not really tested since the bug was discovered by code inspection not an actual test failure.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-27 19:06:40 +02:00
Vinson Lee	f12e551810	radeonsi/compute: Fix memory leak in radeonsi_launch_grid. Fixes "Resource leak" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-06-27 10:03:33 -07:00
Tom Stellard	0e990736f3	clover: Fix build with LLVM 3.4 Reported on IRC by lordheavy	2013-06-27 10:03:33 -07:00
Bill York	191795eaf1	docs: updated instructions for Mesa on Windows Signed-off-by: Brian Paul <brianp@vmware.com>	2013-06-27 09:49:41 -06:00
Matthew McClure	e87fc11cac	postprocess: handle partial intialization failures. This patch fixes segfaults observed when enabling the post processing features. When the format is not supported, or a texture cannot be created, the code must gracefully handle failure and report the error to the calling code for proper failure handling. To accomplish this the following changes were made to the filters.h prototypes: - bool return for pp_init_func - Added pp_free_func for filter specific resource destruction Fixes segfaults from backtraces: * util_destroy_blit pp_free * u_transfer_inline_write_vtbl pp_jimenezmlaa_init_run pp_init This patch also uses tgsi_alloc_tokens to allocate temporary tokens in pp_tgsi_to_state, instead of allocating the array on the stack. This fixes the following stack corruption segfault in pp_run.c: * _int_free aaline_delete_fs_state pp_free Bug Number: 1021843 Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-27 09:44:29 -06:00
Brian Paul	482c43a946	glx: return True/False instead of GL_TRUE/GL_FALSE Just to be consistent with the functions' Bool return type. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-27 07:48:19 -06:00
Brian Paul	d171bc9d19	glx: move declarations before code Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-27 07:48:18 -06:00
Brian Paul	d43548ca37	mesa: move declarations before code Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-27 07:48:18 -06:00
José Fonseca	15085b477b	glsl: Use the C99 variadic macro syntax. MSVC does not support the old GCC syntax. See also http://gcc.gnu.org/onlinedocs/gcc/Variadic-Macros.html	2013-06-27 07:44:11 +01:00
José Fonseca	bcd6f3b23c	scons: Add dependencies to all .xml files. Should prevent stuck builds when only some of the included .xml files change.	2013-06-27 07:25:10 +01:00
Chia-I Wu	9f3cfe6aaf	ilo: plug a potential index buffer leak This is harmless since st_context and u_vbuf both set index buffer to NULL before destroying themselves. But we do not want to rely on that behavior.	2013-06-27 11:46:58 +08:00
Roland Scheidegger	eabe068747	softpipe: honor predication for clear_render_target and clear_depth_stencil trivial, copied from llvmpipe Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-26 23:17:53 +02:00
Roland Scheidegger	2e4da1f594	llvmpipe: add support for nested / overlapping queries OpenGL doesn't support this but d3d10 does. It is a bit of a pain as it is necessary to keep track of queries still active at the end of a scene, which is also why I cheat a bit and limit the amount of simultaneously active queries to (arbitrary) 16 (simplifies things because don't have to deal with a real list that way). I can't think of a reason why you'd really want large numbers of overlapping/nested queries so it is hopefully fine. (This only affects queries which need to be binned.) v2: don't copy remainder of array when deleting an entry simply replace the deleted entry with the last one (order doesn't matter). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-26 23:17:53 +02:00
Roland Scheidegger	0820342880	llvmpipe: rework query logic Previously lp_rast_begin_query commands were always inserted into each bin, and re-issued if the scene was restarted, while lp_rast_end_query commands were executed for each still active query at the end of tile rasterization. Also, the ps_invocations and vis_counter were set to zero when the respective command was encountered. This however cannot work for multiple queries of the same type (note that occlusion counter and occlusion predicate while different type were also affected). So, change the logic to always set the ps_invocations and vis_counter to zero at the start of tile rasterization, and then use "start" and "end" per-thread query values when encountering the begin/end query commands instead, which should work for multiple queries of the same type. This also means queries do not have to be reissued in a new scene, however they still need to be finished at end of tile rasterization, so a list of queries still active at the end of a scene needs to be maintained. Also while here don't bin the queries which don't do anything in rasterization. (This change does not actually handle multiple queries of the same type yet, as the list of active queries is just a simple fixed array and setup can still only have one query active per type.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-26 23:17:53 +02:00
Eric Anholt	3dbba95b72	i965: Move the remaining intel code to the i965 directory. Now that i915's forked off, they don't need to live in a shared directory. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad.versace@linux.intel.com> Acked-by: Adam Jackson <ajax@redhat.com> (and I hear second hand that idr is OK with it, too)	2013-06-26 12:28:26 -07:00
Eric Anholt	733d32f376	i915: Fork the shared code from i965. Of this 15000 lines of code in intel/, we've identified 4000 lines that are trivially unnecessary for i915, and another 1000 that are pointless for i965, and expect to find more as time goes on. Split the i915 driver off, so that we can continue active development on i965 without worrying about breaking i915. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Chad Versace <chad.versace@linux.intel.com> Acked-by: Adam Jackson <ajax@redhat.com> (and I hear second hand that idr is OK with it, too)	2013-06-26 12:28:25 -07:00
Eric Anholt	43a6795a1f	i915: Remove dead symlink.	2013-06-26 12:28:25 -07:00
Eric Anholt	fc32d40534	glx: Fix another missed glMultiDrawElementsEXT const change. The build was broken for me since `b7d9478f36`.	2013-06-26 12:28:25 -07:00
Ian Romanick	c170c901d0	glsl: Move all var decls to the front of the IR list in reverse order This has the (intended!) side effect that vertex shader inputs and fragment shader outputs will appear in the IR in the same order that they appeared in the shader code. This results in the locations being assigned in the declared order. Many (arguably buggy) applications depend on this behavior, and it matches what nearly all other drivers do. Fixes the (new) piglit test attrib-assignments. NOTE: This is a candidate for stable release branches (and requires the previous commit to prevent a regression in OpenGL ES 2.0 conformance test stencil_plane_operation). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-26 12:27:23 -07:00
Ian Romanick	329cd6a9b1	i965: Be more careful with the interleaved user array upload optimization The checks to determine when the data can be uploaded in an interleaved fashion can be tricked by certain data layouts. For example, float data[...]; glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, 16, &data[0]); glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 16, &data[4]); glDrawArrays(GL_POINTS, 0, 1); will hit the interleaved path with an incorrect size (16 bytes instead of 32 bytes). As a result, the data for attribute 1 never gets uploaded. The single element draw case is the only sensible case I can think of for non-interleaved-that-looks-like-interleaved data, but there may be others as well. To fix this, make sure that the end of the element in the array being checked is within the stride "window." Previously the code would check that the begining of the element was within the window. NOTE: This is a candidate for stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 12:27:23 -07:00
Brian Paul	b7d9478f36	mesa: add const qualifier to glMultiDrawElementsEXT() indices param The 20130624 version of glext.h changed this to match the glMultiDrawElements() function which already had the extra const qualifier. Fixes warnings/errors that seem to vary from one compiler to the next. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-26 13:12:01 -06:00
Brian Paul	15436adab0	mesa: remove const from glDebugMessageCallbackARB() function parameter The new 20130624 version of glext.h removed the const qualifier on the 'userParam' parameter. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-26 13:12:01 -06:00
Kenneth Graunke	dd0b99b0be	i965/vs: Combine code generation's inst->opcode switch statements. vec4_visitor::generate_code() switches on vec4_instruction::opcode and calls into the brw_eu_emit.c layer to generate code for some of them. It then has a default case which calls generate_vec4_instruction() to handle the rest...which switches on opcode and handles the rest of the cases. The split apparently is that generate_code() handles the actual hardware opcodes (BRW_OPCODE_) while generate_vec4_instruction() handles the virtual opcodes (SHADER_OPCODE_ and VS_OPCODE_*). But this looks fairly arbitrary, and it makes more sense to combine the two switches. This patch moves the cases from generate_code() into the helper function so that generate_code() isn't as large. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-26 11:25:13 -07:00
Kenneth Graunke	55272883ac	i965: Remove broken source type assertions from brw_alu3(). Commit `526ffdfc03` attempted to generalize the source register type assertions to allow D and UD. However, the src1 and src2 assertions actually checked src0.type against D and UD due to a copy and paste bug. It also began setting the source and destination register types based on dest.type, ignoring src0/src1/src2.type completely. BFE and BFI2 may actually pass mixed D/UD types and expect them to be ignored, which is arguably a bit sloppy, but not too crazy either. This patch simply removes the source register assertions as those values aren't used anyway. It also clarifies the comment above the block that sets the register types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-06-26 11:25:13 -07:00
Kenneth Graunke	9321f3257f	i965: Add back strict type assertions for MAD and LRP. Commit `526ffdfc03` relaxed the type assertions in brw_alu3 to allow D/UD types (required by BFE and BFI2). This lost us the strict type checking for MAD and LRP, which require all four types to be float. This patch adds a new ALU3F wrapper which checks these once again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-06-26 11:25:12 -07:00
Kenneth Graunke	4563dfe23a	glsl: Streamline the built-in type handling code. Over the last few years, the compiler has grown to support 7 different language versions and 6 extensions that add new built-in types. With more and more features being added, some of our core code has devolved into an unmaintainable spaghetti of sorts. A few problems with the old code: 1. Built-in types are declared...where exactly? The types in builtin_types.h were organized in arrays by the language version or extension they were introduced in. It's factored out to avoid duplicates---every type only exists in one array. But that means that sampler1D is declared in 110, sampler2D is in core types, sampler3D is a unique global not in a list...and so on. 2. Spaghetti call-chains with weird parameters: generate_300ES_types calls generate_130_types which calls generate_120_types and generate_EXT_texture_array_types, which calls generate_110_types, which calls generate_100ES_types...and more Except that ES doesn't want 1D types, so we have a skip_1d parameter. add_deprecated also falls into this category. 3. Missing type accessors. Common types have convenience pointers (like glsl_type::vec4_type), but others may not be accessible at all without a symbol table (for example, sampler types). 4. Global variable declarations in a header file? #include "builtin_types.h" in two C++ files would break the build. The new code addresses these problems. All built-in types are declared together in a single table, independent of when they were introduced. The macro that declares a new built-in type also creates a convenience pointer, so every type is available and it won't get out of sync. The code to populate a symbol table with the appropriate types for a particular language version and set of extensions is now a single table-driven function. The table lists the type name and GL/ES versions when it was introduced (similar to how the lexer handles reserved words). A single loop adds types based on the language version. Explicit extension checks then add additional types. If they were already added based on the language version, glsl_symbol_table simply ignores the request to add them a second time, meaning we don't need to worry about duplicates and can simply list types where they belong. v2: Mark uvecs and shadow samplers as ES3 only, and 1DArrayShadow as unsupported in ES entirely. Add a touch more doxygen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-26 11:25:12 -07:00
Kenneth Graunke	818da74af5	glsl: Don't use random pointers as an array of glsl_type objects. Using a random glsl_type convenience pointer as an array is a really bad idea, for all the reasons mentioned in the previous commit. The new glsl_type::bvec() function is simpler anyway. Prevents breakage in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-26 11:25:12 -07:00
Kenneth Graunke	4530ed4f26	glsl: Stop being clever with pointer arithmetic when fetching types. Currently, vector types are linked together closely: the glsl_type objects for float, vec2, vec3, and vec4 are all elements of the same array, in that exact order. This makes it possible to obtain vector types via pointer arithmetic on the scalar type's convenience pointer. For example, float_type + (3 - 1) = vec3. However, relying on this is extremely fragile. There's no particular reason the underlying type objects need to be stored in an array. They could be individual class members, possibly with padding between them. Then the pointer arithmetic would break, and we'd get bad pointers to non-heap allocated data, causing subtle breakage that can't be detected by valgrind. Cue insanity. Or someone could simply reorder the type variables, causing us to get the wrong type entirely. Also cue insanity. Writing this explicitly is much safer. With the new helper functions, it's a bit less code even. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-26 11:25:12 -07:00
Kenneth Graunke	d367a1cbdb	glsl: Add simple vector type accessor helpers. This patch introduces new functions to quickly grab a pointer to a vector type. For example: glsl_type::bvec(4) returns glsl_type::bvec4_type glsl_type::ivec(3) returns glsl_type::ivec3_type glsl_type::uvec(2) returns glsl_type::uvec2_type glsl_type::vec(1) returns glsl_type::float_type This is less wordy than glsl_type::get_instance(GLSL_TYPE_BOOL, 4, 1), which can help avoid extra word wrapping. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-26 11:25:12 -07:00
Brian Paul	9a14e412d6	mesa: update glext.h to version 20130624 In glapi_priv.h we always need the typedef for the GLclampx type since GL_OES_fixed_point is now defined in glext.h but the GLclampx type is not. GLclampx is not used by anything in glext.h but we need it for GL ES dispatch. This is a huge patch because the structure of the file has been changed. The following extensions are new, however: GL_AMD_interleaved_elements GL_AMD_shader_trinary_minmax GL_IBM_static_data GL_INTEL_map_texture GL_NV_compute_program5 GL_NV_deep_texture3D GL_NV_draw_texture GL_NV_shader_atomic_counters GL_NV_shader_storage_buffer_object GL_NVX_conditional_render GL_OES_byte_coordinates GL_OES_compressed_paletted_texture GL_OES_fixed_point GL_OES_query_matrix GL_OES_single_precision And these extensions were removed: GL_FfdMaskSGIX GL_INGR_palette_buffer GL_INTEL_texture_scissor GL_SGI_depth_pass_instrument GL_SGIX_fog_scale GL_SGIX_impact_pixel_texture GL_SGIX_texture_select Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-06-26 10:43:27 -06:00
Brian Paul	bc6eb8068f	st/mesa: add casts to silence MSVC warnings	2013-06-26 10:42:59 -06:00
Brian Paul	202299d16e	st/mesa: make rtt_level, face, slice unsigned to silence MSVC warnings	2013-06-26 10:42:59 -06:00
Brian Paul	2285645aa2	hud: add float casts to silence MSVC warnings	2013-06-26 10:42:59 -06:00
Brian Paul	87d5a16927	hud: include stdio.h since we use fprintf(), fscanf(), etc	2013-06-26 10:42:59 -06:00
Brian Paul	61964a9ceb	hud: add cast to silence MSVC warning	2013-06-26 10:42:59 -06:00
Brian Paul	f06e60fde4	os: add cast in os_time_sleep() to silence MSVC warning	2013-06-26 10:42:59 -06:00
Brian Paul	21f8729c3d	vega: add some casts to silence MSVC warnings	2013-06-26 10:42:59 -06:00
Brian Paul	4d452f1988	util: int/unsigned changes to silence some MSVC warnings	2013-06-26 10:42:59 -06:00
Brian Paul	bbdd7cfb8b	util: add some casts to silence some MSVC warnings	2013-06-26 10:42:59 -06:00
Brian Paul	aab8ca8fd1	util: s/int/unsigned/ to silence some MSVC warnings	2013-06-26 10:42:58 -06:00
Maarten Lankhorst	e72cc26518	nvc0: set rsvd_kick correctly This prevents trampling beyond the end of the command stream during flushes. NOTE: This is a candidate for the stable branches. Reported-by: Christoph Bumiller <christoph.bumiller@speed.at> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-06-26 16:50:08 +02:00
Maarten Lankhorst	30c2c34464	nvc0: fix push_space checks for video decoding	2013-06-26 16:18:42 +02:00
Vinson Lee	e6479b4330	ilo: Remove max_threads dead code path. max_threads cannot be greater than 28. It is either 21 or 28. Fixes "Logically dead code" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2013-06-26 21:51:07 +08:00
Jean-Sébastien Pédron	c6d52f2290	winsys/intel: fix typo in "ETIMEOUT" Should be "ETIMEDOUT". [olv: commit message slightly re-formatted] Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2013-06-26 21:51:07 +08:00
Chia-I Wu	c610b67972	ilo: use a bitmask for enabled constant buffers Looping over 4 * 13 constant buffers while in most cases only two are enabled is stupid.	2013-06-26 21:50:26 +08:00
Maarten Lankhorst	9aebad618c	vl/mpeg12: handle mpeg-1 bitstreams more correctly Add support for D-frames. Add support for slices ending on a different horizontal row of macroblocks.	2013-06-26 11:40:47 +02:00
Chia-I Wu	95c21f12f3	ilo: support PIPE_CAP_USER_INDEX_BUFFERS We want to access the user buffer, if available, when primitive restart is enabled and the restart index/primitive type is not natively supported. And since we are handling index buffer uploads in the driver with this change, we can also work around misalignment of index buffer offsets.	2013-06-26 16:42:46 +08:00
Chia-I Wu	5fb5d4f0a6	ilo: make pipe_draw_info a context state Rename ilo_finalize_states() to ilo_finalize_3d_states(), and bind pipe_draw_info to the context when it is called. This saves us from having to pass pipe_draw_info around in several places.	2013-06-26 16:42:46 +08:00
Chia-I Wu	3eb6754e94	ilo: support PIPE_CAP_USER_CONSTANT_BUFFERS We need it for HUD support, and will need it for push constants in the future.	2013-06-26 16:42:45 +08:00
Eric Anholt	79385950f3	i915: Drop dead batch dumping code. Batch dumping is now handled by shared code in libdrm. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:12 -07:00
Eric Anholt	57407bcaf8	intel: Drop little bits of dead code. I noticed these while building the fork-i915 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:12 -07:00
Eric Anholt	88514d922e	i965: Stop recomputing the miptree's size from the texture image. We've already computed what the dimensions of the miptree are, and stored it in the miptree. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:12 -07:00
Eric Anholt	820325b258	i965: Drop unused argument to translate_tex_format(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:11 -07:00
Eric Anholt	c20f973c4f	i965/gen4-5: Stop using bogus polygon_offset_scale field. The polygon offset math used for triangles by the WM is "OffsetUnits * 2 * MRD + OffsetFactor * m" where 'MRD' is the minimum resolvable difference for the depth buffer (~1/(1<<16) or ~1/(1<<24)), 'm' is the approximated slope from the GL spec, and '2' is this magic number from the original i965 code dump that we deviate from the GL spec by because "it makes glean work" (except that it doesn't, because of some hilarity with 0.5 * approximately 2.0 != 1.0. go glean!). This clipper code for unfilled polygons, on the other hand, was doing "OffsetUnits * garbage + OffsetFactor * m", where garbage was MRD in the case of 16-bit depth visual (regardless the FBO's depth resolution), or 128 * MRD for 24-bit depth visual. This change just makes the unfilled polygons behavior match the WM's filled polygons behavior. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:11 -07:00
Eric Anholt	dba46831b0	i915: Use the current drawbuffer's depth for polygon offset scale. There's no reason to care about the window system visual's depth for handling polygon offset in an FBO, and it could only lead to pain. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:11 -07:00
Eric Anholt	c31aee99f3	intel: Add perf debug for glCopyPixels() fallback checks. The separate function for the fallback checks wasn't particularly clarifying things, so I put the improved checks in the caller. (Note that the dropped _mesa_update_state() had already happened once at the start of the caller) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:11 -07:00
Eric Anholt	a2ca98b211	i965: Add debug to INTEL_DEBUG=blorp describing hiz/blit/clear ops. I think we've all added instrumentation at one point or another to see what's being called in blorp. Now you can quickly get output like: Testing glCopyPixels(depth). intel_hiz_exec depth clear to mt 0x16d9160 level 0 layer 0 intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0 intel_hiz_exec hiz ambiguate to mt 0x16d9160 level 0 layer 0 intel_hiz_exec depth resolve to mt 0x16d9160 level 0 layer 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-26 01:07:11 -07:00
Eric Anholt	da00782ed8	ra: Fix register spilling. Commit `551c991606` tried to avoid spilling registers that were trivially colorable. But since we do optimistic coloring, the top of the stack also contains nodes that are not trivially colorable, so we need to consider them for spilling (since they are some of our best candidates). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58384 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63674 NOTE: This is a candidate for the 9.1 branch.	2013-06-26 01:07:11 -07:00
Eric Anholt	c6d74a4992	i965/fs: Dump IR when fatally not compiling due to bad register spilling. It should never happen, but it does, and at this point, you're going to _mesa_problem() and abort() (unless it's just in precompile). Give the developer something to look at.	2013-06-26 01:07:11 -07:00
Naohiro Aota	95e145aaee	xmlpool/build: Make sure to set mo properly Some shells does not set variables sequentially in a statement i.e. "a=X b=${a}" won't set "b" to "X" but empty value. This patch introduce ";" to make sure "mo" is set properly before "lang" assignment. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=471302	2013-06-25 21:22:56 -07:00
Eric Anholt	04e03d9645	i965: Remove the rest of brw_update_draw_buffer(). The last piece of code with an effect was flagging _NEW_BUFFERS. Only, that is already flagged from everything that calls this function: Mesa GL state updates flag it before even calling down into the driver, and the calls from the DRI2 window system framebuffer update path end up flagging it as part of the ResizeBuffers() hook. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:22 -07:00
Eric Anholt	c39111509d	i965: Stop updating FBO state on drawbuffers change. The computed fields are updated appropriately as part of the normal draw call path due to _NEW_BUFFERS being set. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:22 -07:00
Eric Anholt	9d523e3372	i965: Stop recomputing drawbuffer bounds on drawbuffer change. For winsys FBOs, the bounds are appropriately updated immediately upon _mesa_resize_framebuffer(). For user FBOs, they're updated as part of the normal draw path state update due to _NEW_BUFFERS having been flagged. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	15c47481ba	i965: Remove _NEW_DEPTH state flagging on drawbuffers change. Of the places noting a _NEW_DEPTH dependency, all were already checking for _NEW_BUFFERS if appropriate. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	94ecf913b4	intel: Stop doing special _NEW_STENCIL state flagging on drawbuffers. 2/3 packets depending on Stencil._Enabled already checked for _NEW_BUFFERS, so just add _NEW_BUFFERS to the remaining one. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	3faccc42ad	i965: Stop flagging viewport/scissor change on drawbuffers change. The viewport (ctx->Viewport._WindowMap) doesn't change with drawable size changes, and we update scissor (ctx->DrawBuffer->_Xmin and friends) on _NEW_BUFFERS in things like brw_sf_state.c. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	438f85717d	i965: Stop flagging _NEW_POLYGON on drawbuffers change. Things like brw_sf.c that need to know about orientation are already recomputing on _NEW_BUFFERS. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	b04c718ebd	radeon: Remove gratuitous custom framebuffer resize code. _mesa_resize_framebuffer(), the default value of the ResizeBuffers hook, already checks for a window system framebuffer and walks the renderbuffers calling AllocStorage(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	17bc8fdb1d	intel: Remove gratuitous custom framebuffer resize code. _mesa_resize_framebuffer(), the default value of the ResizeBuffers hook, already checks for a window system framebuffer and walks the renderbuffers calling AllocStorage(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:21 -07:00
Eric Anholt	d7165b383d	mesa: Remove the Initialized field from framebuffers. This existed to tell the core not to call GetBufferSize, except that even if you didn't set it nothing happened because nobody had a GetBufferSize. v2: Remove two more instances of setting the field (from Brian) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:20 -07:00
Eric Anholt	bab755ad1b	mesa: Remove Driver.GetBufferSize and its callers. Only the GDI driver set it to non-NULL any more, and that driver has a Viewport hook that should keep it limping along as well as it ever has. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-25 19:19:20 -07:00
Vinson Lee	61bfed2d09	glsl: Fix gl_shader_program::UniformLocationBaseScale assert. commit `26d86d26f9` added gl_shader_program::UniformLocationBaseScale. According to the code comments in that commit, UniformLocationBaseScale "must be >=1". UniformLocationBaseScale is of type unsigned. Coverity reported a "Macro compares unsigned to 0" defect as well. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-06-25 18:45:01 -07:00
Brian Paul	0b994961ff	svga: allow 3D transfers in svga_texture_transfer_map() Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-25 17:54:24 -06:00
Brian Paul	808da7d8ca	svga: use new svga_define_texture_level() helper To get array bounds checking. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-25 17:54:24 -06:00
Brian Paul	2cc27c3faa	svga: fix layer/level mix-up in svga_mark_surface_dirty() Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-25 17:54:24 -06:00
Brian Paul	04e3969597	svga: use new svga_age_texture_view() helper The function does array bounds checking. Note, this exposes a bug in the svga_mark_surface_dirty() function: we're calling svga_age_texture_view() with a texture slice instead of mipmap level. This can lead to a failed assertion. That'll be fixed next. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-25 17:54:24 -06:00
Brian Paul	a4e4a413e5	svga: add array index assertion in svga_validate_sampler_view()	2013-06-25 17:54:24 -06:00
Brian Paul	82d6a52530	svga: use svga_texture() helper instead of casting	2013-06-25 17:54:23 -06:00
José Fonseca	464c6949cb	util/debug: Cleanup/improve debug_symbol_name_dbghelp. - use mgwhelp -- the successor for bfdhelp which does not have a hard dependency on BFD, and works on 64bits. - use a macro instead of hand-typing to dispatch DbgHelp functions - dump line numbers - dump module names when symbols are not available - support 64bits. - add comments Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-25 18:41:59 +01:00
José Fonseca	a26f834a39	util/debug: Make debug_backtrace_capture work for 64bit windows. Rely on Windows' CaptureStackBackTrace to do the grunt work. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-25 18:41:59 +01:00
Zack Rusin	29dacd9803	draw: allow overflows in the llvm paths Because our code couldn't handle it we were skipping rendering if we detected overflows. According to the spec we should still render but with all 0 vertices, which is what the llvm code already does. So for the llvm paths lets enable processing even if an overflow condition has been detected. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-25 11:57:01 -04:00
Zack Rusin	f96326b2f6	draw: avoid overflows in the llvm draw loop Before we could easily overflow if start+count>max integer. To avoid it we can just iterate over the count. This makes sure that we never crash, since most of the overflow conditions is already handled. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-25 11:56:41 -04:00
Maarten Lankhorst	e2b02080d8	nvc0: do not set tiled mode on gart bo when fence debugging is used Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-06-25 13:34:15 +02:00
Chia-I Wu	c8240c9dea	ilo: honor render condition in blitter Make pass_render_condition() available for blitter, and check for render condition in (and only in) clear(), clear_render_target(), and clear_depth_stencil().	2013-06-25 15:38:07 +08:00
Chia-I Wu	5f4b769127	ilo: remove ilo_shader_internal.h from GEN6 pipeline Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.	2013-06-25 13:51:59 +08:00
Chia-I Wu	63165df90f	ilo: remove ilo_shader_internal.h from GEN7 pipeline Replace direct shader accesses with ilo_shader_get_kernel_param() and etc.	2013-06-25 13:51:59 +08:00
Chia-I Wu	855b684141	ilo: speed up ilo_shader_select_kernel_routing() a bit Remember the order of the source attributes and avoid recomputation when it does not change.	2013-06-25 13:51:59 +08:00
Chia-I Wu	9b18df6e08	ilo: move SBE setup code to ilo_shader.c Add ilo_shader_select_kernel_routing() to construct 3DSTATE_SBE. It is called in ilo_finalize_states(), rather than in create_fs_state(), as it depends on VS/GS and rasterizer states. With this change, ilo_shader_internal.h is no longer needed for ilo_gpe_gen6.c.	2013-06-25 13:51:58 +08:00
Chia-I Wu	c4fa24ff08	ilo: use ilo_shader_state exclusively in GPE This allows us to remove ilo_shader_internal.h from ilo_gpe_gen7.c. The unfinished code in 3DSTATE_DS, 3DSTATE_HS, and INTERFACE_DESCRIPTOR_DATA are partly or entirely removed.	2013-06-25 13:18:08 +08:00
Chia-I Wu	91cf6c1e92	ilo: map SO registers at shader compile time The unmodified pipe_stream_output_info describes its outputs as if they are in TGSI_FILE_OUTPUT. Remap the register indices to where they appear in the VUE. TGSI_SEMANTIC_PSIZE needs a little care because it is at the W channel.	2013-06-25 13:18:08 +08:00
Chia-I Wu	68522bf36c	ilo: use ilo_shader_cso for FS Add ilo_gpe_init_fs_cso() to construct 3DSTATE_PS and shader part of 3DSTATE_WM once and early for fragment shaders.	2013-06-25 13:18:08 +08:00
Chia-I Wu	639a2cddc6	ilo: use ilo_rasterizer_state exclusively in GPE Replace pipe_rasterizer_state by ilo_rasterizer_state for the remaining GPE functions for consistency.	2013-06-25 13:18:07 +08:00
Chia-I Wu	54ab03523b	ilo: convert pipe_rasterizer_state to ilo_rasterizer_wm Add ilo_gpe_init_rasterizer_wm() to construct fixed-function part of 3DSTATE_WM once in create_rasterizer_state().	2013-06-25 13:17:56 +08:00
Chia-I Wu	851202c319	ilo: use ilo_shader_cso for GS Add ilo_gpe_init_gs_cso() to construct 3DSTATE_GS once and early for geometry shaders.	2013-06-25 13:17:21 +08:00
Chia-I Wu	d209da5e33	ilo: introduce ilo_shader_cso for VS When a new VS kernel is generated, a newly added function, ilo_gpe_init_vs_cso(), is called to construct 3DSTATE_VS command in ilo_shader_cso. When the command needs to be emitted later, we copy the command from the CSO instead of constructing it dynamically.	2013-06-25 12:42:04 +08:00
Chia-I Wu	5c8db569ab	ilo: add functions to query shaders Add ilo_shader_get_type() to query the type (PIPE_SHADER_x) of the shader. Add ilo_shader_get_kernel_offset() and ilo_shader_get_kernel_param() to query the cache offset and various kernel parameters of the selected kernel.	2013-06-25 12:28:54 +08:00
Chia-I Wu	96e2133e72	ilo: clean up finalize_shader_states() Add ilo_shader_select_kernel() to replace the dependency table, ilo_shader_variant_init(), and ilo_shader_state_use_variant(). With the changes, we no longer need to include ilo_shader_internal.h in ilo_state.c.	2013-06-25 12:10:34 +08:00
Chia-I Wu	f0afedeb75	ilo: use multiple entry points for shader creation Replace ilo_shader_state_create() by ilo_shader_create_vs() ilo_shader_create_gs() ilo_shader_create_fs() ilo_shader_create_cs() Rename ilo_shader_state_destroy() to ilo_shader_destroy(). The old ilo_shader_destroy() is renamed to ilo_shader_destroy_kernel().	2013-06-25 11:54:14 +08:00
Chia-I Wu	4d789c76dc	ilo: move internal shader interface to a new header Move it to ilo_shader_internal.h. The goal is to make files not part of the compiler include only ilo_shader.h eventually.	2013-06-25 11:51:26 +08:00
Brian Paul	e3cbb18321	gallium/hud: do not use free() for the free_query_data hook That confuses Gallium's memory debugging code where CALLOC/MALLOC must be matched with FREE, not free(). Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-06-24 14:23:54 -06:00
Matthew McClure	e5bf19ac1c	draw: check for out-of-memory conditions in the AA line module. To prevent segfaults in the AA line module, the code will check for a valid pointer to the aaline_stage in the draw context. Fixes segfault from backtrace: * aaline_stage_from_pipe aaline_delete_fs_state Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-24 08:36:47 -06:00
José Fonseca	06badea0da	tests/graw: Fix typo in shader-leak.c	2013-06-24 15:29:25 +01:00
José Fonseca	a3d75db022	tools/trace: Fix syntax. Cleaned/commented up the code, but forgot to actually test before commiting...	2013-06-24 15:28:48 +01:00
Richard Sandiford	5a0556f061	st/dri/sw: Fix pitch calculation in drisw_update_tex_buffer swrastGetImage rounds the pitch up to 4 bytes for compatibility reasons that are explained in drisw_glx.c:bytes_per_line, so drisw_update_tex_buffer must do the same. Fixes window skew seen while running firefox over vnc on a 16-bit screen. NOTE: This is a candidate for the stable branches. [ajax: fixed typo in comment] Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2013-06-24 09:52:24 -04:00
Adam Jackson	2151d893fb	gallium: Fix llvmpipe on big-endian machines Squashed commit of the following: commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:07 2013 -0400 gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 0d65131649a8aa140e2db228ba779d685c4333e3 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:07 2013 -0400 gallivm: Fix big-endian machines This adds a bit-shift count to the format table, and adds the concept of vector or bitwise alignment on gathers. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 9740bda9b7dc894b629ed38be9b51059ce90818f Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:07 2013 -0400 llvmpipe: Fix convert_to_blend_type on big-endian Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit ae037c2de0f029e4e99371c0de25560484f0d8df Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 util: Convert color pack to packed formats This fixes them on big-endian. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 graw-xlib: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 format: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 417b60bc66eb450e68a92ab0e47f76e292b385e6 Author: Adam Jackson <ajax@redhat.com> Date: Tue Jun 18 12:25:06 2013 -0400 st/dri: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 0934b2e022a5e0847d312c40734e2b44cac52fd8 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 st/xlib: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit a307ea3c3716a706963acce7966b5e405ba11db9 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 gbm: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 tests: Convert to packed formats Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 2f77fe3ee524945eacd546efcac34f7799fb3124 Author: Adam Jackson <ajax@redhat.com> Date: Tue Jun 18 13:07:37 2013 -0400 gallium: Document packed formats Signed-off-by: Adam Jackson <ajax@redhat.com> commit 1f1017159ce951f922210a430de9229f91f62714 Author: Richard Sandiford <r.sandiford@uk.ibm.com> Date: Tue Jun 18 12:25:06 2013 -0400 gallium: Introduce 32-bit packed format names These are for interacting with buffers natively described in terms of bit shifts, like X11 visuals: uint32_t xyzw8888 = (x << 0) \| (y << 8) \| (z << 16) \| (w << 24); Define these in terms of (endian-dependent) aliases to the array-style format names. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com> commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a Author: Adam Jackson <ajax@redhat.com> Date: Mon Jun 3 12:10:32 2013 -0400 gallium: Document format name conventions v2: - Fix a channel name thinko (Michel Dänzer) - Elaborate on SCALED versus INT - Add links to DirectX and FOURCC docs Signed-off-by: Adam Jackson <ajax@redhat.com> commit df4d269e7fb62051a3c029b84147465001e5776e Author: Adam Jackson <ajax@redhat.com> Date: Tue Jun 18 12:25:06 2013 -0400 gallivm: Remove all notion of byte-swapping Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-06-24 09:48:56 -04:00
Roland Scheidegger	d282f4ea9b	llvmpipe: fix wrong results for queries not in a scene The result isn't always 0 in this case (depends on query type), so instead of special casing this just use the ordinary path (should result in correct values thanks to initialization in query_begin/end), just skipping the fence wait. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-22 17:09:37 +02:00
Brian Paul	a415aa9489	gallium/docs: more documentation for pipe_resource::array_size It should never be zero and for cube/cube_arrays it should be a multiple of six. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-22 08:50:15 -06:00
Brian Paul	cba7939790	svga: minor cleanups, comments in svga_tgsi_insn.c	2013-06-22 08:49:09 -06:00
Brian Paul	b03f394508	svga: add null ptr check in svga_get_tex_sampler_view() Trivial.	2013-06-22 08:49:09 -06:00
José Fonseca	67bfdea933	tools/trace: Several tweaks/fixes to dump_state	2013-06-22 12:30:39 +01:00
José Fonseca	545d3d32d8	trace: Dump result of create_stream_output_target	2013-06-22 12:30:39 +01:00
Maarten Lankhorst	6aabd9490c	vl/mpeg12: fix mpeg-1 bytestream parsing This fixes the bytestream parsing of mpeg-1 stream, but still leaves open a number of issues with the interpretation: - IDCT mismatch control is not correct for MPEG-1. - Slices do not have to start and end on the same horizontal row of macroblocks. - picture_coding_type = 4 (D-pictures) is not handled. - full_pel_*_vector is not handled. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-06-22 09:40:15 +02:00
Rob Clark	efdc6caaf5	freedreno/a3xx/compiler: ensure min # of cycles after bary instr The results of a bary.f do not appear to be immediatley available, but there is no explicit sync bit. Instead the compiler must just ensure that there are a minimum number of instructions following the bary before use of the result of the bary. We aren't clever enough for that so just throw in some nop's. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-21 15:37:05 -04:00
Rob Clark	d4aaa4439a	freedreno/a3xx/compiler: add TGSI_OPCODE_ABS Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-21 15:37:05 -04:00
Rob Clark	fe4ae1163d	freedreno/a3xx/compiler: add TGSI_OPCODE_DPH Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-21 15:37:05 -04:00
Rob Clark	3f965556b4	freedreno/a3xx/compiler: fix for replicating instructions If we are accumulating result into tmp.x, and need a mov to final destination, we want to move the .x component into all of the components enabled from the read dest's writemask, ie. we want: MOV dst.xyzw tmp.xxxx rather than: MOV dst.xyzw tmp.xyzw Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-21 15:37:05 -04:00
Eric Anholt	0343f20e2f	mesa: Move the common _mesa_glsl_compile_shader() code to glsl/. This code had no relation to ir_to_mesa.cpp, since it was also used by intel and state_tracker, and most of it was duplicated with the standalone compiler (which has periodically drifted from the Mesa copy). v2: Split from the ir_to_mesa to shaderapi.c changes. Acked-by: Paul Berry <stereotype441@gmail.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-21 10:04:30 -07:00
Eric Anholt	10c14d16d2	mesa: Move shader compiler API code to shaderapi.c There was nothing ir_to_mesa-specific about this code, but it's not exactly part of the compiler's core turning-source-into-IR job either. v2: Split from the ir_to_mesa to glsl/ commit, avoid renaming the sh variable. Acked-by: Paul Berry <stereotype441@gmail.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-21 10:04:29 -07:00
Eric Anholt	88398a817c	mesa: Fix missing setting of shader->IsES. I noticed this while trying to merge code with the builtin compiler, which does set it. Note that this causes two regressions in piglit in default-precision-sampler.* which try to link without a vertex or fragment shader, due to being run under the desktop glslparsertest binary (using ARB_ES3_compatibility) that doesn't know about this requirement. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-21 10:04:29 -07:00
Eric Anholt	faf3dbad0d	mesa: Use shared code for converting shader targets to short strings. We were duplicating this code all over the place, and they all would need updating for the next set of shader targets. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-21 10:04:29 -07:00
Eric Anholt	426ca34b7a	glsl: Remove ir_print_visitor.h includes and usage We have ir->print() to do the old declaration of a visitor and having the IR accept the visitor (yuck!). And now you can call _mesa_print_ir() safely anywhere that you know what an ir_instruction is. A couple of missing printf("\n")s are added in error paths -- when an expression is handed to the visitor, it doesn't print '\n' (since it might be a step in printing a whole expression tree). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-21 10:04:29 -07:00
Eric Anholt	2b049aa53e	glsl: Make _mesa_print_ir() available from anything including ir.h. No more forgetting to #include "ir_print_visitor.h" when doing temporary debug code, or forgetting and leaving it in after removing your temporary debug code. Also, available from C code so you don't need to move the caller to C++ just to call it (see also: ir_to_mesa.cpp). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-21 10:04:29 -07:00
Paul Berry	d0abac22c3	glsl: Make some files safe to include from C Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-21 10:04:28 -07:00
José Fonseca	2d7e837716	tools/trace: Quick instructions/notes. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-21 14:30:20 +01:00
José Fonseca	c14f516e58	tools/trace: Do a better job at comparing multi line strings. For TGSI diffing. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-21 14:30:20 +01:00
José Fonseca	9b7d21f8f5	tools/trace: Tool to compare json state dumps. Copied verbatim from apitrace's scripts/jsondiff.py Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-21 14:30:20 +01:00
José Fonseca	cc4ad695ca	tools/trace: Tool to dump gallium state at any draw call. Based from the code from the good old python state tracker. Extremely handy to diagnose regressions in state trackers. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-21 14:30:20 +01:00
José Fonseca	a7bccb33b9	tools/trace: Defer blob hex-decoding. To speed up parsing. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-21 14:30:19 +01:00
José Fonseca	a8f7e12d92	trace: Don't dump texture transfers. Huge trace files with little value. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-21 14:30:19 +01:00
Chia-I Wu	bbd2d575e6	ilo: replace a boolean by bool bool is used internally. This is just cosmetic.	2013-06-20 11:40:20 +08:00
Chia-I Wu	8b2cba8f97	ilo: rename cache_seqno to uploaded It has been used as a bool since shader cache rework.	2013-06-20 11:36:54 +08:00
Roland Scheidegger	ffebefa114	util: (trivial) add has_popcnt field Not used yet but there's a couple of places in llvmpipe which should use this (occlusion count is currently very inefficent if there's no cpu popcnt instruction).	2013-06-19 23:47:36 +02:00
Roland Scheidegger	5c9aee111e	llvmpipe: use 64bit counter for occlusion queries Some APIs require 64bit and at least for 64bit archs the overhead should be minimal. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 23:47:36 +02:00
Roland Scheidegger	dc5dc4fd94	llvmpipe: handle more queries Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and also fill out the ps_invocations and c_primitives from the PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already be handled). Note that ps_invocations isn't pixel exact, just 16 pixel exact but I guess it's better than nothing. Doesn't really seem to work correctly but there's probably bugs elsewhere. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 23:47:36 +02:00
Roland Scheidegger	bf5096303f	softpipe: handle all queries, and change for the new disjoint semantics The driver can do render_condition but wasn't handling the occlusion and so_overflow predicates (though the latter might not work yet due to gs support). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 23:47:36 +02:00
Roland Scheidegger	cdf89d0b5c	gallium: fix PIPE_QUERY_TIMESTAMP_DISJOINT The semantics didn't really make sense, not really matching neither d3d9 (though the docs are all broken there) nor d3d10. So make it match d3d10 semantics, which actually gives meaning to the "disjoint" part. Drivers are fixed up in a very primitive way, I have no idea what could actually cause the counter to become unreliable so just always return FALSE for the disjoint part. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 23:47:35 +02:00
José Fonseca	a0a40805dd	trace: Dump pipe_rasterizer_state::clip_halfz. Trivial.	2013-06-19 18:16:16 +01:00
Brian Paul	1e16e48f88	svga: add some comments about primitive conversion And clean up the svga_translate_prim() function with better variable names. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:14 -06:00
Brian Paul	8b3d4efed8	indices: add some comments This is pretty complicated code with few/any comments. Here's a first stab. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:14 -06:00
Brian Paul	2e8c51c98f	svga: reindent svga_tgsi.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:14 -06:00
Brian Paul	0de01a47dd	svga: whitespace, comment, formatting fixes in svga_tgsi_emit.h Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:14 -06:00
Brian Paul	1f57349e20	svga: move some svga/tgsi functions Move some functions from the svga_tgsi_insn.h header into the svga_tgsi_insn.c file since they're only used there. Plus, add comments and fix formatting. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:14 -06:00
Brian Paul	3abd9285be	svga: formatting fixes in svga_tgsi_insn.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:13 -06:00
Brian Paul	9e6c29bf12	mesa: wrap comments, code to 78 columns in multisample.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:13 -06:00
Brian Paul	bdd5a0c12b	mesa: remove unused BITSET64 macros Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-19 11:13:13 -06:00
Maarten Lankhorst	f1cccd6ca0	nvc0: kill assert in ppp code It's no longer always true, and the video tilign aligment should ensure the alignment is handled correctly regardless.	2013-06-19 13:08:51 +02:00
Chia-I Wu	cf41fae96b	ilo: rework shader cache The new code makes the shader cache manages all shaders and be able to upload all of them to a caller-provided bo as a whole. Previously, we uploaded only the bound shaders. When a different set of shaders is bound, we had to allocate a new kernel bo to upload if the current one is busy.	2013-06-19 16:46:42 +08:00
Emil Velikov	7f7b05d6b3	nv50: avoid crash on updating RASTERIZE_ENABLE state When doing blit using the 3D engine, the rasterizer cso may be NULL. Ported from nvc0 commit `8aa8b0539`. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-06-19 00:02:24 +02:00
Kristian Høgsberg	712269d674	wayland: Handle global_remove event as well We need to set up a handler for the global_remove event that gets sent out when a global gets removed. Without the handler we end up calling a NULL pointer. https://bugs.freedesktop.org/show_bug.cgi?id=65910 NOTE: This is a candidate for the stable branches. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2013-06-18 17:45:19 -04:00
Jordan Justen	adeda5afd4	gen7: fix GPU hang on WebGL texture-size test When rendering to a texture with BaseLevel set, the miptree may be laid out such that BaseLevel is in level 0 of the miptree (to avoid wasting memory on unused levels between 0 and BaseLevel-1). In that case, we have to shift our render target's level down to the appropriate level of the smaller miptree. The WebGL test in combination with a meta code relating to glGenerateMipmap also triggered a similar failure scenario. This GPU hang regression was introduced by `c754f7a8`. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65324 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-18 14:06:46 -07:00
Eric Anholt	248fddecd8	intel: Remove unused IS_POWER_OF_TWO() macro. The is_power_of_two() inline function has been used instead. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-06-18 12:08:08 -07:00
Zack Rusin	9542131b27	Revert "draw: clear the draw buffers in draw" This reverts commit `41966fdb3b`. While it's a lot cleaner it causes regressions because the draw interface is always called from the draw functions of the drivers (because the buffers need to be mapped) which means that the stream output buffers endup being cleared on every draw rather than on setting. Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-06-17 21:43:10 -04:00
Roland Scheidegger	8975dc798d	llvmpipe: fixes for conditional rendering honor render_condition for clear_render_target and clear_depth_stencil. Also add minimal support for occlusion predicate, though it can't be active at the same time as an occlusion query yet. While here also switchify some large if-else (actually just mutually exclusive if-if-if...) constructs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-18 18:01:24 +02:00
Roland Scheidegger	793e8e3d7e	gallium: add condition parameter to render_condition For conditional rendering this makes it possible to skip rendering if either the predicate is true or false, as supported by d3d10 (in fact previously it was sort of implied skip rendering if predicate is false for occlusion predicate, and true for so_overflow predicate). There's no cap bit for this as presumably all drivers could do it trivially (but this patch does not implement it for the drivers using true hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL functionality). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-18 18:01:24 +02:00
Chia-I Wu	443dc15cf7	ilo: construct depth/stencil command in create_surface() Add ilo_gpe_init_zs_surface() to construct 3DSTATE_DEPTH_BUFFER 3DSTATE_STENCIL_BUFFER 3DSTATE_HIER_DEPTH_BUFFER at surface creation time. This allows fast state emission in draw_vbo().	2013-06-18 16:23:13 +08:00
Eric Anholt	eb20215075	intel: Allow blorp CopyTexSubImage to nonzero destination slices. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-17 15:43:23 -07:00
Eric Anholt	746b57ef0e	intel: Allow blit CopyTexSubImage to nonzero destination slices. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-17 15:43:23 -07:00
Eric Anholt	b0e3c3b852	intel: Directly implement blit glBlitFramebuffer instead of awkward reuse. This gets us support for blitting to attachment types other than textures. v2: fix up comments from review by Kenneth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-06-17 15:43:23 -07:00
Eric Anholt	815dce9282	intel: Move XRGB->ARGB blit logic into intel_miptree_blit(). Now any caller (such as glCopyPixels()) can benefit from it, and it only changes the correct subset of the destination instead of a whole teximage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-06-17 15:43:23 -07:00
Eric Anholt	04a5e940c9	intel: Fix Y tiling support for glCopyTexSubImage's alpha override. Apparently we don't have any piglit tests for this, because it would have assertion failed in a debug build, or just rendered wrong in a non-debug build if the destination wasn't covering whole tiles. v2: Use the new macros. Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-06-17 15:43:23 -07:00
Eric Anholt	78c2fc5925	intel: Make batch macros for doing BCS_SWCTRL setup. We're going to add more BCS_SWCTRL setup instances soon, and you have to be careful to have the set and restore atomic with the rendering that's done, so that our state doesn't leak out to other rendering processes. v2: Rewrite the patch to have batch begin/advance macros so that magic numbers don't get sprinkled around (and so you don't mix up your do-I-need-to-reset vs what-do-I-reset-to logic, which I nearly did in the next patch when first writing it) Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-17 15:43:13 -07:00
Eric Anholt	b65b1c3148	mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage(). Intel had brokenness here, and I'd like to continue moving Mesa toward hiding 1D_ARRAY's ridiculousness inside of the core, like we did with MapTextureImage. Fixes copyteximage 1D_ARRAY on intel. There's still an impedance mismatch in meta when falling back to read and texsubimage, since texsubimage expects coordinates into 1D_ARRAY as (width, slice, 0) instead of (width, 0, slice). v2: Fix offset of scanline reads from the source. (Thanks Brian!), replace dd.h comment with Paul's text and replace early exit with an assert. Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)	2013-06-17 15:26:20 -07:00
Dave Airlie	9e8400f4c9	tgsi: text parser: fix parsing of array in declaration I noticed this code didn't work as advertised while doing some passing around of TGSI shaders and trying to reparse them, and things failing. This seems to fix it here for at least the small test case I hacked into a graw test. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-06-18 08:25:12 +10:00
Sven Joachim	0829b893a9	mesa: Fix ieee fp on Alpha Commit `1f82bf12ed` inadvertently broke it, checking for __IEEE_FLOAT on all Alpha machines instead of only on VMS as before. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com> Signed-off-by: Sven Joachim <svenjoac@gmx.de>	2013-06-17 10:02:56 -07:00
Richard Sandiford	c132c2978b	st/xlib: Fix XImage stride calculation Fixes window skew seen while running gnome on a 16-bit screen over vnc. NOTE: This is a candidate for stable release branches. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2013-06-17 12:15:13 -04:00
Richard Sandiford	876fefe2ff	st/xlib Fix XIMage bytes-per-pixel calculation Fixes a crash seen while running gnome on a 16-bit screen over vnc. NOTE: This is a candidate for stable release branches. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2013-06-17 12:14:32 -04:00
Jonathan Gray	ebd68dd029	gallium: replace bswap_32 calls with util_bswap32 byteswap.h and bswap_32 aren't portable, replace them with calls to gallium's util_bswap32 as suggested by Mark Kettenis. Lets these files build on OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-06-17 17:22:28 +02:00
Zack Rusin	7807763dd8	draw: fix a regression in computing max elt gl can use elts without setting indices, in which case our eltMax was set to 0 and always invoking the overflow condition. So by default set eltMax to maximum, it will be curbed by draw_set_indexes (if it ever comes) and if not then it will let gl's glVertexPointer/glDrawArrays work correctly. Fixes piglit's triangle-rasterization-overdraw test. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-17 11:06:39 -04:00
Zack Rusin	41966fdb3b	draw: clear the draw buffers in draw Moves clearing of the draw so target buffers to the draw module. They had to be cleared in the drivers before which was quite messy. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-17 11:06:39 -04:00
Chia-I Wu	98bc4c62a6	ilo: add pipe-based copy method to ilo_blitter It enables accelerated resource_copy_region() when blt-based method fails.	2013-06-17 18:28:58 +08:00
Chia-I Wu	ebfd7a61c0	ilo: add BLT-based blitting methods to ilo_blitter Port BLT code in ilo_blit.c to BLT-based blitting methods of ilo_blitter. Add BLT-based clears. The latter is verifed with util_clear(), but it is not in use yet.	2013-06-17 16:36:53 +08:00
Chia-I Wu	b4b3a5c6dc	ilo: replace util_blitter by ilo_blitter ilo_blitter is just a wrapper for util_blitter for now. We will port BLT code to ilo_blitter shortly.	2013-06-17 14:37:10 +08:00
Kenneth Graunke	6d7abafdc8	i965: Assume flexible hardware primitive restart exists in the future. Primitive restart with an arbitrary cut index was first supported as of Haswell. It's very doubtful that they'd take that away in future hardware, so we may as well alter the check now.	2013-06-14 22:58:18 -07:00
Chris Forbes	def84d8014	i965: Shrink Gen5 VUE map layout to be the same as Gen4. The PRM suggests a larger layout, mostly to support having gl_ClipDistance[] somewhere predictable for the fixed-function clipper -- but it didn't actually arrive in Gen5. Just use the same layout for both Gen4 and Gen5. No Piglit regressions. Improves performance in CS:S Video Stress Test by ~3%. V2: - Remove now-useless function for determining the SF URB read offset - Remove now-unused BRW_VARYING_SLOT_POS_DUPLICATE Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-16 01:05:41 +12:00
Kenneth Graunke	1b77d2133c	i965: Implement 16-wide math on G45 and Ironlake. [chrisf:] Improves performance in CS:S video stress test by about 2%. No piglit regressions on Ironlake. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-06-16 00:47:50 +12:00
Matt Turner	fcaa48d9cc	glsl: Disallow return with a void argument from void functions. NOTE: This is a candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-14 11:25:49 -07:00
Matt Turner	1a1b03e6bc	glsl: Allow implicit conversion of return values. Required by ARB_shading_language_420pack. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-14 11:25:49 -07:00
Matt Turner	876e16562b	glsl: Add gl_{Max,Min}ProgramTexelOffset built-in constants. Required by ARB_shading_language_420pack. Note that the 420pack spec incorrectly specifies their values as (Min, Max) = (-7, 8) when they should be (-8, 7) as listed in the GLSL 4.30 and ESSL 3.0 specs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-14 11:25:49 -07:00
Matt Turner	ed455cdb0b	glsl: Allow swizzles on scalars. Required by ARB_shading_language_420pack. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-14 11:25:49 -07:00
Matt Turner	a8492e8fe7	glsl: Allow .length() method on vectors and matrices. Required by ARB_shading_language_420pack. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-14 11:25:49 -07:00
Todd Previte	cf7f424e18	mesa: Add infrastructure for ARB_shading_language_420pack. v2 [mattst88] - Split infrastructure into separate patch. - Add preprocessor #define. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-14 11:25:48 -07:00
Chia-I Wu	bfa8d21759	ilo: fix for half-float vertex arrays Commit `6fe0453c33` broke half-float vertex arrays. This reverts a part of that commit, and explains why.	2013-06-15 01:00:03 +08:00
Chia-I Wu	36ffd08706	ilo: add some assertions to help debugging Assert that we do not support user vertex/index/constant buffers. Issue a warning when a sampler view is created for a resource without PIPE_BIND_SAMPLER_VIEW.	2013-06-14 16:02:31 +08:00
Chia-I Wu	0d9afaad35	ilo: silence a compiler warning The path should never be hit.	2013-06-14 15:36:30 +08:00
Vinson Lee	93534873b0	glsl: Fix null check in read_dereference. Fixes "Logically dead code" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 22:13:34 -07:00
Chia-I Wu	399548b17f	st/mesa: fix temp texture bindings in st_CopyPixels() The temporary texture should have either PIPE_BIND_RENDER_TARGET or PIPE_BIND_DEPTH_STENCIL set in addition to PIPE_BIND_SAMPLER_VIEW. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-06-14 08:46:04 +08:00
Zack Rusin	5507c11f85	gallium/draw: add limits to the clip and cull distances There are strict limits on those registers. Define the maximums and use them instead of magic numbers. Also allows us to add some extra sanity checks. Suggested by Brian. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-13 12:13:11 -04:00
Zack Rusin	b63eeaf7b7	draw: cleanup the distance culling code a bit We don't need the clamped variable, because we can just return early. We should also do the regular culling after the distance culling passes. All spotted by Brian. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-13 12:13:01 -04:00
Chia-I Wu	c7e9b15010	ilo: mapping a resource may make some states dirty When a resource is busy and is mapped with PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE, the underlying bo is replaced. We need to mark states affected by the resource dirty. With this change, we no longer have to emit vertex buffers and index buffer unconditionally.	2013-06-13 23:47:18 +08:00
Chia-I Wu	5f15050dc9	ilo: bump up PIPE_CAP_GLSL_FEATURE_LEVEL to 140 With UBO and TBO support, we are supposedly good to claim GLSL 1.40.	2013-06-13 23:47:18 +08:00
Chia-I Wu	4df85dbc06	ilo: initialize dirty flags in ilo_init_states() Now that we have a function to initialize states, initialize dirty flags there too.	2013-06-13 23:47:18 +08:00
Chia-I Wu	6057d7b7b5	ilo: re-emit states that involve resources Even with hardware contexts, since we do not pin resources, we have to re-emit the states so that the resources are referenced (by cp->bo) and their offsets are updated in case they are moved. This also allows us to elimiate cp flush in is_bo_busy().	2013-06-13 12:58:47 +08:00
Chia-I Wu	b65bdc61bd	ilo: fix for util_blitter_clear() changes It has been broken since `17350ea979`.	2013-06-13 12:58:47 +08:00
Manfred Ernst	bf2c074a2f	mesa: Fix bug in unclamped float to ubyte conversion. Problem: The IEEE float optimized version of UNCLAMPED_FLOAT_TO_UBYTE in macros.h computed incorrect results for inputs in the range 0x3f7f0000 (=0.99609375) to 0x3f7f7f80 (=0.99803924560546875) inclusive. 0x3f7f7f80 is the IEEE float value that results in 254.5 when multiplied by 255. With rounding mode "round to closest even integer", this is the largest float in the range 0.0-1.0 that is converted to 254 by the generic implementation of UNCLAMPED_FLOAT_TO_UBYTE. The IEEE float optimized version incorrectly defined the cut-off for mapping to 255 as 0x3f7f0000 (=255.0/256.0). The same bug was present in the function float_to_ubyte in u_math.h. Fix: The proposed fix replaces the incorrect cut-off value by 0x3f800000, which is the IEEE float representation of 1.0f. 0x3f7f7f81 (or any value in between) would also work, but 1.0f is probably cleaner. The patch does not regress piglit on llvmpipe and on i965 on sandy bridge. Tested-by Stéphane Marchesin <marcheu@chromium.org> Reviewed-by Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-12 20:24:48 -07:00
Marek Olšák	3475b22133	st/dri: if flushing a drawable, don't set reason=SWAPBUFFERS 0 means SWAPBUFFERS. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	a713d7b1b9	st/dri: resolve the back buffer only in SwapBuffers Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	3b525036b9	st/dri: manually swap MSAA front and back buffers in SwapBuffers Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	b77316ad75	st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers This commit fixes these piglit tests with an MSAA visual forced on: - read-front - glx-copy-sub-buffer Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	fdf9d234e2	st/dri: refactor dri_msaa_resolve The generic blit will be used by the following commit. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	6c6cfc02c9	st/dri: reuse depth-stencil and MSAA resources after DRI2 invalidate event Page flipping generates an invalidate event every frame, causing reallocations of all private resources (MSAA and depth-stencil). Reusing the resources may improve performance (especially under memory pressure). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	683b065320	st/dri: fix MSAA resolving of buffers with height > width Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	526ebfa278	st/mesa: make generic CopyPixels path work with MSAA visuals We have to use pipe->blit, not resource_copy_region, so that the read buffer is resolved if it's multisampled. I also removed the CPU-based copying, which just did format conversion (obsoleted by the blit). Also, the layer/slice/face of the read buffer is taken into account (this was ignored). Last but not least, the format choosing is improved to take float and integer read buffers into account. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:14 +02:00
Marek Olšák	9ef44e6eb7	st/mesa: don't use blit_copy_pixels if an occlusion query is active CopyPixels, just as DrawPixels, should count the samples that passed depth test. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	79e421260a	st/mesa: rework blit_copy_pixels to use pipe->blit There were 2 issues with it: - resource_copy_region doesn't allow different sample counts of both src and dst, which can occur if we blit between a window and a FBO, and the window has an MSAA colorbuffer and the FBO doesn't. (this was the main motivation for using pipe->blit) - blitting from or to a non-zero layer/slice/face was broken, because rtt_face and rtt_slice were ignored. blit_copy_pixels is now used even if the formats and orientation of framebuffers don't match. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	4d59258856	r600g: upsample and downsample MSAA resources for transfers We did downsample (=resolve) MSAA resources to make ReadPixels work with MSAA GLX visuals, which was enough for read-only color-only transfers. This commit makes write color transfers and depth-stencil transfers work in a similar manner. It does downsampling in transfer_map and upsampling in transfer_unmap. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	72a086b8b2	gallium/u_format: add a new helper for initializing pipe_blit_info::mask Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	d6d4a9a2e8	gallium/u_blitter: make clearing independent of the colorbuffer format There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching. Both of them don't do any format conversion. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	17350ea979	gallium/u_blitter: make clearing independent of the number of bound colorbuffers We can use the fragment shader TGSI property WRITES_ALL_CBUFS. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	de1c38299c	gallium/util: make WRITES_ALL_CBUFS optional in the passthrough fragment shader Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-13 03:54:13 +02:00
Marek Olšák	45595d5066	mesa: fix OES_EGL_image_external being partially allowed in the core profile Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-13 03:54:13 +02:00
Ian Romanick	cfa3c5ad82	glsl: Generate smaller values for uniform locations Previously we would generate uniform locations as (slot << 16) + array_index. We do this to handle applications that assume the location of a[2] will be +1 from the location of a[1]. This resulted in every uniform location being at least 0x10000. The OpenGL 4.3 spec was amended to require this behavior, but previous versions did not require locations of array (or structure) members be sequential. We've now encountered two applications that assume uniform values will be "small." As far as we can tell, these applications store the GLint returned by glGetUniformLocation in a int16_t or possibly an int8_t. THIS BEHAVIOR IS NOT GUARANTEED OR IMPLIED BY ANY VERSION OF OpenGL. Other implementations happen to have both these behaviors (sequential array elements and small values) since OpenGL 2.0, so let's just match their behavior. Fixes "3D Bowling" on Android. NOTE: This is a candidate for stable release branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-12 16:30:29 -07:00
Ian Romanick	26d86d26f9	glsl: Add gl_shader_program::UniformLocationBaseScale This is used by _mesa_uniform_merge_location_offset and _mesa_uniform_split_location_offset to determine how the base and offset are packed. Previously, this value was hard coded as (1U<<16) in those functions via the shift and mask contained therein. The value is still (1U<<16), but it can be changed in the future. The next patch dynamically generates this value. NOTE: This is a candidate for stable release branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-12 16:30:18 -07:00
Ian Romanick	5097f35841	glsl: Add a gl_shader_program parameter to _mesa_uniform_{merge,split}_location_offset This will be used in the next commit. NOTE: This is a candidate for stable release branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-12 16:30:06 -07:00
Roland Scheidegger	4cce4efaa3	util: new util_fill_box helper Use new util_fill_box helper for util_clear_render_target. (Also fix off-by-one map error.) v2: handle non-zero z correctly in new helper Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-13 00:41:43 +02:00
Roland Scheidegger	957c040eb8	gallivm: (trivial) remove duplicated code block (including comment)	2013-06-13 00:41:43 +02:00
Paul Berry	b09a754078	i965/gen7: Enable support for fast color clears. This patch adds code to place mcs_state into INTEL_MCS_STATE_RESOLVED for miptrees that are capable of supporting fast color clears. This will have no effect on buffers that don't undergo a fast color clear; however, for buffers that do undergo a fast color clear, an MCS miptree will be allocated (at the time of the first fast clear), and will be used thereafter. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:07 -07:00
Paul Berry	ef9142d4a3	i965/gen7+: Disable fast color clears on shared regions. In certain circumstances the memory region underlying a miptree is shared with other miptrees, or with other code outside Mesa's control. This happens, for instance, when an extension like GL_OES_EGL_image or GLX_EXT_texture_from_pixmap extension is used to associate a miptree with an image existing outside of Mesa. When this happens, we need to disable fast color clears on the miptree in question, since there's no good synchronization mechanism to ensure that deferred clear writes get performed by the time the buffer is examined from the other miptree, or from outside of Mesa. Fortunately, this should not be a performance hit for most applications, since most applications that use these extensions use them for importing textures into Mesa, rather than for exporting rendered images out of Mesa. So most of the time the miptrees involved will never experience a clear. v2: Rework based on the fact that we have decided not to use an accessor function to protect access to the region. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:07 -07:00
Paul Berry	67cd0f9703	i965/gen7+: Resolve color buffers when necessary. Resolve color buffers that have been fast-color cleared: 1. before texturing from the buffer (brw_predraw_resolve_buffers()) 2. before using the buffer as the source in a blorp blit (brw_blorp_blit_miptrees()) 3. before mapping the buffer's miptree (intel_miptree_map_raw(), intel_texsubimage_tiled_memcpy()) 4. before accessing the buffer using the hardware blitter (intel_miptree_blit(), do_blit_bitmap()) v2: Rework based on the fact that we have decided not to use an accessor function to protect access to the region. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:07 -07:00
Paul Berry	e9dfcb38e9	i965/gen7+: Ensure that front/back buffers are fast-clear resolved. We already had code in intel_downsample_for_dri2_flush() for downsampling front and back buffers when multisampling was in use. This patch extends that function to perform fast color clear resolves when necessary. To account for the additional functionality, the function is renamed to simply intel_resolve_for_dri2_flush(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:07 -07:00
Paul Berry	418aecea7d	i965/blorp: Write blorp code to do render target resolves. This patch implements the "render target resolve" blorp operation. This will be needed when a buffer that has experienced a fast color clear is later used for a purpose other than as a render target (texturing, glReadPixels, or swapped to the screen). It resolves any remaining deferred clear operation that was not taken care of during normal rendering. Fortunately not much work is necessary; all we need to do is scale down the size of the rectangle primitive being emitted, run the fragment shader with the "Render Target Resolve Enable" bit set, and ensure that the fragment shader writes to the render target using the "replicated color" message. We already have a fragment shader that does that (the shader that we use for fast color clears), so for simplicity we re-use it. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:07 -07:00
Paul Berry	fac32c0bd3	i965/blorp: Expand clear class hierarchy to prepare for RT resolves. The fragment shaders that to do color clears will be re-used to perform so-called "render target resolves" (the resolves associated with fast color clears). To prepare for that, this patch expands the class hierarchy for blorp params by adding brw_blorp_const_color_params (which will be used for all blorp operations where the fragment shader outputs a constant color). Some other data structures and functions were also renamed to use "const_color" nomenclature where appropriate. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:06 -07:00
Paul Berry	5e5d4e021f	i965/gen7+: Implement fast color clear operation in BLORP. Since we defer allocation of the MCS miptree until the time of the fast clear operation, this patch also implements creation of the MCS miptree. In addition, this patch adds the field intel_mipmap_tree::fast_clear_color_value, which holds the most recent fast color clear value, if any. We use it to set the SURFACE_STATE's clear color for render targets. v2: Flag BRW_NEW_SURFACES when allocating the MCS miptree. Generate a perf_debug message if clearing to a color that isn't compatible with fast color clear. Fix "control reaches end of non-void function" build warning. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 11:10:06 -07:00
Paul Berry	dd3f950115	i965/gen7+: Create helper functions for single-sample MCS buffers. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 10:45:42 -07:00
Paul Berry	460b7bc7a1	i965/gen7+: Set up MCS in SURFACE_STATE whenever MCS is present. On Gen7+, MCS buffers are used both for compressed multisampled color buffers and for "fast clear" of single-sampled color buffers. Previous to this patch series, we didn't support fast clear, so we only used MCS with multisampled bolor buffers. As a first step to implementing fast clears, this patch modifies the code that sets up SURFACE_STATE so that it configures the MCS buffer whenever it is present, regardless of whether we are multisampling or not. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 10:45:42 -07:00
Paul Berry	7e5cb4bc4c	i965/gen7+: Create an enum for keeping track of fast color clear state. This patch includes code to update the fast color clear state appropriately when rendering occurs. The state will also need to be updated when a fast clear or a resolve operation is performed; those state updates will be added when the fast clear and resolve operations are added. v2: Create a new function, intel_miptree_used_for_rendering() to handle updating the fast color clear state when rendering occurs. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 10:45:42 -07:00
Paul Berry	8f5147c199	intel: Conditionally compile mcs-related code for i965 only. This patch ifdefs out intel_mipmap_tree::mcs_mt when building the i915 (pre-Gen4) driver (MCS buffers aren't supported until Gen7, so there is no need for this field in the i915 driver). This should make it a bit easier to implement fast color clears without undue risk to i915. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 10:45:42 -07:00
Paul Berry	a5efdca7b7	intel: Keep region name in intel_miptree_create_for_dri2_buffer(). When processing a buffer received from the X server, intel_process_dri2_buffer() examines intel_region::name to determine whether it's received a brand new buffer, or the same buffer it received from the X server the last time it made a request. However, this didn't work properly, because in the call to intel_miptree_create_for_dri2_buffer(), we create a fresh intel_region object to represent the buffer, and this was causing us to forget the buffer's previous name. This patch fixes things by copying over the region name when creating the fresh intel_region object. At the moment, this is just a minor performance optimization. However, when fast color clears are added, it will be necessary to ensure that the fast color clear state for a buffer doesn't get discarded the next time we receive that buffer from the X server. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-12 10:45:42 -07:00
Chia-I Wu	adf324ad28	winsys/intel: make struct intel_bo alias drm_intel_bo There is really nothing in struct intel_bo, and having it alias drm_intel_bo makes the winsys impose almost zero overhead. We can make the overhead gone completely by making the functions static inline, if needed.	2013-06-12 17:46:52 +08:00
Chia-I Wu	e7a14eea16	winsys/intel: reorganize functions Move functions around to match the order of the declarations in the header.	2013-06-12 17:46:52 +08:00
Chia-I Wu	39226705b7	ilo: update winsys interface The motivation is to kill tiling and pitch in struct intel_bo. That requires us to make tiling and pitch not queryable, and be passed around as function parameters.	2013-06-12 17:46:52 +08:00
Chia-I Wu	cdfb2163c4	ilo: get rid of function tables in winsys We are moving toward making struct intel_bo alias drm_intel_bo. As a first step, we cannot have function tables.	2013-06-12 17:46:52 +08:00
Chia-I Wu	6fe0453c33	ilo: access bo size directly buf->bo_size is readily avaiable, no need to go via buf->bo->get_size().	2013-06-12 17:46:52 +08:00
Chia-I Wu	3f79188854	ilo: remove unnecessary tex_set_bo/buf_set_bo Merge the bodies to tex_create_bo/buf_create_bo respectively.	2013-06-12 17:46:52 +08:00
Kenneth Graunke	b00d61151d	i965: Emit the depth/stencil state pointer directly, not via atoms. See two commits ago for the rationale. This allows us to delete the whole gen7_cc_state.c file. This does move these commands before the depth stall flushes from brw_emit_depthbuffer, which may be a problem. The documentation for 3DSTATE_DEPTH_BUFFER mentions that depth stall flushes are required before changing any depth/stencil buffer state, but explicitly lists 3DSTATE_DEPTH_BUFFER, 3DSTATE_HIER_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER, and 3DSTATE_CLEAR_PARAMS. It does not mention this particular packet (_3DSTATE_DEPTH_STENCIL_STATE_POINTERS). No observed Piglit regressions on Sandybridge or Ivybridge. Together with the last two commits, this makes a cairo-gl benchmark faster by 0.324552% +/- 0.258355% on Ivybridge. No statistically significant change on Sandybridge. (Thanks to Eric for the numbers.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-11 15:42:17 -07:00
Kenneth Graunke	8ab15bacf4	i965: Emit the CC state pointer directly rather than via atoms. See the previous commit for the rationale. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-11 15:42:17 -07:00
Kenneth Graunke	da1a896b0f	i965: Emit the BLEND_STATE pointer directly rather than via atoms. Previously, we would: 1. Emit the new indirect state. 2. Flag CACHE_NEW_BLEND_STATE. 3. Rely on later state atoms to notice CACHE_NEW_BLEND_STATE and emit a pointer to the new indirect state. This is rather cumbersome: it requires two state atoms instead of one, and there's a strict ordering dependency in the list. Plus, the code gets spread across two functions (or even files in the case of Gen7+). Gen7+ has a packet to update just the blend state pointer, so it makes a lot of sense to simply emit that right away. Gen6 has a combined packet which updates blending, the color calculator, and depth/stencil state; however, each can still be modified independently. This drops the Gen6 micro-optimization where we tried to only emit one packet that changed all three states. State updates are pretty cheap. CACHE_NEW_BLEND_STATE is no longer necessary, so drop it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-11 15:42:16 -07:00
Zack Rusin	babe35a067	draw: implement distance culling Works similarly to clip distance. If the cull distance is negative for all vertices against a specific plane then the primitive is culled. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-10 22:04:28 -04:00
Zack Rusin	3d08eada34	gallium: add a cull distance semantic cull distance is analogous to clip distance. If a register is given this semantic, then the values in it are assumed to be a float32 distance to a plane. Primitives will be completely discarded if the plane distance for all of the vertices in the primitive are < 0. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-10 22:04:28 -04:00
Zack Rusin	0a3779d955	draw: fix clipper invocation statistics We need to figure out the number of invocations of the clipper before the emit, because in the emit we are after clipping where the number of primitives will be equal to number of clipper invocations minus the clipped primitives. So our computations were always off by the number of clipped primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-10 22:04:28 -04:00
Zack Rusin	2b2e7bb133	draw: enable user plane clipping when clipdistance is used Draw depended on clip_plane_enable being set in the rasterizer to use clipdistance registers for clipping. That's really unfriendly because it requires that rasterizer state to have variants for every shader out there. Instead of depending on the rasterizer lets extract the info from the available state: if a shader writes clipdistance then we need to use it and we need to clip using a number of planes equal to the number of writen clipdistance components. This way clipdistances just work. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-10 22:04:27 -04:00
Zack Rusin	c1a50f5ed7	draw: make sure clipdistances work with geometry shaders we were always fetching the info from the vertex shader, but if geometry shader is present it should be used as the source of that info. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-06-10 22:04:27 -04:00
Kenneth Graunke	3dacb7d40b	Revert "i965: Disable unused pipeline stages once at startup on Gen7+." This reverts commit `6c966ccf07`. Apparently causes GPU hangs. Conflicts: src/mesa/drivers/dri/i965/brw_state.h src/mesa/drivers/dri/i965/brw_state_upload.c	2013-06-11 10:53:44 -07:00
Brian Paul	42adf5f0dd	swrast: add texfetch code for some XBGR formats Fixes piglit texture-packed-formats regression. We need to implement more XBGR formats here eventually, but many are UINT/SINT formats which swrast doesn't handle yet anyway (integer textures). Bugzilla https://bugs.freedesktop.org/show_bug.cgi?id=64935 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-11 08:26:38 -06:00
Brian Paul	91405e3502	mesa: add missing texture strings in tex_target_name() And add a static assert for the future.	2013-06-10 16:35:35 -06:00
Alex Deucher	761320b197	winsys/radeon: add env var to disable VM on Cayman/Trinity Set env var RADEON_VA=0 to disable VM on Cayman/Trinity. Useful for debugging. Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-06-10 18:02:57 -04:00
Eric Anholt	fceff14450	mesa: Add a _mesa_problem to document a piglit failure on i965. Having figured out what was going on with piglit fbo-depth copypixels GL_DEPTH_COMPONENT32F (falling all the way back to swrast on CopyPixels to a float depth buffer), I'm not inclined to fix the problem currently but it seems worth saving someone else the debug time. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-06-10 14:04:25 -07:00
Eric Anholt	9a0bd682f9	i965/vs: Avoid the MUL/MACH/MOV sequence for small integer multiplies. We do a lot of multiplies by 3 or 4 for skinning shaders, and we can avoid the sequence if we just move them into the right argument of the MUL. On pre-IVB, this means reliably putting a constant in a position where it can't be constant folded, but that's still better than MUL/MACH/MOV. Improves GLB 2.7 trex performance by 0.788648% +/- 0.23865% (n=29/30) v2: Fix test for pre-sandybridge. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2013-06-10 14:04:24 -07:00
Eric Anholt	d28e285d41	i965/vs: Allow copy propagation into MUL/MACH. This is a trivial port of `1d6ead3804` from the FS. No significant performance difference on trex (misplaced the data, but it was about n=20). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-06-10 14:04:24 -07:00
Eric Anholt	263a7e4cd9	i965/vs: Use the MAD instruction when possible. This is different from how we do it in the FS - we are using MAD even when some of the args are constants, because with the relatively unrestrained ability to schedule a MOV to prepare a temporary with that data, we can get lower latency for the sequence of instructions. No significant performance difference on GLB2.7 trex (n=33/34), though it doesn't have that many MADs. I noticed MAD opportunities while reading the code for the DOTA2 bug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-06-10 14:04:24 -07:00
Richard Sandiford	1ff10f92e7	draw: Add A8R8G8B8 to draw_print_arrays Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <r.sandiford@uk.ibm.com>	2013-06-10 16:28:31 -04:00
Richard Sandiford	5876a4c71d	draw: Fix type mismatch between draw_private.h and LLVM draw_vertex_buffer declared the size field to be a size_t, but the LLVM code used an int32 instead. This caused problems on big-endian 64-bit targets, because the first 32-bit chunk of the 64-bit size_t was always 0. In one sense size_t seems like a good choice for a size, so one fix would have been to try to get the LLVM code to use the equivalent of size_t too. However, in practice, the size is taken from things like ~0 or width0, both of which are int-sized, so it seemed simpler to make the size field int-sized as well. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2013-06-10 16:26:14 -04:00
Richard Sandiford	337f21bc35	util: Use sizeof(void *) rather than 0 as the fallback cache line size Without this, llvmpipe ends up giving a zero size to all uncompressed textures on non-x86 systems, since align() cannot handle a 0 alignment. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2013-06-10 16:26:09 -04:00
Richard Sandiford	ba6cd796dd	llvmpipe: Use saturating add/sub for UNORM formats lp_build_add and lp_build_sub have fallback code for cases that cannot be handled by known intrinsics. For UNORM formats, this code was using modulo rather than saturating arithmetic. This fixes some rendering issues for a gnome session on System z. It also fixes various piglit tests on z, such as spec/ARB_color_buffer_float/GL_RGBA8-render. The patch deliberately doesn't tackle the more complicated SNORM case. Tested against piglit on x86_64 and System z with no regressions. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2013-06-10 16:20:45 -04:00
Kenneth Graunke	a0037cecd1	intel: Reserve less batchbuffer space. Now that Gen6+ relies on hardware contexts, we don't need to record an occlusion query value at the end of each batch. That means we no longer need to reserve space for the absurd number of PIPE_CONTROLs required to do that on Sandybridge. See commit `4e087de51a`, which bumped this up to 60 bytes. This is not quite a revert, as it uses 24 bytes instead of 16, and saves the comments. As far as I can tell, the old value of 16 bytes was just wrong, so we shouldn't go back to that. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:51 -07:00
Kenneth Graunke	fc800f0c60	i965: Allocate push constant L3 space once at startup on Gen7+. We always allocate the maximum amount of space and never change it, so it makes sense to do it once. Programming it on startup also lets us skip re-programming it from BLORP. This removes a tiny amount of overhead from our drawing loop. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:47 -07:00
Kenneth Graunke	6c966ccf07	i965: Disable unused pipeline stages once at startup on Gen7+. This removes a tiny bit of code from our drawing loop. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:46 -07:00
Kenneth Graunke	b607d57630	i965: Don't emit PIPELINE_SELECT from BLORP. Now that we emit invariant state at startup (and never select the media pipeline), the 3D pipeline will always already be selected, even if BLORP is the first operation. So this is unnecessary. v2: Fix unused variable warning (intel_context is no longer used). Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:44 -07:00
Kenneth Graunke	d671eb140f	i965: Emit invariant state once at startup on Gen6+. Now that we have hardware contexts, we can safely initialize our GPU state once at startup, rather than needing a state atom with the BRW_NEW_CONTEXT flag set. This removes a tiny bit of code from our drawing loop. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:42 -07:00
Kenneth Graunke	33b90804ee	i965: Delete some dead state atom prototypes. These atoms don't actually exist. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:40 -07:00
Kenneth Graunke	233de8e8d3	i965: Change return type of check_state() to bool. The existing code already returned a boolean; this just clarifies that. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:38 -07:00
Kenneth Graunke	650d5de6ea	i965: Remove unused second parameter of brw_print_dirty_count(). Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:58:29 -07:00
Kenneth Graunke	ca6b520f3a	glsl: Allow the use of determinant() in GLSL 1.50. We already implemented this for ES3, so we just need to turn it on. Fixes 6 Piglit tests: spec/glsl-1.50/compiler/built-in-functions/determinant-mat[234].{vert,frag} Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:54:57 -07:00
Kenneth Graunke	603940d5bb	glcpp: Automatically #define GL_core_profile 1 on GLSL 1.50+. Page 17 of the GLSL 1.50.11 specification states: "There is a built-in macro definition for each profile the implementation supports. All implementations provide the following macro: Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:54:56 -07:00
Kenneth Graunke	e203919a4e	glsl: Parse "#version 150 core" directives. Previously we only supported "#version 150". This patch recognizes "compatibility" to give the user a more descriptive error message. Fixes Piglit's version-150-core-profile test. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:54:42 -07:00
Kenneth Graunke	f730b1f72a	glsl: Bail on parsing if the #version directive is bogus. If we didn't successfully parse the #version line, there's no point in continuing with parsing and compiling: it's already failed. Furthermore, it can actually be harmful: right after handling #version, we call _mesa_glsl_initialize_types(), which checks state->es_shader and language_version. If it isn't valid, it hits an assertion failure. Fixes Piglit's "invalid-version-es." When processing "#version 110 es", our code set state->es_shader and state->language_version = 110. It then properly determined that this was invalid and flagged an error. Since we continued anyway, we hit the assertion mentioned above. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-10 10:50:12 -07:00
Chris Forbes	a2e3b1c4e2	dlist: fix save_SamplerParameteri This was building the temporary array to pass to save_SamplerParameteriv, and then not passing it. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-06-09 14:00:40 -07:00
Vinson Lee	ce1f85133d	mesa: Prevent possible out-of-bounds read by save_SamplerParameteriv. Fixes "Out-of-bounds access" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-08 13:32:53 -07:00
Maarten Lankhorst	26e047dec8	nvc0: fix up video buffer alignment requirements Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-06-08 20:11:33 +02:00
Rob Clark	e9edbf0a68	freedreno: better scissor fix Actually respect rasterizer state. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-08 13:15:51 -04:00
Rob Clark	4af1dcbb7d	freedreno: gmem bypass The GPU (at least a3xx, but I think also a2xx) can render directly to memory, bypassing tiling. Although it can't do this if blend, depth, and a few other features of the pipeline are enabled. This direct memory mode can be faster for some sorts of operations, such as simple blits. In particular, this significantly speeds up XA by avoiding to pull the entire dest pixmap into GMEM, render tiles, and write it all back out again. This should also speed up resource copy-region and blit. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-08 13:15:51 -04:00
Rob Clark	2855f3f7bc	freedreno: add a3xx support The adreno a3xx GPU is found in newer snapdragon devices, such as the nexus4. The a3xx is GLESv3 and OpenCL capable, although that is not enabled yet in gallium. Compared to a2xx, it introduces an entirely new unified shader ISA, and re-shuffles all or nearly all of the registers. The good news is that (for the most part) the registers are more orthogonal, not combining unrelated state in a single register. And that there is a lot more flexibility, so we don't need to patch and re-emit the shader like we did on a2xx. The shader compiler is currently quite dumb, there would be a lot of room for improvement with an optimizing pass. Despite that, with the a320 in my nexus4 it seems to be ~2-3x faster compared to the a220 in my HP touchpad. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-08 13:15:51 -04:00
Rob Clark	18c317b21d	freedreno: prepare for a3xx Split the parts that are specific to adreno a2xx series GPUs from the parts that will be in common with a3xx, so that a3xx support can be added more cleanly. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-06-08 13:15:51 -04:00
Roland Scheidegger	213c207b3a	gallivm: work around slow code generated for interleaving 128bit vectors We use 128bit vector interleave for untwiddling in the blend code (with 256bit vectors). llvm generates terrible code for this for some reason, so instead of generating a shuffle for 2 128bit vectors use a extract/insert shuffle instead (it only seems to matter we're not using 128bit wide vectors for the shuffle). This decreases instruction count of the blend code generated for a rgba8 render target without blending from 169 to 113 with llvm 3.1 and from 136 to 114 in llvm 3.2/3.3, and I got a ~8% (llvm 3.1) and ~5% (3.2/3.3) performance improvement in gears. (The generated code is still not terribly good as we could actually avoid the interleaving completely but llvm can't know this.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-08 17:33:51 +02:00
José Fonseca	0aca2c6b60	scons: Fix implicit python dependency discovery on Windows. Probably due to CRLF endings, the discovery of python import statements was not working on Windows builds, causing incremental builds to often fail unless one wiped out the build directory. NOTE: This is a candidate for stable branches.	2013-06-08 08:55:06 +01:00
Stéphane Marchesin	4f905d4900	st/xlib: Flush the front buffer before doing CopySubBuffer We flush pending rendering before running CopySubBuffer, which ensures that the right bits get to the screen. NOTE: This is a candidate for stable release branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-07 18:53:54 -07:00
Stéphane Marchesin	4e5416b0e2	st/xlib: Fix upside down coordinates for CopySubBuffer The coordinates need to be inverted between glX and gallium. NOTE: This is a candidate for stable release branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-07 18:53:54 -07:00
Eric Anholt	3c21a7d3c9	mesa: Report core FBO incompleteness cases through GL_ARB_debug_output. Just like we produce from inside the Intel driver, this can help provide information quickly about FBO incompatibility problems (particularly when using apitrace replay). Currently, in driver-marked incompleteness cases, you'll get both the driver message and the core message on Intel. Until the other drivers are fixed to produce output, I think this is better than not putting in a message for driver-marked incomplete. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-07 16:05:42 -07:00
Paul Berry	9e3475b39a	intel: flush fake front buffer if server is about to destroy it. Fixes piglit test "spec/!OpenGL 1.0/gl-1.0-front-invalidate-back" Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-06-07 13:42:34 -07:00
Paul Berry	447df5eaba	intel: flush fake front buffer more robustly. When a fake front buffer is in use, if we request the front buffer (using screen->dri2.loader->getBuffersWithFormat()), the X server copies the real front buffer to the fake front buffer and returns the fake front buffer. We sometimes make redundant requests for the front buffer (due to using a single counter to track invalidates for both the front and back buffers), so there's a danger of pending front buffer rendering getting overwritten when the redundant front buffer request occurs. Previous to this patch, intel_update_renderbuffers() worked around that problem by sometimes doing intel_flush() and intel_flush_front() before calling intel_query_dri2_buffers(). But it only did the workaround when the front buffer was bound for drawing; it didn't do it when the front buffer was bound for reading. This patch moves the workaround code to intel_query_dri2_buffers(), so that it happens in exactly the circumstances where it is needed. This should fix some of the sporadic failures in Piglit tests fbo-sys-blit and fbo-sys-sub-blit. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-06-07 13:26:43 -07:00
Paul Berry	03cc310313	intel: make intel_flush_front safe to call during initial MakeCurrent The patch that follows will fix a bug that prevents intel_flush_front() from being called often enough. In doing so, it will create a situation where intel_flush_front() is called during the initial call to glXMakeCurrent(). In this circumstance, ctx->DrawBuffer hasn't been initialized yet and is NULL. Fortunately, intel->front_buffer_dirty is false, so intel_flush_front() doesn't actually need to do anything. To avoid a segfault, swap the order of terms in intel_flush_front()'s if statement. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-06-07 13:26:36 -07:00
Eric Anholt	bc8bfdc42c	mesa: Expose MAX_FRAGMENT_INPUT_COMPONENTS on ES3 and desktop 3.2. piglit OpenGL ES 3.0/minmax now passes. This was also one of the subcase failures in OpenGL 3.2/minmax (and still is, because our value is too low for 3.2, but at least we report what it is). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-07 12:55:07 -07:00
Eric Anholt	7500ad23eb	mesa: Expose texture array getters on GLES3. Part of fixing piglit OpenGL ES 3.0/minmax. v2: s/_gles3/_es3/ in extra name, for consistency (review by Matt). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-06-07 12:55:06 -07:00
Eric Anholt	fd27e82ded	mesa: Fix the return value of TEXTURE_BINDING_2D_ARRAY. Noticed by inspection when reviewing the next commit. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-07 12:55:06 -07:00
Eric Anholt	11ace8a827	mesa: Expose texel offset limits in GLES3. Part of fixing piglit OpenGL ES 3.0/minmax. v2: s/_gles3/_es3/ in extra name, for consistency (review by Matt). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-06-07 12:55:06 -07:00
Roland Scheidegger	fa8cefa892	util: add comment about bogus transfer flags	2013-06-07 21:15:01 +02:00
Roland Scheidegger	b47d13f425	util: fix util_clear_render_target and util_clear_depth_stencil layer handling These functions must clear all bound layers, not just the first. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-07 21:15:01 +02:00
Roland Scheidegger	201d7a352b	llvmpipe: move create_surface/destroy_surface functions to lp_surface.c Believe it or not but these two are actually the first two functions which really belong in this file nowadays. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-07 21:15:01 +02:00
Roland Scheidegger	d8146f240e	llvmpipe: add support for layered rendering Mostly just make sure the layer parameter gets passed through to the right places (and get clamped, can do this at setup time), fix up clears to clear all layers and disable opaque optimization. Luckily don't need to touch the jitted code. (Clears invoked via pipe's clear_render_target method will not work however since the pipe_util_clear function used for it doesn't handle clearing multiple layers yet.) v2: per Brian's suggestion, prettify var initialization and add some comments, add assertion for impossible layer specification for surface. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-07 21:15:01 +02:00
Roland Scheidegger	0f4c08aea2	gallium/docs: fix up transfer description for 1d arrays, add cube map arrays Transfers always use z/depth for layers no matter if it's a 1d or 2d array texture, we don't follow OpenGL's crazyness there. Luckily this appears to only be a doc bug, everyone doing the right thing already. While here also document z/depth parameter for cube map arrays. v2: fix typo spotted by Eric Anholt Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-07 21:15:01 +02:00
Chia-I Wu	7916d5ed88	ilo: fix textureSize() for single-layered array textures We returned 0 instead of 1 for the number of layers when the array texutre is single-layered. This fixed it on GEN7+.	2013-06-08 01:39:47 +08:00
Chia-I Wu	d6c2708e1e	util: add util_resource_is_array_texture() Checking if array_size is greater than 1 is not enough for single-layered array textures. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-08 01:37:40 +08:00
Brian Paul	90fa71b277	docs: update some environment variable info Drop the GALLIUM_NOSSE/PPC env vars, added ST_DEBUG and some of the VMware SVGA driver env vars.	2013-06-07 10:12:32 -06:00
Arnas Milasevicius	3069357ef0	gallium: Remove draw_arrays() and draw_arrays_instanced() functions Moved draw_arrays() to st_draw_feedback.c and removed draw_arrays_instanced(). draw_arrays() was used by nobody else. Now there's just one "draw" entrypoint into the draw module. Signed-off-by: Brian Paul <brianp@vmware.com>	2013-06-07 09:29:29 -06:00
Brian Paul	14541dacab	tgsi: replace tgsi_file_names tgsi_file_names[] with tgsi_file_name() function This change came from the discovery that the STATIC_ASSERT to check that the number of register file strings didn't actually work. Similar changes could be made for the other string arrays in tgsi_string.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-07 09:23:24 -06:00
Chia-I Wu	97d641eb22	u_vbuf: fix index buffer leak Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-06-07 19:33:30 +08:00
Chris Forbes	06a503ca71	i965/vs: add support for emitting gl_ClipVertex Removes the special-case suppression of gl_ClipVertex in the VUE map. Also calculate vertex outcodes for user clip planes based on gl_ClipVertex if written; otherwise gl_Position. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-07 20:50:33 +12:00
Chris Forbes	3615949990	i965/clip: Add support for gl_ClipVertex When clipping triangles against a user clip plane, and gl_ClipVertex is provided in the vertex, use it instead of hpos. TODO: A similar change should be made at some point for line clipping. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-07 20:50:33 +12:00
Chia-I Wu	9b34a7f29a	ilo: advertise PIPE_CAP_CUBE_MAP_ARRAY It was supported but not advertised. Also remove TODO tag for PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT, as it is not a TODO.	2013-06-07 15:37:40 +08:00
Chia-I Wu	cde49c71a3	ilo: add support for TEX2/TXB2/TXL2 in fs They were already supported, just being rejected in the TGSI translator.	2013-06-07 15:37:35 +08:00
Vinson Lee	f8df73f41c	glsl linker: Initialize member variable interface_namespace. Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-06 22:55:24 -07:00
Chia-I Wu	7142da6dd1	ilo: use slab allocator for transfers Slab allocator is perfect for transfer. Improved OpenArena performance by 1% with several casual runs.	2013-06-07 13:23:43 +08:00
Chia-I Wu	09f62a13fc	ilo: clean up states upon context destroy We need to unreference resources that we referenced.	2013-06-07 11:28:21 +08:00
Chia-I Wu	7cbf0a410e	ilo: unmap cp bo before destroying it The BOs are mapped in their entire life times for the chipsets we support so do not forget to unmap it.	2013-06-07 11:28:20 +08:00
Chia-I Wu	27804b2fc7	ilo: enable bo reuse This magical line of code must have got lost at some point in the history...	2013-06-07 11:28:20 +08:00
Chia-I Wu	20d23b2275	ilo: construct 3DSTATE_SF in create_rasterizer_state() Add ilo_rasterizer_sf and initialize it in create_rasterizer_state().	2013-06-07 11:13:16 +08:00
Chia-I Wu	3c2fea206f	ilo: construct 3DSTATE_CLIP in create_rasterizer_state() Add ilo_rasterizer_clip and initialize it in create_rasterizer_state().	2013-06-07 11:13:16 +08:00
Chia-I Wu	4006f4ce26	ilo: use emit_SURFACE_STATE() for render targets Introduce ilo_surface_cso and initialize it in create_surface(). With the change, we can emit SURFACE_STATE directly from the CSO and remove emit_surf_SURFACE_STATE(). We do not deal with depth/stencil surfaces yet.	2013-06-07 11:13:16 +08:00
Chia-I Wu	5354dc7428	ilo: use emit_SURFACE_STATE() for constant buffers Introduce ilo_cbuf_cso and initialize it in set_constant_buffer(). As ilo_view_surface is embedded in ilo_cbuf_cso, switch to emit_SURFACE_STATE() for constant buffers and remove emit_cbuf_SURFACE_STATE().	2013-06-07 11:13:16 +08:00
Chia-I Wu	2d82885d3c	ilo: add emit_SURFACE_STATE() for sampler views Introduce ilo_view_cso and initialize it in create_sampler_view(). Add emit_SURFACE_STATE() to GPE, which can emit SURFACE_STATE from ilo_view_surface.	2013-06-07 11:13:16 +08:00
Chia-I Wu	39e947569e	ilo: add ilo_view_surface for SURFACE_STATE Define struct ilo_view_surface for SURFACE_STATE construction and emission.	2013-06-07 11:13:15 +08:00
Courtney Goeltzenleuchter	c6983ea035	ilo: convert generic depth-stencil-alpha pipe state to ilo pipe state Moving the work to create time reduces the work at emit time. Saves time overall as create work is only done once. Fix compiler warning in gen7_pipeline_sol. [olv: remember pipe_alpha_state instead of pipe_depth_stencil_alpha_state in ilo_dsa_state]	2013-06-07 11:13:15 +08:00
Chia-I Wu	70e78211d6	ilo: introduce vertex element CSO Introduce ilo_ve_cso and initialize it in create_vertex_elements_state(). This commit goes a step further by setting up mappings from HW VB to PIPE VB, which we failed to do previously. That allows us to support instanced rendering.	2013-06-07 11:13:15 +08:00
Chia-I Wu	d4fa98db0c	ilo: simplify emit_3DSTATE_DEPTH_BUFFER() Remove hiz and dsa from the parameters. We would know whether HiZ buffer exists from ilo_texture once it is supported. DSA state should not affect 3DSTATE_DEPTH_BUFFER.	2013-06-07 11:13:15 +08:00
Chia-I Wu	eea1be2072	ilo: introduce blend CSO Introduce ilo_blend_cso and initialize it in create_blend_state(). This saves us from having to construct hardware blend states in draw_vbo().	2013-06-07 11:13:15 +08:00
Chia-I Wu	b3c9e2161f	ilo: introduce sampler CSO Introduce ilo_sampler_cso and initialize it in create_sampler_state(). This saves us from having to perform CPU-intensive calculations to construct hardware sampler states in draw_vbo().	2013-06-07 11:13:15 +08:00
Chia-I Wu	99725d2f8a	ilo: construct SCISSOR_RECT in set_scissor_states() This allows us to memcpy() the state in draw_vbo(). Add ilo_init_states() and ilo_cleanup_states() that are called when contexts are created and destroyed respectively, and properly set the initial scissor state in ilo_init_states().	2013-06-07 11:13:15 +08:00
Chia-I Wu	e51806ee7a	ilo: introduce viewport CSO Introduce ilo_viewport_cso and initialize it in set_viewport_states(). This saves us from having to perform CPU-intensive calculations to construct hardware viewport states in draw_vbo().	2013-06-07 11:13:15 +08:00
Chia-I Wu	4228cf3746	ilo: switch to ilo states for shaders and resources Define and use struct ilo_sampler_state; struct ilo_view_state; struct ilo_cbuf_state; struct ilo_resource_state; struct ilo_global_binding; in ilo_context.	2013-06-07 11:13:15 +08:00
Chia-I Wu	94212915ee	ilo: switch to ilo states for CC stage Define and use struct ilo_dsa_state; struct ilo_blend_state; struct ilo_fb_state; in ilo_context.	2013-06-07 11:13:15 +08:00
Chia-I Wu	29b938d9f4	ilo: switch to ilo states for WM stage Define and use struct ilo_rasterizer_state; in ilo_context.	2013-06-07 11:13:15 +08:00
Chia-I Wu	130364ad1d	ilo: switch to ilo states for CLIP and SF stages Define and use struct ilo_viewport_state; struct ilo_scissor_state; in ilo_context.	2013-06-07 11:13:14 +08:00
Chia-I Wu	3bc8289f49	ilo: switch to ilo states for SOL stage Define and use struct ilo_so_state; in ilo_context.	2013-06-07 11:13:14 +08:00
Chia-I Wu	6b14b392d0	ilo: switch to ilo states for VF stage Define and use struct ilo_vb_state; struct ilo_ve_state; struct ilo_ib_state; in ilo_context.	2013-06-07 11:13:14 +08:00
Chia-I Wu	f0af292239	ilo: move hardware limits to ilo_gpe.h	2013-06-07 11:13:14 +08:00
Roland Scheidegger	644b8346fd	draw: trivial fix comment typo	2013-06-06 23:51:39 +02:00
Roland Scheidegger	769449b3e8	gallium/tgsi: add missing string for layer semantic Also report if a shader writes the layer semantic Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-06 23:51:38 +02:00
Roland Scheidegger	d0518c4c69	llvmpipe: bump 3d and cube map limits to 2048 and 8192 respectively These should just work, required by d3d10. Too large resources will get thrown out separately anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-06 23:51:38 +02:00
Eric Anholt	38e77e545d	glsl: Fix uniform buffer object counting. We were counting uniforms located in UBOs against the default uniform block limit, while not doing any counting against the specific combined limit. Note that I couldn't quite find justification for the way I did this, but I think it's the only sensible thing: The spec talks about components, so each "float" in a std140 block would count as 1 component and a "vec4" would count as 4, though they occupy the same amount of space. Since GPU limits on uniform buffer loads are surely going to be about the size of the blocks, I just counted them that way. Fixes link failures in piglit arb_uniform_buffer_object/maxuniformblocksize when ported to geometry shaders on Paul's GS branch, since in that case the max block size is bigger than the default uniform block component limit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-06-06 14:37:41 -07:00
Eric Anholt	93c8692ce9	glsl: Make a local variable to avoid restating this array lookup. v2: Convert another instance of the array lookup. (caught by Tapani) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-06-06 14:37:40 -07:00
Kenneth Graunke	757ad82867	intel: Use the CHIPSET macro in the PCI ID tables for the device name. Putting the human readable device names directly in the PCI ID list consolidates things in one place. It also makes it easy to customize the name on a per-PCI ID basis without a huge code explosion. Based on a patch by Kristian Høgsberg. v2: Fix 830M/845G names and #undef CHIPSET (caught by Emit Velikov). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-06 14:28:35 -07:00
Kenneth Graunke	ea92b700df	intel: Remove 'misc' parameter from CHIPSET macro in PCI ID tables. This has never actually been used for anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-06 14:28:35 -07:00
Andreas Boll	8bc788ea9e	build: Use PACKAGE_VERSION from autoconf Both variables had the same value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-06 19:07:22 +02:00
Andreas Boll	c0f7ccc136	build: Unify PACKAGE_VERSION on autotools, scons and Android This patch unifies mesa's PACKAGE_VERSION on autotools, scons and Android build systems. Current behaviour is: - Autotools uses 9.2.0 as PACKAGE_VERSION - Scons and Android use 9.2-devel as PACKAGE_VERSION With this patch all three build systems use 9.2.0-devel as PACKAGE_VERSION. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-06 19:07:14 +02:00
Jonathan Gray	5bd808a2c7	radeon/winsys: correct RADEON_GEM_WAIT_IDLE use RADEON_GEM_WAIT_IDLE is declared DRM_IOW but mesa uses it with drmCommandWriteRead instead of drmCommandWrite which leads to the ioctl being unmatched and returning an error on at least OpenBSD. Problem originally noticed in libdrm by Mark Kettenis. Dave Airlie pointed out that mesa has the same issue. Signed-off-by: Jonathan Gray <jsg@jsg.id.au>	2013-06-06 11:01:18 +02:00
Mike Stroyan	962204961d	configure.ac: Build dricommon for gallium swrast When building dri-swrast, use gallium_check_st to set HAVE_COMMON_DRI. Commit `07f2dee7` added setting of HAVE_COMMON_DRI in gallium_check_st. But the dri-swrast case did not use gallium_check_st. So dri/common was still not built. v2: set HAVE_COMMON_DRI=yes instead of using gallium_check_st NOTE: This is a candidate for the 9.1 branch. (Depends on `7de78ce5` and `07f2dee`) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821 Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-06-06 08:54:07 +02:00
Rodrigo Vivi	ce67fb4715	i965: Adding more reserved PCI IDs for Haswell. At DDX commit Chris mentioned the tendency we have of finding out more PCI IDs only when users report. So Let's add all new reserved Haswell IDs. NOTE: This is a candidate for stable branches. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=63701 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-05 10:44:15 -07:00
Rico Schüller	3998cfa933	mesa: remove outdated version lines in comments Signed-off-by: Brian Paul <brianp@vmware.com>	2013-06-05 08:54:27 -06:00
Richard Sandiford	7bdf1f2f1a	gallium: System z support The main change is to use MCJIT rather than the old JIT, which will never be supported for System z. The endianness part is by example since the patch was tested on a glibc system. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-06-05 08:36:24 -06:00
Roland Scheidegger	008fd03600	llvmpipe: improve alignment calculation for fetching/storing pixels This was always doing per-pixel alignment which isn't necessary, except for the buffer case (due to the per-element offset). The disabled code for calculating it was incorrect because it assumed that always the full block would be fetched, which may not be the case, so fix this up. The original code failed for instance for r10g10b10a2 the alignment would have been calculated as 4 (block_width) * 4 (bytes) so 16, but the actual fetch may have only fetched 2 values at a time, hence only alignment 8 - it is unclear what exactly would happen in this case (alignment larger than size to fetch). So just use the (already calculated) fetch size instead and get alignment from that which should always work, no matter if fetching 1,2 or 4 pixels. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:47 +02:00
Roland Scheidegger	ffe2a1ca3c	llvmpipe: reduce alignment requirement for 1d resources from 4x4 to 4x1 For rendering to buffers, we cannot have any y alignment. So make sure that tile clear commands only clear up to the fb width/height, not more (do this for all resources actually as clearing more seems pointless for other resources too). For the jit fs function, skip execution of the lower half of the fragment shader for the 4x4 stamp completely, for depth/stencil only load/store the values from the first row (replace other row with undef). For the blend function, also only load half the values from fs output, replace the rest with undefs so that everything still operates on the full 4x4 block to keep code the same between 4x1 and 4x4 (except for load/store of course which also needs to skip (store) or replace these values with undefs (load))., at the cost of slightly less optimal code being produced in some cases. Also reduce 1d and 1d array alignment too, because they can be handled the same as buffers so don't need to waste memory. v2: don't try to run special blend code for 4x1, (very) slightly less complexity if we just use the same code as for 4x4 which may or may not make it easier to optimize in the future (as we care a lot more about 4x4 performance than 1d). v2: don't use undef values for unused fs src outputs with llvm 3.1 as it apparently can trigger a bug in llvm. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:47 +02:00
Roland Scheidegger	ef3e887084	llvmpipe: cleanup of generate_unswizzled_blend Some parameters were used inconsistently, for instance not using block_width/block_height/block_size for deferring number of pixels but rather relying on guesses from the number of fragment shaders etc, so fix this up (no actual change in behavior since the block size stays fixed). (Though most of the code would work with different block_height, with three exceptions, one being the hacked r11g11b10 conversions and twiddle code which only work with block_height 2 not 1, and the last one being blend vector type not being 128bit wide.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:47 +02:00
Roland Scheidegger	44993c1808	gallivm: enhance special sse2 4x4f and 2x8f -> 1x16ub conversion There's no good reason why it can't handle 2x4f->1x8ub, 1x4f->1x4ub and 1x8f->1x8ub cases, there might be legitimate reasons why we don't have enough input vectors for a full destination vector, and using pack intrinsics should still be much better than using generic conversion (it looks like convert_alpha from the blend code might hit this though I suspect it could be avoided). v2: add another test vector format to lp_test_conv so this gets tested. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:46 +02:00
Roland Scheidegger	ce82523db9	gallivm: (trivial) fix lp_build_concat_n The code was designed to handle no-op concat but failed (unless the caller was using same pointer for src and dst). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-05 00:29:46 +02:00
Brian Paul	f270baf074	mesa: change MAX_PROGRAM_ADDRESS_REGS to 1, clamp to it in state tracker We've never properly supported more than one address register. There isn't even a field in prog_src_register or prog_dst_register to indicate which address register to use if RelAddr!=0. In the state tracker, clamp MaxAddressRegs against MAX_PROGRAM_ADDRESS_REGS since many gallium drivers do support more. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65226 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-04 13:29:38 -06:00
Paul Berry	2fd785d126	intel: Don't try to blorp or blit CopyTexSubImage(1D_ARRAY). Blorp and the hardware blitter can't be used to implement CopyTexSubImage when the image type is 1D_ARRAY, because of a coordinate system mismatch (the Y coordinate in the source image is supposed to be matched up to the Z coordinate in the destination texture). The hardware blitter path (intel_copy_texsubimage) contained a perf debug warning for this case, but it failed to actually fall back. The blorp path didn't even check. Fixes piglit test "copyteximage 1D_ARRAY". Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-04 09:14:44 -07:00
Paul Berry	32d1f423bc	i965/gen6+: Fix multisample assertions in CopyTexSubImage hw blitter path. Commit `045612c` (intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers) added an assertion to intel_copy_texsubimage() to make sure that multisampling was not in use, based on the assumption that glCopyTexSubImage() can't legally be used with multisampling. However, there is one case where glCopyTexSubImage() can legally be used with multisampling: when the source buffer is a multisampled window system buffer. If the source and destination color formats don't match, the blorp path will fail, so intel_copy_texsubimage() will be called. In this case, we need intel_copy_texsubimage() to return false so that we fall back to meta to do the copy. (The multisampled source buffer won't cause a problem for the meta path, because it uses glReadPixels, which forces a multisample resolve). It's still safe to assert that the destination image is single-sampled, because it's not legal to call glCopyTexSubImage() on multisampled textures. Fixes some failures with piglit tests "copyteximage {1D,2D,CUBE,RECT,2D_ARRAY}" (with "samples=..." argument). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-04 09:14:40 -07:00
Vinson Lee	7bafd88c15	mesa: Prevent possible out-of-bounds read by save_SamplerParameterfv. Fixes "Out-of-bounds access" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-06-03 23:01:46 -07:00
Dave Airlie	0677ea063c	i965: fix problem with constant out of bounds access (v3) Okay I now understand why Frank would want to run away, this is my attempt at fixing the CVE out of bounds access to constants outside the range. This attempt converts any illegal constants to constant 0 as per the GL spec, and is undefined behaviour. A future patch should add some debug for users to find this out, but this needs to be backported to stable branches. CVE-2013-1872 v2: drop the last hunk which was a separate fix (now in master). hopefully fix the indentations. v3: don't fail piglit, the whole 8/16 dispatch stuff was over my head, and I spent a while figuring it out, but this one is definitely safe, one piglit pass extra on my Ironlake. NOTE: This is a candidate for stable branches. Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-06-04 13:50:20 +10:00
Eric Anholt	bb525f1f11	intel: Fix copying of separate stencil data in glCopyTexSubImage(). We were copying the source stencil data onto the destination depth data. Fixes piglit copyteximage other than 1D_ARRAY. v2: Fix unintentional dropping of the "don't double-copy for packed depth/stencil" check. While blorp is only supported on separate stencil hardware at the moment, hopefully that will change soon. Review by Jordan. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-06-03 14:22:54 -07:00
Eric Anholt	c937aea3d1	meta: Fix temporary image type for float depth/stencil. Fixes assertion failure in piglit copyteximage. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-03 13:47:19 -07:00
Eric Anholt	f96de8ad96	intel: Fix performance regression from miptree blit changes. When making v2 of `da2880bea0`, I carefully checked all of the calls in that commit to see that I'd updated them, but forgot to update the new calls in the later commits such as .e845c5cf7abce55759501a473459aff3bf25c9ca. As a result, we were getting Y tiled temporaries even though the whole point of the temporary was to untile! The steady state of the intro scene of lightsmark goes from 13 to 17 fps. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65154 Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-03 13:47:18 -07:00
Carl Worth	610fe6da79	glcpp: Add test case for recently fixed loop-control underflow bug. To trigger the bug, it suffices to have a line-continuation followed by a newline and then a non-line-continuation backslash. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-03 13:33:32 -07:00
Carl Worth	d8eeb1d330	glcpp: Fix post-decrement underflow in loop-control variable This loop-control condition with a post-decrement operator would lead to an underflow of collapsed_newlines. This in turn would cause a subsequent execution of the loop to labor inordinately trying to return the loop-control variable to a value of 0 again. Fix this by dis-intertwining the test and the decrement. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65112 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-06-03 13:33:31 -07:00
Chad Versace	7a9f4d3e71	i965: Fix glColorPointer(GL_FIXED) When a gl_client_array is created with glColorPointer, gl_client_array::Normalized is true. This caused the translation from the gl_client_array's type to a BRW_SURFACEFORMAT to assertion fail. Fixes the spinning cube's color in Android 4.2's ApiDemos.apk, "Graphics > OpenGL ES". Fixes assertion failure in mesa-demos/src/egl/opengles1/tri_x11 on Haswell and Ivybridge: brw_draw_upload.c:287: get_surface_type: Assertion `0' failed. No Piglit regressions on Haswell. Note: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42182 Issue: AXIA-2954 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-06-03 13:03:28 -07:00
Zack Rusin	e54c924a0e	softpipe: draw_find_shader_output returns -1 on invalid outputs It was changed from 0 to allow shader outputs at 0 that are different from position. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-30 19:54:25 -04:00
Tom Stellard	124e1f91a7	radeonsi/compute: Upload work group, work item size in input buffer	2013-06-03 14:03:13 -04:00
Tom Stellard	3d831206a4	radeonsi/compute: Pass kernel arguments in a buffer v2 v2: - Fix memory leak in si_set_constant_buffer()	2013-06-03 14:03:08 -04:00
Tom Stellard	67e5c9ae0e	radeonsi/compute: Implement un-binding of global buffers	2013-06-03 10:24:54 -04:00
Tom Stellard	d2472ceb92	radeonsi/compute: Support multiple kernels in a compute program	2013-06-03 10:24:54 -04:00
Tom Stellard	3f24190325	radeonsi/compute: Add missing PIPE_COMPUTE caps	2013-06-03 10:24:54 -04:00
Jordan Justen	c754f7a8fd	i965 gen7: use SURFACE_STATE fields to select render level/layer Rather than pointing the surface_state directly at a single sub-image of the texture for rendering, we now point the surface_state at the top level of the texture, and configure the surface_state as needed based on this. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-02 20:39:38 -07:00
Jordan Justen	6bfd897fc4	mesa/texformat: add _mesa_tex_target_is_array function Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-02 20:39:38 -07:00
Jordan Justen	6a5469cff9	intel: add layered parameter to update_renderbuffer_surface Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-02 20:38:37 -07:00
Jordan Justen	8312caf673	intel_fbo: set gl_renderbuffer Depth field Set the renderbuffer's Depth field to match the texture's Depth when rendering to a texture. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-02 20:38:37 -07:00
Jordan Justen	a2d31371e9	intel: print image depth in debug message Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-06-02 20:38:37 -07:00
Brian Paul	e20a2df401	mesa: handle missing read buffer in _mesa_get_color_read_format/type() We were crashing when GL_READ_BUFFER == GL_NONE. Check for NULL pointers and reorganize the code. The spec doesn't say which error to generate in this situation, but NVIDIA raises GL_INVALID_OPERATION. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65173 NOTE: This is a candidate for the stable branches. Tested-by: Vedran Rodic <vrodic@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-06-02 18:12:07 -06:00
Brian Paul	dcc5b6bfb7	meta: move vertex array enables for mipmap generation Before, on the second call to GenerateMipmap we were enabling two vertex arrays for the current vertex array object, rather than the private generate-mipmap vertex array object. This caused things to blow up elsewhere. This patch moves the array enables into the block where the generate-mipmap vertex array object is created, as we do in the setup_ff_generate_mipmap() function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60518 NOTE: This is a candidate for the stable branches. Tested-by: core13@gmx.net Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-06-02 18:06:17 -06:00
Brian Paul	8588350dc0	mesa: fix hodge podge indentation, update comments in texformat.c	2013-06-02 18:06:17 -06:00
Roland Scheidegger	6b53e2b038	gallium: add support for layered rendering Since pipe_surface already has all the necessary fields no interface changes are necessary except adding a new shader semantic value (TGSI_SEMANTIC_LAYER). (Note that what GL knows as "gl_Layer" variable d3d10 is naming "RENDER_TARGET_ARRAY_INDEX".) v2: drop cap bit (just tied to geometry shader), add docs.	2013-06-01 20:03:59 +02:00
Roland Scheidegger	458a9a0f85	gallivm: fix out-of-bounds access with mirror_clamp_to_edge address mode Surprising this bug survived so long, we were missing a clamp (in the linear filtering version). (Valgrind complained a lot about invalid reads with piglit texwrap, I've also seen spurios failures in this test which might have happened due to this. Valgrind probably didn't complain before the alignment reduction in llvmpipe to 4x4 since the test is using tiny textures so the reads were still always well within allocated area.) While here, also do an effective clamp (after half subtraction) of [0,length-0.5] instead of [0, length-1] which saves an instruction (the filtering weight could be different due to this, but only if both texels point to the same max texel so it doesn't matter). (Both changes are borrowed from PIPE_TEX_CLAMP_TO_EDGE case.) Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-01 20:03:59 +02:00
Roland Scheidegger	f51fc7a71c	llvmpipe: fix bogus assertions for buffer surfaces One of the assertion made no sense for buffer rendertargets (due to the union), so drop it. (The same assertion is present already in the path for texture surfaces later.). v2: make assertion completely accurate (suggested by Jose). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-06-01 20:03:59 +02:00
Kenneth Graunke	4405ff4055	i965: Fix haswell_upload_cut_index when there's no index buffer. brw->ib.type is reset to -1 at the start of each batch. If there's no index buffer, it won't get updated to a sensible value, resulting in _mesa_primitive_restart_index's "Invalid index buffer type" assertion tripping. Fixes a regression since `7c87a3b5da`. NOTE: This is a candidate for the 9.1 branch (and should be squashed). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65195 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-31 21:54:49 -07:00
Roland Scheidegger	869c5d438f	llvmpipe: reduce alignment requirement for resources from 64x64 to 4x4 The overallocation was very bad especially for things like 1d array textures which got blown up by a factor of 64. (Even ordinary smallish 2d textures benefit a lot from this, a mipmapped 64x64 rgba8 texture previously used 7*16kB = 112kB instead of now ~22kB.) 4x4 is chosen because this is the size the jit functions run on, so making it smaller is going to be a bit more complicated. It is actually not strictly 4x4 pixel, since we'd want to avoid situations where different threads are rendering to the same cacheline so we keep cacheline size alignment in x direction (often 64bytes). To make this work introduce new task width/height parameters and make sure clears don't clear the whole tile if it's a partial tile. Likewise, the rasterizer may produce fragments outside the 4x4 blocks present in a tile, so don't call the jit function for them. This does not yet fix rendering to buffers (which cannot have any y alignment at all), and 1d/1d array textures are still overallocated by a factor of 4. v2: replace magic number 4 with LP_RASTER_BLOCK_SIZE, fix size of buffers allocated (needed in case we render to them). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-31 20:21:05 +02:00
Adam Jackson	e881c9a5dc	llvmpipe: Remove x/y from cmd_bin These were mostly just a waste of memory and cache pressure, and were really only used for debugging. This change reduces instruction count (as measured by callgrind's Ir event) of gnome-shell-perf-tool on Ivybridge by 3.5% ± 0.015% (n=20). Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-05-31 20:21:05 +02:00
Vadim Girlin	eb4c992ea5	r600g/sb: fix broken assert Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-31 22:11:42 +04:00
Andreas Boll	5ea43e6549	glapi: Add some missing static_dispatch="false" annotations to es_EXT.xml This fixes the following build errors on powerpc: CC glapi_dispatch.lo In file included from glapi_dispatch.c:90:0: ../../../../../src/mapi/glapi/glapitemp.h:1640:1: error: no previous prototype for 'glReadBufferNV' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:4198:1: error: no previous prototype for 'glDrawBuffersNV' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:6377:1: error: no previous prototype for 'glFlushMappedBufferRangeEXT' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:6389:1: error: no previous prototype for 'glMapBufferRangeEXT' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:6401:1: error: no previous prototype for 'glBindVertexArrayOES' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:6413:1: error: no previous prototype for 'glDeleteVertexArraysOES' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:6433:1: error: no previous prototype for 'glGenVertexArraysOES' [-Werror=missing-prototypes] ../../../../../src/mapi/glapi/glapitemp.h:6445:1: error: no previous prototype for 'glIsVertexArrayOES' [-Werror=missing-prototypes] NOTE: This is a candidate for the 9.0 and 9.1 branches. Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-31 17:18:57 +02:00
Vinson Lee	171199b2b7	mesa: Add missing break statement in _mesa_choose_tex_format. Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-30 23:12:32 -07:00
Alan Coopersmith	306f630e67	integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2] clientDriverNameLength is a CARD32 and needs to be bounds checked before adding one to it to come up with the total size to allocate, to avoid integer overflow leading to underallocation and writing data from the network past the end of the allocated buffer. NOTE: This is a candidate for stable release branches. Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-30 18:03:45 -07:00
Alan Coopersmith	2e5a268f18	integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2] busIdStringLength is a CARD32 and needs to be bounds checked before adding one to it to come up with the total size to allocate, to avoid integer overflow leading to underallocation and writing data from the network past the end of the allocated buffer. NOTE: This is a candidate for stable release branches. Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-30 18:03:39 -07:00
Brian Paul	51498a3e71	mesa: fix error checking of DXT sRGB formats in _mesa_base_tex_format() For formats such as GL_COMPRESSED_SRGB_S3TC_DXT1_EXT we need to have both the GL_EXT_texture_sRGB and GL_EXT_texture_compression_s3tc extensions. This patch adds the missing check for the later. Found when checking out https://bugs.freedesktop.org/show_bug.cgi?id=65173 NOTE: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-30 14:01:31 -06:00
Brian Paul	fb1785197f	mesa: asst. whitespace, formatting fixes in teximage.c	2013-05-30 14:01:31 -06:00
Zack Rusin	978d5ed06b	draw: fix vs/fs input/output mismatches When we've changed draw_find_shader_output to return -1 instead of 0 on non found attribs we broke the default behavior of draw, which was to always redirect those to the first (0th) slot. To preserve that behavior if draw_emit_vertex_attr notices a mismatched vertex attrib, it just redirects it to the first slot (instead of trying to use negative index in an array). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-30 15:34:19 -04:00
Anuj Phogat	0a70fdfb3f	intel: Add multisample scaled blitting in blorp engine In traditional multisampled framebuffer rendering, color samples must be explicitly resolved via BlitFramebuffer before doing the scaled blitting of the framebuffer. So, scaled blitting of a multisample framebuffer takes two separate calls to BlitFramebuffer. This patch implements the functionality of doing multisampled scaled resolve using just one BlitFramebuffer call. Important changes involved in this patch are listed below: - Use float registers to scale and offset texture coordinates. - Change offset computation to consider float coordinates. - Round the scaled coordinates down to nearest integer. - Modify src texture coordinates clipping to account for scaling.. - Linear filter is not yet implemented in blorp. So, don't use blorp engine to do single sampled scaled blitting. V3: Fix nearest filtering issue in scaled blits. Makes failing piglit fbo-blit-stetch test and framebuffer_blit_functionality_magnifying_blit.test in gles3 CTS pass. Observed no piglit, gles3 CTS regressions on sandybridge & ivybridge with this patch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-30 10:50:30 -07:00
Anuj Phogat	6e28713a8d	intel: Change the register type from UW to UD in blorp engine These changes are required to implement scaled blitting in blorp in my next patch. No regressions observed in piglit quick-driver.tests with this patch. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-30 10:50:29 -07:00
Anuj Phogat	40e3298125	mesa: Implement ext_framebuffer_multisample_blit_scaled extension Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-30 10:50:29 -07:00
Kenneth Graunke	60f9b722ef	Revert "i965: fix problem with constant out of bounds access (v2)" This reverts commit `98dfd59a04`. The patch was clearly not Piglit tested, as it caused at least 225 tests to start crashing with assertion failures. That was before my desktop tanked and the test run died completely.	2013-05-29 23:31:09 -07:00
Courtney Goeltzenleuchter	8b1c9de166	ilo: simplify shader variant handling Remove hash function on shader variants. Nature of variants limits them to a small number and thus its more efficient to just do a memory compare of the actual shader structures rather than compute and compare hashes.	2013-05-30 13:58:40 +08:00
Dave Airlie	98dfd59a04	i965: fix problem with constant out of bounds access (v2) This is my attempt at fixing this as the CVE is making RH security team care enough to make me look at this. (please upstream, security fixes are more important than whatever else you are doing, if for no other reason than it saves me having to fix stuff I've no real clue about). Since Frank's original fix was denied, here is my attempt to just alias all constants that are out of bounds < 0 or > nr_params to constant 0, hopefully this provides the undefined behaviour idr requires.. CVE-2013-1872 v2: drop the last hunk which was a separate fix (now in master). hopefully fix the indentations. NOTE: This is a candidate for stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-05-30 12:59:34 +10:00
Frank Henigman	02fe736cc0	intel: initialize fs_visitor::params_remap in constructor Set fs_visitor::params_remap to NULL in the constructor. This variable was potentially tested in fs_visitor::remove_dead_constants() before being set. NOTE: This is a candidate for stable release branches. Signed-off-by: Frank Henigman <fjhenigman@google.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-05-30 10:37:35 +10:00
Brian Paul	83aaf61e24	draw: add cast in debug_printf() to silence warning	2013-05-29 18:07:35 -06:00
Brian Paul	71682c1599	svga: add PIPE_CAP_MAX_VIEWPORTS to switch to silence warning	2013-05-29 18:07:11 -06:00
Zack Rusin	c08baef508	draw: make sure viewport index is fetched from leading vertex Viewport index should only be used on a per primitive basis, so instead of fetching it from each vertex, potentially making each vertex in a primitive use a different viewport index, which is obviously broken, make sure that we only fetch from the first vertex in the primitive making the viewport index the same for the entire primtive. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	c88ce3480c	llvmpipe: clamp scissors to be between 0 and max We need to clamp to make sure invalid shader doesn't crash our driver. The spec says to return 0-th index for everything that's out of bounds. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	d7d676252d	draw: clamp the viewports to always be between 0 and max If the viewport index is larger than the PIPE_MAX_VIEWPORTS, then the first (0-th) viewport should be used. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	26fe24c479	gallium/docs: adds documentation for multi viewport cap Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	4b5595b38b	draw: fixup draw_find_shader_output draw_find_shader_output like most of the code in draw used to depend on position always being at output slot 0. which meant that any other attribute being at 0 could signify an error. unfortunately position can be at any of the output slots, thus other attributes can occupy slot 0 and we need to mark the ones which were not found by something else. This commit changes draw_find_shader_output so that it returns -1 if it can't find the given attribute and adjust the code that depended on it returning >0 whenever it correctly found an attrib. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	97b8ae429e	llvmpipe: implement support for multiple viewports Largely related to making sure the rasterizer can correctly pick out the correct scissor box for the current viewport. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	7756aae815	draw: implement support for multiple viewports This adds support for multiple viewports to the draw module. Multiple viewports depend on the presence of geometry shaders which can write the viewport index. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-25 09:49:20 -04:00
Zack Rusin	eaabb4ead0	gallium: Add support for multiple viewports Gallium supported only a single viewport/scissor combination. This commit changes the interface to allow us to add support for multiple viewports/scissors. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: José Fonseca<jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-25 09:49:20 -04:00
Kenneth Graunke	e6efb900e7	mesa: Delete the ctx->Array._RestartIndex derived state. It's incorrect and isn't used any longer. v2: Actually flush vertices/flag _NEW_TRANSFORM on RestartIndex change. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:17 -07:00
Kenneth Graunke	51c0ffacb2	mesa: Ignore fixed-index primitive restart in ArrayElement(). GL_PRIMITIVE_RESTART_FIXED_INDEX is only supposed to apply to glDrawElements*. This code is for legacy drawing paths and display lists, so it shouldn't apply. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:14 -07:00
Kenneth Graunke	a41478e3f6	st/mesa: Go back to using ctx->Array.RestartIndex, not _RestartIndex. The derived _RestartIndex field is an attempt to support both GL_PRIMITIVE_RESTART and GL_PRIMITIVE_RESTART_FIXED_INDEX (part of ES 3.0). Gallium drivers don't appear to support ES 3.0 yet, so they don't need to use it. Plus, it's broken and going to go away soon. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:11 -07:00
Kenneth Graunke	49aba27973	i965: Fix can_cut_index_handle_restart_index() for byte/short types. Pre-Haswell hardware doesn't support an arbitrary restart index, and instead compares the index buffer value against 0xFF for byte-size buffers, 0xFFFF for short-size buffers, or 0xFFFFFFFF for unsigned integer buffers. OpenGL allows the restart index to be an arbitrary unsigned integer. When comparing against byte/short types, the index buffer value should be promoted to a full 32-bit integer before doing the comparison. The restart index is /not/ supposed to be masked to byte/short size. This means that with certain restart indexes, the comparison should always fail. For example, a restart index of 0xF000FFFF should never match any byte/short index buffer values due to the extra high bits. We must not enable hardware primitive restart in such a case. For now, fall back to software primitive restart as it's the simplest fix. In the future, we could detect restart indexes that will never match and skip both hardware and software primitive restart. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:08 -07:00
Kenneth Graunke	7c87a3b5da	i965: Use the correct restart index for fixed index mode on Haswell. The code that updates the ctx->Array._RestartIndex derived state mashed it to 0xFFFFFFFF when GL_PRIMITIVE_RESTART_FIXED_INDEX was enabled regardless of the index buffer type. It's supposed to be 0xFF for byte, 0xFFFF for short, or 0xFFFFFFFF for integer types. The new _mesa_primitive_restart_index() helper gets this right. The hardware appears to compare against the full 32-bit value some of the time, causing primitive restart not to occur when it should. The fact that it works some of the time is rather frightening. Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart conformance test when run in combination with other tests. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:06 -07:00
Kenneth Graunke	1569709663	vbo: Use the new primitive restart index helper function. This gets the correct restart index for unsigned byte/short types when using GL_PRIMITIVE_RESTART_FIXED_INDEX. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:04 -07:00
Kenneth Graunke	959d076b30	mesa: Add a helper function for determining the restart index. The derived state approach currently used (_RestartIndex) doesn't work: in the GL_PRIMITIVE_RESTART_FIXED_INDEX case, the restart index depends on the index buffer's data type, and that isn't known until draw time. The existing code also fails to obey the GL 4.3 rules which say that FIXED_INDEX takes precedence over normal primitive restart. This helper function correctly determines the restart index, and will replace the derived state. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:22:02 -07:00
Kenneth Graunke	37f278000c	vbo: Ignore PRIMITIVE_RESTART_FIXED_INDEX for glDrawArrays(). The derived _PrimitiveRestart enable flag combines the PrimitiveRestart and PrimitiveRestartFixedIndex enable flags. However, DrawArrays is not supposed to do FixedIndex restart: From the OpenGL 4.3 Core specification, section 10.3.5 (page 302): "If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not performed for array elements transferred by any drawing command not taking a type parameter, including all of the Draw commands other than DrawElements." The OpenGL ES 3.0 specification agrees by omission: "When DrawElements, DrawElementsInstanced, or DrawRangeElements transfers a set of generic attribute array elements to the GL..." Notably, DrawArrays is not included in the list of draw calls that take PRIMITIVE_RESTART_FIXED_INDEX into consideration. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-29 14:21:51 -07:00
Eric Anholt	6220cc931f	i965/vs: Fix implied_mrf_writes() for integer division pre-gen6. Previously it would assertion fail in debug builds (though the correct value was returned in a non-debug build). Marking it as a candidate for stable even though it has no current consumers in the stable branches, in case one shows up in a later backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64727 NOTE: This is a candidate for stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-29 11:02:01 -07:00
Eric Anholt	0a0b323193	i965/fs: Fix test for smearing enabled on an instruction. We were expanding the live range too far, breaking register_coalesce_2() and compute_to_mrf() on 16-wide shaders. Turning it back on improves GLB2.7 performance by 0.239355% +/- 0.0850649% (n=398). shader-db stats are: total instructions in shared programs: 1627211 -> 1609262 (-1.10%) instructions in affected programs: 450351 -> 432402 (-3.99%) While 33 new 16-wide shaders are gained, 70 are lost. Despite that, tropics (the app that lost the most 16-wide) shows a .41% +/- .16% (n=7/8, first-run outlier removed) performance improvement on my HSW. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-29 10:20:26 -07:00
Eric Anholt	9a31c4f9ac	i965/fs: Fix segfault in instruction scheduling with LINTERP using last GRF. The scheduler didn't know about uniform-type accesses, and if a uniform access was last in a 16-wide, we'd walk off the end of the array. This never happened, because we'd never coalesce out all the GRFs, due to a bug to be fixed in the next commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-29 10:16:44 -07:00
Eric Anholt	7e7600d10b	mesa: Fix test for optimistic coloring being necessary. i965 and radeon use ra_set_node_reg() to force payload registers to specific registers while exposing those registers to the allocator still. We were treating those register nodes as unsuccessfully allocated in the ra_simplify() step, leading to walking the registers again to do optimistic coloring even if there was nothing left ot do. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-29 10:16:44 -07:00
Anthony G. Basile	22f1add968	gallium: fix build on uclibc system execinfo.h and debug_symbol_name_glibc() are pure GNU-isms and do not build on uclibc systems. A previous patch addressed this issue, but there was an error. This patch corrects that error. See https://bugs.freedesktop.org/show_bug.cgi?id=51782 https://bugs.gentoo.org/show_bug.cgi?id=469768 Signed-off-by: Anthony G. Basile <blueness@gentoo.org> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-05-29 08:32:35 -06:00
Eric Anholt	4dea6cf215	intel: Enable blit glCopyTexSubImage/glBlitFramebuffer with sRGB. Since the introduction of default-to-SARGB8 window system framebuffers, non-blorp hardware lost blit acceleration for these two paths between the window system and ARGB8888 textures. Since we shouldn't be doing any conversion anyway, just compatibility-check the linear variants of the formats. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61954 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>	2013-05-28 17:53:44 -07:00
Andreas Hartmetz	f43f07d588	radeonsi: Add ipo to LLVM_COMPONENTS r600g needs it too, so add ipo in the common radeon_llvm_check(). radeonsi compiled and linked, but it failed at dynamic link time with a missing symbol. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-05-28 17:08:00 -07:00
Roland Scheidegger	33fcce3682	llvmpipe: get rid of tiled/linear layout remains Eliminate the rest of the no longer needed layout logic. (It is possible some code could be simplified a bit further still.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-29 00:41:06 +02:00
Eric Anholt	b3abc93f47	intel: Remove dead intel_drawbuf_region(). Since the glBitmap() MRT change, it's unused. There was basically no way to responsibly use this function since MRT was introduced. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:58 -07:00
Eric Anholt	0a39cb88de	intel: Fix format handling of blit glBitmap() Any 32-bit format got ARGB8888 handling (including, say, GL_RG1616), and anything else got 16-bit (including, say, GL_R8), which could potentially hang the GPU by writing out of bounds. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:58 -07:00
Eric Anholt	1cb8de6fff	intel: Fix MRT handling of glBitmap(). We'd only hit color buffer 0 even if multiple draw buffers were bound. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:57 -07:00
Eric Anholt	5f29dca070	intel: Rebuild PBO blit glTexImage() on top of miptrees. This will ensure that we have resolves if we ever extend this to glTexSubImage(), and fixes missing image start offset handling. The texture buffer alloc ended up getting moved up, because we want to look at the format of the image's actual mt to see if we'll end up blitting the right thing, in the case of packed depth/stencil uploads. This is the last caller of intelEmitCopyBlit() on a miptree-wrapped BO. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:57 -07:00
Eric Anholt	3c3e83014b	intel: Rebuild PBO blit glReadPixels() on top of miptrees. The previous code was missing depth resolves, that had only been prevented due to no blitting of Y tiling. The pair of flip args in the new blit function means that we can just drop the pack->Invert fallback. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:57 -07:00
Eric Anholt	8c3392e274	intel: Rework intel_miptree_create_for_region() to wrap a BO. I needed to do this for the PBO blit cases to use intel_miptree_blit(). But this also actually partially fixes a bug in EGLImage handling: We can't share regions across contexts, because regions have a refcount that isn't protected by a mutex, and different contexts can be simulataneously accessed from multiple threads. Now we just need to get regions out of __DRIImage. There was also a missing use of image->offset in the EGLImage renderbuffer storage code. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:57 -07:00
Eric Anholt	e845c5cf7a	intel: Make a temporary miptree for the blit path of miptree mapping. In a bit of debug code, we no longer have the inter-slice x/y to print. But I think the level/slice is more useful in this case for looking at what's getting mapped, especially given that INTEL_DEBUG=blit will tell you the other value. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:56 -07:00
Eric Anholt	4a13beef88	intel: Make a temporary miptree when doing blit uploads for glTexSubImage(). While this is a bit more CPU work, it also is less code to handle this path, and fixes problems with 32k-pitch textures and missing resolves. v2: Add error checking in new code. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 13:06:56 -07:00
Eric Anholt	da2880bea0	intel: Extend the force_y_tiling flag to allow forcing no tiling. For a blit-uploaded temporary, it's faster on current hardware to memcpy the data into a linear CPU mapping than to go through the GTT. v2: Turn the not-fully-supported mask into 3 supported enum values. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Paul Berry <stereotype441@gmail.com> (v2) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v2)	2013-05-28 13:06:43 -07:00
Eric Anholt	045612c90e	intel: Add an assert for glCopyTexSubImage() being called on MSAA buffers. This is just in case someone else trips over this due to our weird reuse of this code in glBlitFramebuffer(). Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:44 -07:00
Eric Anholt	7638f5578e	i965: Allow glCopyTexSubImage() on depth textures. If the hw is pre-gen5 and can't blit depth, it'll cleanly error out. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:39 -07:00
Eric Anholt	48a22340cf	i965: Prefer blorp glBlitFramebuffer() to the glCopyTexSubImage-based blit. I think we've measured no performance difference from this in the past, except that the blorp code can do things like multisample resolves. Prevents piglit regression in the next commit when a testcase started trying to do a multisampled resolve through the old glCopyTexSubImage() path. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:35 -07:00
Eric Anholt	9720d436d1	i965: Consistently do depth resolves before blitting. We were protected for a long time by the fact that depth was Y tiled and you couldn't blit Y. Now that we can blit Y, we were failing to resolve depth in glCopyPixels(). Note in the comment about swrast, that the swrast map path does resolves appropriately already. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:30 -07:00
Eric Anholt	6a7c27786c	intel: Make a wrapper for intelEmitCopyBlit using miptrees. I had previously asserted that it was hard to write a useful, simpler blit function, but I think this might be it. This has the side effect of extending the 32k pitch check to a few more places that were missing it. v2: Update comment for being moved inside intel_miptree_blit(). Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:25 -07:00
Eric Anholt	0ae294bf7c	intel: Rename intel_renderbuffer_tile_offsets. This makes it more consistent with intel_miptree_get_tile_offsets(). Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:21 -07:00
Eric Anholt	4e8eafd8f4	intel: Reduce intel_renderbuffer_tile_offsets to a thin wrapper. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:15 -07:00
Eric Anholt	5c85e1cf55	intel: Make intel_miptree_get_tile_offsets return a page offset. Right now, the callers in i965 don't expect a nonzero page offset to actually occur (since that's being handled elsewhere), but it seems like a trap to leave it this way. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-28 12:40:00 -07:00
José Fonseca	4eaa0999b5	glsl: Fix MSVC build. It appears that `sizeof(Class::member)` is either non-standard or merely unsupported in MSVC. So use `sizeof(instance->member)` instead, which is guaranteed to work everywhere. Also promote the assert to a static assert. Trivial.	2013-05-28 13:56:18 +01:00
Marek Olšák	d4a06d77f5	mesa: fix GLSL program objects with more than 16 samplers combined The problem is the sampler units are allocated from the same pool for all shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment shader samplers start at index 12, leaving only 4 sampler units for the fragment shader. The main cause is probably the fact that samplers (texture unit -> sampler unit mapping, etc.) are tracked globally for an entire program object. This commit adapts the GLSL linker and core Mesa such that the sampler units are assigned to sampler uniforms for each shader stage separately (if a sampler uniform is used in all shader stages, it may occupy a different sampler unit in each, and vice versa, an i-th sampler unit may refer to a different sampler uniform in each shader stage), and the sampler-specific variables are moved from gl_shader_program to gl_shader. This doesn't require any driver changes, and it fixes piglit/max-samplers for gallium and classic swrast. It also works with any number of shader stages. v2: - converted tabs to spaces - added an assertion to _mesa_get_sampler_uniform_value Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-28 13:05:30 +02:00
Marek Olšák	b4cb857dbf	swrast: increase array size of TextureSample to match the size of ctx->Texture.Unit, and it will also fix piglit/max-samplers with the following commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-28 13:05:30 +02:00
Marek Olšák	15a4b6db21	mesa: declare UniformBufferBindings as an array with a static size Some Gallium drivers were crashing, because the array was not large enough. v2: clamp the per-shader maximum in st/mesa, then sum them all up NOTE: This is a candidate for the stable branches.	2013-05-28 13:05:30 +02:00
Michel Dänzer	cdad129f9c	radeonsi: Enable GLSL 1.30	2013-05-28 11:20:53 +02:00
Michel Dänzer	0495adbac5	radeonsi: Handle TGSI TXQ opcode	2013-05-28 11:20:53 +02:00
Michel Dänzer	3623111960	radeonsi: Add support for TGSI TXF opcode	2013-05-28 11:20:53 +02:00
Michel Dänzer	beaa5eb03a	radeonsi: Use tgsi_util_get_texture_coord_dim()	2013-05-28 11:20:53 +02:00
Michel Dänzer	0afeea5ad2	radeonsi: Handle TGSI_SEMANTIC_CLIPDIST	2013-05-28 11:20:16 +02:00
Michel Dänzer	784df2e115	radeonsi: Make border colour state handling safe for integer textures	2013-05-28 09:55:46 +02:00
Michel Dänzer	e369f40a9b	radeonsi: Fix hardware state for dual source blending Set up CB_SHADER_MASK register according to pixel shader exports, and enable some minimal state for colour buffer 1 in case dual source blending is used.	2013-05-28 09:55:46 +02:00
Vadim Girlin	08810ca9ef	r600g/sb: handle more cases for folding in gvn pass Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-28 05:24:53 +04:00
Christian König	5328c8001b	st/vdpau: destroy handle table only when it's empty Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-27 18:18:32 +02:00
Christian König	f796b67431	st/vdpau: remove vlCreateHTAB from surface functions Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-27 18:18:32 +02:00
Christian König	8ea34fa0e8	st/vdpau: invalidate the handles on destruction Fixes a problem with xbmc when switching channels. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-27 18:18:32 +02:00
Vadim Girlin	5de41575a1	r600g/sb: improve folding for SETcc Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-27 15:30:01 +04:00
Vadim Girlin	88e700329b	r600g/sb: optimize CNDcc instructions Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-27 15:29:56 +04:00
Vadim Girlin	725671a83a	r600g/sb: improve optimization of conditional instructions Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-27 15:19:20 +04:00
Chia-I Wu	5285c4c88e	ilo: enable multiple constant buffers This effectively enables uniform buffer object support.	2013-05-27 12:31:42 +08:00
Chia-I Wu	3a5dd39b1d	ilo: add support for indirect access of CONST in FS Unlike other register files, CONST is read with a message and indirect access is easier to implement.	2013-05-27 12:30:51 +08:00
Chia-I Wu	8e7987cc49	ilo: add support for TBOs on GEN6 This hunk was missing in the last commit.	2013-05-27 12:30:42 +08:00
Chia-I Wu	11c9aaf30a	ilo: advertise supports for pure integer formats For pure integer formats, no filtering nor blending is needed.	2013-05-27 11:02:57 +08:00
Chia-I Wu	fb40aca879	ilo: add support for texture buffer objects Take care of sampler views that have buffers as the underlying resources. Update caps related to TBOs.	2013-05-27 11:02:57 +08:00
Chia-I Wu	441aa9326a	tgsi: add buffer texture to tgsi_util_get_texture_coord_dim() TGSI_TEXTURE_BUFFER is one-dimensional. Assert that exec_tex() is never called with TGSI_TEXTURE_BUFFER. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-27 11:02:06 +08:00
Vadim Girlin	63d09a0cb7	r600g/sb: improve handling of KILL instructions This patch improves handling of unconditional KILL instructions inside the conditional blocks, uncovering more opportunities for if-conversion. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-27 01:45:07 +04:00
Vadim Girlin	880f435a7e	r600g/sb: fix peephole optimization for PRED_SETE Fixes incorrect condition that prevented optimization for PRED_SETE/PRED_SETE_INT. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-27 01:45:07 +04:00
Vadim Girlin	ff2a611699	r600g/sb: fix scheduling of PRED_SET instructions PRED_SET instructions that update exec mask should be scheduled immediately prior to the "if-then-else" block, because any instruction that is inserted after alu clause with PRED_SET and before conditional block is also conditionally executed by hw (exec mask is already updated at that moment). Propbably it's better to make PRED_SET a part of conditional "if-then-else" block in the IR to handle this more cleanly, but for now this temporary solution should prevent the problem. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-27 01:45:07 +04:00
Vadim Girlin	44a117ab9a	r600g/sb: fix handling of preloaded inputs for compute shaders For compute shaders we need to let the backend know that GPRs 0 and 1 are preloaded with some compute-specific input values, otherwise any use of these regs without previous definition is considered as undefined value and usually is simply replaced with 0. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-25 22:56:53 +04:00
Brian Paul	fd9fe4470b	xlib: add null ctx check in glXDestroyContext() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934 NOTE: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-24 16:35:25 -06:00
Brian Paul	fd29e4acda	st/glx: add null ctx check in glXDestroyContext() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64934 NOTE: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-24 16:35:25 -06:00
Brian Paul	db4580cbdf	st/mesa: add switch cases for new IR enums to silence warnings	2013-05-24 16:35:25 -06:00
Brian Paul	820de34ceb	st/glx/xlib: assorted whitespace, comment fixes	2013-05-24 16:35:24 -06:00
Vadim Girlin	8e41ced4b3	r600g/sb: fix incorrect assert Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-24 21:00:54 +04:00
Vadim Girlin	e9aa46e665	r600g/sb: relax some restrictions for FETCH instructions This allows GVN rewrite pass to propagate non-const (register) values to FETCH source operands, helping to eliminate unnecessary copies in some cases. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-24 21:00:54 +04:00
Vadim Girlin	5a68a29706	r600g/sb: relax register allocation for compute shaders We have to assume that all GPRs in compute shader can be indirectly addressed because LLVM backend doesn't provide any indirect array info. That's why for compute shaders GPR array is created that covers all used GPRs (0..r600_bytecode::ngpr-1), but this seriously restricts register allocation in sb. This patch checks for actual use of indirect access in the shader and if it's not used then GPR array is not created, so that regalloc is not unnecessarily restricted. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-24 21:00:54 +04:00
Vadim Girlin	0b5b3f8816	r600g/sb: fix gpr array handling for compute shaders Fixes segfault with bfgminer and R600_DEBUG=sbcl. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-24 16:45:58 +04:00
Vadim Girlin	d1e0dc6275	r600g/sb: fix buffer overflow in sb_ostream Fixes segfault during bytecode dump with bfgminer kernel Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-24 16:40:58 +04:00
Tom Stellard	b1797c3a38	r600g/compute: Use common transfer_{map,unmap} functions for global resources Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-05-23 14:52:34 -07:00
Tom Stellard	65d67bcc4b	r600g/compute: Use common transfer_{map,unmap} functions for kernel inputs Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-05-23 14:52:34 -07:00
Kenneth Graunke	062317d667	i965: Go back to using the kernel SOL reset feature. It turns out the MI_LOAD_REGISTER_IMM approach doesn't work on Haswell, and regressed essentially all the transform feedback Piglit tests. This morally reverts `eaa6fbe6d5`. However, the code is still simpler than it was. On BeginTransformFeedback, we simply flush the batch and set the SOL reset flag so that the next batch will start with zeroed offsets. There's still no software counting. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64887 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-23 13:32:02 -07:00
Rob Clark	95670bdee2	freedreno: scissor fix Don't assume the state-tracker will set the scissor after the framebuffer state is changed. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-05-23 14:35:21 -04:00
Rob Clark	97fa811d14	freedreno: implement pipe->resource_copy_region() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-05-23 14:35:21 -04:00
Kenneth Graunke	3ddfccb303	glsl linker: compare interface blocks during interstage linking Verify that interface blocks match when linking separate shader stages into a program. Fixes piglit glsl-1.50 tests: * linker/interface-blocks-vs-fs-member-count-mismatch.shader_test * linker/interface-blocks-vs-fs-member-order-mismatch.shader_test Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2013-05-23 09:37:12 -07:00
Jordan Justen	4a0bcd90cf	glsl linker: compare interface blocks during intrastage linking Verify that interface blocks match when combining compilation units at the same stage. (For example, when merging all vertex shaders.) Fixes piglit glsl-1.50 test: * linker/interface-blocks-multiple-vs-member-count-mismatch.shader_test v5 (Ken): Rename to link_interface_blocks.cpp and drop the separate .h file for consistency with other linker code. Remove "ok" variable. Fold cross_validate_interface_blocks into its caller. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	d6863acb9f	glsl linker: support arrays of interface block instances With this change we now support interface block arrays. For example, cases like this: out block_name { float f; } block_instance[2]; This allows Mesa to pass the piglit glsl-1.50 test: * execution/interface-blocks-complex-vs-fs.shader_test Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	c30ca431ba	glsl link_varyings: link interface blocks using the block name Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	5ebf547312	glsl linker: remove interface block instance names Convert interface blocks with instance names into flat interface blocks without an instance name. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	b24eeb078f	glsl ast_to_hir: support in/out for interface blocks Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	cb29a7095f	glsl ast_to_hir: reject row/column_major for in/out interface blocks Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	c00387497d	glsl ast_to_hir: move uniform block symbols to interface blocks namespace Uniform/interface blocks are a separate namespace from types. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	3919c19468	glsl_symbol_table: add interface block namespaces For interface blocks, there are three separate namespaces for uniform, input and output blocks. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:12 -07:00
Jordan Justen	9368604d99	glsl parser: allow in & out for interface block members Previously uniform blocks allowed for the 'uniform' keyword to be used with members of a uniform blocks. With interface blocks 'in' can be used on 'in' interface block members and 'out' can be used on 'out' interface block members. The basic_interface_block rule will verify that the same qualifier type is used with the block and each member. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	067cc08d6a	glsl ast_to_hir: reject interpolation qualifiers for uniform blocks Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	4410eba598	glsl parser: handle interface block member qualifier An interface block member may specify the type: in { in vec4 in_var_with_qualifier; }; When specified with the member, it must match the same type as interface block type. It can also omit the qualifier: uniform { vec4 uniform_var_without_qualifier; }; When the type is not specified with the member, it will adopt the same type as the interface block. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	4369acff5e	glsl parser: on desktop GL require GLSL 150 for instance names Interface blocks in GLSL 150 allow an instance name to be used. v2: * use state->check_version Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	d36cb3617c	glsl parser: reject VS+in & FS+out interface blocks Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	6d3d974e37	glsl: parse in/out types for interface blocks Previously only 'uniform' was allowed for uniform blocks. Now, in/out can be parsed, but it will only be allowed for GLSL >= 150. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	744c270406	glsl parser: rename uniform block to interface block Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Jordan Justen	c9f58544be	glsl: rename ast_uniform_block to ast_interface_block Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-23 09:37:11 -07:00
Chris Forbes	7bfb4bea65	i965: Enable guardband clipping on Gen4/5. Enables guardband clipping when the viewport covers the entire render target. No piglit regressions on Ironlake. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-24 08:00:47 +12:00
Chris Forbes	a3d8e7c57c	ARB_fp: accept duplicate precision options Relaxes the validation of OPTION ARB_precision_hint_{nicest,fastest}; to allow duplicate options. The spec says that both /nicest/ and /fastest/ cannot be specified together, but could be interpreted either way for respecification of the same option. Other drivers (NVIDIA etc) accept this, and at least one Unity3D game expects it to succeed (Kerbal Space Program). V2: Add spec quote. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-24 07:50:51 +12:00
Vinson Lee	e3eeb72f24	ilo: Initialize need_flush in draw_vbo. need_flush was uninitialized if hw3d->new_batch was true. Fixes "Uninitialized scalar variable" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2013-05-23 15:31:42 +08:00
Vinson Lee	36e2c7cc1a	radeon: Initialize variables in radeon_llvm_context_init. 'type' was not fully initialized when calling lp_build_context_init. Fixes "Uninitialized scalar variable" defect reported by Coverity. NOTE: This is a candidate for the stable branches. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-05-22 23:06:23 -07:00
Eric Anholt	cf37e12024	intel: Count fragments in our blitter-based glBitmap() path. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59440 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-22 14:35:44 -07:00
Eric Anholt	0af614727a	i965: Shut up more compiler warnings from vector insert/extract changes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-22 14:35:25 -07:00
Roland Scheidegger	2b291eaa90	softpipe: change TEX_TILE_SIZE and NUM_TEX_TILE_ENTRIES Initially we had NUM_TEX_TILE_ENTRIES of 50, however this was using too much memory (mostly because the tile cache is operating on fixed max current sampler views which could be fixed but that's another topic). So it was decreased to 4. However this is a ridiculously low number which can't actually really work (the number of tiles needed for as little as a single quad with linear_mipmap_linear is 2 to 8 for a 2d texture, and 4 to 16 for a 3d texture), as it just about guarantees there will be cache thrashing sometimes (just about always for 3d textures in fact, since while there are 4 entries the cache is direct mapped). So increase that number to 16 (which is still on the low side for direct mapped cache though I guess using something like 4-way associativity would be more effective than increasing this further) which has at least some good chance to avoid thrashing. Since we don't want to increase memory requirements however in turn decrease the tile size accordingly from 64 to 32 (as a bonus point this also decreases the cost of texture thrashing which might still happen sometimes). I've seen performance improvement in the order of factor ~200 (specifically, drawing the first frame from the replay from bug 41787 needs "only" ~10s instead of ~30min, meaning I can actually compare the output with other drivers...) with this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	2f567fb7b5	softpipe: disambiguate TILE_SIZE / TEX_TILE_SIZE These can be different (just like NUM_TEX_TILE_ENTRIES / NUM_ENTRIES), though currently they aren't. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	80e2cc0f97	llvmpipe: disable simple_shader optimization This optimization disabled mask checks if the shader is simple enough. While this should work correctly, the problem is that it can hide real issues because shaders in practice are usually complex enough (8 instructions or 1 texture is already enough) so this doesn't get used, whereas dumbed-down tests which should hit all the same code paths suddenly do something quite different. This was the reason that bug 41787 could not be easily tracked as stencil test not working correctly (piglit would in fact have failed some tests without that optimization). So disable it for now, it's unclear if it's much of a win in any case. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	e108716429	llvmpipe: fix early depth test / late depth write stencil issues We actually did early depth/stencil test and late depth/stencil write even when the shader could kill the fragment (alpha test or discard). Since it matters for the new stencil value if the fragment is killed by depth/stencil test or by the shader (in which case it will not reach the depth/stencil test) this simply cannot work (we also would possibly skip writing the new stencil value due to mask checks but this is a secondary issue). So use late depth test / late depth write instead in this case. (No piglit changes as it doesn't seem to hit such bogus early depth test / late depth write path.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	82d7733b52	llvmpipe: fix issue with not writing new stencil values We did mask checks between depth/stencil testing and depth/stencil write. This meant that if the depth/stencil test killed off all fragments we never actually wrote the new stencil value. This issue affected all early/late test/write combinations. So move the mask check after depth/stencil write (for early depth test, could do the same for late depth test but might not be worth it at that point so just skip it there). This addresses https://bugs.freedesktop.org/show_bug.cgi?id=41787. Piglit does not hit this issue because of the simple_shader optimization in generate_fs_loop() which means we're skipping the mask checks. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	3c91ef0f29	llvmpipe: (trivial) remove confusing code in stencil test This was meant to disable some code which isn't needed when depth/stencil isn't written. However, there's more code which wouldn't be needed in that case so having the condition there was just odd (llvm will drop all the code anyway). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Roland Scheidegger	5314f5d829	llvmpipe: fix bug in early depth test / late depth write handling Using wrong type if the format was less than 32bits. No piglit changes as it doesn't hit that path. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-22 22:57:27 +02:00
Alexander von Gluck IV	6d20e251f2	Haiku: Add Gallium winsys and target code * We generate a static library for Haiku Gallium targets as our port system combines the compiled rendering code into a modular ar for each module (for example, our port system combines llvm libsoftpipe.a libllvmpipe.a into a single ar for the Haiku build system. I'd like the Gallium hgl target scons build system to do this some day, however how is beyond me at the moment. This is a first step.	2013-05-22 14:31:44 -05:00
Chia-I Wu	ff68f61bed	ilo: set more fields of 3DSTATE_DEPTH_BUFFER Set lod/layer related fields of 3DSTATE_DEPTH_BUFFER. Since we always point to a single level/layer, those fields are always zero and this commit effectively makes no change. While at it, make it easier to disable manual slice offset calculation.	2013-05-22 20:25:57 +08:00
Chia-I Wu	f3da711bea	ilo: correctly set view extent in SURFACE_STATE The view extent was set to be the same as the depth while it should be set to the number of layers. It makes a difference for 3D textures. Also use this as a chance to clean up the code.	2013-05-22 18:12:01 +08:00
Chia-I Wu	bbb30398e5	ilo: avoid unnecessary emission of SO states No need to emit 3DSTATE_SO_BUFFER and 3DSTATE_SO_DECL_LIST when SO is disabled. As the implicit flush done by the commands is also gone, emit an explicit flush.	2013-05-22 18:09:17 +08:00
Eric Anholt	08f87ac333	i965: Skip etc-to-rgb transcode on BayTrail. The hardware does it, so no need for this workaround. Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-20 23:04:32 -07:00
Eric Anholt	c245efe7e8	mesa: Remove extension checking from ChooseTexFormat. This should already be handled by _mesa_base_tex_format() calls in TexImage*.	2013-05-21 15:20:28 -07:00
Eric Anholt	36e7c01101	mesa: Add ChooseTexFormat support for the new XBGR formats.	2013-05-21 15:20:28 -07:00
Kenneth Graunke	b29381567a	i965: Split BeginTransformFeedback hook into Gen6 and Gen7+ variants. Most of the work in BeginTransformFeedback is only necessary on Gen6. We may as well just skip it on Gen7+. v2: Add an intel->gen == 6 assert. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:40 -07:00
Kenneth Graunke	64a87f29ce	i965: Kill software primitive counting entirely. Now that we have hardware contexts, we don't need to continually reprogram the GS_SVBI_INDEX registers. They're automatically saved and restored with the context, so they can just increment over time. We only need to reset them when starting transform feedback. There's also no reason to delay until the next drawing operation; we can just emit the packet immediately. However, this means we must drop the initialization in brw_invariant_state, as BeginTransformFeedback may occur before the first drawing in a context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:27 -07:00
Kenneth Graunke	647fc0c50b	i965: Remove software geometry query code. EXT_transform_feedback isn't yet supported on Gen4-5, so none of this query code is actually used. This also means we can remove some of the surrounding support code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:25 -07:00
Kenneth Graunke	b863d44451	i965: Delete unused brw->sol.offset_0_batch_start field. This was only used for the the non-hardware context code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:24 -07:00
Kenneth Graunke	eaa6fbe6d5	i965: Stop using the kernel SOL reset feature. We can just do it ourselves with MI_LOAD_REGISTER_IMM. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:22 -07:00
Kenneth Graunke	6837ebd00f	i965: Remove dead code for Gen7 SOL without hardware contexts. Failing to get a hardware context now means failing to load the driver, so this code will never get hit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:19 -07:00
Kenneth Graunke	58765bb481	i965: Add a macro for accessing the SO_WRITE_OFFSET[0-3] registers. Using a function-like macro makes it easy to loop over all four streams. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-21 13:29:06 -07:00
Ian Romanick	0ba1e65fb6	docs: Import 9.1.3 release notes, add news item. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-21 13:16:56 -07:00
Michel Dänzer	d42a2df19c	radeonsi: Fix user clip planes 4 more little piglits. NOTE: This is a candidate for the 9.1 branch.	2013-05-21 17:50:13 +02:00
Michel Dänzer	e3befbca5e	radeonsi: Handle TGSI_SEMANTIC_CLIPVERTEX 17 more little piglits. NOTE: This is a candidate for the 9.1 branch.	2013-05-21 17:50:13 +02:00
Michel Dänzer	eb19163a4d	radeonsi: Initial support for multiple constant buffers Just enough to support an additional internal constant buffer for the user clip planes. NOTE: This is a candidate for the 9.1 branch.	2013-05-21 17:50:12 +02:00
Michel Dänzer	4730dea5f5	radeonsi: Fix handling of TGSI_SEMANTIC_PSIZE Two more little piglits. NOTE: This is a candidate for the 9.1 branch.	2013-05-21 17:50:12 +02:00
Marek Olšák	2eac0aa1d8	radeonsi: increase array size for shader inputs and outputs and add assertions to prevent buffer overflow. This fixes corruption of the si_shader struct. NOTE: This is a candidate for the 9.1 branch. [ Cherry-pick of r600g commit `da33f9b919` ] Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-05-21 17:47:44 +02:00
Brian Paul	9772284df2	xlib: check for null ctx pointer in glXIsDirect() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745 Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-21 07:35:12 -06:00
Brian Paul	1e9875acbe	st/glx/xlib: check for null ctx pointer in glXIsDirect() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64745 Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-21 07:35:12 -06:00
José Fonseca	8cabc7be1d	scons: Don't force stabs debug format for Mingw. - recent gdb handles DWARF fine (tested both with version 7.1.90.20100730 from mingw-w64 project, and 7.5-1 from mingw project) - http://people.freedesktop.org/~jrfonseca/bfdhelp/ was updated to handle DWARF - stabs requires ugly hacks to prevent compilation failures - mixing stabs/dwarf prevents proper backtraces (which is inevitable, given that the MinGW C runtime is pre-built with DWARF) For example, without this change I get: (gdb) bt #0 _wassert (_Message=0xf925060 L"Num < NumOperands && \"Invalid child # of SDNode!\"", _File=0xf60b488 L"llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534) at ../../../../mingw-w64-crt/misc/wassert.c:51 #1 0x0368996b in _assert (_Message=0x39d7ee4 "Num < NumOperands && \"Invalid child # of SDNode!\"", _File=0x39d7e94 "llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534) at ../../../../mingw-w64-crt/misc/wassert.c:44 #2 0x00000004 in ?? () #3 0x00000004 in ?? () #4 0x0f60b488 in ?? () #5 0x00000000 in ?? () While with this change I get: (gdb) bt #0 _wassert (_Message=0xfb982e8 L"Num < NumOperands && \"Invalid child # of SDNode!\"", _File=0xefbcb40 L"llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534) at ../../../../mingw-w64-crt/misc/wassert.c:51 #1 0x039c996b in _assert (_Message=0x3d17f24 "Num < NumOperands && \"Invalid child # of SDNode!\"", _File=0x3d17ed4 "llvm/include/llvm/CodeGen/SelectionDAGNodes.h", _Line=534) at ../../../../mingw-w64-crt/misc/wassert.c:44 #2 0x033111cc in getOperand (Num=4, this=<optimized out>) at llvm/include/llvm/CodeGen/SelectionDAGNodes.h:534 #3 getOperand (i=4, this=<optimized out>) at llvm/include/llvm/CodeGen/SelectionDAGNodes.h:779 #4 llvm::SelectionDAG::getNode (this=0xf00cb08, Opcode=79, DL=..., VT=..., N1=..., N2=...) at llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:2859 #5 0x03377b20 in llvm::SelectionDAGBuilder::visitExtractElement (this=0xfb45028, I=...) at llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:2803 [...] Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-21 12:34:19 +01:00
Chia-I Wu	2b7463cf3a	ilo: use BLT engine to copy between textures Emit XY_SRC_COPY_BLT to do the job. Since ETC1 textures cannot be mapped for reading, as is required by util_copy_resource_region, this fixes copying of ETC1 textures.	2013-05-21 12:02:55 +08:00
Chia-I Wu	c44ebb4ef4	ilo: use BLT engine to copy between buffers Emit (possibly multiple) SRC_COPY_BLT to copy between buffers of arbitrary sizes.	2013-05-21 11:47:20 +08:00
Chia-I Wu	731cafe7b2	ilo: refactor blitter_xy_color_blt() Add gen6_XY_COLOR_BLT() and let blitter_xy_color_blt() call the function. Not sure if this path is still being hit by any application.	2013-05-21 11:47:20 +08:00
Chia-I Wu	0d42a9e941	ilo: replace cp hooks by cp owner and flush callback The problem with cp hooks is that when we switch from 3D ring to 2D ring, and when there are active queries, we will emit 3D commands to 2D ring because the new-batch hook is called. This commit introduces the idea of cp owner. When the cp is flushed, or when another owner takes place, the current owner is notified, giving it a chance to emit whatever commands there need to be. With this mechanism, we can resume queries when the 3D pipeline owns the cp, and pause queries when it loses the cp. Ring switch will just work. As we still need to know when the cp bo is reallocated, a flush callback is added.	2013-05-21 11:47:20 +08:00
Chia-I Wu	a04d8574c6	ilo: harware contexts are only for the render ring The hardware context should not be passed for bo execution when the ring is not the render ring. Rename hw_ctx to render_ctx for clarity.	2013-05-21 11:47:19 +08:00
Chia-I Wu	1ed7b825cf	ilo: update format mappings Add more PIPE_FORMAT -> BRW_SURFACEFORMAT mappings, and update surface_format_info from i965.	2013-05-21 11:47:19 +08:00
Chia-I Wu	bd8090a5af	ilo: update headers from i965 Mainly for MI_LOAD_REGISTER_IMM and BCS_SWCTRL.	2013-05-21 11:47:19 +08:00
Anuj Phogat	06cd89a88c	i965: Fix build failure meta.h should be included in brw_state_upload.c to get access to function _mesa_meta_in_progress().	2013-05-20 16:15:57 -07:00
Kenneth Graunke	f09b91f782	i965: Implement transform feedback query support in hardware on Gen6+. Now that we have hardware contexts and can use MI_STORE_REGISTER_MEM, we can use the GPU's pipeline statistics counters rather than going out of our way to count primitives in software. Aside from being simpler, this also paves the way for Geometry Shaders, which can output an arbitrary number of primitives on the GPU. It will also allow us to use hardware primitive restart when these queries are in use. The GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN query is easy: it corresponds to the SO_NUM_PRIMS_WRITTEN/SO_NUM_PRIMS_WRITTEN0_IVB counters. The GL_PRIMITIVES_GENERATED query is trickier. Gen provides several statistics registers which /almost/ match the semantics required: - IA_PRIMITIVES_COUNT The number of primitives fetched by the VF or IA (input assembler). This undercounts when GS is enabled, as it can output many primitives. - GS_PRIMITIVES_COUNT The number of primitives output by the GS. Unfortunately, this doesn't increment unless the GS unit is actually enabled, and it usually isn't. - SO_PRIM_STORAGE_NEEDED*_IVB The amount of space needed to write primitives output by transform feedback. These naturally only work when transform feedback is on. We'd also have to add the counters for all four streams. - CL_INVOCATION_COUNT The number of primitives processed by the clipper. This doesn't work if the GS or SOL throw away primitives for rasterizer discard. However, it does increment even if the clipper is in REJECT_ALL mode. Dynamically switching between counters would be painfully complicated, especially since GS, rasterizer discard, and transform feedback can all be switched on and off repeatedly during a single query. The most usable counter is CL_INVOCATION_COUNT. The previous two patches reworked rasterizer discard support so that all primitives hit the clipper, making this work. v2: Occlusion query bug fixes removed and squashed in earlier patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	037a901a5b	i965: Handle rasterizer discard in the clipper rather than GS on Gen6. This has more of a negative impact than the previous patch, as on Gen6 passing primitives through to the clipper means we actually have to make the GS thread write them to the URB. I don't see another good solution though, and rasterizer discard is not the most common of cases, so hopefully it won't be too terrible. v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags; remove the rasterizer_discard field from brw_gs_prog_key. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	d1e4e9960c	i965: Handle rasterizer discard in the clipper rather than SOL on Gen7. In order to implement the GL_PRIMITIVES_GENERATED query in a sane fashion on our hardware, we can't discard primitives until the clipper. The patch after next explains the rationale. By setting the clipper to REJECT_ALL mode, all primitives get thrown away, so rendering is still appropriately disabled. This may negatively impact performance in the rasterizer discard case, but it's unclear how much and this hasn't been observed to be a bottleneck in any application we've looked at. The clipper is the very next stage in the pipeline, so I don't think it will be terrible. v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	5ebe9523f9	i965: Disable clipper statistics when meta operations are in progress. We don't currently use the clipper statistics, but we'll soon use CL_INVOCATIONS_COUNT to implement the GL_PRIMITIVES_GENERATED query. The number of primitives generated is not supposed to be altered during operations such as glGenerateMipmap. Prevents spec/EXT_transform_feedback/generatemipmap prims_generated from breaking when we start using pipeline statistics registers to implement the GL_PRIMITIVES_GENERATED query in a few commits. v2: Use the BRW_NEW_META_IN_PROGRESS flag for correct state handling. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	b96f93c453	i965: Create a BRW_NEW_META_IN_PROGRESS state flag. This will allow us to disable statistics during meta operations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	bbf86712f8	i965: Add #defines for the pipeline statistics counter registers. These come from the Ivybridge PRM, Volume 1, Part 3. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	e32cd5ffbb	i965: Rely on hardware contexts for query objects on Gen6+. Hardware contexts greatly simplify the query object code. The pipeline statistics counters get saved and restored with the context, which means that we don't need to worry about other workloads polluting them. This means that we can simply write a single pair of values (one at BeginQuery and one at EndQuery) rather than a series of pairs. This also means we don't need to worry about the BO getting full. We also don't need to delay BO allocation and starting snapshot until the first draw. The generation split here is a little off: technically, Ironlake can also support hardware contexts. However, the kernel currently doesn't, and even if it were to do so someday, we'd need to wait a while before bumping the kernel requirement to take advantage of it. v2: Incorporate Paul's feedback. - Clarify which functions are Gen4/5-only via assertions and comments. - Change how driver hook initialization happens. - Update comments. - Squash a bug fix from a later commit here where it belongs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1] Acked-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:18 -07:00
Kenneth Graunke	72b1e440dd	i965: Disable pixel statistics in BLORP. BLORP is used for operations like glClear, glCopyTexImage, and glBlitFramebuffer which aren't supposed to contribute fragments toward occlusion queries. This prevents Piglit tests from breaking in the next commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:17 -07:00
Kenneth Graunke	92d2f5acfa	i965: Require hardware contexts (and thus Kernel 3.6) on Gen6+. Hardware contexts are necessary to reasonably support OpenGL 3.2. In particular, we currently maintain software counters for transform feedback buffer offsets and counters, which relies on knowing the number of primitives generated. Geometry shaders violate that assumption. At the time of writing, Debian has moved to Kernel 3.8, which means most people probably have a newer kernel by now. It's also worth noting that this patch won't land until Mesa 10 which is currently targeted for September. By that point, even more people will have a newer kernel. Also, don't bother trying to allocate contexts on pre-Gen6, as it currently will always fail, and if this changes in the future, we'll need to reevaluate our hw_ctx/gen checks. This patch leaves the code for flagging BRW_NEW_CONTEXT on new batchbuffers if hw_ctx == NULL since that still occurs pre-Gen6. Also remove the Gen7+ check for kernel 3.3, since it's now redundant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:17 -07:00
Kenneth Graunke	50e60bf8da	i965: Bump kernel requirement to 3.3 on Ivybridge. Kernel 3.3 introduced the SOL reset execbuf parameter, needed for GL 3.0 on Ivybridge. Bumping the requirement will give an obvious error message rather than simply reporting GL 2.1. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-20 13:03:17 -07:00
Vincent Lejeune	9fd7ea786c	r600g/llvm: fix cubemap lod/bias	2013-05-20 20:23:19 +02:00
Vincent Lejeune	9a95fb1605	r600g/llvm: Fix texelFetchOffset-2D	2013-05-20 20:23:14 +02:00
Vincent Lejeune	32c9cbb38f	r600g/llvm: Fix cubearray textureSize	2013-05-20 20:23:09 +02:00
Vincent Lejeune	9c2943601e	r600g/llvm: Factorize code loading from const buffer.	2013-05-20 20:23:04 +02:00
Kenneth Graunke	01b79b2e3b	i965: Add cases for ir_triop_vector_insert that assert. brw_link_shader() unconditionally calls lower_vector_insert() with true as the second parameter. This means that both constant and variable indexed expressions will get lowered, so we should never see this in the backend. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-20 10:19:48 -07:00
Kenneth Graunke	e1e8876797	i965: Add cases for ir_binop_vector_extract that assert. do_vec_index_to_swizzle() should remove any vector extract operations with a constant index. It's unconditionally called from do_common_optimization(). do_vec_index_to_cond_assign() should remove the rest, and it is unconditionally called from brw_link_shader(). This means that we should never see ir_binop_vector_extract in the backend. Silences compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-20 10:19:30 -07:00
Roland Scheidegger	f6beb4c6b6	llvmpipe: enable z32s8x24 format Now that we can handle it both for sampling and as depth/stencil enable it. Passes nearly all additional piglit tests which are now performed, with two exceptions (one being a framebuffer blit which fails for all other formats including stencil too as we don't support stencil blits, the other reporting a unexpected GL error so doesn't look to be llvmpipe's fault).	2013-05-18 00:32:45 +02:00
Roland Scheidegger	070a9afb54	llvmpipe: handle z32s8x24 depth/stencil format We need to split up the depth and stencil values in this case, and there's some new logic required to handle float depth and stencil simultaneously. Also make sure we get the 64bit zs clear values and masks propagated correctly.	2013-05-18 00:32:33 +02:00
Roland Scheidegger	f3ad716e8f	llvmpipe: get rid of unused tiled/linear logic We do rendering to linear color buffers for quite some time, and since switching to linear depth buffers all the tiled/linear logic was unused. So get rid of (most) of it - there's still some LAYOUT_NONE things and late allocation of resources which probably could be simplified. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-18 00:32:27 +02:00
Roland Scheidegger	87978518e9	llvmpipe: fix bogus handling of first_layer when setting up texture sampling The code avoided first_layer parameter in the sampler interface (and needing to do another calculation at runtime) by fixing up the base texture pointer instead. Unfortunately, this didn't actually work as we have mip-first texture layout so fixing up the base ptr by a fixed amount is very wrong if there are mipmaps present. The wrong offsets caused misrendering and crashes. Fix this by just adjusting the individual mip level offsets instead. Spotted by Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-18 00:32:18 +02:00
Roland Scheidegger	d7e811c0b0	gallivm: handle z32s8x24 format for sampling Since we can only sample either depth or stencil but not both only load the required bits which makes things a bit easier (it requires special handling since the format doesn't fit into 32bit). The logic for deciding if depth or stencil should be sampled is a bit odd, but seems to be what other drivers and statetrackers do: if it's a format with both depth and stencil (or just with depth) then sample depth, for sampling stencil a sampler view format with only stencil is required. Also while here fix up stencil sampling for other formats as well, though this isn't supported by mesa (ARB_stencil_texturing), and while blits would use it they don't work neither since they'd also need stencil export. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-18 00:31:49 +02:00
Roland Scheidegger	0346e9b3bb	st/mesa: fix weird UCMP opcode use for bool ubo load I don't know what this code was trying to do but whatever it was it couldn't have worked since negation of integer boolean inputs while not specified as outright illegal (not yet at least) won't do anything since it doesn't affect the result of comparison with zero at all. In fact it looks like the whole instruction can just be omitted. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-05-18 00:31:49 +02:00
Eric Anholt	a5b0452400	mesa: Make FinishRenderTexture just take the renderbuffer being finished. Now that the rb has a reference to the teximage, we didn't need anything else out of the attachment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-17 13:04:05 -07:00
Eric Anholt	e98c39c109	mesa: Track the TexImage being rendered to in the gl_renderbuffer. We keep having to pass the attachments around with our gl_renderbuffers because that's the only way to find what the gl_renderbuffer actually refers to. This is a step toward removing that (though drivers still need the Zoffset as well). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-17 13:04:05 -07:00
Eric Anholt	7b085d1bfa	radeon: Remove dead radeon_wrap_texture(). I should have killed this in my previous cleanup. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-17 13:04:04 -07:00
Eric Anholt	c810e67c55	mesa: Make gl_renderbuffers backed by EGL images use FinishRenderTexture. This is the opportunity that radeon and intel drivers rely on for flushing render targets that may get reused as textures. Before EGL, that only happened for GL_TEXTURE attachments. Fixes piglits: KHR_gl_renderbuffer_image/renderbuffer-texture OES_EGL_image/renderbuffer-texture NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-17 13:04:04 -07:00
José Fonseca	6166ffeaf7	gallivm: Eliminate 8.8 fixed point intermediates from AoS sampling path. This change was meant as a stepping stone to use PMADDUBSW SSSE3 instruction, but actually this refactoring by itself yields a 10% speedup on texture intensive shaders (e.g, Google Earth's ocean water w/o S3TC on a Ivy Bridge machine), while giving yielding exactly the same results, whereas PMADDUBSW only gave an extra 5%, at the expense of 2bits of precision in the interpolation. I belive that the speedup of this change comes from the reduced register pressure (as 8.8 fixed point intermediates take twice the space of 8bit unorm). Also, not dealing with 8.8 simplifies lp_bld_sample_aos.c code substantially -- it's no longer necessary to have code duplicated for low and high register halfs. Note about lp_build_sample_mipmap(): the path for num_quads > 1 is never executed (as it is faster on AVX to split the 256bit wide texture computation into two 128bit chunks, in order to leverage integer opcodes). This path might be useful in the future, so in order to verify this change did not break that path I had to apply this change: @@ -1662,11 +1662,11 @@ lp_build_sample_soa(struct gallivm_state gallivm, / * we only try 8-wide sampling with soa as it appears to * be a loss with aos with AVX (but it should work). * (It should be faster if we'd support avx2) / - if (num_quads == 1 \|\| !use_aos) { + if (/ num_quads == 1 \|\| ! / use_aos) { if (num_quads > 1) { if (mip_filter == PIPE_TEX_MIPFILTER_NONE) { LLVMValueRef index0 = lp_build_const_int32(gallivm, 0); / and then run texfilt mesademo: LP_NATIVE_VECTOR_WIDTH=256 ./texfilt Ran whole piglit without regressions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-17 20:23:00 +01:00
José Fonseca	5aaa4bafe0	gallivm: Add and use lp_build_lerp_3d. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-17 20:22:50 +01:00
Tom Stellard	e230d9debb	radeon/llvm: Run standard optimization passes on conpute shader modules The SROA and function inliner passes are espically important, because they optimize away unsupported features: functions and indirect private memory access.	2013-05-17 07:38:01 -07:00
Kenneth Graunke	ccb041fe8e	intel: Don't spam "intelReadPixels: fallback to swrast" in non-PBO case. When an application is using PBOs, we attempt to use the BLT engine to perform ReadPixels. If that fails due to some restrictions, it's useful to raise a performance warning. In the non-PBO case, we always use a CPU mapping since getting the data into client memory requires a CPU-side copy. This is a very common case, so raising a performance warning is annoying. In particular, apitrace's image dumping code hits this path, causing it to print hundreds of thousands of performance warnings via ARB_debug_output. This tends to obscure actual errors or other important messages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-16 22:35:01 -07:00
Paul Berry	46ea804107	intel: Do a depth resolve before copying images between miptrees. When intel_finalize_mipmap_tree() calls intel_miptree_copy_teximage() to reassemble a depth miptree that has been broken apart into pieces (to deal with misalignment of levels/layers within the miptree), it just copies the depth data, not the HiZ data. This is reasonable, since the alignment restrictions of HiZ are a large part of the reason why the miptree had to be broken apart in the first place. However, in order for the depth copy to be sufficient, we need to do a depth resolve first, to make sure any deferred depth writes that are in the HiZ buffer get performed. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=64662 and https://bugs.freedesktop.org/show_bug.cgi?id=64659. NOTE: This is a candidate for stable release branches. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-05-16 14:42:54 -07:00
Niels Ole Salscheider	7e17e72cb7	r600g: fixup for MSAA texture support checking Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2013-05-16 12:03:47 -07:00
José Fonseca	4f518e1738	llvmpipe: Temporary workaround to prevent segfault on array textures.	2013-05-16 15:14:10 +01:00
José Fonseca	cb9913cdab	gallivm: Support pointers in lp_build_print_value(). Trivial.	2013-05-16 15:14:10 +01:00
Chia-I Wu	435aea6f32	ilo: emit 3DSTATE_STENCIL_BUFFER on GEN7+ Whether HiZ is enalbed or not, separate stencil is supported and enforced on GEN7+. Now that we support separate stencil resources, we know how to emit 3DSTATE_STENCIL_BUFFER.	2013-05-16 18:33:59 +08:00
Chia-I Wu	6b894e6900	ilo: add support for stencil resources on GEN7+ For allocations, we need to support stencil-only and separate stencil resources. For mapping, we need to support software tiling and packing/unpacking for separate stencil resources.	2013-05-16 18:20:17 +08:00
Chia-I Wu	5c9b69d259	winsys/intel: test for and expose address swizzling Without knowing whether addresses are swizzled or not, we cannot manipulate a tiled surface in CPU.	2013-05-16 11:24:59 +08:00
Marek Olšák	639d0f73c1	st/mesa: handle texture_from_pixmap and other surface-based textures correctly There were 2 issues with it: 1) The texture format which should be used for texturing was only set in gl_texture_image::TexFormat, which wasn't used for sampler views. 2) Textures are sometimes reallocated under some circumstances in st_finalize_texture, which is unacceptable if the texture comes from a window system. The issues are resolved as follows: 1) If surface_based is true (texture_from_pixmap, etc.), store the format in a new variable st_texture_object::surface_format. 2) Don't reallocate a surface-based texture in st_finalize_texture. Also don't use st_ChooseTextureFormat is st_context_teximage, because the format is dictated by the caller. This fixes the glx-tfp piglit test. Reviewed-by: Adam Jackson <ajax@redhat.com>	2013-05-15 20:22:48 +02:00
Marek Olšák	5a3fac4d26	r600g: cleanup MSAA texture support checking Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-05-15 20:20:32 +02:00
Marek Olšák	61c995bc47	r600g: rewrite FMASK allocation, fix FMASK texturing with 2 and 4 samples This fixes and enables texturing with compressed MSAA colorbuffers on Evergreen and Cayman. For the first time, multisample textures work on Cayman. This requires the libdrm flag RADEON_SURF_FMASK. v2: require libdrm_radeon 2.4.45 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-05-15 20:19:45 +02:00
Eric Anholt	61506257f6	i965: Fill in brw_format_for_mesa_format for some non-rendering formats. This should have no change on driver operation, but it means that when you wonder why some format isn't supported natively, you can just look at the table above, instead of wondering if maybe there's an appropriate entry in the surface formats table that is already supported. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-15 09:43:46 -07:00
Eric Anholt	9db9bc3aa1	i965: Use native RGB_FLOAT16 support when available. Previously we would expand it to RGBA_FLOAT16. This format now comes out as framebuffer incomplete, but it seems worth the memory savings if that's what people are asking for (and GL3 does list it under "texture-only" color formats) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-15 09:43:46 -07:00
Eric Anholt	645b610b62	intel: Add support for blitting 6 byte-per-pixel formats. The next commit introduces what is apparently our first one, which tripped over this in glReadPixels. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-15 09:43:45 -07:00
Eric Anholt	028c11e8e3	i965: Use the Mesa surface formats for float RGB surfaces. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-15 09:43:45 -07:00
Eric Anholt	2e057076a8	i965: Use the new XRGB UNORM formats. This is a step on the way to removing some of our code for forcing alpha to 1, but I want easy bisecting so I'll add groups of formats separately. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-15 09:43:45 -07:00
José Fonseca	2a43dfda95	draw: More defensive coding in DRAW_GET_IDX. Doesn't make a difference ATM, but just in case.	2013-05-15 16:59:28 +01:00
José Fonseca	1883e1d3e9	draw: Fix vsplit regression when the ib can be used directly. `ib` no longer is offseted by `istart`. Trivial.	2013-05-15 16:57:44 +01:00
Chris Forbes	53a5f11f0d	mesa: Stop clamping stencil reference value at specification time All drivers now clamp this to the appropriate range for the bound stencil buffer when emitting stencil state. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:04:53 +12:00
Chris Forbes	978f91b829	swrast: Use accessor for stencil reference values NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:04:53 +12:00
Chris Forbes	db8a84de87	st: Use accessor for stencil reference values NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:04:53 +12:00
Chris Forbes	c411f40cba	radeon: Use accessor for stencil reference values V2: Drop spurious mask with 0xff. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:04:34 +12:00
Chris Forbes	7bbe9b78ae	nouveau: Use accessor for stencil reference values NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:01:08 +12:00
Chris Forbes	f819ec46d5	intel: Use accessor for stencil reference values NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:01:06 +12:00
Chris Forbes	96a1bf1ba3	mesa: Use accessor for stencil reference values in glGet NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:01:03 +12:00
Chris Forbes	38f65162af	mesa: add accessor for effective stencil ref Clamps the stencil reference value to the range representable in the currently-bound draw framebuffer's stencil attachment. V2: Add spec quote. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-15 22:00:55 +12:00
Chia-I Wu	c68424bac4	ilo: clean up transfer format conversion Map the bo directly, instead of calling transfer_map().	2013-05-15 15:21:50 +08:00
Chia-I Wu	cb57da421a	ilo: rework transfer mapping method choosing Always check if a bo is busy in choose_transfer_method() since we always need to map it in either map() or unmap(). Also determine how a bo is mapped in choose_transfer_method().	2013-05-15 15:21:50 +08:00
Chia-I Wu	b6c307744f	ilo: refactor transfer mapping Add tex_get_box_offset() to compute transfer offet from the pipe_box. Add tex_get_slice_stride() to compute slice stride for a transfer.	2013-05-15 15:21:50 +08:00
Chia-I Wu	5af8641ce0	ilo: no writeback without PIPE_TRANSFER_WRITE We should not write staging data back when PIPE_TRANSFER_WRITE is not set.	2013-05-15 15:08:54 +08:00
Chia-I Wu	46bb33bc21	ilo: minor cleanups for transfers Rename some functions and reorder some code.	2013-05-15 15:08:54 +08:00
Chia-I Wu	ca349e0217	ilo: simplify ilo_texture_get_slice_offset() Always return a tile-aligned offset. Also fix for W tiling.	2013-05-15 15:08:54 +08:00
Zack Rusin	013424678e	draw/gs: fix extracting of the clip The indices are not consecutive when using the geometry shader, which means we were extracting non existing values. Create an array of linear indices and always use it instead of the passed indices. Found by Jose. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-14 04:04:08 -04:00
Kenneth Graunke	a6961f391a	docs: Mark a few things as in progress.	2013-05-14 12:22:40 -07:00
Zack Rusin	5104ed3dbf	draw: try to prevent overflows on index buffers Pass in the size of the index buffer, when available, and use it to handle out of bounds conditions. The behavior in the case of an overflow needs to be the same as with other overflows in the vertex processing pipeline meaning that a vertex should still be generated but all attributes in it set to zero. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-14 03:10:56 -04:00
Zack Rusin	d5250da818	draw: use the total number of vertices for statistics the number of vertices to fetch doesn't necessarily equal the total number of input vertices, e.g. we might want to fetch a single vertex but then draw it twice. Lets use the correct number of input vertices in the statistics. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-14 03:10:33 -04:00
Zack Rusin	29853ab7b8	draw: don't crash on vertex buffer overflow We would crash when stride was bigger than the size of the buffer. The correct behavior is to just fetch zero's in this case. Unfortunatly with user_buffer's there's no way to validate the size because currently we're just not getting it. Adjust the draw interface to pass the size along the mapped buffer, which works perfectly for buffer backed vertex_buffers and, in future, it will allow us to plumb user_buffer sizes through the same interface. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-14 03:09:32 -04:00
Zack Rusin	386327c48f	gallivm/soa: implement indirect addressing in immediates The support is analogous to the way we handle indirect addressing in temporaries, except that we don't have to worry about storing (after declarations) and thus we'll able to keep using the old code when indirect addressing isn't used. In other words we're still using constants directly, unless the instruction has immediate register with indirect addressing. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-14 03:09:15 -04:00
Zack Rusin	2866525b86	draw/gs: don't bind the tgsi state if we're using llvm paths Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-14 03:08:56 -04:00
Vinson Lee	ff256ec068	gallivm: Fix build with LLVM >= 3.4 r181680. Tested-by: Laurent Carlier <lordheavym@gmail.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-05-14 09:06:14 -07:00
José Fonseca	36385c0bdf	mesa/st: Temporary workaround for fdo bug 64568. Effectively reverting the problematic hunk of commit `614ee25077`	2013-05-14 17:02:53 +01:00
Alex Deucher	29b8d6a1da	radeonsi: add Hainan pci ids Note: this is a candidate for the 9.1 branch Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-05-14 10:51:10 -04:00
Alex Deucher	d188f14941	radeonsi: update r600_get_llvm_processor_name for hainan Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-05-14 10:51:10 -04:00
Alex Deucher	4045c3d060	radeonsi: add support for hainan chips Note: this is a candidate for the 9.1 branch Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-05-14 10:51:10 -04:00
José Fonseca	c475ae5d3d	draw: Fix io_ptr/num_prims name in IR. Trivial.	2013-05-14 15:36:37 +01:00
José Fonseca	2f3d939e36	graw/tgsi_dump: Fix gdb macro. The macro was relying on "tokens" local variable to exist.	2013-05-14 15:36:37 +01:00
Vadim Girlin	560ddad261	r600g/sb: add missing cases for ARUBA chips Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-14 17:36:25 +04:00
Vadim Girlin	ecde4b07e2	r600g/sb: get rid of standard c++ streams Static initialization of internal libstdc++ data related to iostream causes segfaults with some apps. This patch replaces all uses of std::ostream and std::ostringstream in sb with custom lightweight classes. Prevents segfaults with ut2004demo and probably some other old apps. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-14 17:36:25 +04:00
Vadim Girlin	57d1be0d2d	r600g/sb: separate bytecode decoding and parsing Parsing and ir construction is required for optimization only, it's unnecessary if we only need to print shader dump. This should make new disassembler more tolerant to any new features in the bytecode. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-14 17:36:25 +04:00
Christian König	e195d301ae	vl/vdpau: fix PresentationQueueQuerySurfaceStatus The last queued surface always keeps displaying. Fixing a problem with XBMC. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-14 15:16:15 +02:00
Chia-I Wu	176ad54c04	ilo: rework ilo_texture Use ilo_buffer for buffer resources and ilo_texture for texture resources. A major cleanup is necessitated by the separation.	2013-05-14 16:07:22 +08:00
Chia-I Wu	768296dd05	ilo: rename ilo_resource to ilo_texture In preparation for the introduction of ilo_buffer.	2013-05-14 16:01:25 +08:00
Chia-I Wu	528ac68f7a	ilo: move transfer-related functions to a new file Resource mapping is distinct from resource allocation, and is going to get more and more complex. Move the related functions to a new file to make the separation clear.	2013-05-14 16:01:20 +08:00
Rodrigo Vivi	888fc7a891	i965: Add missing Haswell GT3 Desktop to IS_HSW_GT3 check. NOTE: This is a candidate for stable branches. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 17:00:46 -07:00
Jordan Justen	a16a2d7147	i965: write layer if gl_Layer is used in VS This is enabled by the AMD_vertex_shader_layer extension. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-13 13:57:57 -07:00
Jordan Justen	220f70667d	glsl: add AMD_vertex_shader_layer support This GLSL extension requires that AMD_vertex_shader_layer be enabled by the driver. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-13 13:57:52 -07:00
Jordan Justen	c9e981b8fb	extensions: add AMD_vertex_shader_layer This extension will require driver support, so it must be enabled by the driver. http://www.opengl.org/registry/specs/AMD/vertex_shader_layer.txt Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-13 13:57:03 -07:00
Chad Versace	1776eeedd3	mesa: Expose GL_OES_texture_npot on GLES1 Mesa's extension table incorrectly lists this GL_OES_texture_npot as ES2-only. It's also an ES1 extension. This patch adds ES1 to the extensions API mask. From the GL_OES_texture_npot spec: OpenGL ES 1.0 or OpenGL ES 2.0 is required. This extension is written against OpenGL ES 1.1.12 and OpenGL ES 2.0.25. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:08:37 -07:00
Ian Romanick	a61a0dbed2	glsl: Death to array dereferences of vectors! Now that all the places that used to generate array derefeneces of vectors have been changed to generate either ir_binop_vector_extract or ir_triop_vector_insert (or both), remove all support for dealing with this deprecated construct. As an added safeguard, modify ir_validate to reject ir_dereference_array of a vector. v2: Convert tabs to spaces. Suggested by Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:19 -07:00
Ian Romanick	1e773626ee	glsl: Generate correct ir_binop_vector_extract code for out and inout parameters Like with type conversions on out parameters, some extra copies need to occur to handle these cases. The fundamental problem is that ir_binop_vector_extract is not an lvalue, but out and inout parameters must be lvalues. A previous patch delt with a similar problem in the LHS of ir_assignment. v2: Convert tabs to spaces. Suggested by Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:19 -07:00
Ian Romanick	c3bb07f875	glsl: Use vector-insert and vector-extract on elements of gl_ClipDistanceMESA Variable indexing into vectors using ir_dereference_array is being removed, so this lowering pass has to generate something different. v2: Convert tabs to spaces. Suggested by Eric. v3: Simplify code slightly by assuming that elements of gl_ClipDistanceMESA will always be vec4. Suggested by Paul. v4: Fairly substantial rewrite based on the rewrite of "glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor" Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-13 12:05:19 -07:00
Ian Romanick	d13fbeea96	glsl: Remove some stale comments about ir_call ir_call was changed long ago to be a statement rather than an expression. That makes this comment no longer valid. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-13 12:05:19 -07:00
Ian Romanick	065da16508	glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor Right now the lower_clip_distance_visitor lowers variable indexing into gl_ClipDistance into variable indexing into both the array gl_ClipDistanceMESA and the vectors of that array. For example, gl_ClipDistance[i] = f; becomes gl_ClipDistanceMESA[i >> 2][i & 3] = f; However, variable indexing into vectors using ir_dereference_array is being removed. Instead, ir_expression with ir_triop_vector_insert will be used. The above code will become gl_ClipDistanceMESA[i >> 2] = vector_insert(gl_ClipDistanceMESA[i >> 2], i & 3, f); In order to do this, an ir_rvalue_visitor will need to be used. This commit is really just a refactor to get ready for that. v4: Split the least amount of refactor from the rest of the code changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-05-13 12:05:19 -07:00
Ian Romanick	3acb21517b	glsl: Generate ir_binop_vector_extract for indexing of vectors Now ir_dereference_array of a vector will never occur in the RHS of an expression. v2: Add back the { } around the if-statement body to make it more readable. Suggested by Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:19 -07:00
Ian Romanick	89704eb1b0	glsl: Convert ir_binop_vector_extract in the LHS to ir_triop_vector_insert The ast_array_index code can't know whether to generate an ir_binop_vector_extract or an ir_triop_vector_insert. Instead it will always generate ir_binop_vector_extract, and the LHS and RHS have to be re-written. v2: Convert tabs to spaces. Suggested by Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:19 -07:00
Ian Romanick	ee7a6dad30	glsl: Add lowering pass for ir_triop_vector_insert This will eventually replace do_vec_index_to_cond_assign. This lowering pass is called in all the places where do_vec_index_to_cond_assign or do_vec_index_to_swizzle is called. v2: Use WRITEMASK_* instead of integer literals. Use a more concise method of generating broadcast_index. Both suggested by Eric. v3: Use a series of scalar compares instead of a single vector compare. Suggested by Eric and Ken. It still uses 'if (cond) v.x = y;' instead of conditional assignments because ir_builder doesn't do conditional assignments, and I'd rather keep the code simple. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:19 -07:00
Ian Romanick	b881ddba7d	glsl: Lower ir_binop_vector_extract to conditional moves Lower ir_binop_vector_extract with a non-constant index to a series of conditional moves. This is exactly like ir_dereference_array of a vector with a non-constant index. v2: Convert tabs to spaces. Suggested by Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:19 -07:00
Ian Romanick	943de9cdea	glsl: Lower ir_binop_vector_extract to swizzle Lower ir_binop_vector_extract with a constant index to a swizzle. This is exactly like ir_dereference_array of a vector with a constant index. v2: Convert tabs to spaces. Suggested by Eric. v3: Correctly call convert_vector_extract_to_swizzle in ir_vec_index_to_swizzle_visitor::visit_enter(ir_call *ir). Suggested by Ken. v4: Use CLAMP instead of MIN2(MAX2()). Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:18 -07:00
Ian Romanick	63e1147ea1	glsl: Refactor part of convert_vec_index_to_cond_assign Use a first function that extract the vector being indexed and the index from the deref. Call the second function that does the real work. Coming patches will add a new ir_expression for variable indexing into a vector. Having the lowering pass split into two functions will make it much easier to lower the new ir_expression. v2: Convert tabs to spaces. Suggested by Eric. v3: Move some bits from a later patch back to this patch so that it actually compiles. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:18 -07:00
Ian Romanick	dafd6918f3	glsl: Add ir_triop_vector_insert The new opcode is used to generate a new vector with a single field from the source vector replaced. This will eventually replace ir_dereference_array of vectors in the LHS of assignments. v2: Convert tabs to spaces. Suggested by Eric. v3: Add constant expression handling for ir_triop_vector_insert. This prevents the constant matrix inversion tests from regressing. Duh. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:18 -07:00
Ian Romanick	f274a2ca87	glsl: Add ir_binop_vector_extract The new opcode is used to get a single field from a vector. The field index may not be constant. This will eventually replace ir_dereference_array of vectors. This is similar to the extractelement instruction in LLVM IR. http://llvm.org/docs/LangRef.html#extractelement-instruction v2: Convert tabs to spaces. Suggested by Eric. v3: Add array index range checking to ir_binop_vector_extract constant expression handling. Suggested by Ken. v4: Use CLAMP instead of MIN2(MAX2()). Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-13 12:05:18 -07:00
Paul Berry	b0bb6103d2	glsl: Fix "make check" breakage after adding options to do_common_optimization. Commit `b765740` (glsl: Pass struct shader_compiler_options into do_common_optimization.) added a new parameter to do_common_optimization() but didn't update test_optpass.cpp, causing "make check" to break. This patch makes the proper updates to test_optpass.cpp so that the build succeeds again.	2013-05-13 07:55:37 -07:00
Kenneth Graunke	e413d3f15c	glsl: Add a pass to flip matrix/vector multiplies to use dot products. This pass flips (matrix * vector) operations to (vector * matrixTranspose) for certain built-in matrices (currently gl_ModelViewProjectionMatrix and gl_TextureMatrix). This is equivalent, but results in dot products rather than multiplies and adds. On some hardware, this is more efficient. This pass is conditionalized on ctx->mvp_with_dp4, the flag drivers set to indicate they prefer dot products. Improves performance in Lightsmark by 1.01131% +/- 0.162069% (n = 10) on a Haswell GT2 system. Passes Piglit on Ivybridge. v2: Use struct gl_shader_compiler_options instead of plumbing through another boolean flag for this purpose. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-12 09:36:46 -07:00
Kenneth Graunke	72a0b7a435	i965/vs: Set the PreferDP4 shader compiler option. Doing matrix multiplies with DP4s is fewer instructions than MUL/ADD, especially since we don't support MAD in the vertex shader. Not observed to improve performance in any fixed function applications, but is useful for the next patch. I've left this unset for the fragment shader because the scalar backend can't use DP4 and does have MAD support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-12 09:36:44 -07:00
Kenneth Graunke	bbf029f7cf	mesa: Move the mvp_with_dp4 flag to ShaderCompilerOptions. This flag essentially tells the compiler whether it prefers dot products or multiply/adds for matrix operations. As such, ShaderCompilerOptions seems like the right place for it. This also lets us specify it on a per-stage basis. This patch makes all existing users set the flag for the Vertex Shader stage only, as it's currently only used for fixed-function vertex programs. That will change soon, and I wanted to preserve the existing behavior. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-12 09:36:43 -07:00
Kenneth Graunke	b765740a66	glsl: Pass struct shader_compiler_options into do_common_optimization. do_common_optimization may need to make choices about whether to emit certain kinds of instructions. gl_context::ShaderCompilerOptions contains exactly that information, so it makes sense to pass it in. Rather than passing the whole array, pass the structure for the stage that's currently being worked on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-12 09:36:41 -07:00
Kenneth Graunke	6bb9acfb4e	glsl: Initialize ctx->ShaderCompilerOptions in standalone scaffolding. This code is copied from _mesa_init_shader_state(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-12 09:36:39 -07:00
Kenneth Graunke	1c95cea40b	glsl: Copy _mesa_shader_type_to_index() to standalone scaffolding. We can't include shaderobj.h from the standalone utilities, so we unfortunately have to copy this function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-12 09:36:18 -07:00
Kenneth Graunke	a67b18e5a7	mesa: Add comments about bit-ordering of new XRGB/XBGR formats. Marek added these new formats in commit `f9fa725690`, but without comments relating to the packing. Sometimes the naming is confusing, so these comments are helpful in determining whether two formats are compatible. The new comments are based on my reading of format_unpack.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-05-12 09:32:42 -07:00
Marek Olšák	f486c52f9e	st/mesa: remove dependency on _NEW_BUFFER_OBJECT for vertex arrays _NEW_BUFFER_OBJECT means glBufferData was called. We can just set our own flag in BufferData. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:59:20 +02:00
Marek Olšák	b88cebb634	st/mesa: don't check for _NEW_PROGRAM when binding UBOs Probably copied from i965. However st/mesa has its flags ST_NEW_xxx_PROGRAM. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:02 +02:00
Marek Olšák	a17e87d4eb	st/mesa: fix a couple of issues in st_bind_ubos - don't reference a buffer for a local variable (that's never useful unless it can be the only reference to the buffer) - check if the buffer is not NULL - set buffer_size as specified with BindBufferRange NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:02 +02:00
Marek Olšák	1ba1d617bf	st/mesa: restore the transfer_inline_write path for BufferData Version 2 that shouldn't crash. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:02 +02:00
Marek Olšák	6a2ad679e6	st/mesa: initialize Const.MaxColorAttachments NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:02 +02:00
Marek Olšák	52cb395bb1	gallium: add PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE for GL v2: fix typo 65535 -> 65536 Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:01 +02:00
Marek Olšák	b6d3373442	st/mesa: consolidate setting MaxTextureImageUnits Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:01 +02:00
Marek Olšák	614ee25077	st/mesa: initialize all program constants and UBO limits Also simplify UBO support checking. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:01 +02:00
Marek Olšák	d90f04a65b	glsl: fix the value of gl_MaxFragmentUniformVectors NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:45:01 +02:00
Marek Olšák	77d8fbcfd4	mesa: add & use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECT v2: move the flagging from intel_bufferobj_data to intel_bufferobj_alloc_buffer Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-11 23:45:01 +02:00
Marek Olšák	081c789c3e	mesa: skip _MaxElement computation unless driver needs strict bounds checking If Const.CheckArrayBounds is false, the only code using _MaxElement is glDrawRangeElements, so I changed it and explained in the code why _MaxElement is not very useful there. BTW, the big magic number was copied to the letter from _mesa_update_array_max_element. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-11 23:45:01 +02:00
Marek Olšák	db38e9a0e1	mesa: remove unused gl_array_object::NewArray Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-11 23:45:01 +02:00
Marek Olšák	74ca7f0974	mesa: remove unused gl_constants::MaxColorTableSize Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-11 23:45:01 +02:00
Marek Olšák	286d06ddc4	mesa: unify MaxVertexVaryingComponents and MaxGeometryVaryingComponents The limits should not be different and OpenGL requires both to be at least 32, which is also the maximum limit on radeon. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-11 23:45:01 +02:00
Marek Olšák	5e78433eec	mesa: move max texture image unit constants to gl_program_constants Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits etc. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-11 23:45:01 +02:00
Marek Olšák	d27d29f1a6	mesa: consolidate definitions of max texture image units Shaders are unified on most hardware (= same limits in all stages). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-11 23:44:55 +02:00
Vinson Lee	5471e3949c	ilo: Initialize read_back in transfer_map_sys. Fixes "Uninitialized scalar variable" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2013-05-10 15:29:40 +08:00
Marek Olšák	da33f9b919	r600g: increase array size for shader inputs and outputs and add assertions to prevent buffer overflow. This fixes corruption of the r600_shader struct. NOTE: This is a candidate for the stable branches.	2013-05-10 03:23:31 +02:00
Chí-Thanh Christopher Nguyễn	121c2c8983	targets/dri-i915: Force c++ linker in all cases NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=461696 Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-05-09 17:04:27 -07:00
Ben Widawsky	fc98c47115	i965: Actually use the user timeout in glClientWaitSync. Use the new libdrm functionality to actually do timed waits on the sync object. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-05-09 16:41:44 -07:00
Paulo Zanoni	f1d2b37317	i965: make GT3 machines work as GT3 instead of GT2 We were not allowed to say the "GT3" name, but we really needed to have the PCI IDs because too many people had such machines, so we had to make the GT3 machines work as GT2. Let's just say that GT2_PLUS was a short for GT2_PLUS_1 :) NOTE: This is a candidate for stable branches. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-09 15:11:53 -07:00
Kenneth Graunke	d0b82b1add	i965: Add chipset limits for the Haswell GT3 variant. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>	2013-05-09 15:11:53 -07:00
Kenneth Graunke	eca2251f42	i965: Update URB partitioning code for Haswell's GT3 variant. Haswell's GT3 variant offers 32kB of URB space for push constants, while GT1 and GT2 match Ivybridge, providing 16kB. Update the code to reserve the full 32kB on GT3. v2: Specify push constant size correctly. I thought GT3 reinterpreted the value as multiples of 2kB, but it doesn't. You simply have to program an even number. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-09 15:11:52 -07:00
Kenneth Graunke	c56eba5adb	i965: Delete dead intel_span.c symlink.	2013-05-09 15:11:52 -07:00
Eric Anholt	0f3068a58b	i965/vs: Make virtual grf live intervals actually cover their used range. This is the same change as the previous commit to the FS. A very few VSes are regressed by 1 or 2 instructions, which look recoverable with a bit more dead code elimination. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-09 14:38:05 -07:00
Eric Anholt	e290372542	i965/fs: Make virtual grf live intervals actually cover their used range. Previously, we would sometimes not consider a write to a register to extend the end of the interval, nor would we consider a read before a write to extend the start. This made for a bunch of complicated logic related to how to treat the results when dead code might be present. Instead, just extend the interval and fix dead code elimination to know how to remove it. Interestingly, this actually results in a tiny bit more optimization: total instructions in shared programs: 1391220 -> 1390799 (-0.03%) instructions in affected programs: 14037 -> 13616 (-3.00%) v2: Fix a theoretical problem with the simd16 workaround if dst == src, where we would revert the bump of the live range. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2013-05-09 14:38:05 -07:00
Marek Olšák	dd6152b6ca	docs: document GALLIUM_HUD and LIBGL_SHOW_FPS	2013-05-09 23:28:05 +02:00
Courtney Goeltzenleuchter	daa90f91ff	ilo: Add support for HW primitive restart. Now tells Gallium that ilo supports primitive restart. Updated ilo_draw_vbo to be able to check that the indexed primitive being rendered can actually be supported in HW. If not, will break up into individual prims similar to what Mesa does. [olv: a minor fix after rebasing and formatting]	2013-05-10 00:06:14 +08:00
Brian Paul	009d79734f	svga: misc whitespace and comment fixes in svga_cmd.c	2013-05-09 07:43:46 -06:00
Brian Paul	60c71cce3f	docs: remove ^M chars from GL3.txt	2013-05-09 07:43:46 -06:00
Brian Paul	e0144019c0	st/mesa: generate GL_OUT_OF_MEMORY if we can't create the index buffer Before, if we failed to allocate the index buffer we'd silently return from st_draw_vbo() without drawing anything. We should raise GL_OUT_OF_MEMORY to give some indication that something went wrong. Note: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-05-09 07:43:46 -06:00
Chia-I Wu	a8e4614071	ilo: add support for PIPE_FORMAT_ETC1_RGB8 It is decompressed to and stored as PIPE_FORMAT_R8G8B8X8_UNORM on-the-fly.	2013-05-09 16:05:48 +08:00
Chia-I Wu	183ea823fd	ilo: support mapping with a staging system buffer It can be used for unpacking compressed texture on-the-fly or to support explicit transfer flushing.	2013-05-09 16:05:47 +08:00
Chia-I Wu	baa44db065	ilo: allow for different mapping methods We want to or need to use a different mapping method when when the resource is busy, the bo format differs from the requested format, and etc.	2013-05-09 16:05:47 +08:00
Chia-I Wu	7cca1aac9d	ilo: allow bo format to differ from that requested For separate stencil buffer or formats not supported natively, the real format of the bo may differ from that requested.	2013-05-09 16:05:47 +08:00
Stéphane Marchesin	1c56fc1025	draw/llvm: Add additional llvm optimization passes It helps a bit with vertex shader performance on i915g (a couple percent faster with openarena). I have tried most other passes, and they weren't showing any measurable improvement. Note that my vertex shaders didn't have loops, so maybe the loop optimizations could still be useful in the future. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-08 22:05:54 -07:00
Eric Anholt	0b0d6f97cf	i965: Sync brw_format_for_mesa_format() table with new Mesa formats. I'm not filling them all in, to prevent any breakage in this commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-08 15:31:07 -07:00
Eric Anholt	2755946427	i965: Update the surface formats table from the current specs. Unfortunately the surface formats table is now splattered across multiple chapters. All surface format enums from brw_defines.h are present, but only support for them that is mentioned in the public specs is included here. v2 (from Ken): Mark R32G32B32A32_SFIXED as unsupported on Ivybridge. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-08 15:31:06 -07:00
Eric Anholt	5d89487eb2	i965: Add surface format defines from the public specs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-08 14:27:30 -07:00
Fabian Bieler	4e9c7f9c5a	mesa/program: Don't copy propagate from swizzles. Do not propagate a copy if source and destination are identical. Otherwise code like MOV TEMP[0].xyzw, TEMP[0].wzyx MOV TEMP[1].xyzw, TEMP[0].xyzw is changed to MOV TEMP[0].xyzw, TEMP[0].wzyx MOV TEMP[1].xyzw, TEMP[0].wzyx This fixes Piglit test shaders/glsl-copy-propagation-self-2 for drivers that use Mesa IR. NOTE: This is a candidate for the stable branches. Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-08 13:59:19 -07:00
Fabian Bieler	e1ff753d67	mesa/st: Don't copy propagate from swizzles. Do not propagate a copy if source and destination are identical. Otherwise code like MOV TEMP[0].xyzw, TEMP[0].wzyx MOV TEMP[1].xyzw, TEMP[0].xyzw is changed to MOV TEMP[0].xyzw, TEMP[0].wzyx MOV TEMP[1].xyzw, TEMP[0].wzyx This fixes Piglit test shaders/glsl-copy-propagation-self-2 for gallium drivers. NOTE: This is a candidate for the stable branches. Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-08 13:59:14 -07:00
Eric Anholt	5d06c9ea0f	i965: Fix hangs on HSW since the gen6 blorp fix. The constant packets for gen6 are too small for gen7, and while IVB seems happy with them HSW blows up. Fix it by emitting the correct packets on gen7, for all stages. v2: Include the packets instead of just skipping them. NOTE: This is a candidate for the stable branches. Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-08 10:23:41 -07:00
Chad Versace	2878f4685c	egl/android: Fix error condition for EGL_ANDROID_image_native_buffer Emit EGL_BAD_CONTEXT if the user passes a context to eglCreateImageKHR(type=EGL_ANDROID_image_native_buffer). From the EGL_ANDROID_image_native_buffer spec: * If <target> is EGL_NATIVE_BUFFER_ANDROID and <ctx> is not EGL_NO_CONTEXT, the error EGL_BAD_CONTEXT is generated. Note: This is a candidate for the stable branches. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-05-08 08:44:05 -07:00
Stéphane Marchesin	38d2a16c01	i915: Use Y tiling for textures This basically reverts commit `2acc719374`. With the previous change, we're not batchbuffer limited any longer. So we actually start seeing a performance difference between X and Y tiling. X tiling is funny because it is faster for screen-aligned quads but slower in games. So let's use Y tiling which is 10% faster overall.	2013-05-08 02:07:00 -07:00
Stéphane Marchesin	fc24c7aede	i915g: Optimize batchbuffer sizes Now that we don't throttle at every batchbuffer, we can shrink the size of batchbuffers to achieve early flushing. This gives a significant speed boost in a lot of games (on the order of 20%).	2013-05-08 02:06:56 -07:00
Stéphane Marchesin	7f7c7fda83	i915g: Add more PIPE_CAP_* support	2013-05-08 01:37:55 -07:00
Chia-I Wu	00035670de	ilo: remove our own type inference tgsi_opcode_infer_{src,dst}_type() works just fine.	2013-05-08 11:33:34 +08:00
Chia-I Wu	b74af51a46	ilo: use tgsi_util_get_texture_coord_dim() And remove toy_tgsi_get_texture_coord_dim().	2013-05-08 11:07:46 +08:00
Chia-I Wu	75a48a53d8	tgsi: fix operand type of TGSI_OPCODE_NOT It should be TGSI_TYPE_UNSIGNED, not TGSI_TYPE_FLOAT. Fixed also gallivm not_emit_cpu() to use uint build context. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2013-05-08 11:03:49 +08:00
Chia-I Wu	1f970816b1	tgsi: refactor tgsi_opcode_infer_src_type() Call tgsi_opcode_infer_type() from tgsi_opcode_infer_src_type(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2013-05-08 11:03:47 +08:00
Chia-I Wu	364feb327d	tgsi: refactor tgsi_opcode_infer_dst_type() Move the body of tgsi_opcode_infer_dst_type() to a new helper function, tgsi_opcode_infer_type(), and call the helper function from tgsi_opcode_infer_dst_type(). The diff looks complicated simply because the code is moved around. A following commit will make tgsi_opcode_infer_src_type() call tgsi_opcode_infer_type(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2013-05-08 11:03:43 +08:00
Chia-I Wu	8a52453f5d	tgsi: reorder opcodes in opcode type inference Reorder opcodes by their assigned numbers. This makes it easier to see the differences between tgsi_opcode_infer_src_type() and tgsi_opcode_infer_dst_type(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2013-05-08 11:03:24 +08:00
Chia-I Wu	61d57ec276	tgsi: clean up exec_tex() Make use of tgsi_util_get_texture_coord_dim() to replace the big switch table. There is a subtle difference with this change. When TXP is used with an array texture, the layer is now also projected. This behavior matches the TGSI doc. Since GLSL does not allow TXP on an array texture, I am not sure which behavior is correct or preferred. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2013-05-08 11:00:07 +08:00
Chia-I Wu	80857d2c8b	tgsi: add tgsi_util_get_texture_coord_dim() This util function returns the dimension of the texture coordinates for a texture target, and the location of the shadow reference value. For example, when the texture target is TGSI_TEXTURE_SHADOW2D, the dimension of the texture coordinates is 2, and the location of the ref value is 2 (that is, the Z channel). Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Roland Scheidegger <sroland@vmware.com>	2013-05-08 10:58:53 +08:00
Bryan Cain	14a0bb81fe	nv50: initialize kick_notify callback in nv50_create Fixes infinite loop on startup in Portal and Left 4 Dead 2. NOTE: This is a candidate for the 9.0 and 9.1 branches.	2013-05-07 17:01:59 -05:00
Eric Anholt	3f09e528d5	i965: Use Y-tiled blits to untile for cached mappings of miptrees. Fixes a regression in firefox's unaccelerated compositing path for WebGL with the introduction of Y tiling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64213 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-07 11:45:45 -07:00
Eric Anholt	d641a01d98	i965: Add support for Y-tiled blits on gen6+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-07 11:45:45 -07:00
Eric Anholt	7a74808d78	i965: Count occlusion query samples for CopyPixels using the 2D engine. We accidentally "fixed" the piglit test for this when introducing Y tiling, since this path stopped being executed. In reenabling this path for Y tiling, we ended up regressing it again, so just fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59439 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-05-07 11:45:45 -07:00
Robert Bragg	f8c3242682	egl/wayland: Implement EGL_EXT_swap_buffers_with_damage Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-05-07 17:07:50 +01:00
Robert Bragg	6425b14515	egl: Add extension infrastructure for EGL_EXT_swap_buffers_with_damage Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-05-07 17:07:45 +01:00
Robert Bragg	95dda0d649	egl: Update to revision 21254 of eglext.h This pulls in EGL_EXT_swap_buffers_with_damage. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2013-05-07 17:07:44 +01:00
Roland Scheidegger	65102b708b	gallium: more tgsi documentation updates Adds the remaining integer opcodes, and some opcodes are moved to more appropriate places, along with getting rid of the (already nearly empty) ps_2_x section. Though the CAP bits for some of these are still a bit in the air so the documentation isn't quite as watertight as is desirable. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-07 16:13:23 +02:00
Vinson Lee	4ba9c9c5be	ilo: Add missing break statement in aos_tex TGSI_OPCODE_TEX2 case. Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2013-05-07 12:15:48 +08:00
Vadim Girlin	c9cf83b587	r600g/sb: optimize some cases for CNDxx instructions We can replace CNDxx with MOV (and possibly eliminate after propagation) in following cases: If src1 is equal to src2 in CNDxx instruction then the result doesn't depend on condition and we can replace the instruction with "MOV dst, src1". If src0 is const then we can evaluate the condition at compile time and also replace it with MOV. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-07 04:40:26 +04:00
Vadim Girlin	46dfad8b36	r600g/sb: fix memory leaks Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-07 04:40:26 +04:00
Vadim Girlin	1c28e7c5a1	r600g/sb: fix kcache handling on r6xx Use the same limit for kcache constants in alu group on r6xx as on other chips (two const pairs). Relaxing this will require additional checks to make sure that all 4 consts in the group come from 2 kcache sets (clause limit), probably without noticeable improvements of shader performance. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-07 04:40:26 +04:00
Eric Anholt	03ef60681e	intel: Remove renderbuffer delete setup from texture wrapping. This is already set by intel_new_renderbuffer(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 14:34:27 -07:00
Eric Anholt	77a405dba7	mesa: Make Mesa core set up wrapped texture renderbuffer state. Everyone was doing effectively the same thing, except for some funky code reuse in Intel, and swrast mistakenly recomputing _BaseFormat instead of using the texture's _BaseFormat. swrast's sRGB handling is left in place, though it should be done by using _mesa_get_render_format() at render time instead (as-is, it will miss updates to GL_FRAMEBUFFER_SRGB). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 14:34:14 -07:00
Eric Anholt	5b190d19d3	intel: Simplify renderbuffer-for-texture width setup. We're looking for the logical width of our level, which is what image->Width2/Height2 is. The previous code relied on MSAA textures being only level 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 14:33:43 -07:00
Eric Anholt	749a92786d	mesa: Make core Mesa allocate the texture renderbuffer wrapper. Every driver did the same thing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 14:33:38 -07:00
Eric Anholt	5b9609f59a	i965: Use brw_blorp_blit_miptrees() for CopyTexSubImage(). Now that depth resolves are handled there, we don't need to make the temporary renderbuffer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 14:33:33 -07:00
Eric Anholt	40956c5519	i965: Move blorp resolve setup into brw_blorp_blit_miptrees(). There was some comment about trying to avoid marking resolves in updownsample, but if the downsample is never actually rendered to, then the required resolve tracked in the downsample will never be executed, so who cares? Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 14:33:27 -07:00
Tom Stellard	730c90a70e	gallivm: Fix build for LLVM < 3.3 The C API versions of the LLVM multithreaded functions were added in LLVM 3.3.	2013-05-06 11:17:03 -07:00
Tom Stellard	bb94d4d8fe	r600g/llvm: Parse config values in register / value pairs Rather than relying on a predetermined order for the config values.	2013-05-06 10:54:52 -07:00
Tom Stellard	df27320560	r600g/llvm: Don't feed LLVM output through r600_bytecode_build() The LLVM backend emits raw ISA now, so we can just its output unmodified.	2013-05-06 10:54:52 -07:00
Tom Stellard	e917ed96ae	r600g/llvm: Don't emit CALL_FS for vertex shaders The LLVM backend takes care of this now.	2013-05-06 10:54:52 -07:00
Matt Turner	1d09a8c3cd	i965: Lower bitfieldInsert. v2: Only lower bitfieldInsert to BFM+BFI (and don't lower bitfieldExtract at all) since three-source instructions are now usable in the vertex shader. v3: Lower bitfield_insert in the same pass with everything else, since it doesn't produce any instructions to be lowered (the other two lowering passes that were in a previous iteration of this series emitted subtractions which needed to be lowered). Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]	2013-05-06 10:17:14 -07:00
Matt Turner	acd2bccd85	i965/vs: Add support for bit instructions. v2: Rebase on LRP addition. Use fix_3src_operand() when emitting BFE and BFI2. Add BFE and BFI2 to is_3src_inst check in brw_vec4_copy_propagation.cpp. Subtract result of FBH from 31 (unless an error) to convert MSB counts to LSB counts Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:14 -07:00
Matt Turner	1f0f26d60c	i965/fs: Add support for bit instructions. Don't bother scalarizing ir_binop_bfm, since its results are identical for all channels. v2: Subtract result of FBH from 31 (unless an error) to convert MSB counts to LSB counts. v3: Use op0->clone() in ir_triop_bfi to prevent (var_ref channel_expressions) from appearing multiple times in the IR. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]	2013-05-06 10:17:14 -07:00
Matt Turner	fa958182b7	i965: Add support for emitting and disassembling bit instructions. Specifically bfe - for bitfieldExtract() bfi1 and bfi2 - for bitfieldInsert() bfrev - for bitfieldReverse() cbit - for bitCount() fbh - for findMSB() fbl - for findLSB() Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:14 -07:00
Matt Turner	c71bee757b	i965: Print the correct dst and shared-src types for 3-src instructions. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:14 -07:00
Matt Turner	526ffdfc03	i965/gen7: Set src/dst types for 3-src instructions. Also update asserts to allow BFE and BFI2, which take (unsigned) doubleword arguments. v2: Allow BRW_REGISTER_TYPE_UD for src1 and src2 as well. Assert that src2.type (instead of src0.type) matches dest.type since it's the primary argument and src0 and src1 might correctly have different types. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v1]	2013-05-06 10:17:13 -07:00
Matt Turner	2305047823	i965: Add 3-src destination and shared-source type macros. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:13 -07:00
Matt Turner	4049d48e02	i965: Add Gen7+ fields to brw_instruction and add comments. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:13 -07:00
Matt Turner	dafd050883	glsl: Add a pass to lower bitfield-insert into bfm+bfi. i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement bitfieldInsert() from ARB_gpu_shader5. v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp. Remove spurious temporary assignment and dereference. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:13 -07:00
Matt Turner	9c04b8c28c	glsl: Add constant evaluation of bit built-ins. v2: Order bits from LSB end (31 - count) for ir_unop_find_msb. v3: Add ir_triop_bitfield_extract as an exception to the op[0]->type == op[1]->type assertion in ir_constant_expression.cpp. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2]	2013-05-06 10:17:13 -07:00
Matt Turner	499d8c6545	glsl: Add support for new bit built-ins in ARB_gpu_shader5. v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:13 -07:00
Matt Turner	44d3287ecd	glsl: Add new bit built-ins IR and prototypes from ARB_gpu_shader5. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:13 -07:00
Matt Turner	f9e37879eb	glsl: Rework ir_reader to handle expressions with four operands. Needed to support the bitfieldInsert() built-in added by ARB_gpu_shader5. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:12 -07:00
Matt Turner	f99f78e49a	mesa: Add infrastructure for ARB_gpu_shader5. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-05-06 10:17:12 -07:00
Tom Stellard	914d797797	radeon/llvm: Always build libradeonllvm as static This library is very small, so there is not much to gain from building it as a shared library. Also, when linking statically with LLVM, a shared libradeonllvm exports LLVM symbols and creates problems when used with other shared objects that also link statically to LLVM. Reviewed-by: Mathias.Froehlich@web.de	2013-05-06 09:06:10 -07:00
Tom Stellard	024fe6852a	radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2 The LLVM C API is considered stable and should never change, so it is much more desirable to use than the LLVM C++ API, which is constantly in flux. v2: - Split target initialization and lookup into separate functions Reviewed-by: Mathias.Froehlich@web.de	2013-05-06 09:06:06 -07:00
Tom Stellard	55eb8eaaa8	gallivm: Move LLVMStartMultithreaded() static initializer into gallivm This does not solve all of the problems with using LLVM in a multithreaded enivronment, but it should help in some cases. Reviewed-by: Mathias.Froehlich@web.de	2013-05-06 09:06:03 -07:00
Tom Stellard	7cc98ea88f	radeon/llvm: Don't use the global context when parsing LLVM IR This leads to crashes when multiple threads try to compile compute shaders in the same time. Fixes a crash in bfgminer when using more than one thread.	2013-05-06 09:06:00 -07:00
Eric Anholt	bd850cb4f2	i965: Remove GL_ARB_color_buffer_float from GL core contexts. Of the 3 controls in the extension, one was kept in GL core and the other two were explicitly deprecated and the reasonable default behavior was encoded in the spec. By not exposing the extension, we avoid shader recompiles when switching between float and unorm color buffers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-06 09:01:51 -07:00
Tom Stellard	ec143dc0b1	r600g/llvm: Update radeon family mappings for LLVM backend New processors were added to the backend to distinguish between GPUs with and without vertex caches.	2013-05-06 08:22:24 -07:00
Chia-I Wu	5cca6b6280	android: libsync is needed on Android 4.2+ for any driver Add libsync not only for MESA_BUILD_CLASSIC, but also for MESA_BUILD_GALLIUM. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-05-06 07:20:08 -07:00
Chia-I Wu	da109d56d5	android: add ilo to the build system It can be selected with BOARD_GPU_DRIVERS := ilo Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2013-05-06 07:20:07 -07:00
Eric Anholt	739b88330c	glsl: Flip around "if" statements with empty "then" blocks. This cleans up some funny-looking code in some unigine shaders I was looking at. Also slightly helps on planeshift and a few shaders in an upcoming Valve release. total instructions in shared programs: 1653715 -> 1653587 (-0.01%) instructions in affected programs: 16550 -> 16422 (-0.77%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-05 13:20:42 -07:00
Chia-I Wu	008346273c	ilo: correctly set return types of sampler messages Correctly set the types of the temporaries. We do not want type conversions when moving the results to the final destinations.	2013-05-05 14:36:39 +08:00
Vincent Lejeune	b42fe195a2	r600g/llvm: Undefines unrequired texture coord values This is a port of "r600g:mask unused source components for SAMPLE" patch from Vadim Girlin.	2013-05-04 23:38:50 +02:00
Maarten Lankhorst	c4150123aa	nvc0: fixup video decoding with 2D_ARRAY Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>	2013-05-04 20:56:23 +02:00
Chia-I Wu	8c347d4e57	gallium: fix type of flags in pipe_context::flush() It should be unsigned, not enum pipe_flush_flags. Fixed a build error: src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error: invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive] v2: replace all occurrences of enum pipe_flush_flags by unsigned Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> [olv: document the parameter now that the type is unsigned]	2013-05-04 17:32:10 +08:00
Eric Anholt	cbf3462c35	i965: Enable fast clears on non-8x4-aligned sizes. Improves glb2.7 performance at a misaligned size by 2.3% +/- 0.7% (n=11). The workaround was to avoid bad primitive/surface sizes, but that's worked around as of `a14dc4f92c`. (One might note that pre-gen7 we don't know that the right half of an 8x4 at the right edge is actually our pixels, but we're already clobbering those pixels for depth resolves anyway and more work would be required to avoid that). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-05-03 20:59:51 -07:00
Brian Paul	76084907fb	vbo: add comments, const qualifiers Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	0baf32508a	mesa: whitespace, formatting fixes, etc in api_arrayelt.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	7c9e5afe81	vbo: use new no-op ArrayElement in _mesa_noop_vtxfmt_init() As we do for the other commands which can appear between glBegin/End. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	7b762305d5	mesa: change ctx->Driver.NeedFlush to GLbitfield and update comment Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	36c83ccca0	mesa; change ctx->Driver.SaveNeedFlush to boolean, and document it. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	af30987a69	vbo: update comments for vbo_save_NotifyBegin() Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	4ea05bcba6	vbo: implement primitive merging for glBegin/End sequences A surprising number of apps and benchmarks have poor code like this: glBegin(GL_LINE_STRIP); glVertex(v1); glVertex(v2); glEnd(); // Possibly some no-op state changes here glBegin(GL_LINE_STRIP); glVertex(v3); glVertex(v4); glEnd(); // repeat many, many times. The above sequence can be converted into: glBegin(GL_LINES); glVertex(v1); glVertex(v2); glVertex(v3); glVertex(v4); glEnd(); Similarly for GL_POINTS, GL_TRIANGLES, etc. Merging was already implemented for GL_QUADS in the display list code. Now other prim types are handled and it's also done for immediate mode. In one case: before after ----------------------------------------------- number of st_draw_vbo() calls: 141 45 number of _mesa_prims issued: 7520 632 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Brian Paul	3702d25082	vbo: create a few utility functions for merging primitives To be used by following commit. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 19:00:07 -06:00
Zack Rusin	a232afdbfb	draw/pt: adjust overflow calculations gallium lies. buffer_size is not actually buffer_size but available size, which is 'buffer_size - buffer_offset' so by adding buffer offset we'd incorrectly compute overflow. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 07:07:33 -04:00
Zack Rusin	8490d21cbe	tgsi/ureg: make the dst register match the src indirection In ureg src registers could have an indirect register that was either a temp or an addr register, while dst registers allowed only addr. That made moving between them a little difficult so make them behave the same way and allow temp's and addr registers as indirect files for both (tgsi supports it, just ureg didn't). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-05-03 07:07:33 -04:00
Roland Scheidegger	23025ed15d	gallium: tgsi documentation updates and clarification for integer opcodes. A lot of them were missing. Others were moved from the Compute ISA to a new Integer ISA section as that seemed more appropriate. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-03 21:36:28 +02:00
Roland Scheidegger	ae507b6260	llvmpipe: get rid of depth swizzling. Eliminating this we no longer need to copy between linear and swizzled layout. This is probably not quite ideal since it's a bit more work for now, could do some optimizations by moving depth testing outside the fragment shader loop (but tricky for early depth test as we don't have neither the mask nor the interpolated z in the right order handy). The large amount of tile/untile code is no longer needed will be deleted in next commit. No piglit regressions. v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR. v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-03 21:36:20 +02:00
Lauri Kasanen	e495d88453	r600g: Correctly initialize the shader key, v2 Assigning a struct only copies the members - any padding is left as is. Thus this code: struct foo_t foo; foo = bar; leaves the padding of foo intact, ie uninitialized random garbage. This patch fixes constant shader recompiles by initializing the struct to zero. For completeness, memcpy is used to copy the key to the shader struct. NOTE: This is a candidate for the stable branches. Signed-off-by: Lauri Kasanen <cand@gmx.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-05-03 19:28:57 +02:00
Lauri Kasanen	5ff81cfd86	st/xvmc/tests: Fix build failure, v2 v2: Removed extra libs as requested by Matt Turner. Signed-off-by: Lauri Kasanen <cand@gmx.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-05-03 19:14:54 +02:00
Andreas Boll	e62be5de53	scons: remove nouveau build One build system for linux/unix only drivers should be enough. Additionally the nouveau target was disabled anyway. Acked-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-03 18:44:57 +02:00
Andreas Boll	4ca44f2c5e	scons: remove radeon build One build system for linux/unix only drivers should be enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48694 Acked-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-03 18:44:43 +02:00
Alex Deucher	4539f8e20a	r600g: don't emit surface_sync after FLUSH_AND_INV_EVENT It shouldn't be needed since the FLUSH_AND_INV_EVENT has already made sure the destination caches are flushed. Additionally, we didn't previously emit the surface_sync until this commit: http://cgit.freedesktop.org/mesa/mesa/commit/?id=e5e4c07e7964a3258ed02b530bcdc24c0650204b Emitting them together causes hangs in compute on cayman/TN and hangs in Heaven on evergreen. Note: this patch is a candidate for the 9.1 branch, but requires: http://cgit.freedesktop.org/mesa/mesa/commit/?id=156bcca62c9f4e79e78929f72bc085757f36a65a as well. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-05-03 10:55:05 -04:00
Vadim Girlin	41005d7bd2	r600g/sb: zero-initialize bytecode structs Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:42 +04:00
Vadim Girlin	f92bd0958e	r600g/sb: fix constant propagation in gvn pass Fixes the bug that prevented propagation of literals in some cases. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:42 +04:00
Vadim Girlin	3c201a22ca	r600g/sb: don't run unnecessary passes Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:42 +04:00
Vadim Girlin	48ba5712f5	r600g/sb: silence warnings with gcc 4.8 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:42 +04:00
Vadim Girlin	c49b6d7f27	r600g/sb: fix handling of interference sets in post_scheduler post_scheduler clears interference set for reallocatable values when the value becomes live first time, and then updates it to take into account modified order of operations, but this was not handled properly if the value appears first time as a source in copy operation. Fixes issues with webgl demo: http://madebyevan.com/webgl-water/ Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:42 +04:00
Vadim Girlin	e16ef1f454	r600g/sb: fix allocation of indirectly addressed input arrays Some inputs may be preloaded into predefined GPRs, so we can't reallocate arrays with such inputs. Fixes issues with webgl demo: http://oos.moxiecode.com/js_webgl/snake/ Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:41 +04:00
Vadim Girlin	a6fe055fa7	r600g/sb: use hex instead of binary constants This should fix build issues with GCC < 4.3 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:41 +04:00
Vadim Girlin	4ca67dbf0c	r600g: use old shader disassembler by default New disassembler is not completely isolated yet from further processing in r600g/sb that is not required for printing the dump, so it has higher probability to fail in case of any unexpected features in the bytecode. This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new disassembler in r600g/sb for shader dumps when shader optimization is not enabled. If shader optimization is enabled, new disassembler is used by default. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-05-03 16:53:41 +04:00
Christian König	b4b3041132	radeon/uvd: enable interlaced buffers by default Kills tilling on UVD buffers, but we currently don't really need that. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-03 11:00:21 +02:00
Christian König	85b0880a17	vl/idct: fix for commit `7d2f2a0c89` We still need the option for handling 3D textures as well. Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=64143 Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-03 11:00:21 +02:00
Christian König	379753869d	vl/buffers: fix typo in function name Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-03 11:00:20 +02:00
Christian König	9c353ea293	radeon/uvd: fix some MPEG4 artifacts Still not perfect, but a step in the right direction. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-03 11:00:20 +02:00
José Fonseca	abbbc9b667	draw: Update for u_assembled_primitive -> u_assembled_prim rename. Mesa build is too complex to rely on successful builds. On refactorings it is always a good idea to use git grep to prevent missing cases: $ git grep u_assembled_primitive src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c: u_assembled_primitive(in_prim);	2013-05-03 08:35:17 +01:00
Chia-I Wu	8b2a967e32	st/egl: fix bulid errors on Android 4.2 The differences from the previous releases that affect st/egl are - logging macros are prefixed with an 'A' - dequeueBuffer() and enqueueBuffer() require an additoinal argument for fence fd, acquired from libsync Additionally, include gralloc_drm.h with extern "C".	2013-05-03 13:04:00 +08:00
Chia-I Wu	7346ab3b43	ilo: use u_reduced_prims_for_vertices() We do not need our own prim_count() anymore.	2013-05-03 11:59:10 +08:00
Chia-I Wu	f87dccdc19	util/prim: add u_reduced_prims_for_vertices() The function returns the number of reduced/tessellated primitives for the given vertex count. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:59:10 +08:00
Chia-I Wu	90d5190594	util/prim: assorted fixes for u_decomposed_prims_for_vertices() Switch to '>=' for comparisons, and it becomes obvious that the comparison for PIPE_PRIM_QUAD_STRIP was wrong. Add minimum vertex count check for PIPE_PRIM_LINE_LOOP. Return 1 for PIPE_PRIM_POLYGON with 3 vertices. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:59:10 +08:00
Chia-I Wu	30671cecc0	util/prim: use vertex count info in u_validate_pipe_prim() As a side effect, primitives with adjacency are now correctly validated. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:59:10 +08:00
Chia-I Wu	ddf0e3930f	util/prim: fix the name of the include guard It should be U_PRIM_H, not U_BLIT_H. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:59:10 +08:00
Chia-I Wu	5dd3bd70a1	draw: use u_assembled_prim() instead of u_assembled_primitive() The latter function is also removed as a result of the change. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:59:10 +08:00
Chia-I Wu	185692e72c	util/prim: clean up and add comments Move together (or add) functions to decompose/reduce/assemble a primitive, give them consistent names, and document them. Add u_prim_vertex_count() so that the vertex count information can be used elsewhere. u_assembled_primitive() will be removed in a folow-on commit. [olv: fix a warning when -Wold-style-declaration is enabled] Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:58:57 +08:00
Chia-I Wu	64913002e4	util/prim: fix primitive trimming for triangles with adjacency Fix for PIPE_PRIM_TRIANGLES_ADJACENCY and PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Zack Rusin <zackr@vmware.com>	2013-05-03 11:39:12 +08:00
Eric Anholt	573d8813fd	i965/vs: Add instruction scheduling. While this is ignorant of dependency control, it's still good for a 0.39% +/- 0.08% performance improvement on GLBenchmark 2.7 (n=548) v2: Rewrite as a subclass of the base class for the FS instruction scheduler, inheriting the same latency information. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:47 -07:00
Eric Anholt	3b00a6acac	i965: Move most of the FS instruction scheduler code to a general class. About half of this is shareable with the VS code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:43 -07:00
Eric Anholt	ce22dd75b7	i965: Pull a couple of FS scheduling functions out to methods. These will get virtualized as we add VS scheduling support. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:39 -07:00
Eric Anholt	ee0223ba2a	i965: Move FS instruction scheduling to a non-FS-specific file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:35 -07:00
Eric Anholt	ab04f3b2d7	i965: Share the register file enum between the two backends. I need this so I can look at vec4 and fs registers' files from the same .cpp file without namespaces. As far as I can tell we never rely on the particular numerical values of the files, though I thought it sounded like a good idea when doing the VS (it turns out having 0 be BAD_FILE is nicer). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:31 -07:00
Eric Anholt	63c8155b09	i965: Make dump_instructions be a virtual method of the visitor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:26 -07:00
Eric Anholt	74e670d0a3	i965/vs: Do round-robin register allocation on gen6+ like we do in the FS. This will free instruction scheduling to make better choices. No statistically significant performance difference on GLB2.7 (n=93). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-02 15:54:09 -07:00
Rob Bradford	15e64de9e6	wayland: Make eglQueryBufferWL succeed for width and height requests too Following the addition of the EGL_WIDTH and EGL_HEIGHT this function should return EGL_TRUE for those requested attributes too.	2013-05-02 16:46:04 -04:00
Zack Rusin	396b861ceb	draw/gs: don't crash when vs/gs signatures don't match instead of crashing just fill zeros at the input slots that don't match, that's the mandated behavior and it avoids debug asserts. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-02 02:43:42 -04:00
Zack Rusin	999cd79c9e	tgsi: allow negation of all integer types It's valid because we reuse certain arithmetic operations for both signed and unsigned types (e.g. uadd, umad, which have a bit unfortunate naming) Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-05-02 02:43:42 -04:00
Eric Anholt	1dfea559c3	i965: Fix SNB GPU hangs when a blorp batch is the first thing to execute. The GPU apparently goes looking for constants even though there are no shader stages enabled, and gets stuck because we haven't told it there are no constants to collect. If any other user of the 3D pipeline had run (even the Render accel of the X server!) since power on, then the in-GPU constant buffers would have been set up with some contents we didn't use, and we would succeed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56416 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Dave Airlie <airlied@redhat.com> NOTE: This is a candidate for the stable branches.	2013-05-02 11:27:37 -07:00
Tom Stellard	156bcca62c	r600g: Don't set the dest cache bits on surface sync for R600_CONTEXT_FLUSH_AND_INV We are already emitting a EVENT_TYPE_CACHE_FLUSH_AND_INV_EVENT packet when this flush flag is set, so flushing the dest caches with a SURFACE_SYNC should not be necessary. The motivation for this change is that emitting a SURFACE_SYNC packet with the CB bits set was causing compute shaders to hang on Cayman. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-05-02 09:00:37 -07:00
Tom Stellard	5752be0cb7	r600g/compute: Fix build error in debug code Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-05-02 09:00:37 -07:00
Armin K	cd84353d57	radeon: Fix build with LLVM 3.3 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-05-02 09:00:37 -07:00
Armin K	4742f9b00b	gallivm: Fix build with LLVM 3.3 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-05-02 09:00:37 -07:00
Brian Paul	fcfbf4a19f	mesa: update comments, simplify code in vtxfmt.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:16 -06:00
Brian Paul	5dc0081ade	mesa: update GLvertexformat comments Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:16 -06:00
Brian Paul	200e09e393	mesa: remove GLvertexformat::EvalMesh1(), EvalMesh2() See previous commit comments. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:16 -06:00
Brian Paul	0f365b2d77	mesa: remove GLvertexformat::Rectf() As with the glDraw* functions, this doesn't have to be in GLvertexformat. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:16 -06:00
Brian Paul	49993a1a9d	mesa: simplify dispatch for glDraw* functions Remove all the glDraw* functions from the GLvertexformat structure. The point of that dispatch struct is to handle all the functions which dispatch differently depending on whether we're inside glBegin/End. glDraw* are never allowed inside glBegin/End so we can remove those entries. This simplifies the code paths and gets rid of quite a bit of code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:16 -06:00
Brian Paul	79679e258b	vbo: add new vbo_initialize_exec_dispatch(), vbo_initialize_save_dispatch() First step in simplifying the vertex array / glDraw dispatch code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	d0102500bd	mesa: remove _MESA_INIT_EVAL_VTXFMT() macro Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	43b3d3bc25	mesa: remove _MESA_INIT_ARRAYELT_VTXFMT() macro Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	95188fd10f	mesa: remove _MESA_INIT_DLIST_VTXFMT() macro Just expand the code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	84e62b7358	mesa: change _mesa_inside_dlist_begin_end() to handle PRIM_UNKNOWN If the currently compiled primitive state is PRIM_UNKNOWN we should not return true from _mesa_inside_dlist_begin_end(). This lets us simplify the calls to that function. Note, the call to _mesa_inside_dlist_begin_end() in vbo_save_EndList() should have probably been checking for PRIM_UNKNOWN too, but it wasn't. So there's no code change change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	daf19f28c6	mesa: add names of geometry shader prims in gl_enums.py Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	5472ae1fa9	vbo: fix initial value of ctx->Driver.CurrentSavePrimitive This is set during context creation/initialization. We know we're not inside glBegin/glEnd at this point so use PRIM_OUTSIDE_BEGIN_END. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	ecea61e414	vbo: fix error detection in vbo_save_playback_vertex_list() The old code didn't make sense. The clause in question did the same thing as the next else-if clause. If we're already executing a glBegin/End pair and we're starting a new primitive, that's an error. Fixes more failures in piglit gl-1.0-beginend-coverage test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	a07437dc28	mesa: comments, formatting fixes in dlist code Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	e880b7cbf8	vbo: remove redundant vfmt->Begin = _save_Begin assignment The same assignment appears later in the function. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	3e7c16997a	mesa: don't install glDraw* functions into the BeginEnd dispatch table Functions like glDrawArrays, glDrawElements, etc. are illegal between glBegin/glEnd and should generate GL_INVALID_OPERATION. Fixes several piglit gl-1.0-beginend-coverage failures. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	d6f3ef92d7	vbo: fix parameter validation for saving dlist glDraw* functions The _save_OBE_DrawArrays/Elements/RangeElements() functions are called when building a display list and we know we're outside glBegin/End. We shouldn't call the normal _mesa_validate_DrawArrays/Elements() functions here because those functions only work properly in immediate mode or during dlist execution. At dlist compile time, we can't call _mesa_update_state(), etc. and examine the current state since it won't apply when the list is executed later. Fixes several failures in piglit's gl-1.0-beginend-coverage test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	94c7caf406	mesa: add missing error check in _mesa_EndList() If we're in GL_COMPILE_AND_EXECUTE mode and inside glBegin, calling glEndList() should generate an error. Fixes a failure in piglit's gl-1.0-beginend-coverage test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	c1a5c5c13d	mesa: remove unused PRIM_INSIDE_UNKNOWN_PRIM constant Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	d5bdce1142	mesa: simplify save_Begin() error checking The old code was hard to understand and not entirely correct. Note that PRIM_INSIDE_UNKNOWN_PRIM is no longer set anywhere so we'll be able to remove that next. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:15 -06:00
Brian Paul	bb459f6295	mesa: refactor _mesa_valid_prim_mode() ...in terms of new _mesa_is_valid_prim_mode(). We need a mode validater function that doesn't depend on current state for the display list code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:14 -06:00
Brian Paul	8be093e2f6	mesa: fix CurrentSavePrimitive <= GL_POLYGON tests Use the new PRIM_MAX value instead so that new geometry shader primitive types are accounted for. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:14 -06:00
Brian Paul	cce6e30613	mesa: adjust PRIM_x constants for geometry shaders These values pertain to display lists, and the new types of geometry shader primitives can be used in display lists. And add new PRIM_MAX constant for follow-on changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:14 -06:00
Brian Paul	aa782f260d	mesa: fix save_ShadeModel() logic and add new comments This removes the test for _mesa_inside_dlist_begin_end(). If ctx->Driver.CurrentSavePrimitive==PRIM_UNKNOWN (the initial value), _mesa_inside_dlist_begin_end() will, confusingly, return TRUE. So we didn't set the ctx->ListState.Current.ShadeModel value and it remained in its indeterminate state. This didn't effect correctness, but it defeated the intended optimization of dropping redundant glShadeModel() state changes in order to coalesce sequences of drawing commands. Verified with new piglit gl-1.0-dlist-shademodel test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-05-02 09:03:14 -06:00
Adam Jackson	16296cc843	gallivm: Fix altivec intrinsics for 8xi16 add/sub Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-05-02 10:34:08 -04:00
Lauri Kasanen	35c5b95b94	r600/sb: Fix build failure with non-standard libdrm installation prefix Just like radeon/uvd, r600/sb fails to find the libdrm includes. Signed-off-by: Lauri Kasanen <cand@gmx.com>	2013-05-02 14:57:00 +02:00
Lauri Kasanen	e2b985dc0f	radeon/uvd: Fix build failure with non-standard libdrm installation prefix Without this patch, radeon_uvd failed to find the libdrm includes: In file included from radeon_uvd.c:48: ../../winsys/radeon/drm/radeon_winsys.h:44:35: error: libdrm/radeon_surface.h: No such file or directory Signed-off-by: Lauri Kasanen <cand@gmx.com>	2013-05-02 14:54:03 +02:00
Jordan Justen	02f2bce08d	mesa: implement glFramebufferTexture Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-01 16:18:25 -07:00
Jordan Justen	5da8288911	mesa: add Layered field to framebuffers When checking framebuffer completeness, we test each attachment. We verify that all attachments are consistent in terms of layers. 1. They must all be layered, or all non-layered 2. If they are layered, they must match in depth Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-01 15:31:48 -07:00
Jordan Justen	a62808085a	mesa: add renderbuffer attachment Layered field If glFramebufferTexture is used, then the framebuffer attachment is layered. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-01 15:31:44 -07:00
Jordan Justen	a05e201d4a	mesa: add renderbuffer Depth field With glFramebufferTexture, a renderbuffer may support all layers of the texture, so we need the depth of the renderbuffer to check for consistency which is required for framebuffer completeness. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-01 15:30:48 -07:00
Andreas Boll	b8e41db053	mesa: add usage examples to get-pick-list and shortlog scripts NOTE: This is a candidate for the stable branches.	2013-05-01 21:42:02 +02:00
Andreas Boll	df01201132	docs: add info about bugzilla_mesa.sh script	2013-05-01 21:42:02 +02:00
Andreas Boll	ca79b72c00	mesa: Add a script to generate the list of fixed bugs This list appears in the fixed bugs section of the release notes. v2: Add usage examples NOTE: This is a candidate for the stable branches.	2013-05-01 21:42:02 +02:00
Andreas Boll	f6aab27d43	scons: remove IN_DRI_DRIVER Not used anymore.	2013-05-01 21:34:48 +02:00
Andreas Boll	be0fec4f5b	build: remove unused API_DEFINES Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-01 21:34:48 +02:00
Brian Paul	7f8434b866	configure: remove IN_DRI_DRIVER Not used anymore. v2: Andreas Boll <andreas.boll.dev@gmail.com> - split patch into two patches - remove more unused code Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-01 21:34:48 +02:00
Brian Paul	4ede5fb0c6	configure: remove FEATURE_GL/ES1/ES2 Not used anymore. v2: Andreas Boll <andreas.boll.dev@gmail.com> - split patch into two patches Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-01 21:34:48 +02:00
Andreas Boll	6b8f55c4da	intel: use automake conditionals for defining FEATURE_{ES1,ES2} Removes the need of API_DEFINES. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-01 21:34:48 +02:00
Andreas Boll	afa33a001a	egl-static: use automake conditionals for defining FEATURE_{GL,ES1,ES2} Removes the need of API_DEFINES. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-01 21:34:48 +02:00
Andreas Boll	3537d853d0	intel: remove executable bit from C file Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-05-01 21:34:48 +02:00
Brian Paul	aaab450d22	docs: s/Aprile/April/	2013-05-01 13:17:21 -06:00
Andreas Boll	85e5bc106c	docs: fix 9.1.2 release notes	2013-05-01 21:01:48 +02:00
Marek Olšák	8eef6ad2e2	vbo: fix possible use-after-free segfault after a VAO is deleted This like the fifth attempt to fix the issue. Also with the new "validating" flag, we can set recalculate_inputs to FALSE earlier in vbo_bind_arrays, because _mesa_update_state won't change it. NOTE: This is a candidate for the stable branches. v2: fixed a typo Reviewed-by: Brian Paul <brianp@vmware.com>	2013-05-01 20:08:53 +02:00
Kenneth Graunke	b5b6460c40	i965/vs: Fix textureGrad() with shadow samplers on Haswell. The shadow comparitor needs to be loaded into the Z component of the last DWord. Fixes es3conform's shadow_execution_vert and oglconform's shadow-grad advanced.textureGrad.1D tests on Haswell. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-01 10:42:51 -07:00
Kenneth Graunke	e2f887b243	i965: Lower textureGrad() for samplerCubeShadow. According to the Ivybridge PRM, Volume 4 Part 1, page 130, in the section for the sample_d message: "The r coordinate contains the faceid, and the r gradients are ignored by hardware." This doesn't match GLSL, which provides gradients for all of the coordinates. So we would need to do some math to compute the face ID before using sample_d. We currently don't have any code to do that. However, we do have a lowering pass that converts textureGrad to textureLod, which solves this problem. Since textureGrad on three components is sufficiently obscure, it's not a performance path. For now, only handle samplerCubeShadow; we need tests for samplerCube and samplerCubeArray. Fixes es3conform's shadow_comparison_frag test on Haswell. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-05-01 10:42:51 -07:00
Christian König	163b4da874	radeon/uvd: fix quant scan order for mpeg2 Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-01 13:33:46 +02:00
Christian König	3aafe2437d	st/vdpau: fix background handling in the mixer Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-01 13:33:46 +02:00
Christian König	7d2f2a0c89	vl/buffer: use 2D_ARRAY instead of 3D textures Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-01 13:33:46 +02:00
Christian König	e27f87b549	vl/compositor: cleanup background clearing Add an extra parameter to specify if we should clear the render target. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-05-01 13:33:46 +02:00
Brian Paul	236ea7900f	swrast: add casts for ImageSlices pointer arithmetic MSVC doesn't like pointer arithmetic with void * so use GLubyte *. Reviewed-by: Jose Fonseca<jfonseca@vmware.com>	2013-05-01 11:53:02 +01:00
Chia-I Wu	22c5e048bd	ilo: fix PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS On GEN7+, is->dev.has_gen7_sol_reset is required.	2013-05-01 17:41:39 +08:00
Chia-I Wu	16f81fcf1e	ilo: enable SO support on GEN7	2013-05-01 17:36:44 +08:00
Chia-I Wu	d26f70e208	ilo: reset SO write offsets for new SO targets When the SO targets are changed and no appending is requested, we need to send SOL_RESET on GEN7+.	2013-05-01 17:36:44 +08:00
Chia-I Wu	68e1f76e46	ilo: correctly program SO states for GEN7 With the commands supported by GPE, we can finally program the states.	2013-05-01 17:36:44 +08:00
Chia-I Wu	9557cd39e2	ilo: implement GEN7 SO GPE functions They were just stubs before.	2013-05-01 17:36:09 +08:00
Chia-I Wu	9069a3b065	ilo: add gen6_pipeline_update_max_svbi() Move max_svbi calculation to a helper function and make it available for other GENs.	2013-05-01 17:35:43 +08:00
Chia-I Wu	252a21c2cc	ilo: expose register indices of OUTs in ilo_shader pipe_stream_output_info tells us which of OUT[i] needs to be written out. We need the info to map OUT[i] to VUE offset.	2013-05-01 17:34:49 +08:00
Chia-I Wu	440557db4e	ilo: allow one-off flags to be specified for CP It will be used for SOL_RESET on GEN7.	2013-05-01 16:03:44 +08:00
Chia-I Wu	dd62e7bc02	ilo: fix tiling/size for special-purpose resources We do not allocate such resources yet though.	2013-05-01 12:00:32 +08:00
Chia-I Wu	7726e9500c	ilo: use UMS layout for render targets As we do not advertise MSAA support, this change should not make any difference yet.	2013-05-01 11:56:43 +08:00
Chia-I Wu	334abed828	ilo: support and prefer compact array spacing There is no reason to waste the memory when the HW can support compact array spacing (ARYSPC_LOD0).	2013-05-01 11:31:15 +08:00
Chia-I Wu	ce188bb252	ilo: move device limits to ilo_dev_info or to GPEs It seems a bit weird to have device limits in a context.	2013-05-01 11:23:11 +08:00
Chia-I Wu	bef98f9c3a	ilo: use ilo_dev_info in toy compiler We need only dev->gen, but it makes sense to expose other information to the compiler.	2013-05-01 11:22:57 +08:00
Chia-I Wu	51d749e7e2	ilo: use ilo_dev_info in GPE and 3D pipeline We need only dev->gen and dev->gt, but it makes sense to expose other information to the pipeline.	2013-05-01 11:22:20 +08:00
Chia-I Wu	bb1f635dcc	ilo: add ilo_dev_info shared by the screen and contexts The struct is used to describe the device information, such as PCI ID, GEN, GT, and etc.	2013-05-01 11:20:41 +08:00
Chia-I Wu	355f3f7ab5	ilo: fix indentation of ilo_gpe_gen*.h	2013-05-01 11:20:32 +08:00
Kenneth Graunke	6c5cf8baa1	glsl: Ignore redundant prototypes after a function's been defined. Consider the following shader: vec4 f(vec4 v) { return v; } vec4 f(vec4 v); The prototype exactly matches the signature of the earlier definition, so there's absolutely no point in it. However, it doesn't appear to be illegal. The GLSL 4.30 specification offers two relevant quotes: "If a function name is declared twice with the same parameter types, then the return types and all qualifiers must also match, and it is the same function being declared." "User-defined functions can have multiple declarations, but only one definition." In this case the same function was declared twice, and there's only one definition, which fits both pieces of text. There doesn't appear to be any text saying late prototypes are illegal, so presumably it's valid. Unfortunately, it currently triggers an assertion failure: ir_dereference_variable @ <p1> specifies undeclared variable `v' @ <p2> When we process the second line, we look for an existing exact match so we can enforce the one-definition rule. We then leave sig set to that existing function, and hit sig->replace_parameters(&hir_parameters), unfortunately nuking our existing definition's parameters (which have actual dereferences) with the prototype's bogus unused parameters. Simply bailing out and ignoring such late prototypes is the safest thing to do. Fixes Piglit's late-proto.vert as well as 3DMark/Ice Storm for Android. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Ian Romanick <idr@freedesktop.org>	2013-04-30 16:43:42 -07:00
Ian Romanick	abfe486b9e	docs: Import 9.1.2 release notes, add news item. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-04-30 15:33:29 -07:00
Matt Turner	1b6281443d	build: Remove libws_xlib.la from GALLIUM_PIPE_LOADER_LIBS. The three users of GALLIUM_PIPE_LOADER_LIBS (OpenCL, gallium-gbm, gallium tests) don't appear to need libws_xlib.la. Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 14:03:32 -07:00
Matt Turner	460996b937	build: Remove libpipe_loader.la from GALLIUM_PIPE_LOADER_LIBS. Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 14:03:32 -07:00
Matt Turner	538e10f3ea	build: Remove HAVE_PIPE_LOADER_SW. It guarded the function prototype of pipe_loader_sw_probe, whose use (in pipe_loader.c) and definition (in pipe_loader_sw.c) were not guarded. Both are built into libpipe_loader.la if HAVE_LOADER_GALLIUM, which is enable_gallium_loader in configure.ac. Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 14:03:32 -07:00
Matt Turner	ea6caf4cdf	build: Remove libws_null.la from GALLIUM_PIPE_LOADER_LIBS. Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 14:03:32 -07:00
Matt Turner	242809942f	build: Rename PIPE_LOADER_HAVE_XCB to HAVE_PIPE_LOADER_XCB. For consistency, since we already have HAVE_PIPE_LOADER_{SW,DRM}. Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 14:03:32 -07:00
Matt Turner	657cfe6252	configure.ac: Remove unused HAVE_PIPE_LOADER_XLIB macro. Added in `e1364530` but never used. Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 14:03:31 -07:00
Paul Berry	bdf13dc832	i965: Stop passing num_samples to intel_miptree_alloc_hiz(). The number of samples is already available in the miptree data structure, so there's no need to pass it in. I suspect this may fix a subtle bug because in one case (intel_renderbuffer_update_wrapper) we were always passing zero for num_samples, even though the buffer in question was not guaranteed to be single-sampled. But I wasn't able to find a failing test case. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-30 13:46:57 -07:00
Zack Rusin	d48054ff22	draw: don't crash if GS doesn't emit anything Technically it's legal for geometry shader to not emit any vertices. It's silly, but perfectly legal, so lets make draw stop crashing if it happens. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-27 17:28:04 -04:00
Eric Anholt	e56095dc2e	i965: Implement color clears using a simple shader in blorp. The upside is less CPU overhead in fiddling with GL error handling, the ability to use the constant color write message in most cases, and no GLSL clear shaders appearing in MESA_GLSL=dump output. The downside is more batch flushing and a total recompute of GL state at the end of blorp. However, if we're ever going to use the fast color clear feature of CMS surfaces, we'll need this anyway since it requires very special state setup. This increases the fail rate of some the GLES3conform ARB_sync tests, because of the initial flush at the start of blorp. The tests already intermittently failed (because it's just a bad testing procedure), and we can return it to its previous fail rate by fixing the initial flush. Improves GLB2.7 performance 0.37% +/- 0.11% (n=71/70, outlier removed). v2: Rename the key member, use the core helper for sRGB, and use BRW_MASK_* enums, fix comment and indentation (review by Paul). v3: Rewrite a comment, drop a silly temporary variable (review by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-30 11:59:23 -07:00
Eric Anholt	e34c857639	mesa: Make a Mesa core function for sRGB render encoding handling. v2: const-qualify ctx, and add a comment about the function (recommended by Brian and Kenneth). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-04-30 11:59:23 -07:00
Eric Anholt	db31bc5cfb	i965: Don't flush the batch at the end of blorp. Improves GLB2.7 performance 0.13% +/- 0.09% (n=104/105, outliers removed). More importantly, once color glClear()s are done through blorp in the next commit, this reduces regression in GLES3 conformance tests that rely on queueing up many glClear()s and having the GPU report being still busy in an ARB_sync query after that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-30 11:59:23 -07:00
Vadim Girlin	fb1eed9ec5	r600g/sb: remove unused code Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-30 21:50:48 +04:00
Vadim Girlin	3f18dd818f	r600g/sb: collect shader statistics Collects various statistical information for each shader and total stats for contexts. Printed with R600_DEBUG=sb,sbstat Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-30 21:50:48 +04:00
Vadim Girlin	6ba7a162b6	r600g/sb: don't propagate dead values in GVN pass In some cases we use value::gvn_source field to link values that are known to be equal before gvn pass (e.g. results of DOT4 in different slots of the same alu group), but then source value may become dead later and this confuses further passes. This patch resets value::gvn_source to NULL in the dce_cleanup pass if it points to dead value. Fixes segfault during shader optimization with ETQW. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-30 21:50:48 +04:00
Vadim Girlin	3e476c311f	r600g/sb: use simple heuristic to limit register pressure It's not a complete register pressure tracking, yet it helps to prevent register allocation problems in some cases where they were observed. The problems are uncovered by false dependencies between fetch instructions introduced by some recent changes in TGSI and/or default backend. Sometimes we have code like this: ... SAMPLE R5.xyzw, R5.xyzw ... store R5.xyzw somewhere MOV R5.x, <next x coord> MOV R5.y, <next y coord> SAMPLE R5.xyzw, R5.xyzw ... <may be repeated a lot of times> With 2D resources, z and w in SAMPLE src reg aren't used and can be simply masked, but shader backend doesn't have this information, so it's considered as data dependency by optimization algorithms.	2013-04-30 21:50:48 +04:00
Vadim Girlin	6d6c8c88a3	r600g/sb: improve error checking in ra_coalesce pass	2013-04-30 21:50:47 +04:00
Vadim Girlin	188c893e65	r600g/sb: use source bytecode in case of optimization errors	2013-04-30 21:50:47 +04:00
Vadim Girlin	ad1df471d0	r600g: plug in optimizing backend Optimization is enabled with "R600_DEBUG=sb". Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-30 21:50:47 +04:00
Vadim Girlin	2cd7691793	r600g/sb: initial commit of the optimizing shader backend	2013-04-30 21:50:47 +04:00
Vadim Girlin	fbb065d629	r600g: use enum type for domains field in struct r600_resource This prevents the problems when the header is included in C++ code.	2013-04-30 21:50:47 +04:00
Vadim Girlin	d5b30fd036	r600g: add new flags to isa instruction tables	2013-04-30 21:50:47 +04:00
Vadim Girlin	a919424215	r600g: always create reverse lookup isa tables	2013-04-30 21:50:47 +04:00
Vadim Girlin	7d555f2f4c	r600g: mask unused source components for SAMPLE This results in more clean shader code and may improve the quality of optimized code produced by r600-sb due to eliminated false dependencies in some cases. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-30 21:50:47 +04:00
Eric Anholt	df410863d7	intel: Remove the last spans code! The remaining bits happen to do nothing that _swrast_span_render_start()/finish() don't do. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:45 -07:00
Eric Anholt	526cf46666	intel: Move the S8 offset calc function near its remaining usage. It's not really span code ever since we stopped using spans for S8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:45 -07:00
Eric Anholt	e7c5e9949b	intel: Ensure renderbuffers are current when mapping them. In the case of renering to windows in X, we would render to stale buffers (or not render at all!) if you hit a MapRenderbuffer as the first thing done to your window after new buffers are ready to be collected in DRI2. I think this also covers the weird comment about irb->mt being missing sometimes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:45 -07:00
Eric Anholt	0e8ef74c5f	mesa: Add a clarifying comment about rowStride of compressed textures. I always forget how we do this for compressed textures. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:45 -07:00
Eric Anholt	3750ff9e5f	mesa: Remove the Map field from texture images. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:44 -07:00
Eric Anholt	adf958d9c2	swrast: Always use MapTextureImage for mapping textures for swrast. Now that everything goes through ImageSlices[], we can rely on the driver's existing texture mapping function. A big block of code goes away on Radeon that looks like it was to deal with the validate that happened at SpanRenderStart, which no longer occurs since we don't need validation for the MapTextureImage hook. v2: Rewrite comment about ImageSlices, fix duplicated swImages, touch up unmap loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:44 -07:00
Eric Anholt	ea05e259c9	nouveau: Replace swrast_texture_image->Map usage with ->Buffer. This code is trying to deal with providing a map in the case that AllocTexImageBuffer was called, which is hooked up to the swrast variant. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:44 -07:00
Eric Anholt	b78e48289f	nouveau: Just use MapTextureImage instead of duplicating the logic. MapTextureImage has the exact same logic, except it can also handle swrast-allocated buffers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:44 -07:00
Eric Anholt	f91823f026	swrast: Make a teximage's stored RowStride be in terms of bytes per row. For hardware drivers with pitch alignment requirements, a non-power-of-two-sized texture format won't end up being an integer number of pixels per row. Also, avoids having to change our units between MapTextureImage's rowStride and swrast's RowStride. This doesn't fully convert the compressed texel fetch path, but does make sure we don't drop any bits (not that we'd expect to). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:44 -07:00
Eric Anholt	35e179b18c	swrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0]. This gets us ready for the Map field to die. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:44 -07:00
Eric Anholt	0c883e46d8	swrast: Replace ImageOffsets with an ImageSlices pointer. This is a step toward allowing drivers to use their normal mapping paths, instead of requiring that all slice mappings come from an aligned offset from the first slice's map. This incidentally fixes missing slice handling in FXT1 swrast. v2: Use slice height helper function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:43 -07:00
Eric Anholt	e7ecc11311	swrast: Reuse _swrast_free_texture_image_buffer from drivers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:43 -07:00
Eric Anholt	0a484f1006	swrast: Move ImageOffsets allocation to shared code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:43 -07:00
Eric Anholt	f709c31c67	swrast: Clean up and explain the mapping process. v2: Move slice height calculation to a helper function (recommeded by Brian). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:43 -07:00
Eric Anholt	741e540055	swrast: Factor out texture slice counting. This function going to get used a lot more in upcoming patches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:42 -07:00
Eric Anholt	dca4178130	radeon: Remove some dead teximage mapping code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:42 -07:00
Eric Anholt	0de08fb594	radeon: Add missing swrast field initialization. This is the equivalent of intel's `80513ec8b4`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-30 10:40:42 -07:00
Vincent Lejeune	a6a4b70e2d	r600g/llvm: Fix opencl build	2013-04-30 16:38:47 +02:00
Alexander von Gluck IV	f1361ed084	Gallium: Use mmap on Haiku for executable memory vs malloc * Haiku now has DEP enabled by default.	2013-04-29 23:22:35 -05:00
Alexander von Gluck IV	60cc73c333	Mapi: Use mmap on Haiku for executable memory vs malloc * Haiku now has DEP enabled by default.	2013-04-29 23:22:35 -05:00
Alexander von Gluck IV	39bdf08628	Mesa: Use mmap on Haiku for executable memory vs malloc * Haiku now has DEP enabled by default.	2013-04-29 23:22:35 -05:00
Vincent Lejeune	51e9bfdc48	r600g/llvm: get use_kill from compiler shader	2013-04-30 02:17:18 +02:00
Eric Anholt	a79786af64	i965/fs: Print out the estimated cycle count in INTEL_DEBUG=wm This could be used by shader-db for hopefully more accurate regression testing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-29 11:44:35 -07:00
Eric Anholt	61ca2c4f73	i965/fs: Allow LRPs with uniform registers. Improves GLB2.7 performance on my HSW by 0.671455% +/- 0.225037% (n=62). v2: Make is_valid_3src() a method of the fs_reg. (recommended by Ken) Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-04-29 11:41:35 -07:00
Eric Anholt	de7e8b1d01	intel: Be more conservative in disabling tiling to save memory. Improves GLB2.7 trex performance 1.01985% +/- 0.721366% on my IVB (n=10) and by 3.38771% +/- 0.584241% (n=15) on my HSW, due to a 32x32 ARGB8888 cubemap going from untiled to tiled. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-29 11:41:34 -07:00
Eric Anholt	73bc6061f5	i965: Disable Z16 on contexts that don't require it. It appears that Z16 on Intel hardware is in fact slower than Z24, so people are getting surprisingly hurt when trying to use Z16 as a performance-versus-precision tradeoff, or when they're targeting GLES2 and that's all you get. GL 3.0+ have Z16 on the list of required exact format sizes, but GLES doesn't, so choose the better-performing layout in that case. Improves GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB system. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-29 11:41:34 -07:00
Eric Anholt	e409889213	intel: Report FBO incompleteness causes through GL_ARB_debug_output. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-29 11:41:34 -07:00
Eric Anholt	6ae473221a	intel: Fold the one last function intel_tex_format.c into the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-29 11:41:34 -07:00
Eric Anholt	40b207b62f	mesa: Fix error checking for GS UBO getters. These are supposed to be present if both things are available, but we were enabling them if either one was.	2013-04-29 11:41:34 -07:00
Eric Anholt	072709da91	mesa: Add a clarifying comment about EXTRA_ error checking.	2013-04-29 11:41:34 -07:00
Eric Anholt	eac1199604	mesa: Add an extra clarifying set of braces to getter checking. For this multi-page single statement, my thought the end was to that the next block was mis-indented, rather than that the dropped indentation actually indicated the end of the loop.	2013-04-29 11:41:33 -07:00
Eric Anholt	2534f0a57d	mesa: Fix error checking for getters consisting of only API versions. In almost all of our cases, getters that are turned on for only some API variants will have an extension listed as one of the things that can enable it, and thus api_check gets set. For extra_gl30_es3 (used for NUM_EXTENSIONS, MAJOR_VERSION, MINOR_VERSION) on a GL 2.1 context, though, we would check twice, not find either one, but never actually throw the error.	2013-04-29 11:41:33 -07:00
Eric Anholt	d63a10afcc	mesa: Clarify the names of error checking variables for glGet. There's no reason to actually count these things, so the integer ++ behavior was just confusing.	2013-04-29 11:41:33 -07:00
Eric Anholt	4df1b986d3	i915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode. This brings the driver up to GL 2.1.	2013-04-29 11:41:33 -07:00
Eric Anholt	97217a40f9	i915: Always enable GL 2.0 support. There's no point in shipping a non-GL2 driver today.	2013-04-29 11:41:33 -07:00
Eric Anholt	eb062ab07f	i915: Correctly set the OQ counter bits. While we may provide the extension, we need to tell applications that they can't actually use it: An implementation can either set QUERY_COUNTER_BITS_ARB to the value 0, or to some number greater than or equal to n. If an implementation returns 0 for QUERY_COUNTER_BITS_ARB, then the occlusion queries will always return that zero samples passed the occlusion test, and so an application should not use occlusion queries on that implementation.	2013-04-29 11:41:33 -07:00
Kenneth Graunke	5e46482993	i965: Move is_math/is_tex/is_control_flow() to backend_instruction. These are entirely based on the opcode, which is available in backend_instruction. It makes sense to only implement them in one place. This changes the VS implementation of is_tex() slightly, which now accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD. However, since those aren't generated in the VS anyway, it should be fine. This also makes is_control_flow() available in the VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-29 11:10:50 -07:00
Zack Rusin	a6e7c22664	draw/so: fix overflow calculation only report overflow for missing targets if they're actually being used. if the targets are missing but are not being used by any slot in the stream output declaration we should correctly just ignore them. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-27 03:48:36 -04:00
José Fonseca	220ef8295c	llvmpipe: Fix queries when screen->num_threads == 0. That is, when llvmpipe is run in single-threaded mode. Trivial. Tested with LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry	2013-04-29 15:40:06 +01:00
José Fonseca	c4bea00fb3	Revert "st/mesa: add a simple path to BufferData if it only discards buffer contents" This reverts commit `5649f886f7`. It causes segfaults when size is zero.	2013-04-29 15:13:57 +01:00
Jerome Glisse	c7a13dc5f5	r600g: force full cache for hyperz Seems that in some case allowing half cache usage confuse the gpu and trigger lockup. Force full cache use. Should fix : https://bugs.freedesktop.org/show_bug.cgi?id=59592 https://bugs.freedesktop.org/show_bug.cgi?id=60848 https://bugs.freedesktop.org/show_bug.cgi?id=60969 https://bugs.freedesktop.org/show_bug.cgi?id=61747 https://bugs.freedesktop.org/show_bug.cgi?id=62466 https://bugs.freedesktop.org/show_bug.cgi?id=62669 https://bugs.freedesktop.org/show_bug.cgi?id=62721 https://bugs.freedesktop.org/show_bug.cgi?id=63124 Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-29 10:06:29 -04:00
Rob Clark	3900a0e4df	freedreno: fix rebase screw-up Add back 2nd arg to emit_vertexbufs() which got lost in rebase. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-29 07:36:27 -04:00
Chris Forbes	79f786f936	i965/fs: Don't try to use bogus interpolation modes pre-Gen6. Interpolation modes other than perspective-barycentric-pixel-center (and their associated coefficients in the WM payload) only exist in Gen6 and later. Unfortunately, if a varying was declared as `centroid`, we would blindly read the nonexistant values, and so produce all manner of bad behavior -- texture swimming, snow, etc. Fixes rendering in Counter-Strike Source and Team Fortress 2 on Ironlake. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-30 06:50:16 +12:00
Matt Turner	a8eed0299d	i965/vs: Fix order of source arguments to LRP. The order or arguments matches DirectX, and is backwards from GLSL's mix() built-in. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983	2013-04-28 14:38:14 -07:00
Zack Rusin	3bba787879	llvmpipe: stop crashing when one of the so targets is null Fixes a crash when one of the so targets is null. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-27 01:19:12 -04:00
Zack Rusin	0031cde1e1	draw/so: indicate overflow when buffer is missing We were crashing if one of the buffers wasn't set, we should just treat it as an overflow. It's useful when using so statistics because it allows one to figure out how much data would be generated by so without actually writing any of it. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-27 01:19:07 -04:00
Zack Rusin	f9f57312de	gallivm: fix indirect addressing of temps in soa mode we weren't adding the soa offsets when constructing the indices for the gather functions. That meant that we were always returning the data in the first vertex/primitive/pixel in the SoA structure and not correctly fetching from all structures. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-27 01:18:51 -04:00
Zack Rusin	3093ac6f4f	tgsi/ureg: Add a function to return the number of outputs We already hold the variable, just weren't providing access to it. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-26 23:05:45 -04:00
Zack Rusin	53d36d5fb0	draw/so: Fix overflow calculations We weren't taking the buffer offset, destination offset or the stride into consideration so we were frequently writing into an overflown buffer. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 23:04:26 -04:00
Zack Rusin	d996622cfa	draw/llvm: fix viewport transformations This was a very serious bug. We were always doing the viewport transformations on the first output of the vertex shader. That means that every application that was storing position in anything but OUT[0] was outputing untransformed vertices and had broken output for whatever it was storing at OUT[0]. Correctly take into consideration where the vertex position is actually stored. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 23:01:46 -04:00
Zack Rusin	5d9ef5b365	gallium: increase the number of available stream output decls There can be more stream output decls than shader outputs because individual components from them can be split and distributed among different so buffers. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 23:01:23 -04:00
Zack Rusin	562835bcdf	llvmpipe: implement so_overflow query Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 22:58:54 -04:00
Brian Paul	49dda2d92f	mesa: fix the compressed TexSubImage size checking code Before, we'd incorrectly generate an error if we we tried to replace a non-4x4 block near the edge of a NPOT compressed texture. For example, if the dest image was 15 texels wide and xoffset=12 and width=3 we'd incorrectly generate GL_INVALID_OPERATION. Verified with new tests added to piglit s3tc-errors test. Note: This is a candidate for the stable branches. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 16:22:30 -06:00
Brian Paul	ff74cf62b1	llvmpipe: replace LP_MAX_THREADS with screen->num_threads in query code Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 16:22:24 -06:00
Brian Paul	38a751cbe8	llvmpipe: bump LP_MAX_THREADS to 16 On the mesa-users list, Burlen Loring reported a speed-up with 16 cores and his test/app. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-26 16:22:12 -06:00
Brian Paul	8fbc36ff48	mesa: updated read_buffer_enum_to_index() comment Remove the part about the value of gl_framebuffer::Name.	2013-04-26 08:30:25 -06:00
Christian König	e3ac293daa	r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2 That is just not supported by the hardware. v2: fix compare Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-26 15:35:36 +02:00
Christian König	2c2c54b819	radeon/uvd: stop using anonymous unions Signed-off-by: Christian König <christian.koenig@amd.com>	2013-04-26 15:35:36 +02:00
Tapani Pälli	12b0bfa6e9	mesa: fix type comparison errors in sub-texture error checking code patch fixes a crash that happens if glTexSubImage2D is called with a negative xoffset. NOTE: This is a candidate for stable branches. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-26 06:47:44 -06:00
José Fonseca	c5e8573762	Revert "draw: Yield zeros for LLVM fetches of non-existing vertex elements." After more thought/discussion, it seems it is better to handle this sort of stuff in the state tracker. So this reverts commit `12096f334b`, except the variant->key -> key shorthands.	2013-04-26 12:15:39 +01:00
Chia-I Wu	5816a471af	ilo: add the driver to the build system Add ilo to targets/egl-static and add a new target dri-ilo. Update autoconf and automake rules.	2013-04-26 16:20:52 +08:00
Chia-I Wu	825aa60707	ilo: compile VS/GS/FS with the toy compiler	2013-04-26 16:20:52 +08:00
Chia-I Wu	7118ff8bb0	ilo: add a toy shader compiler This is a simple shader compiler that performs almost zero optimizations. The generated code is usually much larger comparing to that generated by i965. The generated code also requires many more registers. Function-wise, it lacks register spilling and does not support most TGSI indirections. Other than those, it works alright.	2013-04-26 16:20:52 +08:00
Chia-I Wu	0fa2d0e98a	ilo: hook up pipe context GPGPU functions This just adds a stub.	2013-04-26 16:16:43 +08:00
Chia-I Wu	cf8f3dd373	ilo: hook up pipe context video functions This just hooks them up with auxiliary/vl layer.	2013-04-26 16:16:43 +08:00
Chia-I Wu	12dd397d0c	ilo: add support for time/occlusion/primitive queries	2013-04-26 16:16:43 +08:00
Chia-I Wu	e6186b0769	ilo: hook up pipe context 3D functions	2013-04-26 16:16:43 +08:00
Chia-I Wu	5b310f6230	ilo: add GEN7 support for 3D pipeline	2013-04-26 16:16:43 +08:00
Chia-I Wu	91ce766c35	ilo: add 3D pipeline for GEN6 The 3D pipeline is a high-level interface to emit 3D commands and states. It uses GEN6 GPE to do the real work.	2013-04-26 16:16:43 +08:00
Chia-I Wu	67233b56d6	ilo: add GEN7 GPE	2013-04-26 16:16:43 +08:00
Chia-I Wu	d3602dfac6	ilo: add GEN6 GPE GEN6 GPE (Graphics Processing Engine) is a low-level interface to emit 3D commands and states.	2013-04-26 16:16:43 +08:00
Chia-I Wu	72357cf3bb	ilo: hook up pipe context query functions None of the query types are supported yet.	2013-04-26 16:16:43 +08:00
Chia-I Wu	8f949bc1da	ilo: hook up pipe context transfer functions	2013-04-26 16:16:42 +08:00
Chia-I Wu	0754ff33e3	ilo: hook up pipe context blit functions	2013-04-26 16:16:42 +08:00
Chia-I Wu	89d1702b9b	ilo: hook up pipe context state functions	2013-04-26 16:16:42 +08:00
Chia-I Wu	520af66797	ilo: add functions to manage shaders This commits add shader cache, shader state, shader variant, and etc. It does not add the shader compiler though.	2013-04-26 16:16:42 +08:00
Chia-I Wu	86940bf41c	ilo: hook up pipe context flush function	2013-04-26 16:16:42 +08:00
Chia-I Wu	eed1e5a407	ilo: add command parser The command parser manages batch buffers and command submissions.	2013-04-26 16:16:42 +08:00
Chia-I Wu	3a4a570c34	ilo: hook up pipe screen resource functions	2013-04-26 16:16:42 +08:00
Chia-I Wu	b50e68cb67	ilo: hook up pipe screen format functions	2013-04-26 16:16:42 +08:00
Chia-I Wu	babb2b5c50	ilo: hook up pipe_screen param and fence functions	2013-04-26 16:16:42 +08:00
Chia-I Wu	e74d67738d	ilo: add debug flags settable through ILO_DEBUG	2013-04-26 16:16:42 +08:00
Chia-I Wu	63b5720105	ilo: new pipe driver for Intel GEN6+ This commit adds some boilerplate code. The header files found under include/ are copied from i965.	2013-04-26 16:16:41 +08:00
Chia-I Wu	380e6875b8	winsys/intel: new winsys for intel This is a wrapper for libdrm_intel to allow the pipe driver to stay OS agnostic.	2013-04-26 15:49:00 +08:00
José Fonseca	542c5b3703	gallivm: Fix trivial out-of-bounds indirection in lp_build_cube_lookup(). Courtesy of clang: src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds] tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02); ^ ~ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here LLVMValueRef ddx_ddy[2], tmp[2], rho_vec; ^ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds] rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]); ^ ~ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here LLVMValueRef ddx_ddy[2], tmp[2], rho_vec; ^ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds] rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]); ^ ~ src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here LLVMValueRef ddx_ddy[2], tmp[2], rho_vec; ^	2013-04-26 08:44:37 +01:00
Matt Turner	0c1d87b0d7	i965/vs: Add support for LRP instruction. Only 13 affected programs in shader-db, but they were all helped. total instructions in shared programs: 368877 -> 368851 (-0.01%) instructions in affected programs: 1576 -> 1550 (-1.65%) Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-25 18:27:39 -07:00
Matt Turner	c0f67a127b	i965/vs: Add a function to fix-up uniform arguments for 3-src insts. Three-source instructions have a vertical stride overloaded to 4, which prevents directly using vec4 uniforms as arguments. Instead we need to insert a MOV instruction to do the replication for the three-source instruction. With this in place, we can use three-source instructions in the vertex shader. While some thought needs to go into deciding whether its better to use a three-source instruction rather than a sequence of equivalent instructions (when one or more sources are uniforms or immediates), this will allow us to skip a lot of ugly lowering code and use the BFE and BFI2 instructions directly. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-25 18:27:39 -07:00
Jerome Glisse	abb96fdea7	winsys/radeon: consolidate tracing into winsys v2 This move the tracing timeout and printing into winsys and add an debug environement variable for it (R600_DEBUG=trace_cs). Lot of file touched because of winsys API changes. v2: Do not write lockup file if ib uniq id does not match last one Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-25 18:36:31 -04:00
Tom Stellard	53fbae7eac	r600g/compute: Removed unused and untested code There was a lot of code in evergreen_compute_internal.c that was not being used at all and most of it was duplicating code from other parts of the driver. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 13:32:22 -07:00
Tom Stellard	f986087d5c	r600g/compute: Use a constant buffer to store kernel parameters v2 v2: - Fix usage of set_constant_buffer() - Fix typo in comment Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-25 13:32:17 -07:00
Tom Stellard	ffadc71afb	r600g: Add evergreen_emit_cs_constant_buffers() v2 v2: - Bump R600_NUM_ATOMS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-25 13:25:00 -07:00
Tom Stellard	83a00a1de8	r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernel The state tracker should be responsible for waiting for the kernel to finish. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 13:24:51 -07:00
Tom Stellard	09e47f7a25	r600g/compute: Fix input buffer size calculation Buffer size should be in bytes not dwords. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 13:24:24 -07:00
Adam Jackson	904b03824b	linux: Don't emit a .note.ABI-tag section anymore (#26663 ) We don't support pre-2.6 kernels anyway - the install docs say 2.6.28 for DRI - and apparently this confuses ld.so's sorting when multiple libGLs are installed. Just remove it. Note: this is a candidate for the stable branches. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-04-25 15:51:35 -04:00
Rob Clark	73de07cbbc	freedreno: use writecombine buffers Better than uncached for writes, which are common for vertex buffer upload, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-25 15:10:56 -04:00
Rob Clark	f706d4d340	freedreno: don't patch and re-emit same shader as much New textures or vertex buffers don't always require patching and re-emitting the shaders. So do a better job of figuring out when we actually have to patch the shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-25 15:10:56 -04:00
Eric Anholt	578987ce1c	i965: Avoid recompiles for fragment clamping on non-clamping APIs. Removes 75/78 state-dependent recompiles in GLB2.7 (the remaining 3 are due to FBO-rendering size predictions). We currently expose GL_ARB_color_buffer_float on GL core, so we may mis-predict there, but I'm about to send a patch for removing that silly extension in that case. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-25 12:03:00 -07:00
Alex Deucher	b5145ca2a8	radeonsi: add new SI pci ids Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 14:22:46 -04:00
Alex Deucher	b3a856dfa9	r600g: add new richland pci ids Note: this is a candidate for the stable branches. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-25 14:21:15 -04:00
José Fonseca	12096f334b	draw: Yield zeros for LLVM fetches of non-existing vertex elements. If a bug in an app/stater-tacker causes vertex buffer to fetch vertex elements that are not bound, simply return zeros instead of crashing. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-25 16:16:21 +01:00
José Fonseca	28e6a272fc	trace: Only close trace files on exit. Many applications don't exit cleanly, others may create and destroy a screen multiple times, so we only write </trace> tag and close at exit time.	2013-04-25 14:18:33 +01:00
José Fonseca	74d1153c9c	graw: Set the vertex shader constant buffer. We were setting the fragment shader, which wasn't needed.	2013-04-25 14:06:50 +01:00
José Fonseca	e88a1dba09	graw: Simple utilities to dump and disassemble TGSI tokens. Useful for core dumps, where calling tgsi_dump() from gdb is not an alternative.	2013-04-25 13:03:06 +01:00
José Fonseca	1687932d2b	scons: Support clang. clang is supports most gcc options / extensions, with a some exceptions. The biggest advantage of using clang is that compilation times are much short. One can tell scons to use clang when building by invoking it as CC=clang CXX=clang++ scons libgl-xlib	2013-04-25 11:59:01 +01:00
José Fonseca	f0c296773d	util/u_sse: Fix _mm_shuffle_epi8 prototype for clang. Clang does not support __artificial__. Instead match precisely what's in the clang headers.	2013-04-25 11:59:01 +01:00
José Fonseca	45a60e2e7a	scons: Remove redundant code. -fvisibility=hidden is already elsewhere for the whole tree.	2013-04-25 11:59:01 +01:00
Chris Forbes	8fd0190278	mesa: fix bogus comment about PrimitiveRestart fields Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-25 20:49:25 +12:00
Chris Forbes	447bf1fb52	i965: report correct sample positions From low to high bits, the sample positions are packed y0,x0,y1,x1... Fixes arb_texture_multisample-sample-position piglit. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-25 20:47:54 +12:00
Rob Clark	49a7624973	freedreno: fix bogus IMM const reg index We were assigning incorrect const register for immediates, and potentially writing immediate const to the wrong location. This fixes an incorrect-rendering bug with xonotic. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	9495ee12c6	freedreno: clear fixes and debugging Set a few extra registers to make sure we are in proper state for clearing. And also add some debug options to mark all state dirty in clear and gmem operations to aid in debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	d5d6ec8843	freedreno: fix texture fetch type There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether there are two or there valid input components. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	d086bb22bc	freedreno: fix temp register usage The previous approach of using the dst register as an intermediate temporary doesn't work in a lot of cases. For example, if the dst register is the same as one of the src registers. For now, just simplify it and always allocate a new register to use as an intermediate. In some cases this will result in more registers used than required. I think the best solution would be to implement an optimization pass to reduce the number of registers used, which would also solve the problem we have now of not being able to use GPRs that are assigned for TGSI_FILE_INPUT. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	7a837da556	freedreno: add noop driver It is useful for debugging. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	eec37f1cdc	freedreno: use u_math macros/helpers more Get rid of a few self-defined macros: ALIGN() -> align() min() -> MIN2() max() -> MAX2() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	38d8b02eba	freedreno: implement fd_screen_destroy() Opps, didn't notice that I had left it stubbed out. Also, make things fail a bit more gracefully when things go wrong. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Rob Clark	a64e2d9d9f	freedreno: set SWAP bit based on format Really this should be set based on buffer format, not on color vs depth/stencil. Probably there should be more formats that set the bit as we add support for more render target formats. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-04-24 21:09:46 -04:00
Tom Stellard	d9a32b84e3	radeon/llvm: Fix segfault with a specifc libelf implementation The libelf implementation that is distributed here: http://www.mr511.de/software/english.html requires calling elf_version() prior to calling elf_memory() Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-24 16:51:25 -07:00
Alex Deucher	5bbeae7a3d	r600g: use CP DMA for buffer clears on evergreen+ Lighter weight then using streamout. Only evergreen and newer asics support embedded data as src with CP DMA. Reviewed-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-24 18:54:31 -04:00
Chia-I Wu	9d0ad4c2f2	i965/gen7: fix encoding of (huge) surface size for BRW_SURFACE_BUFFER Unlike GEN6, the bits of entry count are distributed like this width = (entry_count & 0x0000007f); /* bits [6:0] / height = (entry_count & 0x001fff80) >> 7; / bits [20:7] / depth = (entry_count & 0x7fe00000) >> 21; / bits [30:21] / The maximum entry count is still limited to 2^27. This was noted while going over the PRM. No test is impacted, because 1<<20 (the bit that moved) is much larger than GL_UNIFORM_BLOCK_MAX_SIZE, GL_MAX_TEXTURE_BUFFER_SIZE, or MAX__UNIFORM_COMPONENTS. v2: Explain more in the commit message (by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 12:56:17 -07:00
Chia-I Wu	75d402b211	i965/gen7: fix 3DSTATE_LINE_STIPPLE_PATTERN The inverse repeat count should taks up bits 31:15 and is in U1.16. Fixes the "Restarting lines within a single Begin/End block" subtest of piglit linestipple, and gets the other failing subtests much closer to passing. v2: Rewrite commit message with more detailed piglit info (by anholt) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 12:56:17 -07:00
Chia-I Wu	bc98950a2a	i965: fix SURFACE_STATE dumping Wrong fields were used when dumping width and height. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 12:56:17 -07:00
Matt Turner	d611f12d82	i965: Remove strange comments about math functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-24 12:51:36 -07:00
Matt Turner	0c16c12e46	i965: Remove traces of nonexistent TAN math function. Never existed? At least never supported. Doesn't appear in 965, G45, or ILK documentation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-24 12:51:36 -07:00
Paul Berry	5bb90cfceb	glsl: Teach basic block analysis about break/continue/discard. Previously, the only kind of ir_jump that would terminate a basic block was "return". However, the other possible types of ir_jump ("break", "continue", and "discard") should terminate a basic block too. This patch modifies basic block analysis so that it terminates a basic block on any type of ir_jump, not just ir_return. Fixes piglit test dead-code-break-interaction.shader_test. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-24 09:57:37 -07:00
Paul Berry	70ca263623	glsl: Add virtual function ir_instruction::as_jump() Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-24 09:57:37 -07:00
Tom Stellard	f64058803a	r600g/llvm: Pass struct r600_bytecode to r600_llvm_compile This way we don't need to update the function signature everytime we emit a new config value. This also fixes the build with --enable-opencl.	2013-04-24 12:42:41 -04:00
José Fonseca	e29525f79f	winsys/sw/xlib: Prevent shared memory segment leakage. Running piglit with this was causing all sort of weird stuff happening to my desktop (Chromium webpages become blank, Qt Creator flickered, etc). I tracked this down to shared memory segment leakage when GL is not shutdown properly. The segments can be seen running `ipcs` and looking for nattch==0. This changes fixes this by calling shmctl(IPC_RMID) soon after creation (which does not remove the segment immediately, but simply marks it for removal when no more processes are attached). This matches src/mesa/drivers/x11/xm_buffer.c behaviour. v2: - move shmctl(IPC_RMID) after XShmAttach() for *BSD, per Chris Wilson - remove stray debug printfs, spotted by Ian Romanick NOTE: This is a candidate for stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-24 16:54:58 +01:00
Zack Rusin	1a87473998	draw/gs: preserve leading vertex info for gs We need to handle the leading vertex information when assembling primitives for the geometry shader otherwise the resulting triangles will have vertices at incorrect input locations. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-23 06:17:59 -04:00
Laurent Carlier	addf00e2ad	r200: fix build regression introduced with `9a32203e16` Signed-off-by: Laurent Carlier <lordheavym@gmail.com> Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-04-24 16:48:29 +02:00
Christian König	c5c754d184	radeonsi: cleanup disabling tiling for UVD v3 Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=63702 v2: add a comment that this is just a workaround v3: fix typo in comment Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-24 11:07:26 +02:00
Chad Versace	d3dfce3276	egl/dri2: Fix min/max swap interval of configs The commit below exposed a bug in dri2_add_config. commit `3998f8c6b5` Author: Ralf Jung <post@ralfj.de> Date: Tue Apr 9 14:09:50 2013 +0200 egl/x11: Fix initialisation of swap_interval This little code snippet near the bottom of dri2_add_config, if (double_buffer) { ... conf->base.MinSwapInterval = dri2_dpy->min_swap_interval; conf->base.MaxSwapInterval = dri2_dpy->max_swap_interval; } it never did what it claimed to do. The assignment never changed the value of conf->base.MaxSwapInterval, because dri2_dpy->max_swap_interval was, until the above exposing commit, unitialized here. That is, conf->base.MaxSwapInterval was 0 before and after assignment. Ditto for the min swap interval. Above the troublesome code snippet, the call to _eglFilterArray rejects the config as unmatching if its swap interval bounds differ from the base config's. Before the exposing commit, at the call to _eglFilterArray, the swap interval bounds were always [0,0], and hence no config was rejected due to swap interval. After the exposing commit, _eglFilterArray incorrectly rejected some configs, which prevented dri2_egl_config::dri_double_config from getting set for the rejected config, which resulted in a NULL pointer getting passed into dri2CreateNewDrawable, and then segfault. The solution: set the swap interval bounds before _eglFilterArray. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63447 Tested-by: Lu Hua <huax.lu@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-24 08:05:13 +02:00
Kenneth Graunke	cef31bb290	mesa: Add unpack functions for A/I/L/LA [U]INT8/16/32 formats. NOTE: This is a candidate for stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:13:02 -07:00
Kenneth Graunke	995051ee34	mesa: Add unpack functions for R/RG/RGB [U]INT8/16/32 formats. NOTE: This is a candidate for stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:13:00 -07:00
Kenneth Graunke	531be501de	mesa: Add an unpack function for ARGB2101010_UINT. v2: Remove extra parenthesis (suggested by Brian). NOTE: This is a candidate for stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:12:58 -07:00
Kenneth Graunke	b1fded54c9	mesa: Fix unpack function for ETC2_SRGB8_PUNCHTHROUGH_ALPHA1. We accidentally set MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1 twice, rather than setting the RGB8 and SRGB8 formats. NOTE: This is a candidate for stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63569 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:12:50 -07:00
Kenneth Graunke	097b39276c	mesa: Fix up some final license word wrapping issues by hand. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:14 -07:00
Kenneth Graunke	f0cb66b699	mesa: Restore 78-column wrapping of license text in C++-style comments. The previous commit introduced extra words, breaking the formatting. This text transformation was done automatically via the following shell command: $ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' \| sed 's/:.$//' \| xargs -I {} sh -c 'vim -e -s {} < vimscript2 where 'vimscript2' is a file containing: /THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/^ $/ !fmt -w 78 -p '// ' :wq Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:12 -07:00
Kenneth Graunke	3d8d5b298a	mesa: Restore 78-column wrapping of license text in C-style comments. The previous commit introduced extra words, breaking the formatting. This text transformation was done automatically via the following shell command: $ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' \| sed 's/:.$//' \| xargs -I {} sh -c 'vim -e -s {} < vimscript where 'vimscript' is a file containing: /THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/\\// !fmt -w 78 -p ' * ' :wq Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:09 -07:00
Kenneth Graunke	96ff2edc73	mesa: Add "OR COPYRIGHT HOLDERS" to license text disclaiming liability. This brings the license text in line with the MIT License as published on the Open Source Initiative website: http://opensource.org/licenses/mit-license.php Generated automatically be the following shell command: $ git grep 'THE AUTHORS BE LIABLE' \| sed 's/:.*$//g' \| xargs -I '{}' \ sed -i 's/THE AUTHORS/THE AUTHORS OR COPYRIGHT HOLDERS/' {} This introduces some wrapping issues, to be fixed in the next commit. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:06 -07:00
Kenneth Graunke	ca29382dc3	mesa: Change "BRIAN PAUL OR IBM" to "THE AUTHORS" in license text. See previous commit for the rationale. These weren't caught by the automatic conversion due to the "OR IBM" addition. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:07:04 -07:00
Kenneth Graunke	dd404bc94f	mesa: Change "BRIAN PAUL" to "THE AUTHORS" in license text. Generated automatically be the following shell command: $ git grep 'BRIAN PAUL BE LIABLE' \| sed 's/:.*$//g' \| xargs -I '{}' \ sed -i 's/BRIAN PAUL/THE AUTHORS/' {} The intention here is to protect all authors, not just Brian Paul. I believe that was already the sensible interpretation, but spelling it out is probably better. More practically, it also prevents people from accidentally copy & pasting the license into a new file which says Brian is not liable when he isn't even one of the authors. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-23 22:06:38 -07:00
Brian Paul	cab19eced5	mesa: make _mesa_save_vtxfmt_init() static It's called from nowhere else. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-23 21:12:25 -06:00
Brian Paul	71ee003041	docs: document issue with Viewperf proe-05 test 6	2013-04-23 21:09:17 -06:00
Brian Paul	f74da3e988	mesa: use new _mesa_inside_dlist_begin_end() function Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-23 21:09:17 -06:00
Brian Paul	976b529b7c	mesa: use new _mesa_inside_begin_end() function Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-23 21:09:17 -06:00
Marek Olšák	9a32203e16	mesa: remove unused opcodes AND, DP2A, NOT, NRM3, NRM4, OR, PRINT, XOR Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:24 +02:00
Marek Olšák	3140d132ef	mesa: don't flush vertices and don't flag _NEW_COLOR in ClearColor, ClearIndex Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:24 +02:00
Marek Olšák	9f3985238f	mesa: don't flush vertices and don't flag _NEW_COLOR for GL_CLAMP_READ_COLOR There used to be a derived state _ClampReadColor, so setting _NEW_COLOR made sense. The state is gone now. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:24 +02:00
Marek Olšák	43dac2700c	mesa: don't flag _NEW_DEPTH in Begin/EndQuery if driver implements the functions We don't want to set the flag for Gallium. I think only swrast needs the flag to be set for occlusion queries. v2: fix stats_wm updates in i965 Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	629813d9de	mesa: don't flush vertices and don't flag _NEW_DEPTH in ClearDepth Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	3975f52eb4	mesa: don't flush and don't flag _NEW_STENCIL in ClearStencil, ActiveStencilFace The functions don't affect driver state. There is no code that would rely on vertices being flushed prior to changing the states, and no code that would check for _NEW_STENCIL before using the states. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	1e3b422685	mesa: don't set _NEW_BUFFERS in GenerateMipmap and BlitFramebuffer both functions don't change the framebuffer in any way (if mesa_meta is not used) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	d883d00878	mesa: remove _NEW_PACKUNPACK No driver checks the flag. Nobody uses it. I also removed the FLUSH_VERTICES calls, because PixelStorei has no effect on rendering. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	99bd76d834	mesa: convert _NEW_RASTERIZER_DISCARD to a driver flag Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	b95cbe5e80	mesa,i965: use NewDriverState to communicate TFB state changes with the driver _NEW_TRANSFORM_FEEDBACK is not used by core Mesa, so it can be removed. Instead, an new private flag is added to i965 to serve the same purpose. If you're new to this: * When creating a context. you can set private dirty flags in gl_context::DriverFlags, eg.: ctx->DriverFlags.NewStateX = BRW_NEW_STATE_X; * When StateX is changed, core Mesa does: ctx->NewDriverState \|= ctx->DriverFlags.NewStateX; * When you have to draw, read and clear ctx->NewDriverState. * Pros: not touching NewState, the driver decides the mapping between GL states and hw state groups, unlimited number of flags in core Mesa (still limited number of flags in the driver though) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	ef39bc4f2e	mesa: remove redundant _NEW_BUFFERS setting in ReadBuffer already set by _mesa_readbuffer Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-24 03:23:23 +02:00
Marek Olšák	5649f886f7	st/mesa: add a simple path to BufferData if it only discards buffer contents Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-24 03:23:23 +02:00
Marek Olšák	d23c7455ae	st/mesa: depth-stencil-alpha state also depends on _NEW_BUFFERS because the code looks at the visual if there is a depth or stencil buffer before enabling depth or stencil, respectively. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-24 03:23:23 +02:00
José Fonseca	2737abb44e	gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center. Squashed commit of the following: commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852 Author: José Fonseca <jfonseca@vmware.com> Date: Tue Apr 23 17:37:18 2013 +0100 gallium: s/lower_left_origin/bottom_edge_rule/ commit 4dff4f64fa83b9737def136fffd161d55e4f1722 Author: José Fonseca <jfonseca@vmware.com> Date: Tue Apr 23 17:35:04 2013 +0100 gallium: Move diagram to docs. commit 442a63012c8c3c3797f45e03f2ca20ad5f399832 Author: James Benton <jbenton@vmware.com> Date: Fri May 11 17:50:55 2012 +0100 gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center. This change is necessary to achieve correct results when using OpenGL FBOs. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-23 19:42:47 +01:00
Marek Olšák	b692076420	r600g: initialize CMASK and HTILE with the GPU using streamout This fixes a crash when a resource cannot be mapped to the CPU's address space because it's too big. This puts a global pipe_context in r600_screen, which is guarded by a mutex, so that we can use pipe_context when there isn't one around. Hopefully our multi-context support is solid. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> NOTE: This is a candidate for the 9.1 branch.	2013-04-23 20:26:20 +02:00
Marek Olšák	1ba46bbb4c	gallium/u_blitter: implement buffer clearing Although this might be useful for ARB_clear_buffer_object, I need it for initializating resources in r600g. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> v2: comment cleanups NOTE: This is a candidate for the 9.1 branch.	2013-04-23 20:26:20 +02:00
Vincent Lejeune	edd90a19ca	r600/llvm: Read stacksize from config header	2013-04-23 19:52:29 +02:00
Vincent Lejeune	a7f73f5155	/bin/bash: q : commande introuvable	2013-04-23 19:52:02 +02:00
Tom Stellard	a0c8942bb4	radeon/llvm: Fix build with LLVM >= r180063	2013-04-23 11:53:05 -04:00
Tom Stellard	ead4db420e	gallivm: Fix build with LLVM >= r180063	2013-04-23 11:53:05 -04:00
Zack Rusin	1fb8c3ce55	draw: use the prim count for ia primitives Number of vertices to fetch doesn't always equal the number of input vertices. To correctly compute the number if IA primitives we need to use the total number of input vertices, not only those that need to be fetched. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Zack Rusin	76587d2e5e	tgsi/scan: set correct input limits for geometry shader TGSI geometry shader input declerations are of the IN[][2] format and the dimensions of the array have to be deduced from the input primitive property. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Zack Rusin	913ed25f18	draw: add code to reset instance dependent data We want to be able to reset certain parts of the pipeline, in particular the input primitive index, but only either with seperate invocations of the draw_vbo or new instances. In all other cases (e.g. new invocations due to primitive restart) that data needs to be preserved. Add a function through which we can reset instance dependent data. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Zack Rusin	2aad06844f	softpipe: fix streamout with an emptry geometry shader Same approach as in the llvmpipe, if the geometry shader is null and we have stream output then attach it to the vertex shader right before executing the draw pipeline. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-22 20:36:07 -04:00
Andreas Boll	723b78397f	configure.ac: Allow OpenGL ES1 and ES2 only with enabled OpenGL Building OpenGL ES1 and/or ES2 without OpenGL is not supported on mesa 9.0.x Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-23 03:16:10 +02:00
Matt Turner	7be536bb19	i965/fs: Don't save value returned by emit() if it's not used. Probably a copy-n-paste mistake. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-22 15:34:32 -07:00
Brian Paul	4d5827ea83	mesa: Remove extra MapBufferRange in create_beginend_table() Looks like a copy&paste typo. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-22 12:38:04 -06:00
José Fonseca	7c1bf8e381	gallium: Add a new clip_halfz rasterizer state. gl_rasterization_rules lumps too many different flags. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-22 18:39:06 +01:00
Kenneth Graunke	95c83824e6	i965: Fix a mistake in the comments for software counters. The code doesn't set brw->query.obj to NULL, it sets query->bo to NULL. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-22 10:34:49 -07:00
José Fonseca	c0538860bf	gallivm: Fix assignment of unsigned values to OUT register. TEMP is not the only register file that accept unsigned. OUT too. Actually, what determines the appropriate type of the destination value is not the opcode, but rather the register. Also cleanup/simplify code. Add a few more asserts, but also make code more robust by handling graceful if assert fails. This fixes segfault / assertion in the included vert-uadd.sh graw shader. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-22 18:23:42 +01:00
Matt Turner	ec646e4654	i965: Apply CMP NULL {Switch} work-around to other Gen7s. Listed in the restrictions section of CMP, but not on the work-arounds page. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-22 09:45:10 -07:00
Brian Paul	6654b9d1eb	st/mesa: minor indentation fixes	2013-04-22 10:08:06 -06:00
Eric Anholt	47c0b5ecdd	mesa: Introduce a globally-available minify() macro. This matches u_minify()'s behavior, for consistency. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-21 12:28:04 -07:00
Eric Anholt	1842dd08b8	mesa: Generalize TexStorage allocator between swrast and intel. This should be reusable for other non-gallium drivers, so we can make the extension always be available. v2: Add a more detailed comment than the old function had (recommended by Brian). Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2013-04-21 12:28:04 -07:00
Eric Anholt	e86170c2b8	mesa: Add performance debug for meta code. I noticed a fallback in regnum through sysprof, and wanted a nicer way to get information about it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-21 12:28:03 -07:00
Eric Anholt	cbe8b75b58	intel: Mention how much data we're trying to subdata in perf debug. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-21 12:28:03 -07:00
José Fonseca	9fb5b2f45c	Revert "gallivm: Emit vector selects." It caused inumerous regressions (LLVM 3.1) in blending. In particular: - lp_test_blend type=u8nx16 rgb_func=sub rgb_src_factor=zero rgb_dst_factor=inv_src_color alpha_func=rev_sub alpha_src_factor=one alpha_dst_factor=const_color ... MISMATCH Src: 0 0 0 b5 49 29 0 a2 0 21 de 0 c3 1b ec 0 Src1: 2d 85 14 0 f8 0 79 a1 99 0 d8 0 59 16 0 0 Dst: 0 a9 97 0 c0 0 78 0 0 8b aa f0 bd 0 78 f6 Con: 7d 0 c0 0 0 bb 77 0 0 0 50 0 40 51 0 0 Res: 0 0 0 0 0 29 0 0 0 0 c8 0 97 1b e3 0 Ref: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 type=u8nx16 rgb_func=max rgb_src_factor=one rgb_dst_factor=inv_const_color alpha_func=min alpha_src_factor=zero alpha_dst_factor=inv_src1_alpha ... MISMATCH Src: d 0 0 e9 0 37 35 f0 62 0 0 b2 e9 f7 0 5c Src1: 8f 0 bf 0 a8 5 0 0 c4 0 d7 7 92 a 0 17 Dst: cb 0 1e 0 0 0 19 8e 0 4d 0 0 0 0 3 46 Con: aa 5a 5f 8f 0 0 bc 92 0 88 0 0 b7 8a c0 88 Res: 44 0 13 0 0 0 7 8e 0 24 0 0 0 0 1 40 Ref: 44 0 13 0 0 37 35 0 62 24 0 0 e9 f7 1 0 This reverts commit `1e266c7ef0`.	2013-04-21 09:07:19 +01:00
José Fonseca	d8a4c4c524	llvmpipe: verify function on blend test.	2013-04-21 08:53:31 +01:00
José Fonseca	a79990bec0	llvmpipe: Don't support Z32_FLOAT_S8X24_UINT texture sampling support either. Because we don't support, and the u_format fallback doesn't work for zs formats. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	c08b04992a	llvmpipe: Ignore depth-stencil state if format has no depth/stencil. Prevents assertion failures inside the driver for such state combinations. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	f701a5a0fe	gallivm: Disable LLVM 2.7 workaround on other versions. 2.7 was a particularly trouble ridden release. Furthermore, the bug no longer can be reproduced ever since the first_level state was taken in account. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
José Fonseca	1e266c7ef0	gallivm: Emit vector selects. They are supported on LLVM 3.1, at least on x86. (I haven't tested on PPC though.) Actually lp_build_linear_mip_levels() already has been emitting them for some time. This avoids intrinsics, which tend to be an obstacle for certain optimization passes. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-20 23:25:36 +01:00
Rob Clark	26b39df08f	freedreno: move ir -> ir2 There will be a new IR for a3xx, which has a very different shader ISA (more scalar oriented). So rename to avoid conflicts later when I start adding a3xx support to the gallium driver. Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>	2013-04-20 17:59:41 -04:00
Rob Clark	d8134792ae	freedreno: cleanup some cruft left over from fdre The standalone shader assembler needed some meta-data to know about attributes/varyings/etc, to do the shader linkage. We don't need these parts with gallium/tgsi, so just get rid of it. Signed-off-by: Rob Clark <Rob Clark robdclark@freedesktop.org>	2013-04-20 17:31:47 -04:00
Roland Scheidegger	85974e5fee	gallivm: implement switch opcode Should be able to handle all things which make this tricky to implement. Fallthroughs, including most notably into/out of default, should be handled correctly but are quite a mess. If we see largely unoptimized switches in the wild should probably think about some "real" switch optimization pass, e.g. things like this: switch case1 someinst brk case2 default case3 someinst brk case4 someinst endswitch are legal, but the pointless case2/case3 statements not only cause condition evaluation but will turn this into a "fake" fallthrough case (because mask and defaultmask are already updated for case2 when default is encountered) requiring executing code twice. If default is at the end though, there's never any code re-execution, and if that's not the case if there's no fallthrough in (not even a fake one) and out of default there's no code re-execution neither. v2: add comments, and use enum for break type instead of magic boolean. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	8f5d4283c0	gallivm: use uint build context for mask instead of float Unsurprisingly noone was using it except for grabbing builder. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	107550e71a	gallivm/tgsi: fix up breakc It seems there was a typo in gallivm breakc handling (I am actually still not sure it is really needed but otherwise that statement really should go away). Also fix the wrong src argument type, even though they weren't really used. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	e8d1b26a82	svga: remove TGSI_OPCODE_BREAKC instruction translation While initially that opcode probably was meant for something along the lines of sm3 break_comp it has never worked that way (not even the argument count was right) and now the opcode has quite different semantics so just remove it. (Discovered by Jose Fonseca)	2013-04-20 02:27:53 +02:00
Roland Scheidegger	794579105a	gallium: document breakc and switch/case/default/endswitch docs were missing, especially the opcode-from-hell switch however is anything but obvious. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Roland Scheidegger	443950c6aa	gallivm: increase nesting limit to 66 This is still not really correct, since at least for sm 4.0 the nesting limit is 64 per subroutine, and subroutine nesting itself has a limit of 32, so since we have a flat stack we'd need 32*64. But this should probably be better fixed with per-subroutine stacks, since otherwise these structures get really big (like 100kB for the lp_exec_mask). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-20 02:27:53 +02:00
Zack Rusin	12eab7cc56	draw: implement primitive assembler Input assembler needs to be able to decompose adjacency primitives into something that can be understood by the rest of the pipeline. The specs say that the adjacency primitives are only visible in the geometry shader, for everything else they need to be decomposed. Which in most of the cases is not an issue, because the geometry shader always decomposes them for us, but without geometry shader we were passing unchanged adjacency primitives to the rest of the pipeline and causing crashes everywhere. This commit introduces a primitive assembler which, if geometry shader is missing and the input primitive is one of the adjacency primitives, decomposes them into something that the rest of the pipeline can understand. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-18 11:51:22 -07:00
Zack Rusin	e4752d0f56	util/prim: fix decomposed counts for adjacency primitives Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-18 11:37:37 -07:00
Zack Rusin	c1299204ad	draw/so: uses the correct index with the pre clipped coordinates pre_clip_pos is a float[4] we just used (*float)[4] to be able to jump within the array of vertex_headers with it. So if the idx happened to be anything but 0, we'd actually read from some garbage in memory. Change it to just be a simple pointer instead of casting it to something that it's not. As suggested by Jose. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-18 11:36:38 -07:00
Eric Anholt	8b2662e900	glapi: Add counter information for glBufferData(), like glBufferSubData(). This causes this function to become asynchronous with glthread.	2013-04-19 10:13:00 -07:00
Eric Anholt	1a3ea852ea	glapi: Add parameter count information for uniforms. This is the kind of information that would have been present for GLX, if GLX supported modern GL. This allows these entrypoints to get automatic asynchronous marshalling code generated for glthread.	2013-04-19 10:13:00 -07:00
Paul Berry	57b7c20ca5	glapi: skip padding in get_called_parameter_string This bug is currently benign, since get_called_parameter_string() is currently only used for functions that return true for glx_function.has_different_protocol(), and none of those functions include padding. However, in order to implement marshalling of GL API functions, we'll need to use get_called_parameter_string() far more often. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-19 10:12:36 -07:00
Paul Berry	fe955dc6b6	mesa: Fix up program_parse.y to avoid uninitialized $$ Without this patch, $$.negate, $$.rgba_valid, and $$.xyzw_valid take on garbage values. At the moment this problem is benign (the garbage values happen to be zero), but in my experiments executing GL operations on a background thread, the garbage values change, leading to piglit failures. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-19 10:12:27 -07:00
Eric Anholt	ea6cf2b686	mesa: Use quotes on bool driconf options to prevent stdbool.h breakage. Since stdbool.h's "true" and "false" are #defines, they got expanded when used as macro arguments, and that expanded value was stored in the XML string, producing XML that driconf would then fail to parse. Currently no drivers included stdbool along with driconf, but I keep accidentally doing so on intel as we move towards using normal C. v2: rebase on master. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-04-19 10:10:22 -07:00
Brian Paul	cecbfce5eb	svga: whitespace, comment fixes in svga_pipe_query.c	2013-04-19 10:04:11 -06:00
Brian Paul	ef1b2b8da7	svga: whitespace, comment fixes in svga_pipe_fs/vs.c	2013-04-19 10:03:56 -06:00
José Fonseca	dbb690872e	gallivm: Fix half floats with MCJIT. Prevents: LLVM ERROR: Cannot select: intrinsic %llvm.x86.vcvtph2ps.128	2013-04-19 10:13:19 +01:00
Matt Turner	e87015f508	Revert "i965: Check reg.nr for BRW_ARF_NULL instead of reg.file." This reverts commit `ecdda414d3`. Commit was supposed to be a simple typo fix. Clearly needs more investigating. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63688	2013-04-18 21:52:27 -07:00
Matt Turner	34efd9295e	configure.ac: Remove gallium-g3dvl flag. It's next to useless, since it just allows you to turn off VDPAU and XvMC with a single switch. Just check whether Gallium drivers are enabled instead. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-04-18 21:52:26 -07:00
Jerome Glisse	d0e9aaa31c	radeonsi: add support for compressed texture v2 Most test pass, issue are with border color and swizzle. Based on ircnick<maelcum> patch. v2: Restaged commit hunk Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-18 17:25:38 -04:00
Jerome Glisse	dc21e30a62	radeonsi: add 2d tiling support for texture v3 v2: Remove left over code v3: Restage properly the commit so hunk of first one are not in second one. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-18 17:25:38 -04:00
Vadim Girlin	f732036f12	gallium: handle drirc disable_glsl_line_continuations option NOTE: This is a candidate for the 9.1 branch Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-19 01:05:03 +04:00
José Fonseca	b72ff373fb	llvmpipe: Take in consideration all current constant buffers when mapping. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-04-18 20:48:12 +01:00
Christoph Bumiller	78eaaff696	nv50: add remaining RGBX formats Not all are supported as render targets. The state tracker fallback of using RGBA instead of RGBX currently fails for blending, we could work around this by clearing their alpha to 1 and modifying the color mask to disable writing alpha.	2013-04-18 21:04:22 +02:00
Christoph Bumiller	729abfd0f5	st/mesa: optionally apply texture swizzle to border color v2 This is the only sane solution for nv50 and nvc0 (really, trust me), but since on other hardware the border colour is tightly coupled with texture state they'd have to undo the swizzle, so I've added a cap. The dependency of update_sampler on the texture updates was introduced to avoid doing the apply_depthmode to the swizzle twice. v2: Moved swizzling helper to u_format.c, extended the CAP to provide more accurate information.	2013-04-18 20:35:40 +02:00
Christoph Bumiller	246ff8f887	nv50: set BORDER_COLOR_SRGB in sampler objects	2013-04-18 20:35:40 +02:00
Christoph Bumiller	2d5d054752	nv50: fix 4th component of Lx_SINT/UINT formats	2013-04-18 20:35:40 +02:00
Tom Stellard	3b20170b2f	r600g: Fix build with --enable-opencl	2013-04-18 11:24:48 -07:00
Brian Paul	877e3c1d42	mesa: enable GL_ARB_texture_float if TEXTURE_FLOAT_ENABLED is defined Per message on mesa-users list, this wasn't working before. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-18 10:41:08 -06:00
Roland Scheidegger	50cbcf0c46	gallivm: change cubemaps / derivatives handling, take 55 Turns out the previous "fix" for handling per-pixel face selection and derivatives didn't work out that well - the derivatives were wrong by quite a bit, in theory transformation of the derivatives into cube space should work, but would be _a lot_ more work than the "simplified" transform used. So, for explicit derivatives, I'm just giving up and go back to not honoring them. For implicit derivatives (and the fake explicit ones) however we try something a little different, we just calculate rho as we would for a 3d texture, that is after scaling the coords by the inverse major axis. This gives the same results as calculating the derivs after projection of the coords to the same face as long as all pixels hit the same face (and only without rho_no_opt, otherwise it should be a bit worse). And when not all pixels are hitting the same face, the results aren't so hot but not catastrophically bad (I believe not off by more than a factor of 2 without no_rho_approx and not more than sqrt(2) with no_rho_approx). I think this is better than just picking the wrong face but who knows... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-18 17:06:43 +02:00
Roland Scheidegger	0d07f05ee8	gallivm: Add no_rho_approx debug option This will calculate rho correctly as sqrt(max((ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2), (ds/dx)^2 + (dt/dx)^2 + (dr/dx)^2)) instead of max(\|ds/dx\|,\|dt/dx\|,\|dr/dx\|,\|ds/dy\|,\|dt/dy,\|dr/dy\|) (for 3 coords - 2 coords work analogous, for 1 coord there's no point doing the exact version), for both implicit and explicit derivatives. While such approximation seems to be allowed in OpenGL some APIs may be less forgiving, and the error can be quite large (sqrt(2) for 2 coords, sqrt(3) for 3 coords so wrong by nearly one mip level in the latter case). This also helps to single out "real" bugs from "expected" ones, so it is debug only (though at least combined with no_brilinear I didn't really see much of a performance difference but only tested with a debug build - at least with implicit mipmaps the instruction count is almost exactly the same though the instructions are more complex (1 sqrt and mul/adds instead of and/max mostly). The code when the option isn't set stays exactly the same. v2: rename no_rho_opt to no_rho_approx. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-18 17:04:01 +02:00
José Fonseca	a930136977	llvmpipe: Support half integer pixel center fs coord. Tested with graw/fs-fragcoord 2/3, and piglit glsl-arb-fragment-coord-conventions. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:25 +01:00
José Fonseca	b191be52f2	llvmpipe: Remove the static interpolation. No longer used. If we ever want the old behavior we can run a loop unroller pass. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:22 +01:00
José Fonseca	6e833d4d09	gallivm: Drop pos arg from lp_build_tgsi_soa. Never used. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-18 14:18:13 +01:00
Andreas Boll	34bec4a251	docs: update release notes for 9.2 Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-18 09:36:57 +02:00
José Fonseca	392f6cfced	ralloc: Move declarations before statements. Trivial. Should fix MSVC build.	2013-04-18 06:21:04 +01:00
Emil Velikov	c7b88ed16e	configure: enable vdpau and xvmc detection, with gallium Currently the vdpau and xvmc detection code, is enabled for all builds. The state trackers exist only within gallium. Enable whenever at least one gallium driver is selected v2: removed stray '-a' [mattst88 v3]: Removed stray $. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63645 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-17 18:19:34 -07:00
Matt Turner	ecdda414d3	i965: Check reg.nr for BRW_ARF_NULL instead of reg.file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-17 18:19:34 -07:00
Matt Turner	60e4c99488	i965: Implement work-around for CMP with null dest on Haswell. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-17 18:19:34 -07:00
Stuart Abercrombie	1a59cc777f	i915g: Release old fragment shader sampler views with current pipe We were trying to use a destroy method from a deleted context. This fix is based on what's in the svga driver. Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>	2013-04-17 18:15:12 -07:00
Paul Berry	417d8917d4	i965/vec4: Fix hypothetical use of uninitialized data in attribute_map[]. Fixes issue identified by Klocwork analysis: 'attribute_map' array elements might be used uninitialized in this function (vec4_visitor::lower_attributes_to_hw_regs). The attribute_map array contains the mapping from shader input attributes to the hardware registers they are stored in. vec4_vs_visitor::setup_attributes() only populates elements of this array which, according to core Mesa, are actually used by the shader. Therefore, when vec4_visitor::lower_attributes_to_hw_regs() accesses the array to lower a register access in the shader, it should in principle only access elements of attribute_map that contain valid data. However, if a bug ever caused the driver back-end to access an input that was not flagged as used by core Mesa, then lower_attributes_to_hw_regs() would access uninitialized memory, which could cause illegal instructions to get generated, resulting in a possible GPU hang. This patch makes the situation more robust by using memset() to pre-initialize the attribute_map array to zero, so that if such a bug ever occurred, lower_attributes_to_hw_regs() would generate a (mostly) harmless access to r0. In addition, it adds assertions to lower_attributes_to_hw_regs() so that if we do have such a bug, we're likely to discover it quickly. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:41:55 -07:00
Dave Airlie	47bd6e46fe	ralloc: don't write to memory in case of alloc fail. For some reason I made this happen under indirect rendering, I think we might have a leak, valgrind gave out, so I said I'd fix the basic problem. NOTE: This is a candidate for stable branches. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-18 09:50:42 +10:00
Brian Paul	815ca0bf38	mesa: generate glGetInteger/Boolean/Float/Doublev() code for all APIs No longer pass -a flag to the get_hash_generate.py script to specify OpenGL, ES1, ES2, etc. This updates the autoconf, scons and android build files too (so we can bisect). This is the last of the API-dependent conditional compilation in core Mesa. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:33:40 -06:00
Brian Paul	9835d90596	mesa: remove mfeatures.h No longer needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:33:40 -06:00
Brian Paul	b76f6d9557	mesa: remove #include "mfeatures.h" from numerous source files None of the remaining FEATURE_x symbols in mfeatures.h are used anymore. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:33:40 -06:00
Brian Paul	c6e00b6f6c	glapi: no longer emit #include "mfeatures.h" in generated files None of the symbols in mfeatures.h are used anymore. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:33:40 -06:00
Brian Paul	7fd12a8ae1	mesa: remove FEATURE_remap_table from remap.[ch] It was always defined. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:33:39 -06:00
Brian Paul	0bcced7716	glapi: remove FEATURE_remap_table test (it's always defined) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-17 17:33:39 -06:00
Zack Rusin	8e7f7e9693	draw/so: respect leading/provoking vertex info we were ignoring leading/provoking vertex settings which was breaking decomposition of some strips. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-17 15:43:50 -07:00
Zack Rusin	6bb217a489	softpipe/so: use the correct variable for reporting stream out we were using the wrong vars, reporting incorrect stream output statistics. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-17 15:28:54 -07:00
Zack Rusin	cb58c79efb	gallivm/gs: fix indirect addressing in geometry shaders We were always treating the vertex index as a scalar but when the shader is using indirect addressing it will be a vector of indices for each channel. This was causing some nasty crashes insides LLVM. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-17 15:28:54 -07:00
Brian Paul	02039066a8	st/wgl: fix issue with SwapBuffers of minimized windows If a window's minimized we get a zero-size window. Skip the SwapBuffers in that case to avoid some warning messages with the VMware svga driver. Internal bug #996695 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-17 16:23:19 -06:00
Ian Romanick	505ac6ddc6	intel: Don't dereference a NULL pointer of calloc fails The caller of NewTextureObject does the right thing if NULL is returned, so this function should do the right thing too. NOTE: This is a candidate for stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-17 14:12:46 -07:00
Eric Anholt	50064164a4	i965: Trim trailing whitespace in brw_defines.h. It was all over the formats section I wanted to edit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-17 14:12:01 -07:00
Laurent Carlier	867f71db6b	r200: fix build failure introduced with `cbbcb0247e` Signed-off-by: Brian Paul <brianp@vmware.com>	2013-04-17 13:48:40 -06:00
Brian Paul	1079475481	st/mesa: clean up formatting in st_cb_msaa.c Insert blank lines, wrap lines, remove trailing whitespace, etc.	2013-04-17 12:28:13 -06:00
Brian Paul	3350ca223e	mesa: remove gl_context::_TriangleCaps No longer used anywhere. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:42 -06:00
Brian Paul	cbbcb0247e	mesa: remove DD_TRI_LIGHT_TWOSIDE flag v2: use conditional operator instead of bit shifting Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:42 -06:00
Brian Paul	c9bb052e31	mesa: remove DD_TRI_UNFILLED flag Use alternate code in intel, r200, radeon drivers. v2: use conditional operator instead of bit shifting Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:41 -06:00
Brian Paul	56dc53ed5b	mesa: remove DD_TRI_SMOOTH flag Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:41 -06:00
Brian Paul	b32fb8ac9e	mesa: remove DD_TRI_STIPPLE flag Make it a local macro for the i915 driver. v2: use conditional operator instead of bit shifting Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:41 -06:00
Brian Paul	dfb1474aac	mesa: remove DD_TRI_OFFSET flag Make it a local macro for the i915 driver. v2: use conditional operator instead of bit shifting Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:40 -06:00
Brian Paul	c6a81448f8	mesa: remove DD_POINT_ATTEN flag For the i915 driver, make it a local macro. v2: use conditional operator instead of bit shifting Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:40 -06:00
Brian Paul	4f57fbb507	mesa: remove DD_POINT_SMOOTH flag Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:40 -06:00
Brian Paul	8ac8ae8360	mesa: remove DD_LINE_STIPPLE flag For the i915 driver, make it a local macro. v2: use conditional operator instead of bit shifting Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:40 -06:00
Brian Paul	55b2033f0a	mesa: remove DD_SEPARATE_SPECULAR flag Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:39 -06:00
Brian Paul	c1c5d689c5	mesa: remove unused DD_LINE_SMOOTH flag Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-17 11:59:39 -06:00
Zack Rusin	f01f754ca1	draw/gs: make sure geometry shaders don't overflow The specification says that the geometry shader should exit if the number of emitted vertices is bigger or equal to max_output_vertices and we can't do that because we're running in the SoA mode, which means that our storing routines will keep getting called on channels that have overflown (even though they will be masked out, but we just can't skip them). So we need some scratch area where we can keep writing the overflown vertices without overwriting anything important or crashing. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:47 -07:00
Zack Rusin	be497ac9d3	draw/gs: Return early if the passed geometry shader is null Can happen if we were using stream output without geometry shader, by returning early we avoid a crash. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:47 -07:00
Zack Rusin	80ee4a407a	draw: implement pipeline statistics in the draw module This is a basic implementation of the pipeline statistics in the draw module. The interface is similar to the stream output statistics and also requires that the callers explicitly enable it. Included is the implementation of the interface in llvmpipe and softpipe. Only softpipe enables the pipeline statistics capability though because llvmpipe is lacking gathering of the fragment shading and rasterization statistics. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:47 -07:00
Zack Rusin	b739376cff	gallivm/gs: fix the end primitive calls The issue with SOA execution and end_primitive opcode is that it can be executed both when we haven't emitted any vertices, in which case we don't want to emit an empty primitive, and when the execution mask is zero and the execution should be skipped. We handled only the latter of those conditions. Now we're combining the execution mask with a mask created from emitted vertices to handle both cases. As a result we don't need the pending_end_primitive flag which was broken because it was static and could be affected by both above mentioned conditions at run-time. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:46 -07:00
Zack Rusin	93627e33cc	tgsi/exec: geometry shaders are executed on a single primitive which means that our execution mask in GS is equal to 1 not 0xf. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 23:38:46 -07:00
Zack Rusin	88db6f0a73	tgsi/exec: fix the udiv and umod instructions Same as with llvmpipe: we can't be divind/moding by zero and we need to make sure that dividing/moding by zero produces 0xffffffff. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-16 23:38:46 -07:00
José Fonseca	b8f6858fcb	gallivm: JIT symbol resolution with linux perf. Details on docs/llvmpipe.html Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-17 16:50:52 +01:00
José Fonseca	35ef27d485	draw: Silence uninitialized var warnings. Trivial.	2013-04-17 16:50:52 +01:00
Vincent Lejeune	2b9ed257c0	r600g/llvm: Use gprcount from llvm	2013-04-17 17:24:29 +02:00
Anuj Phogat	484b89ace9	intel: Add a null pointer check before dereferencing the pointer Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-17 08:17:47 -07:00
Emil Velikov	b03f6de63b	docs: Update 'Making new mesa release' Add a note to update PACKAGE_VERSION for Android and scons builds Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-17 08:48:15 -06:00
Emil Velikov	91984a732e	docs: Add some missing release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-17 08:48:15 -06:00
Emil Velikov	cf9bf1d4a6	docs: move specs to a separate folder Handle legacy/obsolete specs as well List all specs in extensions.html Mark 'OLD' extensions as obsolete in extensions.html Update the spec location in old relnotes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-17 08:48:14 -06:00
Emil Velikov	5fd3b3b085	docs: restructure release notes into separate folder relnotes-html > relnotes/html RELNOTES-* > relnotes/* fix links, css and frames Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-17 08:48:14 -06:00
José Fonseca	50b3fc6204	gallium: Disambiguate TGSI_OPCODE_IF. TGSI_OPCODE_IF condition had two possible interpretations: - src.x != 0.0f - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for vertex and fragment shaders - gallivm/llvmpipe - postprocess - vl state tracker - vega state tracker - most old drivers - old internal state trackers - many graw examples - src.x != 0U - Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both vertex and fragment shaders - tgsi_exec/softpipe - r600 - radeonsi - nv50 And drivers that use draw module also were a mess (because Mesa would emit float IFs, but draw module supports native integers so it would interpret IF arg as integers...) This sort of works if the source argument is limited to float +0.0f or +1.0f, integer 0, but would fail if source is float -0.0f, or integer in the float NaN range. It could also fail if source is integer 1, and hardware flushes denormalized numbers to zero. But with this change there are now two opcodes, IF and UIF, with clear meaning. Drivers that do not support native integers do not need to worry about UIF. However, for backwards compatibility with old state trackers and examples, it is advisable that native integer capable drivers also support the float IF opcode. I tried to implement this for r600 and radeonsi based on the surrounding code. I couldn't do this for nouveau, so I just shunted IF/UIF together, which matches the current behavior. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> v2: - Incorporate Roland's feedback. - Fix r600_shader.c merge conflict. - Fix typo in radeon, spotted by Michel Dänzer. - Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float) properly in nv50/ir.	2013-04-17 10:54:08 +01:00
José Fonseca	f61b7da80e	gallium: Eliminate TGSI_OPCODE_IFC. Never used or implemented. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-17 10:54:08 +01:00
Kenneth Graunke	e7965598b7	i965: Enable the Bay Trail platform. This patch adds PCI IDs for Bay Trail (sometimes called Valley View). As far as the 3D driver is concerned, it's very similar to Ivybridge, so the existing code should work just fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-16 15:08:12 -07:00
Christian König	13ddf9baf2	r600/uvd: cleanup disabling tiling on pre EG asics Set transfer flag instead of fiddling with the tilling params directly. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-04-16 22:36:51 +02:00
Christian König	7490eeb3d6	autoconf: enable detection of vdpau and xvmc by default Since we now have UVD support we should enable them by default. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-16 22:36:20 +02:00
Ian Romanick	025f03f3b7	mesa/swrast: Move memory allocation outside the blit loop Assume the maximum pixel size (16 bytes per pixel). In addition to moving redundant malloc and free calls outside the loop, this fixes a potential resource leak when a surface is mapped and the malloc fails. This also makes blit_nearest look a bit more like blit_linear. v2: Use MAX_PIXEL_BYTES instead of 16. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-16 10:18:14 -07:00
Ian Romanick	a27c6e1aea	mesa/swrast: Move free calls outside the attachment loop This was originally discovered by Klocwork analysis: Possible memory leak. Dynamic memory stored in 'srcBuffer0' allocated through function 'malloc' at line 566 can be lost at line 746 However, I think the problem is actually much worse. Since the memory is freed after the first pass through the loop, the released buffer may be used on the next iteration! NOTE: This is a candidate for stable release branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-16 10:13:48 -07:00
Ian Romanick	6758498eb7	mesa/swrast: Refactor no-memory error checking in blit_linear Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-16 10:13:10 -07:00
Martin Andersson	4c3ed79566	r600g: Workaround for a harware bug with nested loops on Cayman There is a hardware bug on Cayman where a BREAK/CONTINUE followed by LOOP_STARTxxx for nested loops may put the branch stack into a state such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this by replacing the ALU_PUSH_BEFORE with a PUSH + ALU Fixes piglit tests EXT_transform_feedback/order* v2: Use existing loop count and improve comment v3: [Vadim Girlin] Set jump address for PUSH instructions NOTE: This is a candidate for the 9.1 branch Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-16 18:02:11 +04:00
Marek Olšák	8616b224bf	gallium/hud: fix FPS computation for framerate > 4.2k	2013-04-16 13:56:47 +02:00
Marek Olšák	332af88c39	gallium/hud: increase vertex buffer size for background black rectangles Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 13:56:47 +02:00
Marek Olšák	0108114619	gallium/hud: update the contents of GALLIUM_HUD=help Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 13:56:47 +02:00
Marek Olšák	30284f8892	gallium/hud: remove pipeline-statistics- prefix in query names for the env var string not to be awfully long v2: fix bug in indexing of "name" Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-16 13:56:47 +02:00
Marek Olšák	dfe5367f0f	r600g: implement pipeline statistics query	2013-04-16 13:56:47 +02:00
Marek Olšák	817723baf8	winsys/radeon: use query_value for timestamp, remove query_timestamp	2013-04-16 13:56:47 +02:00
Marek Olšák	413ca78af3	r600g: add a debug flag for printing virtual addresses of resources	2013-04-16 13:56:47 +02:00
Marek Olšák	05fa3595e0	r600g: add a query returning the amount of time spent during bo_map sync.	2013-04-16 13:56:47 +02:00
Matt Turner	b3f1f665b0	build: Get rid of GALLIUM_WINSYS_DIRS configure still uses it to print the enabled winsys. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	3a6e548a85	build: Get rid of GALLIUM_TARGET_DIRS configure still uses it to print the enabled targets. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	2f7a37d858	build: Build pipe-loader before gallium tests And don't build it from other Makefiles. That's awful, and breaks distclean. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	0d3b1b0e2e	build: Get rid of GALLIUM_MAKE_DIRS Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:05:55 -07:00
Matt Turner	69b69b1a0b	build: Stop using GALLIUM_STATE_TRACKERS_DIRS for SUBDIRS configure still uses it to print the enabled state trackers. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	13a7010c21	build: Get rid of DRIVER_DIRS Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	8341effd4a	build: Stop AC_SUBST'ing DRI_DIRS and GALLIUM_DRIVERS_DIRS Neither are used in Makefile.ams. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	70531b4a25	build: Remove GALLIUM_DIRS It's always constant anyway. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	a9676ae44a	build: Get rid of SRC_DIRS Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:26 -07:00
Matt Turner	691c30404d	build: Get rid of CORE_DIRS A step toward working make dist/distcheck. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:25 -07:00
Matt Turner	d5e9426b96	build: Move src/mapi/mapi/* to src/mapi/ Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:25 -07:00
Matt Turner	3c690524e2	build: Rename sources.mak -> Makefile.sources For the sake of consistency. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-15 12:04:25 -07:00
Tom Stellard	d50343dff1	radeonsi: Read config values from the .AMDGPU.config ELF section Instead of emitting configuration values (e.g. number of gprs used) in a predefined order, the LLVM backend now emits these values in register/value pairs. The first dword contains the register address and the second dword contians the value to write. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-15 10:54:30 -07:00
Tom Stellard	9277b04c02	radeon/llvm: Handle ELF formatted binary output from the LLVM backend	2013-04-15 10:54:29 -07:00
Tom Stellard	7782d19cdc	radeon/llvm: Use a struct for storing compiled code	2013-04-15 10:13:10 -07:00
Roland Scheidegger	1d6eb23f2d	gallivm: fix small but severe bug in handling multiple lod level strides Inserting the value for the second quad in the wrong place for the following shuffle. This meant the row or image stride was undefined which is quite catastrophic, can lead to bogus texels fetched or just segfault. This code is only hit for SoA path currently, still surprising it didn't crash more or caused more visible issues (I think llvm used a broadcast shuffle for the undefined parts of the vector, hence the undefined value for the second quad was just the same as that from the first quad, so as long as both quads hit the same mip level everything was fine, and since lower mips always have the same large stride it made it less likely to hit out-of-bound memory in case of differing lods). Note: this is a candidate for stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-15 15:23:40 +02:00
Francisco Jerez	02b808b08a	clover: Fix usage of incorrect object as destination in clEnqueueCopyBufferToImage. Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-04-13 14:24:10 +02:00
Francisco Jerez	1a8ad6c2e3	clover: Define platform class and merge with device_registry. Null platform IDs are OK according to the spec, but some applications have been reported to get paranoid and assume that our NULL platform is unusable. As it doesn't hurt to have device enumeration separate from the rest of the device code (quite the opposite, it makes the code cleaner), make the API use an actual platform object that keeps track of the available devices instead of the former NULL pointer. Reported-and-reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-04-13 14:20:16 +02:00
Francisco Jerez	6ace452055	clover: Add missing fields to the module serializer. Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-04-13 14:12:49 +02:00
Eric Anholt	1658efc42c	i965: Shut up the last release build warning. I don't see a sensible value to use in this path, but we shouldn't ever hit this outside of developer new-texture-target enabling. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:14 -07:00
Eric Anholt	dcb1b89c65	i965: Silence one more compile warning. We don't want to store this thing in the class, and we do need the definition to be at the top of the function and held onto until the end here, so there's not much to do besides (void) reference it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:14 -07:00
Eric Anholt	dea70404eb	i965: Fix a warning in the release build. This was copy and pasted from can_reswizzle_dst(), and we can just fold it in instead to avoid the warning. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:14 -07:00
Eric Anholt	28170c5b7f	i965: Fix an unused variable warning in the release build. I think this actually clarifies what's going on in the asserts a bit, given how many regions we've got floating around. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:13 -07:00
Eric Anholt	248175ab3b	i965: Fix an unused variable warning in the release build. It's used in an assert, but we have this as a member of the class anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:13 -07:00
Eric Anholt	6cec233c62	intel: Return failure properly in the texsubimage blit path. We assert that failure doesn't happen, but it fixes a warning in the release build and it would at least give working behavior for a user by falling back to the normal texsubimage path. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:13 -07:00
Eric Anholt	b681a89588	intel: Fix a warning in the release build. This was silly -- checking that we didn't overflow the array by dividing the array size by 2 and then multiplying it back up by 2. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:13 -07:00
Eric Anholt	1433936fe5	intel: Fix an unused variable warning in the release build. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:13 -07:00
Eric Anholt	9167ba8584	intel: Improve diagnostics for emit_linear_blit failure path. This fixes unused variable warnings in the release build, and should be more useful if it ever triggers. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:13 -07:00
Eric Anholt	aceba66795	i965: Fix error path for MCS allocation. Asserts don't stop execution in release builds, so we would continue on to use an uninitialized format value. Just take the failure path, which appears to continue up the call stack for a while. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	331766b9a2	i830: Move assert-only code into the assert. The call has no side effects, and moving it into the assert cleans up a compile warning in the release build. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	adf251406b	i965/fs: Fix some untriggered optimization bugs with uncompressed/sechalf. We have this support for firsthalf/sechalf instructions, which would be called in the !has_compr4 (aka original gen4) 16-wide case. We currently only support 16-wide for gen5+, so we weren't tripping over this, but it would have been a problem if we ever try to enable it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	eaca8a94e2	i965/fs: Add basic-block-level dead code elimination. This is a poor substitute for proper global dead code elimination that could replace both our current paths, but it was very easy to write. It particularly helps with Valve's shaders that are translated out of DX assembly, which has been register allocated and thus have a bunch of unrelated uses of the same variable (some of which get copy-propagated from and then left for dead). shader-db results: total instructions in shared programs: 1735753 -> 1731698 (-0.23%) instructions in affected programs: 492620 -> 488565 (-0.82%) v2: Fix comment typo Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	36d0fde603	i965/fs: Remove incorrect note of writing attr in centroid workaround. This instruction doesn't update its IR destination, it just moves from payload to f0. This caused the dead code elimination pass I'm adding to dead-code-eliminate the first step of interpolation. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	2cb7f1e766	i965/fs: Add a helper function for checking for partial register updates. These checks were all over, and every time I wrote one I had to try to decide again what the cases were for partial updates. v2: Fix inadvertent reladdr check removal. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	df25b4f3cf	mesa: Add a macro to bitset for determining bitset size. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:12 -07:00
Eric Anholt	b5a0f59c0f	i965: Fix compiler warnings since the introduction of texture multisample. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-12 16:32:11 -07:00
Ian Romanick	1faaa411c7	mesa: Don't leak gl_context::BeginEnd at context destruction The other dispatch tables (Exec and Save) are freed, but BeginEnd is never freed. This was found by inspection why investigating the leak of shared state in _mesa_initialize_context. NOTE: This is a candidate for stable branches Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-12 16:24:48 -07:00
Ian Romanick	6e06550e4e	mesa: Don't leak shared state when context initialization fails Back up at line 1017 (not shown in patch), we add a reference to the shared state. Several places after that may divert to the error handler, but, as far as I can tell, nothing ever unreferences the shared state. Fixes issue identified by Klocwork analysis: Resource acquired to 'shared->TexMutex' at line 1012 may be lost here. Also there is one similar error on line 1087. NOTE: This is a candidate for the stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-12 16:24:48 -07:00
Ian Romanick	f730c210b8	egl/dri2: NULL check value returned by dri2_create_surface dri2_create_surface can fail for a variety of reasons, including bad input data. Dereferencing the NULL pointer and crashing is not okay. Fixes issue identified by Klocwork analysis: Pointer 'surf' returned from call to function 'dri2_create_surface' at line 285 may be NULL and will be dereferenced at line 291. NOTE: This is a candidate for the stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-12 16:24:48 -07:00
Ian Romanick	2cc0b3294a	mesa: NULL check the pointer before trying to dereference it Duh. Fixes issues identified by Klocwork analysis: Pointer 'table' returned from call to function 'calloc' at line 115 may be NULL and will be dereferenced at line 117. and Suspicious dereference of pointer 'table' before NULL check at line 119. NOTE: This is a candidate for the stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-12 16:24:48 -07:00
Ian Romanick	ee55b845d2	glsl: Fix hypothetical NULL dereference related to process_array_type Ensure that process_array_type never returns NULL, and let process_array_type handle the case where the supplied base type is NULL. Fixes issues identified by Klocwork analysis: Pointer 'type' returned from call to function 'get_type' at line 1907 may be NULL and may be dereferenced at line 1912. and Pointer 'field_type' checked for NULL at line 4160 will be dereferenced at line 4165. Also there is one similar error on line 4174. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-12 16:24:44 -07:00
Ian Romanick	278c9af85e	glsl: Fix hypothetical NULL dereference in ast_process_structure_or_interface_block Fixes issue identified by Klocwork analysis: Pointer 'field_type' returned from call to function 'glsl_type' at line 4126 may be NULL and may be dereferenced at line 4139. Also there are 2 similar errors on line(s) 4165, 4174. In practice, it should be impossible to actually get NULL in here because a syntax error would have already caused compilation to halt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-12 16:24:39 -07:00
Tom Stellard	c6a86fb563	r300g: Fix bug in OMOD optimization https://bugs.freedesktop.org/show_bug.cgi?id=60503 NOTE: This is a candidate for the stable branches.	2013-04-12 08:33:31 -07:00
Emil Velikov	ac1118d53c	nvc0: set ret variable if launch desc allocation failed Pointed out by gcc nve4_compute.c: In function 'nve4_launch_grid': nve4_compute.c:511:7: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized] if (ret) ^ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Edit by Christoph Bumiller: Set it to -1 to indicate failure and only when it's actually required.	2013-04-12 17:15:14 +02:00
Emil Velikov	48bcb94dc3	nvc0: bail out early during nve4_compute_setup() Exit gracefully rather than trying to create a random object, whenever the chipset is unknown Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-12 17:10:11 +02:00
Emil Velikov	e28c266682	nvc0: compile nve4_cache_split_name() only in debug build As otherwise it is unused - pointed out by gcc nve4_compute.c:586:20: warning: 'nve4_cache_split_name' defined but not used [-Wunused-function] static const char *nve4_cache_split_name(unsigned value) ^ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-12 17:09:03 +02:00
Emil Velikov	249f3d73cf	nv50/codegen: do not emitATOM() if the subOp is unknown For debug build we'll hit the assert, for release we are going to emit random data as subOp is used uninitilised. Spotted by gcc codegen/nv50_ir_emit_nv50.cpp: In member function 'void nv50_ir::CodeEmitterNV50::emitATOM(const nv50_ir::Instruction*)': codegen/nv50_ir_emit_nv50.cpp:1554:12: warning: 'subOp' may be used uninitialized in this function [-Wmaybe-uninitialized] uint8_t subOp; ^ Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-04-12 17:08:26 +02:00
Christoph Bumiller	4da54c91d2	nvc0: implement multisample textures	2013-04-12 13:02:18 +02:00
Christoph Bumiller	71c1c8a9b8	nvc0: patch up TEX cases with 5 or 6 sources on nve4 Hackishly fixes alignment requirement of 2nd tuple for now.	2013-04-12 11:41:35 +02:00
Christoph Bumiller	2b62ba7cb0	nvc0: fix 2D engine MS2 resolve	2013-04-12 11:41:35 +02:00
Christoph Bumiller	69804c2ab8	nv50,nvc0: add RGBX16/32_FLOAT formats	2013-04-12 11:41:35 +02:00
Matt Turner	195a6cca3c	i965/vs: Print error if vertex shader fails to compile. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-11 17:22:07 -07:00
Matt Turner	32a8e87766	i965: NULL check prog on shader compilation failure. Also change if (shader) to if (prog) for consistency. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-11 17:21:13 -07:00
José Fonseca	ed9687cf1b	scons: Add st_cb_msaa.c to source list.	2013-04-11 22:37:34 +01:00
Dave Airlie	f024c72476	r600g: add get_sample_position support (v3) v2: I rewrote this to use the sample positions properly. v3: rewrite properly to use bitfield to cast back to signed ints Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:29 +01:00
Dave Airlie	f152da6bf9	st/mesa: add support for ARB_texture_multisample (v3) This adds support to the mesa state tracker for ARB_texture_multisample. hardware doesn't seem to use a different texture instructions, so I don't think we need to create one for TGSI at this time. Thanks to Marek for fixes to sample number picking. v2: idr pointed out a bug in how we picked the max sample counts, use new internal format chooser interface to pick proper answers. v3: use st_choose_format directly, it was okay, fix anding of masks. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:29 +01:00
Dave Airlie	1d90ee5ef5	st/mesa: add support for get sample position This just calls into the gallium interface. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:28 +01:00
Dave Airlie	cc906396c7	gallium: add get_sample_position interface This is to be used to implement glGet GL_SAMPLE_POSITION. Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:28 +01:00
Dave Airlie	184278a804	r600g: fix two issues in compressed msaa reading code I've no idea when sample_chan would ever be 4 here, but 4 is most definitely wrong, array textures have it as 3 as well. Also the cayman code though unused is obviously wrong. Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 21:09:27 +01:00
Paul Berry	e9fa3a9448	i965/vs: Don't hardcode DEBUG_VS in generic vec4 code. Since the vec4_visitor and vec4_generator classes are going to be re-used for geometry shaders, we can't enable their debug functionality based on (INTEL_DEBUG & DEBUG_VS) anymore. Instead, add a debug_flag boolean to these two classes, so that when they're instantiated the caller can specify whether debug dumps are needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:26 -07:00
Paul Berry	defdb310b7	i965/vs: Generalize computation of array strides in preparation for GS. Geometry shader inputs are arrays, but they use an unusual array layout: instead of all array elements for a given geometry shader input being stored consecutively, all geometry shader inputs are interleaved into one giant array. As a result, the array stride we use to access geometry shader inputs must be equal to the size of the input VUE, rather than the size of the array element. This patch introduces a new virtual function, vec4_visitor::compute_array_stride(), which will allow geometry shader compilation to specialize the computation of array stride to account for the unusual layout of geometry shader input arrays. It also renames the local variable that the ir_dereference_array visitor uses to store the stride, to avoid confusion. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:26 -07:00
Paul Berry	444fce6398	i965/vs: Generalize attribute setup code in preparation for GS. This patch introduces a new function, vec4_visitor::lower_attributes_to_hw_regs(), which replaces registers of type ATTR in the instruction stream with the hardware registers that store those attributes. This logic will need to be common between the vertex and geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:26 -07:00
Paul Berry	28fe02ce6e	i965/vs: Generalize vertex emission code in preparation for GS. This patch introduces a new function, vec4_visitor::emit_vertex(), which contains the code for emitting vertices that will need to be common between the vertex and geometry shaders. Geometry shaders will need to use a different message header, and a different opcode, for their URB writes, so we introduce virtual functions emit_urb_write_header() and emit_urb_write_opcode() to take care of the GS-specific behaviours. Also, since vertex emission happens at the end of the VS, but in the middle of the GS, we need to be sure to only call emit_shader_time_end() during VS vertex emission. We accomplish this by moving the call to emit_shader_time_end() into the VS implementation of emit_urb_write_opcode(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	7214451bdc	i965/vs: rename vec4_generator::generate_vs_instruction. Since this function is going to get used for geometry shaders too, it deserves a more generic name: generate_vec4_instruction. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	9bb6840b28	i965/vs: Generalize data structures pointed to by vec4_generator. This patch removes the following field from vec4_generator, since it is not used: - struct brw_vs_compile c And changes the following field: - struct gl_vertex_program vp => struct gl_program *prog With these changes, vec4_generator no longer refers to any VS-specific data structures. This will pave the way for re-using it for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Use the name "prog" rather than "p". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	4d773603d3	i965/vs: Rename vec4_generator::prog to shader_prog. The next patch is going to change the type of vec4_generator::vp from struct gl_vertex_program * to struct gl_program *, and rename it. The sensible name to change it to is vec4_generator::prog. However, prog is already used. Since the existing vec4_generator::prog is of type struct gl_shader_program, it makes sense to rename it to shader_prog. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	5743bea0ba	i965/vs: move VS-specific data members to vs_vec4_visitor. This patch moves the following data structures from vec4_visitor to vec4_vs_visitor, since they contain VS-specific data: - struct brw_vs_compile c (renamed to vs_compile) - struct brw_vs_prog_data prog_data (renamed to vs_prog_data) - src_reg vp_temp_regs - src_reg vp_addr_reg Since brw_vs_compile and brw_vs_prog_data also contain vec4-generic data, the following pointers are added to the base class, to allow it to access the vec4-generic portions of these data structures: - struct brw_vec4_compile c - struct brw_vec4_prog_key key - struct brw_vec4_prog_data prog_data Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> v2: Use shorter names in the base class and longer names in the derived class. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	0ce95222af	i965/vs: move ARB_vertex_program functions to vec4_vs_visitor. This patch moves functions from vec4_visitor to vec4_vs_visitor that deal with ARB (assembly) vertex programs. There's no point in having these functions in the base class since we don't intend to support assembly programs for the GS stage. The following functions are moved: - setup_vp_regs - get_vp_dst_reg - get_vp_src_reg Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	42a3d63dd4	i965/vs: Add virtual function make_reg_for_system_value(). The system values handled by vec4_visitor::visit(ir_variable *) are VS-specific (vertex ID and instance ID). This patch moves the handling of those values into a new virtual function, make_reg_for_system_value(), so that this VS-specific code won't be inherited by geomtry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	8941f73c7c	i965/vs: Make some vec4_visitor functions virtual. This patch makes the following vec4_visitor functions virtual, since they will need to be implemented differently for vertex and geometry shaders. Some of the functions are renamed to reflect their generic purpose, rather than their VS-specific behaviour: - setup_attributes - emit_attribute_fixups (renamed to emit_prolog) - emit_vertex_program_code (renamed to emit_program_code) - emit_urb_writes (renamed to emit_thread_end) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	e9be5a05f7	i965/vs: Make vec4_vs_visitor class derived from vec4_visitor. This patch just creates the derived class; later patches will migrate VS-specific functions and data structures from the base class into the derived class. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:25 -07:00
Paul Berry	5fff3752c8	i965/vs: split brw_vs_prog_data into generic and VS-specific parts. This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Put urb_read_length and urb_entry_size in the generic struct. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	0c994f181c	i965/vs: split brw_vs_prog_key into generic and VS-specific parts. This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	d7af636473	i965/vs: split brw_vs_compile into generic and VS-specific parts. This will allow the generic parts to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	09cd6e06d2	i965/vs: Remove brw_vs_prog_data pointer from brw_vs_compile. In patches that follow, we'll be splitting structs brw_vs_prog_data and brw_vs_compile into a vec4-generic base struct and a VS-specific derived struct (this will allow the vec4-generic code to be re-used for geometry shaders). Having brw_vs_compile point to brw_vs_prog_data makes it difficult to do this cleanly. Fortunately most of the functions that use brw_vs_compile (those in the vec4_visitor class) already have access to brw_vs_prog_data through a separate pointer (vec4_visitor::prog_data). So all we have to do is use that pointer consistently, and plumb prog_data through the few remaining functions that need access to it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	deffbbed4e	i965: Generalize computation of VUE map in preparation for GS. This patch modifies the arguments to brw_compute_vue_map() so that they no longer bake in the assumption that we are generating a VUE map for vertex shader outputs. It also makes the function non-static so that we can re-use it for geometry shader outputs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	b29613371c	i965/vs: Make type of vec4_visitor::vp more generic. The vec4_visitor functions don't use any VS specific data from vec4_visitor::vp. So rename it to "prog" and change its type from struct gl_vertex_program * to struct gl_program *. This will allow the code to be re-used for geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2: Use the name "prog" rather than "p". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	fe97f26c86	i965: Rename backend_visitor::prog to shader_prog. The next patch is going to change the type of vec4_visitor::vp from struct gl_vertex_program * to struct gl_program , and rename it. The sensible name to change it to is vec4_visitor::prog. However, prog is already used in backend_visitor (which vec4_visitor derives from). Since backend_visitor::prog is of type struct gl_shader_program , it makes sense to rename it to shader_prog. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-11 09:25:24 -07:00
Paul Berry	5b0bd8ece8	glsl: Fix (and validate) comment above glsl_type::name. The comment above glsl_type::name claimed that it could sometimes be NULL. This was wrong--it is never NULL. Many error handling paths would segfault if it were. (Anonymous structs are assigned names like "#anon_struct_0001"--see the ast_struct_specifier constructor in glsl_parser_extras.cpp.) Fix the comment and add assertions to validate that it really is never NULL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-04-11 09:25:24 -07:00
Christian König	5b2855bfe7	radeon/uvd: add UVD implementation v5 Just everything you need for UVD with r600g and radeonsi. v2: move UVD code to radeon subdir, clean up build system additions, remove an unused SI function, disable tiling on SI for now. v3: some minor indentation fix and rebased v4: dpb size calculation fixed v5: implement proper fall-back in case the kernel doesn't support UVD, based on patches from Andreas Boll but cleaned up a bit more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-04-11 17:10:28 +02:00
Christian König	f91e4d2c9d	radeon/winsys: add uvd ring support to winsys v3 Separated from UVD patch for clarity. v2: sync with next tree for 3.10 v3: as pointed out by Andreas Bool check for drm minor >= 32 http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.10-wip Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-04-11 17:10:01 +02:00
Dave Airlie	cb12bf7606	st/mesa: fix UBO offsets. Reported and tested by degasus on #radeon. Note: This is a candidate for the 9.1 branch Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-11 15:20:19 +10:00
Ralf Jung	3998f8c6b5	egl/x11: Fix initialisation of swap_interval The EGLConfig attributes EGL_MIN/MAX_SWAP_INTERVAL were incorrectly set to 0 and 0. This prevented clients from setting the swap interval to a reasonable value, like 1 or 2. Swap interval worked correctly in Mesa 9.0. The commit below introduced the bug. commit `7e9bd2b2ed` Author: Eric Anholt <eric@anholt.net> Date: Tue Sep 25 14:05:30 2012 -0700 egl: Add support for driconf control of swapinterval. Note: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63078 [chadv: Wrote commit message] Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-10 19:16:45 -07:00
Kenneth Graunke	cbe24ff7c8	intel: Fall back to X-tiling when larger than estimated aperture size. If a region is larger than the estimated aperture size, we map/unmap it by copying with the BLT engine. Which means we can't use Y-tiling. Fixes Piglit max-texture-size and tex3d-maxsize, which regressed in my recent change to use Y-tiling by default on Gen6+. This was due to a botched merge conflict resolution. v2: Return a mask of valid tilings from intel_miptree_select_tiling. This allows us to avoid the X-tiling fallback if Y-tiling is actually mandatory. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-10 16:54:31 -07:00
Kenneth Graunke	eef3dff3fd	intel: Refactor code in intel_miptree_choose_tiling(). This reduces the nesting level slightly, and in my opinion, makes it a bit easier to follow. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-10 16:54:31 -07:00
Kenneth Graunke	ba38ac062c	intel: Move the max_gtt_map_object_size estimation to intel_context. We need know this in order to decide what tiling mode to use. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-10 16:54:31 -07:00
Fredrik Höglund	fb69dbb0d1	r600g: Add support for GL_ARB_texture_buffer_range Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-11 00:10:45 +02:00
Paul Berry	42767dc22f	i965/blorp: Remove unnecessary test in gen7_blorp_emit_depth_stencil_config. gen7_blorp_emit_depth_stencil_config() is only called when params->depth.mt is non-null. Therefore, it's not necessary to do an "if (params->depth.mt)" test inside it. The presence of this if test was misleading static analysis tools (and briefly, me) into thinking that gen7_blorp_emit_depth_stencil_config() might sometimes access uninitialized data and dereference a null pointer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-10 13:17:53 -07:00
Marek Olšák	34c3f98641	r600g: fix valgrind warning on Cayman Warning: "Conditional jump or move depends on uninitialised value(s)".	2013-04-10 21:56:51 +02:00
Zack Rusin	fe29f99293	gallivm/tgsi: handle untyped moves both mov and ucmp can be used to move variables of any type. correctly note that about ucmp in the tgsi_info and make sure gallivm can handle that by correctly casting the untyped moves. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:37:17 -07:00
Zack Rusin	d56f2d5267	gallivm: fix loops and conditionals within GS We were using simple temporaries, without using alloca or phi nodes which meant that on every iteration of the loop our temporaries, which were holding the number of vertices and primitives which were emitted, were being reset to zero. Now we're using alloca to allocate those variables to preserve them across conditionals. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:33:59 -07:00
Zack Rusin	c1cd19c3b8	llvmpipe: implement PIPE_QUERY_SO_STATISTICS We were missing the implementation of PIPE_QUERY_SO_STATISTICS query, this change implements it on top of the existing facilities. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:32:56 -07:00
Zack Rusin	7466e0b6c8	gallivm: fix unsigned divide and remainder opcodes We want to both make sure we never divide by zero to not generate sigfpe and that divide by zero is guaranteed to return 0xffffffff. Based on José idea. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:31:22 -07:00
Zack Rusin	1ad4a4eeb3	gallivm: fix breakc we break when the mask values are 0 not, 1, plus it's bit comparison not a floating point comparison. This fixes both. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-04-10 12:25:34 -07:00
Chad Versace	e4484a0309	intel/hsw: Enable hiz (v2) Enable hiz by setting intel_context::has_hiz. However, to work around a hardware bug, we selectively enable hiz for only nicely aligned miptree slices. No Piglit regressions on Haswell 0x0d26 rev07 when based atop mesa-master-4ad3601. Improves the performance of GLB27_TRex_C24Z16_FixedTimeStep by 18.52% (hsw-0x0d26-rev07; kernel-3.9.0-rc1; GLBenchmark 2.7.0 Release a68901; samples=3). v2: Replace the check for IS_HASWELL(devid) in intel_miptree_slice_has_hiz() with a conditional set of has_hiz. [for anholt] Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:26 -07:00
Chad Versace	916d1ea7dc	i965: Remove brw_context::depthstencil::hiz_mt After recent refactorings, the field is written but no longer read. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:10 -07:00
Chad Versace	2d3bbc576c	intel: Replace checks for hiz_mt with intel_hashiz() When appropriate, replace each check `hiz_mt != NULL` with either a call to intel_miptree_slice_has_hiz() or intel_renderbuffer_has_hiz(). No behavioral change. This prepares for selectively enabling hiz on individual miptree slices for Haswell. This refactoring had several side effects. 1. To prevent new warnings about discarding the const qualifier, I removed 'const' from some variable declarations in intel_validate_framebuffer(). The alternative was to add const qualifiers to multiple function signatures in the intel_renderbuffer_has_hiz call graph. Since the dominant convention in the Intel code is to not qualify function parameters as const, I chose to remove rather than add const qualifiers. 2. I changed the signature of brw_emit_depth_stencil_hiz() by replacing `struct intel_mipmap_tree hiz_mt` with `bool hiz`. The function used hiz_mt mostly as a boolean indicator of the presence of hiz, so the signature change is consistent with the patch's goal. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:10 -07:00
Chad Versace	5b79705526	i965: Change signature of brw_get_depthstencil_tile_masks() Add new parameters `depth_level` and `depth_layer`, which specify depth miptree's slice of interest. A following patch will pass the new parameters through to intel_miptree_slice_has_hiz(). Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:10 -07:00
Chad Versace	87f4541bc1	i965/blorp: Add fields brw_blorp_mip_info::level,layer The new fields define the 2D miptree slice to be used. A following patch will pass the new fields through to intel_miptree_slice_has_hiz(). Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:10 -07:00
Chad Versace	2a416a9b1b	intel: Add field intel_mipmap_slice::has_hiz On Haswell, HiZ will selectively be enabled on individual miptree slices to workaround a hardware bug. The new field 'has_hiz' indicates if HiZ is enabled for a given slice. Also add two new accessor functions for this field. intel_miptree_slice_has_hiz intel_renderbuffer_has_hiz The new field and accessor functions are not yet used. Also, this patch introduces no behavioral change because, in this patch, intel_miptree_alloc_hiz() sets has_hiz for all slices. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:10 -07:00
Chad Versace	a14dc4f92c	i965/blorp: Align rectangle primitive for hiz ops The hardware docs and the simulator require that the rectangle primitive emitted during fast depth clears and hiz resolves must be aligned to 8x4 pixels. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-10 10:55:10 -07:00
Eric Anholt	d5f7aebac2	i965/vs: Use GRFs for pull constant offsets on gen7. This allows the computation of the offset to get written directly into the message source. shader-db results: total instructions in shared programs: 3308390 -> 3283025 (-0.77%) instructions in affected programs: 442998 -> 417633 (-5.73%) No difference in GLB2.7 low res (n=9). Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-10 09:45:21 -07:00
Eric Anholt	3badbf7f7f	i965/vs: When asked to make a dst_reg for a src.xxxx, just write to src.x. We have several places in our pull constant handling where we make a temporary src_reg for an int, and then turn it into a dst. In doing so, we were writing to the dst.xyzw, so we never register coalesced it with a later mov from dst.x to real_dst.x. These extra channels written would be removed if we had channel-wise DCE in the backend, but we don't. Fix it for now by just not writing these extra channels that won't get used. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-04-10 09:45:21 -07:00
Eric Anholt	007a88ed24	i965/gen6: Reduce updates of transform feedback offsets with HW contexts. The software-tracked transform feedback offsets (svbi_0_starting_index) are incorrect in the presence of primitive restart, so we were actually updating it with a bogus value if the batch wrapped and we emitted the packet again during a single transform feedback. By reducing state emission, we avoid the bug. Fixes piglit OpenGL 3.1/primitive-restart-xfb flush Reviewed-by: Paul Berry <stereotype441@gmail.com> NOTE: This is a candidate for the 9.1 branch.	2013-04-10 09:45:21 -07:00
Eric Anholt	62a18da341	i965/gen7: Skip resetting SOL offsets at batch start with HW contexts. The software-tracked transform feedback offsets (svbi_0_starting_index) are incorrect in the presence of primitive restart, so we can't reliably compute offsets for our buffer pointers after a batch flush. Thanks to HW contexts, our transform feedback offsets are now saved, so we can just keep using the ones from before the batch wrap. Fixes piglit OpenGL 3.1/primitive-restart-xfb flush Reviewed-by: Paul Berry <stereotype441@gmail.com> NOTE: This is a candidate for the 9.1 branch.	2013-04-10 09:45:21 -07:00
Christian König	ccf3e8fc9b	radeonsi: remove sampler writemask v3 v2: fix instrinsic name as well v3: LLVM revision incremented as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-10 10:41:29 +02:00
Niels Ole Salscheider	31f14f3def	pipe-loader: Fix out of source build Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2013-04-10 09:45:04 +02:00
Brian Paul	b74b510d64	st/mesa: remove #if FEATURE_GL/ES tests Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 18:43:40 -06:00
Brian Paul	c04e0b9f4b	mesa: remove old comment about FEATURE_GL Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 18:43:40 -06:00
Brian Paul	f490c6839b	mesa: remove #ifdef FEATURE_ES2, add some comments instead Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 18:43:40 -06:00
Brian Paul	9dc6f76e44	st/mesa: remove #include mfeatures.h None of these were needed. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 18:43:40 -06:00
Brian Paul	04bd972fc3	docs: initial 9.2 release notes file	2013-04-09 18:30:23 -06:00
Brian Paul	acd4fb8b5a	st/osmesa: re-use buffers in OSMesaMakeCurrent() Rather than creating a new buffer each time. Fixes problems found with vtk. Tested-by: Kevin H. Hobbs <hobbsk@ohio.edu>	2013-04-09 18:30:23 -06:00
Marek Olšák	4f1fd920c9	mesa: update derived framebuffer state in GetMultisamplefv This makes sure that ctx->DrawBuffer->Visual.samples is up-to-date. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-10 02:01:16 +02:00
Marek Olšák	b6475f9437	mesa: fix glGet queries depending on derived framebuffer state (v2) "ctx->DrawBuffer->Visual" might be invalid if (NewState &_NEW_BUFFERS) != 0. v2: also fix: - RGBA_INTEGER_MODE_EXT - RGBA_FLOAT_MODE_ARB (also check API support) - FRAMEBUFFER_SRGB_CAPABLE_EXT NOTE: This is a candidate for stable branches. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-10 02:01:16 +02:00
Paul Berry	34efd9214d	i965/gen7.5: Allow HW primitive restart for all primitive types. Gen7.5 (Haswell) hardware supports primitive restart for all primitive types. It also handles all possible primitive restart indices. Rather than specialize both can_cut_index_handle_restart_index() and the switch statement in can_cut_index_handle_prims() for Haswell, just return early if the hardware is Haswell because we know it can handle everything. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-09 15:37:36 -07:00
Paul Berry	a7388f8e6f	i965: Only use brw_draw.c's trim() function when necessary. brw_draw.c contains a trim() function which modifies the vertex count for quads and quad strips in order to discard dangling vertices. In principle this shouldn't be necessary, since hardware since Gen4 is capable of discarding dangling vertices by itself. However, it's necessary because as a hack to speed up rendering on Gen 4-5, we sometimes convert quads to trifans and quad strips to tristrips. The trim() function isn't necessary on Gen6 and up. This patch documents why and when the trim() function is necessary, and avoids calling it when it's not needed. This will avoid creating problems when we enable hardware support for primitive restart of quads and quad strips on Haswell. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-09 15:37:35 -07:00
Paul Berry	56ce7fa4b8	i965/vs: Fix DEBUG_SHADER_TIME when VS terminates with 2 URB writes. The call to emit_shader_time_end() before the second URB write was conditioned with "if (eot)", but eot is always false in this code path, so emit_shader_time_end() was never being called for vertex shaders that performed 2 URB writes. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-09 12:15:08 -07:00
Christian König	462647453c	st/vdpau: fix subtitle related bug v2 Drawing subtitles didn't increased the dirty area of the surface. Reported and tested by freeedrich on irc. v2: don't clear the surface Signed-off-by: Christian König <christian.koenig@amd.com>	2013-04-09 21:11:32 +02:00
Paul Berry	5306af2113	glsl/linker: Reduce scope of non-flat integer varying fix. In the mailing list discussion of "glsl/linker: fix varying packing for non-flat integer varyings." (commit `7862bde`), we concluded that since the bug only applies to integral variables, it is safer to just apply the bug fix to integer varyings. I forgot to make the change before pushing the patch upstream. (Note: we aren't aware of any bugs in commit 7862bde; it just seems wise to be on the safe side). This patch makes the change. Assuming commit `7862bde` gets cherry-picked back to 9.1, this commit should be cherry-picked too. NOTE: This is a candidate for the 9.1 release branch.	2013-04-09 10:37:16 -07:00
Paul Berry	32d2b2aa2c	glsl/linker: Adapt flat varying handling in preparation for geometry shaders. When a varying is consumed by transform feedback, but is not used by the fragment shader, assign_varying_locations() sets its interpolation type to "flat" in order to ensure that lower_packed_varyings never has to deal with non-flat integral varyings (the GLSL spec doesn't require integral vertex outputs to be flat if they aren't consumed by the fragment shader). A similar situation will arise when geometry shader support is added, since the GLSL spec only requires integral vertex shader outputs to be flat when they are consumed by the fragment shader. This patch modifies the linker to handle this situation too. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 10:25:57 -07:00
Paul Berry	8687c40c2d	glsl: Document lower_packed_varyings' "flat" requirement with an assert. To minimize the variety of type conversions that lower_packed_varyings needs to perform, it assumes that integral varyings are always qualified as "flat". link_varyings.cpp takes care of ensuring that this is the case (even in the circumstances where GLSL doesn't require it). This patch documents the assumption with an assertion, for ease in future debugging. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 10:25:19 -07:00
Paul Berry	7862bde8af	glsl/linker: fix varying packing for non-flat integer varyings. Commit `dfb57e7` (glsl: Fix error checking on "flat" keyword to match GLSL ES 3.00, GLSL 1.50) relaxed the rules for integral varyings: they only need to be declared as "flat" if they are a fragment shader inputs. This allowed for the possibility of a vertex shader output being a non-flat integer, provided that it was not matched to a fragment shader input. A non-contrived situation where this might arise is if a vertex shader generates some integral outputs which are consumed by tranform feedback, but not by the fragment shader. Unfortunately, lower_packed_varyings assumes that all integral varyings are flat, regardless of whether they are consumed by the fragment shader. As a result, attempting to create a non-flat integral vertex output of a size that required packing (i.e. a size other than ivec4 or uvec4) would cause an assertion failure in lower_packed_varyings. This patch prevents the assertion failure by forcing vertex shader outputs to be "flat" whenever they are not consumed by the fragment shader. This should have no effect on rendering since the "flat" keyword only affects the behaviour of fragment shader inputs. Fixes piglit test "spec/EXT_transform_feedback/nonflat-integral". NOTE: This is a candidate for the 9.1 release branch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-04-09 10:25:15 -07:00
Paul Berry	778ce82b71	glsl: Check the size of ir_print_visitor's mode[] array with STATIC_ASSERT. ir_print_visitor::visit(ir_variable *)'s mode[] array needs to match the declaration of the enum ir_variable_mode. It's hard to verify that at compile time, but at least we can use a STATIC_ASSERT to make sure it's the right size. This required adding ir_var_mode_count to the enum.	2013-04-09 10:19:22 -07:00
Paul Berry	67f226e179	glsl: Fix ir_print_visitor's handling of interpolation qualifiers. This patch updates the interp[] array to match the enum glsl_interp_qualifier. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Add a STATIC_ASSERT to make sure the array is the correct size. This required adding INTERP_QUALIFIER_COUNT to the enum.	2013-04-09 10:19:11 -07:00
Johannes Obermayr	c295874129	autotools: Better describe which cases OProfileJIT is required. Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-04-09 17:38:42 +01:00
Brian Paul	4ad360133c	softpipe: misc updates to image dumping in softpipe_flush()	2013-04-09 08:27:53 -06:00
Vinson Lee	04ffce3004	tgsi: Ensure struct tgsi_ind_register field Index is initialized. Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-08 18:59:34 -07:00
Martin Andersson	a8246927e3	r600g: Fix UMAD on Cayman The multiplication part of tgsi_umad did not work on Cayman, because it did not populate the correct vector slots. This fixed hardlocks in the EXT_transform_feedback/order tests. NOTE: This is a candidate for the stable branches. (might not be easy to cherry-pick though) Signed-off-by: Marek Olšák <maraeo@gmail.com>	2013-04-09 03:09:37 +02:00
Kenneth Graunke	b76539aabe	intel: Remove the texture_tiling driconf option. This option can force textures to be untiled. However, on Gen6+, depth buffers must be Y-tiled. MSAA buffers also must be Y-tiled. So setting this option on even a trivial application like glxgears causes assertion failures in a debug build, and likely GPU hangs in a release build. It's just giving users a license to shoot themselves in the foot. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 16:15:07 -07:00
Kenneth Graunke	55ecc448b9	i965: Prefer Y-tiling on Gen6+. In the past, we preferred X-tiling for color buffers because our BLT code couldn't handle Y-tiling. However, the BLT paths have been largely replaced by BLORP on Gen6+, which can handle any kind of tiling. We hadn't measured any performance improvement in the past, but that's probably because compressed textures were all untiled anyway. Improves performance in GLB27_TRex_C24Z16_FixedTime by 7.69231%. v2: Rebase on top of Eric's untiled-for-larger-than-aperture changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 16:15:07 -07:00
Kenneth Graunke	40e30c1ca1	i965: Use tiling even for compressed textures. The code has no rationale for why we would force compressed textures to be untiled, and it appears to work fine. Git archeology indicates that it's been that way dating back to when we first started tiling. Improves performance in GLB27_TRex_C24Z16_FixedTimeStep at 1280x720 by 10.0529% +/- 0.573075% (n=12). Improves performance in Xonotic by 4.56409% +/- 0.27965% (n=3). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 16:15:07 -07:00
Chad Versace	f709198b10	intel: Refactor selection of miptree tiling This patch (1) extracts from intel_miptree_create() the spaghetti logic that selects the tiling format, (2) rewrites that spaghetti into a lucid form, and (3) moves it to a new function, intel_miptree_choose_tiling(). No behavioral change. As a bonus, it is now evident that the force_y_tiling parameter to intel_miptree_create() does not really force Y tiling. v2 (Ken): Rebase on top of Eric's untiled-for-larger-than-aperture changes. This required passing in the miptree. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 16:15:06 -07:00
Chad Versace	aa391976df	intel: Allocate hiz in intel_renderbuffer_move_to_temp() When moving the renderbuffer to a new miptree, we neglected to allocate the hiz buffer for the new miptree. Oops. Fixes all Piglit depthstencil-render-miplevels tests from crash to pass on Sandybridge. Note: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-04-08 16:09:26 -07:00
Dave Airlie	d0bf48f8e9	st/mesa: fix levels in initial texture creation calim pointed out we were getting mipmap levels for array multisamples, this didn't make sense. So then I noticed this function takes last_level so we are passing in a too high value here. I think this should fix the case he was seeing. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-08 23:56:06 +01:00
Ian Romanick	58d93e3247	glsl: Don't early-out for error-type inputs Check the type of the array operand and the index operand before doing other checks. This simplifies the code a bit now (eliminating the error_emitted parameter), and enables some later functional changes. The shader uniform float x[6]; uniform sampler2D s; void main() { gl_Position.x = xx[s + 1]; } still generates (only) the two expected errors: 0:3(33): error: `xx' undeclared 0:3(39): error: Operands to arithmetic operators must be numeric Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	a131b87706	glsl: Don't emit spurious errors for constant indexes of the wrong type Previously the shader uniform float x[6]; void main() { gl_Position.x = x[1.0]; } would have generated the errors 0:2(33): error: array index must be integer type 0:2(36): error: array index must be < 6 Now only 0:2(33): error: array index must be integer type will be generated. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	a70d2f05dc	glsl: Collect all of the non-constant index error checks together This puts all of the checks togeher for easier reading. It also means that all the checks are blocked on array->type->is_array. Shortly this will allow elimination of some is_error check work-arounds in this function. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	f9d8ca2817	glsl: Minor code compaction in _mesa_ast_array_index_to_hir Also, document the reason for not checking for type->is_array in some of the bound-checking cases. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	2c333a878c	glsl: Don't return a value from check_builtin_array_max_size That last consumer of the return value was changed to not use it by the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	666fafc144	glsl: Remove some unnecessary uses of error_emitted The error_emitted flag is used in semantic checking to prevent spurious cascading errors. For example, void foo(sampler2D s, float a) { float x = a + (1.2 + s); ... } should only generate a single error. Without the error_emitted flag for the first error, "a + ..." would also generate an error. However, a bunch of cases in _mesa_ast_array_index_to_hir that were setting error_emitted would mask legitimate errors. For example, vec4 a[7]; float b = a[3.14]; should generate two error (float index and type mismatch in assignment). The uses of error_emitted would cause only the first to be emitted. This patch removes most of the places in _mesa_ast_array_index_to_hir that would set the error_emitted flag. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	46934adb8d	glsl: Refactor handling of ast_array_index to a separate function I love 800+ line switch-statements as much as the next guy... Future commits will make changes to this part of the AST-to-HIR conversion, and extracting this code will make that a bit easier. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Ian Romanick	cd39ae7394	glsl: Make check_build_array_max_size externally visible A future commit will try to use this function in a different file. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 15:17:05 -07:00
Eric Anholt	ca9a7d975a	intel: Avoid making tiled miptrees we won't be able to blit. Doing so was breaking miptree mapping, which we really need to be able to handle. With this change, intel_miptree_map_direct() falls through to doing a CPU mapping on the buffer like we need. With the previous 2 patches, all of these should be fixed: piglit max-texture-size (all 3 patches required!) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37871 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44958 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53494 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 11:49:33 -07:00
Eric Anholt	dfed115090	intel: Do temporary CPU maps of textures that are too big to GTT map. This still fails, since 8192*4bpp == 32768, which is too big to use the blitter on. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-08 11:49:25 -07:00
Eric Anholt	b3a3cb9611	intel: Add support for writing to our linear-temporary-CPU-map case. This will be used for handling updates of large textures. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>.	2013-04-08 11:49:20 -07:00
Kenneth Graunke	97e40a524e	intel: Remove check for kernel 2.6.29. Now that we require 2.6.39, there's no need to also check for 2.6.29. Calling drm_intel_bufmgr_gem_enable_fenced_relocs() without checking should be safe, as it simply sets a flag. This does remove the check for zero fences available, but that doesn't seem worth checking. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 11:03:08 -07:00
Kenneth Graunke	394edb5af5	intel: Require kernel 2.6.39 for relaxed relocation support. Chris Wilson's relaxed relocation patch landed in March 2011. Anyone running pre-3.0 kernels probably isn't going to get the latest Mesa anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 11:03:08 -07:00
Kenneth Graunke	d7fd5696e6	i965: Remove a few BRW_STATE_... enum values. These were likely used for BRW_NEW_... dirty bit flags at one point, but they're unused now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 11:03:08 -07:00
Kenneth Graunke	79c27e7528	i965: Remove brw->vb.info and struct brw_vertex_info. Nobody uses this value, so there's no need to set it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 11:03:08 -07:00
Kenneth Graunke	b29dc25572	i965: Remove the BRW_NEW_INPUT_DIMENSIONS flag. When I removed the proj_attrib_mask optimization, I also removed the last consumer of this bit without realizing it. Since nobody uses it, there's no point in flagging it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-08 11:03:08 -07:00
Matt Turner	2e177bc8a5	register_allocate: Fix the type of best_benefit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-08 10:30:40 -07:00
Tom Stellard	a5a76782d5	radeon/llvm: Bump minimum LLVM version to 3.3	2013-04-08 07:43:34 -07:00
Niels Ole Salscheider	b336f51cc7	clover: Fix linkage of libOpenCL Clover needs the irreader component of llvm v2: Check for irreader component irreader is only available with LLVM 3.3 >= 177971 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2013-04-08 07:08:10 -07:00
Vincent Lejeune	5019af2145	r600g/llvm: Add support for native isa for pre EG This fixes bug 62756 : https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12	2013-04-08 15:11:59 +02:00
Marek Olšák	eff66bc9f8	gallium/util: add const to a parameter of util_max_layer	2013-04-06 23:57:15 +02:00
Marek Olšák	08275b25cc	st/mesa: don't expose ARB_color_buffer_float without driver support in GL core Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-06 23:57:12 +02:00
Marek Olšák	3264c3e997	mesa: allow drivers not to expose ARB_color_buffer_float in GL core profile Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-06 23:57:10 +02:00
Marek Olšák	9d4f67600b	mesa: move updating clamp control derived state out of mesa_update_state_locked It has 2 dependencies: glClampColor and the framebuffer, we might just as well do the update where those two are changed. v2: cosmetic changes from Brian's email Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-06 23:57:09 +02:00
Marek Olšák	755648c37f	mesa: don't set _ClampFragmentColor to TRUE if it has no effect This should reduce shader recompilations with drivers that emulate fragment color clamping, because we want the clamping to be enabled only if there is a signed normalized or floating-point colorbuffer. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-06 23:57:06 +02:00
Marek Olšák	21d407c1b8	mesa: refactor clamping controls, get rid of _ClampReadColor v2: cosmetic changes from Brian's email Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-06 23:57:04 +02:00
Chris Forbes	c4629ad3f9	mesa: don't memcmp() off the end of a cache key. Reported-by: `per` in #intel-gfx The size of the cache key varies, so store the actual size as well as the key blob itself, rather than just assuming it's the same as the size passed in. NOTE: This is a candidate for stable branches. V2: Don't leave silly holes in structure; use unsigned instead of GLuint. V3: Fix missing case for `last` match. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-06 18:30:08 +13:00
Tom Stellard	302f53dc20	radeonsi: Add compute support v3 v2: - Only dump shaders when env variable is set. v3: - Don't emit VGT registers Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com	2013-04-05 18:43:34 -04:00
Tom Stellard	4f7fe2cf2c	radeonsi: Set TCL1_ACTION_ENA when invalidating the texture cache Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com	2013-04-05 18:43:34 -04:00
Tom Stellard	0ccf82c557	radeonsi: Remove si_pm4_inval_vertex_cache() This function is a holdover from r600g and is identical to si_pm4_inval_texture_cache(), so it is not needed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com	2013-04-05 18:43:34 -04:00
Tom Stellard	c5e5b3401c	gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2 This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-04-05 18:43:34 -04:00
Wladimir	1a868acbec	util: add ETC as compressed format Add UTIL_FORMAT_LAYOUT_ETC to util_format_is_compressed. It was missing. Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-05 16:14:51 -06:00
Brian Paul	de99b6d117	gallium/u_blitter: fix is_blit_generic_supported() stencil checking Don't check if there's sampler support for stencil if we're not going to actually blit/copy stencil values. Fixes the case where we mistakenly said we can't support a blit of depth values from S8Z24 to X8Z24. Also, rename the is_stencil variable to dst_has_stencil to improve readability. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-05 16:14:51 -06:00
Alexander Monakov	9cda356004	Honor GLX_DONT_CARE in MATCH_MASK NOTE: This is a candidate for stable branches. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47478 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62999 Bugzilla: http://bugs.winehq.org/show_bug.cgi?id=26763	2013-04-05 14:32:45 -07:00
Rob Clark	aac7f06ad8	freedreno: use autogenerated register defs Switch to use the envytools generated headers for register/bitfield definitions. This is the first step in preparing to add a3xx support, since it avoids having conflicting names for a3xx and a2xx registers. And since I'm using envytools for a3xx it is simpler to just use it for everything. This shouldn't cause any functional change, it is really just a lot of renaming. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-04-05 14:33:16 -04:00
José Fonseca	1fefc65d20	st/wgl: Install our windows message hook to threads created before the ICD is loaded. Otherwise we will not receive destroy windows events, causing framebuffers to leak. This happens particularly with java and jogl. Tested with java + jogl, MATLAB. VMware Internal Bug Number: 1013086. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-05 18:27:54 +01:00
Adam Jackson	ca70de9bd2	llvmpipe: Work without sse2 if llvm is new enough At least on llvm 3.2 this appears to work fine. Tested on an Athlon XP 2600+, which has sse and 3dnow but not sse2. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-04-05 11:32:53 -04:00
Jerome Glisse	b8998f976e	winsys/radeon: add command stream replay dump for faulty lockup v3 Build time option, set RADEON_CS_DUMP_ON_LOCKUP to 1 in radeon_drm_cs.h to enable it. When enabled after each cs submission the code will try to detect lockup by waiting on one of the buffer of the cs to become idle, after a timeout it will consider that the cs triggered a lockup and will write a radeon_lockup.c file in current directory that have all information for replaying the cs. To build this file : gcc -O0 -g radeon_lockup.c -ldrm -o radeon_lockup -I/usr/include/libdrm v2: Add radeon_ctx.h file to mesa git tree v3: Slightly improve dumped file for easier editing, only dump first faulty cs Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-04-05 10:22:05 -04:00
Brian Paul	5192262833	st/xlib: add HUD support for xlib/GLX For the softpipe and llvmpipe drivers. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 17:00:42 -06:00
Brian Paul	f5071783c1	gallium/hud: add GALLIUM_HUD_PERIOD env var To set the graph update rate, in seconds. The default update rate has also been changed to 1/2 second. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-04 17:00:42 -06:00
Brian Paul	6211c45186	gallium/hud: initialize sampler state The default wrap mode (PIPE_TEX_WRAP_REPEAT) is incompatible with unnormalized texcoords (at least for softpipe). v2: use PIPE_TEX_WRAP_CLAMP_TO_EDGE Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-04 17:00:42 -06:00
Kenneth Graunke	edc52a8f28	glsl: Add an optimization pass to flatten simple nested if blocks. GLBenchmark 2.7's shaders contain conditional blocks like: if (x) { if (y) { ... } } where the outer conditional's then clause contains exactly one statement (the nested if) and there are no else clauses. This can easily be optimized into: if (x && y) { ... } This saves a few instructions in GLBenchmark 2.7: total instructions in shared programs: 11833 -> 11649 (-1.55%) instructions in affected programs: 8234 -> 8050 (-2.23%) It also helps CS:GO slightly (-0.05%/-0.22%). More importantly, however, it simplifies the control flow graph, which could enable other optimizations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	967514ce68	i965: Use a variable for the push constant size in kB. This clarifies that the offset of 2 is actually 16 kB / 8kB units. It also keys both computations off of a single variable, which should make it easier to change in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	8cdb2d32ec	i965: Turn brw->urb.vs_size and gs_size into local variables. These variables are only used within a single function, so we may as well make them local variables. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	b99ad7f02c	i965: Remove BRW_NEW_WM_INPUT_DIMENSIONS dirty bit. This was only produced by the brw_wm_input_dimensions atom, which was removed in the previous commit. So there's no need for the dirty bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	d198546bac	i965: Delete brw_vs_constval.c and the brw_wm_input_sizes atom. This was only used to compute proj_attrib_mask, which was removed by the previous commit. That makes this dead code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	705c8247fa	i965: Remove now dead brw_wm_prog_key::proj_attrib_mask field. The previous commit removed the last user of this field, so there's no longer any point in setting it. Removing this should eliminate state-dependent recompiles, and make the precompile more reliable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	7183568869	i965: Remove fixed-function texture projection avoidance optimization. This optimization attempts to avoid extra attribute interpolation instructions for texture coordinates where the W-component is 1.0. Unfortunately, it requires a lot of complexity: the brw_wm_input_sizes state atom (all the brw_vs_constval.c code) needs to run on each draw. It computes the input_size_masks array, then uses that to compute proj_attrib_mask. Differences in proj_attrib_mask can cause state-dependent fragment shader recompiles. We also often fail to guess proj_attrib_mask for the fragment shader precompile, causing us to needlessly compile it twice. Furthermore, this optimization only applies to fixed-function programs; it does not help modern GLSL-based programs at all. Generally, older fixed-function programs run fine on modern hardware anyway. The optimization has existed in some form since the initial commit. When we rewrote the fragment shader backend, we dropped it for a while. Eric readded it in commit `eb30820f26` as part of an attempt to cure a ~1% performance regression caused by converting the fixed-function fragment shader generation code from Mesa IR to GLSL IR. However, no performance data was included in the commit message, so it's unclear whether or not it was successful. Time has passed, so I decided to re-measure this. Surprisingly, Eric's OpenArena timedemo actually runs /faster/ after removing this and the brw_wm_input_sizes atom. On Ivybridge at 1024x768, I measured a 1.39532% +/- 0.91833% increase in FPS (n = 55). On Ironlake, there was no statistically significant difference (n = 37). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	32726b1af6	i965: Use ctx->Stencil._WriteEnabled in DEPTH_STENCIL_STATE. This is the same computation as the _WriteEnabled flag, so we may as well use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-04 15:38:19 -07:00
Kenneth Graunke	01bd29d681	i965: Fix stencil write enable flag in 3DSTATE_DEPTH_BUFFER on Gen7+. ctx->Stencil.WriteMask is a statically sized array of 3 elements. Checking it against 0 actually is a NULL check, and can never fail, which meant that we always said stencil writes were enabled. Use the new core Mesa derived state flag to fix this. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-04 15:38:18 -07:00
Kenneth Graunke	1e3235d36e	mesa: Add new ctx->Stencil._WriteEnabled derived state flag. i965 needs to know whether stencil writes are enabled in several places, and gets the test wrong sometimes. While we could create a function to compute this, it seems generally useful enough to warrant a new piece of derived state. Also, all the plumbing is already in place. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-04-04 15:38:18 -07:00
Roland Scheidegger	9eef86bb55	gallivm: some minor cube map cleanup The ar_ge_as_at variable was just very very confusing since the condition was actually the other way around (as_at_ge_ar). So change the condition (and the selects depending on it) to match the variable name. And also change the chosen major axis in case the coord values are the same. OpenGL doesn't care one bit which one is chosen in this case but it looks like dx10 would require z chosen over y, and y chosen over x (previously did x chosen over y, y chosen over z). Since it's all the same effort just honor dx10's wishes. (Though actually, for some prefered orderings, we could save one (or two with derivatives) selects since the tnewx and tnewz (and the corresponding dmax values) are the same.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 23:22:10 +02:00
Eric Anholt	b6e9b54d06	i965: Ask the register allocator to round-robin through registers. The way we were allocating registers before, packing into low register numbers for Ironlake, resulted in an overly-constrained dependency graph for instruction scheduling. Improves GLBenchmark 2.1 performance by 4.5% +/- 0.7% (n=26). No difference on my old GLSL demo (n=20). No difference on nexuiz (n=15). v2: Fix off-by-one bug that made the change only work for 16-wide on i965. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-04 12:51:06 -07:00
Zack Rusin	be9a42e980	llvmpipe: implement ucmp and add a test for it Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 12:09:55 -07:00
Paul Berry	5db2249493	Avoid spurious GCC warnings in STATIC_ASSERT() macro. GCC 4.8 now warns about typedefs that are local to a scope and not used anywhere within that scope. This produced spurious warnings with the STATIC_ASSERT() macro (which used a typedef to provoke a compile error in the event of an assertion failure). This patch switches to a simpler technique that avoids the warning. v2: Avoid GCC-specific syntax. Also update p_compiler.h. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-04 09:52:18 -07:00
Erik Faye-Lund	456f40e18d	freedreno: document debug flag Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-04-04 10:41:50 -06:00
Brian Paul	e95514c0ea	st/wgl: add HUD support v2: fix a few minor issues spotted by Jose. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 10:41:35 -06:00
Brian Paul	0c1dcf906d	st/wgl: make stw_current_context() non-static Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 08:50:16 -06:00
Brian Paul	92e5e45ff1	util: add debug_memory_check_block(), debug_memory_tag() The former just checks that the given block is valid by checking the header and footer. The later sets the memory block's tag. With extra debug code, we can use that for monitoring/checking particular allocations. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-04 08:50:15 -06:00
Brian Paul	a408ea9692	gallium/hud: replace malloc w/ MALLOC To match the FREE() called used later. Fixes things on Windows. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-04 08:50:15 -06:00
Vincent Lejeune	9276961223	r600g/llvm: Workaround for wrong tex.offset_*	2013-04-04 16:03:04 +02:00
Roland Scheidegger	ce5096a0a9	gallivm: honor explicit derivatives values for cube maps. This is trivial now, though need to make sure we pass all the necessary derivative values (which is 3 each for ddx/ddy not 2). Passes piglit arb_shader_texture_lod-texgradcube test. v2: add the forgotten abs() for all incoming derivatives (discovered by new piglit arb_shader_texture_lod-texgradcube test, though more by luck as it was failing only for exactly one pixel...). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 01:03:42 +02:00
Roland Scheidegger	f621015cb5	gallivm: do per-pixel cube face selection (finally!!!) This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get "randomly" exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in "nearly" the same values as before (the "nearly" is only because before the code would choose the face based on an "average" vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the "derivative major axis" into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. No piglit regressions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 01:03:42 +02:00
Roland Scheidegger	bdfbeb9633	gallivm: minor rho calculation optimization for 1 or 3 coords Using a different packing for the single coord case should save a shuffle. Plus some minor style fixes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-04 01:03:42 +02:00
Roland Scheidegger	067a0ae420	gallivm: use f16c hw support for float->half and half->float conversion Should be way faster of course on cpus supporting this (includes AMD Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)). Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-04-04 01:03:42 +02:00
Zack Rusin	302df7cc85	draw/llvmpipe: allow independent so attachments to the vs When geometry shaders are present, one needs to be able to create an empty geometry shader with stream output that needs to be resolved later and attached to the currently bound vertex shader. Lets add support for it to llvmpipe and draw. draw allows attaching independent stream output info to any vertex shader and llvmpipe resolves at draw time which vertex shader the given empty geometry shader should be linked to. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	246e68735f	llvmpipe: reset so buffers when not appending We need to reset the internal state of the so buffers or we'll keep appending even though we're not supposed to. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	7ca65a68e1	draw: remove unused function we use draw_set_mapped_so_targets nowadays Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	b16ae0f792	draw/llvm: use an enum instead of magic numbers I think this was there before and got accidently removed during a merge. Same code as for the GS context, which is also using an enum instead of hardcoded numbers. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	49b7d933f8	draw/gs: cleanup some debugging code Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	822c21c776	draw/so: maintain an exact number of written vertices It's quite helpful during the rendering when we know exactly the count of the vertices available in the buffer. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	d8543bd752	draw: Implement support for primitive id We were largely ignoring primitive id. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	f6bfb62c50	draw/so: Fix bogus assert We do support so with multiple primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	e6fc635351	draw/gs: Fix memory corruption with multiple primitives We were flushing with incorrect number of primitives. TGSI exec can only work with a single primitive at a time. Plus the fetching with multiple primitives on llvm paths wasn't copying the last element. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Zack Rusin	f313b0c850	gallivm: cleanup the gs interface Instead of void pointers use a base interface. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-03 10:16:25 -07:00
Brian Paul	ac114c6824	svga: add new memory-used HUD query To track the amount of memory used by all pipe_resources (textures and buffers). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 11:02:47 -06:00
Brian Paul	a69efa9482	util: add new util_resource_size() function in u_resource.[ch] Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 11:02:47 -06:00
Brian Paul	a3cccdec90	util: move functions from u_resource.c to u_transfer.c The functions are prototyped in u_transfer.h and are related to the other functions in u_transfer.c. The next patch will re-use the u_resource.c file for new code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 11:02:47 -06:00
Vincent Lejeune	159d934066	r600g/llvm: Do not override llvm provided stack_size	2013-04-03 18:39:49 +02:00
Vincent Lejeune	097a6ecdfe	r600g/llvm: Do not change cf_alu inst when adding alus	2013-04-03 18:22:40 +02:00
Marek Olšák	ff01e0db0e	radeonsi: add more cases for copying unsupported formats to resource_copy_region Ported from r600g commit: `8891b2f9c9` Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 9.1 branch.	2013-04-03 10:58:33 -04:00
Brian Paul	3838edaf5d	svga: add HUD queries for number of draw calls, number of fallbacks The fallbacks count is the number of drawing calls that use a "draw" module fallback, such as polygon stipple. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 09:56:08 -06:00
Brian Paul	49ed1f3cb3	svga: refactor occlusion query code This is in preparation for adding new query types for the HUD. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-03 09:56:07 -06:00
Brian Paul	a9ae7e9c28	gallium/hud: try L8 texture for font if I8 format isn't supported	2013-04-03 09:44:57 -06:00
Brian Paul	0289ebaa0f	svga: add case for PIPE_CAP_QUERY_PIPELINE_STATISTICS	2013-04-03 08:19:44 -06:00
Brian Paul	7e28debb6f	st/mesa: rewrite comment in st_manager.c	2013-04-03 08:16:36 -06:00
Christoph Bumiller	80eef069f0	nv50,nvc0: remove MS resolve formats hack Mesa now allows BlitFramebuffer resolve between RGBA and BGRA.	2013-04-03 13:19:15 +02:00
Christoph Bumiller	4de70bf43c	nvc0: fix 128 bit compressed storage type selection	2013-04-03 12:54:44 +02:00
Christoph Bumiller	8e1dd58a7e	nvc0: place staging textures in GART and map them directly	2013-04-03 12:54:44 +02:00
Christoph Bumiller	ba9b0b682f	nv50: account for pesky prefetch in size calculation of linear textures	2013-04-03 12:54:44 +02:00
Christoph Bumiller	f0a0d59f0f	nvc0: honour scaled coordiantes setting for linear textures	2013-04-03 12:54:44 +02:00
Christoph Bumiller	d801545964	nvc0: fix for 2d engine R source formats writing RRR1 and not R001	2013-04-03 12:54:43 +02:00
Christoph Bumiller	6417d56c19	nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit We send position.z == 0, DEPTH_RANGE may be some arbitrary range not including 0 (for exmaple in piglit's hiz tests).	2013-04-03 12:54:43 +02:00
Christoph Bumiller	e45c969fe5	st/mesa: fix bitmap,drawpix,drawtex for PIPE_CAP_TGSI_TEXCOORD NOTE: Changed the semantic index for the drawtex coordinate to be the texture unit index instead of always 0. Not sure if this is correct but since the value seems to depend on the unit it would make sense to use different varying slots.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	2a8145d36b	nouveau: accelerate buffer copies in resource_copy_region	2013-04-03 12:54:43 +02:00
Christoph Bumiller	3ed4bbd769	nvc0: demagic some of the NVE4_COMPUTE_UPLOAD methods It's actually the same as P2MF.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	fb0334adb3	nvc0: read PM counters for each warp scheduler separately	2013-04-03 12:54:43 +02:00
Christoph Bumiller	7bac075f25	nvc0: add some metrics to driver specific queries	2013-04-03 12:54:43 +02:00
Christoph Bumiller	198f514aa6	nvc0: add some driver statistics queries	2013-04-03 12:54:43 +02:00
Christoph Bumiller	7628cc247f	nvc0: disable compressed storage type 0xdb for now Single-sample color compression doesn't seem that useful anyway.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	ea12fc3f6c	nvc0: use correct hw query for PRIMITIVES_GENERATED It was the same as SO_STATISTICS[1] before.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	6bca4e7085	nvc0: use fence to check state of queries that don't write sequence This still isn't optimal, since the fence will signal a bit late, but better than checking on the bo, which may never be ready if it is shared (which is likely).	2013-04-03 12:54:43 +02:00
Christoph Bumiller	3d2790cead	gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICS Also, renamed "pixels-rendered" to "samples-passed" because the occlusion counter increments even if colour and depth writes are disabled, or (on some implementations) for killed fragments that passed the depth test when PS early_fragment_tests is set.	2013-04-03 12:54:43 +02:00
Christoph Bumiller	c620aad71c	gallium/docs: fix definition of PIPE_QUERY_SO_STATISTICS Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-03 12:54:43 +02:00
Christoph Bumiller	f35e96d973	gallium: add PIPE_CAP_QUERY_PIPELINE_STATISTICS Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-03 12:54:43 +02:00
Paul Berry	41e4bccc75	i965: Reduce code duplication in handling of depth, stencil, and HiZ. This patch consolidates duplicate code in the brw_depthbuffer and gen7_depthbuffer state atoms. Previously, these state atoms contained 5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for Gen4-6 and 2 for Gen7). Also a lot of logic for determining the appropriate buffer setup was duplicated between the Gen4-6 and Gen7 functions. This refactor splits the code into three separate functions: brw_emit_depthbuffer(), which determines the appropriate buffer setup in a mostly generation-independent way, brw_emit_depth_stencil_hiz(), which emits the appropriate state packets for Gen4-6, and gen7_emit_depth_stencil_hiz(), which emits the appropriate state packets for Gen7. Tested using Piglit on Gen5-7 (no regressions). v2: Re-word some comments. Fix an assertion that incorrectly prohibited packed depth/stencil formats on Gen6 (these are allowed provided that HiZ is disabled). Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-02 15:19:13 -07:00
Paul Berry	2ad0ed6349	Revert "glsl: Replace constant-index vector array accesses with swizzles" This reverts commit `dbf94d105a`, which was working around a bug in the handling of array indexing when constant folding built-in functions. Now that the constant folding bug has been fixed, the workaround is no longer needed.	2013-04-02 12:24:16 -07:00
Paul Berry	7d4f1e6467	glsl: Fix array indexing when constant folding built-in functions. Mesa constant-folds built-in functions by using a miniature GLSL interpreter (see ir_function_signature::constant_expression_evaluate_expression_list()). This interpreter had a bug in its handling of array indexing, which caused expressions like "m[i][j]" (where m is a matrix) to be handled incorrectly. Specifically, it incorrectly treated j as indexing into the whole matrix (rather than indexing just into the vector m[i]); as a result the offset computed for m[i] was lost and m[i][j] was treated as m[j][0]. Fixes piglit tests inverse-mat[234].{vert,frag}. NOTE: This is a candidate for the 9.1 and 9.0 branches. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57436	2013-04-02 12:24:08 -07:00
Roland Scheidegger	450950c57a	gallivm: bring back optimized but incorrect float to smallfloat optimizations Conceptually the same as previously done in float_to_half. Should cut down number of instructions from 14 to 10 or so, but will promote some NaNs to Infs, so it's disabled. It gets a bit tricky though handling all the cases correctly... Passes basic tests either way (though there are no tests testing special cases, but some manual tests injecting them seemed promising). v2: style and comment fixes suggested by Jose Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-02 18:24:31 +02:00
Roland Scheidegger	3febc4a1cd	gallivm: consolidate code for float-to-half and float-to-packed conversion. This replaces the existing float-to-half implementation. There are definitely a couple of differences - the old implementation had unspecified(?) rounding behavior, and could at least in theory construct Inf values out of NaNs. NaNs and Infs should now always be properly propagated, and rounding behavior is now towards zero (note this means too large but non-Infinity values get propagated to max representable value, not Infinity). The implementation will definitely not match util code, however (which does nearest rounding, which also means too large values will get propagated to Infinity). Also fix a bogus round mask probably leading to rounding bugs... v2: fix a logic bug in handling infs/nans. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-02 18:24:31 +02:00
Vadim Girlin	9be624b3ef	r600g: don't reserve more stack space than required v5 Reduced stack size allows to run more threads in some cases, improving performance for the shaders that use stack (that is, for the shaders with control flow instructions). E.g. with unigine-based apps. v4: implement exact computation taking into account wavefront size v5: add cases for RV620, RS880 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-02 19:34:14 +04:00
Vadim Girlin	7e04227f39	r600g: fix range handling for tgsi input declarations v2 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-04-02 19:34:14 +04:00
Marek Olšák	f8502b7e71	gallium/hud: do .xxxx swizzling for the font texture in the fragment shader This allows using L8 and R8 for the font if I8 isn't supported. Tested-by: Brian Paul <brianp@vmware.com>	2013-04-02 16:57:57 +02:00
Brian Paul	98b64cc20f	hud: flush/unmap the vertex buffer before drawing The VMware svga driver is picky about making sure the VBO is unmapped before drawing. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-04-02 08:17:28 -06:00
Brian Paul	bdd3770b78	draw: use pipe_transfer_unmap() to match pipe_transfer_map()	2013-04-02 08:17:28 -06:00
Roland Scheidegger	9b329f4c09	gallivm: fix signed small float to float conversion Introduced by `5f41e08cf3`, just a silly typo. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62921.	2013-04-02 13:21:07 +02:00
Christian König	a0dca4409a	radeonsi: add instance divisor support v3 v2: reduce key size, don't copy key around to much. v3: remove key size reduction Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:43 +02:00
Christian König	cf9b31f78a	radeonsi: add start instance support This works different than on R600, we need to add the start instance manually. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:43 +02:00
Christian König	e4ed58763a	radeonsi: add instanceid support Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:43 +02:00
Christian König	83df955ca9	radeon/llvm: move system value fetching to common code This should be used by both SI and R600. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-04-02 13:01:42 +02:00
Michel Dänzer	c6efb4870b	radeonsi: Handle arbitrary 2-byte formats in resource_copy_region Fixes mplayer -vo vdpau OSD. NOTE: This is a candidate for the 9.1 branch. Reported-by: Igor Vagulin <igor.vagulin@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Christian König <christian.koenig@amd.com>	2013-04-02 11:42:35 +02:00
Maarten Lankhorst	6d20c646d6	nvc0: Fix fd leak in nvc0_create_decoder NOTE: This is a candidate for the 9.0 and 9.1 branches. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-04-02 10:25:26 +02:00
Aras Pranckevicius	b2eee0869f	GLSL: fix lower_jumps to report progress properly A fix for lower_jumps progress reporting, very much like similar in `c1e591eed`. NOTE: This is a candidate for stable branches. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-04-01 16:57:17 -07:00
Eric Anholt	62501c3af8	i965/fs: Allow CSE on pre-gen7 varying-index uniform loads All the other expression types allowed here have inst->mlen == 0, and this one has implied MRF writes for all of its payload, so nothing else in the implementation should need to change. Reduces SEND messages for loading from pull constants in kwin's Lanczos shader from 16 to 6. (Due to a deficiency in constant propagation, I can't use the hack I did in the previous commit to test the performance change) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554 NOTE: This is a candidate for the 9.1 branch.	2013-04-01 16:17:26 -07:00
Eric Anholt	70b27e0e4b	i965/fs: Use LD messages for pre-gen7 varying-index uniform loads This comes at a minor performance cost at the moment (-3.2% +/- 0.2%, n=14 on my GM45 forced to load all uniforms through the varying-index path), but we get a whole vec4 at a time to reuse in the next commit. v2: Fix comment about channels in the other message. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for the 9.1 branch.	2013-04-01 16:17:26 -07:00
Eric Anholt	ce316f62ef	i965/fs: Don't double-emit SEND dependency workarounds at control flow. We weren't setting needs_dep[i] in the loops, so we'd continue on to potentially add the same workaround MOVs to the later basic block boundaries, too. We can either set needs_dep[i] to exit through the normal path, or we can just return since we know we're done. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:26 -07:00
Eric Anholt	3cf69b2284	i965/fs: Bake regs_written into the IR instead of recomputing it later. For sampler messages, it depends on the target gen, and on gen4 SIMD16-sampler-on-SIMD8-execution we were returning 4 instead of 8 like we should. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for the 9.1 branch.	2013-04-01 16:17:26 -07:00
Eric Anholt	8edc7cbe64	i965/fs: Clean up the setup of gen4 simd16 message destinations. I think this makes it much more obvious what's going on here. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:26 -07:00
Eric Anholt	9f43b84928	i965/fs: Do CSE on gen7's varying-index pull constant loads. This is our first CSE on a regs_written() > 1 instruction, so it takes a bit of extra fixup. Reduces the number of loads on kwin's Lanczos shader from 12 to 2. v2: Fix compiler warning (false positive on possibly-uninitialized variable) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) NOTE: This is a candidate for the 9.1 branch.	2013-04-01 16:17:25 -07:00
Eric Anholt	dca5fc1435	i965/fs: Improve performance of varying-index uniform loads on IVB. Like we have done for the VS and for constant-index uniform loads, we use the sampler engine to get caching in front of the L3 to avoid tickling the IVB L3 bug. This is also a bit of a functional change, as we're now loading a vec4 instead of a single dword, though we're not taking advantage of the other 3 components of the vec4 (yet). With the driver hacked to always take the varying-index path for all uniforms, improves performance of my old GLSL demo by 315% +/- 2% (n=4). This a major fix for some blur shaders in compositors from the varying-index uniforms support I introduced in 9.1. v2: Move old offset computation into the pre-gen7 path. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61554 NOTE: This is a candidate for the 9.1 branch.	2013-04-01 16:17:25 -07:00
Eric Anholt	bc0e1591f6	i965/fs: Avoid inappropriate optimization with regs_written > 1. Right now we don't have anything with regs_written() > 1 and !inst->mlen, but that's about to change. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:25 -07:00
Eric Anholt	740350c982	i965: Make the fragment shader pull constants index by dwords, not vec4s. We want to load vec4s, since loading a vec4 instead of a dword is basically no increased latency. But for variable indexed access, the previous requirement of aligned vec4s for a sampler LD was hard to implement. Note that this change only affects those messages that use the surface format, like sampler LDs, but not to the untyped data cache loads we've used in other cases. No significant performance difference on my GLSL demo with uniforms forced to take the varying pull constants path (n=4). NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:25 -07:00
Eric Anholt	2f41a60145	i965: Make the constant surface interface take a normal byte size. This puts the rounding-up logic into the function itself instead of all the callers having to manage it. Also drop an "unused" comment in gen4, as the stride is used for texbos (and will be for uniforms soon). NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:25 -07:00
Eric Anholt	8c694dfe64	i965/fs: Move varying uniform offset compuation into the helper func. I'm going to want to change the math for gen7 using sampler LD instructions in a way that gets CSE to occur like we'd hope. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:25 -07:00
Eric Anholt	59e858861c	i965/fs: Remove creation of a MOV instruction that's never used. We weren't inserting it into the list, so it did nothing. This line was replaced by the MOV/MUL block above. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:24 -07:00
Eric Anholt	1d6ead3804	i965/fs: Allow constant propagation into MACH. This happens quite a bit with varying-index uniform loads. We could also do better by avoiding the MACH entirely, but there's no reason not to at least take this step. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 16:17:24 -07:00
Vincent Lejeune	50fd9c4544	r600g/llvm: Update LLVM_REVISION.txt	2013-04-01 23:50:20 +02:00
Vincent Lejeune	8c8c4e3977	r600g/llvm: Use stack_size provided from llvm.	2013-04-01 23:43:57 +02:00
Vincent Lejeune	4ac0d85ca6	r600g/llvm: uses function attribute to pass shader type	2013-04-01 23:43:42 +02:00
Vincent Lejeune	af38695f51	r600g/llvm: Add support for cf_alu native encode	2013-04-01 23:43:27 +02:00
Haixia Shi	bc0cc2944f	ACTIVE_UNIFORM_MAX_LENGTH should include 3 extra characters for arrays. If the active uniform is an array, then the length of the uniform name should include the three extra characters for the "[0]" suffix, which is required by the GL 4.2 spec to be appended to the uniform name in glGetActiveUniform(). This avoids the situation where the output buffer does not have enough space to hold the "[0]" suffix, resulting in an incomplete array specification like "foobar[0". NOTE: This is a candidate for the 9.1 branch. Change-Id: I41e87ba347a7169eec8c575596cc3416adbe0728 Signed-off-by: Haixia Shi <hshi@chromium.org> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-04-01 13:39:13 -07:00
Matt Turner	e2b40e253b	i965/fs: Fix bad interaction between tex swizzles and textureQueryLOD. Reported-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 13:11:43 -07:00
Eric Anholt	4ee892ee8a	i965: Remove the old brw_optimize() code. This is now done in the VS backend before instruction emit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 11:36:06 -07:00
Eric Anholt	4fee05b020	i965/vs: Add a pass to set dependency control fields on instructions. This is a more aggressive version of the old brw_optimize() path. Reduces cycles spent in the vertex shader on minecraft by 18.6% +/- 10.0% (n=15). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 11:36:05 -07:00
Eric Anholt	229a51cdbe	i965: Dump shader source for linked shader programs. We dump shader source in ir_to_mesa.cpp, and we dump linked programs here, but we had no reference from the linked programs to their source. This was preventing improvement of shader-db to use linked shader programs instead of individual shader files (which is bogus, because it means we optimize out VS outputs, and don't interpolate FS inputs!) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-04-01 11:30:36 -07:00
Mike Lothian	777a7f2003	clover: Fix build with LLVM 3.3	2013-04-01 10:50:23 -07:00
Brian Paul	1165ff1af1	llvmpipe: use triangle subdivision to avoid fixed-point overflow issues If we're drawing to a surface that's 2048 x 2048 pixels or larger there's danger of fixed-point overflow in the triangle rasterization code. That leads to various rendering glitches. Rather than implement some intricate changes to the rasterization code, simply subdivide triangles into smaller subtriangles to avoid the issue. Only do this when the drawing surface is larger than 2048 by 2048. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-01 08:40:35 -06:00
Brian Paul	95df2b2883	mesa: remove platform checks around __builtin_ffs, __builtin_ffsll Use the __builtin_ffs, __builtin_ffsll functions whenever we have GCC, not just for specific platforms. Fixes Solaris build. Note: This is a candidate for the stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62868 Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-01 08:40:35 -06:00
Brian Paul	99811c344b	docs: add a new page documenting known application issues Let's try to update this when we find other broken applications... Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-04-01 08:40:35 -06:00
Brian Paul	fe30fa9ad6	drirc: set always_have_depth_buffer for Topogon Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-04-01 08:18:09 -06:00
Adam Jackson	e26d5940ff	gallivm: Minor comment cleanup Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-04-01 09:45:38 -04:00
Dave Airlie	135bb3c1a9	mesa: fix texture storage multisample prototypes harder. I just noticed the warnings since I fixed the other bit. Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-01 19:54:56 +10:00
Vincent Lejeune	c3fb34ee8d	r600g/llvm: Update LLVM_REVISION	2013-03-31 21:37:20 +02:00
Vincent Lejeune	67a8ee7aaa	r600g/llvm: use native encode for tex	2013-03-31 21:35:47 +02:00
Dave Airlie	5b36bc05be	glapi: fix storage multisample build errors Reported on #radeon by udovdh Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-03-31 20:41:28 +10:00
Chris Forbes	2a528889a3	docs: mark ARB_texture_storage_multisample done Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:42 +13:00
Chris Forbes	d25b4d5e90	i965: enable ARB_texture_storage_multisample on Gen6+ This can be enabled everywhere that ARB_texture_multisample is supported -- ARB_texture_storage is supported on everything. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:40 +13:00
Chris Forbes	e0015c819c	mesa: allow multisample texture targets in [Get]TexParameter* ARB_texture_storage_multisample allows texture parameters to be queried for TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets. Some parameters may also be set, with the following exceptions: - TEXTURE_BASE_LEVEL may not be set to a nonzero value; generates INVALID_OPERATION - any state which appears in the `per-sampler` state table may not be set; generates INVALID_OPERATION V2: Don't introduce bogus handling of TEXTURE_MAX_LEVEL Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:36 +13:00
Chris Forbes	b15c558c85	mesa: improve reported function name in Tex*Multisample Now that there are 4 variants, just pass the function name into teximagemultisample rather than reconstructing it. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:34 +13:00
Chris Forbes	9cbfe98bfc	mesa: add enable bit for ARB_texture_storage_multisample Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:32 +13:00
Chris Forbes	719974b54c	glapi: add definition of ARB_texture_storage_multisample Adds XML for the extension, dispatch_sanity enabling, and the two new entrypoints. These are both implemented by calling the shared teximagemultisample() with immutable=GL_TRUE. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:28 +13:00
Chris Forbes	788b0f8535	mesa: add support for immutable textures to teximagemultisample() The new entrypoints will come later, but this adds the actual logic for supporting immutable multisample textures: - The immutability flag is set as desired. - Attempting to modify an immutable multisample texture produces INVALID_OPERATION. Note: The extension spec does not mention adding this behavior to TexImage*Multisample, but it seems like the reasonable thing to do. V2: - Cover missing error cases (unsized formats; texture object zero) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [V1] Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:22 +13:00
Chris Forbes	7f32b9560b	mesa: extract _mesa_is_legal_tex_storage_format helper This is about to be used in teximagemultisample() when immutable=true. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-31 22:19:13 +13:00
Kenneth Graunke	fdc5941972	mesa: Delete VERT_ATTRIB_GENERIC_NV and VERT_BIT_GENERIC_NV macros. These haven't been used since we deleted NV_vertex_program support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-30 19:19:45 -07:00
Eric Anholt	0967c362bf	i965: Fix an inconsistency inb the VUE map with gl_ClipVertex on gen4/5. We are intentionally not allocating a slot for gl_ClipVertex. But by leaving the bit set in the slots_valid, the fragment shader's computation of where varyings are in urb entry coming out of the SF would be off by one. Fixes rendering in Freespace 2 SCP, and improves rendering in TF2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62830 Tested-by: Joaquín Ignacio Aramendía <samsagax@gmail.com> NOTE: This is a candidate for the 9.1 branch. Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-03-30 17:24:18 -07:00
Eric Anholt	9dd19575d3	intel: Remove a never-taken debug print path. Alessandro Pignotti noted when I added this code in commit `0e723b135b` that it's in the else block for "if (busy)", so this debug print couldn't happen. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-30 17:23:50 -07:00
Brian Paul	c34bbe110d	st/mesa: add ir_lod case in GLSL->TGSI code to silence warning	2013-03-29 17:21:33 -06:00
Ian Romanick	e0131196ca	glsl: Generated masked write instead of vector array index for UBO lowering When reading a column from a row-major matrix, we would slot the single value read into the vector using an ir_dereference_array of the vector with a constant index. This will (eventually) get optimized to a masked-write, so just generate the masked write in the first place. v2: Remove unused variable 'chan'. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net>	2013-03-29 12:01:14 -07:00
Ian Romanick	65cc68f430	glsl: Replace open-coded dot-product with dot Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: Paul Berry <stereotype441@gmail.com>	2013-03-29 12:01:11 -07:00
Ian Romanick	dbf94d105a	glsl: Replace constant-index vector array accesses with swizzles Search and replace: ][0] -> ].x ][1] -> ].y ][2] -> ].z ][3] -> ].w Fixes piglit tests inverse-mat[234].{vert,frag}. These tests call the inverse function with constant parameters and expect proper constant folding to happen. My suspicion is that this patch papers over some bug in constant propagation involving array accesses. Either way, all of these accesses eventually get lowered to swizzles. This cuts out the middle man (saving a trivial amount of CPU). NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: Eric Anholt <eric@anholt.net> Cc: Paul Berry <stereotype441@gmail.com>	2013-03-29 12:01:07 -07:00
Ian Romanick	c770faea0a	glsl: Add missing bool case in glsl_type::get_scalar_type Since the case was missing bec4->get_scalar_type() would return bvec4, but vec4->get_scalar_type() would return float. NOTE: This is a candidate for stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-29 12:01:01 -07:00
Kenneth Graunke	57a502518e	i965: Fix INTEL_DEBUG=shader_time for fragment shaders with discards. "discard" instructions generate HALT instructions which jump to a final HALT near the end of the shader. Previously, fs_generator created this final jump target when it saw the first FS_OPCODE_FB_WRITE, causing it to jump right before the FB write epilogue. This is normally good. However, INTEL_DEBUG=shader_time also has an epilogue section which records the final timestamp. The frontend emits IR for this just before FS_OPCODE_FB_WRITE. Unfortunately, this led to the following ordering: 1. Shader Time Epilogue 2. Final HALT (where discards jump) 3. Framebuffer Write Epilogue This meant that discarded pixels completely skipped the shader time epilogue, causing no ending timestamp to be written. This obviously led to inaccurate results. This patch adds a new FS_OPCODE_PLACEHOLDER_HALT in the IR stream just before any epilogue sections. This is where the final HALT should be generated, and makes it easy to ensure the correct ordering: 1. Final HALT 2. Shader Time Epilogue 3. Framebuffer Write Epilogue For shaders that don't discard, this opcode compiles away to nothing. The scheduler adds barrier dependencies to make sure that it doesn't get moved above any FS_OPCODE_DISCARD_JUMP instructions. One 8-wide shader in GLBenchmark 2.7 dropped from 2291.67 Gcycles to a mere 5.13 Gcycles. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 11:39:32 -07:00
Eric Anholt	20d846ce8b	i965: Add names for all instructions to dump_instruction() in FS and VS. I'd previously added the minimum names to understand my dumps, but this makes dumps in general much easier to read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 11:39:21 -07:00
Matt Turner	ed6186f0e8	i965: Enable ARB_texture_query_lod. v2: Support Ironlake as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 10:21:14 -07:00
Matt Turner	b8aa9f7d3a	i965/fs: Generate LOD sampler message from ir_lod. v2: Support Ironlake as well. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 10:21:14 -07:00
Dave Airlie	110ca8b1f3	glsl: Implement ARB_texture_query_lod v2 [mattst88]: - Rebase. - #define GL_ARB_texture_query_lod to 1. - Remove comma after ir_lod in ir.h for MSVC. - Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp, opt_tree_grafting.cpp. - Rename textureQueryLOD to textureQueryLod, see https://www.khronos.org/bugzilla/show_bug.cgi?id=821 - Fix ir_reader of (lod ...). v3 [mattst88]: - Rename textureQueryLod to textureQueryLOD, pending resolution of Khronos 821. - Add ir_lod case to ir_to_mesa.cpp. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 10:20:26 -07:00
Matt Turner	0e0ab8a071	i965/fs: Use measured Gen7 instruction timings on Gen6. x before + after +------------------------------------------------------------------------------+ \| x x + \| \| xx ++ x + \| \| xx ++ + xx ++ \| \|x xxx x+++++ + xxx xx++++ + x +\| \| \|_____\|____________A______A____M____M_\|_______\| \| +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 23 8083.78 8287.83 8205.55 8162.7461 68.307951 + 23 8107.56 8358.74 8224.33 8186.1765 71.506301 No difference proven at 95.0% confidence Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	f085b21b25	i965/fs: Increase and document MAD latency on Gen7. 58% of mad(8) generated in shader-db are reading registers from the same bank. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	414ea2f560	i965/fs: Add LRP instruction latency. Set its latency to what happens to be the default floating-point instruction latency. One day we may want to handle latency based on register bank information. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	ad4507b355	i965/fs: Add Haswell cycle timings Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:27 -07:00
Matt Turner	7997e59b65	i965: Note that write-after-write dependencies are blocking. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:26 -07:00
Matt Turner	f91e371fee	i965: Reword comment about the shared mathbox. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-29 10:13:26 -07:00
Roland Scheidegger	5f41e08cf3	gallivm: consolidate some half-to-float and r11g11b10-to-float code Similar enough that we can try to use shared code. v2: fix a stupid bug using wrong variable causing mayhem with Inf and NaNs. Reviewed-by: Jose Fonseca <jfonseca@vmware.com	2013-03-29 16:39:40 +01:00
Chris Forbes	4412f3bc13	mesa: provide default implementation of QuerySamplesForFormat Previously at least i915 failed to provide an implementation, but exposed ARB_internalformat_query anyway, leading to crashes when QueryInternalformativ was called. Default implementation just returns 1 for everything, so is suitable for any driver which does not support multisampling. V2: - Move from intel to core mesa. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-29 20:54:36 +13:00
Christoph Bumiller	ee624ced36	nvc0: implement MP performance counters There's more, but this only adds (most) of the counters that are handled directly by the shader processors. The other counter domains are not handled on the multiprocessor and there are no FIFO object methods for configuring them. Instead, they have to be programmed by the kernel via PCOUNTER, and the interface for this isn't in place yet.	2013-03-29 00:33:01 +01:00
Christoph Bumiller	480359bcf6	nvc0: enable compression when supported	2013-03-29 00:33:01 +01:00
Christoph Bumiller	25722e3454	nvc0: use NOUVEAU_GETPARAM_GRAPH_UNITS to get MP count	2013-03-29 00:33:00 +01:00
Christoph Bumiller	443b247878	nv50,nvc0: fix 3d blits, restore viewport after blit	2013-03-29 00:33:00 +01:00
Christoph Bumiller	090e73fc46	nv50: fix 3D render target setup	2013-03-29 00:33:00 +01:00
Brian Paul	b54ce3738a	llvmpipe: put .bmp extension on dumped image files	2013-03-28 17:17:26 -06:00
Brian Paul	e90c56bc4e	llvmpipe: add 'f' suffix to 1.0 in fixed_to_float()	2013-03-28 17:17:26 -06:00
Brian Paul	499aa3ddb4	draw: fix some build breakage when LLVM is not used Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62883 Tested-by: Vinson Lee <vlee@freedesktop.org>	2013-03-28 17:15:58 -06:00
Marek Olšák	9ad9141917	mesa: handle STATE_CURRENT_ATTRIB_MAYBE_VP_CLAMPED for parameter printing Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-28 20:02:50 +01:00
Kenneth Graunke	9fe47756b3	i965: Tidy shader time printing code by using printf's field widths. We can use %-6s%-6s rather than manually counting characters, resulting in much more readable code. This necessitates a small secondary change: using "total fs16" and "" now causes the "" string to be padded out to 6 characters, resulting in too much whitespace. Splitting it into "total" and "fs16" produces the same output as before. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:44 -07:00
Eric Anholt	6192e9b377	i965/vs: Include URB payload setup in shader_time. This much more accurately reflects the cost of the vertex shader, since the payload setup is often a significant fraction of the instructions in the VS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:41 -07:00
Eric Anholt	55feb19704	i965/vs: Use a send from a 2-register VGRF for shader time writes. This will let us emit it later, after we're setting up MRFs for the URB write. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:37 -07:00
Eric Anholt	130138030a	i965/vs: Teach copy propagation about sends from GRFs. This incidentally also teaches it a bit about gen6 math -- we now allow unswizzled, unmodified GRF temps as the sources for math. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:34 -07:00
Eric Anholt	c3a22d42a8	i965/vs: Prepare split_virtual_grfs() for the presence of SENDs from GRFs. v2: Fix silly bool handling, and don't add new tabs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:29 -07:00
Eric Anholt	47e795d861	i965/fs: Include everything but the final FB write in shader_time. Previously, if you just wrote a constant color to the render target, no time got noted at all. This is convenient for doing single-instruction timings, but not so much for actual program analysis. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:23 -07:00
Eric Anholt	5c5218ea61	i965/fs: Switch shader_time writes to using GRFs. This avoids conflicts between shader_time and FB writes, so we can include more of the program under our profiling. This does mean hiding more of the message setup from the optimizer, which doesn't have a way to handle multi-reg sends from GRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:15 -07:00
Eric Anholt	5c039543db	i965: Provide more detailed information to match shader_time to programs. Ken asked me the other day what -1 vs 0 vs 3 vs other meant in our shader names, and I realized that it was really unclear. I'd like to do even better, like noting which one is the clear shader, but that would require exposing the metaops struct to the driver. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:11 -07:00
Eric Anholt	d2ba1c24b4	i965: Track ARB program state along with GLSL state for shader_time. This will let us do much better printouts for non-GLSL programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 11:46:01 -07:00
Marek Olšák	a19f6e880a	st/dri: fix crash with HUD and single buffering	2013-03-28 18:17:21 +01:00
Marek Olšák	6b5dfa42c9	st/mesa: remove leftover printfs from ReadPixels Oops, I thought I had removed all debugging code.	2013-03-28 18:17:21 +01:00
Eric Anholt	eda434921d	i965/fs: Improve performance of copy propagation dataflow using bitsets. Reduces compile time of l4d2's slowest shader by 17.8% +/- 1.3% (n=10). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-28 09:48:50 -07:00
Zack Rusin	d066133a76	llvmpipe/draw: Fix texture sampling in geometry shaders We weren't correctly propagating the samplers and sampler views when they were related to geometry shaders. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	186a6bffdd	draw/llvm: Cleanup the store debugging code Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	10964fc73d	draw: Allocate the output buffer for output primitives We were allocating the output buffer but using the input primitives. We need to allocate that buffer using the maximum number of output, not input, primitives. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	f20f981553	gallivm: Implement the breakc instruction Required by more modern examples. Like BRK but with a condition. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	b66ffcf2f8	gallivm: implement implicit primitive flushing TGSI semantics currently require an implicit endprim at the end of GS if an ending primitive hasn't been emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	e96f4e3b85	gallium/llvm: implement geometry shaders in the llvm paths This commits implements code generation of the geometry shaders in the SOA paths. All the code is there but bugs are likely present. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:02 -07:00
Zack Rusin	edcebe665d	draw/gs: Fetch more than one primitive per invocation Allows executing gs on up to 4 primitives at a time. Will also be required by the llvm code because there we definitely don't want to flush with just a single primitive. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Zack Rusin	014c4d1cd7	draw/gs: Abstract the portions of GS that are tgsi specific To be able to add llvm paths later on we need to have some common interface for them. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Zack Rusin	a85c83e427	draw/llvm: Remove unused gs_constants from jit_context The member was never used and we'll need to handle it differently because gs will also need samplers/textures setup. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Zack Rusin	90ee8de700	graw/gs: add missing max output vertices to all tests A few tests were missing this crucial property. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-27 03:53:01 -07:00
Jerome Glisse	3f7d9710e8	radeonsi: add cs tracing v3 Same as on r600, trace cs execution by writting cs offset after each states, this allow to pin point lockup inside command stream and narrow down the scope of lockup investigation. v2: Use WRITE_DATA packet instead of WRITE_MEM v3: Remove useless nop packet Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-03-27 11:38:02 -04:00
Chris Forbes	21a2dfa55d	mesa: only check sample count if we actually wanted multisampling Fixes various test fallout from `90b5a2425a` on Pineview, which claims to support ARB_internalformat_query but doesn't actually provide the driverfunc. That driver is still broken [GetInternalformativ will still segfault!] but it was silly to be going through the sample count logic in the nonmultisampling case at all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-27 07:49:12 +13:00
Christian König	c77159cc11	radeon/llvm: document LLVM commit We need at least that revision to work correctly now. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-26 15:08:00 +01:00
Christian König	1c10018925	radeonsi: add preloading for all samplers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:43 +01:00
Christian König	0f6cf2bc79	radeonsi: add preloading of all constants Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:40 +01:00
Christian König	44e3224554	radeonsi: mark most intrinsics as readnone/nounwind Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:36 +01:00
Christian König	206f059e1f	radeonsi: mark all loads as constant Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:33 +01:00
Christian König	86f6fc2f1d	radeonsi: remove wqm intrinsic Now the backend handles that itself. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:30 +01:00
Christian König	6249db73ea	radeon/llvm: remove uneeded inclusion The include isn't needed and the file has moved with LLVM master. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 12:57:23 +01:00
Christian König	0f001fbff1	glsl_to_tgsi: avoid creating arrays if driver doesn't support them Avoid creating arrays if we replace indirect addressing anyway. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-26 10:22:27 +01:00
Christian König	462de2e65f	glsl_to_tgsi: make simplify_cmp work with arrays Even when we have arrays it is possible for simplify_cmp to work on temps, just not on arrays. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=62696 Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-26 10:22:27 +01:00
Marek Olšák	98a8e5b87e	gallium/docs: document get_driver_query_info	2013-03-26 01:37:40 +01:00
Marek Olšák	8ddae684af	r600g: add a driver query returning the amount of requested VRAM and GTT memory	2013-03-26 01:28:19 +01:00
Marek Olšák	2504380aaf	r600g: add a driver query returning the number of draw_vbo calls between begin_query and end_query	2013-03-26 01:28:19 +01:00
Marek Olšák	e40c634bd2	st/dri: integrate the HUD Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-26 01:28:19 +01:00
Marek Olšák	c91cf7d7d2	gallium: implement a heads-up display module Reviewed-by: Brian Paul <brianp@vmware.com> v2: lots of cosmetic changes	2013-03-26 01:28:19 +01:00
Marek Olšák	8ddcd715b7	gallium: add interface for driver queries like performance counters, etc. The pipe query interface is reused. The list of available queries can be obtained using pipe_screen::get_driver_query_info. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-26 01:28:19 +01:00
Marek Olšák	9cec5edea7	gallium/tgsi: fix valgrind warning "Conditional jump or move depends on uninitialised value(s)" Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-26 01:28:19 +01:00
Marek Olšák	17003b44b7	st/mesa: fix crash with blit-based GetTexImage https://bugs.freedesktop.org/show_bug.cgi?id=62573 Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-26 01:28:19 +01:00
Marek Olšák	d1b91e309b	cso: add constant buffer save/restore feature for postprocessing Postprocessing is an internal meta op and should restore the states it changes.	2013-03-26 01:28:18 +01:00
Marek Olšák	35c522dce4	radeonsi: fix crash while binding a NULL constant buffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-03-26 01:28:18 +01:00
Marek Olšák	a2378daf83	r600g: fix crash while binding a NULL constant buffer	2013-03-26 01:28:18 +01:00
Marek Olšák	53228fe2a8	r300g: fix crash while binding a NULL constant buffer	2013-03-26 01:28:18 +01:00
Martin Andersson	92855bcc95	r600g: Use virtual address for PIPE_QUERY_SO* in r600_emit_query_end Virtual address is used for PIPE_QUERY_SO* queries in r600_emit_query_begin, but not in r600_emit_query_end. This will trigger a GPU fault when one of those queries is made and virtual address is enabled. Note: this is a candidate for the 9.1 branch Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-25 18:18:23 -04:00
Rob Clark	634fb837ef	freedreno: use u_debug for debug env vars Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-25 15:05:44 -04:00
Jordan Justen	e207c33020	glsl ir: add as_dereference_record Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-25 11:35:56 -07:00
Brian Paul	eb92f89587	gallium: undef PACKAGE_* macros to silence warnings Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-25 12:24:11 -06:00
Brian Paul	c0f16df938	gallivm: init vars to silence warnings Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-25 12:24:11 -06:00
Brian Paul	35aefe9226	swrast: init vars to silence warnings Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-25 12:24:11 -06:00
Rob Clark	980f1cf8a1	freedreno: prefer sw upload for textures Since we are UMA, in most cases the GPU blit doesn't make much sense for texture upload. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-25 13:05:44 -04:00
Rob Clark	732b0b5ebc	freedreno: track maximal scissor bounds Optimize out parts of the render target that are scissored out by taking into account maximal scissor bounds in fd_gmem_render_tiles(). This is a big win on things like gnome-shell which frequently do partial screen updates. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-25 13:05:44 -04:00
Adrian Marius Negreanu	8a4750fe5e	android: fix Android.mk bug in mesa/drivers/dri/common target-specific variables are undefined when used as pre-requisites. instead, use secondary-expansion. I noticed this when building the patch: i965: Add a driconf option to disable flush throttling Signed-off-by: Adrian Marius Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-25 09:52:19 -07:00
Eric Anholt	712bac1f41	mesa: Disable validate_ir_tree() on release builds. Since half of ir_validate uses asserts() (the other using printf() then abort()), there's not much use to calling it in a release build. Cuts 6.3% of the startup time of TF2. NOTE: This is a candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-25 08:50:38 -07:00
Roland Scheidegger	92b8a37fdf	gallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own file This is really not generic conversion stuff and the code very particular to these formats.	2013-03-24 22:54:45 +01:00
Vinson Lee	7d0c1f2437	llvmpipe: Fix assertions with assignment instead of comparison. Fixes assign instead of compare defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-03-24 14:49:22 -07:00
Paul Berry	a593a1b276	i965: Shrink brw_vue_map struct. This patch changes the arrays in brw_vue_map (which only ever contain values from -1 to 58) from ints to signed chars. This reduces the size of the struct from 488 bytes to 136 bytes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: fix STATIC_ASSERT to use 127 instead of 128. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-24 10:55:28 -07:00
Paul Berry	0a0deb92d9	i965/fs: Rename vp_outputs_written to input_slots_valid. With the introduction of geometry shaders, fragment inputs will no longer come exclusively from the vertex shader; sometimes they come from the geometry shader. So the name "vp_outputs_written" will become a misnomer. This patch renames vp_outputs_written to input_slots_valid, to reflect the true meaning of the bitfield from the fragment shader's point of view: it indicates which of the possible input slots contain valid data that was written by the previous shader stage. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 10:55:28 -07:00
Paul Berry	bf9bfe838e	i965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate. This patch modifies post-GS pipeline stages (transform feedback, clip, sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather than brw->vs.prog_data->vue_map. This ensures that when geometry shader support is added, these pipeline stages will consult the geometry shader output VUE map when appropriate, rather than the vertex shader output VUE map. v2: Fixed some stale "CACHE_NEW_VS_PROG" comments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 10:55:27 -07:00
Paul Berry	463ef47b16	i965: Store the geometry output VUE map in brw_context. Currently, the GPU pipeline has one active VUE map in effect at any given time--the one representing the layout of vertex data coming from the vertex shader. However, when geometry shaders are added, they will have their own independent VUE map. Later pipeline stages (clip, sf, fs) will need to consult the geometry shader VUE map if a geometry shader is in use, and the vertex shader VUE map otherwise. This patch adds a new field to brw_context, vue_map_geom_out, which contains the VUE map that should be used by later pipeline stages. It also adds a new state flag, BRW_NEW_VUE_MAP_GEOM_OUT, which is signalled whenever the contents of the VUE map changes. Since we don't support geometry shaders yet, vue_map_geom_out is currently set only by the brw_vs_prog state atom. v2: Don't set vue_map_geom_out in do_vs_prog--that's redundant and possibly problematic for precompiles. Only set it in brw_upload_vs_prog. Also, make a copy instead of using a pointer--this makes it possible to detect when the VUE map hasn't changed, so we can avoid redundant state uploads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 10:55:27 -07:00
Paul Berry	8fbc22e880	i965: Move brw_vs_prog_data::outputs_written into VUE map. Future patches will allow for there to be separate VUE maps when both a geometry shader and a vertex shader are in use. When this happens, we will want to have correspondingly separate outputs_written bitfields. Moving outputs_written into the VUE map will make this easy. For consistency with the terminology used in the VUE map, the bitfield is renamed to "slots_valid" in the process. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 10:55:27 -07:00
Paul Berry	76ba30800d	i965/gen7: Use WE_all mode when enabling channel masks for URB write. Gen7 adds mask bits to the message header for a URB write which allow the write to apply only to certain channels. We don't use this functionality, so to ensure that the entire write always occurs, we emit an OR instruction to set the mask bits. With the advent of geometry shaders, URB writes won't just happen at the end of a thread; they will happen in mid-thread too. Thus, we can no longer rely on channel 0 being enabled, so we need to emit the OR instruction in WE_all mode to ensure that it is executed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 10:55:27 -07:00
Paul Berry	8371c68a4b	i965: Rename BRW_VARYING_SLOT_MAX -> BRW_VARYING_SLOT_COUNT. The new name clarifies that it represents one more than the maximum possible brw_varying_slot value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 10:55:27 -07:00
Paul Berry	ec9c3882d9	i965: Clarify nomenclature: vert_result -> varying This patch removes the terminology "vert_result" from the i965 driver, replacing it with "varying". The old terminology, "vert_result", was confusing because (a) it referred to the enum gl_vert_result, which no longer exists (it was replaced with gl_varying_slot), and (b) it implied a vertex output, but with the advent of geometry shaders, it could be either a vertex or a geometry output, depending what shaders are in use. The generic term "varying" is less confusing. No functional change. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Whitespace fixes.	2013-03-23 22:47:54 -07:00
Chris Forbes	f56fb9d248	i965: bump MAX_DEPTH_TEXTURE_SAMPLES to 4/8 Bump MAX_DEPTH_TEXTURE_SAMPLES to match what GetInternalformativ is claiming. Since that limit is what is actually enforced now, this doesn't actually change anything except the queried value. There's still no piglits verifying that multisample depth textures work, but this works in the Unigine demos. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 16:38:18 +13:00
Chris Forbes	2405da174e	mesa: use _mesa_check_sample_count() for multisample textures Extends _mesa_check_sample_count() to properly support the TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets, which have subtly different limits than renderbuffers. This resolves the remaining TODO in the implementation of TexImage*DMultisample. V2: - Don't introduce spurious block. - Do this in multisample.c instead. - Fix typo in error message. - Inline spec quotes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 16:38:18 +13:00
Chris Forbes	90b5a2425a	mesa: helper for checking renderbuffer sample count Pulls the checking of the sample count into a helper function, and extends the existing logic to include the interactions with both ARB_texture_multisample and ARB_internalformat_query. _mesa_check_sample_count() checks a desired sample count against a a combination of target/internalformat, and returns the error enum to be produced, if any. Unfortunately the conditions are messy and the errors vary. V2: - Tidy up spurious block. - Move _mesa_check_sample_count() to multisample.c instead; It doesn't really belong in fbobject.c or teximage.c. - Inlined spec quotes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 16:38:18 +13:00
Chris Forbes	86b8380600	mesa: allow internalformat_query with multisample texture targets Now that we support ARB_texture_multisample, there are multiple targets accepted for this query, and they may have target-dependent limits, so pass the target to the driverfunc. For example, the sampling hardware may not be able to do general texelFetch() for some format/sample count combination, but the driver may still be able to implement a reasonable resolve operation, so it can be supported for renderbuffers. V2: - Don't break Gallium compile. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-24 16:38:18 +13:00
Dmitry Cherkassov	3cc2629b3b	clover: add dynamic_cast results checking down in clSetKernelArgument() code path. Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com> Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-03-24 02:43:34 +01:00
Roland Scheidegger	b50e362dbb	gallivm: Add code for rgb9e5 shared exponent format to float conversion And use this (and the code for r11g11b10 packed float to float conversion) in the soa texturing code (the generated code looks quite good). Should be an order of magnitude faster probably than using the fallback (not measured). Tested with piglit texwrap GL_EXT_packed_float and GL_EXT_texture_shared_exponent respectively (didn't find much else using it). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-24 02:09:02 +01:00
Marek Olšák	3e10ab6b22	gallium,st/mesa: don't use blit-based transfers with software rasterizers The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very fast with software rasterizer. Now Gallium drivers have the ability to turn them off. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-23 13:19:16 +01:00
Marek Olšák	25e3094058	st/mesa: implement blit-based ReadPixels Initial version contributed by: Martin Andersson <g02maran@gmail.com> This is only used if the memcpy path cannot be used and if no transfer ops are needed. It's pretty similar to our TexImage and GetTexImage implementations. The motivation behind this is to be able to use ReadPixels every frame and still have at least 20 fps (or 60 fps with a powerful GPU and CPU) instead of 0.5 fps. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-23 13:17:05 +01:00
Marek Olšák	d702c67ba5	mesa: add common format-independent memcpy-based ReadPixels path I'll need the _mesa_readpixels_needs_slow_path function for the blit-based version, but it's also useful to have this memcpy-based path in one place and not scattered across several functions. v2: add "const" to function parameters Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-23 13:17:05 +01:00
Marek Olšák	f8855a4214	mesa: add helper func for checking combined depthstencil buffers from st/mesa Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-23 13:17:05 +01:00
Marek Olšák	2dc2066b90	mesa: add a common function returning transfer ops for ReadPixels I'll need both new functions for later. For now, it consolidates the code for determining what the transfer ops should be and makes it a little bit smarter. v2: added "const" Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-23 13:17:05 +01:00
Marek Olšák	b2a4573c14	mesa: handle HALF_FLOAT like FLOAT in get_tex_rgba NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-23 13:17:05 +01:00
Roland Scheidegger	b101a094b5	llvmpipe: add EXT_packed_float render target format support New conversion code to handle conversion from/to r11g11b10 AoS to/from SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA (which works pretty much the same as r11g11b10 except for the packing). (This code should also be used for texture sampling instead of relying on u_format conversion but it's not yet, so rgb9e5 is unused.) Unfortunately a crazy amount of hacks is necessary to get the conversion code running in llvmpipe's generate_unswizzled_blend, which isn't well suited for formats where the storage representation has nothing to do with what's needed for blending (moreover, the conversion will convert from packed AoS values, which is the storage format, to float SoA values, because this is much more natural for the conversion, and likewise from SoA values to packed AoS values - but the "blend" (which includes trivial things like partial mask) works on AoS values, so incoming fs values will go SoA->AoS, values from destination will go packed AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably isn't the most efficient way though the shuffles are probably bearable). Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter), still need to verify Inf/NaNs (where most of the complexity in the conversion comes from actually). v2: drop the (very bogus) rgb9e5 part, and do component extraction in the helper code for r11g11b10 to float conversion, making the code slightly more compact (suggested by Jose), now that there are no other callers left this works quite well. (Could do the same for the opposite way but it's less than ideal there, final part of packing needs to be done in caller anyway and there'd be another conditional.) v3: minor style and comment fixes. Also fix a potential issue with negative zero being potentially returned by max(src, zero) as we don't have well-defined min/max behavior (fortunately no additonal cost). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-22 20:10:53 +01:00
Michel Dänzer	31009b4521	r600g: Honour legacy debugging environment variables This helps minimize confusion / effort when moving between branches or helping others. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-03-22 10:29:49 +01:00
Matt Turner	81e585fabe	docs: Mark ARB_ES3_compatibility as done.	2013-03-21 15:59:21 -07:00
Rob Clark	eab8d6cbdb	freedreno: add pipe->blit Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-21 17:33:51 -04:00
Paul Berry	eea30dff43	i965: Add a driconf option to disable flush throttling. Normally when submitting the first batch buffer after a flush, we check whether the GPU has completed processing of the first batch buffer of the previous frame. If it hasn't, we wait for it to finish before submitting any more batches. This prevents GPU-heavy and CPU-light applications from racing too far ahead of the current frame, but at the expense of possibly lower frame rates. Sometimes when benchmarking we want to disable this mechanism. This patch adds the driconf option "disable_throttling" to disable the throttling mechanism. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-21 13:24:43 -07:00
Matt Turner	12dc4be8a6	mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0. NOTE: This is a candidate for the 9.1 branch. Fixes piglit's texture-immutable-levels test. Reported-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-21 11:04:41 -07:00
Adam Jackson	38aa8ec937	glx: Build with VISIBILITY_CFLAGS in automake Note: This is a candidate for the stable branches. Signed-off-by: Adam Jackson <ajax@redhat.com>	2013-03-21 13:21:18 -04:00
Brian Paul	3804d67723	scons: check for existance of 'MSVC_VERSION' in env Evidently, MSVC_VERSION isn't always defined so check for it before checking the MSVC version. Suggested by Jose.	2013-03-21 09:24:40 -06:00
Brian Paul	10393038f8	softpipe: silence some asst. MSVC type warnings in sp_tex_sample.c	2013-03-21 09:24:35 -06:00
Brian Paul	b2d3f364db	softpipe: silence some MSVC signed/unsigned warnings	2013-03-21 09:24:35 -06:00
Brian Paul	2e3200d463	softpipe: silence some MSVC float/double warnings	2013-03-21 09:24:35 -06:00
Brian Paul	f7b07fd25c	rbug: silence some MSVC signed/unsigned warnings	2013-03-21 09:24:35 -06:00
Brian Paul	bfc8b8fac5	postprocess: silence some MSVC float/int warnings	2013-03-21 09:24:35 -06:00
Brian Paul	8bd5692a5d	meta: fix incorrect slice, r coordinate computation The arithmetic to convert a 3D texture slice to an R coordinate was incorrect. Found when MSVC warned of a divide by zero. Note that we don't actually ever hit this path. We don't decompress slices of 3D textures and we don't support 3D mipmap generation yet.	2013-03-21 09:24:35 -06:00
Brian Paul	a940c93aac	vega: fix MSVC warning about missing return statement	2013-03-21 09:24:35 -06:00
Brian Paul	52edca9df9	meta: minor indentation fix	2013-03-21 08:28:26 -06:00
Michel Dänzer	032e5548b3	radeonsi: Emit pixel shader state even when only the vertex shader changed Fixes random failures with piglit glsl-max-varyings. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Christian König <christian.koenig@amd.com>	2013-03-21 15:12:31 +01:00
Chad Versace	e34fe8bd20	android: Define PACKAGE_VERSION/BUGREPORT in CFLAGS This fixes the Android build. Commit `439c3d4` broke it. CC: Adrian M Negreanu <adrian.m.negreanu@intel.com> CC: Matt Turner <mattst@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-20 15:11:41 -07:00
Kenneth Graunke	d24819dce8	i965/vs: Add IR dumping for immediates. This makes dump_instructions more useful. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-20 10:40:44 -07:00
Kenneth Graunke	095c3755ee	glsl: Add built-in functions for GLSL 1.50. This makes basic built-in functions work in GLSL 1.50. It supports everything except the new Geometry Shader functions. The new 150.glsl file is 140.glsl plus ARB_texture_multisample.glsl; 150.frag is identical to 140.frag except for the #version bump. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-03-20 10:38:40 -07:00
Kenneth Graunke	bcdda04349	glsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50. GLSL 1.50 includes support for the new sampler types introduced by the ARB_texture_multisample extension. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-03-20 10:38:38 -07:00
Kenneth Graunke	f1ca2ed538	glsl: Bump standalone compiler versions to 1.50. The version bumps are necessary in order to compile built-ins for 1.50. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-03-20 10:38:20 -07:00
Kenneth Graunke	d86efc075e	i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary. Commit `33599433c7` began setting the texture swizzle mode to XYZ1 for RED, RG, and RGB textures in order to force alpha to 1.0 in case we actually stored the texture as RGBA. This had a unforseen performance implication: the shader precompile assumes that the texture swizzle mode will be XYZW for non-shadow sampler types. By setting it to XYZ1, this means every shader used with a RED, RG, or RGB texture has to be recompiled. This is a very common case. Unfortunately, there's no way to improve the precompile, since RGBA textures still need XYZW, and there's no way to know by looking at the shader source what texture formats might be used. However, we only need to smash alpha to 1.0 if the texture's memory format actually has alpha bits. If not, the sampler already returns 1.0 for us without any special swizzling. XRGB8888, for example, is a very common case where this occurs. This partially fixes a performance regression since commit `33599433c7`. More work is required to fully fix it in all cases. This at least helps Warsow. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Carl Worth <cworth@cworth.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-20 10:37:34 -07:00
Kenneth Graunke	2dd22130cd	i965: Don't print a fatal-looking message if intelCreateContext fails. With the old context creation mechanism, an application asked the GL to give it a context. Failing to produce a context was a fatal error. Now, with GLX_ARB_create_context, the application can request a specific version. If it's higher than the maximum version we support, context creation will fail. But this is a normal error that applications recover from. In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1, 4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1 context. This led to it printing the following message 6 times: "brwCreateContext: failed to init intel context" There's no need to alarm users (and developers) with such a message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-20 10:37:34 -07:00
Eric Anholt	1f112ccf02	i965/gen7: Align all depth miplevels to 8 in the X direction. On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW workaround: blit" (the printouts from the misaligned-depth workaround blits) from 725 to 675. It doesn't totally eliminate the workaround blit, because we still have problems with Y offsets that we can't fix (since texturing can only align miplevels up to 2 or 4, not 8). No regressions on piglit/es3conform on IVB. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-20 10:18:44 -07:00
Christoph Bumiller	529dbbfcf7	nvc0: fix max varying count, move CLIPVERTEX,FOG out of the way The card spews an error if I use all 128 generic slots. Apparently the real limit isn't just dictated by the address space layout.	2013-03-20 12:25:21 +01:00
Christoph Bumiller	8acaf862df	gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3 This makes it possible to identify gl_TexCoord and gl_PointCoord for drivers where sprite coordinate replacement is restricted. The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings should be hidden behind the GENERIC semantic or not. With this patch only nvc0 and nv30 will request that they be used. v2: introduce a CAP so other drivers don't have to bother with the new semantic v3: adapt to introduction gl_varying_slot enum	2013-03-20 12:25:21 +01:00
Ian Romanick	3eaf823b90	docs: import release notes for 9.1.1, add news item Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-19 17:46:30 -07:00
Kristian Høgsberg	939789e48d	gallium-egl: Fix compile errors introduced in `de315f76a` The commit changed API in a helper library shared by both egl_dri2 and the gallium egl state tracker, but only egl_dri2 was updated to use the new interface. Tested-by: Giulio Camuffo <giuliocamuffo@gmail.com>	2013-03-19 20:17:47 -04:00
Paul Berry	995bbc2256	i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask. Previous to this patch, when using fixed function fragment shading, bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set differently during precompiles and normal usage. During precompiles it was being set only if the fragment shader reads from window position (which it never does), so it was always being set to 0. During normal usage it was being set if the vertex shader writes to all 4 components of gl_Position (which it usually does), so it was usually being set to 1. As a result, we were almost always doing an extra recompile for the fixed function fragment shader. The recompile was totally unnecessary, though, because brw_wm_prog_key::proj_attrib_mask is only consulted for fs_visitor::emit_general_interpolation(), which isn't used for VARYING_SLOT_POS. This patch avoids the unnecessary recompile by always setting bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-19 16:56:58 -07:00
Paul Berry	db81d3b8f7	ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup. Previously, right after calling _mesa_glsl_link_shader(), the fixed function fragment shader code made several calls with the ostensible purpose of setting up uniforms for the fragment shader it just created. These calls are unnecessary, since _mesa_glsl_link_shader() calls driver->LinkShader(), which takes care of calling these functions (or their equivalent). Also, they are dangerous to call after _mesa_glsl_link_shader() has returned, because on back-ends such as i965 which do precompilation, _mesa_glsl_link_shader() may have already cached pointers to the existing uniform structures; attempting to set up the uniforms again invalidates those cached pointers. It was only by sheer coincidence that this wasn't manifesting itself as a bug. It turns out that i965's precompile mechanism was always setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed function fragment shaders, but during normal usage this bit usually gets set to 1. As a result, the precompiled shader (with its invalid uniform pointers) was not being used. I'm about to introduce some changes that cause bit 0 of proj_attrib_mask to be set consistently between precompilation and normal usage, so to avoid regressions I need to get rid of the dangerous duplicate uniform setup code first. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-19 16:56:56 -07:00
Paul Berry	0af56c9d53	i965: Avoid unnecessary copy when depthstencil workaround invoked by clear. Since apps typically begin rendering with a call to glClear(), it is likely that when brw_workaround_depthstencil_alignment() moves a miplevel to a temporary buffer, it can avoid doing a blit, since the contents of the miplevel are about to be erased. This patch adds the necessary plumbing to determine when brw_workaround_depthstencil_alignment() is being called as a consequence of glClear(), and avoids the unnecessary blit when it is safe to do so. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Eliminate unnecessary call to _mesa_is_depthstencil_format(). Fix handling of depth buffer in depth/stencil format. v3: Use correct bitfields for clear_mask. Fix handling of depth buffer in depth/stencil format when hardware uses separate stencil. When invalidating, make sure we still reassociate the image to the new miptree. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-19 16:56:51 -07:00
Alex Deucher	49c1fc7044	r600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on cayman Doesn't exist on the asic and will cause a CS rejection if VM is disabled. Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-19 18:13:27 -04:00
Alex Deucher	a9914117ea	r600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xx Not using HiS yet, but matches what we do on evergreen+. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-19 18:13:26 -04:00
Brian Paul	c45d22e26a	winsys/svga: improve error/debug message output Use vmw_printf() just for extra debugging info (off by default). Use vmw_error() for real errors/failures/etc that we definitely want to report. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-19 15:18:38 -06:00
Brian Paul	460a4444e8	tgsi: fix uninitialized declaration array fields Fixes a few regressions since the TGSI array changes. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-19 15:15:37 -06:00
Kristian Høgsberg	1670737436	egl_dri2: Lower __DRI_IMAGE version requirement back to 1 We check the extension version manually instead and verify that we have the createImageFromFds function before enabling prime fd passing.	2013-03-19 16:13:38 -04:00
Maarten Lankhorst	7c3d8301af	radeon/llvm: Do not link against libgallium when building statically. NOTE: This is a candidate for the 9.1 branch. Tested-by: Vincent Lejeune <vljn@ovi.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-03-19 20:20:33 +01:00
Matt Turner	322c840bea	gles2: Add an ABI-check test Checks that no functions are exported that are not part of the ABI. Note that currently we are exporting functions that are aliased to functions that are part of the ABI. They shouldn't be exported, but the XML descriptions don't adequately describe this case.	2013-03-19 12:04:32 -07:00
Matt Turner	569bd281c1	gles1: Add an ABI-check test Checks that no functions are exported that are not part of the ABI. Note that currently we are exporting functions that are aliased to functions that are part of the ABI. They shouldn't be exported, but the XML descriptions don't adequately describe this case.	2013-03-19 12:04:31 -07:00
Andreas Boll	182895c4e6	gallium/egl: fix out-of-tree build Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1 NOTE: This is a candidate for the 9.1 branch. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-19 18:12:38 +01:00
Andreas Boll	92e6260c19	osmesa: fix out-of-tree build Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1 v2: Move the added line immediately after -I$(top_srcdir)/src/mapi NOTE: This is a candidate for the 9.1 and 9.0 branches. Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-19 18:12:38 +01:00
Andreas Boll	06fff296e9	build: Enable x86 assembler on Hurd. Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/10-hurd-configure-tweaks.diff;h=984e17df1b8afdf8e4b36bee96aa5ab6a5691021;hb=refs/heads/ubuntu%2B1 Thanks to Pino Toscano. v2: Don't bother with x86_64. AFAICT GNU/Hurd doesn't support it so far. NOTE: This is a candidate for stable branches. Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Acked-by: Matt Turner <mattst88@gmail.com>	2013-03-19 18:12:38 +01:00
Andreas Boll	7962f28c43	mesa: use ieee fp on s390 and m68k Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1 Fixes Debian bug #349437. Patch written by David Nusinow. NOTE: This is a candidate for stable branches. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com>	2013-03-19 18:12:37 +01:00
Roland Scheidegger	5af7b45986	gallivm: fix return opcode handling in main function of a shader If we're in some conditional or loop we must not return, or the code after the condition is never executed. (v2): And, we also can't just continue as nothing happened, since the mask update code would later check if we actually have a mask, so we need to remember that there was a return in main where we didn't exit (to illustrate this, a ret in a if clause would cause a mask update which is still ok as we're in a conditional, but after the endif the mask update code would drop the mask hence bringing execution back to pixels which should have their execution mask set to zero by the ret). Thanks to Christoph Bumiller for figuring this out. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-19 18:04:05 +01:00
Rob Clark	afc1b7c21f	freedreno: clear fixes Some fixes for clearing only depth or only stencil. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-19 10:49:30 -04:00
Christian König	90862c8507	radeonsi: enable indirect adressing Fixing 16 piglit tests. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-19 15:16:18 +01:00
Christian König	5e616cf2c5	radeonsi: implement indirect adressing of constants Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-19 15:16:18 +01:00
Christian König	f5298b0a65	radeonsi: switch to using resource destribtors for constants v2 v2: remove superfluous mask, use buffer_size instead of constant Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-19 15:16:18 +01:00
Christian König	c05483fc00	radeon/llvm: rework input fetch and output store Cleanup the code and implement indirect addressing. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-19 15:16:18 +01:00
Brian Paul	b51f8593d8	tgsi: add initializer data to fix MSVC compile error	2013-03-19 07:55:48 -06:00
Christian König	897303f8ff	tgsi: add ArrayID documentation v2 v2: further improve the text with comments from Christoph Bumiller. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	21190fbd56	tgsi: use separate structure for indirect address v2 To further improve the optimization of source and destination indirect addressing we need the ability to store a reference to the declaration of the addressed operands. Since most of the fields in tgsi_src_register doesn't apply for an indirect addressing operand replace it with a separate tgsi_ind_register structure and so make room for extra information. v2: rename Declaration to ArrayID, put the ArrayID into () instead of [] Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	16caeff2a5	tgsi: add ArrayID to declarations Remember which declarations are declared as "arrays" and so can be indirectly addressed. ArrayIDs start at 1, cause for compatibility reasons zero is treaded as no array present. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	d3e07bed90	tgsi: remove TGSI_FILE_(IMMEDIATE\|TEMP)_ARRAY Nobody seems to be using it, and only nv50 had a partial implementation. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	affdff230b	glsl_to_tgsi: remove indirect addressing limitations They shouldn't be necessary any more. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	3f67251e3d	glsl_to_tgsi: allocate arrays separately v2 Instead of allocating everything as temporaries, use the new array allocation functions. v2: fix bug in simplify_cmp, declare arrays on demand Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	433b2ca46b	glsl_to_tgsi: use get_temp for all allocations Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	506d400275	tgsi/ureg: implement support for array temporaries Don't bother with free temporaries, just allocate them at the end and also emit them in their own declaration. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:32 +01:00
Christian König	52947b93b2	tgsi/ureg: cleanup local temporary emission v2 Instead of emitting each temporary separately, emit them in a chunk. v2: keep separate function for emitting temps Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-19 13:38:31 +01:00
Andreas Boll	36320bfa54	radeon/llvm: Link against libgallium.la to fix an undefined symbol Ported from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1 Fixes a regression introduced with `f70c385351` NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-03-19 12:07:51 +01:00
Kristian Høgsberg	de315f76a2	wayland: Add prime fd passing as a buffer sharing mechanism Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>	2013-03-18 21:15:41 -04:00
Kristian Høgsberg	2356e28452	Add dri image entry point for creating image from fd Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>	2013-03-18 21:03:54 -04:00
Kristian Høgsberg	664fe6dc84	wayland: allocate a __DRIimage for the color buffer No functional change here, but this will let us query the image for an fd handle later. Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>	2013-03-18 21:03:46 -04:00
Rob Clark	4e8f5c52bb	DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap If ddx does not support swap, don't advertise it. This is a hack to work around current xservers which advertise this extension even when it is clearly not supported. When: http://lists.x.org/archives/xorg-devel/2013-February/035449.html is merged in upstream xserver and makes it's way into most distros then this hack can be removed. In the mean time, it is required to allow gnome-shell/clutter/etc to work properly with a DDX driver which does not support ScheduleSwap. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-18 14:16:43 -04:00
Paul Berry	5a13e051d9	i965/blorp: Add INTEL_DEBUG=blorp flag. This debug flag prints out the native GEN assembly for a blitting shader produced using BLORP. Hopefully this should be useful in developing additional BLORP features. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-18 09:27:25 -07:00
Alex Deucher	2da8ee16a8	r600g: properly set non_disp tiling mode for DMA (v2) Needs to be set for depth, stencil, and fmask just like other blocks. v2: drop additional cayman bits for now Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-17 13:32:48 -04:00
Alex Deucher	4409758a04	r600g: Use blitter rather than DMA for 128bpp on cayman (v3) On cayman, 128bpp surfaces require non_disp ordering for hw access to both linear and tiled surfaces. When we use the 3D engine we can set the non_disp ordering on both the tiled and linear sides (via CB or texture), but when we use the DMA engine, we can only set the non_disp ordering on the tiled side, so after a L2T operation with the DMA engine, the data ends up in the wrong order on the tiled side. v2: cayman/TN only v3: fix comments Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=60802 Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-17 13:32:48 -04:00
Paul Berry	346a1b9bb9	i965: Simplify separate stencil check The only format returned by _mesa_get_format_base_format() that satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we can simplify the check. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-16 10:15:51 -07:00
Maarten Lankhorst	f70c385351	gallium/build: Fix visibility CFLAGS in automake v2: Andreas Boll <andreas.boll.dev@gmail.com> - Fix formatting - use one CFLAG per line NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238 Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-16 12:45:22 +01:00
José Fonseca	49ae9b08d4	scons: Warn when using MSVS versions prior to 2012. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-15 19:55:54 +00:00
Paul Berry	c5d5827951	i965: Apply depthstencil alignment workaround when doing fast clears. Fast depth clears have the same depth/stencil alignment requirements as other drawing operations. Therefore, we need to call brw_workaround_depthstencil_alignment() from both the clear and drawing paths. Without this fix, we get image corruption if the following conditions hold: (a) the first ever drawing operation to a depth miplevel (or the first drawing operation after having used the texture for sampling) is a clear, (b) the depth miplevel has a size that is eligible for fast depth clears, and (c) the depth miplevel has an offset within the miptree that isn't 8x8 aligned. Fixes piglit "depthstencil-render-miplevels" tests with size 273. NOTE: This is a candidate for stable branches Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-15 11:52:33 -07:00
Paul Berry	eed6baf762	Replace gl_frag_attrib enum with gl_varying_slot. This patch makes the following search-and-replace changes: gl_frag_attrib -> gl_varying_slot FRAG_ATTRIB_* -> VARYING_SLOT_* FRAG_BIT_* -> VARYING_BIT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:26:17 -07:00
Paul Berry	f117abe664	Get rid of _mesa_frag_attrib_to_vert_result(). Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:26:07 -07:00
Paul Berry	10a131211e	Get rid of _mesa_vert_result_to_frag_attrib(). Now that there is no difference between the enums that represent vertex outputs and fragment inputs, there's no need for a conversion function. But we still need to be able to detect when a given vertex output has no corresponding fragment input. So it is replaced by a new function, _mesa_varying_slot_in_fs(), which tells whether the given varying slot exists as an FS input or not. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:25:57 -07:00
Paul Berry	827c074fb1	mtypes.h: Modify gl_frag_attrib to refer to new gl_varying_slot enum. This paves the way for eliminating the gl_frag_attrib enum entirely. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:25:46 -07:00
Paul Berry	a6d807c86f	Replace gl_geom_result enum with gl_varying_slot. This patch makes the following search-and-replace changes: gl_geom_result -> gl_varying_slot GEOM_RESULT_* -> VARYING_SLOT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:25:36 -07:00
Paul Berry	d453225efc	mtypes.h: Modify gl_geom_result to refer to new gl_varying_slot enum. This paves the way for eliminating the gl_geom_result enum entirely. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:25:26 -07:00
Paul Berry	d7c60a4a4f	Replace gl_geom_attrib enum with gl_varying_slot. This patch makes the following search-and-replace changes: gl_geom_attrib -> gl_varying_slot GEOM_ATTRIB_* -> VARYING_SLOT_* GEOM_BIT_* -> VARYING_BIT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:25:15 -07:00
Paul Berry	094bcf399c	mtypes.h: Modify gl_geom_attrib to refer to new gl_varying_slot enum. This paves the way for eliminating the gl_geom_attrib enum entirely. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:25:05 -07:00
Paul Berry	36b252e947	Replace gl_vert_result enum with gl_varying_slot. This patch makes the following search-and-replace changes: gl_vert_result -> gl_varying_slot VERT_RESULT_* -> VARYING_SLOT_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:24:54 -07:00
Paul Berry	9e729a79b0	mtypes.h: Modify gl_vert_result to refer to new gl_varying_slot enum. This paves the way for eliminating the gl_vert_result enum entirely. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:24:44 -07:00
Paul Berry	8a076c5f05	mtypes.h: Add new gl_varying_slot enum, and bitfield defines. Future patches will make use of the enum. It will eventually take the place of the existing enums gl_vert_result, gl_geom_attrib, gl_geom_result, and gl_frag_attrib, all of which represent essentially the same information but using inconsistent values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:24:34 -07:00
Paul Berry	6bec74bfd9	i965: Change fragment input related bitfields to 64-bit. This patch updates the bitfields brw_context::wm.input_size_masks, tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of which are indexed by gl_frag_attrib, from 32-bit to 64-bit. This paves the way for supporting geometry shaders, and for merging the gl_frag_attrib and gl_vert_result enums. The combination of these two will require at least 55 bits in the bitfields. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Brian Paul <brianp@vmware.com>	2013-03-15 09:24:30 -07:00
Alex Deucher	03eef7f8ef	r600g: add Richland APU pci ids Note: this is a candidate for the stable branches. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-15 09:24:14 -04:00
Brian Paul	fec8733d4e	st/dri: add support for the always_have_depth_buffer option This involved adding another driOptionCache to dri_screen. The existing one just held the default values. But now we also need to have the values from the DRI config file so that we can get at the always_have_depth_buffer config option, which is per-screen.	2013-03-15 07:05:01 -06:00
Brian Paul	5d1b3097e2	driconf: add a miscellaneous section and always_have_depth_buffer option This option is needed for some applications that neglect to request a depth buffer when choosing a visual/fbconfig. The Linux app Topogun is an example of this problem.	2013-03-15 07:04:13 -06:00
Brian Paul	b3d184bac6	driconf: reorder options, reformat comments, etc Move the options into the proper section (Debug, Quality, Performance, etc). Update comments and add some whitespace to improve readability.	2013-03-15 07:04:08 -06:00
Philipp Brüschweiler	c07c18081e	wayland: fix segfault when using software rendering wayland_roundtrip() was given an incorrect parameter. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62362 Note: This is a candidate for the stable branches. Signed-off-by: Brian Paul <brianp@vmware.com>	2013-03-15 06:50:23 -06:00
Brian Paul	f4a2c29d93	softpipe: fix up NUM_ENTRIES confusion There were two different NUM_ENTRIES #defines for the framebuffer tile cache and the texture tile cache. Rename the later to fix the warnings: In file included from sp_flush.c:40:0: sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined sp_tile_cache.h:78:0: note: this is the location of the previous definition In file included from sp_context.c:50:0: sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined sp_tile_cache.h:78:0: note: this is the location of the previous definition Also, replace occurances of NUM_ENTRIES with Element() macro to be safer. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-14 18:17:18 -06:00
Brian Paul	2f6970ae97	st/osmesa: silence some optimized build warnings	2013-03-14 18:09:42 -06:00
Brian Paul	6a9d7659d6	draw: init pre_clip_pos = NULL to fix optimized build warning	2013-03-14 18:09:42 -06:00
Brian Paul	622b1fcc18	glx: init screen = 0 to fix optimized build warning	2013-03-14 18:09:42 -06:00
Kenneth Graunke	91df4d746b	i965: Make INTEL_DEBUG=shader_time use the RAW surface format. Untyped Atomic Operation messages are illegal for non-RAW formats. The IVB hardware proceeds happily (after all, who cares what the format of the surface is if you're doing untyped ops on it?), but later hardware apparently doesn't. The simulator for gen7 does complain, though. v2: Rebase against updates to previous patches. (by anholt) NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-14 12:30:40 -07:00
Kenneth Graunke	125b34cffb	i965: Specialize SURFACE_STATE creation for shader time. This is basically a copy and paste of gen7_create_constant_surface, but with the parameters filled in to offer a simpler interface. It will diverge shortly. I didn't bother adding it to the vtable for now since shader time is only exposed on Gen7+. v2: Replace tabs in the new code (by anholt) Add back dropped memset() and add a comment about HSW channel selects. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-14 12:30:40 -07:00
Kenneth Graunke	f27a220cad	i965: Fix INTEL_DEBUG=shader_time for Haswell. Haswell's "Data Cache" data port is a single unit, but split into two SFIDs to allow for more message types without adding more bits in the message descriptor. Untyped Atomic Operations are now message 0010 in the second data cache data port, rather than 6 in the first. v2: Use the #defines from the previous commit. (by anholt) NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> (v1)	2013-03-14 12:30:40 -07:00
Eric Anholt	a2d08f170a	i965: Add definitions for gen7+ data cache messages. We were sparsely using some of these message types, but I'll just fill them all in now. It will be used for fixing shader_time on HSW. v2: Add missing MEDIA_BLOCK_READ. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-14 12:30:39 -07:00
Eric Anholt	db3a0f13ef	i965: Split shader_time entries into separate cachelines. This avoids some snooping overhead between EUs processing separate shaders (so VS versus FS). Improves performance of a minecraft trace with shader_time by 28.9% +/- 18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4). v2: Add a define for the stride with a comment explaining its units and why. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-14 12:30:39 -07:00
José Fonseca	a35a19a6ea	scons: Define _ALLOW_KEYWORD_MACROS on MSVC builds. scons/llvm.py defines inline globally to workaround issues with LLVM C binding headers, so the only way to is to avoid aggravating xkeycheck.h errors is to set _ALLOW_KEYWORD_MACROS. This fixes MSVC 2012 build with LLVM. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-14 19:01:10 +00:00
José Fonseca	6a3d77e13d	softpipe: Shrink context size. - each softpipe_tex_tile_cache 50646444 = 3,276,800 bytes - each softpipe_context has 3*32 softpipe_tex_tile_cache, i.e, each softpipe context is 314,572,800 bytes, i.e, 300MB That is, in a 32bits process (around 3GB virtual memory max), we can only fit 10 contexts. This change is a short-term hack to shrink the context size. Longer term we'll need to change how the texture cache works. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-14 11:59:53 +00:00
Christian König	ce3aa0e775	radeon/llvm: fix LLVM dependencies Since commit `1c4f283151` we obvious depend on this. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-14 12:38:54 +01:00
Anuj Phogat	d78dcdf103	mesa: Fix FB blitting in case of zero size src or dst rect Framebuffer blitting operation should be skipped if any of the dimensions (width/height) of src/dst rect is zero. V2: Move the dimension check after error checking in _mesa_BlitFramebuffer. Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform https://bugs.freedesktop.org/show_bug.cgi?id=59495 Note: Candidate for all the stable branches. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-03-13 17:58:09 -07:00
Roland Scheidegger	1826659272	tgsi: fix sample_d emit for arrays Those cases were apparently forgotten. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-14 00:22:55 +01:00
Roland Scheidegger	9e93d7c4fd	llvmpipe: don't assert when trying to render to surfaces with multiple layers instead just warn when creating the surface, rendering will simply happen to first layer. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-14 00:22:30 +01:00
Roland Scheidegger	81e728982d	softpipe: don't assert when creating surfaces with multiple layers We can't handle them yet, however we can safely just warn (we will just render to first layer, which is fine since we can't handle rendertarget system value neither). Also make behavior more predictable with buffer surfaces (it would sometimes hit bogus asserts because of the union in the surface, instead create the surface but assert when trying to set a buffer in the framebuffer). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-14 00:21:56 +01:00
José Fonseca	4889315619	llvmpipe: Fix geometry shader token leak. Trivial. Matches softpipe's code.	2013-03-13 21:46:50 +00:00
Tom Stellard	c95177ea88	radeon/llvm: Add missing license headers Signed-off-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-13 16:01:31 +00:00
Tom Stellard	1c4f283151	radeon/llvm: Make radeon_llvm_util.cpp a C file All the functions in this file are now implemented in C.	2013-03-13 16:01:31 +00:00
Tom Stellard	3958c104c6	radeon/llvm: Optimize radeon_llvm_strip_unused_kernels() Just delete unused kernels rather than marking them as internal and running the GlobalDCE pass. Also implement this function in C and inline it into radeon_llvm_get_kernel_module()	2013-03-13 16:01:31 +00:00
Tom Stellard	2ace79dce5	radeon/llvm: Implement radeon_llvm_get_kernel_module() using the C API	2013-03-13 16:01:31 +00:00
Tom Stellard	b34b8576ec	radeon/llvm: Implement radeon_llvm_get_num_kernels() using the C API	2013-03-13 16:01:31 +00:00
Tom Stellard	7e9abbea15	radeon/llvm: Implement radeon_llvm_parse_bitcode() using C API Also make the function static since it is not used anywhere else.	2013-03-13 16:01:30 +00:00
Tom Stellard	97bfcddde0	r600g/llvm: Move llvm wrapper functions into the radeon directory	2013-03-13 16:01:30 +00:00
Jon TURNEY	28e1693630	Properly check GLX_INDIRECT_RENDERING in glapi/tests/check_table Actually use $DEFINES, so we can see if GLX_INDIRECT_RENDERING is defined If GLX_INDIRECT_RENDERING is defined, _GLAPI_SKIP_PROTO_ENTRY_POINTS will be defined, and libglapi won't contain the 'protocol entry points', so we should provide stubs in check_table.cpp	2013-03-13 14:55:52 +00:00
Jon TURNEY	ed8ddd57e9	Fix glapi/tests/check_table.cpp for standardized OpenGL function names It looks like this has been broken since commit `1a1db1746d` "Standardize names of OpenGL functions." Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-03-13 14:53:49 +00:00
Jon TURNEY	c7a319182f	Fix out-of-tree build of 'make check' in src/mapi/glapi/tests/ Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-03-13 14:53:36 +00:00
José Fonseca	cff70dcfb2	scons: Define PACKAGE_VERSION/BUGREPORT globally. Fixes the scons build.	2013-03-13 13:13:37 +00:00
Vinson Lee	a6bb7a9495	tests: Add $(top_srcdir)/include to AM_CPPFLAGS. Fixes this build error with make check. CC collision.o In file included from ../../../../../src/mesa/main/hash_table.h:34:0, from collision.c:31: ../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-03-12 23:14:39 -07:00
José Fonseca	f7ef83cdf4	scons: Define PACKAGE_xxx Should get the builds going again.	2013-03-13 01:29:47 +00:00
Brian Paul	6f86b934e6	docs: rewrite the OSMesa info / instructions Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:43 -06:00
Brian Paul	79eac7da6b	configure: wire-up new OSMesa gallium state tracker and target Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:43 -06:00
Brian Paul	be51f123c9	target/osmesa: add new Makefile.am Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:43 -06:00
Brian Paul	94263da46e	targets/osmesa: new OSMesa gallium target Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:43 -06:00
Brian Paul	7114b6a92d	st/osmesa: add new Makefile.am Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:43 -06:00
Brian Paul	73436a909e	st/osmesa: new OSMesa gallium state tracker Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:43 -06:00
Brian Paul	3c3668c5a1	st/mesa: add PIPE_FORMAT_R16G16B16A16_UNORM renderbuffer support To allow rendering in 16-bit/channel RGBA buffers. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 19:04:42 -06:00
José Fonseca	c526e1728f	scons: Re-add ','	2013-03-13 00:31:03 +00:00
José Fonseca	7bff1cc3f6	autotools: Add missing top-level include dir. Fixes autotools build failure. Not sure if there are more, as I have difficulties in building the full tree.	2013-03-13 00:25:09 +00:00
Matt Turner	5c6e1e97b3	configure.ac: Alphabetize freedreno makefiles.	2013-03-12 17:09:55 -07:00
Matt Turner	d89ef39418	build: Get rid of dead MESA_ASM_FILES variable Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-12 17:02:54 -07:00
Matt Turner	bd0c9d07d0	mesa/build: Get rid of dead ALL_FILES variable Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-12 17:02:47 -07:00
Matt Turner	51e065a96c	xmlpool/.gitignore: Remove 'Makefile' Handled by top level .gitignore. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-12 17:02:40 -07:00
Matt Turner	e59fc3faa5	mesa: Use PACKAGE_BUGREPORT macro. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-12 17:02:33 -07:00
Matt Turner	9065bab37e	mesa: Remove unused version #defines from version.h. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-12 17:02:28 -07:00
Matt Turner	439c3d4e31	mesa: Replace MESA_VERSION with PACKAGE_VERSION. One fewer place to have to update. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-12 17:02:21 -07:00
Zack Rusin	42c1b33f6d	draw/so: Fix stream output with geometry shaders If geometry shader is present its stream output info should be used instead of the vs and we shouldn't use the pre-clipped corrdinates. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-12 16:22:26 -07:00
José Fonseca	57cd1d1454	include: Fix build with VS 11 (i.e, 2012). NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-12 22:07:10 +00:00
José Fonseca	70fe7c6d3e	mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all. We were in four already... NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-12 22:06:27 +00:00
José Fonseca	96b3ca89b1	scons: Allows choosing VS 10 or 11. NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-12 22:04:04 +00:00
Michel Dänzer	4dca602521	radeonsi: Fix off-by-one for maximum vertex element index in some cases In cases where the vertex element size is smaller than the vertex buffer stride, the previous calculation could end up 1 too low. This would result in the GPU using index 0 instead of the maximum index for those elements, which would be visible as intermittent distorted triangles. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-12 18:25:54 +01:00
Christoph Bumiller	8aa8b0539e	nvc0: avoid crash on updating RASTERIZE_ENABLE state When doing a blit with the 3D engine, the rasterizer or zsa cso may be NULL.	2013-03-12 12:55:37 +01:00
Christoph Bumiller	4d28aff48f	gallium/tests: check format in compute tests, make selectable	2013-03-12 12:55:37 +01:00
Christoph Bumiller	e2dded78ea	nvc0: add MP trap handler for nve4	2013-03-12 12:55:37 +01:00
Christoph Bumiller	ae59a7d35d	nvc0: they removed the NTID,NCTAID,GRIDID registers on nve4	2013-03-12 12:55:37 +01:00
Christoph Bumiller	e066f2f62f	nvc0: implement compute support for nve4	2013-03-12 12:55:37 +01:00
Christoph Bumiller	75f1f852b0	nvc0/ir: try to fix CAS (CompareAndSwap)	2013-03-12 12:55:37 +01:00
Christoph Bumiller	18fdfbdc32	nv50/ir: add CCTL (cache control) op	2013-03-12 12:55:37 +01:00
Christoph Bumiller	9db7e09cb4	nvc0/ir/emit: fix emission of large address offsets	2013-03-12 12:55:36 +01:00
Christoph Bumiller	175c185941	nvc0: add SHADER/COMPUTE_RESOURCE bind flags to format table	2013-03-12 12:55:36 +01:00
Christoph Bumiller	19ea0bd521	nouveau: align PIPE_BIND_SHADER,COMPUTE_RESOURCEs to 256 bytes	2013-03-12 12:55:36 +01:00
Christoph Bumiller	47f2179844	nv50,nvc0: copy writable flag on surface creation	2013-03-12 12:55:36 +01:00
Christoph Bumiller	7a91d3a2a4	nv50/ir: add support for different sampler and resource index on nve4 And remove non-working code for indirect sampler/resource selection. Will be added back later. Includes code from "nv50/ir/tgsi: Resource indirect indexing" by Francisco Jerez (when mixing the R and S handles we can only specify them via a register, i.e. indirectly, unless we upload all the used handle combinations to c[] space, which we don't for now).	2013-03-12 12:55:36 +01:00
Christoph Bumiller	99e4eba669	nv50/ir: implement splitting of 64 bit ops after RA	2013-03-12 12:55:36 +01:00
Christoph Bumiller	ac9f19e485	nvc0/ir: skip back edges when determining latest sched value	2013-03-12 12:55:36 +01:00
Christoph Bumiller	f07c46a4f4	nvc0/ir: use large issue delay after RET, too	2013-03-12 12:55:36 +01:00
Christoph Bumiller	b23ec3f8ba	nv50/ir: fix size adjustment for sched info for multiple functions	2013-03-12 12:55:36 +01:00
Christoph Bumiller	d39169cb6d	nv50/ir: print function inputs and outputs	2013-03-12 12:55:36 +01:00
Christoph Bumiller	1b4faa2b17	nv50/ir/ssa: add a few comments regarding RenamePass	2013-03-12 12:55:36 +01:00
Francisco Jerez	1535b754fb	nv50/ir/tgsi: Exclude local declarations from function prototypes.	2013-03-12 12:55:36 +01:00
Christoph Bumiller	9b563ef3f7	nv50/ir/opt: try to make use of SUCLAMP addend	2013-03-12 12:55:36 +01:00
Christoph Bumiller	a788be19e5	nv50/ir: don't assert on type in Modifier.applyTo if it is 0	2013-03-12 12:55:35 +01:00
Christoph Bumiller	c3a5bc0bdf	nv50/ir: add support for barriers nv50 part by Francisco Jerez.	2013-03-12 12:55:35 +01:00
Christoph Bumiller	a0a25191f2	nv50/ir/tgsi: add support for atomics	2013-03-12 12:55:35 +01:00
Christoph Bumiller	c2dfcd7f0e	nv50/ir/tgsi: handle TGSI_OPCODE_LOAD,STORE Squashed and (heavily) modified original patches by Francisco Jerez: nv50/ir/tgsi: Implement resource LOAD/STORE (wip). nv50/ir/tgsi: Emit SUST/SULD for surface access, and add CB LOAD/STORE support nv50/ir/tgsi: Fix/clean up the LOAD/STORE handling code. Left out for now: nv50/ir/tgsi: Resource indirect indexing Treating raw, read-only surfaces as constant buffers (CBs) was removed because CBs are limited to a size of 64 KiB which isn't desireable, and because this decision should probably be made by the state tracker. If we used a number of CB slots for surfaces, it might find that we cannot accomodate the advertised limit.	2013-03-12 12:55:35 +01:00
Christoph Bumiller	d105b3df14	nvc0/ir: don't replace load from input in COMPUTE progs with VFETCH	2013-03-12 12:55:35 +01:00
Christoph Bumiller	4506ed28de	nvc0/ir: implement lowering of surface ops for nve4	2013-03-12 12:55:35 +01:00
Christoph Bumiller	8ac68b071d	nvc0/ir: add formatted surface load lib code, move to extra header OpenGL is nice and makes the user specify a format with an image unit. OpenCL is evil and doesn't, and what's better than adding a huge load of functions that we call indirectly to handle the conversion ?	2013-03-12 12:55:35 +01:00
Christoph Bumiller	ce1951daed	nv50/ir: extend moveSources for delta < 0	2013-03-12 12:55:35 +01:00
Christoph Bumiller	c0fc3463e9	nvc0/ir: lower atomics in s[]	2013-03-12 12:55:35 +01:00
Christoph Bumiller	9c196779bc	nvc0/ir/emit: implement INSBF, EXTBF, PERMT and ATOM	2013-03-12 12:55:35 +01:00
Christoph Bumiller	c8f0c43f7a	nv50/ir/emit: handle OP_ATOM	2013-03-12 12:55:35 +01:00
Christoph Bumiller	d6c95f6819	nvc0/ir/target: some ops can't be predicated, e.g. CALL	2013-03-12 12:55:35 +01:00
Christoph Bumiller	1ed507ca46	nv50/ir/opt: CALLs cannot load	2013-03-12 12:55:35 +01:00
Christoph Bumiller	c893b94060	nv50/ir: add support for indirect BRA,CALL	2013-03-12 12:55:34 +01:00
Christoph Bumiller	efe55075b5	nvc0/ir/emit: implement move to and logic ops on predicates	2013-03-12 12:55:34 +01:00
Christoph Bumiller	ce7610f7d5	nvc0/ir/emit: implement surface related ops	2013-03-12 12:55:34 +01:00
Christoph Bumiller	3741b7d844	nv50/ir: initialize CodeEmitters' specialized target fields	2013-03-12 12:55:34 +01:00
Christoph Bumiller	b0fc2f13ec	nv50/ir/opt: make optimization aware of atomics, barriers, surface ops	2013-03-12 12:55:34 +01:00
Christoph Bumiller	22b762f9b4	nv50/ir: add various new OPs that will be needed for compute	2013-03-12 12:55:34 +01:00
Francisco Jerez	c82714c593	nv50/ir: Rename "mkLoad" to "mkLoadv" for consistency.	2013-03-12 12:55:34 +01:00
Christoph Bumiller	cc30ce8160	nv50/ir: fix comparison of system values	2013-03-12 12:55:34 +01:00
Francisco Jerez	4ddfdcea04	nv50/ir/tgsi: Translate grid-related system parameters.	2013-03-12 12:55:34 +01:00
Francisco Jerez	8446c31d0e	nv50/ir/tgsi: Accept COMPUTE programs.	2013-03-12 12:55:34 +01:00
Christoph Bumiller	e9294e11b4	nv50/ir/ra: make sure all used function inputs get assigned a reg A live range [0, 0) counts as empty. For function inputs this can be a problem, so insert a nop at the beginning to make it [0, 1). This is a bit of a hack but also the most simple solution.	2013-03-12 12:55:34 +01:00
Christoph Bumiller	ee431b12ec	nv50/ir/ra: also add pre-existing MERGE,SPLIT to constraint list	2013-03-12 12:55:34 +01:00
Christoph Bumiller	f1dfa414f4	nv50/ir/ra: fix confusion with conditional RegisterSet::occupy	2013-03-12 12:55:34 +01:00
Christoph Bumiller	d995f44f0b	nv50/ir/ra: swap copyCompound args if src is compound and dst isn't	2013-03-12 12:55:33 +01:00
Francisco Jerez	95ad9bca2f	nv50/ir/ra: Fix maxGPR calculation for programs with multiple functions.	2013-03-12 12:55:33 +01:00
Francisco Jerez	ca04e71024	nv50/ir/ra: Fix traversal before the beginning of the active list in buildRIG.	2013-03-12 12:55:33 +01:00
Francisco Jerez	fe17d8a7c0	nv50/ir/ra: Fix RegisterSet::occupy(const Value *v).	2013-03-12 12:55:33 +01:00
Francisco Jerez	49ded0e132	nv50/ir/ra: Fix argument const-ness in RegisterSet::idToUnits and idToBytes	2013-03-12 12:55:33 +01:00
Francisco Jerez	5959d4247a	nv50/ir/opt: Fix tryPropagateBranch for BBs with several exit branches. Comments and "if (bf->cfg.incidentCount() == 1)" condition added by Christoph Bumiller.	2013-03-12 12:55:33 +01:00
Francisco Jerez	572bf83ec0	nv50/ir: Clean up references to function values before destroying them.	2013-03-12 12:55:33 +01:00
Francisco Jerez	12f65e38c0	nouveau: Bail out from nouveau_fence_wait if flushing the pushbuf fails.	2013-03-12 12:55:33 +01:00
Vinson Lee	543d032885	mesa: Use correct functions for enum conversion. Fixes mixing enum types defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-11 23:44:10 -07:00
Rob Clark	6173cc19c4	freedreno: gallium driver for adreno Currently works on a220. Others in the a2xx family look pretty similar and should be pretty straightforward to support with the same driver. The a3xx has a new shader ISA, and while many registers appear similar, the register addresses have been completely shuffled around. I am not sure yet whether it is best to support with the same driver, but different compiler, or whether it should be split into a different driver. v1: original v2: build file updates from review comments, and remove GPL licensed header files from msm kernel v3: smarter temp/pred register assignment, fix clear and depth/stencil format issues, resource_transfer fixes, scissor fixes Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-03-11 21:53:24 -04:00
José Fonseca	44a8e51354	d3d1x: Remove. Unused/unmaintained. Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2013-03-12 00:35:06 +00:00
José Fonseca	7db60f049f	nv50: Remove nv0_ir_from_sm4.* Unused, depends on d3d1x. Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2013-03-12 00:35:06 +00:00
Roland Scheidegger	5c41d1c222	gallivm: clean up passing derivatives around Previously, the derivatives were calculated and passed in a packed form to the sample code (for implicit derivatives, explicit derivatives were packed to the same format). There's several reasons why this wasn't such a good idea: 1) the derivatives may not even be needed (not as bad as it sounds since llvm will just throw the calculations needed for them away but still) 2) the special packing format really shouldn't be part of the sampler interface 3) depending what the sample code actually does the derivatives will be processed differently, hence there is no "ideal" packing. For cube maps with explicit derivatives (which we don't do yet) for instance the packing looked downright useless, and for non-isotropic filtering we'd need different calculations too. So, instead just pass the derivatives as is (for explicit derivatives), or let the rho calculating sample code calculate them itself. This still does exactly the same packing stuff for implicit derivatives for now, though explicit ones are handled in a more straightforward manner (quick estimates show performance should be quite similar, though it is much easier to follow and also does the rho calculation per-pixel until the end, which we eventually need for spec compliance anyway). No piglit changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-12 00:24:22 +01:00
Chad Versace	b7262ac7ea	i965: Fix typo in doxygen hyperlink s/brw_state_upload/brw_upload_state/ Found because the link was broken. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-11 16:01:19 -07:00
Eric Anholt	11b8df0c01	mesa: Reduce memory usage for reg alloc with many graph nodes (part 2). After the previous fix that almost removes an allocation of 4*n^2 bytes, we can use a bitset to reduce another allocation from n^2 bytes to n^2/8 bytes. Between the previous commit and this one, the peak heap size for an oglconform ARB_fragment_program max instructions test on i965 goes from 4GB to 255MB. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	6aa3afbfd6	mesa: Reduce the memory usage for reg alloc with many graph nodes (part 1) We were allocating an adjacency_list entry for every possible interference that could get created, but that usually doesn't happen. We can save a lot of memory by resizing the array on demand. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	5daf867f6c	i965/fs: Improve CSE performance by expiring some available expressions. We're already walking the list, and we can easily know when something has no reason to be in the list any longer, so take a brief extra step to reduce our worst-case runtime (an oglconform test that emits the maximum instructions in a fragment program). I don't actually know what the worst-case runtime was, because it was too long and I got bored. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	f179f419d1	i965/fs: Improve live variables calculation performance. We can execute way fewer instructions by doing our boolean manipulation on an "int" of bits at a time, while also reducing our working set size. Reduces compile time of L4D2's slowest shader from 4s to 1.1s (-72.4% +/- 0.2%, n=10) v2: Remove redundant masking (noted by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:54 -07:00
Eric Anholt	4dc7e6dcbf	i965/fs: Also do the gen4 SEND dependency workaround against other SENDs. We were handling the the dependency workaround for the first written reg of a send preceding the one we're fixing up, but didn't consider the other regs. Thus if you had two sampler calls that got allocated to the same set of regs, one might, rarely, ovewrite the other. This was occurring in XBMC's GLSL shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567 NOTE: This is a candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	4c1fdae0a0	i965/fs: Switch to using sampler LD messages for uniform pull constants. When forcing the compiler to always generate pull constants instead of push constants (in order to have an easy to use testcase), improves performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	1323772543	i965/fs: Fix broken rendering in large shaders with UBO loads. The lowering process creates a new vgrf on gen7 that should be represented in live interval analysis. As-is, it was getting a conflicting allocation with gl_FragDepth in the dolphin emulator, producing broken rendering. NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	c588cd2031	i965/fs: Add a comment about about an implementation detail. I was going to fix the code above like the previous commit, but we already had that covered (otherwise all our uniform access would have been broken, unlike just pull constants). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	f10f5e4980	i965/fs: Fix register allocation for uniform pull constants in 16-wide. We were allowing a compressed instruction to write a register that contained the last use of a uniform pull constant (either UBO load or push constant spillover), so it would get half its values smashed. Since we need to see the actual instruction to decide this, move the pre-gen6 pixel_x/y logic here, which should improve the performance of register allocation since virtual_grf_interferes() is called more than once per instruction. NOTE: This is a candidate for the stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Eric Anholt	f09a8e17e5	intel: Remove some unused debug flags. I was looking at the list to see what might be interesting to document for application developers, and it turns out some are completely dead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-11 12:11:53 -07:00
Zack Rusin	7295fad204	draw/gs: Correctly iterate the emitted primitives We were assuming that each emitted primitive had the same number of vertices. That is incorrect. Emitted primitives can have arbirtrary number of vertices. Simply increment index on iteration to fix it. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-07 20:16:07 -08:00
Zack Rusin	e5406f7058	tgsi/exec: Correctly reset NumOutputs before parsing the shader Whenever we're binding the shaders we're incrementing NumOutputs, assuming the parser spots an output decleration, but we were never reseting the variable. That means that each subsequent bind of a geometry shader would add its number of output to the number of output bound by all previously ran shaders and our indexes would get completely messed up. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-07 20:16:00 -08:00
Roland Scheidegger	9060c835fd	draw/llvm: another quick hack for drawing with no position output Also need to skip things if we have no cv value but pos value (happens with geometry shaders enabled). Needs a round of cleanup, though.	2013-03-11 17:07:51 +01:00
Roland Scheidegger	ef17cc9cb6	softpipe: don't use samplers with prebaked sampler and sampler_view state This is needed for handling the dx10-style sample opcodes. This also simplifies the logic by getting rid of sampler variants completely (sampler_views though OTOH have sort of variants because some of their state is different depending on the shader stage they are bound to). No significant performance difference (openarena run: 840 frames in 459.8 seconds vs. 840 frames in 460.5 seconds). v2: fix reference counting bug spotted by Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-11 17:07:51 +01:00
Roland Scheidegger	f33c744fb9	tgsi: emit code for SVIEWINFO and SAMPLE_I Can handle them since the single sampler interface was introduced. v2: simplify txf/sample_i handling a bit according to Brian's feedback. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-11 17:07:51 +01:00
Roland Scheidegger	7b3a0bb45d	tgsi: fix wrong reg used for unit for TGSI_OPCODE_TXF Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-11 17:07:51 +01:00
Tom Stellard	a0676968b9	r600g/llvm: Fix build	2013-03-11 11:10:51 -04:00
Marek Olšák	e4e655fd11	r600g: add debug options disabling various copy-buffer-related features This will be invaluable for debugging and bug reports.	2013-03-11 13:44:46 +01:00
Marek Olšák	4b69c1a92d	mesa: don't allocate a texture if width or height is 0 in CopyTexImage NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-11 13:44:14 +01:00
Marek Olšák	68ed4c9c89	gallium/util: attempt to fix blitting multisample texture arrays We don't have a test for this yet, but obviously the swizzle was wrong.	2013-03-11 13:43:36 +01:00
Marek Olšák	52efa01de0	r600g: allocate FMASK right after the texture, so that it's aligned with it This avoids the kernel CS checker errors with MSAA textures. Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	2c339f8015	r600g: remove r600.h, move the stuff elsewhere (mostly to r600_pipe.h) Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	ec7d775790	r600g: remove r600_hw_context_priv.h, move the stuff to r600_pipe.h Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	1724ef8908	r600g: remove deprecated state management code It's nice to see so much code that did pretty much nothing go away. Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	65cbf89567	r600g: atomize pixel shader Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	63042af933	r600g: atomize vertex shader Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	167263ecb1	r600g: inline r600_pipe_shader function also change names of other functions, so that they make sense Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	65b2a449bc	r600g: dump vertex elements state along with the fetch shader	2013-03-11 13:43:36 +01:00
Marek Olšák	3f0a51d677	gallium/util: dump instance_divisor	2013-03-11 13:43:36 +01:00
Marek Olšák	3832059b10	r600g: remove bytecode dumping Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	4bf0ebdd4f	r600g: use a single env var R600_DEBUG, disable bytecode dumping Only the disassembler is used to dump shaders. Here's a few examples how to use R600_DEBUG. Log compute info: R600_DEBUG=compute Dump all shaders: R600_DEBUG=fs,vs,gs,ps,cs Dump pixel shaders only: R600_DEBUG=ps Disable Hyper-Z: R600_DEBUG=nohyperz Disable the LLVM backend: R600_DEBUG=nollvm Or use any combination of the above, or print all options: R600_DEBUG=help Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	2ca73bc7f7	r600g: cleanup #include recursion between r600_pipe.h and evergreen_compute.h Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-11 13:43:36 +01:00
Marek Olšák	43d3e0cd3d	r600g: don't check for R600_ENABLE_S3TC env var	2013-03-11 13:43:36 +01:00
Stefan Brüns	b21a9d46e4	glapi/gen: Remove duplicate PYTHON_FLAGS PYTHON_GEN calls python with PYTHON_FLAGS Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>	2013-03-09 16:24:51 -08:00
Frank Henigman	89559c50e7	i965: Link i965_dri.so with C++ linker. Force C++ linking of i965_dri.so by adding a dummy C++ source file. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-08 21:21:53 -08:00
Maxence Le Doré	ba588dd45d	gallium/util: Correct shift value for TSC feature detection. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-08 21:21:53 -08:00
Matt Turner	07f2dee731	configure.ac: Build dricommon for DRI gallium drivers Commit `67ef7559` added an \|\| test "x$enable_dri" check in an attempt to get the DRI common bits built in some necessary cases. That change was inappropriate as it made these common DRI pieces be built unconditionally, so some builds were broken. Subsequently, commit `998d975e3` change the "\|\| test" to a "-a" conjunction within the existing test invocation. This made the '-a "x$enable_dri" = xyes' clause have no effect, (as it was inside an enclosing test for the same condition). So the new breakage from commit `67ef7559` was addressed, but the original problems were regressed. The immediately preceding commit removed the redundant condition. Now, finally this commit fixes the original problem as described in the commit message of `67ef7559`: this code should be compiled when using the DRI state tracker. In order to do so, the HAVE_*_DRI conditionals must be moved after the last assignment of HAVE_COMMON_DRI. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61821 Tested-by: Stéphane Marchesin <marcheu@chromium.org>	2013-03-08 21:21:46 -08:00
Matt Turner	7de78ce5e5	configure.ac: Remove redundant checks of enable_dri. The whole block is enclosed inside if test "x$enable_dri" = xyes.	2013-03-08 21:20:43 -08:00
Matt Turner	79a0977241	mesa: Allow ETC2/EAC formats with ARB_ES3_compatibility. Fixes piglit's oes_compressed_etc2_texture-miptree tests on Desktop GL. Reported-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-03-08 21:20:39 -08:00
Stéphane Marchesin	1662178863	i915g: Use PIPE_FLUSH_END_OF_FRAME to trigger throttling This helps with jittering, instead of throttling at every command buffer we only throttle once a frame.	2013-03-08 19:34:50 -08:00
Stéphane Marchesin	d815e8af39	i915g: Update TODO	2013-03-08 19:34:43 -08:00
Brian Paul	728240b64d	docs: document another Viewperf bug	2013-03-08 10:35:46 -07:00
Jan de Groot	17f1cb1d99	dri/nouveau: fix crash in nouveau_flush https://bugs.freedesktop.org/show_bug.cgi?id=61947 Note: this is a candidate for the stable branches	2013-03-07 19:55:07 +01:00
Brian Paul	057c46d791	draw: add const qualifier to silence compiler warning	2013-03-07 08:11:12 -07:00
Brian Paul	9915636fb8	llvmpipe: remove the power of two sizeof(struct cmd_block) assertion It fails on 32-bit systems (I only tested on 64-bit). Power of two size isn't required, so just remove the assertion. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-07 06:28:23 -07:00
Brian Paul	c2665aacdd	vbo: fix crash found with shared display lists This fixes a crash when a display list is created in one context but executed from a second one. The vbo_save_context::vertex_store memeber will be NULL if we never created a display list with the context. Just check for that before dereferencing the pointer. Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661 Note: This is a candidate for the stable branches.	2013-03-07 06:28:23 -07:00
Alan Hourihane	5984a911f9	mesa: fix glGetInteger*(GL_SAMPLER_BINDING). If the sampler object has been deleted on another context, an alternative context may reference the old sampler. So ensure the sampler object still exists. Note: this is a candidate for the stable branch. Signed-off-by: Alan Hourihane <alanh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-07 10:13:40 +00:00
Christian König	eddf33f711	radeon/llvm: document LLVM commit We need at least that revision to work correctly now. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-07 10:06:24 +01:00
Christian König	a7a899584c	radeon/llvm: enable LICM and DCE pass v2 LICM stands for Loop Invariant Code Motion. Instructions that does not depend of loop index are moved outside of loop body. DCE is DeadCodeElimination. v2: updated commit msg, thx to Vincent. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Vincent Lejeune <vljn at ovi.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	e4188ee13d	radeonsi: add LLVMNoUnwindAttribute to intrinsic So LLVM can better eliminate dead code. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	0666ffddd2	radeonsi: rework input interpolation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	c497321d31	radeonsi: remove SI.vs.load.buffer.index Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	55fe5ccb39	radeon/llvm: make SGPRs proper function arguments v2 v2: remove unrelated changes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	b8f4ca3d85	radeon/llvm: replace shader type intrinsic with function attribute Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	de80e560bc	radeonsi: switch to v*i8 for resources and samplers v2 v2: remove unrelated changes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-03-07 10:03:22 +01:00
Christian König	2cb54833d0	r600g/llvm: Update CONSTANT_BUFFER address space definition To match recent LLVM changes. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-07 10:03:11 +01:00
Zack Rusin	2532147f8b	draw/llvm: fix inputs to the geometry shader We can't clip and viewport transform the vertices before we let the geometry shader process them. Lets make sure the generated vertex shader has both disabled if geometry shader is present. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-05 20:13:08 -08:00
Bryan Cain	8c74380b2d	draw: use geometry shader info in clip_init_state if appropriate Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-05 20:13:08 -08:00
Bryan Cain	30f246bf2c	draw: account for separate shader objects in geometry shader code The geometry shader code seems to have been originally written with the assumptions that there are the same number of VS outputs as GS outputs and that VS outputs are in the same order as their corresponding GS inputs. Since TGSI uses separate shader objects, these are both wrong assumptions. This was causing several valid vertex/geometry shader combinations to either render incorrectly or trigger an assertion. Conflicts: src/gallium/auxiliary/draw/draw_gs.c Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-05 20:13:08 -08:00
Alan Hourihane	cf0b4a30fc	Unreference sampler object when it's currently bound to texture unit. This change specifically unbinds a sampler object from the texture unit if it's bound to a unit. The spec calls for default object when deleting sampler objects which are currently bound. Note: this is a candidate for the stable branches Signed-off-by: Alan Hourihane <alanh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-06 18:10:12 +00:00
Brian Paul	b21f8e364b	llvmpipe: fix incorrect 'j' array index in dummy texture code Use 0 instead. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-03-06 10:34:09 -07:00
Brian Paul	975d31f60d	llvmpipe: remove unused cmd_block_list struct	2013-03-06 10:34:09 -07:00
Brian Paul	a51b81558f	llvmpipe: add some scene limit sanity check assertions Note: This is a candidate for the stable branches. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-06 10:34:09 -07:00
Brian Paul	a31ebdffa0	llvmpipe: tweak CMD_BLOCK_MAX and LP_SCENE_MAX_SIZE We advertise a max texture/surfaces size of 8K x 8K but the old values for these limits didn't actually allow us to handle that surface size. For 8K x 8K we'll have 16384 bins. Each bin needs at least one cmd_block object which was 2192 bytes in size. Since 16384 * 2192 exceeded LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not draw the complete scene. By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks. And by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command blocks for 8K x 8K, plus a few regular data blocks. Fixes the (improved) piglit fbo-maxsize test. Note: This is a candidate for the stable branches. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-03-06 10:34:09 -07:00
Kenneth Graunke	492693c0a5	i965: Don't fill buffer with zeroes. This was only necessary because our bounds checking was off by one, and thus we read an extra pair of values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-06 08:27:54 -08:00
Kenneth Graunke	89e5c8e0fa	i965: Fix off-by-one in query object result gathering. If we've written N pairs of values to the buffer, then last_index = N, but the values are 0 .. N-1. Thus, we need to use <, not <=. This worked anyway because we fill the buffer with zeroes, so we just added an extra (0 - 0) to our results. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-06 08:27:47 -08:00
Christian König	886c5085e3	radeon/llvm: fix trivial warnings Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-06 12:08:54 +01:00
Christian König	a212483437	radeonsi: fix trivial warning Signed-off-by: Christian König <christian.koenig@amd.com>	2013-03-06 12:07:40 +01:00
Eric Anholt	88b20d5834	intel: Improve the matching (more formats!) for TexImage from PBOs. Mesa core is the place for encoding what format/type matches a mesa format, so rely on that. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-05 16:02:38 -08:00
Eric Anholt	731d474d98	intel: Improve the test for readpixels blit path format checking. We were allowing things like copying RG1616 to a user's ARGB8888 format, while we were denying anything that wasn't ARGB8888 or RGB565. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-05 16:02:38 -08:00
Eric Anholt	3c7e96ff01	intel: Fold intel_region_copy() into its one caller. This is similar code to intel_miptree_copy_slice, but the knobs are all set differently. v2: fix whitespace Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-05 16:02:38 -08:00
Eric Anholt	7604debabb	intel: Transition intel_region_map() to being a miptree operation. I'm trying to move us away from the region structure, and all the callers are currently dereferencing a miptree to get the region. In this change, the map_refcount is dropped. However, the bo->virtual is itself map refcounted, so that's already dealt with. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-05 16:02:38 -08:00
Eric Anholt	f4f288f317	intel: Remove num_mapped_regions tracking. The point of tracking the value was removed in February 2012 (`65b096aedd`), and this should have been removed at the same time. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-05 16:02:38 -08:00
Eric Anholt	3c9532314c	intel: Remove the struct intel_region reuse hash table. I don't see any reason for it -- it was introduced with the DRI2 invalidate work by krh in 2010 with no explanation. I suspect it was something about wanting the same drm_intel_bo struct underneath multiple openings of the BO within one process, but that's covered by libdrm at this point. As far as the struct region goes, it is not threadsafe, so multiple contexts sharing a region could have mixed up the map_count and assertion failed or worse. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-05 16:02:37 -08:00
José Fonseca	e77234be39	scons: Provide shorthand aliases for software winsyses.	2013-03-05 23:06:13 +00:00
José Fonseca	3950953f93	scons: Fix llvm-config not found error message. "% llvm_version" is bogus copy'n'past cruft.	2013-03-05 23:06:13 +00:00
Ian Romanick	674f9239b9	mesa: Modify candidate search string Several commits on master for the 9.1 branch had "NOTE" messages in a slightly different format. NOTE: This is a candidate for stable branches Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-05 14:54:11 -08:00
Eric Anholt	65afa11dc6	mesa: Remove the special enum for _mesa_error debug output. Now all the per-message enums from mtypes are gone. Now we can extend unique message IDs into all generators of debug output without having to update mtypes.h for each one. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:01 -08:00
Eric Anholt	d9249935db	mesa: Remove the enum for the oom-within-debug-output case. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:01 -08:00
Eric Anholt	6816f67de6	mesa: Remove now-unused gl_winsys_error and gl_shader_error enums. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:00 -08:00
Eric Anholt	c72cf53817	mesa: Report ARB_debug_output for both shader errors and warnings. This ends up reusing the dynamic ID support, so a silly enum gets to go away. We don't assign good IDs to different messages yet, but at least that's tractable now. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:00 -08:00
Eric Anholt	f0a191ca0f	intel: Add missing perf debug for a stall on mapping a BO. I was testing the ARB_debug_output code and wrote an obvious sample that should have hit this, and got confused that my ARB_debug_output was broken. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:00 -08:00
Eric Anholt	14cec07177	i965: Make perf_debug() output to GL_ARB_debug_output in a debug context. I tried to ensure that performance in the non-debug case doesn't change (we still just check one condition up front), and I think the impact is small enough in the debug context case to warrant including all of it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:00 -08:00
Eric Anholt	0a1c6bcfb0	intel: Finish renaming fallback_debug() to perf_debug(). They're about to change to handle GL_ARB_debug_output, so just make one function. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:25:00 -08:00
Eric Anholt	807eedf70f	intel: Hook up the WARN_ONCE macro to GL_ARB_debug_output. This doesn't provide detailed error type information, but it's important to get these relatively severe but rare error messages out to the developer through whatever mechanism they are using. v2: Rebase on new WARN_ONCE additions. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)	2013-03-05 14:25:00 -08:00
Eric Anholt	3025680578	mesa: Add support for GL_ARB_debug_output with dynamic ID allocation. We can emit messages now without always having to use the same ID for each, or having a giant table of all possible errors in mtypes.h. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:24:59 -08:00
Eric Anholt	7beb93456d	mesa: Merge handling of application-provided and built-in error sources. I want to have dynamic IDs so that we don't need to add to mtypes.h for every error we might want to add. To do so, I need to get rid of the static arrays and actually support all the crazy filtering of dynamic IDs that we already support for application-provided error sources. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:24:59 -08:00
Eric Anholt	88831a8d99	mesa: Fix _mesa_problem() on context destroy after application debug output This was apparently not noticed because we don't have any testing of application-generated debug output. However, as I'm changing the GL-generated debug output to use the same path as application/middleware-generated debug output, this obviously became an issue. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:24:59 -08:00
Eric Anholt	e0d1e3b785	mesa: Move debug type/severity enums to mesa core. These will get reused by new ARB_debug_output messages in drivers/core, instead of having the caller pass GL enums and have us immediately switch-statement those into enums. Add source enums will be handled in the next commit, because the way different sources are handled at the moment is pretty strange. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:24:59 -08:00
Eric Anholt	c42148d16e	mesa: Replace open-coded _mesa_lookup_enum_by_nr(). The new one doesn't have the same behavior for GL_NO_ERROR, but we don't produce errors with GL_NO_ERROR as the error type. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:24:59 -08:00
Eric Anholt	e022461c64	mesa: Remove extra #define MAXSTRING duplicating MAX_DEBUG_MESSAGE_LENGTH. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-03-05 14:24:59 -08:00
Marcin Slusarz	f4ebcd133b	dri/nouveau: NV17_3D class is not available for NV1a chipset Should fix https://bugs.freedesktop.org/show_bug.cgi?id=60510 Note: this is a candidate for the stable branches Acked-by: Francisco Jerez <currojerez@riseup.net>	2013-03-05 21:19:17 +01:00
Roland Scheidegger	b9eb573600	tgsi: handle projection modifier for array textures. This partly reverts `6ace2e41da`. Apparently with GL_MESA_texture_array fixed-function texturing with texture arrays is possible, and hence we have to handle TXP. (Though noone seems to know the semantics, softpipe now does what it did before, which is to NOT project the array coord, llvmpipe for instance however indeed does project the array coord. Unlike before it will project the comparison coord for shadow1d array, as that clearly was an error.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61828. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-05 20:10:37 +01:00
Roland Scheidegger	be6d18ba5e	st/mesa: translate ir offset parameters for non-TXF opcodes. Otherwise the state tracker will crash if the texture instructions have offsets. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-05 20:10:37 +01:00
Matt Turner	523b07e320	configure.ac: Remove stale comment about --x-* arguments. Should have been removed with `e273ed37`. Note: This is a candidate for the 9.1 branch. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-05 11:02:36 -08:00
Matt Turner	35189d768b	configure.ac: Don't check for X11 unconditionally. X11 is already checked conditionally below. Fixes OSMesa-only configurations to not require X11. Note: This is a candidate for the 9.1 branch. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-05 11:02:22 -08:00
Alan Hourihane	196443f3f5	Add missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions This was hit on the glTexStorage2D() path. Note: this is a candidate for the stable branches Signed-off-by: Alan Hourihane <alanh@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-05 17:22:44 +00:00
Jon TURNEY	87fdcd87b1	Fix out-of-tree build of 'make check' in src/mesa/main/tests Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-05 13:33:16 +00:00
Dave Airlie	e21460b4d5	u_blitter: don't create illegal shaders for 1D/3D/RECT/CUBE MSAA Reviewed-by: Marek Olšák <maraeo@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-03-04 22:23:08 +00:00
Daniel Martin	998d975e38	Fix build of swrast only without libdrm Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Daniel Martin <consume.noise@gmail.com>	2013-03-04 10:11:01 -08:00
Brian Paul	b1390c7992	mesa: flush current state when querying GL_EDGE_FLAG Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61395 Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-04 08:41:45 -07:00
Jakub Bogusz	e29124717e	vdpau-softpipe: Build correct source file - vl_winsys_xsp.c Copy-and-paste problem introduced by commit `7f24483e`. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-03 22:53:26 -08:00
Kenneth Graunke	b88f74d63d	i965: Fix Crystal Well PCI IDs. The second digit was off by one, which meant we accidentally treated GTn as GT(n-1). This also meant no support for GT1 at all. NOTE: This is a candidate for stable branches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-03 13:53:58 -08:00
Vincent Lejeune	83e7d111af	r600g: Check comp_mask before merging export instructions Fixes a llvm uncovered (rare) bug where consecutive exports were merged even if they have incompatible mask.	2013-03-03 21:39:51 +01:00
Vadim Girlin	138b5b9a12	r600g: fix check_and_set_bank_swizzle for cayman Tested-by: Vincent Lejeune <vljn at ovi.com> Reviewed-by: Vincent Lejeune <vljn at ovi.com>	2013-03-03 21:38:49 +01:00
Brian Paul	0b6e72f8d7	st/mesa: add switch case for ir_txf_ms to silence warning Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-03-02 05:52:40 -07:00
Brian Paul	2ea0e30bed	mesa: add switch case for ir_txf_ms to silence warning Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 05:52:28 -07:00
Kenneth Graunke	cf0c0a7782	i965: Pull query BO reallocation out into a helper function. We'll want to reuse this for non-occlusion queries in the future. Plus, it's a single logical task, so having it as a helper function clarifies the code somewhat. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	961c9b8cac	i965: Replace the global brw->query.bo variable with query->bo. Again, eliminating a global variable in favor of a per-query object variable will help in a future where we have more queries in hardware. Personally, I find this clearer: there's just the query object's BO, rather than two variables that usually shadow each other. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	614944b897	i965: Turn if (query->bo) into an assertion. The code a few lines above calls brw_emit_query_begin() if !query->bo, and that creates query->bo. So it should always be non-NULL. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	981a22b62b	i965: Unify query object BO reallocation code. If we haven't allocated a BO yet, we need to do that. Or, if there isn't enough room to write another pair of values, we need to gather up the existing results and start a new one. This is simple enough. However, the old code was awkwardly split into two blocks, with a write_depth_count() placed in the middle. The new depth count isn't relevant to gathering the old BO's data, so that can go after the reallocation is done. With the two blocks adjacent, we can merge them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	90feda81de	i965: Use query->last_index instead of the global brw->query.index. Since we already have an index in the brw_query_object, there's no need to also keep a global variable that shadows it. Plus, if we ever add support for more types of queries that still need the per-batch before/after treatment we do for occlusion queries, we won't be able to use a single global variable. In contrast, per-query object variables will work fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	ec5d502ec3	i965: Remove brw_query_object::first_index field as it's always 0. brw->query.index is initialized to 0 just a few lines before it's copied to first_index. Presumably the idea here was to reuse the query BO for subsequent queries of the same type, but since that doesn't happen, there's no need to have the extra code complexity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	d92c7d8eed	i965: Add a pile of comments to brw_queryobj.c. This code was really difficult to follow, for a number of reasons: - Queries were handled in four different ways (TIMESTAMP writes a single value, TIME_ELAPSED writes a single pair of values, occlusion queries write pairs of values for the start and end of each batch, and other queries are done entirely in software. It turns out that there are very good reasons each query is handled the way it is, but insufficient comments explaining the rationale. - It wasn't immediately obvious which functions were driver hooks and which were helper functions. For example, brw_query_begin() is a driver hook that implements glBeginQuery() for all query types, but the similarly named brw_emit_query_begin() is a helper function that's only relevant for occlusion queries. Extra explanatory comments should save me and others from constantly having to ask how this code works and why various query types are handled differently. v2: Incorporate Eric's feedback: change "as soon as possible" to "the results will be present when mapped." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:04 -08:00
Kenneth Graunke	d1b34baf9b	i965: Write TIMESTAMP query values into the first buffer element. For timestamp queries, we just write a single value to a BO. The natural place to write that is element 0, so we should do that. Previously, we wrote it into element 1 (the second slot) leaving element 0 filled with garbage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:03 -08:00
Kenneth Graunke	3d71f4fbac	i965: Implement the new QueryCounter() hook. This moves the GL_TIMESTAMP handling out of EndQuery. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:03 -08:00
Kenneth Graunke	dfb056b892	mesa: Add a new QueryCounter() hook for TIMESTAMP queries. In OpenGL, most queries record statistics about operations performed between a defined beginning and ending point. However, TIMESTAMP queries are different: they immediately return a single value, and there is no start/stop mechanism. Previously, Mesa implemented TIMESTAMP queries by calling EndQuery without first calling BeginQuery. Apparently this is DirectX convention, and Gallium followed suit. I personally find the asymmetry jarring, however---having BeginQuery and EndQuery handle a different set of enum values looks like a bug. It's also a bit confusing to mix the one-shot query with the start/stop model. So, add a new QueryCounter driver hook for implementing TIMESTAMP. For now, fall back to EndQuery to support drivers that don't do the new mechanism. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-03-01 22:09:03 -08:00
Roland Scheidegger	6ace2e41da	tgsi: add texel offsets and derivatives to sampler interface Something I never got around to implement, but this is the tgsi execution side for implementing texel offsets (for ordinary texturing) and explicit derivatives for sampling (though I guess the ordering of the components for the derivs parameters is debatable). There is certainly a runtime cost associated with this. Unless there are different interfaces used depending on the "complexity" of the texture instructions, this is impossible to avoid. Offsets are always active (I think checking if they are active or not is probably not worth it since it should mostly be an add), whereas the sampler_control is extended for explicit derivatives. For now softpipe (the only user of this) just drops all those new values on the floor (which is the part I never implemented...). Additionally this also fixes (discovered by accident) inconsistent projective divide for the comparison coord - the code did do the projection for shadow2d targets, but not shadow1d ones. This also drops checking for projection modifier on array targets, since they aren't possible in any extension I know of (hence we don't actually know if the array layer should also be divided or not). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-02 02:54:31 +01:00
Roland Scheidegger	c7c7186045	draw: additional fix for the no-position case with llvm Similar fix to what is done for the non-llvm case, we could otherwise still hit the stages (near certainly with gs) which crash. It is probably a much better idea to skip trying to draw at that point anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-02 02:54:31 +01:00
Roland Scheidegger	ea8b2ae8a5	draw: fix no position output in non-llvm pipeline. It seems easiest (and best) if we simply skip all the later stages (after stream output). (This is different to the llvm case at least for now where we will simply try to render garbage, though both behaviors should be correct.) Fixes piglit glsl-1.40-tf-no-position with softpipe. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-02 02:54:31 +01:00
Roland Scheidegger	de0593e333	draw/llvm: skip clipping and viewport transform if there's no position output With glsl 1.40 writing position is not required (useful for transform feedback, though in fact it's still possible to rasterize such geometry even if the results aren't too well defined). Prevents crashes in that case. Fixes piglit glsl-1.40-tf-no-position. Not quite sure this is 100% correct as it also skips clipdistance clipping which could still work (but not sure if the result would really be needed?) Reviewed-by: Jose Fonseca <jfonseca@vmware.com Reviewed-by: Brian Paul <brianp@vmware.com>	2013-03-02 02:54:31 +01:00
Roland Scheidegger	2ef13e7c55	llvmpipe: don't assert on illegal surface creation. Since `c8eb2d0e82` llvmpipe checks if it's actually legal to create a surface. The opengl state tracker doesn't quite obey this so for now just warn instead of assert. Also warn instead of disabled assert when creating sampler views (same reasoning). Addresses https://bugs.freedesktop.org/show_bug.cgi?id=61647. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-02 02:54:31 +01:00
Roland Scheidegger	4c12276607	llvmpipe: bump glsl version to 140 texel offsets should have been the last missing feature for 130, and in fact 140 as well (last there were texture buffers). In any case we still don't do OpenGL 3.0 (missing MSAA which will be difficult, plus EXT_packed_float, ARB_depth_buffer_float and EXT_framebuffer_sRGB). v2: bump to 140 instead - we have everything except we crash when not writing to gl_Position (but softpipe crashes as well) so let's just say this is a bug instead. Also (by Dave Airlie's suggestion) update llvm-todo.txt. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-02 02:54:30 +01:00
Roland Scheidegger	b3b3b389fa	gallivm: add support for texel offsets for ordinary texturing. This was previously only handled for texelFetch (much easier). Depending on the wrap mode this works slightly differently (for somewhat efficient implementation), hence have to do that separately in all roughly 137 places - it is easy if we use fixed point coords for wrapping, however some wrapping modes are near impossible with fixed point (the repeat stuff) hence we have to normalize the offsets if we can't do the wrapping in unnormalized space (which is a division which is slow but should still be much better than the alternative, which would be integer modulo for wrapping which is just unusable). This should still give accurate results in all cases that really matter, though it might be not quite conformant behavior for some apis (but we have much worse problems there anyway even without using offsets). (Untested, no piglit test.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-03-02 02:54:30 +01:00
Brian Paul	a99eb5c83f	svga: always link with C++ Even when we don't have LLVM since there's other C++ code in the resulting DRI driver object. Note: This is a candidate for the stable branches. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-01 17:31:32 -07:00
Brian Paul	f6c0612618	st/mesa: convert ir_triop_lrp to TGSI_OPCODE_LRP AFAICT, all gallium drivers implement TGSI_OPCODE_LRP. Tested with softpipe, llvmpipe, svga drivers. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-03-01 17:31:32 -07:00
Chris Forbes	7616586cff	docs: Mark some things done in GL3.txt	2013-03-02 12:02:25 +13:00
Martin Andersson	d96d8ed910	winsys/radeon: Only add bo to hash table when creating flink The problem is that we mix bo handles and flinked names in the hash table. Because kms type handles are not flinked they should not be added to the hash table. If we do that we will sooner or later get a situation where we will overwrite a correct entry because the bo handle was the same as a flinked name. Note: this is a candidate for the stable branches. Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-01 17:52:40 -05:00
Chris Forbes	1d4dbeeaec	i965: enable ARB_texture_multisample on Gen6+ V2: Works on Ivy Bridge now too, so this can be 6+. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-02 11:40:50 +13:00
Chris Forbes	26c8479474	i965/fs: add support for ir_txf_ms on Gen6+ On Gen6, lower this to `ld` with lod=0 and an extra sample_index parameter. On Gen7, use `ld2dms`. We don't support CMS yet for multisample textures, so we just hardcode MCS=0. This is ignored for IMS and UMS surfaces. Note: If we do end up emitting specialized shaders based on the MSAA layout, we can emit a slightly shorter message here in the UMS case. Note: According to the PRM, `ld2dms` takes one more parameter, lod. However, it's always zero, and including it would make the message too long for SIMD16, so we just omit it. V2: Reworked completely, added support for Gen7. V3: - Introduce sample_index parameter rather than reusing lod - Removed spurious whitespace change - Clarify commit message V4: - Fix comment style - Emit SHADER_OPCODE_TXF_MS on Gen6. This was benignly wrong since it lowers to `ld` anyway on this gen, but still wrong. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-02 11:40:50 +13:00
Chris Forbes	6883c8845d	i965/vs: add support for ir_txf_ms on Gen6+ On Gen6, lower this to `ld` with lod=0 and an extra sample_index parameter. On Gen7, use `ld2dms`. This takes an additional MCS parameter to support compressed multisample surfaces, but we're not enabling them for multisample textures for now, so it's always ignored and can be safely omitted. V2: Reworked completely, added support for Gen7. V3: - Use new sample_index, sample_index_type rather than reusing lod - Clarify commit message. V4: - Fix comment style Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-02 11:40:49 +13:00
Chris Forbes	f52ce6a0ca	i965: add a new virtual opcode: SHADER_OPCODE_TXF_MS This is very similar to the TXF opcode, but lowers to `ld2dms` rather than `ld` on Gen7. V4: - add SHADER_OPCODE_TXF_MS to is_tex() functions, so regalloc thinks it actually writes the correct number of registers. Otherwise in nontrivial shaders some of the registers tend to get clobbered, producing bad results. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-03-02 11:40:49 +13:00
Chris Forbes	555dc6d74d	i965: take the target into account for Gen7 MSAA modes Gen7 has an erratum affecting the ld_mcs message, making it unsafe to use when the surface doesn't have an associated MCS. From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"): "If this field is disabled and the sampling engine <ld_mcs> message is issued on this surface, the MCS surface may be accessed. Software must ensure that the surface is defined to avoid GTT errors." To allow the shader to treat all surfaces uniformly, force UMS if the surface is to be used as a multisample texture, even if CMS would have been possible. V3: - Quoted erratum text Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-03-02 11:39:42 +13:00
Chris Forbes	8cc26ae993	i965: Support multisampling in surface_state for textures The surface_state setup for renderbuffers already worked; only the texturing side needed work. BLORP does something similar, but does its own surface_state setup. On Gen6, we just need to set the correct sample count. On Gen7: - set the correct sample count - set the correct layout mode - set GEN7_SURFACE_ARYSPC_LOD0 if it's set in the miptree. V2: - Clarify commit message - Rebased onto Paul's physical/logical dims cleanup - Added Gen7 support Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-03-02 11:35:24 +13:00
Chris Forbes	e62b6a10bc	i965: add support for multisample textures V2: - Fix for state moving from texobj to image - Rebased onto Paul's logical/physical cleanup - Fixed missing quantization of sample count - Fold in IMS renderbuffer wrapper fixes from later in the series - Use correct physical slice offset for UMS/CMS surfaces on Gen7 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-03-02 11:35:24 +13:00
Chris Forbes	575d3870bb	mesa: implement TexImage*Multisample V2: - fix formatting issues - generate GL_OUT_OF_MEMORY if teximage cannot be allocated - fix for state moving from texobj to image V3: - remove ridiculous stencil hack - alter format check to not allow a base format of STENCIL_INDEX - allow width/height/depth to be zero, to deallocate the texture - dont forget to call _mesa_update_fbo_texture V4: - fix indentation - don't throw errors on proxy texture targets Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2013-03-02 11:35:24 +13:00
Chris Forbes	61d42ffef4	mesa: support multisample textures in framebuffer completeness check - sample count must be the same on all attachments - fixedsamplepositions must be the same on all attachments (renderbuffers have fixedsamplepositions=true implicitly; only multisample textures can choose to have it false) V2: - fix wrapping to 80 columns, debug message, fix for state moving from texobj to image. - stencil texturing tweaks tidied up and folded in here. V3: - Removed silly stencil hacks entirely; the extension doesn't actually make stencil-only textures legal at all. - Moved sample count / fixed sample locations checks into existing attachment-type-specific blocks, as suggested by Eric V4: - Removed stencil hacks which were missed in V3 (thanks Eric) - Don't move the declaration of texImg; only required pre-V3. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [V2] Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:35:22 +13:00
Chris Forbes	032896cbf9	i965: expose sample positions Moves the definition of the sample positions out of gen6_emit_3dstate_multisample, and unpacks them in gen6_get_sample_position. V2: Be consistent about `sample position` rather than `location`. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:35:20 +13:00
Chris Forbes	569c4a9f1c	i965: add support for sample mask on Gen6+ Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:35:17 +13:00
Chris Forbes	1822496f3a	mesa: implement sample mask V2: - fix multiline comment style - stop using ASSERT_OUTSIDE_BEGIN_END_AND_FLUSH since that doesn't exist anymore. V3: - check for the extension being enabled - tidier flagging of _NEW_MULTISAMPLE - fix weird indentation in get.c V4: - move flush later in SampleMaski() Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:35:16 +13:00
Chris Forbes	7c1017e292	mesa: implement GetMultisamplefv Actual sample locations deferred to a driverfunc since only the driver really knows where they will be. V2: - pass the draw buffer to the driverfunc; don't fallback to pixel center if driverfunc is missing. - rename GetSampleLocation to GetSamplePosition - invert y sample position for winsys FBOs, at Paul's suggestion Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:35:13 +13:00
Chris Forbes	abb5429537	i965: expose new max sample counts V2: For now, only expose a depth sample count of 1, since there are possible unresolved interactions with HiZ. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:35:08 +13:00
Chris Forbes	db5d5c30a6	mesa: add new max sample count state - GL_MAX_COLOR_TEXTURE_SAMPLES - GL_MAX_DEPTH_TEXTURE_SAMPLES - GL_MAX_INTEGER_SAMPLES V2: initialize limits to 1 in _mesa_init_constants as suggested by Brian and Paul Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:34:58 +13:00
Chris Forbes	ffb53b4f03	glsl: add support for ARB_texture_multisample V2: - emit `sample` parameter properly for multisample texelFetch() - fix spurious whitespace change - introduce a new opcode ir_txf_ms rather than overloading the existing ir_txf further. This makes doing the right thing in the driver somewhat simpler. V3: - fix weird whitespace V4: - don't forget to include the new opcode in tex_opcode_strs[] (thanks Kenneth for spotting this) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [V2] Reviewed-by: Eric Anholt <eric@anholt.net> [V2] Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:33:54 +13:00
Chris Forbes	16af0aca09	tests: add ARB_texture_multisample enums to table Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:33:42 +13:00
Chris Forbes	d04a4dd003	mesa: add texobj support for ARB_texture_multisample Adds the new texture targets, and per-image state for GL_TEXTURE_SAMPLES and GL_TEXTURE_FIXED_SAMPLE_LOCATIONS. V2: - Allow multisample texture targets in glInvalidateTexSubImage too. This was already partly there, but I missed it the first time around since the interaction is defined in a newer extension. Fixed weird indentation. - Allow multisample array textures in glFramebufferTextureLayer. This was overlooked as the tests originally only used 2d multisample textures. V3: - Set min/mag filters sensibly for multisample textures. This can't actually be changed by the user, so it's more sensible to initialize it correctly than to hack around it being bogus later. V4: - Tidy up initial min/mag filter setup. Setup in _mesa_initialize_texture_object was bogus, but benign since finish_texture_init() clobbered everything with correct values. For V4, just do the setup in finish_texture_init(). V5: - Don't break glPopAttrib(GL_TEXTURE_BIT) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [V2] Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:33:27 +13:00
Chris Forbes	0f83e415e4	glapi: add ARB_texture_multisample Adds new enums, dispatch machinery, and stubs for the 4 new entrypoints. V2: - Drop placeholder - Align enum values - Remove explicit exec=mesa; it is the dispatch flavor we want, but it's also the default. I misunderstood how this worked before; after actually reading the generator it makes good sense. V3: - Squash in stubs for new entrypoints, and dispatch_sanity tweaks, so we don't get build breakage between those patches. V4: - Fix various remaining whitespace issues Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> [1/3 V2] Reviewed-by: Matt Turner <mattst88@gmail.com> [V3] Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-02 11:33:20 +13:00
Eric Anholt	c0674fa5cd	intel: Use the new "ctx" local variable I just added some more. Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-01 12:10:22 -08:00
Eric Anholt	e15c21a957	i965: Make sRGB-capable framebuffers by default. The GLX extension lets you expose visuals that explicitly guarantee you that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set the flag even while the visual doesn't provide the guarantee. This appears to be consistent with other implementations, as we've seen several apps now that don't require an srgb visual and assume sRGB will work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633 Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-03-01 12:10:16 -08:00
Eric Anholt	973ddc897d	intel: Fix software copying of miptree faces for weird formats. Now that we have W-tiled S8, we can't just region_map and poke at bits -- there has to be some swizzling. Rely on intel_miptree_map to get that job done. This should also get the highest performance path we know of for the mapping (interesting if I get around to finishing movntdqa some day). v2: Fix stale name of the bit in a comment. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-01 11:50:03 -08:00
Eric Anholt	6d6bd2ac7c	intel: Add a flag for miptree mapping to disable transcoding. I want to reuse intel_miptree_map() to replace some region mapping that's broken for separate stencil, but doing so would result in new demands on ETC transcode that we actually don't want to happen. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-01 11:50:03 -08:00
Eric Anholt	e63c959451	i965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-03-01 11:50:03 -08:00
Alex Deucher	a40ba43d78	r600g: enable CP DMA on 6xx Tested across several 6xx parts, no piglit regressions. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-03-01 12:11:31 -05:00
Marek Olšák	58bd926d9e	r600g: don't require dword alignment with CP DMA for buffer transfers which is a leftover from the days when we used streamout to copy buffers Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-01 13:46:32 +01:00
Marek Olšák	89e2898e9e	r600g: always map uninitialized buffer range as unsynchronized Any driver can implement this simple and efficient optimization. Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used with TF2 anymore, so we avoid a ton of useless buffer copies. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> NOTE: This is a candidate for the 9.1 branch.	2013-03-01 13:46:32 +01:00
Marek Olšák	44f37261fc	gallium/util: add helper code for 1D integer range Reviewed-by: Brian Paul <brianp@vmware.com> v2: cosmetic changes based on Brian's review Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> NOTE: This is a candidate for the 9.1 branch. (the next patch depends on it)	2013-03-01 13:46:32 +01:00
Marek Olšák	8f192a3c9e	r600g: cleanup deprecated register tables These registers are either already emitted elsewhere or moved to start_cs. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-01 13:46:32 +01:00
Marek Olšák	f0636bc982	r600g: unify vgt states The states were split because we thought it caused a hardlock. Now we know the hardlock was caused by something else and has since been fixed. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-01 13:46:32 +01:00
Marek Olšák	e5a250fdf9	r600g: flush and invalidate htile cache when appropriate Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> NOTE: This is a candidate for the 9.1 branch.	2013-03-01 13:46:32 +01:00
Marek Olšák	6f25de6711	r600g: atomize streamout enabling This doesn't fix any issue we know of, but there indeed is a week spot in draw_vbo where streamout can fail. After streamout is enabled, the need_cs_space call can flush the context, which causes the streamout to be disabled right after it was enabled and bad things happen. One way to fix it is to atomize the beginning part, so that no context flush can happen between streamout enabling and the first drawing. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-03-01 13:46:32 +01:00
Marek Olšák	9dd18f43a4	r600g: use async DMA with a non-zero src offset probably a typo Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> NOTE: This is a candidate for the 9.1 branch.	2013-03-01 13:46:32 +01:00
Marek Olšák	c77917d35f	r600g: pad the DMA CS to a multiple of 8 dwords Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> NOTE: This is a candidate for the 9.1 branch.	2013-03-01 13:46:32 +01:00
Jordan Justen	782d4f0f3c	intel: Enable __DRI_API_OPENGL_CORE api with dri2 contexts Without this set, dri_util.c:dri2CreateContextAttribs will reject requests to create a context with __DRI_API_OPENGL_CORE. This prevents a 3.2 core profile context from being created even when MESA_GL_OVERRIDE_VERSION=3.2 is used. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 21:51:00 -08:00
Jordan Justen	fde59a27fb	intel: update max versions based on MESA_GL_VERSION_OVERRIDE If the override is version is >= 3.1, then update the max_gl_core_version. Otherwise, update max_gl_compat_version. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 21:50:56 -08:00
Jordan Justen	c4e059a359	mesa version: add _mesa_get_gl_version_override This will allow other code to get access to the override version before a context is available. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 21:50:50 -08:00
Jordan Justen	500b69e797	glsl: allow GLSL compiler version to be overridden to 1.50 Although GLSL 1.50 compiler support is not available, this change will allow MESA_GLSL_VERSION_OVERRIDE=150 to be used while 1.50 support is being developed. Since no drivers claim 1.50 GLSL support, this change should only impact Mesa when MESA_GLSL_VERSION_OVERRIDE=150 is set. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 21:49:59 -08:00
Matt Turner	4154ac066f	i965/fs: Put immediate operand as src2 Immediate operands can only be src2 in 2-source instructions. Fixes piglit failures since `0a1d145e` (oops!). Spotted-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-28 16:29:30 -08:00
Chad Versace	809fdc211f	intel: Remove intel_mipmap_tree::wraps_etc The field was equivalent to (etc_format != MESA_FORMAT_NONE), and therefore duplicate information. This patch removes field and replaces all references to it with `etc_format != MESA_FORMAT_NONE`. No Piglit ETC test regresses on Intel Sandybridge. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-28 15:22:41 -08:00
Matt Turner	c001985cbf	ir_to_mesa: Translate ir_triop_lrp to OPCODE_LRP. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:19:00 -08:00
Matt Turner	428503fcdf	i965/vs: Assert that ir_triop_lrp was lowered. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:19:00 -08:00
Matt Turner	f78a7ff6b2	i965/fp: Use the LRP instruction for OPCODE_LRP. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:19:00 -08:00
Kenneth Graunke	0a1d145e5f	i965/fs: Use the LRP instruction for ir_triop_lrp when possible. v2 [mattst88]: - Add BRW_OPCODE_LRP to list of CSE-able expressions. - Fix op_var[] array size. - Rename arguments to emit_lrp to (x, y, a) to clear confusion. - Add LRP function to brw_fs.cpp/.h. - Corrected comment about LRP instruction arguments in emit_lrp. v3 [mattst88]: - Duplicate MAD code for LRP instead of using a function pointer. - Check for != GRF instead of == IMM in emit_lrp. - Lower LRP on gen < 6. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> 1	2013-02-28 13:19:00 -08:00
Kenneth Graunke	015a48743d	i965: Add support for emitting the LRP instruction. Like MAD, this is another three-source instruction. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:18:59 -08:00
Matt Turner	af2c64063e	glsl: Optimize ir_triop_lrp(x, y, a) with a = 0.0f or 1.0f Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:18:59 -08:00
Kenneth Graunke	93066ce129	glsl: Convert mix() to use a new ir_triop_lrp opcode. Many GPUs have an instruction to do linear interpolation which is more efficient than simply performing the algebra necessary (two multiplies, an add, and a subtract). Pattern matching or peepholing this is more desirable, but can be tricky. By using an opcode, we can at least make shaders which use the mix() built-in get the more efficient behavior. Currently, all consumers lower ir_triop_lrp. Subsequent patches will actually generate different code. v2 [mattst88]: - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a subsequent patch and ir_triop_lrp translated directly. v3 [mattst88]: - Move changes from the next patch to opt_algebraic.cpp to accept 3-src operations. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:18:59 -08:00
Kenneth Graunke	18281d6088	glsl: Rework ir_reader to handle expressions with three operands. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:18:59 -08:00
Kenneth Graunke	1afd33ec05	glsl: Consolidate ir_expression constructors that use explicit types. Previously, we had separate constructors for one, two, and four operand expressions. This patch consolidates them into a single constructor which uses NULL default parameters. The unary and binary operator constructors had assertions to verify that the caller supplied the correct number of operands for the expression, but the four-operand version did not. Since get_num_operands for ir_quadop_vector returns the number of vector_elements, we can safely add that without breaking the semantics of ir_quadop_vector. This also paves the way for expressions with three operands. Currently, none can be constructed since get_num_operands() never returns 3. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-28 13:18:59 -08:00
Matt Turner	f0213b1242	i965/vs/gen7: Allow MATH instructions to have MRF as a destination total instructions in shared programs: 346873 -> 346847 (-0.01%) instructions in affected programs: 364 -> 338 (-7.14%) (All affected shaders are from Lightsmark) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-28 13:18:59 -08:00
Matt Turner	4eeb9ded9d	i965/fs/gen7: Allow MATH instructions to have MRF as a destination total instructions in shared programs: 1376297 -> 1375626 (-0.05%) instructions in affected programs: 35977 -> 35306 (-1.87%) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-28 13:18:59 -08:00
Matt Turner	d5c3aa89dc	i965/gen7: Relax restrictions on fake MRFs Gen6 has write-only MRF registers, and for ease of implementation we paritition off 16 general purposes registers to act as MRFs on Gen7. Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't do with real MRFs: - read from them; - return values directly to them from a send instruction; and - compute directly to them with math instructions. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-28 13:18:59 -08:00
Matt Turner	b9f6795e34	i965/fs: Remove duplicate scan_inst->mlen check Is already checked 20 lines below. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-28 13:18:59 -08:00
Tom Stellard	aa1c734b3c	clover: Fix build with LLVM 3.3 v2 v2: - Fix order that the clang libraries are passed to the linker to avoid missing symbol errors. Acked-by: Francisco Jerez <currojerez@riseup.net>	2013-02-28 16:01:23 -05:00
Jordan Justen	6f1538f8b4	attrib: push/pop FRAGMENT_PROGRAM_ARB state This requirement was added by ARB_fragment_program When the Steam overlay is enabled, this fixes: * Menu corruption with the Puddle game * The screen going black on Rochard when the Steam overlay is accessed NOTE: This is a candidate for the 9.0 and 9.1 branches. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-28 09:29:45 -08:00
Keith Kriewall	efd8311a54	scons: Fix Windows build with LLVM 3.2 Fixes fdo bug 61299 NOTE: This is a candidate for the stable branches. Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-28 15:40:02 +00:00
Adam Sampson	2506b03503	autotools: oprofilejit should be included in the list of LLVM components required NOTE: This is a candidate for the stable branch. Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-28 15:37:09 +00:00
Jerome Glisse	6bc7605745	r600g: workaround hyperz lockup on evergreen This work around disable hyperz if write to zbuffer is disabled. Somehow using hyperz when not writting to the zbuffer trigger GPU lockup. See : https://bugs.freedesktop.org/show_bug.cgi?id=60848 Candidate for 9.1 Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-02-28 09:48:05 -05:00
Jordan Justen	c6ae10887e	texobj: add verbose api trace messages to several routines Motivated by wanting to see if GenTextures was called by an application while debugging another Steam overlay issue. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-27 23:02:12 -08:00
Roland Scheidegger	c8eb2d0e82	llvmpipe: check buffers in llvmpipe_is_resource_referenced. Now that buffers can be used as textures or render targets make sure they aren't skipped. Fix suggested by Jose Fonseca. v2: added a couple of assertions so we can actually guarantee we check the resources and don't skip them. Also added some comments that this is actually a lie due to the way the opengl buffer api works.	2013-02-28 03:39:54 +01:00
Roland Scheidegger	686f6c69bd	llvmpipe: support rendering to buffer render targets. Unfortunately not usable from OpenGL, and no cap bit. Pretty similar to a 1d texture, though allows specifying a start element. v2: also fix up renderbuffer width (which will get promoted to fb width) to be the number of elements Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-28 03:39:54 +01:00
Roland Scheidegger	2fcd3638be	util: fix issues with util_clear_render_target. For PIPE_BUFFER we need coord adjustments for the transfer. And for pure integer formats util_pack_color just crashes, need to handle that differently due to clear colors being ints/uints. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-28 03:39:53 +01:00
Roland Scheidegger	6b35c2b110	softpipe/draw/tgsi: simplify driver/tgsi sampler interface Use a single sampler adapter instead of per-sampler-unit samplers, and just pass along texture unit and sampler unit in the calls. The reason is that for dx10-style sample opcodes pre-wired samplers including all the texture state aren't really feasible (and for sample_i/sviewinfo we don't even have samplers). Of course right now softpipe doesn't actually do anything more than just look up all its pre-wired per-texunit/per-samplerunit sampler as it did before so this doesn't really achieve much except one more function call, however this is now all softpipe's fault (fixing that in a way which doesn't suck is still an unsolved problem). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-28 03:39:53 +01:00
Maxence Le Doré	0845d16976	gallivm: fix mis-matching AOS instruction emission Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-27 20:23:01 +00:00
Jon TURNEY	f816a9f522	glx: Fix glXCreateWindow() when GLX_DIRECT_RENDERING is undefined glXCreateWindow() and glXCreatePbuffer() always fail when built without GLX_DIRECT_RENDERING defined since commit `48331047`. Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-02-27 13:36:19 -05:00
Francisco Jerez	4deefd9ba6	configure.ac: Clarify the description of the --with-opencl-libdir parameter a little. https://bugs.freedesktop.org/show_bug.cgi?id=61415 Signed-off-by: Francisco Jerez <currojerez@riseup.net>	2013-02-27 12:27:13 +01:00
Vinson Lee	f987d23b28	radeonsi: Fix memory leak in si_set_constant_buffer. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-26 20:03:11 -08:00
Vinson Lee	f88ed1658c	st/vega: Fix memory leak in combine_shaders. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-26 20:01:58 -08:00
Kristian Høgsberg	112ccfab44	egl/wayland: Don't block on EGL_DEFAULT_DISPAY under wayland Normally the application will own the main event queue and be responsible for moving events. In case of EGL_DEFAULT_DISPLAY, EGL opens the display and has to own the main queue so it can move the events itself. Call wl_display_dispatch_pending() to take ownership.	2013-02-26 12:49:49 -05:00
Ian Romanick	68a147e9a9	egl: Allow 24-bit visuals for 32-bit RGBA8888 configs Previously only the 32-bit X visual would match the 32-bit RGBA8888 configs. This resulted in every config with alpha getting the "magic" visual whose alpha is used by the compositor. This also resulted in no multisample visuals being advertised. How many ways could we lose? This patch inverts the problem... now you can't get the visual with alpha used by the compositor even if you want it. I think we need to invent a new value for EGL_TRANSPARENT_TYPE that apps can use to get this. I'm surprised that there isn't already a choice for EGL_TRANSPARENT_ALPHA. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Tian Ye <yex.tian@intel.com> Acked-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59783	2013-02-26 09:42:31 -08:00
Brian Paul	e2148ab043	st/mesa: remove some conditionals in update_raster_state() Just use simple assignments. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-26 09:16:52 -07:00
Alex Deucher	e5e4c07e79	r600g: add missing emit_flush for R600_CONTEXT_FLUSH_AND_INV case We set the cp_coher_cntl bits but never emit them. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-26 10:30:26 -05:00
Alex Deucher	d54bc5d227	r600g: synchronize streamout buffers on r6xx too (v3) Streamout buffers need to be synchronized on r6xx as well. v2: Add DEST flush as well. v3: drop DEST flush Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-26 10:30:10 -05:00
Brian Paul	62329d77b8	winsys/null: fix var typo templet->templat	2013-02-26 08:20:16 -07:00
Brian Paul	02bf645111	svga: fix comment typos	2013-02-26 08:20:16 -07:00
Marek Olšák	d8d58bdcb9	r300g: implement 3D transfers Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=61351	2013-02-26 01:14:20 +01:00
Marek Olšák	3857f450a6	gallium/util: add helper util_max_layer from r600g	2013-02-26 01:14:05 +01:00
Roland Scheidegger	52c44cee1e	llvmpipe: (trivial) get rid of old function prototypes. llvmpipe_init_screen/context_texture_funcs have long been replaced with the respective "resource" funcs.	2013-02-25 20:38:23 +01:00
Roland Scheidegger	c0ba1080df	draw: make sure pipeline is revalidated when sampler views or samplers change. Since with llvm execution parts of sampler view and sampler state is baked into the shader, we need to revalidate otherwise the wrong shader might get used. (Not completely sure but I think this would not be required for non-llvm case, along with everything else in these functions.) This caused bugs in piglit arb_texture_buffer_object-formats, because we never noticed that the view format changed. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-25 20:38:23 +01:00
Roland Scheidegger	20183177a5	llvmpipe: support GL_ARB_texture_buffer_object/GL_ARB_texture_buffer_range This also fixes not honoring first/last_layer view parameters for array textures, plus not honoring last_level view parameter for all textures (neither is really used by OpenGL). This mostly passes piglit arb_texture_buffer_object tests (it needs, however, glsl 140 version override, plus GL 3.1 override, the latter only because mesa does not allow ARB_tbo in non-core contexts). Most arb_texture_buffer_object tests pass, with the exception of arb_texture_buffer_object-formats. With "arb" parameter it passes most weirdo formats before it segfaults in the state tracker, this looks to be some issue with using legacy formats in core context (fails the same in softpipe). With "core" parameter it passes with "fs", however fails with "vs" (for most formats). This will be fixed later (debugging shows we're completely missing the shader recompile depending on format). v2: based on Jose's feedback, fix comments, variable/function names. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-25 20:38:23 +01:00
Eric Anholt	50a5d5dea0	i965: Fix the W value of deprecated pointcoords on pre-gen6. When you didn't have a texcoord array bound (or a non-1 current w attrib), we were telling the fragment shader that it could just use "1" instead of doing expensive pre-gen6 math to invert it. If you drew the point with a non-1 W value, then you'd get the right size (since all the vertex computations worked), but we'd mis-interpolate the coordinate across the face. Fixes the mesa pointsprite demo on GM45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30232 Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com> Note: This is a candidate for the stable branches.	2013-02-25 11:21:44 -08:00
Tapani Pälli	3cdb548bfb	mesa/es: NULL check in EGLImageTargetTexture2DOES check that pointer passed is valid and return error if not. Note: This is a candidate for the stable branches. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-25 09:17:31 -08:00
Tapani Pälli	331967c773	mesa: add missing case in _mesa_GetTexParameterfv() missing case GL_REQUIRED_TEXTURE_IMAGE_UNITS_OES is required by OES_EGL_image_external extension. Note: This is a candidate for the stable branches. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-25 09:17:20 -08:00
Andreas Boll	533dc3b690	docs: add news item for mesa-demos 8.1.0 release	2013-02-25 11:31:08 +01:00
Andreas Boll	d209926666	docs: import release notes for 9.1, add news item	2013-02-25 10:47:02 +01:00
Jordan Justen	0486d50320	glsl: Remove VS output varyings which are optimized out of the FS Previously when an input varying was optimized out of the FS we would still retain it as an output of the VS. We now build a hash of live FS input varyings rather than looking in the FS symbol table. (The FS symbol table will still contain the optimized out varyings.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-23 16:20:28 -08:00
Vinson Lee	f6487e8911	vl: Fix off-by-one error in device_name_length allocation. Fixes out-of-bounds write reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel@daenzer.net>	2013-02-23 14:57:05 -08:00
John Kåre Alsaker	65aa1a194d	llvmpipe: Fix creation of shared and scanout textures. NOTE: This is a candidate for the stable branches. Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-23 18:36:58 +00:00
José Fonseca	fdb88967e3	util/u_blitter: Set pipe_sampler_state::normalized_coords correctly. We might want to revisit the normalized_coords semantics, but this is the current expected behavior. Fixes fdo bug 61091. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-23 18:36:57 +00:00
Brian Paul	2557d3f9c3	svga: remove some extraneous whitespace	2013-02-23 08:20:36 -07:00
Brian Paul	840d6faf68	st/mesa: fix debug_printf() format string warning Use %td for ptrdiff_t (aka GLsizeiptrARB).	2013-02-23 08:20:36 -07:00
José Fonseca	0d760a8160	util/dump: Use static assertion to detect string table size mismatches. Suggested by Brian Paul. Could probably be extended to other enums. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-23 13:32:34 +00:00
Vinson Lee	2fa9e4c97c	st/xvmc/tests: Ensure colorkey is initialized. Fixes uninitialized scalar variable defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-02-22 19:32:00 -08:00
Vinson Lee	54afbce934	st/vdpau: Fix memory leak in vlVdpBitmapSurfaceCreate. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-02-22 19:30:03 -08:00
Vinson Lee	1bac4a1e6f	st/vdpau: Fix memory leak in vlVdpOutputSurfaceCreate. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2013-02-22 19:29:56 -08:00
Tapani Pälli	b4dba5bba2	glapi: mark static_dispatch false for DiscardFramebufferEXT Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61199 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Brad King <brad.king@kitware.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-02-22 17:18:08 -08:00
Brian Paul	b804fb8714	llvmpipe: rename polygon offset fields to something more specific Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 16:49:05 -07:00
Brian Paul	f93c580063	llvmpipe: add missing checks for polygon offset point/line modes The llvm pipeline handles regular filled triangle offsets, but it doesn't handle offsets for triangles drawn in point or line mode. Fixes failures found with new piglit polygon-mode-offset test. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 16:49:05 -07:00
Brian Paul	d6b8b116ee	draw: fix broken polygon offset stage There were several issues. We weren't handling different front/back polygon fill modes. We weren't checking whether the offset applied to fill mode vs. line mode vs. point mode. Fixes problems found with the Visualization Toolkit (VTK) test suite. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 16:49:05 -07:00
Brian Paul	a2c105e31e	st/mesa: fix polygon offset state translation logic The old logic was kind of twisted, but seemed to work in practice. Note: This is a candidate for the stable branches. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-02-22 16:49:05 -07:00
Brian Paul	8bb291b0f5	st/mesa: check for dummy programs in destroy_program_variants() When we destroy an ARB vp/fp whose ID was gen'd but not otherwise used we get a pointer to the dummy/placeholder program. We can't destroy that one so just skip it. This only failed during context tear-down because glDeleteProgramsARB() was already aware of dummy programs. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=38086 Note: This is a candidate for the stable branches. Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2013-02-22 16:49:05 -07:00
Brian Paul	8589cc41b3	st/mesa: fix trimming of GL_QUAD_STRIP We sometimes convert GL_QUAD_STRIP prims into GL_TRIANGLE_STRIP, but that changes the results of the u_trim_pipe_prim() call. We need to pass the original primitive type to the trim function. Note that OpenGL's GL_x prim type values match Gallium's PIPE_PRIM_x values. Fixes a failure in the new piglit degenerate-prims test. Note: This is a candidate for the stable branches. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-02-22 16:49:05 -07:00
Alex Deucher	8b5acad0e9	r600g: fixup PS_PARTIAL_FLUSH flag handling for cayman So we don't emit it twice if we ever use the flag on cayman. Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-22 18:43:27 -05:00
Alex Deucher	8442b67f5f	r600g: r6xx deadlock workaround (v6) Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=50655 https://bugs.freedesktop.org/show_bug.cgi?id=47116 v2: flush along with workaround. v3: just need a flush v4: try WAIT_UNTIL v5: switch to PS partial flush v6: rework patch Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-22 18:23:46 -05:00
Alex Deucher	7ebf83f109	r600g: add PS_PARTIAL_FLUSH flag PS_PARTIAL flushes seems to be required in certain cases to prevent hangs, especially on r6xx. Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-22 18:23:31 -05:00
Ian Romanick	7ae6864f0d	i965: Enable OpenGL ES 3.0 on Sandy Bridge Regardless of what we put in the screen structure, all of the extensions that compute_version_es2 checks are present and 3.0 will be exposed anyway. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-22 13:57:44 -08:00
Lauri Kasanen	0a82828ad5	configure: Fix build with automake < 1.11 Commit `86d30dea3c` broke building with older automake versions with this error: Makefile:769: *** Recursive variable am__v_YACC_ references itself (eventually). Stop. This patch fixes it. Fix stolen from xorg-macros. Signed-off-by: Lauri Kasanen <cand@gmx.com>	2013-02-22 13:15:14 -08:00
Anuj Phogat	cff862f90d	meta: Allocate texture before initializing texture coordinates tex->Sright and tex->Ttop are initialized during texture allocation. This fixes depth buffer blitting failures in khronos conformance tests when run on desktop GL 3.0. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=59495 Note: This is a candidate for stable branches. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-22 12:03:59 -08:00
Eric Anholt	92a204b493	mesa: Fix setup of ctx->Point.PointSprite for GLES2. The recent change for GL core broke the older setup, which broke gl_PointCoord on pre-gen6 (where gl_PointCoord is undefined if point sprites are disabled). Fixes the new piglit GLES-2.0/glsl-fs-pointcoord test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32429 Note: This is a candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-22 10:55:39 -08:00
Eric Anholt	7b0731d940	i965/fs: Fix broken math on values loaded from uniform buffers on gen6. In a debug build this led to assertion failures, but on a non-debug build the hardware would just reference the whole vec8 instead of the same channel 8 times. Fixes the new piglit glsl-1.40/uniform-buffer/fs-exp2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57121 Note: This is a candidate for the stable branches Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-22 10:50:50 -08:00
José Fonseca	cd01cc3b48	tgsi: Improve execution debugging. - zero temps/outputs instead of copying (otherwise we won't be able to see the temps/outputs assignments for small shaders where nothing changes across big areas - also show the inputs (as it's often impossible to infer from the rest) Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-22 16:19:58 +00:00
José Fonseca	f8436c17e4	util/u_dump: Update texture target strings.	2013-02-22 16:19:58 +00:00
Sergey Matyukevich	21e8af0b09	util/debug: Always use __builtin_frame_address on gcc. Should workaround fdo bug 57563. Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-22 16:19:58 +00:00
Michel Dänzer	f6b40ddd2d	radeon/llvm: Remove stale comment about radeon_llvm_emit_prepare_cube_coords	2013-02-22 13:06:07 +01:00
Marek Olšák	aac8138744	r600g: fix random corruption with CP DMA in TF2 NOTE: This is a candidate for the 9.1 branch.	2013-02-22 12:49:15 +01:00
Michel Dänzer	3447cc4856	radeonsi: Don't pretend there is any R8G8B8 support The hardware can't do it.	2013-02-22 11:44:24 +01:00
Andreas Boll	c1f2c3a80f	llvmpipe/build: add DLOPEN_LIBS and PTHREAD_LIBS to the lp_test_* targets Fixes undefined symbols. NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61052 Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 10:21:43 +01:00
Andreas Boll	c1eb585f3d	targets/xa-vmwgfx: Force c++ linker to fix undefined symbols NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61200 Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-22 10:21:43 +01:00
Roland Scheidegger	b6f15954b4	llvmpipe: Fix rendering into PIPE_FORMAT_X8*_UNORM. Mesa state tracker recently started using PIPE_FORMAT_X8B8G8R8_UNORM, causing segfaults in texture-packed-formats, because swizze[chan] was 0xff for padding channel (X). Signed-off-by: José Fonseca <jfonseca@vmware.com>	2013-02-22 09:00:45 +00:00
José Fonseca	8ed1279b10	trace: Never close stdout/stderr. This could happen, when a trace screen was destroyed and then recreated.	2013-02-22 08:45:07 +00:00
José Fonseca	59025d6e95	trace: Fix set_constant_buffer dumping. We were dumping the trace driver pointer, instead of the pointer from the underlying pipe driver.	2013-02-22 08:40:47 +00:00
Vinson Lee	b92984b2fa	r600g: Fix memory leak in r600_shader_select. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reported-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 21:49:24 -08:00
Roland Scheidegger	66c3cd0be3	llvmpipe: simplify buffer allocation logic. Now with buffer formats clarification don't need all that logic any longer. (Note that it never would have worked in any case, because blockwidth and blockheight were swapped any allocation with multi-byte format would have had zero size.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 04:34:07 +01:00
Roland Scheidegger	2cfee2295f	gallium/docs: improve text about resources a bit. This clarifies some things and gets rid of some old stuff. The most significant one is probably that buffers cannot have formats (nearly all drivers completely ignored format and used width0 as byte size already in any case). There seems to be no use case for "structured" buffers. (Note while d3d11 has new Structured Buffers, these still aren't associated with a format, rather a byte stride, which we can't do yet either way.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 04:34:07 +01:00
Roland Scheidegger	f972567671	draw: make sure key size is calculated consistently. Some parts calculated key size by using shader information, others by using the pipe_vertex_element information. Since it is perfectly valid to have more vertex_elements set than the vertex shader is using those may not be the same, so we weren't copying over all vertex_element state - this caused the tgsi dump to assert (iterates over all vertex elements). More importantly in this situation it would also break vertex texturing completely (since the sampler state derived from the key is at a different position than expected). Fix thix by deriving key->nr_vertex_elements from the shader information instead of the pipe_vertex_element state (unlike dx10, we can't have "holes" in pipe_vertex_element state, so this should be safe). (Note that actual llvm shader generation does not use the pipe_vertex_element state from the key itself in any case (althogh I guess it could) but uses the one from draw.pt (which should be the same though contains all elements) instead.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-22 04:34:07 +01:00
Tom Stellard	10bcc843f8	r300g/compiler: Fix bug in OMOD folding The OMOD value was only being folded to one instruction in cases where the MUL instruction was reading a value written by more than one instruction. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 22:07:28 -05:00
Tom Stellard	5e1321ddf4	r300g/tests: Add helper functions for creating a full program Now you can convert assembly strings into a full struct radeon_compiler object and use it to test individual compiler pases. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 22:07:27 -05:00
Tom Stellard	bcf2e157ca	r300g/tests: Exit test runner with a valid status code This way make check can report whether or not the tests pass. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 22:07:27 -05:00
Tom Stellard	5355fc1e87	r300g/complier: Make r300_vertprog_swizzle_caps visible in other files This will be used by the test suite in later commits. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 22:07:27 -05:00
Tom Stellard	c3df498ff9	r300g/compiler: Fix typo in comment Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 22:07:27 -05:00
Tom Stellard	27d140b960	r300g/compiler: Add missing license headers These are all files that I authored, but forgot to add the license headers. NOTE: This is a candidate for the stable branches. Signed-off-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 22:07:27 -05:00
Carl Worth	f5a8084692	i965: Avoid segfault in gen6_upload_state This fixes a bug introduced in commit `258453716f` and triggered whenever "rb" is NULL. Fixes at least one cause bug #59445: [SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault https://bugs.freedesktop.org/show_bug.cgi?id=59445 (Though segfaults are still possible in that test case, but they have been present since before commit `258453716f` which is what's being fixed here.) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-21 12:09:24 -08:00
Alex Deucher	2e4ef989a2	r600g: don't enable ReZ mode on evergreen Can cause lockups in certain cases when zfunc/zenable/zwrite change without a flush in between. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=60969 and lockups on Civ4 with wine. This is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-21 11:59:07 -05:00
Andreas Boll	f7d87332b0	docs: import release notes for 9.0.3, add news item	2013-02-21 17:31:42 +01:00
Michel Dänzer	b63b3012c9	radeonsi: Don't match TGSI_SEMANTIC_POSITION fs inputs to vs outputs	2013-02-21 10:07:18 +01:00
Michel Dänzer	954bc4ac34	radeonsi: Fix w component of TGSI_SEMANTIC_POSITION fragment shader inputs. It's the reciprocal of the register value. Fixes piglit fragcoord_w and glsl-fs-fragcoord-zw-perspective. NOTE: This is a candidate for the 9.1 branch.	2013-02-21 10:06:52 +01:00
Michel Dänzer	18272c9b1b	radeonsi: Fix up and enable flat shading. Requires corresponding LLVM R600 backend fix to work correctly, but even without that it doesn't hang anymore. 13 more little piglits. Depends on LLVM: r175193, r175733 NOTE: This is a candidate for the 9.1 branch.	2013-02-21 09:14:36 +01:00
Vinson Lee	0d51906c07	radeonsi: Fix memory leak in si_shader_select. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-20 23:29:12 -08:00
Paul Berry	54d9c8a04a	i965: Consign COORD_REPLACE VS hacks to Pre-Gen6. Pre-Gen6, the SF thread requires exact matching between VS output slots (aka VUE slots) and FS input slots, even when the corresponding VS output slot is unused due to being overwritten by point coordinate replacement (glTexEnvi(GL_POINT_SPRITE, GL_COORD_REPLACE, GL_TRUE)). As a result, we have a special hack in the VS to ensure when any texture coordinate is subject to point coordinate replacement, it is always allocated space in the VUE, even if it isn't written to by the VS. This hack isn't needed from Gen6 onwards, since SF (Gen7: SBE) swizzling has the ability to insert the point coordinate into gl_TexCoord[] without needing a corresponding unused VUE slot. Note that no modification of SF setup code is required for this patch--get_attr_override() already does the right thing. However, we make a slight comment change to clarify why this works. In addition to eliminating unnecessary VS recompiles and saving precious URB space on Gen6+, this will save us the trouble of having to adjust this hack when we implement geometry shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-20 13:48:45 -08:00
Ian Romanick	8b586322e7	mesa: Don't install glEvalMesh in the beginend dispatch table NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59740 Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-20 12:46:58 -08:00
Roland Scheidegger	83f7cde182	gallivm: fix indirect src register fetches requiring bitcast For constant and temporary register fetches, the bitcasts weren't done correctly for the indirect case, leading to crashes due to type mismatches. Simply do the bitcasts after fetching (much simpler than fixing up the load pointer for the various cases). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61036 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-20 19:37:30 +01:00
Roland Scheidegger	fbbcc1fcc4	llvmpipe: lp_resource_copy cleanup We don't need to flush resources for each layer, and since we don't actually care about layer at all in the flush function just drop the parameter. Also we can use util_copy_box instead of repeated util_copy_rect. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-20 19:37:30 +01:00
Roland Scheidegger	95181ed2fd	llvmpipe: fix lp_resource_copy using more than one 3d slice These used to be illegal a very long time ago, then for some more time nothing really emitted these so this code path wasn't hit. Just trivially iterate over box->depth. (Might be worth refactoring at some point since nowadays all the code doesn't really do much except for depth textures.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61093 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-20 19:37:30 +01:00
Tapani Pälli	413941e1a3	gles2: a stub implementation for GL_EXT_discard_framebuffer This patch implements a stub for GL_EXT_discard_framebuffer with required checks listed by the extension specification. This extension is required by GLBenchmark 2.5 when compiled with OpenGL ES 2.0 as the rendering backend. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-20 10:01:45 -08:00
Michel Dänzer	73bf626713	r600g/Cayman: Fix blending using destination alpha factor but non-alpha dest Only compile tested, but should fix at least some piglit fbo-blending tests. NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-20 14:43:17 +01:00
Michel Dänzer	95bced5929	radeonsi: Fix blending using destination alpha factor but non-alpha destination 11 more little piglits. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-20 12:58:52 +01:00
Marek Olšák	72f4490b55	radeonsi: implement 3D transfers That means we can map and read multiple slices with one transfer_map call. [ Cherry-picked from r600g commit `1aebb6911e` ] 11 more little piglits on master, 1 more on the 9.1 branch (Marek's glTex(Sub)Image improvements on master broke the other 10). NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-20 12:30:59 +01:00
Marek Olšák	a84c4edeed	radeonsi: add assertions to prevent creation of invalid surfaces [ Cherry-picked from r600g commit `ef11ed61a0` ] NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-20 12:30:32 +01:00
Marek Olšák	c4faab63c4	radeonsi: use u_box_origin_2d helper function [ Cherry-picked from r600g commit `b278aba423` ] NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-02-20 12:15:22 +01:00
Vinson Lee	c403a52666	configure.ac: Do not check for clock_gettime on MinGW. MinGW does not have clock_gettime. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-19 21:17:37 -08:00
Zack Rusin	076403c30d	DRI2: Don't disable GLX_INTEL_swap_event unconditionally GLX_INTEL_swap_event is broken on the server side, where it's currently unconditionally enabled. This completely breaks systems running on drivers which don't support that extension. There's no way to test for its presence on this side, so instead of disabling it uncondtionally, just disable it for drivers which are known to not support it. It makes sense because most drivers do support it right now. We'll be able to remove this once Xserver properly advertises GLX_INTEL_swap_event. Note: This is a candidate for stable branch branches. Signed-off-by: Zack Rusin <zackr@vmware.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60052 Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-19 12:50:16 -08:00
Eric Anholt	4c64f65f5d	i965/fs: Enable CSE on uniform pull constant loads. Improves on a major performance regression for the dolphin wii emulator from its move to using UBOs. Performance in the UBO codepath (as replayed through apitrace) is up 21.1% +/- 2.3% (n=26/29). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-19 10:34:03 -08:00
Eric Anholt	c2a6e529c3	i965/fs: Only do CSE when the dst types match. We could potentially do some CSE even when the dst types aren't the same on gen6 where there is no implicit dst type conversion iirc, or in the case of uniform pull constant loads where the dst type doesn't impact what's stored. But it's not worth worrying about. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for the 9.1 branch.	2013-02-19 10:33:41 -08:00
Eric Anholt	aebd3f46e3	i965/fs: Delay setup of uniform loads until after pre-regalloc scheduling. This should fix the register allocation explosion on the GLES 3.0 test on gen6. It also gives us an instruction that will fit our CSE handling. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for the 9.1 branch.	2013-02-19 10:33:32 -08:00
Eric Anholt	49bdebad38	i965/fs: Fix copy propagation with smearing. We were correctly relaying the smear from MOV's src, but if the MOV didn't do a smear, we don't want to smash the smear value from the instruction being propagated into. Prevents a regression in the upcoming UBO change. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for the 9.1 branch.	2013-02-19 10:33:15 -08:00
Eric Anholt	de7cb1cff3	i965/fs: Add a bit more instruction dumping useful for upcoming work. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-19 10:33:00 -08:00
Tom Stellard	7cd248aa79	radeon/llvm: Fix build with LLVM 3.3	2013-02-19 15:52:55 +00:00
Tom Stellard	1f006717db	r600g: Add $(DEFINES) to AM_CXXFLAGS This way llvm_wrapper.cpp is compiled with -DHAVE_LLVM=0x....	2013-02-19 15:52:55 +00:00
Paul Berry	444246c7e3	i965: Remove unused userclip flags. brw_vs_prog_data::userclip hasn't been used since commit `f0cecd4` (i965: Move VUE map computation to once at VS compile time). brw_gs_prog_key::userclip_active hasn't been used since commit `9f3d321` (i965: Make the userclip flag for the VUE map come from VS prog data). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-19 07:35:52 -08:00
Brian Paul	dfbcb1849c	llvmpipe: fix handling of 0 x 0 framebuffer size Bump up the size to 1 x 1. This fixes a number of potential failure points in the code. See also http://bugs.freedesktop.org/show_bug.cgi?id=61012	2013-02-19 07:19:19 -07:00
Brian Paul	e2091f64cb	st/xlib: initialize the drawable size in create_xmesa_buffer() Otherwise, the PBuffer's size was never set. This also initializes the buffer size for windows, pixmaps, etc. Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61012 Note: This is a candidate for the stable branches.	2013-02-19 07:19:19 -07:00
Stefan Brüns	5876a5dbc0	glx: fix glGetTexLevelParameteriv for indirect rendering A single element in a GLX reply is contained in the header itself. The number of elements is denoted in the "n" field of the reply. If "n" is 1, the length of additional data is 0. The XXX_data_length() function of xcb does not return the length of the (optional, n>1) data but the number of elements. Fixes http://bugs.freedesktop.org/show_bug.cgi?id=59876 Note: This is a candidate for the stable branches. Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de> Signed-off-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-19 07:19:19 -07:00
Brian Paul	63c30d7e4f	st/mesa: implement glBitmap unpacking from a PBO, for the cache path We weren't mapping the PBO when using the bitmap cache (but we had the PBO code for the non-cache path.) Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61026 Note: This is a candidate for the stable branches.	2013-02-19 07:19:19 -07:00
Brian Paul	5da967aff5	draw: fix non-perspective interpolation in interp() This fixes a regression from `ab74fee5e1`. When we use the clip coordinate to compute the screen-space interpolation factor, we need to first apply the divide-by-W step to the clip coordinate. Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60938 Note: This is a candidate for the 9.1 branch.	2013-02-19 07:19:18 -07:00
Marek Olšák	07cdfdb708	st/mesa: remove what is left from u_blit Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Marek Olšák	40ee93c4e8	st/mesa: simplify and improve CopyTexSubImage It has become a bit messy. Changes: - finally correct checking for transfer ops depending on the base format - making sure the base internal format and the texture format match (we were ignoring it, but it's important for correctness) - the way-too-strict rule that both src and dst base formats must be the same was dropped; ensuring the simpler and more permissive rule mentioned above is enough - stop using util_blit_pixels; pipe->blit is flexible enough, and now that we have RGBX and red-alpha formats, pipe->blit can be used for more cases Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Marek Olšák	6520a86c67	st/mesa: don't do sRGB conversion in CopyTexSubImage Assuming I understand EXT_texture_sRGB correctly. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Marek Olšák	0a1479c829	st/mesa: implement blit-based TexImage and TexSubImage A temporary texture is created such that it matches the format and type combination and pixels are copied to it using memcpy. Then the blit is used to copy the temporary texture to the texture image being modified by TexImage or TexSubImage. The blit takes care of the format and type conversion and swizzling. The result is a very fast texture upload involving as little CPU as possible. This improves performance in apps which upload textures during rendering. An example is the Wine OpenGL backend for DirectDraw, which I used to test the game StarCraft. Profiling had shown that TexSubImage was taking 50% of CPU time without this patch, which was the main motivation for this work, and now TexSubImage only takes 14% of CPU time. I had to underclock my CPU to see any difference in the game and this patch does make the game a lot faster if the CPU is slow (or using the powersave cpufreq profile). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Marek Olšák	a6e0ac9571	st/mesa: fix blit-based GetTexImage for 1D array textures This is not easy to hit, because we have 3 code paths now (tried in this order): - memcpy-based (skips the blit) -> _mesa_tex_getimage - blit-based - slow pixel packing -> _mesa_tex_getimage The main difference later in the code is the parameters of _mesa_image_address3d. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Marek Olšák	91acf6225a	st/mesa: fix blit-based GetTexImage for depth/stencil formats BTW, we have 0 tests for glGetTexImage(format=GL_DEPTH*). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Marek Olšák	0181e18d0f	st/mesa: factor out code for determining blit.mask from CopyTexSubImage I'll need this later. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-18 17:57:41 +01:00
Michel Dänzer	9c1107b3e1	radeonsi: Fix PIPE_FORMAT_X32_S8X24_UINT sampler hardware format 4 more little piglits. NOTE: This is a candidate for the 9.1 branch.	2013-02-18 15:59:02 +01:00
Michel Dänzer	8356962853	radeonsi: Use stencil surface level information for stencil texturing 7 more little dwarves^W piglits. NOTE: This is a candidate for the 9.1 branch.	2013-02-18 15:58:37 +01:00
Michel Dänzer	f9adf79876	radeonsi: properly implement S8Z24 depth-stencil format Based on r600g commit `2b9659c9e6` . Fixes crashes with 4 piglit tests which are now hitting these formats. NOTE: This is a candidate for the 9.1 branch.	2013-02-18 15:58:05 +01:00
Vincent Lejeune	0527317e1f	r600g/llvm: Support for TBO Reviewed-by: Tom Stellard <thomas.stellard at amd.com>	2013-02-18 15:08:59 +01:00
Vincent Lejeune	c116598f86	r600g/llvm: Set Inputs/Outputs count to 32 (api reported value) Reviewed-by: Tom Stellard <thomas.stellard at amd.com>	2013-02-18 15:08:54 +01:00
Vincent Lejeune	90e6f47ac8	r600g/llvm: Fix alpha_to_one piglit tests Reviewed-by: Tom Stellard <thomas.stellard at amd.com>	2013-02-18 15:08:50 +01:00
Vincent Lejeune	ef8fde6acb	r600g/llvm: Add support for UBO NOTE: This is a candidate for the Mesa stable branch. Reviewed-by: Tom Stellard <thomas.stellard at amd.com>	2013-02-18 15:08:45 +01:00
Christopher James Halse Rogers	dd599188d2	i965: Fix leak in blorp CopyTexSubImage2D _mesa_delete_renderbuffer does not call the driver-specific renderbuffer delete function, so the blorp code was leaking the Intel-specific bits, including some GEM objects. Call the renderbuffer's ->Delete() method instead, which does the right thing. Fixes Unity rapidly sending the machine into the arms of the OOM-killer Note: This is a candidate for the 9.1 branch. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-16 08:11:14 -08:00
Roland Scheidegger	f1ab67c13a	gallivm/tgsi: fix issues with sample opcodes We need to encode them as Texture instructions since the NumOffsets field is encoded there. However, we don't encode the actual target in there, this is derived from the sampler view src later. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-16 02:40:59 +01:00
Roland Scheidegger	cb2e678294	gallivm/tgsi: fix src modifier fetching with non-float types. Need to take the type into account. Also, if we want to allow mov's with modifiers we need to pick a type (assume float). v2: don't allow all modifiers on all type, in particular don't allow absolute on non-float types and don't allow negate on unsigned. Also treat UADD as signed (despite the name) since it is used for handling both signed and unsigned integer arguments and otherwise modifiers don't work. Also add tgsi docs clarifying this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-16 02:40:51 +01:00
Roland Scheidegger	c25ae5d27b	gallivm: fix issues with trunc/round/floor/ceil with no arch rounding The emulation of these if there's no rounding instruction available is a bit more complicated than what the code did. In particular, doing fp-to-int/int-to-fp will not work if the exponent is large enough (and with NaNs, Infs). Hence such values need to be filtered out and the original value returned in this case (which fortunately should always be exact). This comes at the expense of performance (if your cpu doesn't support rounding instructions). Furthermore, floor/ifloor/ceil/iceil were affected by precision issues for values near negative (for floor) or positive (for ceil) zero, fix that as well (fixing this issue might not actually be slower except for ceil/iceil if the type is not signed which is probably rare - note iceil has no callers left in any case). Also add some new rounding test values in lp_test_arit to actually test for that stuff (which previously would have failed without sse41). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=59701.	2013-02-16 02:40:44 +01:00
Roland Scheidegger	70daad6a99	gallivm: DIV shouldn't be deprecated. (Though it looks glsl won't emit it.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-16 02:40:36 +01:00
Matt Turner	00f6fe6c66	mesa: Use PROGRAM_ERROR_STRING_ARB instead of the _NV name Since NV_fragment_program is now gone. No functional change, since the values are identical. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-15 10:28:12 -08:00
Brian Paul	2ef530cf68	trace: add context pointer sanity checking To help catch mixed up context pointer bugs in the future, add a trace_context_check() function and some new assertions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-15 11:11:34 -07:00
Brian Paul	82d62cf04f	trace: fix incorrect trace_surface::base.context pointer When a trace_surface object is created in trace_surf_create() we weren't correctly setting the surface's context pointer. Instead of it being the trace context, it was the wrapped driver's context. This caused things to blow up sometimes during surface deallocation. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-15 11:11:34 -07:00
Brian Paul	3b0de75c4d	mesa: remove old version comment from gl.h	2013-02-15 09:25:15 -07:00
Brian Paul	70135e915a	trace: whitespace, comment clean-ups	2013-02-15 09:25:15 -07:00
Brian Paul	7b836a7d25	trace: move struct tr_list to tr_texture.h That's the only place it's used.	2013-02-15 09:25:15 -07:00
Brian Paul	4be5a06752	st/mesa: fix format query for GL_ARB_texture_rg The GL_ARB_texture_rg spec says that we need to support both texturing and rendering for the GL_RED and GL_RG formats. So move the format check up into the rendertarget_mapping[] list. Also, add PIPE_FORMAT_R8_UNORM to the list of formats required. Note: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-15 09:25:14 -07:00
Eric Anholt	c37992c54d	i965/fs: Do a general SEND dependency workaround for the original 965. We'd been ad-hoc inserting instructions in some SEND messages with no knowledge of when it was required (so extra instructions), but not all SENDs (so not often enough). This should do much better than that, though it's still flow-control-ignorant. v2: Use BRW_MAX_MRF instead of magic numbers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58960 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: Candidate for the stable branches.	2013-02-15 06:17:46 -08:00
Kristian Høgsberg	6dbe94c12c	egl-wayland: Fix left-over wl_display_roundtrip() usage We have to use the EGL wayland event queue for roundtrip, so use the wayland_roundtrip() helper, which does just that.	2013-02-14 20:48:05 -05:00
Eric Anholt	5bb05c6e6d	i965/gen7: Set up all samplers even if samplers are sparsely used. In GLSL, sampler indices are allocated contiguously from 0. But in the case of ARB_fragment_program (and possibly fixed function), an app that uses texture 0 and 2 will use sampler indices 0 and 2, so we were only allocating space for samplers 0 and 1 and setting up sampler 0. We would read garbage for sampler 2, resulting in flickering textures and an angry simulator. Fixes bad rendering in 0 A.D. and ETQW. This was fixed for pre-gen7 by `28f4be9eb9` Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25201 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58680 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> NOTE: This is a candidate for stable branches.	2013-02-14 15:14:09 -08:00
Marek Olšák	34dc4d6b67	r600g: add support for red-alpha render targets	2013-02-14 14:59:36 +01:00
Marek Olšák	ec5376f5d8	r300g: add support for red-alpha render targets	2013-02-14 14:59:36 +01:00
Marek Olšák	5d3b8ad24b	st/mesa: try to find exact format matching user format and type for DrawPixels Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-14 14:51:46 +01:00
Marek Olšák	2b9659c9e6	r600g: properly implement S8Z24 depth-stencil format for Evergreen I should say "fix", but it has never been used until now. S8Z24 is the format equivalent to the GL_UNSIGNED_INT_24_8 packing, so we'll start to see it more often with st/mesa now making smart decisions about formats. The DB<->CB copy can change the channel ordering for transfers, other than that, the internal DB format doesn't really matter. R600-R700 support is possible except shadow mapping. FMT_24_8 is broken if the SAMPLE_C instruction is used (no idea why). Also the sampler swizzling was broken in theory and the fact it worked was a lucky coincidence. radeonsi might need to port this. Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-02-14 14:51:46 +01:00
Michel Dänzer	c840270ebe	radeonsi: Handle TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS 8 more little piglits. NOTE: This is a candidate for the 9.1 branch.	2013-02-14 10:51:44 +01:00
Michel Dänzer	f34ad85765	radeonsi: Fix array indices for detecting integer vertex formats	2013-02-14 10:31:21 +01:00
Vinson Lee	0d5ce524ab	glsl: Initialize ir_texture member variable. Fixes uninitialized pointer field defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-13 23:10:48 -08:00
Eric Anholt	b8906adb66	intel: Allow blit readpixels even when the pack alignment is set. The default alignment is 4, so this fast path was rarely hit. Rather than introduce logic to handle alignment, just use the Mesa core function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46632 Cc: neil@linux.intel.com Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-13 18:10:20 -08:00
Eric Anholt	516d8be502	i965: Remove writemask support from brw_SAMPLE(). The code was rather broken for non-XYZW on 8-wide, but all of our callers were using XYZW anyway. For my experiments with using writemask on texturing, I've been using manual header setup in the compiler backends, since we want to actually know what registers are written for optimization and register allocation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-13 18:10:20 -08:00
Eric Anholt	bf91f0b039	i965/fs: Use a helper function for checking for flow control instructions. In 2 of our checks, we were missing BREAK and CONTINUE. NOTE: Candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-13 17:47:06 -08:00
bma	ce3dfa19ab	shaderapi: Fix AttachShader error Detect a duplicate Shader type as and error instead of silently allowing it, restrict to ES2 API. v2: Tapani Pälli <tapani.palli@intel.com> - make the check run time instead of compile time v3: chadv - Quote spec on which error to generate. Signed-off-by: bma <Bo.Ma@windriver.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-and-tested-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-13 14:09:47 -08:00
Tom Stellard	0898047e7b	configure.ac: Add components to LLVM_COMPONENTS when using llvm shared libs This is required when LLVM is built with CMake, which creates one shared library for each component.	2013-02-13 17:01:08 -05:00
Eric Anholt	cb4616d32d	i965: Re-enable the -RHW workaround for original gen4 chips. Fixes broken clipping in supertuxkart and presumably many other applications. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51471 NOTE: Candidate for the stable branches. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-13 10:19:21 -08:00
Eric Anholt	ddc2b453d0	i965/gen4: Work around missing sRGB RGB DXT1 support. The hardware just doesn't support it. I suspect this was a regression from the move to fixed MESA_FORMATs for compressed textures and that previously we were storing uncompressed for this or something. Fixes GPU hangs in piglit "texwrap GL_EXT_texture_sRGB-s3tc bordercolor swizzled" on my GM965. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-13 10:19:21 -08:00
Paul Berry	dfb57e7d1b	glsl: Fix error checking on "flat" keyword to match GLSL ES 3.00, GLSL 1.50. All of the GLSL specs from GLSL 1.30 (and GLSL ES 3.00) onward contain language requiring certain integer variables to be declared with the "flat" keyword, but they differ in exactly when the rule is enforced: (a) GLSL 1.30 and 1.40 say that vertex shader outputs having integral type must be declared as "flat". There is no restriction on fragment shader inputs. (b) GLSL 1.50 through 4.30 say that fragment shader inputs having integral type must be declared as "flat". There is no restriction on vertex shader outputs. (c) GLSL ES 3.00 says that both vertex shader outputs and fragment shader inputs having integral type must be declared as "flat". Previously, Mesa's behaviour was consistent with (a). This patch makes it consistent with (b) when compiling desktop shaders, and (c) when compiling ES shaders. Rationale for desktop shaders: once we add geometry shaders, (b) really seems like the right choice, because it requires "flat" in just the situations where it matters. Since we may want to extend geometry shader support back before GLSL 1.50 (via ARB_geometry_shader4), it seems sensible to apply this rule to all GLSL versions. Also, this matches the behaviour of the nVidia proprietary driver for Linux, and the expectations of Intel's oglconform test suite. Rationale for ES shaders: since the behaviour specified in GLSL ES 3.00 matches neither pre-GLSL-1.50 nor post-GLSL-1.50 behaviour, it seems likely that this was a deliberate choice on the part of the GLES folks to be more restrictive. Also, the argument in favor of (b) doesn't apply to GLES, since it doesn't support geometry shaders at all. Some discussion about this has already happened on the Mesa-dev list. See: http://lists.freedesktop.org/archives/mesa-dev/2013-February/034199.html Fixes piglit tests: - glsl-1.30/compiler/interpolation-qualifiers/nonflat-*.frag - glsl-1.30/compiler/interpolation-qualifiers/vs-flat-int-0{2,3,4,5}.vert - glsl-es-3.00/compiler/interpolation-qualifiers/varying-struct-nonflat-{int,uint}.frag Fixes oglconform tests: - glsl-q-inperpol negative.fragin.{int,uint,ivec,uvec} Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-02-13 07:58:08 -08:00
Paul Berry	93c913485e	glsl: don't allow non-flat integral types in varying structs/arrays. In the GLSL 1.30 spec, section 4.3.6 ("Outputs") says: "If a vertex output is a signed or unsigned integer or integer vector, then it must be qualified with the interpolation qualifier flat." The GLSL ES 3.00 spec further clarifies, in section 4.3.6 ("Output Variables"): "Vertex shader outputs that are, or contain, signed or unsigned integers or integer vectors must be qualified with the interpolation qualifier flat." (Emphasis mine.) The language in the GLSL ES 3.00 spec is clearly correct and should be applied to all shading language versions, since varyings that contain ints can't be interpolated, regardless of which shading language version is in use. (Note that in GLSL 1.50 the restriction is changed to apply to fragment shader inputs rather than vertex shader outputs, to accommodate the fact that in the presence of geometry shaders, vertex shader outputs are not necessarily interpolated. That will be addressed by a future patch). NOTE: This is a candidate for stable branches. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-02-13 07:58:01 -08:00
Paul Berry	d5948f2f5e	glsl: Allow default precision qualifiers to be set for sampler types. From GLSL ES 3.00 section 4.5.4 ("Default Precision Qualifiers"): "The precision statement precision precision-qualifier type; can be used to establish a default precision qualifier. The type field can be either int or float or any of the sampler types, and the precision-qualifier can be lowp, mediump, or highp." GLSL ES 1.00 has similar language. GLSL 1.30 doesn't allow precision qualifiers on sampler types, but this seems like an oversight (since the intention of including these in GLSL 1.30 is to allow compatibility with ES shaders). Previously, Mesa followed GLSL 1.30 and only allowed default precision qualifiers to be set for float and int. This patch makes it follow GLSL ES rules in all cases. Fixes Piglit tests default-precision-sampler.{vert,frag}. Partially addresses https://bugs.freedesktop.org/show_bug.cgi?id=60737. NOTE: This is a candidate for stable branches. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-13 07:57:58 -08:00
Marek Olšák	60aa5f360a	st/mesa: fix texture buffer objects Broken by `624528834f`. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-13 16:38:19 +01:00
Kenneth Graunke	8cabe26f5d	i965: Use derived state for Haswell's 3DSTATE_VF packet. Otherwise, we fail to correctly handle GL_PRIMITIVE_RESTART_FIXED_INDEX. Fixes gles3conform's primitive_restart_mode test. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-12 20:24:28 -08:00
Marek Olšák	ea63491629	st/mesa: accelerate glGetTexImage for all formats using a blit This commit allows using glGetTexImage during rendering and still maintain interactive framerates. This improves performance of WarCraft 3 under Wine. The framerate is improved from 25 fps to 39 fps in the main menu, and from 0.5 fps to 32 fps in the game. v2: fix choosing the format for decompression	2013-02-13 02:13:10 +01:00
Marek Olšák	cd41833b44	gallium: add red-alpha texture formats and a couple of util functions This is for glGetTexImage and it will be used for samplers only (which some drivers already implement by reading util_format_description). v2: incorporate Brian's suggestion Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-13 02:13:10 +01:00
Jerome Glisse	974b482aca	r600g: fix lockup when hyperz & alpha test are enabled together. v3 Seems that alpha test being enabled confuse the GPU on the order in which it should perform the Z testing. So force the order programmed throught db shader control. v2: Only force z order when alpha test is enabled v3: Update db shader when binding new dsa + spelling fix Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-12 17:03:56 -05:00
Jordan Justen	496928a442	CopyTexImage: Don't check sRGB vs LINEAR for desktop GL In OpenGL 4.3, new language was added that would require this check. But, if this check results in broken applications then perhaps it will be reversed. For now, remove this check and re-evaluate when desktop GL 4.3 is closer. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-02-12 11:22:49 -08:00
Christian König	8c80894fb3	radeonsi: remove constant index limitation v3 With the llvm patches, fixing 14 piglit tests in total. v2: increase the const limit v3: document the const limit Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-12 18:57:12 +01:00
Christian König	8514f5ac01	radeonsi: support constants as TEX coordinates Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-12 18:57:12 +01:00
Paul Berry	f8426eea35	glsl: Fix unsupported version error for GLSL ES 3.00, future proof for 3.30. When the user specifies an unsupported GLSL version, _mesa_glsl_parse_state::process_version_directive() nicely gives them an error message telling them which GLSL versions are supported. Previous to this patch, the logic for determining whether a given language version was supported was independent from the logic to generate this error message string; as a result, we had a bug where GLSL 3.00 would never be listed in the error message as an available language version, even if it was really available. To make matters worse, the code for generating the error message string assumed that desktop GL versions were always separated by 0.10, an assumption that will be wrong as soon as we support GLSL 3.30. This patch fixes both problems by adding a table of supported GLSL versions to _mesa_glsl_parse_state; this table is used both to generate the error message and to check whether a given version is supported. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-12 08:06:35 -08:00
Roland Scheidegger	9870459522	gallium/docs: fix typos in sample opcode descriptions	2013-02-12 16:51:11 +01:00
Roland Scheidegger	2947f00bc4	nv50: fix bogus parameters when processing sample instructions Discovered accidentally when changing SAMPLE_L definition. Turns out the lod arguments were already correct for the new definition but the compare and derivs were not. Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2013-02-12 16:51:11 +01:00
Roland Scheidegger	427d36a227	gallium: fix tgsi SAMPLE_L opcode to use separate source for explicit lod It looks like using coord.w as explicit lod value is a mistake, most likely because some dx10 docs had it specified that way. Seems this was changed though: http://msdn.microsoft.com/en-us/library/windows/desktop/hh447229%28v=vs.85%29.aspx - let's just hope it doesn't depend on runtime build version or something. Not only would this need translation (so go against the stated goal these opcodes should be close to dx10 semantics) but it would prevent usage of this opcode with cube arrays, which is apparently possible: http://msdn.microsoft.com/en-us/library/windows/desktop/bb509699%28v=vs.85%29.aspx (Note not only does this show cube arrays using explicit lod, but also the confusion with this opcode: it lists an explicit lod parameter value, but then states last component of location is used as lod). (For "true" hw drivers, only nv50 had code to handle it, and it appears the code was already right for the new semantics, though fix up the seemingly wrong c/d arguments while there.) v2: fix comment, separate out other changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-12 16:51:11 +01:00
Brian Paul	4bfdef87e6	util: fix incorrect Z bit masking in util_clear_depth_stencil() For PIPE_FORMAT_Z24_UNORM_S8_UINT, the Z bits are in the 24 least significant bits. Fixes http://bugs.freedesktop.org/show_bug.cgi?id=60527 and http://bugs.freedesktop.org/show_bug.cgi?id=60524 and http://bugs.freedesktop.org/show_bug.cgi?id=60047 Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-12 08:11:05 -07:00
Matt Turner	a79ce0c925	radeon: Remove dead STANDALONE_MMIO defines These were, at some point in the past, used to request that Xorg's compiler.h export a static inline xf86ReadMmio32 instead of a function pointer. compiler.h only has this option for DEC Alpha. But Xorg's compiler.h isn't being included by either of these two files and the radeon driver still works on Alpha, so the definitions are dead and not needed. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-11 23:18:11 -08:00
Roland Scheidegger	8b8bca06df	llvmpipe: implement dual source blending link up the fs outputs and blend inputs, and make sure the second blend source is correctly loaded and converted (which is quite complex). There's a slight refactoring of the monster generate_unswizzled_blend() function where it makes sense to factor out alpha conversion (which needs to run twice for dual source blend). This passes piglit arb_blend_func_extended tests. v2: remove new but ultimately not used function... Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-12 03:41:48 +01:00
Kenneth Graunke	a73181be6d	docs: Mark a few things done in GL3.txt.	2013-02-11 15:55:29 -08:00
Kenneth Graunke	3d7c09e8b0	i965: Add missing dirty bits to INTEL_DEBUG=state arrays. These are more recent additions, and no one remembered to update the INTEL_DEBUG=state code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-11 15:54:10 -08:00
Kenneth Graunke	b9c5997bb3	i965: Reorganize brw_bits to match the order in brw_context.h. This reorders the "brw_bits" array in brw_state_upload.c to match the order of the #defines in brw_context.h. Otherwise, it's really hard to see if any are missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-11 15:54:07 -08:00
Kenneth Graunke	0ac6d5a7fb	i965: Use BRW_NEW_CONTEXT for gen7_disable rather than BRW_NEW_BATCH. These don't need to be re-disabled on every batch if we're using hardware contexts. (If we're not, this is equivalent.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-11 15:54:01 -08:00
Jerome Glisse	323a448825	r600g: make sure async blit is done 8 * pitch at a time v2 The blit must be aligned on 8 horizontal block. v2: no need to align the reminder Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-02-11 18:44:18 -05:00
Martin Andersson	a37835c8ed	winsys/radeon: fix bo with virtual address referencing mismatch If the same context try to flink and open the object, use the same bo struct instead of opening a new gem handle for the object. This way we avoid avoid having 2 different handle pointing to the same kernel object which can latter lead to trouble with virtual address. Fix: https://bugs.freedesktop.org/show_bug.cgi?id=60200 Signed-off-by: Martin Andersson <g02maran@gmail.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-02-11 18:38:00 -05:00
Eric Anholt	e776b632c0	vbo: Merge GL_QUADS drawing requests in display lists. minecraft apparently has its piles of display lists each contain 6 instances of glBegin(GL_QUADS)/verts/glEnd(), which appear in the compiled list as 6 prims of 4 verts each in one draw call. We can reduce driver overhead even more by making that one prim of 24 verts. Improves minecraft performance by 1.6% +/- .25% (n=446) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-02-11 13:14:52 -08:00
Eric Anholt	50202f0961	vbo: Print display list debug using printf() like dlist.c does. Otherwise, the stderr and stdout debug end up interleaved wrong when I pipe them to a file. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-02-11 13:14:51 -08:00
Eric Anholt	b9a66da258	i965: Remove some stale comments about the brw_constant_buffer atom. These have been wrong since `f428255bde` back in 2009! Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-11 13:14:51 -08:00
Eric Anholt	e07457d0ae	i965: Simplify VS push constant upload code since removal of old path. We used to have clip planes optionally included in the push constants, resulting in a variable amount of data uploaded, but no more. This also means less wasted space in the batch for our push constants. v2: Update _NEW_TRANSFORM state bit information. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2013-02-11 13:14:51 -08:00
Eric Anholt	11766b1bbb	i965: Add perf debug for a corner case. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-11 13:14:51 -08:00
Eric Anholt	936a3ca6fd	i965: Fix access mode of index buffer rebase. It doesn't matter with our current implementation of MapBufferRange, but it was wrong -- the result pointer is read by intel_upload_data(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-11 13:14:51 -08:00
Eric Anholt	016928b163	i965: Fix indentation of index buffer rebase code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-11 13:14:51 -08:00
Marek Olšák	cb6470775c	mesa: fix GetTexImage if mesa format and internal format don't match Tested with softpipe only exposing RGBA formats. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	c8379204ab	mesa: don't use memcpy fast path for GetTexImage if base format is different The Mesa format can be RGBA8888_REV, the format/type can be GL_RGBA/GL_UNSIGNED_BYTE, but the actual texture internal format can be LUMINANCE_ALPHA, INTENSITY, etc. Therefore we should look at the base internal format as well. NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	09a99867ab	mesa: don't use _mesa_base_tex_format for format parameter of GetTexImage _mesa_base_tex_format doesn't accept GL_BGR and GL_ABGR_EXT, etc. v2: add a (now hopefully complete) helper function to deal with this NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	5587c8619a	mesa: adjust usage of swapBytes/littleEndian in format_matches_format_and_type - swapBytes has no effect on 8-bit single-component formats - GL_SHORT is in host byte order, so checking for littleEndian is unnecessary, I decided to make the change for single-component formats only Based on suggestions from Michel Dänzer. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	dcdffaaf43	mesa: remove per-format memcpy codepaths from texstore functions It's obsoleted by the common function _mesa_texstore_memcpy. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	4bf27ed7ed	mesa: implement common texstore memcpy function for all formats Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	967b21df6a	mesa: fill in Z32_FLOAT_X24S8 in _mesa_format_matches_format_and_type Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	a0510fa773	mesa: fill in signed cases and RGBA16 in _mesa_format_matches_format_and_type Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	a0fb71888f	mesa: fill in INT/UINT format cases in _mesa_format_matches_format_and_type Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	43395da55a	mesa: fill in YCBCR cases in _mesa_format_matches_format_and_type based on the texstore code Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-11 19:43:01 +01:00
Marek Olšák	87f94e6f80	mesa: fill in SRGB cases in _mesa_format_matches_format_and_type Texstore takes the same codepath as the corresponding linear formats. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 19:43:01 +01:00
Adhemerval Zanella	1ab2c55bf4	llvmpipe: fix vertex_header mask store in big-endian This patch fixes the vertex_header mask bitfield store in big-endian architectures by bit-swap the fields accordingly. Reviewed-by: Adam Jackson <ajax@redhat.com>	2013-02-11 13:41:28 -05:00
Adhemerval Zanella	a8016b2f60	llvmpipe: remove lp_swizzled_cbuf Ununsed since `75da95c5`. Reviewed-by: Adam Jackson <ajax@redhat.com>	2013-02-11 13:41:28 -05:00
Andreas Boll	44a5d7371c	docs: document removal of makedepend build dependency Build dependency removed with `424f200881` Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-11 18:11:20 +01:00
Andreas Boll	d59bd61445	docs: update making a new mesa release info Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 10:58:33 +01:00
Andreas Boll	ab10d2d8a5	docs: use proper title for index.html Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 10:58:33 +01:00
Andreas Boll	bf9e19d308	docs: mention some other supported APIs v2: add ES3 Reviewed-by: Brian Paul <brianp@vmware.com> (v1)	2013-02-11 10:58:33 +01:00
Andreas Boll	babc638c72	docs: update sourcetree glsl directory is located in src and not in src/egl v2: remove ppc, move glapi from src/mesa to src/mapi Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 10:58:33 +01:00
Andreas Boll	dbbe108951	docs: replace CVS with git Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-11 10:58:33 +01:00
Vinson Lee	990bd49fba	configure.ac: Do not check for rt on Mac OS X. There is no rt library on Mac OS X. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58872 Acked-by: Matt Turner <mattst88@gmail.com>	2013-02-09 15:21:08 -08:00
Ian Romanick	0e2f26d5ea	intel: Do not expose OES_compressed_ETC1_RGB8_texture or ARB_texture_rgb10_a2ui pre-GEN4 Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation code for OES_compressed_ETC1_RGB8_texture was never implemented in the i915 driver. NOTE: This is a candidate for all stable branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-08 19:28:53 -08:00
Roland Scheidegger	75d99673a8	softpipe: clean up lod computation This should handle the new lod_zero modifier more correctly. The runtime-conditional is a bit more complex however we now also do scalar lod computation when appropriate which should more than make up for it. The refactoring should also fix an issue with explicit lods (lod clamp wasn't applied to them). Also, always pass lod as the 5th element from tgsi executor, which simplifies things (get rid of annoying conditionals later). v2: based on Brian's feedback, use switch in a couple of places, fix up some function parameter names, fix up comments. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-08 18:54:40 -08:00
Roland Scheidegger	4f1d757b86	softpipe: try to beat new dx10-style sample opcodes into shape There were several bugs how this was handled, most opcodes wouldn't even have fetched the right arguments. Also, the tex "target" is coming from the sampler view, hence it cannot have information about shadow comparisons - fortunately this is not only sampler state but also needs to have matching instruction, so just use this instead to identify shadow comparisons. Still untested (compiles...). Note that sample_i and sviewinfo are still busted (just assert). (The problem is that the interface for doing the opengl-equivalent functions txf and txq is tied to the specific the sampler itself but these opcodes have no sampler associated with them. Oops...) Also, even the other sample instructions will not work correctly since they always operate on samplers which include the texture state. Fixing this wouldn't be that difficult but most likely make softpipe quite a bit slower when using the OpenGL tex opcodes (as the samplers have pre-baked function calls in the sampler state depending on texture state and that stuff would need to be evaluated at runtime), so leave it for now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 18:54:40 -08:00
Roland Scheidegger	614982d320	gallivm: fix up size queries for dx10 sviewinfo opcode Need to calculate the number of mip levels (if it would be worthwile could store it in dynamic state). While here, the query code also used chan 2 for the lod value. This worked with mesa state tracker but it seems safer to use chan 0. Still passes piglit textureSize (with some handwaving), though the non-GL parts are (largely) untested. v2: clarify and expect the sviewinfo opcode to return ints, not floats, just like the OpenGL textureSize (dx10 supports dst modifiers with resinfo). Also simplify some code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 18:54:40 -08:00
Roland Scheidegger	0a8043bb76	gallivm: hook up dx10 sampling opcodes They are similar to old-style tex opcodes but with separate sampler and texture units (and other arguments in different places). Also adjust the debug tgsi dump code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 18:54:40 -08:00
Vinson Lee	db7612d15d	intel: Ensure variable intel is used in i915 builds. Fixes unused pointer value defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-08 18:51:27 -08:00
Vinson Lee	85a9a7f09c	glsl: Ensure glsl_type constructors initialize gl_type. Fixes uninitialized scalar field defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-08 18:50:08 -08:00
Jerome Glisse	9a47684564	winsys/radeon: improve debuging printing Make sure one can identify virtual address failure from allocation failure. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-02-08 20:30:09 -05:00
Roland Scheidegger	1d71106f5c	softpipe: get rid of tgsi_sampler_control param in img_filter None of the filters used it (why would they). Maybe that param was just there because some of the lines were considered to be too short... Reviewed-by: Dave Airlie <airlied@redhat.com>	2013-02-08 16:32:30 -08:00
Roland Scheidegger	66b6d51214	softpipe: fix using optimized filter function This optimized filter (when using repeat wrap modes, linear min/mag/mip filters, pot textures) only applies to 2d textures, but nothing prevented it from being used for other textures (likely leading to very bogus sample results). Note: This is a candidate for the 9.0 branch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 16:32:30 -08:00
Roland Scheidegger	49f8825c49	gallivm: fix typo in lp_build_mul_norm The signed case didn't do what the comment indicated. Should increase rounding precision (at the expense of performance since the former code was effectively a no-op). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 16:32:30 -08:00
Roland Scheidegger	67906f91c9	llvmpipe: first steps of adding dual source blend support This adds support of the additional blending factors to the blend function itself, and also enables testing of it in lp_test_blend (which passes). Still need to add the glue code of linking fs shader outputs to blend inputs in llvmpipe, and probably need to add special handling if destination doesn't include alpha (which lp_test_blend doesn't test). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-08 16:32:30 -08:00
Roland Scheidegger	8e44f4117a	llvmpipe: refactoring of visibility counter handling There can be other per-thread data than just vis_counter, so pass a struct around instead (some of our non-public code uses this already and this difference is a major cause of merge pain). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-08 16:32:30 -08:00
Jerome Glisse	3310acdf47	xorg: fix exa finish access The exa core will already set the pointer to NULL prior calling the callback function. So don't bail out in the callback if it's already NULL. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-02-08 19:01:19 -05:00
Kristian Høgsberg	1fe007399c	egl-wayland: Make sure we allocate a back buffer even if nothing was rendered At eglSwapBuffer time, we blindly assume we have a back buffer, but the back buffer only gets allocated when somebody tries to render something. NOTE: This is a candidate for the 9.0 and 9.1 branches. https://bugs.freedesktop.org/show_bug.cgi?id=60086	2013-02-08 11:23:18 -05:00
Paul Berry	a4b9678a54	Consolidate some redundant definitions of ARRAY_SIZE() macro. Previous to this patch, there were 13 identical definitions of this macro in Mesa source. That's ridiculous. This patch consolidates 6 of them to a single definition in src/mesa/main/macros.h. Unfortunately, I wasn't able to eliminate the remaining definitions, since they occur in places that don't include src/mesa/main/macros.h: - include/pci_ids/pci_id_driver_map.h - src/egl/drivers/dri2/egl_dri2.h - src/egl/main/egldefines.h - src/gbm/main/backend.c - src/gbm/main/gbm.c - src/glx/glxclient.h - src/mapi/mapi/stub.c I'm open to suggestions as to how to deal with the remaining redundancy. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-08 06:51:22 -08:00
Paul Berry	dc92b2d11f	intel/pre-gen6: Disable EXT_framebuffer_multisample. Previously, the i965 driver enabled EXT_framebuffer_multisample even on pre-gen6 chipsets. However, since we don't support multisampling on these chips, we set GL_MAX_SAMPLES=1 (the minimum allowed by EXT_framebuffer_multisample), and if the client ever requested a multisample buffer, we quietly supplied them with a single-sampled buffer instead. After some discussion on the mailing list (see thread "ext_framebuffer_multisample: check for num_samples<=1"), it's clear that this was the wrong approach. The correct approach is to only expose EXT_framebuffer_multisample when we truly support multisampling; that frees us to set a sensible value of GL_MAX_SAMPLES=0 on other chipsets, so that we never have to deal with a client requesting a multisample buffer when multisampling isn't supported. This change causes the following piglit tests to be skipped on chipsets prior to Gen6: - "ARB_framebuffer_sRGB/blit {renderbuffer,texture} {linear,linear_to_srgb,srgb,srgb_to_linear} {downsample,msaa,upsample} {disabled,enabled}" - EXT_framebuffer_multisample/blit-mismatched-formats - EXT_framebuffer_multisample/blit-mismatched-sizes - EXT_framebuffer_multisample/dlist - EXT_framebuffer_multisample/interpolation 0 * - EXT_framebuffer_multisample/minmax - EXT_framebuffer_multisample/negative-copypixels - EXT_framebuffer_multisample/negative-copyteximage - EXT_framebuffer_multisample/negative-max-samples - EXT_framebuffer_multisample/negative-mismatched-samples - EXT_framebuffer_multisample/negative-readpixels - EXT_framebuffer_multisample/renderbuffer-samples - EXT_framebuffer_multisample/renderbufferstorage-samples - EXT_framebuffer_multisample/samples This is expected, since the above tests exercise MSAA functionality, and shouldn't be run on systems prior to Gen6. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-08 06:51:22 -08:00
Vinson Lee	b681ed6ac9	glsl: Initialize all tfeedback_candidate_generator member variables. Fixes uninitialized pointer field defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-02-07 21:51:20 -08:00
Vinson Lee	7c544e55da	nv30: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-07 21:45:01 -08:00
Ian Romanick	82691f1293	glsl: Change loop_analysis to not look like a resource leak Previously the loop_state was allocated in the loop_analysis constructor, but not freed in the (nonexistent) destructor. Moving the allocation of the loop_state makes this code appear less sketchy. Either way, there is no actual leak. The loop_state is freed by the single caller of analyze_loop_variables. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Dave Airlie <airlied@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57753	2013-02-07 21:18:42 -08:00
Paul Berry	04f0d6cc22	mesa: Don't check (offset + size <= bufObj->Size) in BindBufferRange. In the documentation for BindBufferRange, OpenGL specs from 3.0 through 4.1 contain this language: "The error INVALID_VALUE is generated if size is less than or equal to zero or if offset + size is greater than the value of BUFFER_SIZE." This text was dropped from OpenGL 4.2, and it does not appear in the GLES 3.0 spec. Presumably the reason for the change is because come clients change the size of the buffer after calling BindBufferRange. We don't want to generate an error at the time of the BindBufferRange call just because the old size of the buffer was too small, when the buffer is about to be resized. Since this is a deliberate relaxation of error conditions in order to allow clients to work, it seems sensible to apply it to all versions of GL, not just GL 4.2 and above. (Note that there is no danger of this change allowing a client to access data beyond the end of a buffer. We already have code to ensure that that doesn't happen in the case where the client shrinks the buffer after calling BindBufferRange). Eliminates a spurious error message in the gles3 conformance test "transform_feedback_offset_size". Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-07 21:16:37 -08:00
Ian Romanick	f29ab4ece5	i965: Set UniformBufferOffsetAlignment to sizeof(vec4) This matches the behavior of the Windows driver, but a bspec reference should would be nice. NOTE: This is a candidate for the 9.0 and 9.1 branches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-07 21:16:08 -08:00
Matt Turner	3ee602314f	mesa: Allow glGet* queries of MAX_VARYING_COMPONENTS in ES 3 Should have been done in `d9948e49` but I missed it because MAX_VARYING_FLOATS doesn't appear in the ES 3 spec, but is the same value as MAX_VARYING_COMPONENTS. NOTE: Candidate for the 9.1 branch Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-07 17:53:13 -08:00
Daniel van Vugt	6e226ab5ac	gbm: Remember to init format on gbm_dri_bo_create. https://bugs.freedesktop.org/show_bug.cgi?id=60143	2013-02-07 20:00:52 -05:00
Eric Anholt	7242b03622	glx: Centralize the code for context flushing. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-07 13:13:02 -08:00
Eric Anholt	95080ca8d4	glx: Add a little comment about what dri2FlushFrontBuffer() does. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-07 13:13:02 -08:00
Michel Dänzer	c093f12406	radeonsi: Handle scaled and integer formats for samplers and vertex elements. Also, add assertions to stress that render targets don't support scaled formats. 20 more little piglits.	2013-02-07 19:07:43 +01:00
Michel Dänzer	23405ef467	radeonsi: Don't advertise PIPE_FORMAT_L8A8_SRGB support. The hardware can't do it.	2013-02-07 19:07:43 +01:00
Michel Dänzer	a9816cc784	radeonsi: Remove incorrect (and dead) assignment in tex_fetch_args(). The proper return type is assigned at the end of the function.	2013-02-07 19:07:43 +01:00
Michel Dänzer	07eddc444c	radeonsi: Use unique names for referring to texture sampling intrinsics. Append the overloaded vector type used for passing in the addressing parameters. Without this, LLVM uses the same function signature for all those types, which cannot work. Fixes problems e.g. with FlightGear and Red Eclipse.	2013-02-07 19:07:43 +01:00
Marek Olšák	74a17a764d	r300g: put textures with usage=staging in GTT and make them linear	2013-02-07 17:43:19 +01:00
Jerome Glisse	681707abf2	r600g: fix slice tile max for compressed texture and async dma Was using the pixel size instead of the number of block for the slice tile max computation which resulted in dma writing at wrong address. Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-02-07 10:42:22 -05:00
Marek Olšák	9ba1e23647	radeonsi: use new RGBX formats	2013-02-07 00:20:24 +01:00
Marek Olšák	4dc142d521	r300g: fix blending and alpha-test with RGBX16F and enable MSAA for it	2013-02-07 00:20:24 +01:00
Marek Olšák	27e216a075	r300g: use new RGBX formats	2013-02-07 00:20:24 +01:00
Marek Olšák	3c351b7c33	r600g: use new RGBX formats	2013-02-07 00:20:24 +01:00
Marek Olšák	dd21ecdc42	st/mesa: use new RGBX formats Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-07 00:20:24 +01:00
Marek Olšák	f9fa725690	mesa: add RGBX formats for existing GL RGB texture formats v2: fix compilation of swrast	2013-02-07 00:20:24 +01:00
Marek Olšák	70bf7bae1d	gallium: add RGBX formats for existing GL RGB texture formats Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-07 00:20:23 +01:00
Kenneth Graunke	7d467f3c15	i965/blorp: Support blits between ARGB and XRGB formats. Now that we have support for overriding alpha to 1.0, we can handle blitting between these formats in either direction. For now, we only support two XRGB formats: MESA_FORMAT_XRGB8888 and MESA_FORMAT_RGBX8888_REV. Most places only appear to worry about the former, so ignore the latter for now. We can always add it later. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Martin Steigerwald <martin@lichtvoll.de>	2013-02-06 10:01:03 -08:00
Kenneth Graunke	c0554141a9	i965/blorp: Support overriding destination alpha to 1.0. Currently, Blorp requires the source and destination formats to be equal. However, we'd really like to be able to blit between XRGB and ARGB formats; our BLT engine paths have supported this for a long time. For ARGB -> XRGB, nothing needs to occur: the missing alpha is already interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha channel to 1.0 when writing the destination colors. This is fairly straightforward with blending. For now, this code is never used, as the source and destination formats still must be equal. The next patch will relax that restriction. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Martin Steigerwald <martin@lichtvoll.de>	2013-02-06 10:00:53 -08:00
Kenneth Graunke	0b3bebbaac	i965: Implement CopyTexSubImage2D via BLORP (and use it by default). The BLT engine has many limitations. Currently, it can only blit X-tiled buffers (since we don't have a kernel API to whack the BLT tiling mode register), which means all depth/stencil operations get punted to meta code, which can be very CPU-intensive. Even if we used the BLT engine, it can't blit between buffers with different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture and a Y-tiled CMS ARGB8888 renderbuffer. This is a fundamental limitation, and the only way around that is to use BLORP. Previously, BLORP only handled BlitFramebuffer. This patch adds an additional frontend for doing CopyTexSubImage. It also makes it the default. This is partly to increase testing and avoid hiding bugs, and partly because the BLORP path can already handle more cases. With trivial extensions, it should be able to handle everything the BLT can. This helps PlaneShift massively, which tries to CopyTexSubImage2D between depth buffers whenever a player casts a spell. Since these are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU while delivering ~1 FPS. This is particularly bad in an MMO setting because people cast spells all the time. It also helps Xonotic in 4X MSAA mode. At default power management settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5). (This data is from v1 of the patch.) No Piglit regressions on Ivybridge (v3) or Sandybridge (v2). v2: Create a fake intel_renderbuffer to wrap the destination texture image and then reuse do_blorp_blit rather than reimplementing most of it. Remove unnecessary clipping code and conditional rendering check. v3: Reuse formats_match() to centralize checks; delete temporary renderbuffers. Reorganize the code. v4: Actually copy stencil when dealing with separate stencil buffers but packed depth/stencil formats. Tested by a new Piglit test. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> [v4] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v3] Reviewed-and-tested-by: Carl Worth <cworth@cworth.org> [v2] Tested-by: Martin Steigerwald <martin@lichtvoll.de> [v3]	2013-02-06 10:00:22 -08:00
Kenneth Graunke	29aef6cce8	mesa: Put extern "C" guards in renderbuffer.h. I need to use this from C++ code. NOTE: This is a candidate for the 9.1 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-06 09:59:53 -08:00
Brian Paul	48b01e6a10	llvmpipe: remove extraneous const qualifier	2013-02-06 09:16:58 -07:00
Marek Olšák	bc2ceb97f1	gallium/util: remove duplicated function util_format_is_rgb_no_alpha It only checks if alpha is present, so it's the same as util_format_has_alpha. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:32 +01:00
Marek Olšák	b92057a983	st/mesa: get rid of GET_CURRENT_CONTEXT in st_choose_format Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:32 +01:00
Marek Olšák	2e6f10d0b7	st/mesa: adjust texture format selection to try the closest base format first Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:32 +01:00
Marek Olšák	b89b80a91d	st/mesa: put RGBX8 and RGBA8 in the default format lists Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:32 +01:00
Marek Olšák	c1856da75d	st/mesa: add the rest of RGB8 format/type combos to exact_format_mapping tables These formats were added a few months after these tables were committed. No idea why we have the table though. AFAIK, texstore always takes the slow path for GL_RGBn. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:32 +01:00
Marek Olšák	ebe86b8082	mesa: fixup inconsistent naming of RG16 formats Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:31 +01:00
Marek Olšák	cf37aef414	r600g: report correct control flow depth	2013-02-06 14:51:31 +01:00
Marek Olšák	fc86394882	glsl: fix incorrect comment about do_common_optimization	2013-02-06 14:51:31 +01:00
Marek Olšák	4362bdadf3	st/mesa: emit saturates in the vertex shader if Shader Model 3.0 is supported v2: change the requirement from GLSL 1.30 to SM 3.0 (R500 can do this)	2013-02-06 14:51:31 +01:00
Marek Olšák	48689ca14a	st/mesa: advertise ARB_shading_language_packing for GLSL >= 1.30 Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:31 +01:00
Marek Olšák	afd4178fec	st/mesa: do most of GLSL lowering outside of the optimization do-while loop based on the intel driver Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:31 +01:00
Marek Olšák	7325f1faaa	st/mesa: remove dead code depending on EmitCondCodes EmitCondCodes is always false. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 14:51:31 +01:00
Marek Olšák	85efb2fff0	r300g: try to use color varyings for texcoords if max texcoord limit is exceeded +35 piglits	2013-02-06 14:45:22 +01:00
Marek Olšák	1d3561d877	r300/compiler: copy-propagate saturate mode when possible Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-02-06 14:45:20 +01:00
Marek Olšák	ae8696c7ee	r300/compiler: add support for saturate output modifier in r500 vertex shaders The GLSL compiler can simplify clamp(v,0,1) to saturate. The state tracker doesn't use it yet, but it will. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2013-02-06 14:45:16 +01:00
Marek Olšák	499f7de12e	r300g: fix blending with RGBX formats Change DST_ALPHA to ONE.	2013-02-06 14:31:23 +01:00
Marek Olšák	f40a7fc34a	r300g: fix blending with blend color and RGBA formats NOTE: This is a candidate for the stable branches.	2013-02-06 14:31:23 +01:00
José Fonseca	5048e69392	egl/dri: Don't invoke dri2_dpy->flush if it's NULL. I'd like to test Mesa OpenGL ES along side with NVIDIA libGL drivers. But without this change, I get a NULL pointer dereference. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-06 09:22:26 +00:00
Vinson Lee	d08cee5d80	glsl: Initialize ast_parameter_declarator member variables. Fixes uninitialized pointer field defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-05 22:11:32 -08:00
Brian Paul	ff60509157	svga: fix sRGB rendering We weren't emitting the SVGA_RS_OUTPUTGAMMA state so sRGB rendering didn't work properly. Fixes piglit's framebuffer-srgb test. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-05 12:34:55 -07:00
Tom Stellard	8aaee4d64e	r600g/compute: Fix segfault caused by new shader disassembler	2013-02-05 15:41:33 +00:00
Michel Dänzer	02a423b239	Require libdrm_radeon 2.4.42 for radeonsi. It has new PCI IDs and an important tiled surface layout fix.	2013-02-05 15:12:14 +01:00
Eric Anholt	86536a321d	i965: Disable write masking when setting up texturing m0. v2/Kayden: Also disable write masking in the vec4 backend. Fixes 78 oglconform glsl-bif-tex-* subcases. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2013-02-04 17:29:41 -08:00
Tapani Pälli	e062a4187d	intel: Fix regression in intel_create_image_from_name stride handling Strangely, the DRIimage interface we have passes the pitch in pixels instead of bytes, which anholt missed in the change to using bytes for region pitch. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-02-04 13:59:02 -08:00
Eric Anholt	5751d0cb2d	i965: Fix segfaults from `45a28a927a` If you look up a level that isn't in the miptree, you crash. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-02-04 13:58:55 -08:00
Alex Deucher	4161d70bba	radeonsi: add Oland pci ids Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Note: this is a candidate for the 9.1 branch.	2013-02-04 15:44:38 -05:00
Alex Deucher	af0af75881	radeonsi: default PA_SC_RASTER_CONFIG to 0 That should work in all cases. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Note: this is a candidate for the 9.1 branch.	2013-02-04 15:44:07 -05:00
Alex Deucher	83e4407f44	radeonsi: add support for Oland chips Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Note: this is a candidate for the 9.1 branch	2013-02-04 15:43:21 -05:00
Paul Berry	99b78337e3	glsl: Support transform feedback of varying structs. Since transform feedback needs to be able to access individual fields of varying structs, we can no longer match up the arguments to glTransformFeedbackVaryings() with variables in the vertex shader. Instead, we build up a hashtable which records information about each possible name that is a candidate for transform feedback, and then match up the arguments to glTransformFeedbackVaryings() with the contents of that hashtable. Populating the hashtable uses the program_resource_visitor infrastructure, so the logic is shared with how we handle uniforms. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-04 10:36:47 -08:00
Paul Berry	53febac02c	glsl: Use parse_program_resource_name to parse transform feedback varyings. Previously, transform feedback varyings were parsed in an ad-hoc fashion that wasn't compatible with structs (or array of structs). This patch makes it use parse_program_resource_name(), which correctly handles both. Note that parse_program_resource_name()'s technique for handling mal-formed input strings is to simply let them through and rely on the fact that a future name lookup will fail. Because of this, tfeedback_decl::init() no longer needs to return a boolean error code--it always succeeds, and if the input was mal-formed the error will be detected later. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-04 10:36:44 -08:00
Paul Berry	b4db34cc4c	glsl: Rename uniform_field_visitor to program_resource_visitor. There's actually nothing uniform-specific in uniform_field_visitor. It is potentially useful for all kinds of program resources (in particular, future patches will use it for transform feedback varyings). This patch renames it to program_resource_visitor, and clarifies several comments, to reflect the fact that it is useful for more than just uniforms. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-04 10:36:40 -08:00
Paul Berry	b92900d26a	mesa/glsl: Separate parsing logic from _mesa_get_uniform_location. The parsing logic is moved to a new function in the GLSL module, parse_program_resource_name(). This name was chosen because it should eventually be useful for handling everything that OpenGL 4.3 calls "program resources" (e.g. uniforms, vertex inputs, fragment outputs, and transform feedback varyings). Future patches will make use of this function for linking transform feedback varyings. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-02-04 10:36:35 -08:00
Quentin Glidic	11bd1b0f58	gallium/egl: Fix include dirs for VPATH build NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>	2013-02-04 10:36:50 -08:00
Abdiel Janulgue	eaeb314372	intel: make sure to setup image dimension in image_from_planar setup Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=60212 Tested-by: Scott Moreau <oreaus@gmail.com> Tested-by: Tiago Vignatti <tiago.vignatti@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-04 10:18:22 -08:00
Matt Turner	2db1f73849	builtin_compiler/build: Don't use *_FOR_BUILD when not cross compiling Previously we were relying on CFLAGS_FOR_BUILD to be the same as CFLAGS when not cross compiling, but this assumption didn't take into consideration 32-bit builds on 64-bit systems. More generally, not honoring CFLAGS is bad. Automake is evidently too stupid to accept if CROSS_COMPILING CC = @CC_FOR_BUILD@ ... else CC = @CC@ endif without warning that CC has been already defined. The warnings are harmless, but I'd prefer to avoid future reports about them, so define proxy variables, which are assigned inside the conditional and then unconditionally assigned to CC et al. NOTE: This is a candidate for the 9.1 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59737 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60038	2013-02-04 09:35:45 -08:00
Brian Paul	805cf07dc3	st/mesa: emit SQRT opcode when driver supports it	2013-02-04 09:33:44 -07:00
Brian Paul	13f3ae5b83	gallium/drivers: handle PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED query Initially, only softpipe/llvmpipe support SQRT.	2013-02-04 09:33:44 -07:00
Brian Paul	2d367e40d9	gallivm: implement support for SQRT opcode	2013-02-04 09:33:44 -07:00
Brian Paul	ad30e4545b	tgsi: add support for new SQRT opcode	2013-02-04 09:33:44 -07:00
Brian Paul	d276a40e15	gallium: add SQRT shader opcode The glsl-to-tgsi translater will emit SQRT to implement GLSL's sqrt() and distance() functions if the PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED query says it's supported by the driver. Otherwise, sqrt(x) is implemented with xrsq(x). The problem with this is sqrt(0) must be handled specially because rsq(0) might be Inf/NaN/undefined (and then 0rsq(0) is Inf/Nan/undefined). In the glsl-to-tgsi code we use an extra CMP to check if x is zero and then replace the result of x*rsq(x) with zero. In the end, this makes sqrt() generate much more reasonable code for drivers that can do square roots. Note that many of piglit's generated shader tests use the GLSL distance() function.	2013-02-04 09:33:44 -07:00
Michel Dänzer	6455d40b7e	radeonsi: Remove spurious traces of R16G16B16 support. The hardware can't do it, and these were causing warnings in some piglit tests. NOTE: This is a candidate for the 9.1 branch.	2013-02-04 17:03:26 +01:00
Michel Dänzer	6bcb823844	radeonsi: Enable texture arrays. 28/30 piglit tests pass. NOTE: This is a candidate for the 9.1 branch.	2013-02-04 17:03:25 +01:00
Michel Dänzer	120efeef8b	radeonsi: Improve packing of texture address parameters. In particular, the LOD bias and depth comparison values are packed before the 'normal' texture coordinates, and the array slice and LOD values are appended. NOTE: This is a candidate for the 9.1 branch.	2013-02-04 17:03:25 +01:00
Michel Dänzer	e5fb7347a7	radeonsi: Adapt to sample intrinsics changes. Fix up intrinsic names, and bitcast texture address parameters to integers. NOTE: This is a candidate for the 9.1 branch.	2013-02-04 17:03:25 +01:00
Brian Paul	624528834f	st/mesa: simplify the update_single_texture() function In particular, rework the sRGB/linear format selection code. There's no reason to mess with the Mesa format. Just do everything in terms of the gallium pipe_format. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-02-04 08:28:17 -07:00
Brian Paul	5f81549f6c	st/mesa: merge st_ChooseTextureFormat_renderable() into st_ChooseTextureFormat() That was the only place it was being called from.	2013-02-04 08:28:17 -07:00
Brian Paul	f54a9f4ff2	st/mesa: improve the format choosing code for DrawPixels The code before was getting a pipe format, then calling st_pipe_format_to_mesa_format() and then converting back again with st_mesa_format_to_pipe_format(). This removes one conversion step.	2013-02-04 08:28:17 -07:00
Andreas Boll	38d65a9769	gallium: handle unhandled PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60098 Signed-off-by: Brian Paul <brianp@vmware.com>	2013-02-04 08:28:17 -07:00
Brian Paul	4df42890c5	st/mesa: don't choose DXT formats if we can't do DXT compression If we call gl[Copy]TexImage2D() with a generic compression format (e.g. intFormat=GL_COMPRESSED_RGBA) we can't choose a DXT format if we don't have the external DXT compression library. We weren't actually enforcing this before since the pipe_screen::is_format_supported(DXT) query has no dependency on the DXT compression library. Now if we're given a generic compressed format and we can't do DXT compression we'll fall back to a non-compressed format. v2: use util_format_is_s3tc() function and add more comments about the allow_dxt parameter. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-02-04 07:58:21 -07:00
Brian Paul	478056b81a	mesa: don't use format chooser code for glCompressedTexImage When glCompressedTexImage is called the internalFormat is a specific format for the incoming image and the the hardware format should be the same (since we never do format transcoding). So use the simpler _mesa_glenum_to_compressed_format() function. This change is also needed for the next patch. Note: This is a candidate for the stable branches.	2013-02-04 07:58:21 -07:00
Kenneth Graunke	44aa2e15f6	i965: Fix the SF Vertex URB Read Length calculation for Gen7 platforms. Ivybridge doesn't appear to have the same errata as Sandybridge; no corruption was observed by setting it to more than the minimal correct value. It's possible that we were simply lucky, since the URB entries are 1024-bit on Ivybridge vs. 512-bit Sandybridge. Or perhaps the underlying hardware issue is fixed. Either way, we may as well program the minimum value since it's now readily available, likely to be more efficient, and possibly more correct. v2: Use GEN7_SBE_* defines rather than GEN6_SF_*. (A copy and paste mistake.) They're the same, but using the right names is better. NOTE: This is a candidate for all stable branches. Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-03 13:41:09 -08:00
Kenneth Graunke	09fbc29828	i965: Fix the SF Vertex URB Read Length calculation for Sandybridge. (This commit message was primarily written by Paul Berry, who explained what's going on far better than I would have.) Previous to this patch, we thought that the only restrictions on 3DSTATE_SF's URB read length were (a) it needs to be large enough to read all the VUE data that the SF needs, and (b) it can't be so large that it tries to read VUE data that doesn't exist. Since the VUE map already tells us how much VUE data exists, we didn't bother worrying about restriction (a); we just did the easy thing and programmed the read length to satisfy restriction (b). However, we didn't notice this erratum in the hardware docs: "[errata] Corruption/Hang possible if length programmed larger than recommended". Judging by the context surrounding this erratum, it's pretty clear that it means "URB read length must be exactly the size necessary to read all the VUE data that the SF needs, and no larger". Which means that we can't program the read length based on restriction (b)--we have to program it based on restriction (a). The URB read size needs to precisely match the amount of data that the SF consumes; it doesn't work to simply base it on the size of the VUE. Thankfully, the PRM contains the precise formula the hardware expects. Fixes random UI corruption in Steam's "Big Picture Mode", random terrain corruption in PlaneShift, and Piglit's fbo-5-varyings test. NOTE: This is a candidate for all stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56920 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60172 Tested-by: Jordan Justen <jordan.l.justen@intel.com> (v1/Piglit) Tested-by: Martin Steigerwald <martin@lichtvoll.de> (PlaneShift) Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-03 13:40:45 -08:00
Kenneth Graunke	5e9bc7bd12	i965: Compute the maximum SF source attribute. The maximum SF source attribute is necessary to compute the Vertex URB read length properly, which will be done in the next commit. NOTE: This is a candidate for all stable branches. Reviewed-by: Paul Berry <stereotype441@gmail.com> Tested-by: Martin Steigerwald <martin@lichtvoll.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-03 13:40:43 -08:00
Kenneth Graunke	b3efc5bea8	i965: Refactor Gen6+ SF attribute override code. The next patch will benefit from easy access to the source attribute number and whether or not we're swizzling. It doesn't want the final attr_override DWord form, however. NOTE: This is a candidate for all stable branches. Reviewed-by: Paul Berry <stereotype441@gmail.com> Tested-by: Martin Steigerwald <martin@lichtvoll.de> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-03 13:40:31 -08:00
Kenneth Graunke	488ddb247c	glsl: Remove hash table from ir_set_program_inouts pass. Back when ir_var_in and ir_var_out signified both function parameters and shader input/outputs, we had trouble distinguishing the two when looking at a dereference. Now that we have separate ir_var_shader_in and ir_var_shader_out modes, we can determine this easily. Removing the hash table saves memory and CPU overhead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-02-03 13:38:16 -08:00
Kenneth Graunke	b56d6badad	i965: Remove dead field brw_wm_prog_data::error.	2013-02-03 13:38:16 -08:00
Kenneth Graunke	7eda7a455b	i965: Remove dead field brw_context::constant_map. This was used by the old VS backend, but that's long gone.	2013-02-03 13:38:16 -08:00
Vinson Lee	8a4d952d10	r600g: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-01 22:52:22 -08:00
Vinson Lee	080e91aa07	egl/dri2: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-01 22:50:34 -08:00
Vinson Lee	cea341fce8	nv30: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-01 22:50:26 -08:00
Vinson Lee	4cd4deab48	nv50: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-01 22:50:16 -08:00
Vinson Lee	0580f165ed	nvc0: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-01 22:50:01 -08:00
Vinson Lee	985e710c0d	swrast: Fix memory leak. Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-02-01 22:49:45 -08:00
Quentin Glidic	1e857130f0	configure.ac: Fix --with-llvm-shared-libs The third argument of AC_ARG_WITH is evaluated for any provided value, not only on --with-, so it must not force-enable the feature Also, setting $with_llvm_shared_libs in the opencl check was overriding the user switch https://bugs.freedesktop.org/show_bug.cgi?id=59851 Signed-off-by: Quentin Glidic <sardemff7+git@sardemff7.net>	2013-02-01 22:53:46 +00:00
Tom Stellard	257006e2a4	r600g/llvm: Select the correct GPU type for RV670 RV670 belongs in the R600 chip class https://bugs.freedesktop.org/show_bug.cgi?id=58666 NOTE: This is a candidate for the 9.1 branch	2013-02-01 22:53:30 +00:00
Abdiel Janulgue	6c7e95cb89	intel: implement create image from texture Save miptree level info to DRIImage: - Appropriately-aligned base offset pointing to the image - Additional x/y adjustment offsets from above. v8: -Bump intelImageExtension version v9: -Don't use internal _eglError but implement error reporting in new DRI inteface instead. This fixes Android build problems based on feedback from Adrian M Negreanu and Chad Versace. -Move the non-tile-aligned check and error-reporting to intel_set_texture_image_region v10: -Don't #include "egl/main/eglcurrent.h". [chadv] Reviewed-by: Eric Anholt <eric@anholt.net> (v6) Acked-by: Chad Versace <chad.versace@linux.intel.com> (v10) Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:13 -08:00
Abdiel Janulgue	8e2454c562	intel: Account for mt->offset in intel_miptree_map We need to take account the offset from original bo when using glTexSubImage() and other functions that manipulate the subregion of an exported texture. Offsets are appended to mapped region address and when blitting from a source region. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Abdiel Janulgue	11f5c82e83	intel: Create a miptree using offsets in intel_set_texture_image_region When binding a region to a texture image, re-create the miptree base-level considering the offset and dimension information exported by DRIImage. v8: - Move the alignment surface address checks from the image-from-texture code to the texture-from-image side. This allows the error reporting to conform to OES_EGL_Image and to prevent mixing up EGL and GL errors. Reported by Chad Versace. - Addressed an existing issue in renderbuffer case where there is a a possibility of creating EGL images out of depthstencil textures which isn't really possible. This was spotted by Eric earlier. Reviewed-by: Eric Anholt <eric@anholt.net> (v6) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8) Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Abdiel Janulgue	45a28a927a	i965: Account for offsets when updating SURFACE_STATE. If the offsets are present, this lets us specify a particular level and slice in a shared region using the base level of an exported mip-map tree. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Abdiel Janulgue	163b35e416	intel: add pixel offset calculator for miptree levels Add helper to calculate fine-grained x and y adjustment pixels to an image within a miptree level for tiled regions. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Abdiel Janulgue	7014df0d1d	intel: Expose intel_miptree_create_internal as intel_miptree_create_layout. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Abdiel Janulgue	f9e4e5f9f9	intel: expose dimensions and offsets of a miptree level in DRIImage v8: - Append has_depthstencil field in DRIImage structure. Reviewed-by: Eric Anholt <eric@anholt.net> (v6) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8) Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Abdiel Janulgue	7b7af48e01	dri2: Create image from texture Add create image from texture extension and bump version. v8: - Add appropriate image errors codes in DRI interface so we don't have to use internal EGL functions in driver. Suggested by Chad Versace. Reviewed-by: Eric Anholt <eric@anholt.net> (v6) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v8) Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2013-02-01 11:58:12 -08:00
Michel Dänzer	a8a5055f2d	radeonsi: Fix draws using user index buffer. Was broken since commit `bf469f4edc` ('gallium: add void *user_buffer in pipe_index_buffer'). Fixes 11 piglit tests and lots of missing geometry e.g. in TORCS. NOTE: This is a candidate for the 9.1 branch.	2013-02-01 18:53:03 +01:00
Brian Paul	1bb52bab9e	st/mesa: whitespace/indentation fix	2013-02-01 08:00:28 -07:00
Brian Paul	3cb4915344	svga: check for NaN shader immediates The svga device doesn't handle them. Replace with zeros. Fixes several piglit tests, such as "glsl-const-builtin-inversesqrt". Reviewed-by: Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-02-01 08:00:28 -07:00
Brian Paul	9eff5e905f	svga: add, use SVGA3D_SURFACE_HINT_VOLUME flag Reviewed-by: Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-02-01 08:00:28 -07:00
Brian Paul	9a91ce9448	trace: measure time for each gallium call To get a rough idea of how much time is spent in each gallium driver function. The time is measured in microseconds.	2013-02-01 08:00:28 -07:00
Brian Paul	b516bf46ef	trace: add void to function definition	2013-02-01 08:00:28 -07:00
Brian Paul	fe20e3ebb5	trace: allow GALLIUM_TRACE=stdout/stderr	2013-02-01 08:00:28 -07:00
Marek Olšák	225228a7f5	radeonsi: port some of get_shader_param changes from r600g Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-02-01 15:16:35 +01:00
Marek Olšák	cc5fdaf2dc	mesa: don't expose IBM_rasterpos_clip in a core context glRasterPos doesn't exist in the core profile. NOTE: This is a candidate for the stable branches (9.0 and 9.1). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-02-01 15:16:35 +01:00
Marek Olšák	a06f03d795	r300g: always put MSAA resources in VRAM This along with the latest drm-fixes branch should help with bad performance of MSAA. Remember: Nx MSAA can't be more than N times slower (where N=2,4,6). Anyway, I recommend at least 512 MB of VRAM for Full HD 6x MSAA. NOTE: This is a candidate for the 9.1 branch.	2013-02-01 15:16:35 +01:00
Michel Dänzer	3b888f534c	configure.ac: GLX cannot work without OpenGL GLX uses mapi/glapi/libglapi.la, which is only built for OpenGL. If the user specified --enable-xlib-glx --disable-opengl, error out, as these cannot be both observed at the same time. If the user just specified --disable-opengl but not --disable-glx, print a warning and disable GLX as well. NOTE: This is a candidate for the stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59364 Tested-by: Tom Stellard <thomas.stellard@amd.com>	2013-02-01 11:42:09 +01:00
Vadim Girlin	9824755dae	r600g: remove broken assert from r600_isa.c Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-02-01 13:19:35 +04:00
Vadim Girlin	e42111ecba	r600g: implement shader disassembler v3 R600_DUMP_SHADERS environment var now allows to choose dump method: 0 (default) - no dump 1 - full dump (old dump) 2 - disassemble 3 - both v2: fix output for burst_count > 1 v3: use more human-readable output for kcache data in CF_ALU_xxx clauses, improve output for ALU_EXTENDED, other minor fixes Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-02-01 12:08:42 +04:00
Vadim Girlin	022122ee63	r600g: use tables with ISA info v3 v3: added some flags including condition codes for ALU, fixed issue with CF reverse lookup (overlapping ranges of CF_ALU_xxx and other CF instructions) rebased on current master Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-02-01 12:08:42 +04:00
Vinson Lee	b68a3b865b	glapi: Do not use backtrace on MinGW. execinfo.h is not available on MinGW. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-01-31 23:23:12 -08:00
Jerome Glisse	5e0c956cb2	r600g: add cs memory usage accounting and limit it v3 We are now seing cs that can go over the vram+gtt size to avoid failing flush early cs that goes over 70% (gtt+vram) usage. 70% is use to allow some fragmentation. The idea is to compute a gross estimate of memory requirement of each draw call. After each draw call, memory will be precisely accounted. So the uncertainty is only on the current draw call. In practice this gave very good estimate (+/- 10% of the target memory limit). v2: Remove left over from testing version, remove useless NULL checking. Improve commit message. v3: Add comment to code on memory accounting precision Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-01-31 14:23:52 -05:00
Marek Olšák	5c86a728d4	r600g: fix htile buffer leak NOTE: This is a candidate for the 9.1 branch.	2013-01-31 15:35:18 +01:00
Andreas Boll	6ea753b056	mesa: bump version to 9.2 (devel) Now that branch 9.1 is created, bump the minor version in master. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-01-31 09:01:15 +01:00
Matt Turner	a527b2192e	Revert "mesa: Return INVALID_OPERATION when type is known but not allowed" This reverts commit `2906e2034c`. Fixes a regression in the glean depthStencil test. Reverting this does not affect any tests in es3conform, so a more recent patch must have also fixed the failure this one was intended to fix. Reported-by: lu hua <huax.lu@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59494	2013-01-30 10:56:01 -08:00
Kenneth Graunke	7cccf46ec4	mesa: Add TexBufferRange to dispatch_sanity. Christoph implemented this, so we should expect it to be present now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60082	2013-01-30 10:48:05 -08:00
Christoph Bumiller	4bdf5454a5	nv50,nvc0: fix/enable texture buffer objects	2013-01-30 13:10:11 +01:00
Christoph Bumiller	a901d54f67	st/mesa: add support for GL_ARB_texture_buffer_range v2: Update to handle BufferSize being -1 and return a NULL sampler view if the specified range would cause out of bounds access. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2013-01-30 13:10:11 +01:00
Christoph Bumiller	0fcd2c5e2f	gallium: add PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT Reviewed-by: Brian Paul <brianp@vmware.com>	2013-01-30 13:10:11 +01:00
Christoph Bumiller	785a8c3beb	mesa: implement GL_ARB_texture_buffer_range v2: Record texObj.BufferSize as -1 in TexBuffer(non-Range) instead of the buffer's current size so we know we always have to use the full size of the buffer object (i.e. even if it changes without the user calling TexBuffer again) for the texture. Clarify invalid offset alignment error message. v3: Use extra GL_CORE-only section in get_hash_params.py for TEXTURE_BUFFER_OFFSET_ALIGNMENT. v4: Remove unnecessary check for profile in _mesa_TexBufferRange. Add check for extension enable in get_tex_level_parameter_buffer. v5: Fix position in gl_API.xml. Add comment about meaning of BufferSize == -1. v6: Add back checks for core profile and add a note about it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-01-30 13:10:10 +01:00
Matt Turner	02b6da1e87	build: Add missing comma in AS_IF Reported-by: Lauri Kasanen<curaga@operamail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47248#c15	2013-01-29 13:19:18 -08:00
Brian Paul	ce6bf2d4c5	mesa: remove ctx->Driver.Error() hook Not used by any driver anymore. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-01-29 12:32:13 -07:00
Stéphane Marchesin	67e7263e45	glx: Check that swap_buffers_reply is non-NULL before using it Check that the return value from xcb_dri2_swap_buffers_reply is non-NULL before accessing the struct members. Note: This is a candidate for the 9.0 branch. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-01-29 11:15:22 -08:00
Brian Paul	70c5297439	mesa: fix comment typo: s/formaat/format/	2013-01-29 11:53:24 -07:00
José Fonseca	42f762dcf6	llvmpipe: Don't advertise S8_UNORM (with feeble attempt at supporting it). S8_UNORM was inadvertedly supported together with Z16_UNORM. I tried to update the code to accomodate stencil-only -- it seemed a simple thing to do -- but "fbo-stencil clear GL_STENCIL_INDEX8" still fails, and it's not worth debugging. Therefore although this change tries to update for S8_UNORM, it also disables it completely. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-01-29 16:41:56 +00:00
José Fonseca	3b683700ef	llvmpipe: Fix deferred depth writes for Z16_UNORM. This special path hadn't been exercised by my earlier testing, and mask values weren't being properly truncated to match the values. This change fixes that. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-01-29 16:41:56 +00:00
Roland Scheidegger	0eb588a37c	draw: fix draw_llvm_variant_key struct padding to avoid recompiles The struct padding got broken by `c789b981b2`. This caused serious performance regression because part of the key was uninitialized and hence the shader always recompiled (at least on release builds...). While here also fix key size calculation when the number of samplers and the number of sampler views are different. v2: add comment Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-01-29 08:40:52 -08:00
Marek Olšák	845130951f	docs/relnotes-9.1: document new features in radeon drivers	2013-01-29 17:35:17 +01:00
Brian Paul	d83336ce3e	docs: more VMware guest driver info, tips	2013-01-29 08:59:53 -07:00
Brian Paul	c80bacba2e	st/mesa: only enable GL_EXT_framebuffer_multisample if GL_MAX_SAMPLES >= 2 We never really have multisampling with one sample per pixel. See also http://bugs.freedesktop.org/show_bug.cgi?id=59873 Note: This is a candidate for the 9.0 branch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-01-29 08:59:53 -07:00
Brian Paul	8f3c81d018	mesa: don't enable GL_EXT_framebuffer_multisample for software drivers Note: This is a candidate for the 9.0 branch. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-01-29 08:59:53 -07:00
Brian Paul	2180f32972	osmesa: use _mesa_generate_mipmap() for mipmap generation, not meta See previous commit for more info. Note: This is a candidate for the 9.0 branch. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-01-29 08:59:53 -07:00
Brian Paul	89551ae04f	xlib: use _mesa_generate_mipmap() for mipmap generation, not meta The swrast fragment program interpreter has trouble computing the right texture LOD because it doesn't have easy access to input derivatives. This causes the GLSL-based meta generate mipmap code to fetch texels from the wrong mipmap level. One possible fix would be to set the GL_TEXTURE_MIN/MAX_LOD parameters to limit sampling from the right level. But let's just use the _mesa_generate_mipmap() fallback since it's a lot faster than using the fragment shader interpreter. Fixes http://bugs.freedesktop.org/show_bug.cgi?id=54240 Note: This is a candidate for the 9.0 branch. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-01-29 08:59:53 -07:00
Brian Paul	d60da27273	st/mesa: set ctx->Const.MaxSamples = 0, not 1 The gallium docs for pipe_screen::is_format_supported() says that samples==0 or samples==1 both mean that multisampling is not supported. Return GL_MAX_SAMPLES==0 instead of 1 for consistency with other drivers. Note: This is a candidate for the 9.0 branch. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2013-01-29 08:59:53 -07:00
Brian Paul	4e41ae5fc1	xlib: stop use _mesa_enable_extension(), just set the boolean flags Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-01-29 08:59:53 -07:00
Brian Paul	becec657d6	xlib: fix incorrect GL_ANGLE_texture_compression_dxt enable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-01-29 08:59:53 -07:00
José Fonseca	0ca384fb39	llvmpipe: Support Z16_UNORM as depth-stencil format. Simply by adjusting the vector element width after/before reading/writing the depth-stencil values. Ran several GL_DEPTH_COMPONENT16 piglit tests without regressions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-01-29 07:06:36 +00:00
Kenneth Graunke	9add4e8038	i965: Add chipset limits for Haswell GT1/GT2. The maximum number of URB entries come from the 3DSTATE_URB_VS and 3DSTATE_URB_GS state packet documentation; the thread count information comes from the 3DSTATE_VS and 3DSTATE_PS state packet documentation. NOTE: This is a candidate for the 9.0 branch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>	2013-01-28 17:08:28 -08:00
Kenneth Graunke	7b07808f74	intel: Un-hardcode lengths from blitter commands. The packet length may change at some point in the future. Specifying it explicitly (rather than hardcoding it in the command #define) allows us to change it much more easily in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-01-28 16:47:52 -08:00
Matt Turner	1b3ec16cc2	Remove APIspec.dtd Left behind by `a8ab7e33`.	2013-01-28 16:48:38 -08:00
Matt Turner	6324521789	docs: List new extensions added in Mesa 9.1 I did not list the *_get_program_binary extensions since they're not useful to anyone with their current implementation (that supports 0 binary formats).	2013-01-28 16:48:38 -08:00
Eric Anholt	99fe2b36cf	intel: Use a CPU map of the batch on LLC-sharing architectures. Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in, which was an improvement over mapping the batch through the GTT directly (since any readback or other failure to stream through write combining correctly would hurt). However, on LLC-sharing architectures we can do better by mapping the batch directly, which reduces the cache footprint of the application since we no longer have this extra copy of a batchbuffer around. Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4% (n=21). Improves Lightsmark performance by 1.1 +/- 0.1% (n=76). Improves cairo-gl performance by 1.9% +/- 1.4% (n=57). No statistically significant difference in GLB2.1 on SNB (n=37). Improves cairo-gl performance by 2.1% +/- 0.1% (n=278).	2013-01-29 11:25:14 +11:00
Jerome Glisse	e1598cb642	r600g: use uint64_t instead of unsigned long for proper 32bits cpu support Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-01-28 19:09:52 -05:00
Jerome Glisse	da638781f6	r600g: real fix for non 3.8 kernel Signed-off-by: Jerome Glisse <jglisse@redhat.com>	2013-01-28 17:17:00 -05:00
Vinson Lee	1559994cba	i965: Fix assignment instead of comparison in asserts. Fixes side effect in assertion defects reported by Coverity. Note: This is a candidate for the 9.1 branch. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-01-28 13:51:10 -08:00
Tapani Pälli	407029591c	android: use gralloc_drm_get_gem_handle api Currently a gralloc internal structure is exposed to Mesa, Use a query function instead to maintain ABI compatibility. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-01-28 12:49:41 -08:00
Paul Berry	8e4bb4bc09	intel: Typo fix: "pitsh" -> "pitch" Comment change only.	2013-01-28 12:31:25 -08:00

3592 changed files with 379825 additions and 184515 deletions

									
										1

.dir-locals.el
									
												View File
												
				@@ -8,4 +8,5 @@

					    (c-set-offset 'innamespace '0)

					    (c-set-offset 'inline-open '0)))

				  )

				 (makefile-mode (indent-tabs-mode . t))

				 )

									
										3

Android.common.mk
									
												View File
												
				@@ -33,8 +33,11 @@ endif

				LOCAL_C_INCLUDES += \

					$(MESA_TOP)/include

				MESA_VERSION=$(shell cat $(MESA_TOP)/VERSION)

				# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)

				LOCAL_CFLAGS += \

					-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \

					-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \

					-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)

				LOCAL_CFLAGS += \

									
										5

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast i915g nouveau r300g r600g radeonsi vmwgfx

				#   gallium drivers: swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -42,7 +42,7 @@ DRM_TOP := external/drm

				DRM_GRALLOC_TOP := hardware/drm_gralloc

				classic_drivers := i915 i965

				gallium_drivers := swrast i915g nouveau r300g r600g radeonsi vmwgfx

				gallium_drivers := swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx

				MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

				@@ -78,6 +78,7 @@ endif

				ifneq ($(strip $(MESA_GPU_DRIVERS)),)

				SUBDIRS := \

					src/loader \

					src/mapi \

					src/glsl \

					src/mesa \

									
										16

Makefile.am
									
												View File
												
				@@ -26,17 +26,10 @@ ACLOCAL_AMFLAGS = -I m4

				doxygen:

					cd doxygen && $(MAKE)

				check-local:

					$(MAKE) -C src/mapi/glapi/tests check

					$(MAKE) -C src/mapi/shared-glapi/tests check

					$(MAKE) -C src/mesa/main/tests check

					$(MAKE) -C src/glx/tests check

				.PHONY: doxygen

				# Rules for making release tarballs

				PACKAGE_VERSION=9.1-devel

				PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)

				PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)

				@@ -52,18 +45,13 @@ EXTRA_FILES = \

					bin/ltmain.sh					\

					bin/missing					\

					bin/ylwrap					\

				        bin/test-driver					\

					src/glsl/glsl_parser.cpp			\

					src/glsl/glsl_parser.h				\

					src/glsl/glsl_lexer.cpp				\

					src/glsl/glcpp/glcpp-lex.c			\

					src/glsl/glcpp/glcpp-parse.c			\

					src/glsl/glcpp/glcpp-parse.h			\

					src/mesa/main/api_exec_es1.c			\

					src/mesa/main/api_exec_es1_dispatch.h		\

					src/mesa/main/api_exec_es1_remap_helper.h	\

					src/mesa/main/api_exec_es2.c			\

					src/mesa/main/api_exec_es2_dispatch.h		\

					src/mesa/main/api_exec_es2_remap_helper.h	\

					src/mesa/program/lex.yy.c			\

					src/mesa/program/program_parse.tab.c		\

					src/mesa/program/program_parse.tab.h		\

				@@ -76,7 +64,7 @@ IGNORE_FILES = \

				parsers: configure

					$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h

					$(MAKE) -C src/mesa/program lex.yy.c program_parse.tab.c program_parse.tab.h

					$(MAKE) -C src/mesa program/lex.yy.c program/program_parse.tab.c program/program_parse.tab.h

				# Everything for new a Mesa release:

				ARCHIVES = $(PACKAGE_NAME).tar.gz \

									
										20

SConstruct
									
												View File
												
				@@ -59,16 +59,16 @@ else:

				Help(opts.GenerateHelpText(env))

				# fail early for a common error on windows

				if env['gles']:

				    try:

				        import libxml2

				    except ImportError:

				        raise SCons.Errors.UserError, "GLES requires libxml2-python to build"

				#######################################################################

				# Environment setup

				with open("VERSION") as f:

				  mesa_version = f.read().strip()

				env.Append(CPPDEFINES = [

				    ('PACKAGE_VERSION', '\\"%s\\"' % mesa_version),

				    ('PACKAGE_BUGREPORT', '\\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\\"'),

				])

				# Includes

				env.Prepend(CPPPATH = [

					'#/include',

				@@ -80,9 +80,6 @@ env.Append(CPPPATH = [

					'#/src/gallium/winsys',

				])

				if env['msvc']:

				    env.Append(CPPPATH = ['#include/c99'])

				# for debugging

				#print env.Dump()

				@@ -115,9 +112,6 @@ if env['crosscompile'] and not env['embedded']:

				    host_env['hostonly'] = True

				    assert host_env['crosscompile'] == False

				    if host_env['msvc']:

				        host_env.Append(CPPPATH = ['#include/c99'])

				    target_env = env

				    env = host_env

				    Export('env')

1

VERSION Normal file

View File

				`@@ -0,0 +1 @@`
				`10.2.0-devel`

									
										52

bin/bugzilla_mesa.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,52 @@

				#!/bin/bash

				# This script is used to generate the list of fixed bugs that

				# appears in the release notes files, with HTML formatting.

				#

				# Note: This script could take a while until all details have

				#       been fetched from bugzilla.

				#

				# Usage examples:

				#

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes

				# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes

				# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | wc -l

				# regex pattern: trim before url

				trim_before='s/.*\(http\)/\1/'

				# regex pattern: trim after url

				trim_after='s/\(show_bug.cgi?id=[0-9]*\).*/\1/'

				# regex pattern: always use https

				use_https='s/http:/https:/'

				# extract fdo urls from commit log

				urls=$(git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before -e $trim_after -e $use_https | sort | uniq)

				# if DRYRUN is set to "yes", simply print the URLs and don't fetch the

				# details from fdo bugzilla.

				#DRYRUN=yes

				if [ "x$DRYRUN" = xyes ]; then

					for i in $urls

					do

						echo $i

					done

				else

					echo "<ul>"

					echo ""

					for i in $urls

					do

						id=$(echo $i | cut -d'=' -f2)

						summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>Bug [0-9]\+ &ndash; \(.*\)<\/title>/\1/')

						echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"

						echo ""

					done

					echo "</ul>"

				fi

									
										8

bin/get-pick-list.sh
									
												View File
												
				@@ -1,6 +1,12 @@

				#!/bin/sh

				# Script for generating a list of candidates for cherry-picking to a stable branch

				#

				# Usage examples:

				#

				# $ bin/get-pick-list.sh

				# $ bin/get-pick-list.sh > picklist

				# $ bin/get-pick-list.sh | tee picklist

				# Grep for commits with "cherry picked from commit" in the commit message.

				git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\

				@@ -8,7 +14,7 @@ git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\

					sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked

				# Grep for commits that were marked as a candidate for the stable tree.

				git log --reverse --pretty=%H -i --grep='^[[:space:]]*NOTE: This is a candidate' HEAD..origin/master |\

				git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\

				while read sha

				do

					# Check to see whether the patch is on the ignore list.

251

bin/perf-annotate-jit Executable file

View File

@@ -0,0 +1,251 @@
 #!/usr/bin/env python
 #
 # Copyright 2012 VMware Inc
 # Copyright 2008-2009 Jose Fonseca
 #
 # Permission is hereby granted, free of charge, to any person obtaining a copy
 # of this software and associated documentation files (the "Software"), to deal
 # in the Software without restriction, including without limitation the rights
 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 # copies of the Software, and to permit persons to whom the Software is
 # furnished to do so, subject to the following conditions:
 #
 # The above copyright notice and this permission notice shall be included in
 # all copies or substantial portions of the Software.
 #
 # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 # THE SOFTWARE.
 #
 """Perf annotate for JIT code.
 Linux `perf annotate` does not work with JIT code.  This script takes the data
 produced by `perf script` command, plus the diassemblies outputed by gallivm
 into /tmp/perf-XXXXX.map.asm and produces output similar to `perf annotate`.
 See docs/llvmpipe.html for usage instructions.
 The `perf script` output parser was derived from the gprof2dot.py script.
 """
 import sys
 import os.path
 import re
 import optparse
 import subprocess
 class Parser:
     """Parser interface."""
     def __init__(self):
         pass
     def parse(self):
         raise NotImplementedError
 class LineParser(Parser):
     """Base class for parsers that read line-based formats."""
     def __init__(self, file):
         Parser.__init__(self)
         self._file = file
         self.__line = None
         self.__eof = False
         self.line_no = 0
     def readline(self):
         line = self._file.readline()
         if not line:
             self.__line = ''
             self.__eof = True
         else:
             self.line_no += 1
         self.__line = line.rstrip('\r\n')
     def lookahead(self):
         assert self.__line is not None
         return self.__line
     def consume(self):
         assert self.__line is not None
         line = self.__line
         self.readline()
         return line
     def eof(self):
         assert self.__line is not None
         return self.__eof
 mapFile = None
 def lookupMap(filename, matchSymbol):
     global mapFile
     mapFile = filename
     stream = open(filename, 'rt')
     for line in stream:
         start, length, symbol = line.split()
         start = int(start, 16)
         length = int(length,16)
         if symbol == matchSymbol:
             return start
     return None
 def lookupAsm(filename, desiredFunction):
     stream = open(filename + '.asm', 'rt')
     while stream.readline() != desiredFunction + ':\n':
         pass
     asm = []
     line = stream.readline().strip()
     while line:
         addr, instr = line.split(':', 1)
         addr = int(addr)
         asm.append((addr, instr))
         line = stream.readline().strip()
     return asm
 samples = {}
 class PerfParser(LineParser):
     """Parser for linux perf callgraph output.
     It expects output generated with
         perf record -g
         perf script
     """
     def __init__(self, infile, symbol):
         LineParser.__init__(self, infile)
 	self.symbol = symbol
     def readline(self):
         # Override LineParser.readline to ignore comment lines
         while True:
             LineParser.readline(self)
             if self.eof() or not self.lookahead().startswith('#'):
                 break
     def parse(self):
         # read lookahead
         self.readline()
         while not self.eof():
             self.parse_event()
         asm = lookupAsm(mapFile, self.symbol)
         addresses = samples.keys()
         addresses.sort()
         total_samples = 0
 	sys.stdout.write('%s:\n' % self.symbol)
         for address, instr in asm:
             try:
                 sample = samples.pop(address)
             except KeyError:
                 sys.stdout.write(6*' ')
             else:
                 sys.stdout.write('%6u' % (sample))
                 total_samples += sample
             sys.stdout.write('%6u: %s\n' % (address, instr))
         print 'total:', total_samples
         assert len(samples) == 0
         sys.exit(0)
     def parse_event(self):
         if self.eof():
             return
         line = self.consume()
         assert line
         callchain = self.parse_callchain()
         if not callchain:
             return
     def parse_callchain(self):
         callchain = []
         while self.lookahead():
             function = self.parse_call(len(callchain) == 0)
             if function is None:
                 break
             callchain.append(function)
         if self.lookahead() == '':
             self.consume()
         return callchain
     call_re = re.compile(r'^\s+(?P<address>[0-9a-fA-F]+)\s+(?P<symbol>.*)\s+\((?P<module>[^)]*)\)$')
     def parse_call(self, first):
         line = self.consume()
         mo = self.call_re.match(line)
         assert mo
         if not mo:
             return None
         if not first:
             return None
         function_name = mo.group('symbol')
         if not function_name:
             function_name = mo.group('address')
         module = mo.group('module')
         function_id = function_name + ':' + module
         address = mo.group('address')
         address = int(address, 16)
         if function_name != self.symbol:
             return None
         start_address = lookupMap(module, function_name)
         address -= start_address
         #print function_name, module, address
         samples[address] = samples.get(address, 0) + 1
         return True
 def main():
     """Main program."""
     optparser = optparse.OptionParser(
         usage="\n\t%prog [options] symbol_name")
     (options, args) = optparser.parse_args(sys.argv[1:])
     if len(args) != 1:
         optparser.error('wrong number of arguments')
     symbol = args[0]
     p = subprocess.Popen(['perf', 'script'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
     parser = PerfParser(p.stdout, symbol)
     parser.parse()
 if __name__ == '__main__':
     main()
 # vim: set sw=4 et:

									
										6

bin/shortlog_mesa.sh
									
												View File
												
				@@ -2,6 +2,12 @@

				# This script is used to generate the list of changes that

				# appears in the release notes files, with HTML formatting.

				#

				# Usage examples:

				#

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 > changes

				# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes

				typeset -i in_log=0

									
										3

common.py
									
												View File
												
				@@ -91,6 +91,7 @@ def AddOptions(opts):

					opts.Add(EnumOption('platform', 'target platform', host_platform,

															 allowed_values=('cygwin', 'darwin', 'freebsd', 'haiku', 'linux', 'sunos', 'windows')))

					opts.Add(BoolOption('embedded', 'embedded build', 'no'))

					opts.Add(BoolOption('analyze', 'enable static code analysis where available', 'no'))

					opts.Add('toolchain', 'compiler toolchain', default_toolchain)

					opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support', 'no'))

					opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))

				@@ -100,4 +101,4 @@ def AddOptions(opts):

					opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))

					opts.Add(BoolOption('texture_float', 'enable floating-point textures and renderbuffers', 'no'))

					if host_platform == 'windows':

						opts.Add(EnumOption('MSVS_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0')))

						opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')

1354

configure.ac

View File

File diff suppressed because it is too large Load Diff

272

docs/GL3.txt

View File

@@ -7,153 +7,189 @@ infrastructure is complete but it may be the case that few (if any) drivers
 implement the features.
 OpenGL Core and Compatibility context support
 OpenGL 3.1 and later versions are only supported with the Core profile.
 There are no plans to support GL_ARB_compatibility. The last supported OpenGL
 version with all deprecated features is 3.0. Some of the later GL features
 are exposed in the 3.0 context as extensions.
 Feature                                               Status
 ----------------------------------------------------- ------------------------
 GL 3.0:
 GL 3.0 --- all DONE: i965, nv50, nvc0, r600, radeonsi
 GLSL 1.30                                             DONE
 glBindFragDataLocation, glGetFragDataLocation         DONE
 Conditional rendering (GL_NV_conditional_render)      DONE (i965, r300, r600, swrast)
 Map buffer subranges (GL_ARB_map_buffer_range)        DONE (i965, r300, r600, swrast)
 Clamping controls (GL_ARB_color_buffer_float)         DONE (i965, r300, r600)
 Float textures, renderbuffers (GL_ARB_texture_float)  DONE (i965, r300, r600)
 GL_EXT_packed_float                                   DONE (i965, r600)
 GL_EXT_texture_shared_exponent                        DONE (i965, r600, swrast)
 Float depth buffers (GL_ARB_depth_buffer_float)       DONE (i965, r600)
 Framebuffer objects (GL_ARB_framebuffer_object)       DONE (i965, r300, r600, swrast)
 Half-float                                            DONE
 Non-normalized Integer texture/framebuffer formats    DONE (i965, r600)
 D/2D Texture arrays                                  DONE
 Per-buffer blend and masks (GL_EXT_draw_buffers2)     DONE (i965, r600, swrast)
 GL_EXT_texture_compression_rgtc                       DONE (i965, r300, r600, swrast)
 Red and red/green texture formats                     DONE (i965, swrast, gallium)
 Transform feedback (GL_EXT_transform_feedback)        DONE (i965, r600)
 Vertex array objects (GL_APPLE_vertex_array_object)   DONE (i965, r300, r600, swrast)
 sRGB framebuffer format (GL_EXT_framebuffer_sRGB)     DONE (i965, r600)
 glClearBuffer commands                                DONE
 glGetStringi command                                  DONE
 glTexParameterI, glGetTexParameterI commands          DONE
 glVertexAttribI commands                              DONE
 Depth format cube textures                            DONE
 GLX_ARB_create_context (GLX 1.4 is required)          DONE
   GLSL 1.30                                             DONE ()
   glBindFragDataLocation, glGetFragDataLocation         DONE
   Conditional rendering (GL_NV_conditional_render)      DONE (r300, swrast)
   Map buffer subranges (GL_ARB_map_buffer_range)        DONE (r300, swrast)
   Clamping controls (GL_ARB_color_buffer_float)         DONE (r300)
   Float textures, renderbuffers (GL_ARB_texture_float)  DONE (r300)
   GL_EXT_packed_float                                   DONE ()
   GL_EXT_texture_shared_exponent                        DONE (swrast)
   Float depth buffers (GL_ARB_depth_buffer_float)       DONE ()
   Framebuffer objects (GL_ARB_framebuffer_object)       DONE (r300, swrast)
   GL_ARB_half_float_pixel                               DONE (all drivers)
   GL_ARB_half_float_vertex                              DONE (r300, swrast)
   GL_EXT_texture_integer                                DONE ()
   GL_EXT_texture_array                                  DONE ()
   Per-buffer blend and masks (GL_EXT_draw_buffers2)     DONE (swrast)
   GL_EXT_texture_compression_rgtc                       DONE (r300, swrast)
   GL_ARB_texture_rg                                     DONE (r300, swrast)
   Transform feedback (GL_EXT_transform_feedback)        DONE ()
   Vertex array objects (GL_ARB_vertex_array_object)     DONE (all drivers)
   sRGB framebuffer format (GL_EXT_framebuffer_sRGB)     DONE ()
   glClearBuffer commands                                DONE
   glGetStringi command                                  DONE
   glTexParameterI, glGetTexParameterI commands          DONE
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (r300)
 GL 3.1:
 GL 3.1 --- all DONE: i965, nv50, nvc0, r600, radeonsi
 GLSL 1.40                                             DONE (i965, r600)
 Forward compatibile context support/deprecations      DONE (i965, r600)
 Instanced drawing (GL_ARB_draw_instanced)             DONE (i965, gallium, swrast)
 Buffer copying (GL_ARB_copy_buffer)                   DONE (i965, r300, r600, swrast)
 Primitive restart (GL_NV_primitive_restart)           DONE (i965, r600)
 vertex texture image units                         DONE
 Texture buffer objs (GL_ARB_texture_buffer_object)    DONE for OpenGL 3.1 contexts (i965, r600)
 Rectangular textures (GL_ARB_texture_rectangle)       DONE (i965, r300, r600, swrast)
 Uniform buffer objs (GL_ARB_uniform_buffer_object)    DONE (i965, r600, swrast)
 Signed normalized textures (GL_EXT_texture_snorm)     DONE (i965, r300, r600)
   GLSL 1.40                                             DONE ()
   Forward compatible context support/deprecations       DONE ()
   Instanced drawing (GL_ARB_draw_instanced)             DONE (swrast)
   Buffer copying (GL_ARB_copy_buffer)                   DONE (r300, swrast)
   Primitive restart (GL_NV_primitive_restart)           DONE (r300)
 vertex texture image units                         DONE ()
   Texture buffer objs (GL_ARB_texture_buffer_object)    DONE for OpenGL 3.1 contexts ()
   Rectangular textures (GL_ARB_texture_rectangle)       DONE (r300, swrast)
   Uniform buffer objs (GL_ARB_uniform_buffer_object)    DONE (swrast)
   Signed normalized textures (GL_EXT_texture_snorm)     DONE (r300)
 GL 3.2:
 GL 3.2 --- all DONE: i965, nv50, nvc0, r600, radeonsi
 Core/compatibility profiles                           DONE
 GLSL 1.50                                             not started
 Geometry shaders (GL_ARB_geometry_shader4)            partially done (Zack)
 BGRA vertex order (GL_ARB_vertex_array_bgra)          DONE (i965, r300, r600, swrast)
 Base vertex offset(GL_ARB_draw_elements_base_vertex)  DONE (i965, r300, r600, swrast)
 Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (i965, r300, r600, swrast)
 Provoking vertex (GL_ARB_provoking_vertex)            DONE (i965, r300, r600, swrast)
 Seamless cubemaps (GL_ARB_seamless_cube_map)          DONE (i965, r600)
 Multisample textures (GL_ARB_texture_multisample)     not started
 Frag depth clamp (GL_ARB_depth_clamp)                 DONE (i965, r600, swrast)
 Fence objects (GL_ARB_sync)                           DONE (i965, r300, r600, swrast)
 GLX_ARB_create_context_profile                        DONE
   Core/compatibility profiles                           DONE
   GLSL 1.50                                             DONE ()
   Geometry shaders                                      DONE ()
   BGRA vertex order (GL_ARB_vertex_array_bgra)          DONE (r300, swrast)
   Base vertex offset(GL_ARB_draw_elements_base_vertex)  DONE (r300, swrast)
   Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (r300, swrast)
   Provoking vertex (GL_ARB_provoking_vertex)            DONE (r300, swrast)
   Seamless cubemaps (GL_ARB_seamless_cube_map)          DONE ()
   Multisample textures (GL_ARB_texture_multisample)     DONE ()
   Frag depth clamp (GL_ARB_depth_clamp)                 DONE (swrast)
   Fence objects (GL_ARB_sync)                           DONE (r300, swrast)
   GLX_ARB_create_context_profile                        DONE
 GL 3.3:
 GL 3.3 --- all DONE: i965, nv50, nvc0, r600, radeonsi
 GLSL 3.30                                             new features in this version pretty much done
 GL_ARB_blend_func_extended                            DONE (i965, r600, softpipe)
 GL_ARB_explicit_attrib_location                       DONE (i915, i965, r300, r600, swrast)
 GL_ARB_occlusion_query2                               DONE (i965, r300, r600, swrast)
 GL_ARB_sampler_objects                                DONE (i965, r300, r600)
 GL_ARB_shader_bit_encoding                            DONE
 GL_ARB_texture_rgb10_a2ui                             DONE (i965, r600)
 GL_ARB_texture_swizzle                                DONE (same as EXT version) (i965, r300, r600, swrast)
 GL_ARB_timer_query                                    DONE (i965, r600)
 GL_ARB_instanced_arrays                               DONE (i965, r300, r600)
 GL_ARB_vertex_type_2_10_10_10_rev                     DONE (i965, r600)
   GLSL 3.30                                             DONE ()
   GL_ARB_blend_func_extended                            DONE (softpipe)
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
   GL_ARB_occlusion_query2                               DONE (r300, swrast)
   GL_ARB_sampler_objects                                DONE (all drivers)
   GL_ARB_shader_bit_encoding                            DONE ()
   GL_ARB_texture_rgb10_a2ui                             DONE ()
   GL_ARB_texture_swizzle                                DONE (r300, swrast)
   GL_ARB_timer_query                                    DONE ()
   GL_ARB_instanced_arrays                               DONE (r300)
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE ()
 GL 4.0:
 GLSL 4.0                                             not started
 GL_ARB_texture_query_lod                             not started
 GL_ARB_draw_buffers_blend                            DONE (i965, r600, softpipe)
 GL_ARB_draw_indirect                                 not started
 GL_ARB_gpu_shader5                                   not started
 GL_ARB_gpu_shader_fp64                               not started
 GL_ARB_sample_shading                                not started
 GL_ARB_shader_subroutine                             not started
 GL_ARB_tessellation_shader                           not started
 GL_ARB_texture_buffer_object_rgb32                   DONE (i965, softpipe)
 GL_ARB_texture_cube_map_array                        DONE (i965, softpipe)
 GL_ARB_texture_gather                                not started
 GL_ARB_transform_feedback2                           DONE
 GL_ARB_transform_feedback3                           DONE
   GLSL 4.0                                             not started
   GL_ARB_texture_query_lod                             DONE (i965, nv50, nvc0)
   GL_ARB_draw_buffers_blend                            DONE (i965, nv50, nvc0, r600, radeonsi, softpipe)
   GL_ARB_draw_indirect                                 DONE (i965)
   GL_ARB_gpu_shader5                                   started
   - 'precise' qualifier                                not started
   - Dynamically uniform sampler array indices          not started
   - Dynamically uniform UBO array indices              not started
   - Implicit signed -> unsigned conversions            not started
   - Fused multiply-add                                 DONE
   - Packing/bitfield/conversion functions              DONE
   - Enhanced textureGather                             DONE
   - Geometry shader instancing                         DONE
   - Geometry shader multiple streams                   not started
   - Enhanced per-sample shading                        DONE
   - Interpolation functions                            started
   - New overload resolution rules                      not started
   GL_ARB_gpu_shader_fp64                               not started
   GL_ARB_sample_shading                                DONE (i965, nv50, nvc0)
   GL_ARB_shader_subroutine                             not started
   GL_ARB_tessellation_shader                           not started
   GL_ARB_texture_buffer_object_rgb32                   DONE (i965, nvc0, r600, radeonsi, softpipe)
   GL_ARB_texture_cube_map_array                        DONE (i965, nv50, nvc0, r600, softpipe)
   GL_ARB_texture_gather                                DONE (i965, nv50, nvc0)
   GL_ARB_transform_feedback2                           DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_transform_feedback3                           DONE (i965, nv50, nvc0, r600, radeonsi)
 GL 4.1:
 GLSL 4.1                                             not started
 GL_ARB_ES2_compatibility                             DONE (i965, r300, r600)
 GL_ARB_get_program_binary                            not started
 GL_ARB_separate_shader_objects                       some infrastructure done
 GL_ARB_shader_precision                              not started
 GL_ARB_vertex_attrib_64bit                           not started
 GL_ARB_viewport_array                                not started
   GLSL 4.1                                             not started
   GL_ARB_ES2_compatibility                             DONE (i965, nv50, nvc0, r300, r600, radeonsi)
   GL_ARB_get_program_binary                            DONE (0 binary formats)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_shader_precision                              not started
   GL_ARB_vertex_attrib_64bit                           not started
   GL_ARB_viewport_array                                DONE (i965, nv50, r600)
 GL 4.2:
 GLSL 4.2                                             not started
 GL_ARB_texture_compression_bptc                      not started
 GL_ARB_compressed_texture_pixel_storage              not started
 GL_ARB_shader_atomic_counters                        not started
 GL_ARB_texture_storage                               DONE (r300, r600, swrast, gallium)
 GL_ARB_transform_feedback_instanced                  DONE
 GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi)
 GL_ARB_shader_image_load_store                       not started
 GL_ARB_conservative_depth                            DONE (softpipe)
 GL_ARB_shading_language_420pack                      not started
 GL_ARB_internalformat_query                          not started
 GL_ARB_map_buffer_alignment                          DONE (r300, r600, radeonsi)
   GLSL 4.2                                             not started
   GL_ARB_texture_compression_bptc                      not started
   GL_ARB_compressed_texture_pixel_storage              not started
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_texture_storage                               DONE (all drivers)
   GL_ARB_transform_feedback_instanced                  DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_shader_image_load_store                       in progress (curro)
   GL_ARB_conservative_depth                            DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                      DONE (all drivers that support GLSL 1.30)
   GL_ARB_internalformat_query                          DONE (i965, nv50, nvc0, r300, r600, radeonsi)
   GL_ARB_map_buffer_alignment                          DONE (all drivers)
 GL 4.3:
 GLSL 4.3                                             not started
 ARB_arrays_of_arrays                                 not started
 ARB_ES3_compatibility                                not started
 ARB_clear_buffer_object                              not started
 ARB_compute_shader                                   started (gallium)
 ARB_copy_image                                       not started
 KHR_debug                                            some work done (ARB_debug_output)
 ARB_explicit_uniform_location                        not started
 ARB_fragment_layer_viewport                          not started
 ARB_framebuffer_no_attachments                       not started
 ARB_internalformat_query2                            not started
 ARB_invalidate_subdata                               not started
 ARB_multi_draw_indirect                              not started
 ARB_program_interface_query                          not started
 ARB_robust_buffer_access_behavior                    not started
 ARB_shader_image_size                                not started
 ARB_shader_storage_buffer_object                     not started
 ARB_stencil_texturing                                not started
 ARB_texture_buffer_range                             not started
 ARB_texture_query_levels                             not started
 ARB_texture_storage_multisample                      not started
 ARB_texture_view                                     not started
 ARB_vertex_attrib_binding                            not started
   GLSL 4.3                                             not started
   GL_ARB_arrays_of_arrays                              started
   GL_ARB_ES3_compatibility                             DONE (i965)
   GL_ARB_clear_buffer_object                           DONE (all drivers)
   GL_ARB_compute_shader                                started (Paul Berry)
   GL_ARB_copy_image                                    not started
   GL_KHR_debug                                         DONE (all drivers)
   GL_ARB_explicit_uniform_location                     not started
   GL_ARB_fragment_layer_viewport                       not started
   GL_ARB_framebuffer_no_attachments                    not started
   GL_ARB_internalformat_query2                         not started
   GL_ARB_invalidate_subdata                            DONE (all drivers)
   GL_ARB_multi_draw_indirect                           DONE (i965)
   GL_ARB_program_interface_query                       not started
   GL_ARB_robust_buffer_access_behavior                 not started
   GL_ARB_shader_image_size                             not started
   GL_ARB_shader_storage_buffer_object                  not started
   GL_ARB_stencil_texturing                             DONE (i965/gen8+)
   GL_ARB_texture_buffer_range                          DONE (nv50, nvc0, i965, r600, radeonsi)
   GL_ARB_texture_query_levels                          DONE (i965)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                  DONE (i965)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
 GL 4.4:
   GLSL 4.4                                             not started
   GL_MAX_VERTEX_ATTRIB_STRIDE                          not started
   GL_ARB_buffer_storage                                DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi)
   GL_ARB_clear_texture                                 not started
   GL_ARB_enhanced_layouts                              not started
   GL_ARB_multi_bind                                    DONE (all drivers)
   GL_ARB_query_buffer_object                           not started
   GL_ARB_texture_mirror_clamp_to_edge                  DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, swrast)
   GL_ARB_texture_stencil8                              not started
   GL_ARB_vertex_type_10f_11f_11f_rev                   DONE (i965, nv50, nvc0, r600, radeonsi)
 More info about these features and the work involved can be found at

256

docs/README.CYGWIN

View File

@@ -1,256 +0,0 @@
                           Mesa Cygwin/X11 Information
 WARNING
 =======
 If you installed X11 (packages xorg-x11-devel and xorg-x11-bin-dlls ) with the
 latest setup.exe from Cygwin the GL (Mesa) libraries and include are already
 installed in /usr/X11R6.
 The following will explain how to "replace" them.
 Installation
 ============
 How to compile Mesa on Cygwin/X11 systems:
 . Shared libs:
     type 'make cygwin-sl'.
     When finished, the Mesa DLL will be in the Mesa-x.y/lib/ and
     Mesa-x.y/bin directories.
 . Static libs:
     type 'make cygwin-static'.
     When finished, the Mesa libraries will be in the Mesa-x.y/lib/ directory.
 Header and library files:
    After you've compiled Mesa and tried the demos I recommend the following
    procedure for "installing" Mesa.
    Copy the Mesa include/GL directory to /usr/X11R6/include:
 	cp -a include/GL /usr/X11R6/include
    Copy the Mesa library files to /usr/X11R6/lib:
 	cp -a lib/* /usr/X11R6ocal/lib
    Copy the Mesa bin files (used by the DLL stuff) to /usr/X11R6/bin:
 	cp -a lib/cyg* /usr/X11R6/bin
 Xt/Motif widgets:
    If you want to use Mesa or OpenGL in your Xt/Motif program you can build
    the widgets found in either the widgets-mesa or widgets-sgi directories.
    The former were written for Mesa and the later are the original SGI
    widgets.  Look in those directories for more information.
    For the Motif widgets you must have downloaded the lesstif package.
 Using the library
 =================
 Configuration options:
    The file src/mesa/main/config.h has many parameters which you can adjust
    such as maximum number of lights, clipping planes, maximum texture size,
    etc.  In particular, you may want to change DEPTH_BITS from 16 to 32
    if a 16-bit depth buffer isn't precise enough for your application.
 Shared libraries:
    If you compile shared libraries (Win32 DLLS) you may have to set an
    environment variable to specify where the Mesa libraries are located.
    Set the PATH variable to include /your-dir/Mesa-2.6/bin.
    Otherwise, when you try to run a demo it may fail with a message saying
    that one or more DLL couldn't be found.
 Xt/Motif Widgets:
    Two versions of the Xt/Motif OpenGL drawing area widgets are included:
       widgets-sgi/	SGI's stock widgets
       widgets-mesa/	Mesa-tuned widgets
    Look in those directories for details
 Togl:
    Togl is an OpenGL/Mesa widget for Tcl/Tk.
    See http://togl.sourceforge.net for more information.
 X Display Modes:
    Mesa supports RGB(A) rendering into almost any X visual type and depth.
    The glXChooseVisual function tries its best to pick an appropriate visual
    for the given attribute list.  However, if this doesn't suit your needs
    you can force Mesa to use any X visual you want (any supported by your
    X server that is) by setting the MESA_RGB_VISUAL and MESA_CI_VISUAL
    environment variables.  When an RGB visual is requested, glXChooseVisual
    will first look if the MESA_RGB_VISUAL variable is defined.  If so, it
    will try to use the specified visual.  Similarly, when a color index
    visual is requested, glXChooseVisual will look for the MESA_CI_VISUAL
    variable.
    The format of accepted values is:  <visual-class> <depth>
    Here are some examples:
    using the C-shell:
 	% setenv MESA_RGB_VISUAL "TrueColor 8"		// 8-bit TrueColor
 	% setenv MESA_CI_VISUAL "PseudoColor 12"	// 12-bit PseudoColor
 	% setenv MESA_RGB_VISUAL "PseudoColor 8"	// 8-bit PseudoColor
    using the KornShell:
 	$ export MESA_RGB_VISUAL="TrueColor 8"
 	$ export MESA_CI_VISUAL="PseudoColor 12"
 	$ export MESA_RGB_VISUAL="PseudoColor 8"
 Double buffering:
    Mesa can use either an X Pixmap or XImage as the backbuffer when in
    double buffer mode.  Using GLX, the default is to use an XImage.  The
    MESA_BACK_BUFFER environment variable can override this.  The valid
    values for MESA_BACK_BUFFER are:  Pixmap and XImage (only the first
    letter is checked, case doesn't matter).
    A pixmap is faster when drawing simple lines and polygons while an
    XImage is faster when Mesa has to do pixel-by-pixel rendering.  If you
    need depth buffering the XImage will almost surely be faster.  Exper-
    iment with the MESA_BACK_BUFFER variable to see which is faster for
    your application.
 Colormaps:
    When using Mesa directly or with GLX, it's up to the application writer
    to create a window with an appropriate colormap.  The aux, tk, and GLUT
    toolkits try to minimize colormap "flashing" by sharing colormaps when
    possible.  Specifically, if the visual and depth of the window matches
    that of the root window, the root window's colormap will be shared by
    the Mesa window.  Otherwise, a new, private colormap will be allocated.
    When sharing the root colormap, Mesa may be unable to allocate the colors
    it needs, resulting in poor color quality.  This can happen when a
    large number of colorcells in the root colormap are already allocated.
    To prevent colormap sharing in aux, tk and GLUT, define the environment
    variable MESA_PRIVATE_CMAP.  The value isn't significant.
 Gamma correction:
    To compensate for the nonlinear relationship between pixel values
    and displayed intensities, there is a gamma correction feature in
    Mesa.  Some systems, such as Silicon Graphics, support gamma
    correction in hardware (man gamma) so you won't need to use Mesa's
    gamma facility.  Other systems, however, may need gamma adjustment
    to produce images which look correct.  If in the past you thought
    Mesa's images were too dim, read on.
    Gamma correction is controlled with the MESA_GAMMA environment
    variable.  Its value is of the form "Gr Gg Gb" or just "G" where
    Gr is the red gamma value, Gg is the green gamma value, Gb is the
    blue gamma value and G is one gamma value to use for all three
    channels.  Each value is a positive real number typically in the
    range 1.0 to 2.5.  The defaults are all 1.0, effectively disabling
    gamma correction.  Examples using csh:
 	% setenv MESA_GAMMA "2.3 2.2 2.4"	// separate R,G,B values
 	% setenv MESA_GAMMA "2.0"		// same gamma for R,G,B
    The demos/gamma.c program may help you to determine reasonable gamma
    value for your display.  With correct gamma values, the color intensities
    displayed in the top row (drawn by dithering) should nearly match those
    in the bottom row (drawn as grays).
    Alex De Bruyn reports that gamma values of 1.6, 1.6 and 1.9 work well
    on HP displays using the HP-ColorRecovery technology.
    Mesa implements gamma correction with a lookup table which translates
    a "linear" pixel value to a gamma-corrected pixel value.  There is a
    small performance penalty.  Gamma correction only works in RGB mode.
    Also be aware that pixel values read back from the frame buffer will
    not be "un-corrected" so glReadPixels may not return the same data
    drawn with glDrawPixels.
    For more information about gamma correction see:
    http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html
 Overlay Planes
    Overlay planes in the frame buffer are supported by Mesa but require
    hardware and X server support.  To determine if your X server has
    overlay support you can test for the SERVER_OVERLAY_VISUALS property:
 	xprop -root | grep SERVER_OVERLAY_VISUALS
 HPCR glClear(GL_COLOR_BUFFER_BIT) dithering
    If you set the MESA_HPCR_CLEAR environment variable then dithering
    will be used when clearing the color buffer.  This is only applicable
    to HP systems with the HPCR (Color Recovery) system.
 Extensions
 ==========
    There are three Mesa-specific GLX extensions at this time.
    GLX_MESA_pixmap_colormap
       This extension adds the GLX function:
          GLXPixmap glXCreateGLXPixmapMESA( Display *dpy, XVisualInfo *visual,
                                            Pixmap pixmap, Colormap cmap )
       It is an alternative to the standard glXCreateGLXPixmap() function.
       Since Mesa supports RGB rendering into any X visual, not just True-
       Color or DirectColor, Mesa needs colormap information to convert RGB
       values into pixel values.  An X window carries this information but a
       pixmap does not.  This function associates a colormap to a GLX pixmap.
       See the xdemos/glxpixmap.c file for an example of how to use this
       extension.
    GLX_MESA_release_buffers
       Mesa associates a set of ancillary (depth, accumulation, stencil and
       alpha) buffers with each X window it draws into.  These ancillary
       buffers are allocated for each X window the first time the X window
       is passed to glXMakeCurrent().  Mesa, however, can't detect when an
       X window has been destroyed in order to free the ancillary buffers.
       The best it can do is to check for recently destroyed windows whenever
       the client calls the glXCreateContext() or glXDestroyContext()
       functions.  This may not be sufficient in all situations though.
       The GLX_MESA_release_buffers extension allows a client to explicitly
       deallocate the ancillary buffers by calling glxReleaseBuffersMESA()
       just before an X window is destroyed.  For example:
          #ifdef GLX_MESA_release_buffers
             glXReleaseBuffersMESA( dpy, window );
          #endif
          XDestroyWindow( dpy, window );
       This extension is new in Mesa 2.0.
    GLX_MESA_copy_sub_buffer
       This extension adds the glXCopySubBufferMESA() function.  It works
       like glXSwapBuffers() but only copies a sub-region of the window
       instead of the whole window.
       This extension is new in Mesa version 2.6
 Summary of X-related environment variables:
    MESA_RGB_VISUAL - specifies the X visual and depth for RGB mode (X only)
    MESA_CI_VISUAL - specifies the X visual and depth for CI mode (X only)
    MESA_BACK_BUFFER - specifies how to implement the back color buffer (X only)
    MESA_PRIVATE_CMAP - force aux/tk libraries to use private colormaps (X only)
    MESA_GAMMA - gamma correction coefficients (X only)
 ----------------------------------------------------------------------
 README.CYGWIN - lassauge April 2004 - based on README.X11

102

docs/README.MITS

View File

@@ -1,102 +0,0 @@
 			Mesa 3.0 MITS Information
 This software is distributed under the terms of the GNU Library
 General Public License, see the LICENSE file for details.
 This document is a preliminary introduction to help you get
 started. For more detaile information consult the web page.
 http://10-dencies.zkm.de/~mesa/
 Version 0.1 (Yes it's very alpha code so be warned!)
 Contributors:
   Emil Briggs    	(briggs@bucky.physics.ncsu.edu)
   David Bucciarelli 	(tech.hmw@plus.it)
   Andreas Schiffler 	(schiffler@zkm.de)
 . Requirements:
      Mesa 3.0.
      An SMP capable machine running Linux 2.x
      libpthread installed on your machine.
 . What does MITS stand for?
      MITS stands for Mesa Internal Threading System. By adding
      internal threading to Mesa it should be possible to improve
      performance of OpenGL applications on SMP machines.
 . Do applications have to be recoded to take advantage of MITS?
      No. The threading is internal to Mesa and transparent to
      applications.
 . Will all applications benefit from the current implementation of MITS?
      No. This implementation splits the processing of the vertex buffer
      over two threads. There is a certain amount of overhead involved
      with the thread synchronization and if there is not enough work
      to be done the extra overhead outweighs any speedup from using
      dual processors. You will not for example see any speedup when
      running Quake because it uses GL_POLYGON and there is only one
      polygon for each vertex buffer processed. Test results on a
      dual 200 Mhz. Pentium Pro system show that one needs around
 -200 vertices in the vertex buffer before any there is any
      appreciable benefit from the threading.
 . Are there any parameters that I can tune to try to improve performance.
      Yes. You can try to vary the size of the vertex buffer which is
      define in VB_MAX located in the file src/vb.h from your top level
      Mesa distribution. The number needs to be a multiple of 12 and
      the optimum value will probably depend on the capabilities of
      your machine and the particular application you are running.
 . Are there any ways I can modify the application to improve its
    performance with the MITS?
      Yes. Try to use as many vertices between each Begin/End pair
      as possbile. This will reduce the thread synchronization
      overhead.
 . What sort of speedups can I expect?
      On some benchmarks performance gains of up to 30% have been
      observerd. Others may see no gain at all and in a few rare
      cases even some degradation.
 . What still needs to be done?
      Lots of testing and benchmarking.
      A portable implementation that works within the Mesa thread API.
      Threading of additional areas of Mesa to improve performance
      even more.
 Installation:
 . This assumes that you already have a working Mesa 3.0 installation
       from source.
 . Place the tarball MITS.tar.gz in your top level Mesa directory.
 . Unzip it and untar it. It will replace the following files in
       your Mesa source tree so back them up if you want to save them.
 	 README.MITS
          Make-config
 	 Makefile
 	 mklib.glide
          src/vbxform.c
 	 src/vb.h
 . Rebuild Mesa using the command
           make linux-386-glide-mits

207

docs/README.QUAKE

View File

@@ -1,207 +0,0 @@
              Info on using Mesa 3.0 with Linux Quake I and Quake II
 Disclaimer
 ----------
 I am _not_ a Quake expert by any means.  I pretty much only run it to
 test Mesa.  There have been a lot of questions about Linux Quake and
 Mesa so I'm trying to provide some useful info here.  If this file
 doesn't help you then you should look elsewhere for help.  The Mesa
 mailing list or the news://news.3dfx.com/3dfx.linux.glide newsgroup
 might be good.
 Again, all the information I have is in this file.  Please don't email
 me with questions.
 If you have information to contribute to this file please send it to
 me at brianp@elastic.avid.com
 Linux Quake
 -----------
 You can get Linux Quake from http://www.idsoftware.com/
 Quake I and II for Linux were tested with, and include, Mesa 2.6.  You
 shouldn't have too many problems if you simply follow the instructions
 in the Quake distribution.
 RedHat 5.0 Linux problems
 -------------------------
 RedHat Linux 5.x uses the GNU C library ("glibc" or "libc6") whereas
 previous RedHat and other Linux distributions use "libc5" for its
 runtime C library.
 Linux Quake I and II were compiled for libc5.  If you compile Mesa
 on a RedHat 5.x system the resulting libMesaGL.so file will not work
 with Linux Quake because of the different C runtime libraries.
 The symptom of this is a segmentation fault soon after starting Quake.
 If you want to use a newer version of Mesa (like 3.x) with Quake on
 RedHat 5.x then read on.
 The solution to the C library problem is to force Mesa to use libc5.
 libc5 is in /usr/i486-linux-libc5/lib on RedHat 5.x systems.
 Emil Briggs (briggs@tick.physics.ncsu.edu) nicely gave me the following
 info:
 >   I only know what works on a RedHat 5.0 distribution. RH5 includes
 > a full set of libraries for both libc5 and glibc. The loader ld.so
 > uses the libc5 libraries in /usr/i486-linux-libc5/lib for programs
 > linked against libc5 while it uses the glibc libraries in /lib and
 > /usr/lib for programs linked against glibc.
 >
 > Anyway I changed line 41 of mklib.glide to
 >     GLIDELIBS="-L/usr/local/glide/lib -lglide2x -L/usr/i486-linux-libc5/lib"
 >
 > And I started quake2 up with a script like this
 > #!/bin/csh
 > setenv LD_LIBRARY_PATH /usr/i486-linux-libc5/lib
 > setenv MESA_GLX_FX f
 > ./quake2 +set vid_ref gl
 > kbd_mode -a
 > reset
 I've already patched the mklib.glide file.  You'll have to start Quake
 with the script shown above though.
 **********************
 Daryll Strauss writes:
 Here's my thoughts on the problem. On a RH 5.x system, you can NOT build
 a libc5 executable or library. Red Hat just doesn't include the right
 stuff to do it.
 Since Quake is a libc5 based application, you are in trouble. You need
 libc5 libraries.
 What can you do about it? Well there's a package called gcc5 that does
 MOST of the right stuff to compile with libc5. (It brings back older
 header files, makes appropriate symbolic links for libraries, and sets
 up the compiler to use the correct directories) You can find gcc5 here:
 ftp://ecg.mit.edu/pub/linux/gcc5-1.0-1.i386.rpm
 No, this isn't quite enough. There are still a few tricks to getting
 Mesa to compile as a libc5 application. First you have to make sure that
 every compile uses gcc5 instead of gcc. Second, in some cases the link
 line actually lists -L/usr/lib which breaks gcc5 (because it forces you
 to use the glibc version of things)
 If you get all the stuff correctly compiled with gcc5 it should work.
 I've run Mesa 3.0B6  and its demos in a window with my Rush on a Red Hat
 .1 system. It is a big hassle, but it can be done. I've only made Quake
 segfault, but I think that's from my libRush using the wrong libc.
 Yes, mixing libc5 and glibc is a major pain. I've been working to get
 all my libraries compiling correctly with this setup. Someone should
 make an RPM out of it and feed changes back to Brian once they get it
 all working. If no one else has done so by the time I get the rest of my
 stuff straightened out, I'll try to do it myself.
 							- |Daryll
 *********************
 David Bucciarelli (tech.hmw@plus.it) writes:
 I'm using the Mesa-3.0beta7 and the RedHat 5.1 and QuakeII is
 working fine for me.  I had only to make a small change to the
 Mesa-3.0/mklib.glide file, from:
     GLIDELIBS="-L/usr/local/glide/lib -lglide2x
 -L/usr/i486-linux-libc5/lib -lm"
 to:
     GLIDELIBS="-L/usr/i486-linux-libc5/lib -lglide2x"
 and to make two symbolic links:
 [david@localhost Mesa]$ ln -s libMesaGL.so libMesaGL.so.2
 [david@localhost Mesa]$ ln -s libMesaGLU.so libMesaGLU.so.2
 I'm using the Daryll's Linux glide rpm for the Voodoo2 and glibc (it
 includes also the Glide for the libc5). I'm not using the /dev/3Dfx and
 running QuakeII as root with the following env. var:
 export
 LD_LIBRARY_PATH=/dsk1/home/david/src/gl/Mesa/lib:/usr/i486-linux-libc5/lib
 I think that all problems are related to the glibc, Quake will never
 work if you get the following output:
 [david@localhost Mesa]$ ldd lib/libMesaGL.so
         libglide2x.so => /usr/lib/libglide2x.so (0x400f8000)
         libm.so.6 => /lib/libm.so.6 (0x40244000)
         libc.so.6 => /lib/libc.so.6 (0x4025d000)
         /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00000000)
 You must get the following outputs:
 [david@localhost Mesa]# ldd lib/libMesaGL.so
         libglide2x.so => /usr/i486-linux-libc5/lib/libglide2x.so
 (0x400f3000)
 [root@localhost quake2]# ldd quake2
         libdl.so.1 => /lib/libdl.so.1 (0x40005000)
         libm.so.5 => /usr/i486-linux-libc5/lib/libm.so.5 (0x40008000)
         libc.so.5 => /usr/i486-linux-libc5/lib/libc.so.5 (0x40010000)
 [root@localhost quake2]# ldd ref_gl.so
         libMesaGL.so.2 =>
 /dsk1/home/david/src/gl/Mesa/lib/libMesaGL.so.2 (0x400eb000)
         libglide2x.so => /usr/i486-linux-libc5/lib/libglide2x.so
 (0x401d9000)
         libX11.so.6 => /usr/i486-linux-libc5/lib/libX11.so.6
 (0x40324000)
         libXext.so.6 => /usr/i486-linux-libc5/lib/libXext.so.6
 (0x403b7000)
         libvga.so.1 => /usr/i486-linux-libc5/lib/libvga.so.1
 (0x403c1000)
         libm.so.5 => /usr/i486-linux-libc5/lib/libm.so.5 (0x403f5000)
         libc.so.5 => /usr/i486-linux-libc5/lib/libc.so.5 (0x403fd000)
 ***********************
 Steve Davies (steve@one47.demon.co.uk) writes:
 Try using:
     export LD_LIBRARY_PATH=/usr/i486-linux-libc5/lib
     ./quake2 +set vid_ref gl
 to start the game... Works for me, but assumes that you have the
 compatability libc5 RPMs installed.
 ***************************
 WWW resources - you may find additional Linux Quake help at these URLs:
 http://quake.medina.net/howto
 http://webpages.mr.net/bobz
 http://www.linuxgames.com/quake2/
 ----------------------------------------------------------------------

52

docs/README.THREADS

View File

@@ -1,52 +0,0 @@
 Mesa Threads README
 -------------------
 Thread safety was introduced in Mesa 2.6 by John Stone and
 Christoph Poliwoda.
 It was redesigned in Mesa 3.3 so that thread safety is
 supported by default (on systems which support threads,
 that is).  There is no measurable penalty on single
 threaded applications.
 NOTE that the only _driver_ which is thread safe at this time
 is the OS/Mesa driver!
 At present the mthreads code supports three thread APIS:
 ) POSIX threads (aka pthreads).
 ) Solaris / Unix International threads.
 ) Win32 threads (Win 95/NT).
 Support for other thread libraries can be added src/glthread.[ch]
 In order to guarantee proper operation, it is
 necessary for both Mesa and application code to use the same threads API.
 So, if your application uses Sun's thread API, then you should build Mesa
 using one of the targets for Sun threads.
 The mtdemos directory contains some example programs which use
 multiple threads to render to osmesa rendering context(s).
 Linux users should be aware that there exist many different POSIX
 threads packages. The best solution is the linuxthreads package
 (http://pauillac.inria.fr/~xleroy/linuxthreads/) as this package is the
 only one that really supports multiprocessor machines (AFAIK). See
 http://pauillac.inria.fr/~xleroy/linuxthreads/README for further
 information about the usage of linuxthreads.
 If you are interested in helping with thread safety work in Mesa
 join the Mesa developers mailing list and post your proposal.
 Regards,
   John Stone           -- j.stone@acm.org  johns@cs.umr.edu
   Christoph Poliwoda   -- poliwoda@volumegraphics.com
 Version info:
    Mesa 2.6 - initial thread support.
    Mesa 3.3 - thread support mostly rewritten (Brian Paul)

44

docs/README.UVD Normal file

View File

@@ -0,0 +1,44 @@
 The software may implement third party technologies (e.g. third party
 libraries) that are not licensed to you by AMD and for which you may need
 to obtain licenses from other parties.  Unless explicitly stated otherwise,
 these third party technologies are not licensed hereunder.  Such third
 party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
 AVC, and VC-1.
 For MPEG-2 Encoding Products ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
 THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD FOR ENCODING VIDEO
 INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED WITHOUT A LICENSE
 UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSES IS
 AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers Green Circle, Suite 400E,
 Greenwood Village, Colorado 80111 U.S.A.
 WARRANTY DISCLAIMER: THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
 KIND.  AMD DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
 BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, THAT THE SOFTWARE WILL RUN
 UNINTERRUPTED OR ERROR-FREE OR WARRANTIES ARISING FROM CUSTOM OF TRADE OR
 COURSE OF USAGE.  THE ENTIRE RISK ASSOCIATED WITH THE USE OF THE SOFTWARE IS
 ASSUMED BY YOU.  Some jurisdictions do not allow the exclusion of implied
 warranties, so the above exclusion may not apply to You.
 LIMITATION OF LIABILITY AND INDEMNIFICATION:  AMD AND ITS LICENSORS WILL NOT,
 UNDER ANY CIRCUMSTANCES BE LIABLE FOR ANY PUNITIVE, DIRECT, INCIDENTAL,
 INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE SOFTWARE OR
 THIS AGREEMENT EVEN IF AMD AND ITS LICENSORS HAVE BEEN ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGES.  In no event shall AMD's total liability to You
 for all damages, losses, and causes of action (whether in contract, tort
 (including negligence) or otherwise) exceed the amount of $100 USD.  You agree
 to defend, indemnify and hold harmless AMD and its licensors, and any of their
 directors, officers, employees, affiliates or agents from and against any and
 all loss, damage, liability and other expenses (including reasonable
 attorneys' fees), resulting from Your use of the Software or violation of the
 terms and conditions of this Agreement.
 U.S. GOVERNMENT RESTRICTED RIGHTS: The Software is provided with "RESTRICTED
 RIGHTS." Use, duplication, or disclosure by the Government is subject to the
 restrictions as set forth in FAR 52.227-14 and DFAR252.227-7013, et seq., or
 its successor.  Use of the Software by the Government constitutes
 acknowledgement of AMD's proprietary rights in them.
 EXPORT RESTRICTIONS: The Software may be subject to export restrictions as
 stated in the Software License Agreement.

43

docs/README.VCE Normal file

View File

@@ -0,0 +1,43 @@
 The software may implement third party technologies (e.g. third party
 libraries) that are not licensed to you by AMD and for which you may need
 to obtain licenses from other parties.  Unless explicitly stated otherwise,
 these third party technologies are not licensed hereunder.  Such third
 party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
 AVC, and VC-1.
 For MPEG-2 Intermediate Products: ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
 THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD IS EXPRESSLY
 PROHIBITED WITHOUT A LICENSE UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT
 PORTFOLIO, WHICH LICENSES IS AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers
 Green Circle, Suite 400E, Greenwood Village, Colorado 80111 U.S.A.
 WARRANTY DISCLAIMER: THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
 KIND.  AMD DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
 BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, THAT THE SOFTWARE WILL RUN
 UNINTERRUPTED OR ERROR-FREE OR WARRANTIES ARISING FROM CUSTOM OF TRADE OR
 COURSE OF USAGE.  THE ENTIRE RISK ASSOCIATED WITH THE USE OF THE SOFTWARE IS
 ASSUMED BY YOU.  Some jurisdictions do not allow the exclusion of implied
 warranties, so the above exclusion may not apply to You.
 LIMITATION OF LIABILITY AND INDEMNIFICATION:  AMD AND ITS LICENSORS WILL NOT,
 UNDER ANY CIRCUMSTANCES BE LIABLE FOR ANY PUNITIVE, DIRECT, INCIDENTAL,
 INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE SOFTWARE OR
 THIS AGREEMENT EVEN IF AMD AND ITS LICENSORS HAVE BEEN ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGES.  In no event shall AMD's total liability to You
 for all damages, losses, and causes of action (whether in contract, tort
 (including negligence) or otherwise) exceed the amount of $100 USD.  You agree
 to defend, indemnify and hold harmless AMD and its licensors, and any of their
 directors, officers, employees, affiliates or agents from and against any and
 all loss, damage, liability and other expenses (including reasonable
 attorneys' fees), resulting from Your use of the Software or violation of the
 terms and conditions of this Agreement.
 U.S. GOVERNMENT RESTRICTED RIGHTS: The Software is provided with "RESTRICTED
 RIGHTS." Use, duplication, or disclosure by the Government is subject to the
 restrictions as set forth in FAR 52.227-14 and DFAR252.227-7013, et seq., or
 its successor.  Use of the Software by the Government constitutes
 acknowledgement of AMD's proprietary rights in them.
 EXPORT RESTRICTIONS: The Software may be subject to export restrictions as
 stated in the Software License Agreement.

17

docs/README.WIN32

View File

@@ -1,6 +1,6 @@
 File: docs/README.WIN32
 Last updated: 23 April 2011
 Last updated: 21 June 2013
 Quick Start
@@ -30,6 +30,21 @@ At this time, only the gallium GDI driver is known to work.
 Source code also exists in the tree for other drivers in
 src/mesa/drivers/windows, but the status of this code is unknown.
 Recipe
 ------
 Building on windows requires several open-source packages. These are
 steps that work as of this writing.
 - install python 2.7
 - install scons (latest)
 - install mingw, flex, and bison
 - install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
   get pywin32-218.4.win-amd64-py2.7.exe
 - install git
 - download mesa from git
   see http://www.mesa3d.org/repository.html
 - run scons
 General
 -------

									
										83

docs/application-issues.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,83 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Application Issues</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Application Issues</h1>

				<p>

				This page documents known issues with some OpenGL applications.

				</p>

				<h2>Topogun</h2>

				<p>

				<a href="http://www.topogun.com/">Topogun</a> for Linux (version 2, at least)

				creates a GLX visual without requesting a depth buffer.

				This causes bad rendering if the OpenGL driver happens to choose a visual

				without a depth buffer.

				</p>

				<p>

				Mesa 9.1.2 and later (will) support a DRI configuration option to work around

				this issue.

				Using the <a href="http://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,

				set the "Create all visuals with a depth buffer" option before running Topogun.

				Then, all GLX visuals will be created with a depth buffer.

				</p>

				<h2>Old OpenGL games</h2>

				<p>

				Some old OpenGL games (approx. ten years or older) may crash during

				start-up because of an extension string buffer-overflow problem.

				</p>

				<p>

				The problem is a modern OpenGL driver will return a very long string

				for the glGetString(GL_EXTENSIONS) query and if the application

				naively copies the string into a fixed-size buffer it can overflow the

				buffer and crash the application.

				</p>

				<p>

				The work-around is to set the MESA_EXTENSION_MAX_YEAR environment variable

				to the approximate release year of the game.

				This will cause the glGetString(GL_EXTENSIONS) query to only report extensions

				older than the given year.

				</p>

				<p>

				For example, if the game was released in 2001, do

				<pre>

				export MESA_EXTENSION_MAX_YEAR=2001

				</pre>

				before running the game.

				</p>

				<h2>Viewperf</h2>

				<p>

				See the <a href="viewperf.html">Viewperf issues</a> page for a detailed list

				of Viewperf issues.

				</p>

				</div>

				</body>

				</html>

									
										37

docs/autoconf.html
									
												View File
												
				@@ -123,24 +123,6 @@ directories.</p>

				There are also a few general options for altering the Mesa build:

				</p>

				<dl>

				<dt><code>--with-x</code></dt>

				<dd><p>When the X11 development libraries are

				needed, the <code>pkg-config</code> utility <a href="#pkg-config">will

				be used</a> for locating them. If they cannot be found through

				<code>pkg-config</code> a fallback routing using <code>imake</code> will

				be used. In this case, the <code>--with-x</code>,

				<code>--x-includes</code> and <code>--x-libraries</code> options can

				control the use of X for Mesa.</p>

				</dd>

				<dt><code>--enable-gl-osmesa</code></dt>

				<dd><p>The <a href="osmesa.html">OSMesa

				library</a> can be built on top of libGL for drivers that provide it.

				This option controls whether to build libOSMesa. By default, this is

				enabled for the Xlib driver and disabled otherwise. Note that this

				option is different than using OSMesa as the driver.</p>

				</dd>

				<dt><code>--enable-debug</code></dt>

				<dd><p>This option will enable compiler

				options and macros to aid in debugging the Mesa libraries.</p>

				@@ -155,12 +137,12 @@ assembly will not be used.</p>

				<dt><code>--enable-32-bit</code></dt>

				<dt><code>--enable-64-bit</code></dt>

				<dd><p>By default, the

				build will compile code as directed by the environment variables

				<dd><p>By default, the build will compile code as directed by the environment

				variables

				<code>CC</code>, <code>CFLAGS</code>, etc. If the compiler is

				<code>gcc</code>, these options offer a helper to add the compiler flags

				to force 32- or 64-bit code generation as used on the x86 and x86_64

				architectures.</p>

				architectures. Note that these options are mutually exclusive.</p>

				</dd>

				</dl>

				@@ -171,19 +153,19 @@ architectures.</p>

				There are several different driver modes that Mesa can use. These are

				described in more detail in the <a href="install.html">basic

				installation instructions</a>. The Mesa driver is controlled through the

				configure option --with-driver. There are currently three supported

				options in the configure script.

				configure options <code>--enable-xlib-glx</code>, <code>--enable-osmesa</code>,

				and <code>--enable-dri</code>.

				</p>

				<h3 id="xlib">Xlib</h3><p>This is the default mode for building Mesa.

				<h3 id="xlib">Xlib</h3><p>

				It uses Xlib as a software renderer to do all rendering. It corresponds

				to the option <code>--with-driver=xlib</code>. The libX11 and libXext

				to the option <code>--enable-xlib-glx</code>. The libX11 and libXext

				libraries, as well as the X11 development headers, will be need to

				support the Xlib driver.

				<h3 id="dri">DRI</h3><p>This mode uses the DRI hardware drivers for

				accelerated OpenGL rendering. Enable the DRI drivers with the option

				<code>--with-driver=dri</code>. See the <a href="install.html">basic

				<code>--enable-dri</code>. See the <a href="install.html">basic

				installation instructions</a> for details on prerequisites for the DRI

				drivers.

				@@ -223,7 +205,8 @@ and <code>/usr/local/lib</code>, respectively.

				<h3 id="osmesa">OSMesa </h3><p> No libGL is built in this

				mode. Instead, the driver code is built into the Off-Screen Mesa

				(OSMesa) library. See the <a href="osmesa.html">Off-Screen Rendering</a>

				page for more details.

				page for more details.  It corresponds to the option

				<code>--enable-osmesa</code>.

				<!-- OSMesa specific options -->

				<dl>

									
										2

docs/conform.html
									
												View File
												
				@@ -19,7 +19,7 @@

				<p>

				The SGI OpenGL conformance tests verify correct operation of OpenGL

				implementations.  I, Brian Paul, have been given a copy of the tests

				for testing Mesa.  The tests are not publically available.

				for testing Mesa.  The tests are not publicly available.

				</p>

				<p>

				This file has the latest results of testing Mesa with the OpenGL 1.2

									
										1

docs/contents.html
									
												View File
												
				@@ -71,6 +71,7 @@

				<li><a href="llvmpipe.html" target="_parent">Gallium llvmpipe driver</a>

				<li><a href="vmware-guest.html" target="_parent">VMware SVGA3D guest driver</a>

				<li><a href="postprocess.html" target="_parent">Gallium post-processing</a>

				<li><a href="application-issues.html" target="_parent">Application Issues</a>

				<li><a href="viewperf.html" target="_parent">Viewperf Issues</a>

				</ul>

									
										97

docs/devinfo.html
									
												View File
												
				@@ -17,7 +17,7 @@

				<h1>Development Notes</h1>

				<h2>Adding Extentions</h2>

				<h2>Adding Extensions</h2>

				<p>

				To add a new GL extension to Mesa you have to do at least the following.

				@@ -36,7 +36,7 @@ To add a new GL extension to Mesa you have to do at least the following.

				   </pre>

				</li>

				<li>

				   In the src/mesa/glapi/ directory, add the new extension functions and

				   In the src/mapi/glapi/gen/ directory, add the new extension functions and

				   enums to the gl_API.xml file.

				   Then, a bunch of source files must be regenerated by executing the

				   corresponding Python scripts.

				@@ -56,6 +56,11 @@ To add a new GL extension to Mesa you have to do at least the following.

				   If the new extension adds new GL state, the functions in get.c, enable.c

				   and attrib.c will most likely require new code.

				</li>

				<li>

				   The dispatch tests check_table.cpp and dispatch_sanity.cpp

				   should be updated with details about the new extensions functions. These

				   tests are run using 'make check'

				</li>

				</ul>

				@@ -155,6 +160,29 @@ of <tt>bool</tt>, <tt>true</tt>, and

				src/mesa/state_tracker/st_glsl_to_tgsi.cpp can serve as examples.

				</p>

				<h2>Submitting patches</h2>

				<p>

				You should always run the Mesa Testsuite before submitting patches.

				The Testsuite can be run using the 'make check' command. All tests

				must pass before patches will be accepted, this may mean you have

				to update the tests themselves.

				</p>

				<p>

				Patches should be sent to the Mesa mailing list for review.

				When submitting a patch make sure to use git send-email rather than attaching

				patches to emails. Sending patches as attachments prevents people from being

				able to provide in-line review comments.

				</p>

				<p>

				When submitting follow-up patches you can use --in-reply-to to make v2, v3,

				etc patches show up as replies to the originals. This usually works well

				when you're sending out updates to individual patches (as opposed to

				re-sending the whole series). Using --in-reply-to makes

				it harder for reviewers to accidentally review old patches.

				</p>

				<h2>Marking a commit as a candidate for a stable branch</h2>

				@@ -167,11 +195,31 @@ you should add an appropriate note to the commit message.

				Here are some examples of such a note:

				</p>

				<ul>

				  <li>NOTE: This is a candidate for the 9.0 branch.</li>

				  <li>NOTE: This is a candidate for the 8.0 and 9.0 branches.</li>

				  <li>NOTE: This is a candidate for the stable branches.</li>

				  <li>CC: &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				  <li>CC: "9.2 10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				  <li>CC: "10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				</ul>

				Simply adding the CC to the mesa-stable list address is adequate to nominate

				the commit for the most-recently-created stable branch. It is only necessary

				to specify a specific branch name, (such as "9.2 10.0" or "10.0" in the

				examples above), if you want to nominate the commit for an older stable

				branch. And, as in these examples, you can nominate the commit for the older

				branch in addition to the more recent branch, or nominate the commit

				exclusively for the older branch.

				This "CC" syntax for patch nomination will cause patches to automatically be

				copied to the mesa-stable@ mailing list when you use "git send-email" to send

				patches to the mesa-dev@ mailing list. Also, if you realize that a commit

				should be nominated for the stable branch after it has already been committed,

				you can send a note directly to the mesa-stable@lists.freedesktop.org where

				the Mesa stable-branch maintainers will receive it. Be sure to mention the

				commit ID of the commit of interest (as it appears in the mesa master branch).

				The latest set of patches that have been nominated, accepted, or rejected for

				the upcoming stable release can always be seen on the

				<a href=http://cworth.org/~cworth/mesa-stable-queue/">Mesa Stable Queue</a>

				page.

				<h2>Cherry-picking candidates for a stable branch</h2>

				@@ -193,22 +241,13 @@ branch is relevant.

				</p>

				<h3>Verify and update version info</h3>

				<dl>

				  <dt>Makefile.am</dt>

				  <dd>PACKAGE_VERSION</dd>

				  <dt>configure.ac</dt>

				  <dd>AC_INIT</dd>

				  <dt>src/mesa/main/version.h</dt>

				  <dd>MESA_MAJOR, MESA_MINOR, MESA_PATCH and MESA_VERSION_STRING</dd>

				</dl>

				<h3>Verify and update version info in VERSION</h3>

				<p>

				Create a docs/relnotes-x.y.z.html file.

				The bin/shortlog_mesa.sh script can be used to create a HTML-formatted list

				of changes to include in the file.

				Link the new docs/relnotes-x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.

				Create a docs/relnotes/x.y.z.html file.

				The bin/bugzilla_mesa.sh and bin/shortlog_mesa.sh scripts can be used to

				create the HTML-formatted lists of bugfixes and changes to include in the file.

				Link the new docs/relnotes/x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.

				</p>

				<p>

				@@ -217,7 +256,7 @@ Update <a href="index.html">docs/index.html</a>.

				<p>

				Tag the files with the release name (in the form <b>mesa-x.y</b>)

				with: <code>git tag -a mesa-x.y</code>

				with: <code>git tag -s mesa-x.y -m "Mesa x.y Release"</code>

				Then: <code>git push origin mesa-x.y</code>

				</p>

				@@ -226,13 +265,14 @@ Then: <code>git push origin mesa-x.y</code>

				<p>

				Make the distribution files.  From inside the Mesa directory:

				<pre>

					./autogen.sh

					make tarballs

				</pre>

				<p>

				After the tarballs are created, the md5 checksums for the files will

				be computed.

				Add them to the docs/relnotes-x.y.html file.

				Add them to the docs/relnotes/x.y.html file.

				</p>

				<p>

				@@ -242,15 +282,18 @@ compile everything, and run some demos to be sure everything works.

				<h3>Update the website and announce the release</h3>

				<p>

				Follow the directions on SourceForge for creating a new "release" and

				uploading the tarballs.

				Make a new directory for the release on annarchy.freedesktop.org with:

				<br>

				<code>

				mkdir /srv/ftp.freedesktop.org/pub/mesa/x.y

				</code>

				</p>

				<p>

				Basically, to upload the tarball files with:

				<br>

				<code>

				rsync -avP ssh Mesa*-X.Y.* USERNAME@frs.sourceforge.net:uploads/

				rsync -avP -e ssh MesaLib-x.y.* USERNAME@annarchy.freedesktop.org:/srv/ftp.freedesktop.org/pub/mesa/x.y/

				</code>

				</p>

				@@ -266,10 +309,10 @@ sftp USERNAME,mesa3d@web.sourceforge.net

				<p>

				Make an announcement on the mailing lists:

				<em>m</em><em>e</em><em>s</em><em>a</em><em>-</em><em>d</em><em>e</em><em>v</em><em>@</em><em>l</em><em>i</em><em>s</em><em>t</em><em>s</em><em>.</em><em>f</em><em>r</em><em>e</em><em>e</em><em>d</em><em>e</em><em>s</em><em>k</em><em>t</em><em>o</em><em>p</em><em>.</em><em>o</em><em>r</em><em>g</em>,

				<em>m</em><em>e</em><em>s</em><em>a</em><em>-</em><em>u</em><em>s</em><em>e</em><em>r</em><em>s</em><em>@</em><em>l</em><em>i</em><em>s</em><em>t</em><em>s</em><em>.</em><em>f</em><em>r</em><em>e</em><em>e</em><em>d</em><em>e</em><em>s</em><em>k</em><em>t</em><em>o</em><em>p</em><em>.</em><em>o</em><em>r</em><em>g</em>

				<em>mesa-dev@lists.freedesktop.org</em>,

				<em>mesa-users@lists.freedesktop.org</em>

				and

				<em>m</em><em>e</em><em>s</em><em>a</em><em>-</em><em>a</em><em>n</em><em>n</em><em>o</em><em>u</em><em>n</em><em>c</em><em>e</em><em>@</em><em>l</em><em>i</em><em>s</em><em>t</em><em>s</em><em>.</em><em>f</em><em>r</em><em>e</em><em>e</em><em>d</em><em>e</em><em>s</em><em>k</em><em>t</em><em>o</em><em>p</em><em>.</em><em>o</em><em>r</em><em>g</em>

				<em>mesa-announce@lists.freedesktop.org</em>

				</p>

				</div>

									
										10

docs/dispatch.html
									
												View File
												
				@@ -25,7 +25,7 @@ href="#overview">overview of Mesa's implementation</a>.</p>

				<h2>1. Complexity of GL Dispatch</h2>

				<p>Every GL application has at least one object called a GL <em>context</em>.

				This object, which is an implicit parameter to ever GL function, stores all

				This object, which is an implicit parameter to every GL function, stores all

				of the GL related state for the application.  Every texture, every buffer

				object, every enable, and much, much more is stored in the context.  Since

				an application can have more than one context, the context to be used is

				@@ -51,7 +51,7 @@ example, <tt>glFogCoordf</tt> may operate differently depending on whether

				or not fog is enabled.</p>

				<p>In multi-threaded environments, it is possible for each thread to have a

				differnt GL context current.  This means that poor old <tt>glVertex3fv</tt>

				different GL context current.  This means that poor old <tt>glVertex3fv</tt>

				has to know which GL context is current in the thread where it is being

				called.</p>

				@@ -207,13 +207,13 @@ few preprocessor defines.</p>

				<li>If <tt>GLX_USE_TLS</tt> is defined, method #4 is used.</li>

				<li>If <tt>HAVE_PTHREAD</tt> is defined, method #3 is used.</li>

				<li>If <tt>WIN32_THREADS</tt> is defined, method #2 is used.</li>

				<li>If none of the preceeding are defined, method #1 is used.</li>

				<li>If none of the preceding are defined, method #1 is used.</li>

				</ul>

				<p>Two different techniques are used to handle the various different cases.

				On x86 and SPARC, a macro called <tt>GL_STUB</tt> is used.  In the preamble

				of the assembly source file different implementations of the macro are

				selected based on the defined preprocessor variables.  The assmebly code

				selected based on the defined preprocessor variables.  The assembly code

				then consists of a series of invocations of the macros such as:

				<blockquote>

				@@ -242,7 +242,7 @@ first technique, is to insert <tt>#ifdef</tt> within the assembly

				implementation of each function.  This makes the assembly file considerably

				larger (e.g., 29,332 lines for <tt>glapi_x86-64.S</tt> versus 1,155 lines for

				<tt>glapi_x86.S</tt>) and causes simple changes to the function

				implementation to generate many lines of diffs.  Since the assmebly files

				implementation to generate many lines of diffs.  Since the assembly files

				are typically generated by scripts (see <a href="#autogen">below</a>), this

				isn't a significant problem.</p>

									
										22

docs/egl.html
									
												View File
												
				@@ -88,7 +88,7 @@ drivers will be installed to <code>${libdir}/egl</code>.</p>

				<dd>

				<p>List the platforms (window systems) to support.  Its argument is a comma

				seprated string such as <code>--with-egl-platforms=x11,drm</code>.  It decides

				separated string such as <code>--with-egl-platforms=x11,drm</code>.  It decides

				the platforms a driver may support.  The first listed platform is also used by

				the main library to decide the native platform: the platform the EGL native

				types such as <code>EGLNativeDisplayType</code> or

				@@ -223,7 +223,7 @@ the X server directly using (XCB-)DRI2 protocol.</p>

				<dd>

				<p>This driver is based on Gallium3D.  It supports all rendering APIs and

				hardwares supported by Gallium3D.  It is the only driver that supports OpenVG.

				hardware supported by Gallium3D.  It is the only driver that supports OpenVG.

				The supported platforms are X11, DRM, FBDEV, and GDI.</p>

				<p>This driver comes with its own hardware drivers

				@@ -232,16 +232,6 @@ The supported platforms are X11, DRM, FBDEV, and GDI.</p>

				</dd>

				<dt><code>egl_glx</code></dt>

				<dd>

				<p>This driver provides a wrapper to GLX.  It uses exclusively GLX to implement

				the EGL API.  It supports both direct and indirect rendering when the GLX does.

				It is accelerated when the GLX is.  As such, it cannot provide functions that

				is not available in GLX or GLX extensions.</p>

				</dd>

				</dl>

				<h2>Packaging</h2>

				<p>The ABI between the main library and its drivers are not stable.  Nor is

				@@ -262,10 +252,6 @@ is disabled by default.</p>

				<code>src/egl/</code>.  The sources of the <code>egl</code> state tracker can

				be found at <code>src/gallium/state_trackers/egl/</code>.</p>

				<p>The suggested way to learn to write a EGL driver is to see how other drivers

				are written.  <code>egl_glx</code> should be a good reference.  It works in any

				environment that has GLX support, and it is simpler than most drivers.</p>

				<h3>Lifetime of Display Resources</h3>

				<p>Contexts and surfaces are examples of display resources.  They might live

				@@ -273,8 +259,8 @@ longer than the display that creates them.</p>

				<p>In EGL, when a display is terminated through <code>eglTerminate</code>, all

				display resources should be destroyed.  Similarly, when a thread is released

				throught <code>eglReleaseThread</code>, all current display resources should be

				released.  Another way to destory or release resources is through functions

				through <code>eglReleaseThread</code>, all current display resources should be

				released.  Another way to destroy or release resources is through functions

				such as <code>eglDestroySurface</code> or <code>eglMakeCurrent</code>.</p>

				<p>When a resource that is current to some thread is destroyed, the resource

									
										64

docs/envvars.html
									
												View File
												
				@@ -32,6 +32,8 @@ sometimes be useful for debugging end-user issues.

				<li>LIBGL_ALWAYS_INDIRECT - forces an indirect rendering context/connection.

				<li>LIBGL_ALWAYS_SOFTWARE - if set, always use software rendering

				<li>LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging)

				<li>LIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers

				    calls per second.

				</ul>

				@@ -45,7 +47,7 @@ sometimes be useful for debugging end-user issues.

				<li>MESA_NO_SSE - if set, disables Intel SSE optimizations

				<li>MESA_DEBUG - if set, error messages are printed to stderr.  For example,

				   if the application generates a GL_INVALID_ENUM error, a corresponding error

				   message indicating where the error occured, and possibly why, will be

				   message indicating where the error occurred, and possibly why, will be

				   printed to stderr.<br>

				   If the value of MESA_DEBUG is 'FP' floating point arithmetic errors will

				   generate exceptions.

				@@ -119,10 +121,38 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				<h2>i945/i965 driver environment variables (non-Gallium)</h2>

				<ul>

				<li>INTEL_STRICT_CONFORMANCE - if set to 1, enable sw fallbacks to improve

				    OpenGL conformance.  If set to 2, always use software rendering.

				<li>INTEL_NO_BLIT - if set, disable hardware-accelerated glBitmap,

				    glCopyPixels, glDrawPixels.

				<li>INTEL_NO_HW - if set to 1, prevents batches from being submitted to the hardware.

				   This is useful for debugging hangs, etc.</li>

				<li>INTEL_DEBUG - a comma-separated list of named flags, which do various things:

				<ul>

				   <li>tex - emit messages about textures.</li>

				   <li>state - emit messages about state flag tracking</li>

				   <li>blit - emit messages about blit operations</li>

				   <li>miptree - emit messages about miptrees</li>

				   <li>perf - emit messages about performance issues</li>

				   <li>perfmon - emit messages about AMD_performance_monitor</li>

				   <li>bat - emit batch information</li>

				   <li>pix - emit messages about pixel operations</li>

				   <li>buf - emit messages about buffer objects</li>

				   <li>reg - emit messages about regions</li>

				   <li>fbo - emit messages about framebuffers</li>

				   <li>fs - dump shader assembly for fragment shaders</li>

				   <li>gs - dump shader assembly for geometry shaders</li>

				   <li>sync - emit messages about synchronization</li>

				   <li>prim - emit messages about drawing primitives</li>

				   <li>vert - emit messages about vertex assembly</li>

				   <li>dri - emit messages about the DRI interface</li>

				   <li>sf - emit messages about the strips &amp; fans unit (for old gens, includes the SF program)</li>

				   <li>stats - enable statistics counters. you probably actually want perfmon or intel_gpu_top instead.</li>

				   <li>urb - emit messages about URB setup</li>

				   <li>vs - dump shader assembly for vertex shaders</li>

				   <li>clip - emit messages about the clip unit (for old gens, includes the CLIP program)</li>

				   <li>aub - dump batches into an AUB trace for use with simulation tools</li>

				   <li>shader_time - record how much GPU time is spent in each shader</li>

				   <li>no16 - suppress generation of 16-wide fragment shaders. useful for debugging broken shaders</li>

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				</ul>

				</ul>

				@@ -144,14 +174,13 @@ Mesa EGL supports different sets of environment variables.  See the

				<h2>Gallium environment variables</h2>

				<ul>

				<li>GALLIUM_HUD - draws various information on the screen, like framerate,

				    cpu load, driver statistics, performance counters, etc.

				    Set GALLIUM_HUD=help and run e.g. glxgears for more info.

				<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.

				    rather than stderr.

				<li>GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment

				    variables which are used, and their current values.

				<li>GALLIUM_NOSSE - if non-zero, do not use SSE runtime code generation for

				    shader execution

				<li>GALLIUM_NOPPC - if non-zero, do not use PPC runtime code generation for

				    shader execution

				<li>GALLIUM_DUMP_CPU - if non-zero, print information about the CPU on start-up

				<li>TGSI_PRINT_SANITY - if set, do extra sanity checking on TGSI shaders and

				    print any errors to stderr.

				@@ -159,6 +188,9 @@ Mesa EGL supports different sets of environment variables.  See the

				<LI>DRAW_NO_FSE - ???

				<li>DRAW_USE_LLVM - if set to zero, the draw module will not use LLVM to execute

				    shaders, vertex fetch, etc.

				<li>ST_DEBUG - controls debug output from the Mesa/Gallium state tracker.

				Setting to "tgsi", for example, will print all the TGSI shaders.

				See src/mesa/state_tracker/st_debug.c for other options.

				</ul>

				<h3>Softpipe driver environment variables</h3>

				@@ -169,14 +201,14 @@ Mesa EGL supports different sets of environment variables.  See the

				    to stderr

				<li>SOFTPIPE_NO_RAST - if set, rasterization is no-op'd.  For profiling purposes.

				<li>SOFTPIPE_USE_LLVM - if set, the softpipe driver will try to use LLVM JIT for

				    vertex shading procesing.

				    vertex shading processing.

				</ul>

				<h3>LLVMpipe driver environment variables</h3>

				<ul>

				<li>LP_NO_RAST - if set LLVMpipe will no-op rasterization

				<li>LP_DEBUG - a comma-separated list of debug options is acceptec.  See the

				<li>LP_DEBUG - a comma-separated list of debug options is accepted.  See the

				    source code for details.

				<li>LP_PERF - a comma-separated list of options to selectively no-op various

				    parts of the driver.  See the source code for details.

				@@ -185,6 +217,16 @@ Mesa EGL supports different sets of environment variables.  See the

				    cores present.

				</ul>

				<h3>VMware SVGA driver environment variables</h3>

				<ul>

				<li>SVGA_FORCE_SWTNL - force use of software vertex transformation

				<li>SVGA_NO_SWTNL - don't allow software vertex transformation fallbacks

				(will often result in incorrect rendering).

				<li>SVGA_DEBUG - for dumping shaders, constant buffers, etc.  See the code

				for details.

				<li>See the driver code for other, lesser-used variables.

				</ul>

				<p>

				Other Gallium drivers have their own environment variables.  These may change

									
										34

docs/extensions.html
									
												View File
												
				@@ -23,19 +23,27 @@ The specifications follow.

				<ul>

				<li><a href="MESA_agp_offset.spec">MESA_agp_offset.spec</a>

				<li><a href="MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>

				<li><a href="MESA_packed_depth_stencil.spec">MESA_packed_depth_stencil.spec</a>

				<li><a href="MESA_pack_invert.spec">MESA_pack_invert.spec</a>

				<li><a href="MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>

				<li><a href="MESA_release_buffers.spec">MESA_release_buffers.spec</a>

				<li><a href="MESA_resize_buffers.spec">MESA_resize_buffers.spec</a>

				<li><a href="MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>

				<li><a href="MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)

				<li><a href="MESA_texture_signed_rgba.spec">MESA_texture_signed_rgba.spec</a>

				<li><a href="MESA_trace.spec">MESA_trace.spec</a> (obsolete)

				<li><a href="MESA_window_pos.spec">MESA_window_pos.spec</a>

				<li><a href="MESA_ycbcr_texture.spec">MESA_ycbcr_texture.spec</a>

				<li><a href="specs/MESA_agp_offset.spec">MESA_agp_offset.spec</a>

				<li><a href="specs/MESA_copy_sub_buffer.spec">MESA_copy_sub_buffer.spec</a>

				<li><a href="specs/MESA_drm_image.spec">MESA_drm_image.spec</a>

				<li><a href="specs/MESA_multithread_makecurrent.spec">MESA_multithread_makecurrent.spec</a>

				<li><a href="specs/OLD/MESA_packed_depth_stencil.spec">MESA_packed_depth_stencil.spec</a> (obsolete)

				<li><a href="specs/MESA_pack_invert.spec">MESA_pack_invert.spec</a>

				<li><a href="specs/MESA_pixmap_colormap.spec">MESA_pixmap_colormap.spec</a>

				<li><a href="specs/OLD/MESA_program_debug.spec">MESA_program_debug.spec</a> (obsolete)

				<li><a href="specs/MESA_release_buffers.spec">MESA_release_buffers.spec</a>

				<li><a href="specs/OLD/MESA_resize_buffers.spec">MESA_resize_buffers.spec</a> (obsolete)

				<li><a href="specs/MESA_set_3dfx_mode.spec">MESA_set_3dfx_mode.spec</a>

				<li><a href="specs/MESA_shader_debug.spec">MESA_shader_debug.spec</a>

				<li><a href="specs/OLD/MESA_sprite_point.spec">MESA_sprite_point.spec</a> (obsolete)

				<li><a href="specs/MESA_swap_control.spec">MESA_swap_control.spec</a>

				<li><a href="specs/MESA_swap_frame_usage.spec">MESA_swap_frame_usage.spec</a>

				<li><a href="specs/MESA_texture_array.spec">MESA_texture_array.spec</a>

				<li><a href="specs/MESA_texture_signed_rgba.spec">MESA_texture_signed_rgba.spec</a>

				<li><a href="specs/OLD/MESA_trace.spec">MESA_trace.spec</a> (obsolete)

				<li><a href="specs/MESA_window_pos.spec">MESA_window_pos.spec</a>

				<li><a href="specs/MESA_ycbcr_texture.spec">MESA_ycbcr_texture.spec</a>

				<li><a href="specs/WL_bind_wayland_display.spec">WL_bind_wayland_display.spec</a>

				</ul>

				</div>

									
										8

docs/faq.html
									
												View File
												
				@@ -137,7 +137,7 @@ Just follow the Mesa <a href="install.html">compilation instructions</a>.

				<h2>1.6 Are there other open-source implementations of OpenGL?</h2>

				<p>

				Yes, SGI's <a href="http://oss.sgi.com/projects/ogl-sample/index.html">

				OpenGL Sample Implemenation (SI)</a> is available.

				OpenGL Sample Implementation (SI)</a> is available.

				The SI was written during the time that OpenGL was originally designed.

				Unfortunately, development of the SI has stagnated.

				Mesa is much more up to date with modern features and extensions.

				@@ -353,7 +353,7 @@ That's where Mesa development is discussed.

				</p>

				<p>

				The <a href="http://www.opengl.org/documentation">

				OpenGL Specification</a> is the bible for OpenGL implemention work.

				OpenGL Specification</a> is the bible for OpenGL implementation work.

				You should read it.

				</p>

				<p>Most of the Mesa development work involves implementing new OpenGL

				@@ -375,7 +375,7 @@ For a Gallium3D hardware driver, the r300g, r600g and the i915g are good example

				</p>

				<p>The DRI website has more information about writing hardware drivers.

				The process isn't well document because the Mesa driver interface changes

				over time, and we seldome have spare time for writing documentation.

				over time, and we seldom have spare time for writing documentation.

				That being said, many people have managed to figure out the process.

				</p>

				<p>

				@@ -390,7 +390,7 @@ The <a href="http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compres

				indicates that there are intellectual property (IP) and/or patent issues

				to be dealt with.

				</p>

				<p>We've been unsucessful in getting a response from S3 (or whoever owns

				<p>We've been unsuccessful in getting a response from S3 (or whoever owns

				the IP nowadays) to indicate whether or not an open source project can

				implement the extension (specifically the compression/decompression

				algorithms).

									
										252

docs/index.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa News</title>

				  <title>The Mesa 3D Graphics Library</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				@@ -16,10 +16,160 @@

				<h1>News</h1>

				<h2>April 18, 2014</h2>

				<p>

				<a href="relnotes/10.1.1.html">Mesa 10.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2014</h2>

				<p>

				<a href="relnotes/10.0.5.html">Mesa 10.0.5</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: Since the 10.1.1 release is being released concurrently, it is

				anticipated that 10.0.5 will be the final release in the 10.0

				series. Users of 10.0 are encouraged to migrate to the 10.1 series in

				order to obtain future fixes.

				</p>

				<h2>March 12, 2014</h2>

				<p>

				<a href="relnotes/10.0.4.html">Mesa 10.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 4, 2014</h2>

				<p>

				<a href="relnotes/10.1.html">Mesa 10.1</a> is released.

				This is a new development release.

				See the release notes for more information about the release.

				</p>

				<h2>February 3, 2014</h2>

				<p>

				<a href="relnotes/10.0.3.html">Mesa 10.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 9, 2014</h2>

				<p>

				<a href="relnotes/10.0.2.html">Mesa 10.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 12, 2013</h2>

				<p>

				<a href="relnotes/10.0.1.html">Mesa 10.0.1</a>

				and <a href="relnotes/9.2.5.html">Mesa 9.2.5</a> are released.

				These are both bug-fix releases.

				</p>

				<h2>November 30, 2013</h2>

				<p>

				<a href="relnotes/10.0.html">Mesa 10.0</a> is released.

				This is a new development release.

				See the release notes for more information about the release.

				</p>

				<h2>November 27, 2013</h2>

				<p>

				<a href="relnotes/9.2.4.html">Mesa 9.2.4</a> is released.

				This is a bug fix release.

				</p>

				<h2>November 13, 2013</h2>

				<p>

				<a href="relnotes/9.2.3.html">Mesa 9.2.3</a> is released.

				This is a bug fix release.

				</p>

				<h2>October 18, 2013</h2>

				<p>

				<a href="relnotes/9.2.2.html">Mesa 9.2.2</a> is released.

				This is a bug fix release.

				</p>

				<h2>October 4, 2013</h2>

				<p>

				<a href="relnotes/9.2.1.html">Mesa 9.2.1</a> and

				<a href="relnotes/9.1.7.html">Mesa 9.1.7</a> are released,

				both bug-fix releases.

				</p>

				<h2>August 27, 2013</h2>

				<p>

				<a href="relnotes/9.2.html">Mesa 9.2</a> is released.

				This is a new development release.

				See the release notes for more information about the release.

				</p>

				<h2>August 1, 2013</h2>

				<p>

				<a href="relnotes/9.1.6.html">Mesa 9.1.6</a> is released.

				This is a bug fix release.

				</p>

				<h2>July 17, 2013</h2>

				<p>

				<a href="relnotes/9.1.5.html">Mesa 9.1.5</a> is released.

				This is a bug fix release.

				</p>

				<h2>July 1, 2013</h2>

				<p>

				<a href="relnotes/9.1.4.html">Mesa 9.1.4</a> is released.

				This is a bug fix release.

				</p>

				<h2>May 21, 2013</h2>

				<p>

				<a href="relnotes/9.1.3.html">Mesa 9.1.3</a> is released.

				This is a bug fix release.

				</p>

				<h2>April 30, 2013</h2>

				<p>

				<a href="relnotes/9.1.2.html">Mesa 9.1.2</a> is released.

				This is a bug fix release.

				</p>

				<h2>March 19, 2013</h2>

				<p>

				<a href="relnotes/9.1.1.html">Mesa 9.1.1</a> is released.

				This is a bug fix release.

				</p>

				<h2>February 24, 2013</h2>

				<p>

				Mesa demos 8.1.0 is released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-dev/2013-February/035180.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.1.0/">ftp.freedesktop.org/pub/mesa/demos/8.1.0/</a>.

				</p>

				<h2>February 22, 2013</h2>

				<p>

				<a href="relnotes/9.1.html">Mesa 9.1</a> is released.

				This is a new development release.

				See the release notes for more information about the release.

				</p>

				<h2>February 21, 2013</h2>

				<p>

				<a href="relnotes/9.0.3.html">Mesa 9.0.3</a> is released.

				This is a bug fix release.

				</p>

				<h2>January 22, 2013</h2>

				<p>

				<a href="relnotes-9.0.2.html">Mesa 9.0.2</a> is released.

				<a href="relnotes/9.0.2.html">Mesa 9.0.2</a> is released.

				This is a bug fix release.

				</p>

				@@ -27,7 +177,7 @@ This is a bug fix release.

				<h2>November 16, 2012</h2>

				<p>

				<a href="relnotes-9.0.1.html">Mesa 9.0.1</a> is released.

				<a href="relnotes/9.0.1.html">Mesa 9.0.1</a> is released.

				This is a bug fix release.

				</p>

				@@ -35,7 +185,7 @@ This is a bug fix release.

				<h2>October 24, 2012</h2>

				<p>

				<a href="relnotes-8.0.5.html">Mesa 8.0.5</a> is released.

				<a href="relnotes/8.0.5.html">Mesa 8.0.5</a> is released.

				This is a bug fix release.

				</p>

				@@ -43,7 +193,7 @@ This is a bug fix release.

				<h2>October 8, 2012</h2>

				<p>

				<a href="relnotes-9.0.html">Mesa 9.0</a> is released.

				<a href="relnotes/9.0.html">Mesa 9.0</a> is released.

				This is the first version of Mesa to support OpenGL 3.1 and GLSL 1.40

				(with the i965 driver).

				See the release notes for more information about the release.

				@@ -53,7 +203,7 @@ See the release notes for more information about the release.

				<h2>July 10, 2012</h2>

				<p>

				<a href="relnotes-8.0.4.html">Mesa 8.0.4</a> is released.

				<a href="relnotes/8.0.4.html">Mesa 8.0.4</a> is released.

				This is a bug fix release.

				</p>

				@@ -61,7 +211,7 @@ This is a bug fix release.

				<h2>May 18, 2012</h2>

				<p>

				<a href="relnotes-8.0.3.html">Mesa 8.0.3</a> is released.

				<a href="relnotes/8.0.3.html">Mesa 8.0.3</a> is released.

				This is a bug fix release.

				</p>

				@@ -69,7 +219,7 @@ This is a bug fix release.

				<h2>March 21, 2012</h2>

				<p>

				<a href="relnotes-8.0.2.html">Mesa 8.0.2</a> is released.

				<a href="relnotes/8.0.2.html">Mesa 8.0.2</a> is released.

				This is a bug fix release.

				</p>

				@@ -77,14 +227,14 @@ This is a bug fix release.

				<h2>February 16, 2012</h2>

				<p>

				<a href="relnotes-8.0.1.html">Mesa 8.0.1</a> is released.  This is a bug fix

				<a href="relnotes/8.0.1.html">Mesa 8.0.1</a> is released.  This is a bug fix

				release.  See the release notes for more information about the release.

				</p>

				<h2>February 9, 2012</h2>

				<p>

				<a href="relnotes-8.0.html">Mesa 8.0</a> is released.

				<a href="relnotes/8.0.html">Mesa 8.0</a> is released.

				This is the first version of Mesa to support OpenGL 3.0 and GLSL 1.30

				(with the i965 driver).

				See the release notes for more information about the release.

				@@ -94,7 +244,7 @@ See the release notes for more information about the release.

				<h2>November 27, 2011</h2>

				<p>

				<a href="relnotes-7.11.2.html">Mesa 7.11.2</a> is released.  This is a bug fix

				<a href="relnotes/7.11.2.html">Mesa 7.11.2</a> is released.  This is a bug fix

				release.  This release was made primarily to fix build problems with 7.11.1 on

				Mandriva and to fix problems related to glCopyTexImage to luminance-alpha

				textures.  The later was believed to have been fixed in 7.11.1 but was not.

				@@ -103,36 +253,36 @@ textures.  The later was believed to have been fixed in 7.11.1 but was not.

				<h2>November 17, 2011</h2>

				<p>

				<a href="relnotes-7.11.1.html">Mesa 7.11.1</a> is released.  This is a bug

				<a href="relnotes/7.11.1.html">Mesa 7.11.1</a> is released.  This is a bug

				fix release.

				</p>

				<h2>July 31, 2011</h2>

				<p>

				<a href="relnotes-7.11.html">Mesa 7.11</a> (final) is released.  This is a new

				<a href="relnotes/7.11.html">Mesa 7.11</a> (final) is released.  This is a new

				development release.

				</p>

				<h2>June 13, 2011</h2>

				<p>

				<a href="relnotes-7.10.3.html">Mesa 7.10.3</a> is released.  This is a bug

				<a href="relnotes/7.10.3.html">Mesa 7.10.3</a> is released.  This is a bug

				fix release.

				</p>

				<h2>April 6, 2011</h2>

				<p>

				<a href="relnotes-7.10.2.html">Mesa 7.10.2</a> is released.  This is a bug

				<a href="relnotes/7.10.2.html">Mesa 7.10.2</a> is released.  This is a bug

				fix release.

				</p>

				<h2>March 2, 2011</h2>

				<p>

				<a href="relnotes-7.9.2.html">Mesa 7.9.2</a> and

				<a href="relnotes-7.10.1.html">Mesa 7.10.1</a> are released.  These are

				<a href="relnotes/7.9.2.html">Mesa 7.9.2</a> and

				<a href="relnotes/7.10.1.html">Mesa 7.10.1</a> are released.  These are

				stable releases containing bug fixes since the 7.9.1 and 7.10 releases.

				</p>

				@@ -140,7 +290,7 @@ stable releases containing bug fixes since the 7.9.1 and 7.10 releases.

				<h2>October 4, 2010</h2>

				<p>

				<a href="relnotes-7.9.html">Mesa 7.9</a> (final) is released.  This is a new

				<a href="relnotes/7.9.html">Mesa 7.9</a> (final) is released.  This is a new

				development release.

				</p>

				@@ -148,7 +298,7 @@ development release.

				<h2>September 27, 2010</h2>

				<p>

				<a href="relnotes-7.9.html">Mesa 7.9.0-rc1</a> is released.  This is a

				<a href="relnotes/7.9.html">Mesa 7.9.0-rc1</a> is released.  This is a

				release candidate for the 7.9 development release.

				</p>

				@@ -156,7 +306,7 @@ release candidate for the 7.9 development release.

				<h2>June 16, 2010</h2>

				<p>

				<a href="relnotes-7.8.2.html">Mesa 7.8.2</a> is released.  This is a bug-fix

				<a href="relnotes/7.8.2.html">Mesa 7.8.2</a> is released.  This is a bug-fix

				release collecting fixes since the 7.8.1 release.

				</p>

				@@ -164,18 +314,18 @@ release collecting fixes since the 7.8.1 release.

				<h2>April 5, 2010</h2>

				<p>

				<a href="relnotes-7.8.1.html">Mesa 7.8.1</a> is released.  This is a bug-fix

				<a href="relnotes/7.8.1.html">Mesa 7.8.1</a> is released.  This is a bug-fix

				release for a few critical issues in the 7.8 release.

				</p>

				<h2>March 28, 2010</h2>

				<p>

				<a href="relnotes-7.7.1.html">Mesa 7.7.1</a> is released.  This is a bug-fix

				<a href="relnotes/7.7.1.html">Mesa 7.7.1</a> is released.  This is a bug-fix

				release fixing issues found in the 7.7 release.

				</p>

				<p>

				Also, <a href="relnotes-7.8.html">Mesa 7.8</a> is released.  This is a new

				Also, <a href="relnotes/7.8.html">Mesa 7.8</a> is released.  This is a new

				development release.

				</p>

				@@ -183,37 +333,37 @@ development release.

				<h2>December 21, 2009</h2>

				<p>

				<a href="relnotes-7.6.1.html">Mesa 7.6.1</a> is released.  This is a bug-fix

				<a href="relnotes/7.6.1.html">Mesa 7.6.1</a> is released.  This is a bug-fix

				release fixing issues found in the 7.6 release.

				</p>

				<p>

				Also, <a href="relnotes-7.7.html">Mesa 7.7</a> is released.  This is a new

				Also, <a href="relnotes/7.7.html">Mesa 7.7</a> is released.  This is a new

				development release.

				</p>

				<h2>September 28, 2009</h2>

				<p>

				<a href="relnotes-7.6.html">Mesa 7.6</a> is released.  This is a new feature

				<a href="relnotes/7.6.html">Mesa 7.6</a> is released.  This is a new feature

				release.  Those especially concerned about stability may want to wait for the

				follow-on 7.6.1 bug-fix release.

				</p>

				<p>

				<a href="relnotes-7.5.2.html">Mesa 7.5.2</a> is also released.

				<a href="relnotes/7.5.2.html">Mesa 7.5.2</a> is also released.

				This is a stable release fixing bugs since the 7.5.1 release.

				</p>

				<h2>September 3, 2009</h2>

				<p>

				<a href="relnotes-7.5.1.html">Mesa 7.5.1</a> is released.

				<a href="relnotes/7.5.1.html">Mesa 7.5.1</a> is released.

				This is a bug-fix release which fixes bugs found in version 7.5.

				</p>

				<h2>July 17, 2009</h2>

				<p>

				<a href="relnotes-7.5.html">Mesa 7.5</a> is released.

				<a href="relnotes/7.5.html">Mesa 7.5</a> is released.

				This is a new features release.  People especially concerned about

				stability may want to wait for the follow-on 7.5.1 bug-fix release.

				</p>

				@@ -221,7 +371,7 @@ stability may want to wait for the follow-on 7.5.1 bug-fix release.

				<h2>June 23, 2009</h2>

				<p>

				<a href="relnotes-7.4.4.html">Mesa 7.4.4</a> is released.

				<a href="relnotes/7.4.4.html">Mesa 7.4.4</a> is released.

				This is a stable release that fixes a regression in the i915/i965 drivers

				that slipped into the 7.4.3 release.

				</p>

				@@ -229,35 +379,35 @@ that slipped into the 7.4.3 release.

				<h2>June 19, 2009</h2>

				<p>

				<a href="relnotes-7.4.3.html">Mesa 7.4.3</a> is released.

				<a href="relnotes/7.4.3.html">Mesa 7.4.3</a> is released.

				This is a stable release fixing bugs since the 7.4.2 release.

				</p>

				<h2>May 15, 2009</h2>

				<p>

				<a href="relnotes-7.4.2.html">Mesa 7.4.2</a> is released.

				<a href="relnotes/7.4.2.html">Mesa 7.4.2</a> is released.

				This is a stable release fixing bugs since the 7.4.1 release.

				</p>

				<h2>April 18, 2009</h2>

				<p>

				<a href="relnotes-7.4.1.html">Mesa 7.4.1</a> is released.

				<a href="relnotes/7.4.1.html">Mesa 7.4.1</a> is released.

				This is a stable release fixing bugs since the 7.4 release.

				</p>

				<h2>March 27, 2009</h2>

				<p>

				<a href="relnotes-7.4.html">Mesa 7.4</a> is released.

				<a href="relnotes/7.4.html">Mesa 7.4</a> is released.

				This is a stable release fixing bugs since the 7.3 release.

				</p>

				<h2>January 22, 2009</h2>

				<p>

				<a href="relnotes-7.3.html">Mesa 7.3</a> is released.

				<a href="relnotes/7.3.html">Mesa 7.3</a> is released.

				This is a new development release.

				Mesa 7.4 will follow and will have bug fixes relative to 7.3.

				</p>

				@@ -265,14 +415,14 @@ Mesa 7.4 will follow and will have bug fixes relative to 7.3.

				<h2>September 20, 2008</h2>

				<p>

				<a href="relnotes-7.2.html">Mesa 7.2</a> is released.

				<a href="relnotes/7.2.html">Mesa 7.2</a> is released.

				This is a stable, bug-fix release.

				</p>

				<h2>August 26, 2008</h2>

				<p>

				<a href="relnotes-7.1.html">Mesa 7.1</a> is released.

				<a href="relnotes/7.1.html">Mesa 7.1</a> is released.

				This is a new development release.

				It should be relatively stable, but those especially concerned about

				stability should wait for the 7.2 release or use Mesa 7.0.4 (the

				@@ -282,14 +432,14 @@ previous stable release).

				<h2>August 16, 2008</h2>

				<p>

				<a href="relnotes-7.0.4.html">Mesa 7.0.4</a> is released.

				<a href="relnotes/7.0.4.html">Mesa 7.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 4, 2008</h2>

				<p>

				<a href="relnotes-7.0.3.html">Mesa 7.0.3</a> is released.

				<a href="relnotes/7.0.3.html">Mesa 7.0.3</a> is released.

				This is a bug-fix release.

				</p>

				@@ -318,28 +468,28 @@ but other drivers will be coming...

				<h2>November 10, 2007</h2>

				<p>

				<a href="relnotes-7.0.2.html">Mesa 7.0.2</a> is released.

				<a href="relnotes/7.0.2.html">Mesa 7.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 3, 2007</h2>

				<p>

				<a href="relnotes-7.0.1.html">Mesa 7.0.1</a> is released.

				<a href="relnotes/7.0.1.html">Mesa 7.0.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>June 22, 2007</h2>

				<p>

				<a href="relnotes-7.0.html">Mesa 7.0</a> is released.

				<a href="relnotes/7.0.html">Mesa 7.0</a> is released.

				This is a stable release featuring OpenGL 2.1 support.

				</p>

				<h2>April 27, 2007</h2>

				<p>

				<a href="relnotes-6.5.3.html">Mesa 6.5.3</a> is released.

				<a href="relnotes/6.5.3.html">Mesa 6.5.3</a> is released.

				This is a development release which will lead up to the Mesa 7.0 release

				(which will advertise OpenGL 2.1 API support).

				</p>

				@@ -370,33 +520,33 @@ See the <a href="repository.html">repository page</a> for more information.

				<h2>December 2, 2006</h2>

				<p>

				<a href="relnotes-6.5.2.html">Mesa 6.5.2</a> has been released.

				<a href="relnotes/6.5.2.html">Mesa 6.5.2</a> has been released.

				This is a new development release.

				</p>

				<h2>September 15, 2006</h2>

				<p>

				<a href="relnotes-6.5.1.html">Mesa 6.5.1</a> has been released.

				<a href="relnotes/6.5.1.html">Mesa 6.5.1</a> has been released.

				This is a new development release.

				</p>

				<h2>March 31, 2006</h2>

				<p>

				<a href="relnotes-6.5.html">Mesa 6.5</a> has been released.

				<a href="relnotes/6.5.html">Mesa 6.5</a> has been released.

				This is a new development release.

				</p>

				<h2>February 2, 2006</h2>

				<p>

				<a href="relnotes-6.4.2.html">Mesa 6.4.2</a> has been released.

				<a href="relnotes/6.4.2.html">Mesa 6.4.2</a> has been released.

				This is stable, bug-fix release.

				</p>

				<h2>November 29, 2005</h2>

				<p>

				<a href="relnotes-6.4.1.html">Mesa 6.4.1</a> has been released.

				<a href="relnotes/6.4.1.html">Mesa 6.4.1</a> has been released.

				This is stable, bug-fix release.

				</p>

				@@ -404,7 +554,7 @@ This is stable, bug-fix release.

				<h2>October 24, 2005</h2>

				<p>

				<a href="relnotes-6.4.html">Mesa 6.4</a> has been released.

				<a href="relnotes/6.4.html">Mesa 6.4</a> has been released.

				This is stable, bug-fix release.

				</p>

				@@ -723,8 +873,8 @@ OpenGL 1.5 features.

					- demo of per-pixel lighting with a fragment program (demos/fplight.c)

					- new version (18) of glext.h header

					- new spriteblast.c demo of GL_ARB_point_sprite

					- faster glDrawPixels in X11 driver in some cases (see RELNOTES-5.1)

					- faster glCopyPixels in X11 driver in some cases (see RELNOTES-5.1)

					- faster glDrawPixels in X11 driver in some cases (see relnotes/5.1)

					- faster glCopyPixels in X11 driver in some cases (see relnotes/5.1)

				    Bug fixes:

					- really enable OpenGL 1.4 features in DOS driver.

					- fixed issues in glDrawPixels and glCopyPixels for very wide images

									
										4

docs/install.html
									
												View File
												
				@@ -44,10 +44,6 @@ On Windows with MinGW, install flex and bison with:

				</li>

				<li>python - Python is needed for building the Gallium components.

				Version 2.6.4 or later should work.

				<br>

				<br>

				To build OpenGL ES 1.1 and 2.0 you'll also need

				<a href="http://xmlsoft.org/sources/win32/python/libxml2-python-2.7.7.win32-py2.7.exe">libxml2-python</a>.

				</li>

				</ul>

									
										10

docs/license.html
									
												View File
												
				@@ -75,9 +75,10 @@ in all copies or substantial portions of the Software.

				THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS

				OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,

				FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL

				BRIAN PAUL BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN

				AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN

				CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

				THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER

				LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,

				OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE

				SOFTWARE.

				</pre>

				@@ -102,6 +103,9 @@ Device drivers    src/mesa/drivers/*     MIT, generally

				Ext headers       include/GL/glext.h     Khronos

				                  include/GL/glxext.h

				C11 thread        include/c11/threads*.h Boost (permissive)

				emulation

				</pre>

				<p>

									
										99

docs/llvmpipe.html
									
												View File
												
				@@ -130,38 +130,38 @@ need to ask, don't even try it.

				<h1>Profiling</h1>

				To profile llvmpipe you should pass the options

				<p>

				To profile llvmpipe you should build as

				</p>

				<pre>

				  scons build=profile &lt;same-as-before&gt;

				</pre>

				<p>

				This will ensure that frame pointers are used both in C and JIT functions, and

				that no tail call optimizations are done by gcc.

				</p>

				To better profile JIT code you'll need to build LLVM with oprofile integration.

				<h2>Linux perf integration</h2>

				<p>

				On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:

				</p>

				<pre>

				  ./configure \

				      --prefix=$install_dir \

				      --enable-optimized \

				      --disable-profiling \

				      --enable-targets=host-only \

				      --with-oprofile

				  make -C "$build_dir"

				  make -C "$build_dir" install

				  find "$install_dir/lib" -iname '*.a' -print0 | xargs -0 strip --strip-debug

					perf record -g /my/application

					perf report

				</pre>

				The you should define

				<p>

				When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with

				symbol address table.  It also dumps assembly code to /tmp/perf-XXXXX.map.asm,

				which can be used by the bin/perf-annotate-jit script to produce disassembly of

				the generated code annotated with the samples.

				</p>

				<pre>

				  export LLVM=/path/to/llvm-2.6-profile

				</pre>

				and rebuild.

				<p>You can obtain a call graph via

				<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>

				<h1>Unit testing</h1>

				@@ -203,11 +203,66 @@ for posterior analysis, e.g.:

				  We use LLVM-C bindings for now. They are not documented, but follow the C++

				  interfaces very closely, and appear to be complete enough for code

				  generation. See 

				  http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html

				  for a stand-alone example.  See the llvm-c/Core.h file for reference.

				  <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">

				  this stand-alone example</a>.  See the llvm-c/Core.h file for reference.

				</li>

				</ul>

				<h1 id="recommended_reading">Recommended Reading</h1>

				<ul>

				  <li>

				    <p>Rasterization</p>

				    <ul>

				      <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>

				      <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>

				      <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>

				      <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>

				      <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>Texture sampling</p>

				    <ul>

				      <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>

				      <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>

				      <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>

				      <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>

				      <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>

				      <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>SIMD</p>

				    <ul>

				      <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>Optimization</p>

				    <ul>

				      <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>

				      <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>

				      <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>

				      <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>

				    </ul>

				  </li>

				  <li>

				    <p>LLVM</p>

				    <ul>

				      <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>

				      <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>General</p>

				    <ul>

				      <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>

				      <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>

				    </ul>

				  </li>

				</ul>

				</div>

				</body>

				</html>

									
										4

docs/opengles.html
									
												View File
												
				@@ -16,7 +16,7 @@

				<h1>OpenGL ES</h1>

				<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0.  More informations about

				<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0.  More information about

				OpenGL ES can be found at <a href="http://www.khronos.org/opengles/">

				http://www.khronos.org/opengles/</a>.</p>

				@@ -48,7 +48,7 @@ EGL drivers for your hardware.</p>

				<h3>Dispatch Table</h3>

				<p>OpenGL ES has an additional indirection when dispatching fucntions</p>

				<p>OpenGL ES has an additional indirection when dispatching functions</p>

				<pre>

				  Mesa:       glFoo() --&gt; _mesa_Foo()

									
										2

docs/openvg.html
									
												View File
												
				@@ -20,7 +20,7 @@

				The current version of the OpenVG state tracker implements OpenVG 1.1.

				</p>

				<p>

				More informations about OpenVG can be found at

				More information about OpenVG can be found at

				<a href="http://www.khronos.org/openvg/">

				http://www.khronos.org/openvg/</a> .

				</p>

									
										71

docs/osmesa.html
									
												View File
												
				@@ -18,77 +18,62 @@

				<p>

				Mesa's off-screen rendering interface is used for rendering into

				user-allocated blocks of memory.

				Mesa's off-screen interface is used for rendering into user-allocated memory

				without any sort of window system or operating system dependencies.

				That is, the GL_FRONT colorbuffer is actually a buffer in main memory,

				rather than a window on your display.

				There are no window system or operating system dependencies.

				One potential application is to use Mesa as an off-line, batch-style renderer.

				</p>

				<p>

				The <b>OSMesa</b> API provides three basic functions for making off-screen

				The OSMesa API provides three basic functions for making off-screen

				renderings: OSMesaCreateContext(), OSMesaMakeCurrent(), and

				OSMesaDestroyContext().  See the Mesa/include/GL/osmesa.h header for

				more information about the API functions.

				</p>

				<p>

				The OSMesa interface may be used with any of three software renderers:

				</p>

				<ol>

				<li>llvmpipe - this is the high-performance Gallium LLVM driver

				<li>softpipe - this it the reference Gallium software driver

				<li>swrast - this is the legacy Mesa software rasterizer

				</ol>

				<p>

				There are several examples of OSMesa in the mesa/demos repository.

				</p>

				<h2>Deep color channels</h2>

				<h1>Building OSMesa</h1>

				<p>

				For some applications 8-bit color channels don't have sufficient

				precision.

				OSMesa supports 16-bit and 32-bit color channels through the OSMesa interface.

				When using 16-bit channels, channels are GLushorts and RGBA pixels occupy

				8 bytes.

				When using 32-bit channels, channels are GLfloats and RGBA pixels occupy

				16 bytes.

				</p>

				Configure and build Mesa with something like:

				<p>

				Before version 6.5.1, Mesa had to be recompiled to support exactly

				one of 8, 16 or 32-bit channels.

				With Mesa 6.5.1, Mesa can be compiled for either 8, 16 or 32-bit channels

				and render into any of the smaller size channels.

				For example, if Mesa's compiled for 32-bit channels, you can also render

				16 and 8-bit channel images.

				</p>

				<p>

				To build Mesa/OSMesa for 16 and 8-bit color channel support:

				<pre>

				      make realclean

				      make linux-osmesa16

				configure --enable-osmesa --disable-driglx-direct --disable-dri --with-gallium-drivers=swrast

				make

				</pre>

				<p>

				To build Mesa/OSMesa for 32, 16 and 8-bit color channel support:

				Make sure you have LLVM installed first if you want to use the llvmpipe driver.

				</p>

				<p>

				When the build is complete you should find:

				</p>

				<pre>

				      make realclean

				      make linux-osmesa32

				lib/libOSMesa.so  (swrast-based OSMesa)

				lib/gallium/libOSMsea.so  (gallium-based OSMesa)

				</pre>

				<p>

				You'll wind up with a library named libOSMesa16.so or libOSMesa32.so.

				Otherwise, most Mesa configurations build an 8-bit/channel libOSMesa.so library

				by default.

				Set your LD_LIBRARY_PATH to point to one directory or the other to select

				the library you want to use.

				</p>

				<p>

				If performance is important, compile Mesa for the channel size you're

				most interested in.

				</p>

				<p>

				If you need to compile on a non-Linux platform, copy Mesa/configs/linux-osmesa16

				to a new config file and edit it as needed.  Then, add the new config name to

				the top-level Makefile.  Send a patch to the Mesa developers too, if you're

				inclined.

				When you link your application, link with -lOSMesa

				</p>

				</div>

									
										172

docs/relnotes.html
									
												View File
												
				@@ -21,57 +21,79 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes-9.1.html">9.1 release notes</a>

				<li><a href="relnotes-9.0.2.html">9.0.2 release notes</a>

				<li><a href="relnotes-9.0.1.html">9.0.1 release notes</a>

				<li><a href="relnotes-9.0.html">9.0 release notes</a>

				<li><a href="relnotes-8.0.5.html">8.0.5 release notes</a>

				<li><a href="relnotes-8.0.4.html">8.0.4 release notes</a>

				<li><a href="relnotes-8.0.3.html">8.0.3 release notes</a>

				<li><a href="relnotes-8.0.2.html">8.0.2 release notes</a>

				<li><a href="relnotes-8.0.1.html">8.0.1 release notes</a>

				<li><a href="relnotes-8.0.html">8.0 release notes</a>

				<li><a href="relnotes-7.11.2.html">7.11.2 release notes</a>

				<li><a href="relnotes-7.11.1.html">7.11.1 release notes</a>

				<li><a href="relnotes-7.11.html">7.11 release notes</a>

				<li><a href="relnotes-7.10.3.html">7.10.3 release notes</a>

				<li><a href="relnotes-7.10.2.html">7.10.2 release notes</a>

				<li><a href="relnotes-7.10.1.html">7.10.1 release notes</a>

				<li><a href="relnotes-7.10.html">7.10 release notes</a>

				<li><a href="relnotes-7.9.2.html">7.9.2 release notes</a>

				<li><a href="relnotes-7.9.1.html">7.9.1 release notes</a>

				<li><a href="relnotes-7.9.html">7.9 release notes</a>

				<li><a href="relnotes-7.8.3.html">7.8.3 release notes</a>

				<li><a href="relnotes-7.8.2.html">7.8.2 release notes</a>

				<li><a href="relnotes-7.8.1.html">7.8.1 release notes</a>

				<li><a href="relnotes-7.8.html">7.8 release notes</a>

				<li><a href="relnotes-7.7.1.html">7.7.1 release notes</a>

				<li><a href="relnotes-7.7.html">7.7 release notes</a>

				<li><a href="relnotes-7.6.1.html">7.6.1 release notes</a>

				<li><a href="relnotes-7.6.html">7.6 release notes</a>

				<li><a href="relnotes-7.5.2.html">7.5.2 release notes</a>

				<li><a href="relnotes-7.5.1.html">7.5.1 release notes</a>

				<li><a href="relnotes-7.5.html">7.5 release notes</a>

				<li><a href="relnotes-7.4.4.html">7.4.4 release notes</a>

				<li><a href="relnotes-7.4.3.html">7.4.3 release notes</a>

				<li><a href="relnotes-7.4.2.html">7.4.2 release notes</a>

				<li><a href="relnotes-7.4.1.html">7.4.1 release notes</a>

				<li><a href="relnotes-7.4.html">7.4 release notes</a>

				<li><a href="relnotes-7.3.html">7.3 release notes</a>

				<li><a href="relnotes-7.2.html">7.2 release notes</a>

				<li><a href="relnotes-7.1.html">7.1 release notes</a>

				<li><a href="relnotes-7.0.4.html">7.0.4 release notes</a>

				<li><a href="relnotes-7.0.3.html">7.0.3 release notes</a>

				<li><a href="relnotes-7.0.2.html">7.0.2 release notes</a>

				<li><a href="relnotes-7.0.1.html">7.0.1 release notes</a>

				<li><a href="relnotes-7.0.html">7.0 release notes</a>

				<li><a href="relnotes-6.5.3.html">6.5.3 release notes</a>

				<li><a href="relnotes-6.5.2.html">6.5.2 release notes</a>

				<li><a href="relnotes-6.5.1.html">6.5.1 release notes</a>

				<li><a href="relnotes-6.5.html">6.5 release notes</a>

				<li><a href="relnotes-6.4.2.html">6.4.2 release notes</a>

				<li><a href="relnotes-6.4.1.html">6.4.1 release notes</a>

				<li><a href="relnotes-6.4.html">6.4 release notes</a>

				<li><a href="relnotes/10.1.1.html">10.1.1 release notes</a>

				<li><a href="relnotes/10.1.html">10.1 release notes</a>

				<li><a href="relnotes/10.0.5.html">10.0.5 release notes</a>

				<li><a href="relnotes/10.0.4.html">10.0.4 release notes</a>

				<li><a href="relnotes/10.0.3.html">10.0.3 release notes</a>

				<li><a href="relnotes/10.0.2.html">10.0.2 release notes</a>

				<li><a href="relnotes/10.0.1.html">10.0.1 release notes</a>

				<li><a href="relnotes/10.0.html">10.0 release notes</a>

				<li><a href="relnotes/9.2.5.html">9.2.5 release notes</a>

				<li><a href="relnotes/9.2.4.html">9.2.4 release notes</a>

				<li><a href="relnotes/9.2.3.html">9.2.3 release notes</a>

				<li><a href="relnotes/9.2.2.html">9.2.2 release notes</a>

				<li><a href="relnotes/9.2.1.html">9.2.1 release notes</a>

				<li><a href="relnotes/9.2.html">9.2 release notes</a>

				<li><a href="relnotes/9.1.7.html">9.1.7 release notes</a>

				<li><a href="relnotes/9.1.6.html">9.1.6 release notes</a>

				<li><a href="relnotes/9.1.5.html">9.1.5 release notes</a>

				<li><a href="relnotes/9.1.4.html">9.1.4 release notes</a>

				<li><a href="relnotes/9.1.3.html">9.1.3 release notes</a>

				<li><a href="relnotes/9.1.2.html">9.1.2 release notes</a>

				<li><a href="relnotes/9.1.1.html">9.1.1 release notes</a>

				<li><a href="relnotes/9.1.html">9.1 release notes</a>

				<li><a href="relnotes/9.0.3.html">9.0.3 release notes</a>

				<li><a href="relnotes/9.0.2.html">9.0.2 release notes</a>

				<li><a href="relnotes/9.0.1.html">9.0.1 release notes</a>

				<li><a href="relnotes/9.0.html">9.0 release notes</a>

				<li><a href="relnotes/8.0.5.html">8.0.5 release notes</a>

				<li><a href="relnotes/8.0.4.html">8.0.4 release notes</a>

				<li><a href="relnotes/8.0.3.html">8.0.3 release notes</a>

				<li><a href="relnotes/8.0.2.html">8.0.2 release notes</a>

				<li><a href="relnotes/8.0.1.html">8.0.1 release notes</a>

				<li><a href="relnotes/8.0.html">8.0 release notes</a>

				<li><a href="relnotes/7.11.2.html">7.11.2 release notes</a>

				<li><a href="relnotes/7.11.1.html">7.11.1 release notes</a>

				<li><a href="relnotes/7.11.html">7.11 release notes</a>

				<li><a href="relnotes/7.10.3.html">7.10.3 release notes</a>

				<li><a href="relnotes/7.10.2.html">7.10.2 release notes</a>

				<li><a href="relnotes/7.10.1.html">7.10.1 release notes</a>

				<li><a href="relnotes/7.10.html">7.10 release notes</a>

				<li><a href="relnotes/7.9.2.html">7.9.2 release notes</a>

				<li><a href="relnotes/7.9.1.html">7.9.1 release notes</a>

				<li><a href="relnotes/7.9.html">7.9 release notes</a>

				<li><a href="relnotes/7.8.3.html">7.8.3 release notes</a>

				<li><a href="relnotes/7.8.2.html">7.8.2 release notes</a>

				<li><a href="relnotes/7.8.1.html">7.8.1 release notes</a>

				<li><a href="relnotes/7.8.html">7.8 release notes</a>

				<li><a href="relnotes/7.7.1.html">7.7.1 release notes</a>

				<li><a href="relnotes/7.7.html">7.7 release notes</a>

				<li><a href="relnotes/7.6.1.html">7.6.1 release notes</a>

				<li><a href="relnotes/7.6.html">7.6 release notes</a>

				<li><a href="relnotes/7.5.2.html">7.5.2 release notes</a>

				<li><a href="relnotes/7.5.1.html">7.5.1 release notes</a>

				<li><a href="relnotes/7.5.html">7.5 release notes</a>

				<li><a href="relnotes/7.4.4.html">7.4.4 release notes</a>

				<li><a href="relnotes/7.4.3.html">7.4.3 release notes</a>

				<li><a href="relnotes/7.4.2.html">7.4.2 release notes</a>

				<li><a href="relnotes/7.4.1.html">7.4.1 release notes</a>

				<li><a href="relnotes/7.4.html">7.4 release notes</a>

				<li><a href="relnotes/7.3.html">7.3 release notes</a>

				<li><a href="relnotes/7.2.html">7.2 release notes</a>

				<li><a href="relnotes/7.1.html">7.1 release notes</a>

				<li><a href="relnotes/7.0.4.html">7.0.4 release notes</a>

				<li><a href="relnotes/7.0.3.html">7.0.3 release notes</a>

				<li><a href="relnotes/7.0.2.html">7.0.2 release notes</a>

				<li><a href="relnotes/7.0.1.html">7.0.1 release notes</a>

				<li><a href="relnotes/7.0.html">7.0 release notes</a>

				<li><a href="relnotes/6.5.3.html">6.5.3 release notes</a>

				<li><a href="relnotes/6.5.2.html">6.5.2 release notes</a>

				<li><a href="relnotes/6.5.1.html">6.5.1 release notes</a>

				<li><a href="relnotes/6.5.html">6.5 release notes</a>

				<li><a href="relnotes/6.4.2.html">6.4.2 release notes</a>

				<li><a href="relnotes/6.4.1.html">6.4.1 release notes</a>

				<li><a href="relnotes/6.4.html">6.4 release notes</a>

				</ul>

				<p>

				@@ -80,29 +102,31 @@ Versions of Mesa prior to 6.4 are summarized in the

				</p>

				<ul>

				<li><a href="RELNOTES-6.3.2">RELNOTES-6.3.2</a>

				<li><a href="RELNOTES-6.3">RELNOTES-6.3</a>

				<li><a href="RELNOTES-6.2.1">RELNOTES-6.2.1</a>

				<li><a href="RELNOTES-6.2">RELNOTES-6.2</a>

				<li><a href="RELNOTES-6.1">RELNOTES-6.1</a>

				<li><a href="RELNOTES-6.0">RELNOTES-6.0</a>

				<li><a href="RELNOTES-5.1">RELNOTES-5.1</a>

				<li><a href="RELNOTES-5.0.2">RELNOTES-5.0.2</a>

				<li><a href="RELNOTES-5.0.1">RELNOTES-5.0.1</a>

				<li><a href="RELNOTES-5.0">RELNOTES-5.0</a>

				<li><a href="RELNOTES-4.1">RELNOTES-4.1</a>

				<li><a href="RELNOTES-4.0.3">RELNOTES-4.0.3</a>

				<li><a href="RELNOTES-4.0.2">RELNOTES-4.0.2</a>

				<li><a href="RELNOTES-4.0.1">RELNOTES-4.0.1</a>

				<li><a href="RELNOTES-4.0">RELNOTES-4.0</a>

				<li><a href="RELNOTES-3.5">RELNOTES-3.5</a>

				<li><a href="RELNOTES-3.4.2">RELNOTES-3.4.2</a>

				<li><a href="RELNOTES-3.4.1">RELNOTES-3.4.1</a>

				<li><a href="RELNOTES-3.4">RELNOTES-3.4</a>

				<li><a href="RELNOTES-3.3">RELNOTES-3.3</a>

				<li><a href="RELNOTES-3.2.1">RELNOTES-3.2.1</a>

				<li><a href="RELNOTES-3.2">RELNOTES-3.2</a>

				<li><a href="RELNOTES-3.1">RELNOTES-3.1</a>

				<li><a href="relnotes/6.3.2">6.3.2 release notes</a>

				<li><a href="relnotes/6.3.1">6.3.1 release notes</a>

				<li><a href="relnotes/6.3">6.3 release notes</a>

				<li><a href="relnotes/6.2.1">6.2.1 release notes</a>

				<li><a href="relnotes/6.2">6.2 release notes</a>

				<li><a href="relnotes/6.1">6.1 release notes</a>

				<li><a href="relnotes/6.0.1">6.0.1 release notes</a>

				<li><a href="relnotes/6.0">6.0 release notes</a>

				<li><a href="relnotes/5.1">5.1 release notes</a>

				<li><a href="relnotes/5.0.2">5.0.2 release notes</a>

				<li><a href="relnotes/5.0.1">5.0.1 release notes</a>

				<li><a href="relnotes/5.0">5.0 release notes</a>

				<li><a href="relnotes/4.1">4.1 release notes</a>

				<li><a href="relnotes/4.0.3">4.0.3 release notes</a>

				<li><a href="relnotes/4.0.2">4.0.2 release notes</a>

				<li><a href="relnotes/4.0.1">4.0.1 release notes</a>

				<li><a href="relnotes/4.0">4.0 release notes</a>

				<li><a href="relnotes/3.5">3.5 release notes</a>

				<li><a href="relnotes/3.4.2">3.4.2 release notes</a>

				<li><a href="relnotes/3.4.1">3.4.1 release notes</a>

				<li><a href="relnotes/3.4">3.4 release notes</a>

				<li><a href="relnotes/3.3">3.3 release notes</a>

				<li><a href="relnotes/3.2.1">3.2.1 release notes</a>

				<li><a href="relnotes/3.2">3.2 release notes</a>

				<li><a href="relnotes/3.1">3.1 release notes</a>

				</ul>

				</div>

									
										150

docs/relnotes/10.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,150 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.1 Release Notes / (December 12, 2013)</h1>

				<p>

				Mesa 10.0.1 is a bug fix release which fixes bugs found since the 10.0 release.

				</p>

				<p>

				Mesa 10.0.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				0a72ca5b36046a658bf6038326ff32ed  MesaLib-10.0.1.tar.bz2

				01bde35c912e504ba62caf1ef9f7022c  MesaLib-10.0.1.tar.gz

				59a174a11a89e6b1b8ee9c3f7e3c388c  MesaLib-10.0.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64323">Bug 64323</a> - Severe misrendering in Left 4 Dead 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68838">Bug 68838</a> - GLSL: struct declarations produce a &quot;empty declaration warning&quot; in 9.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69155">Bug 69155</a> - [NV50 gallium] [piglit] bin/varying-packing-simple triggers memory corruption/failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70250">Bug 70250</a> - weston-terminal rendering corrupted with output transform 90 and 270</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70601">Bug 70601</a> - [SNB Bisected]Piglit spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72230">Bug 72230</a> - Unable to extract MesaLib-10.0.0.tar.{gz,bz2} with bsdtar</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72325">Bug 72325</a> - [swrast] piglit glean fbo regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72327">Bug 72327</a> - [swrast] piglit glean pointSprite regression</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0..mesa-10.0.1

				</pre>

				<p>Axel Davy (2):</p>

				<ul>

				  <li>egl/wayland: Flush the wl_display at the end of SwapBuffers</li>

				  <li>Enable throttling in SwapBuffers</li>

				</ul>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs</li>

				  <li>i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>swrast: fix readback regression since inversion fix</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>automake: include only one copy VERSION in tarball</li>

				</ul>

				<p>Ian Romanick (3):</p>

				<ul>

				  <li>docs: Add 10.0 release md5sums</li>

				  <li>Remove a057b83 from the pick list</li>

				  <li>glsl: Don't emit empty declaration warning for a struct specifier</li>

				</ul>

				<p>Ilia Mirkin (8):</p>

				<ul>

				  <li>mesa: don't leak performance monitors on context destroy</li>

				  <li>nv50: Fix GPU_READING/WRITING bit removal</li>

				  <li>nouveau: avoid leaking fences while waiting</li>

				  <li>nv50: wait on the buf's fence before sticking it into pushbuf</li>

				  <li>nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0)</li>

				  <li>nouveau/video: update h264 picparm field names based on usage</li>

				  <li>nouveau/video: update a few more h264 picparm field names</li>

				  <li>nv50: report 15 max inputs for fragment programs</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>dri megadriver_stub: add compatibility for older DRI loaders</li>

				</ul>

				<p>Kristian Høgsberg (2):</p>

				<ul>

				  <li>egl/wayland: Damage INT32_MAX x INT32_MAX region for eglSwapBuffers</li>

				  <li>egl/wayland: Send commit after flushing the driver context</li>

				</ul>

				<p>Maarten Lankhorst (1):</p>

				<ul>

				  <li>nouveau: Fix compiler warning regression</li>

				</ul>

				<p>Paul Berry (1):</p>

				<ul>

				  <li>i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats.</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Bump major version number to 2</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>r300/compiler/tests: Fix segfault</li>

				  <li>r300/compiler/tests: Fix line length check in test parser</li>

				</ul>

				</div>

				</body>

				</html>

									
										161

docs/relnotes/10.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,161 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.2 Release Notes / (January 9, 2014)</h1>

				<p>

				Mesa 10.0.2 is a bug fix release which fixes bugs found since the 10.0.1 release.

				</p>

				<p>

				Mesa 10.0.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				de7d14baf0101b697c140d2f47ef27e9  MesaLib-10.0.2.tar.gz

				8544c0ab3e438a08b5103421ea15b6d2  MesaLib-10.0.2.tar.bz2

				181b0d6c1afca38e98a930d0e564ed90  MesaLib-10.0.2.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70740">Bug 70740</a> - HiZ on SNB causes GPU hang with WebGL web app</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72026">Bug 72026</a> - SIGSEGV in fs_visitor::visit(ir_dereference_variable*)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72264">Bug 72264</a> - GLSL error reporting</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72369">Bug 72369</a> - glitches in serious sam 3 with the sb shader backend</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.1..mesa-10.0.2

				</pre>

				<p>Aaron Watry (8):</p>

				<ul>

				  <li>clover: Remove unused variable</li>

				  <li>pipe_loader/sw: close dev-&gt;lib when initialization fails</li>

				  <li>radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode</li>

				  <li>r600/compute: Free compiled kernels when deleting compute state</li>

				  <li>r600/compute: Use the correct FREE macro when deleting compute state</li>

				  <li>radeon/llvm: Free target data at end of optimization</li>

				  <li>st/vdpau: Destroy context when initialization fails</li>

				  <li>r600/pipe: Stop leaking context-&gt;start_compute_cs_cmd.buf on EG/CM</li>

				</ul>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>r600g: fix SUMO2 pci id</li>

				</ul>

				<p>Alexander von Gluck IV (1):</p>

				<ul>

				  <li>Haiku: Add in public GL kit headers</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>mesa: Fix error code generation in glBeginConditionalRender()</li>

				</ul>

				<p>Carl Worth (2):</p>

				<ul>

				  <li>docs: Add md5sums for the 10.0.1 release.</li>

				  <li>Update version to 10.0.2</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>i965/gen6: Fix HiZ hang in WebGL Google Maps</li>

				</ul>

				<p>Erik Faye-Lund (1):</p>

				<ul>

				  <li>glcpp: error on multiple #else/#elif directives</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>i915: Add support for gl_FragData[0] reads.</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nv50: fix a small leak on context destroy</li>

				</ul>

				<p>Jonathan Liu (2):</p>

				<ul>

				  <li>st/mesa: use pipe_sampler_view_release()</li>

				  <li>llvmpipe: use pipe_sampler_view_release() to avoid segfault</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.</li>

				  <li>Revert "mesa: Remove GLXContextID typedef from glx.h."</li>

				</ul>

				<p>Kevin Rogovin (1):</p>

				<ul>

				  <li>Use line number information from entire function expression</li>

				</ul>

				<p>Kristian Høgsberg (1):</p>

				<ul>

				  <li>dri_util: Don't assume __DRIcontext-&gt;driverPrivate is a gl_context</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>mesa: fix interpretation of glClearBuffer(drawbuffer)</li>

				  <li>st/mesa: fix glClear with multiple colorbuffers and different formats</li>

				</ul>

				<p>Paul Berry (2):</p>

				<ul>

				  <li>glsl: Teach ir_variable_refcount about ir_loop::counter variables.</li>

				  <li>glsl: Fix inconsistent assumptions about ir_loop::counter.</li>

				</ul>

				<p>Vadim Girlin (1):</p>

				<ul>

				  <li>r600g/sb: fix stack size computation on evergreen</li>

				</ul>

				</div>

				</body>

				</html>

									
										206

docs/relnotes/10.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,206 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.3 Release Notes / (February 3, 2014)</h1>

				<p>

				Mesa 10.0.3 is a bug fix release which fixes bugs found since the 10.0.2 release.

				</p>

				<p>

				Mesa 10.0.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				5f9f463ef08129f6762106b434910adb  MesaLib-10.0.3.tar.bz2

				fb3997b6500e153bc32370cb3fc4ca9e  MesaLib-10.0.3.tar.gz

				a07b4b6b9eb449b88a6cb5061e51c331  MesaLib-10.0.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72708">Bug 72708</a> - Master fails to build with older gcc due to -msse4.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72926">Bug 72926</a> - [REGRESSION,swrast] Memory-related crash with anti-aliasing enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73096">Bug 73096</a> - Query GL_RGBA_SIGNED_COMPONENTS_EXT missing</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73100">Bug 73100</a> - Please use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73418">Bug 73418</a> - OpenCL hangs graphics on CAYMAN</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73473">Bug 73473</a> - Potential crash bug in src/gallium/auxiliary/rtasm/rtasm_execmem.c</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73915">Bug 73915</a> - sample shading + centroid broken since f5cfb4a</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73956">Bug 73956</a> - SIGSEGV when passing GL_NONE to glReadBuffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74026">Bug 74026</a> - Compiler rejects chained assignments involving array dereferences</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.2..mesa-10.0.3

				</pre>

				<p>Aaron Watry (2):</p>

				<ul>

				  <li>radeon: Move gfx/dma cs cleanup to r600_common_context_cleanup</li>

				  <li>st/dri: prevent leak of dri option default values</li>

				</ul>

				<p>Andreas Fänger (1):</p>

				<ul>

				  <li>swrast: fix delayed texel buffer allocation regression for OpenMP</li>

				</ul>

				<p>Anuj Phogat (3):</p>

				<ul>

				  <li>glsl: Disable ARB_texture_rectangle in shader version 100.</li>

				  <li>i965: Use sample barycentric coordinates with per sample shading</li>

				  <li>i965: Ignore 'centroid' interpolation qualifier in case of persample shading</li>

				</ul>

				<p>Brian Paul (3):</p>

				<ul>

				  <li>mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query</li>

				  <li>st/mesa: fix glReadBuffer(GL_NONE) segfault</li>

				  <li>draw: fix incorrect vertex size computation in LLVM drawing code</li>

				</ul>

				<p>Carl Worth (5):</p>

				<ul>

				  <li>Add md5sums for 10.0.2. release.</li>

				  <li>cherry-ignore: Ignore several patches not yet ready for the stable branch</li>

				  <li>Drop another couple of patches.</li>

				  <li>cherry-ignore: Ignore 4 patches at teh request of the author, (Anuj).</li>

				  <li>Update version to 10.0.3</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>i965/gen6/blorp: Emit more flushes to workaround hangs</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965: fold offset into coord for textureOffset(gsampler2DRect)</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>mesa: use signed temporary variable to store _ColorDrawBufferIndexes</li>

				  <li>st/mesa: use signed temporary variable to store _ColorDrawBufferIndexes</li>

				  <li>nv50: access only the available amount of textures</li>

				  <li>nv50: access only the available amount of constbuf</li>

				  <li>gallium/rtasm: handle mmap failures appropriately</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.</li>

				  <li>i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>mesa: Add COMPRESSED_RGBA_S3TC_DXT1_EXT to COMPRESSED_TEXTURE_FORMATS for GLES</li>

				  <li>radeon / r200: Pass the API into _mesa_initialize_context</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_program</li>

				  <li>st/vdpau: don't return a device if the screen doesn't support NPOT</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>mesa: Use IROUND instead of roundf.</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Rename "expr" to "lhs_expr" in vector_extract munging code.</li>

				  <li>glsl: Fix chained assignments of vector channels.</li>

				</ul>

				<p>Lauri Kasanen (1):</p>

				<ul>

				  <li>mesa: Fix build to properly check for supported compiler flags</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>st/mesa: use sRGB formats for MSAA resolving if destination is sRGB</li>

				  <li>gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>glcpp: Define GL_EXT_shader_integer_mix in both GL and ES.</li>

				  <li>glx: Update glxext.h to revision 24777.</li>

				</ul>

				<p>Michał Górny (1):</p>

				<ul>

				  <li>Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.</li>

				</ul>

				<p>Paul Berry (1):</p>

				<ul>

				  <li>i965: Ensure that all necessary state is re-emitted if we run out of aperture.</li>

				</ul>

				<p>Paul Seidler (1):</p>

				<ul>

				  <li>build: move ARCH_LIBS definition outside of ASM definition</li>

				</ul>

				<p>Thomas Sondergaard (4):</p>

				<ul>

				  <li>mesa: Preliminary support for MSVC_VERSION=12.0</li>

				  <li>mesa: Fix compile error with MSVC 2013</li>

				  <li>mesa: Work around internal compiler error</li>

				  <li>mesa: Namespace qualify fma to override ambiguity with fma from math.h</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader.</li>

				</ul>

				</div>

				</body>

				</html>

									
										191

docs/relnotes/10.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,191 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.4 Release Notes / (March 12, 2014)</h1>

				<p>

				Mesa 10.0.4 is a bug fix release which fixes bugs found since the 10.0.3 release.

				</p>

				<p>

				Mesa 10.0.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				5a3c5b90776ec8a9fcd777c99e0607e2  MesaLib-10.0.4.tar.gz

				8b148869d2620b0720c8a8d2b7eb3e38  MesaLib-10.0.4.tar.bz2

				da2418d25bfbc273660af7e755fb367e  MesaLib-10.0.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71870">Bug 71870</a> - Metro: Last Light rendering issues</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72895">Bug 72895</a> - Missing trees in flightgear 2.12.1 with mesa 10.0.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74251">Bug 74251</a> - Segfault in st_finalize_texture with Texture Buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74723">Bug 74723</a> - main/shaderapi.c:407: detach_shader: Assertion `shProg-&gt;Shaders[j]-&gt;Type == 0x8B31 || shProg-&gt;Shaders[j]-&gt;Type == 0x8B30' failed.</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.3..mesa-10.0.4

				</pre>

				<p>Anuj Phogat (4):</p>

				<ul>

				  <li>mesa: Generate correct error code in glDrawBuffers()</li>

				  <li>mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()</li>

				  <li>glsl: Fix condition to generate shader link error</li>

				  <li>i965: Fix the region's pitch condition to use blitter</li>

				</ul>

				<p>Brian Paul (8):</p>

				<ul>

				  <li>r200: move driContextSetFlags(ctx) call after ctx var is initialized</li>

				  <li>radeon: move driContextSetFlags(ctx) call after ctx var is initialized</li>

				  <li>gallium/auxiliary/indices: replace free() with FREE()</li>

				  <li>draw: fix incorrect color of flat-shaded clipped lines</li>

				  <li>st/mesa: avoid sw fallback for getting/decompressing textures</li>

				  <li>mesa: update assertion in detach_shader() for geom shaders</li>

				  <li>mesa: do depth/stencil format conversion in glGetTexImage</li>

				  <li>softpipe: use 64-bit arithmetic in softpipe_resource_layout()</li>

				</ul>

				<p>Carl Worth (4):</p>

				<ul>

				  <li>docs: Add md5sums for 10.0.3 release</li>

				  <li>main: Avoid double-free of shader Label</li>

				  <li>get-pick-list: Update to only find patches nominated for the 10.0 branch</li>

				  <li>Update version to 10.0.4</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965: Validate (and resolve) all the bound textures.</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>radeon/uvd: fix feedback buffer handling v2</li>

				</ul>

				<p>Daniel Kurtz (1):</p>

				<ul>

				  <li>glsl: Add locking to builtin_builder singleton</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>dri/nouveau: Pass the API into _mesa_initialize_context</li>

				  <li>nv50: correctly calculate the number of vertical blocks during transfer map</li>

				  <li>dri/i9*5: correctly calculate the amount of system memory</li>

				</ul>

				<p>Fredrik Höglund (3):</p>

				<ul>

				  <li>mesa: Preserve the NewArrays state when copying a VAO</li>

				  <li>glx: Fix the default values for GLXFBConfig attributes</li>

				  <li>glx: Fix the GLXFBConfig attrib sort priorities</li>

				</ul>

				<p>Hans (2):</p>

				<ul>

				  <li>util: don't define isfinite(), isnan() for MSVC &gt;= 1800</li>

				  <li>mesa: don't define c99 math functions for MSVC &gt;= 1800</li>

				</ul>

				<p>Ian Romanick (6):</p>

				<ul>

				  <li>meta: Release resources used by decompress_texture_image</li>

				  <li>meta: Release resources used by _mesa_meta_DrawPixels</li>

				  <li>meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY</li>

				  <li>meta: Consistenly use non-Apple VAO functions</li>

				  <li>glcpp: Only warn for macro names containing __</li>

				  <li>glsl: Only warn for macro names containing __</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>nv30: report 8 maximum inputs</li>

				  <li>nouveau/video: make sure that firmware is present when checking caps</li>

				  <li>nouveau: fix chipset checks for nv1a by using the oclass instead</li>

				</ul>

				<p>Julien Cristau (1):</p>

				<ul>

				  <li>glx/dri2: fix build failure on HURD</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Don't lose precision qualifiers when encountering "centroid".</li>

				  <li>i965: Create a hardware context before initializing state module.</li>

				</ul>

				<p>Kusanagi Kouichi (1):</p>

				<ul>

				  <li>targets/vdpau: Always use c++ to link</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: fix crash when a shader uses a TBO and it's not bound</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>glsl: Initialize ubo_binding_mask flags to zero.</li>

				</ul>

				<p>Paul Berry (2):</p>

				<ul>

				  <li>glsl: Make condition_to_hir() callable from outside ast_iteration_statement.</li>

				  <li>glsl: Fix continue statements in do-while loops.</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs</li>

				</ul>

				<p>Topi Pohjolainen (1):</p>

				<ul>

				  <li>i965/blorp: do not use unnecessary hw-blending support</li>

				</ul>

				</div>

				</body>

				</html>

									
										173

docs/relnotes/10.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,173 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.5 Release Notes / April 18, 2014</h1>

				<p>

				Mesa 10.0.5 is a bug fix release which fixes bugs found since the 10.0.4 release.

				</p>

				<p>

				Mesa 10.0.5 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				db606aadd0fe321f3664099677d159bc  MesaLib-10.0.5.tar.gz

				e6009ccd8898d7104bb325b6af9ec354  MesaLib-10.0.5.tar.bz2

				c8ab9e502542bf32299a4df85b0b704d  MesaLib-10.0.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58660">Bug 58660</a> - CAYMAN broken with HyperZ on</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66352">Bug 66352</a> - GPU lockup in L4D2 on TURKS with HyperZ</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68799">Bug 68799</a> - [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error &quot;SSE4.1 instruction set not enabled&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73088">Bug 73088</a> - [HyperZ] Juniper (6770): Gone Home / Unigine Heaven 4.0 lock up system after several minutes of use</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74428">Bug 74428</a> - hyperz causes gpu hang in Counter-strike: Source</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74803">Bug 74803</a> - [r600g] HyperZ broken on RV630 (Cogs shadows are broken)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74892">Bug 74892</a> - HyperZ GPU lockup with radeonsi 7970M PITCAIRN and Distance Alpha game</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74988">Bug 74988</a> - Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75279">Bug 75279</a> - XCloseDisplay() takes one minute around nouveau_dri.so, freezing Firefox startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77102">Bug 77102</a> - gallium nouveau has no profile in vdpau and libva</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77207">Bug 77207</a> - [ivb/hsw] batch overwritten with garbage</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.4..mesa-10.0.5

				</pre>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>radeon: reverse DBG_NO_HYPERZ logic</li>

				</ul>

				<p>Brian Paul (9):</p>

				<ul>

				  <li>mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SARGB8()</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SRGB8()</li>

				  <li>mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up</li>

				  <li>st/mesa: add null pointer checking in query object functions</li>

				  <li>mesa: fix glMultiDrawArrays inside a display list</li>

				  <li>cso: fix sampler view count in cso_set_sampler_views()</li>

				  <li>svga: replace sampler assertion with conditional</li>

				  <li>svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()</li>

				</ul>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>docs: Add md5sums for the 10.0.4 release.</li>

				  <li>Ignore patches which don't apply.</li>

				  <li>Update version to 10.0.5</li>

				</ul>

				<p>Christian König (2):</p>

				<ul>

				  <li>st/mesa: recreate sampler view on context change v3</li>

				  <li>st/mesa: fix sampler view handling with shared textures v4</li>

				</ul>

				<p>Courtney Goeltzenleuchter (1):</p>

				<ul>

				  <li>mesa: add bounds checking to eliminate buffer overrun</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>mesa: return v.value_int64 when the requested type is TYPE_INT64</li>

				  <li>glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Fix buffer overruns in MSAA MCS buffer clearing.</li>

				</ul>

				<p>Ilia Mirkin (6):</p>

				<ul>

				  <li>nouveau: fix fence waiting logic in screen destroy</li>

				  <li>nv50: adjust blit_3d handling of ms output textures</li>

				  <li>mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture</li>

				  <li>nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list</li>

				  <li>nouveau: there may not have been a texture if the fbo was incomplete</li>

				  <li>nouveau: fix firmware check on nvd7/nvd9</li>

				</ul>

				<p>Johannes Nixdorf (1):</p>

				<ul>

				  <li>configure.ac: fix the detection of expat with pkg-config</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>gallium: add endian detection for OpenBSD</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>draw: Duplicate TGSI tokens in draw_pipe_pstipple module.</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.</li>

				</ul>

				<p>Paul Berry (1):</p>

				<ul>

				  <li>i965/gen7: Prefer vertical alignment of 4 when possible.</li>

				</ul>

				</div>

				</body>

				</html>

									
										146

docs/relnotes/10.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,146 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0 Release Notes / (November 30th, 2013)</h1>

				<p>

				Mesa 10.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.0.1.

				</p>

				<p>

				Mesa 10.0 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				b38626b96c664db67a534d7859682436  MesaLib-10.0.0.tar.gz

				f3fe55d9735bea158bbe97ed9a0da819  MesaLib-10.0.0.tar.bz2

				c6ee1ce51e3bf35947d2978b872daf51  MesaLib-10.0.0.zip

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_AMD_seamless_cubemap_per_texture on i965.</li>

				<li>GL_ARB_conservative_depth on i965.</li>

				<li>GL_ARB_texture_gather on i965.</li>

				<li>GL_ARB_texture_query_levels on i965.</li>

				<li>GL_ARB_texture_mirror_clamp_to_edge.</li>

				<li>GL_ARB_transform_feedback2, GL_ARB_transform_feedback3, and GL_ARB_transform_feedback_instanced on i965/Gen7 (with appropriate kernel support).</li>

				<li>GL_ARB_sample_shading on i965.</li>

				<li>GL_ARB_shader_atomic_counters on i965.</li>

				<li>GL_ARB_vertex_attrib_binding</li>

				<li>GL_ARB_vertex_type_10f_11f_11f_rev on i965 and r600g</li>

				<li>GL_KHR_debug</li>

				<li>GLX_MESA_query_renderer</li>

				</ul>

				<h2>Bug fixes</h2>

				<p>Attempts have been made to <b>not</b> include bugs fixed in previous 9.2

				releases or bugs that were regressions during 10.0 development. This list is

				likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47755">Bug 47755</a> - [glsl-compiler] no error checking when Interpolation qualifier for built-in variable is different in vertex and fragment shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=52171">Bug 52171</a> - [gallium/r600/clover] Simple benchmarks failed to run</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53077">Bug 53077</a> - [IVB] Output error with msaa when both of framebuffer and source color's alpha are not 1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54867">Bug 54867</a> - bug in r300 compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60929">Bug 60929</a> - [r600-llvm] mono games with opengl are blocking on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62142">Bug 62142</a> - Mesa/demo mipmap_limits upside down with running by SOFTWARE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62698">Bug 62698</a> - [bisected] WebGL demo &quot;Consumed&quot;: texstate.c:628: update_texture_state: Assertion „__builtin_popcount(enabledTargets) == 1“ failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64225">Bug 64225</a> - bfgminer --scyte generates Segmentation Fault on Northern Island</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64226">Bug 64226</a> - python-opencl package generate segmentation fault at pipe_r600.so</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64261">Bug 64261</a> - [SNB Bisected]Ogles3conform GL3Tests_color_buffer_float_color_buffer_float_clamp_fixed.test fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66213">Bug 66213</a> - Certain Mesa Demos Rendering Inverted (vertically)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66806">Bug 66806</a> - [softpipe] glxgears floating point exception</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67921">Bug 67921</a> - [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68162">Bug 68162</a> - [radeonsi] texture rendering is broken in Source-Engine games</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68451">Bug 68451</a> - Texture flicker in native Dota2 in mesa 9.2.0rc1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68503">Bug 68503</a> - Graphical glitches in Serious Sam 3 when SB is enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68792">Bug 68792</a> - Problems during playback of h264 files using UVD and VLC on AMD E-350 CPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68845">Bug 68845</a> - VDPAU/UVD regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69078">Bug 69078</a> - Modern Warfare (1, 2 and 3) broken in Wine on SNB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69321">Bug 69321</a> - starting openCL crashes/boots system</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70042">Bug 70042</a> - Major texture flickering in Dota 2 (r600g on HD 6950)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70088">Bug 70088</a> - Glamor on r600g crashes Xserver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70123">Bug 70123</a> - Freeze caused by 'winsys/radeon: remove cs_queue_empty' commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70327">Bug 70327</a> - Casting floating point variable to integer not working properly while constant gets converted properly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70891">Bug 70891</a> - CL_INVALID_BUILD_OPTIONS results in CL_INVALID_DEVICE when asking for build log</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70913">Bug 70913</a> - [PIGLIT,radeonsi] crash in &quot;spec/EXT_framebuffer_multisample/sample-alpha-to-coverage 4 depth&quot; (buffer overflow)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71022">Bug 71022</a> - configure: error: Expat required for DRI.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71110">Bug 71110</a> - xorg_driver.c:1030:2: error: too many arguments to function ‘DamageUnregister’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71172">Bug 71172</a> - Segfault when running glxinfo. NV25GL [Quadro4 900 XGL]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71512">Bug 71512</a> - dlopen.h:54: undefined reference to `dlopen'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71870">Bug 71870</a> - Metro: Last Light rendering issues</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Removed X.Org state tracker (unmaintained and broken)</li>

				<li>Removed the video-accel r300 targets</li>

				<li>Removed the video-accel softpipe targets</li>

				</ul>

				</div>

				</body>

				</html>

									
										254

docs/relnotes/10.1.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,254 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.1 Release Notes / April 18, 2014</h1>

				<p>

				Mesa 10.1.1 is a bug fix release which fixes bugs found since the 10.1 release.

				</p>

				<p>

				Mesa 10.1.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				96e63674ccfa98e7ec6eb4fee3f770c3  MesaLib-10.1.1.tar.gz

				1fde7ed079df7aeb9b6a744ca033de8d  MesaLib-10.1.1.tar.bz2

				e64d0a562638664b13d2edf22321df59  MesaLib-10.1.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error &quot;SSE4.1 instruction set not enabled&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74868">Bug 74868</a> - r600g: Diablo III Crashes After a few minutes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74988">Bug 74988</a> - Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75279">Bug 75279</a> - XCloseDisplay() takes one minute around nouveau_dri.so, freezing Firefox startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75543">Bug 75543</a> - OSMesa Gallium OSMesaMakeCurrent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75660">Bug 75660</a> - u_inlines.h:277:pipe_buffer_map_range: Assertion `length' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76323">Bug 76323</a> - GLSL compiler ignores layout(binding=N) on uniform blocks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76377">Bug 76377</a> - DRI3 should only be enabled on Linux due to a udev dependency</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76749">Bug 76749</a> - [HSW] DOTA world lighting has no effect</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77102">Bug 77102</a> - gallium nouveau has no profile in vdpau and libva</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77207">Bug 77207</a> - [ivb/hsw] batch overwritten with garbage</li>

				</ul>

				<h2>Changes</h2>

				<p>Aaron Watry (1):</p>

				<ul>

				  <li>gallium/util: Fix memory leak</li>

				</ul>

				<p>Alexander von Gluck IV (1):</p>

				<ul>

				  <li>haiku: Fix build through scons corrections and viewport fixes</li>

				</ul>

				<p>Anuj Phogat (2):</p>

				<ul>

				  <li>mesa: Set initial internal format of a texture to GL_RGBA</li>

				  <li>mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexImage{123}D()</li>

				</ul>

				<p>Brian Paul (12):</p>

				<ul>

				  <li>softpipe: use 64-bit arithmetic in softpipe_resource_layout()</li>

				  <li>mesa: don't call ctx-&gt;Driver.ClearBufferSubData() if size==0</li>

				  <li>st/osmesa: check buffer size when searching for buffers</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SARGB8()</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SRGB8()</li>

				  <li>c11/threads: don't include assert.h if the assert macro is already defined</li>

				  <li>mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up</li>

				  <li>st/mesa: add null pointer checking in query object functions</li>

				  <li>mesa: fix glMultiDrawArrays inside a display list</li>

				  <li>cso: fix sampler view count in cso_set_sampler_views()</li>

				  <li>svga: replace sampler assertion with conditional</li>

				  <li>svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()</li>

				</ul>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>cherry-ignore: Ignore a few patches</li>

				  <li>glsl: Allow explicit binding on atomics again</li>

				  <li>Update VERSION to 10.1.1</li>

				</ul>

				<p>Chia-I Wu (1):</p>

				<ul>

				  <li>i965/vec4: fix record clearing in copy propagation</li>

				</ul>

				<p>Christian König (2):</p>

				<ul>

				  <li>st/mesa: recreate sampler view on context change v3</li>

				  <li>st/mesa: fix sampler view handling with shared textures v4</li>

				</ul>

				<p>Courtney Goeltzenleuchter (1):</p>

				<ul>

				  <li>mesa: add bounds checking to eliminate buffer overrun</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>nv50: add missing brackets when handling the samplers array</li>

				  <li>mesa: return v.value_int64 when the requested type is TYPE_INT64</li>

				  <li>configure: enable dri3 only for linux</li>

				  <li>glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path</li>

				  <li>configure: cleanup libudev handling</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Fix buffer overruns in MSAA MCS buffer clearing.</li>

				</ul>

				<p>Hans (2):</p>

				<ul>

				  <li>util: don't define isfinite(), isnan() for MSVC &gt;= 1800</li>

				  <li>mesa: don't define c99 math functions for MSVC &gt;= 1800</li>

				</ul>

				<p>Ian Romanick (7):</p>

				<ul>

				  <li>linker: Split set_uniform_binding into separate functions for blocks and samplers</li>

				  <li>linker: Various trivial clean-ups in set_sampler_binding</li>

				  <li>linker: Fold set_uniform_binding into call site</li>

				  <li>linker: Clean up "unused parameter" warnings</li>

				  <li>linker: Set block bindings based on UniformBlocks rather than UniformStorage</li>

				  <li>linker: Set binding for all elements of UBO array</li>

				  <li>glsl: Propagate explicit binding information from the AST all the way to the linker</li>

				</ul>

				<p>Ilia Mirkin (8):</p>

				<ul>

				  <li>nouveau: fix fence waiting logic in screen destroy</li>

				  <li>nv50: adjust blit_3d handling of ms output textures</li>

				  <li>loader: add special logic to distinguish nouveau from nouveau_vieux</li>

				  <li>mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture</li>

				  <li>nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list</li>

				  <li>nouveau: there may not have been a texture if the fbo was incomplete</li>

				  <li>nvc0/ir: move sample id to second source arg to fix sampler2DMS</li>

				  <li>nouveau: fix firmware check on nvd7/nvd9</li>

				</ul>

				<p>Johannes Nixdorf (1):</p>

				<ul>

				  <li>configure.ac: fix the detection of expat with pkg-config</li>

				</ul>

				<p>Jonathan Gray (7):</p>

				<ul>

				  <li>gallium: add endian detection for OpenBSD</li>

				  <li>loader: use 0 instead of FALSE which isn't defined</li>

				  <li>loader: don't limit the non-udev path to only android</li>

				  <li>megadriver_stub.c: don't use _GNU_SOURCE to gate the compat code</li>

				  <li>egl/dri2: don't require libudev to build drm/wayland platforms</li>

				  <li>egl/dri2: use drm macros to construct device name</li>

				  <li>configure: don't require libudev for gbm or egl drm/wayland</li>

				</ul>

				<p>José Fonseca (4):</p>

				<ul>

				  <li>c11/threads: Fix nano to milisecond conversion.</li>

				  <li>mapi/u_thread: Use GetCurrentThreadId</li>

				  <li>c11/threads: Don't implement thrd_current on Windows.</li>

				  <li>draw: Duplicate TGSI tokens in draw_pipe_pstipple module.</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>i965/fs: Fix register comparisons in saturate propagation.</li>

				  <li>glsl: Fix lack of i2u in lower_ubo_reference.</li>

				  <li>i965: Stop advertising GL_MESA_ycbcr_texture.</li>

				  <li>glsl: Try vectorizing when seeing a repeated assignment to a channel.</li>

				</ul>

				<p>Marek Olšák (13):</p>

				<ul>

				  <li>r600g: fix texelFetchOffset GLSL functions</li>

				  <li>r600g: fix blitting the last 2 mipmap levels for Evergreen</li>

				  <li>mesa: fix the format of glEdgeFlagPointer</li>

				  <li>r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limits</li>

				  <li>st/mesa: fix per-vertex edge flags and GLSL support (v2)</li>

				  <li>mesa: mark GL_RGB9_E5 as not color-renderable</li>

				  <li>mesa: fix texture border handling for cube arrays</li>

				  <li>mesa: allow generating mipmaps for cube arrays</li>

				  <li>mesa: fix software fallback for generating mipmaps for cube arrays</li>

				  <li>mesa: fix software fallback for generating mipmaps for 3D textures</li>

				  <li>st/mesa: fix generating mipmaps for cube arrays</li>

				  <li>st/mesa: drop the lowering of quad strips to triangle strips</li>

				  <li>r600g: implement edge flags</li>

				</ul>

				<p>Matt Turner (4):</p>

				<ul>

				  <li>mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.</li>

				  <li>i965/fs: Fix off-by-one in saturate propagation.</li>

				  <li>i965/fs: Don't propagate saturate modifiers into partial writes.</li>

				  <li>i965/fs: Don't propagate saturation modifiers if there are source modifiers.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>r600g: Don't leak bytecode on shader compile failure</li>

				</ul>

				<p>Mike Stroyan (1):</p>

				<ul>

				  <li>i965: Avoid dependency hints on math opcodes</li>

				</ul>

				<p>Thomas Hellstrom (5):</p>

				<ul>

				  <li>winsys/svga: Replace the query mm buffer pool with a slab pool v3</li>

				  <li>winsys/svga: Update the vmwgfx_drm.h header to latest version from kernel</li>

				  <li>winsys/svga: Fix prime surface references also for guest-backed surfaces</li>

				  <li>st/xa: Bind destination before setting new state</li>

				  <li>st/xa: Make sure unused samplers are set to NULL</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>configure: Use LLVM shared libraries by default</li>

				</ul>

				</div>

				</body>

				</html>

									
										75

docs/relnotes/10.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1 Release Notes / March 4, 2014</h1>

				<p>

				Mesa 10.1 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.1.1.

				</p>

				<p>

				Mesa 10.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				3ec43f79dbcd9aa2a4a27bf1f51655b6  MesaLib-10.1.0.tar.bz2

				08e796ec7122aa299d32d4f67a254315  MesaLib-10.1.0.tar.gz

				bd365356543f4b38e57c1ddf7a317c40  MesaLib-10.1.0.zip

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_draw_indirect on i965.</li>

				<li>GL_ARB_clear_buffer_object</li>

				<li>GL_ARB_viewport_array on i965.</li>

				<li>GL_ARB_map_buffer_alignment on all drivers that did not previously support

				it.</li>

				<li>GL_AMD_shader_trinary_minmax.</li>

				<li>GL_EXT_framebuffer_blit on r200 and radeon.</li>

				<li>Reduced memory usage for display lists.</li>

				<li>OpenGL 3.3 support on nv50, nvc0, r600 and radeonsi</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				<ul>

				<li>Removed support for the GL_MESA_texture_array extension.  This extension

				  enabled the use of texture array with fixed-function and assembly fragment

				  shaders.  No applications are known to use this extension.</li>

				</ul>

				</div>

				</body>

				</html>

									
										74

docs/relnotes/10.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,74 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2 Release Notes / TBD</h1>

				<p>

				Mesa 10.2 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.2.1.

				</p>

				<p>

				Mesa 10.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_buffer_storage on i965, nv30, nv50, nvc0, r300, r600, and radeonsi</li>

				<li>GL_ARB_multi_bind on all drivers</li>

				<li>GL_ARB_sample_shading on nv50 (GT21x only), nvc0</li>

				<li>GL_ARB_separate_shader_objects (desktop OpenGL) and

				  GL_EXT_separate_shader_objects (OpenGL ES 2.0 and 3.0) on all drivers</li>

				<li>GL_ARB_stencil_texturing on i965/gen8+</li>

				<li>GL_ARB_texture_cube_map_array on nv50 (GT21x only)</li>

				<li>GL_ARB_texture_gather on nv50 (GT21x only), nvc0</li>

				<li>GL_ARB_texture_query_lod on nv50 (GT21x only), nvc0</li>

				<li>GL_ARB_texture_view on i965/gen7</li>

				<li>GL_ARB_vertex_type_10f_11f_11f_rev on nv50, nvc0, radeonsi</li>

				<li>GL_ARB_viewport_array on nv50, r600</li>

				<li>GL_INTEL_performance_query on i965/gen5+</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				<ul>

				</ul>

				</div>

				</body>

				</html>

0

docs/RELNOTES-3.1 → docs/relnotes/3.1

View File

0

docs/RELNOTES-3.2 → docs/relnotes/3.2

View File

0

docs/RELNOTES-3.2.1 → docs/relnotes/3.2.1

View File

0

docs/RELNOTES-3.3 → docs/relnotes/3.3

View File

0

docs/RELNOTES-3.4 → docs/relnotes/3.4

View File

0

docs/RELNOTES-3.4.1 → docs/relnotes/3.4.1

View File

0

docs/RELNOTES-3.4.2 → docs/relnotes/3.4.2

View File

0

docs/RELNOTES-3.5 → docs/relnotes/3.5

View File

0

docs/RELNOTES-4.0 → docs/relnotes/4.0

View File

0

docs/RELNOTES-4.0.1 → docs/relnotes/4.0.1

View File

0

docs/RELNOTES-4.0.2 → docs/relnotes/4.0.2

View File

0

docs/RELNOTES-4.0.3 → docs/relnotes/4.0.3

View File

0

docs/RELNOTES-4.1 → docs/relnotes/4.1

View File

0

docs/RELNOTES-5.0 → docs/relnotes/5.0

View File

0

docs/RELNOTES-5.0.1 → docs/relnotes/5.0.1

View File

0

docs/RELNOTES-5.0.2 → docs/relnotes/5.0.2

View File

2

docs/RELNOTES-5.1 → docs/relnotes/5.1

View File

@@ -106,7 +106,7 @@ Vertex/Fragment program debugger
 GL_MESA_program_debug is an experimental extension to support
 interactive debugging of vertex and fragment programs.  See the
 docs/MESA_program_debug.spec file for details.
 docs/specs/OLD/MESA_program_debug.spec file for details.
 The bulk of the vertex/fragment program debugger is implemented
 outside of Mesa.  The GL_MESA_program_debug extension just has minimal

0

docs/RELNOTES-6.0 → docs/relnotes/6.0

View File

0

docs/RELNOTES-6.0.1 → docs/relnotes/6.0.1

View File

0

docs/RELNOTES-6.1 → docs/relnotes/6.1

View File

0

docs/RELNOTES-6.2 → docs/relnotes/6.2

View File

0

docs/RELNOTES-6.2.1 → docs/relnotes/6.2.1

View File

0

docs/RELNOTES-6.3 → docs/relnotes/6.3

View File

0

docs/RELNOTES-6.3.1 → docs/relnotes/6.3.1

View File

0

docs/RELNOTES-6.3.2 → docs/relnotes/6.3.2

View File

0

docs/RELNOTES-6.4 → docs/relnotes/6.4

View File

									
										4

docs/relnotes-6.4.1.html → docs/relnotes/6.4.1.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.4.1 / November 29, 2006</h1>

									
										4

docs/relnotes-6.4.2.html → docs/relnotes/6.4.2.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.4.2 / February 2, 2006</h1>

									
										4

docs/relnotes-6.4.html → docs/relnotes/6.4.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.4 / October 24, 2005</h1>

									
										4

docs/relnotes-6.5.1.html → docs/relnotes/6.5.1.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.5.1 Release Notes / September 15, 2006</h1>

									
										4

docs/relnotes-6.5.2.html → docs/relnotes/6.5.2.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.5.2 Release Notes / December 2, 2006</h1>

									
										6

docs/relnotes-6.5.3.html → docs/relnotes/6.5.3.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.5.3 Release Notes / April 27, 2007</h1>

				@@ -56,7 +56,7 @@ for the same reason.

				<ul>

				<li>OpenGL 2.0 and 2.1 API support.

				<li>Entirely new Shading Language code generator.  See the

				<a href="shading.html">Shading Language</a> page for more information.

				<a href="../shading.html">Shading Language</a> page for more information.

				<li>Much faster software execution of vertex, fragment shaders.

				<li>New vertex buffer object (vbo) infrastructure

				<li>Updated glext.h file (version 39)

									
										4

docs/relnotes-6.5.html → docs/relnotes/6.5.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 6.5 Release Notes / March 31, 2006</h1>

									
										4

docs/relnotes-7.0.1.html → docs/relnotes/7.0.1.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.0.1 Release Notes / August 3, 2007</h1>

									
										4

docs/relnotes-7.0.2.html → docs/relnotes/7.0.2.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.0.2 Release Notes / November 10, 2007</h1>

									
										4

docs/relnotes-7.0.3.html → docs/relnotes/7.0.3.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.0.3 Release Notes / April 4, 2008</h1>

									
										4

docs/relnotes-7.0.4.html → docs/relnotes/7.0.4.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.0.4 Release Notes / August 16, 2008</h1>

									
										4

docs/relnotes-7.0.html → docs/relnotes/7.0.html
									
												View File
												
				@@ -2,7 +2,7 @@

				<html lang="en">

				<head>

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.0 Release Notes / June 22, 2007</h1>

									
										4

docs/relnotes-7.1.html → docs/relnotes/7.1.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.1 Release Notes / August 26, 2008</h1>

									
										6

docs/relnotes-7.10.1.html → docs/relnotes/7.10.1.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.10.1 Release Notes / March 2, 2011</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.10.2.html → docs/relnotes/7.10.2.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.10.2 Release Notes / April 6, 2011</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.10.3.html → docs/relnotes/7.10.3.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.10.3 Release Notes / June 13, 2011</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										8

docs/relnotes-7.10.html → docs/relnotes/7.10.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.10 Release Notes / January 7, 2011</h1>

				@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

				@@ -699,7 +699,7 @@ bc644be551ed585fc4f66c16b64a91c9  MesaGLUT-7.10.tar.gz

				  <li>st/egl: Plug pbuffer leaks.</li>

				  <li>st/egl: Fix eglCopyBuffers.</li>

				  <li>st/egl: Assorted fixes for dri2_display_get_configs.</li>

				  <li>docs/egl: Update egl.html.</li>

				  <li>docs/egl: Update ../egl.html.</li>

				  <li>st/egl: Fix eglChooseConfig when configs is NULL.</li>

				  <li>docs: Add an example for EGL_DRIVERS_PATH.</li>

				  <li>autoconf: Fix --with-driver=xlib --enable-openvg.</li>

									
										6

docs/relnotes-7.11.1.html → docs/relnotes/7.11.1.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.11.1 Release Notes / November 17, 2011</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.11.2.html → docs/relnotes/7.11.2.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.11.2 Release Notes / November 27, 2011</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.11.html → docs/relnotes/7.11.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.11 Release Notes / July 31, 2011</h1>

				@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										4

docs/relnotes-7.2.html → docs/relnotes/7.2.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.2 Release Notes / 20 September 2008</h1>

									
										6

docs/relnotes-7.3.html → docs/relnotes/7.3.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.3 Release Notes / 22 January 2009</h1>

				@@ -27,7 +27,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.4.1.html → docs/relnotes/7.4.1.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.4.1 Release Notes / 18 April 2009</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.4.2.html → docs/relnotes/7.4.2.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.4.2 Release Notes / May 15, 2009</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.4.3.html → docs/relnotes/7.4.3.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.4.3 Release Notes / 19 June 2009</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.4.4.html → docs/relnotes/7.4.4.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.4.4 Release Notes / 23 June 2009</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

									
										6

docs/relnotes-7.4.html → docs/relnotes/7.4.html
									
												View File
												
				@@ -3,7 +3,7 @@

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				@@ -11,7 +11,7 @@

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 7.4 Release Notes / 27 March 2009</h1>

				@@ -25,7 +25,7 @@ glGetString(GL_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 2.1.

				</p>

				<p>

				See the <a href="install.html">Compiling/Installing page</a> for prerequisites

				See the <a href="../install.html">Compiling/Installing page</a> for prerequisites

				for DRI hardware acceleration.

				</p>

Compare commits

7877 Commits mesa-9.1.4 ... 10.2-branc

1 .dir-locals.el Unescape Escape View File

3 Android.common.mk Unescape Escape View File

5 Android.mk Unescape Escape View File

16 Makefile.am Unescape Escape View File

20 SConstruct Unescape Escape View File

1 VERSION Normal file Unescape Escape View File

52 bin/bugzilla_mesa.sh Executable file Unescape Escape View File

8 bin/get-pick-list.sh Unescape Escape View File

251 bin/perf-annotate-jit Executable file Unescape Escape View File

6 bin/shortlog_mesa.sh Unescape Escape View File

3 common.py Unescape Escape View File

1354 configure.ac View File

272 docs/GL3.txt Unescape Escape View File

256 docs/README.CYGWIN Unescape Escape View File

102 docs/README.MITS Unescape Escape View File

207 docs/README.QUAKE Unescape Escape View File

52 docs/README.THREADS Unescape Escape View File

44 docs/README.UVD Normal file Unescape Escape View File

43 docs/README.VCE Normal file Unescape Escape View File

17 docs/README.WIN32 Unescape Escape View File

83 docs/application-issues.html Normal file Unescape Escape View File

37 docs/autoconf.html Unescape Escape View File

2 docs/conform.html Unescape Escape View File

1 docs/contents.html Unescape Escape View File

97 docs/devinfo.html Unescape Escape View File

10 docs/dispatch.html Unescape Escape View File

22 docs/egl.html Unescape Escape View File

64 docs/envvars.html Unescape Escape View File

34 docs/extensions.html Unescape Escape View File

8 docs/faq.html Unescape Escape View File

252 docs/index.html Unescape Escape View File

4 docs/install.html Unescape Escape View File

10 docs/license.html Unescape Escape View File

99 docs/llvmpipe.html Unescape Escape View File

4 docs/opengles.html Unescape Escape View File

2 docs/openvg.html Unescape Escape View File

71 docs/osmesa.html Unescape Escape View File

172 docs/relnotes.html Unescape Escape View File

150 docs/relnotes/10.0.1.html Normal file Unescape Escape View File

161 docs/relnotes/10.0.2.html Normal file Unescape Escape View File

206 docs/relnotes/10.0.3.html Normal file Unescape Escape View File

191 docs/relnotes/10.0.4.html Normal file Unescape Escape View File

173 docs/relnotes/10.0.5.html Normal file Unescape Escape View File

146 docs/relnotes/10.0.html Normal file Unescape Escape View File

254 docs/relnotes/10.1.1.html Normal file Unescape Escape View File

75 docs/relnotes/10.1.html Normal file Unescape Escape View File

74 docs/relnotes/10.2.html Normal file Unescape Escape View File

0 docs/RELNOTES-3.1 → docs/relnotes/3.1 Unescape Escape View File

0 docs/RELNOTES-3.2 → docs/relnotes/3.2 Unescape Escape View File

0 docs/RELNOTES-3.2.1 → docs/relnotes/3.2.1 Unescape Escape View File

0 docs/RELNOTES-3.3 → docs/relnotes/3.3 Unescape Escape View File

0 docs/RELNOTES-3.4 → docs/relnotes/3.4 Unescape Escape View File

0 docs/RELNOTES-3.4.1 → docs/relnotes/3.4.1 Unescape Escape View File

0 docs/RELNOTES-3.4.2 → docs/relnotes/3.4.2 Unescape Escape View File

0 docs/RELNOTES-3.5 → docs/relnotes/3.5 Unescape Escape View File

0 docs/RELNOTES-4.0 → docs/relnotes/4.0 Unescape Escape View File

0 docs/RELNOTES-4.0.1 → docs/relnotes/4.0.1 Unescape Escape View File

0 docs/RELNOTES-4.0.2 → docs/relnotes/4.0.2 Unescape Escape View File

0 docs/RELNOTES-4.0.3 → docs/relnotes/4.0.3 Unescape Escape View File

0 docs/RELNOTES-4.1 → docs/relnotes/4.1 Unescape Escape View File

0 docs/RELNOTES-5.0 → docs/relnotes/5.0 Unescape Escape View File

0 docs/RELNOTES-5.0.1 → docs/relnotes/5.0.1 Unescape Escape View File

0 docs/RELNOTES-5.0.2 → docs/relnotes/5.0.2 Unescape Escape View File

2 docs/RELNOTES-5.1 → docs/relnotes/5.1 Unescape Escape View File

0 docs/RELNOTES-6.0 → docs/relnotes/6.0 Unescape Escape View File

0 docs/RELNOTES-6.0.1 → docs/relnotes/6.0.1 Unescape Escape View File

0 docs/RELNOTES-6.1 → docs/relnotes/6.1 Unescape Escape View File

0 docs/RELNOTES-6.2 → docs/relnotes/6.2 Unescape Escape View File

0 docs/RELNOTES-6.2.1 → docs/relnotes/6.2.1 Unescape Escape View File

0 docs/RELNOTES-6.3 → docs/relnotes/6.3 Unescape Escape View File

0 docs/RELNOTES-6.3.1 → docs/relnotes/6.3.1 Unescape Escape View File

0 docs/RELNOTES-6.3.2 → docs/relnotes/6.3.2 Unescape Escape View File

0 docs/RELNOTES-6.4 → docs/relnotes/6.4 Unescape Escape View File

4 docs/relnotes-6.4.1.html → docs/relnotes/6.4.1.html Unescape Escape View File

4 docs/relnotes-6.4.2.html → docs/relnotes/6.4.2.html Unescape Escape View File

4 docs/relnotes-6.4.html → docs/relnotes/6.4.html Unescape Escape View File

4 docs/relnotes-6.5.1.html → docs/relnotes/6.5.1.html Unescape Escape View File

4 docs/relnotes-6.5.2.html → docs/relnotes/6.5.2.html Unescape Escape View File

7877 Commits

mesa-9.1.4 ... 10.2-branc

1

.dir-locals.el

View File

3

Android.common.mk

View File

5

Android.mk

View File

16

Makefile.am

View File

20

SConstruct

View File

1

VERSION Normal file

View File

52

bin/bugzilla_mesa.sh Executable file

View File

8

bin/get-pick-list.sh

View File

251

bin/perf-annotate-jit Executable file

View File

6

bin/shortlog_mesa.sh

View File

3

common.py

View File

1354

configure.ac

View File

272

docs/GL3.txt

View File

256

docs/README.CYGWIN

View File

102

docs/README.MITS

View File

207

docs/README.QUAKE

View File

52

docs/README.THREADS

View File

44

docs/README.UVD Normal file

View File

43

docs/README.VCE Normal file

View File

17

docs/README.WIN32

View File

83

docs/application-issues.html Normal file

View File

37

docs/autoconf.html

View File

2

docs/conform.html

View File

1

docs/contents.html

View File

97

docs/devinfo.html

View File

10

docs/dispatch.html

View File

22

docs/egl.html

View File

64

docs/envvars.html

View File

34

docs/extensions.html

View File

8

docs/faq.html

View File

252

docs/index.html

View File

4

docs/install.html

View File

10

docs/license.html

View File

99

docs/llvmpipe.html

View File

4

docs/opengles.html

View File

2

docs/openvg.html

View File

71

docs/osmesa.html

View File

172

docs/relnotes.html

View File

150

docs/relnotes/10.0.1.html Normal file

View File

161

docs/relnotes/10.0.2.html Normal file

View File

206

docs/relnotes/10.0.3.html Normal file

View File

191

docs/relnotes/10.0.4.html Normal file

View File

173

docs/relnotes/10.0.5.html Normal file

View File

146

docs/relnotes/10.0.html Normal file

View File

254

docs/relnotes/10.1.1.html Normal file

View File

75

docs/relnotes/10.1.html Normal file

View File

74

docs/relnotes/10.2.html Normal file

View File

0

docs/RELNOTES-3.1 → docs/relnotes/3.1

View File

0

docs/RELNOTES-3.2 → docs/relnotes/3.2

View File

0

docs/RELNOTES-3.2.1 → docs/relnotes/3.2.1

View File

0

docs/RELNOTES-3.3 → docs/relnotes/3.3

View File

0

docs/RELNOTES-3.4 → docs/relnotes/3.4

View File

0

docs/RELNOTES-3.4.1 → docs/relnotes/3.4.1

View File

0

docs/RELNOTES-3.4.2 → docs/relnotes/3.4.2

View File

0

docs/RELNOTES-3.5 → docs/relnotes/3.5

View File

0

docs/RELNOTES-4.0 → docs/relnotes/4.0

View File

0

docs/RELNOTES-4.0.1 → docs/relnotes/4.0.1

View File

0

docs/RELNOTES-4.0.2 → docs/relnotes/4.0.2

View File

0

docs/RELNOTES-4.0.3 → docs/relnotes/4.0.3

View File

0

docs/RELNOTES-4.1 → docs/relnotes/4.1

View File

0

docs/RELNOTES-5.0 → docs/relnotes/5.0

View File

0

docs/RELNOTES-5.0.1 → docs/relnotes/5.0.1

View File

0

docs/RELNOTES-5.0.2 → docs/relnotes/5.0.2

View File

2

docs/RELNOTES-5.1 → docs/relnotes/5.1

View File

0

docs/RELNOTES-6.0 → docs/relnotes/6.0

View File

0

docs/RELNOTES-6.0.1 → docs/relnotes/6.0.1

View File

0

docs/RELNOTES-6.1 → docs/relnotes/6.1

View File

0

docs/RELNOTES-6.2 → docs/relnotes/6.2

View File

0

docs/RELNOTES-6.2.1 → docs/relnotes/6.2.1

View File

0

docs/RELNOTES-6.3 → docs/relnotes/6.3

View File

0

docs/RELNOTES-6.3.1 → docs/relnotes/6.3.1

View File

0

docs/RELNOTES-6.3.2 → docs/relnotes/6.3.2

View File

0

docs/RELNOTES-6.4 → docs/relnotes/6.4

View File

4

docs/relnotes-6.4.1.html → docs/relnotes/6.4.1.html

View File

4

docs/relnotes-6.4.2.html → docs/relnotes/6.4.2.html

View File

4

docs/relnotes-6.4.html → docs/relnotes/6.4.html

View File

4

docs/relnotes-6.5.1.html → docs/relnotes/6.5.1.html

View File

4

docs/relnotes-6.5.2.html → docs/relnotes/6.5.2.html

View File

6

docs/relnotes-6.5.3.html → docs/relnotes/6.5.3.html

View File