Comparing 035cce83f7...c2a0600d5b - mesa

fran/mesa

Author	SHA1	Message	Date
Kenneth Graunke	c2a0600d5b	i965: Don't set NirOptions for stages that will use the vec4 backend. We've started using NirOptions != NULL to mean "we're using NIR for this stage." However, when INTEL_USE_NIR=1, we set it for a bunch of stages that still use the vec4 backend, and thus definitely aren't using NIR. For example, if INTEL_USE_NIR=1 we disable the GLSL IR cubemap normalization pass, even for vertex shaders and geometry shaders. This is wrong, but breaks a very uncommon case. When I started deleting GLSL IR for stages where we claimed to be using NIR, this bug quickly became apparent. For now, only set it for fragment shaders, and vertex shaders if brw->scalar_vs is set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-10 16:22:48 -07:00
Nick Sarnie	f9048ee3c8	gallivm: Fix build since llvm-3.7.0svn r234495 Revert `50e9fa2ed6` as LLVM reverted their change. Signed-off-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2015-04-10 13:30:23 -04:00
Ville Syrjälä	50db8bd1b5	i965/disasm: Print the type after the swizzle also for 3src src operands The disassembly currently has the swizzle after the type for 3src source operands, and the other way around for 2src. Flip the type and swizzle around for 3src so that the output matches 2src. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-04-10 14:53:12 +03:00
Kenneth Graunke	ae17f34850	i965: Move brw_link_shader's GLSL IR transformations into a helper. This function was getting a bit large and unwieldy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:37 -07:00
Kenneth Graunke	10d85ffc5a	i965: Change brw_shader to gl_shader in brw_link_shader(). Nothing actually wanted brw_shader fields - we just had to type shader->base all over the place for no reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:35 -07:00
Kenneth Graunke	500da98e0b	nir: Constify nir_lower_sampler's gl_shader_program pointer. Now that we're not generating linker errors, we don't actually modify this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:33 -07:00
Kenneth Graunke	709b88ccd8	nir: Remove linker_error calls from nir_lower_samplers(). These should never happen. Plus, NIR passes really shouldn't be reporting linker errors - this is past link time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:31 -07:00
Kenneth Graunke	99264b7f37	nir: Make nir_lower_samplers take a gl_shader_stage, not a gl_program *. We don't actually need a gl_program struct. We only used it to translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage (i.e. MESA_SHADER_VERTEX). We may as well just pass that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:29 -07:00
Kenneth Graunke	4b27391cad	nir: Move gl_shader_stage enum from mtypes.h to shader_enums.h. I want to use this in some code that doesn't currently include mtypes.h. It seems like a better place for it anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:27 -07:00
Kenneth Graunke	feafe70399	nir: Fix #include guards in shader_enums.h. This header was originally going to be called pipeline.h, but it got renamed at the last minute. Make the include guards match. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:16:25 -07:00
Kenneth Graunke	d0f39a2fcd	nir: Constify prog_to_nir's gl_program pointer. prog_to_nir should not modify the incoming Mesa IR program - just translate it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-10 02:15:58 -07:00
Vinson Lee	50e9fa2ed6	gallivm: Fix build since llvm-3.7.0svn r234460. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89963 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-04-09 10:41:26 -07:00
Roland Scheidegger	a873b79fa5	draw: (trivial) don't print the shader twice with GALLIVM_DEBUG=tgsi (or ir) Neither the shader nor the key change when doing elts or linear variant, so this was just annoying (probably mildly useful at some point when we printed the IR per function too). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-04-09 01:32:30 +02:00
Roland Scheidegger	586536a4e1	gallivm: don't use control flow when doing indirect constant buffer lookups llvm goes crazy when doing that, using way more memory and time, though there's probably more to it - this points to a very much similar issue as fixed in `8a9f5ecdb1`. In any case I've seen a quite plain looking vertex shader with just ~50 simple tgsi instructions (but with a dozen or so such indirect constant buffer lookups) go from a terribly high ~440ms compile time (consuming 25MB of memory in the process) down to a still awful ~230ms and 13MB with this fix (with llvm 3.3), so there's still obvious improvements possible (but I have no clue why it's so slow...). The resulting shader is most likely also faster (certainly seemed so though I don't have any hard numbers as it may have been influenced by compile times) since generally fetching constants outside the buffer range is most likely an app error (that is we expect all indices to be valid). It is possible this fixes some mysterious vertex shader slowdowns we've seen ever since we are conforming to newer apis at least partially (the main draw loop also has similar looking conditionals which we probably could do without - if not for the fetch at least for the additional elts condition.) v2: use static vars for the fake bufs, minor code cleanups Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-04-09 01:32:30 +02:00
Brian Paul	09e7e2016b	glsl: check for forced_language_version in is_version() This is a follow-on fix from the earlier "glsl: allow ForceGLSLVersion to override #version directives" change. Since we're not changing the language_version field, we have to check forced_language_version here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-04-08 17:03:16 -06:00
Neil Roberts	4deca1274c	i965/skl: Fix the order of the arguments for the LD sampler message In Skylake the order of the arguments for sample messages with the LD type are u, v, lod, r whereas previously they were u, lod, v, r. This fixes 144 Piglit tests including ones that directly use texelFetch and also some using the meta stencil blit path which appears to use texelFetch in its shader. v2: Fix sampling 1D textures Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-04-08 12:08:41 +01:00
Zhenyu Wang	eb51c6d55f	i965: Fix depth field setting in surface state for raw buffer on Gen7/8 On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface state means [30:21] bits of number of entries which is different from other surface format which uses [26:21] bits field. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-04-08 13:20:17 +08:00
Dave Airlie	6b722c390b	u_tile: fix warnings about incompatible casts. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-08 10:31:42 +10:00
Glenn Kennard	f2947807c8	r600g/sb: Enable SB for geometry shaders Add SV_GEOMETRY_EMIT special variable type to track the implicit dependencies between CUT/EMIT_VERTEX/MEM_RING instructions so GCM/scheduler doesn't reorder them. Mark emit instructions as unkillable so DCE doesn't eat them. Enable only for evergreen/cayman as there are a few unexplained GS piglit regressions on R6xx/R7xx with SB enabled otherwise. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-08 08:18:35 +10:00
Glenn Kennard	06bb68da4a	r600g/sb: Update last_cf for loops CF_END could end up emitted in the middle of a shader on cayman when there was a loop at the very end. Fixes glsl-1.50-geometry-end-primitive and ext_transform_feedback-geometry-shaders-basic piglit tests. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-08 08:18:17 +10:00
Dave Airlie	61393bdcdc	u_tile: fix stencil texturing tests under softpipe arb_stencil_texturing-draw failed under softpipe because we got a float back from the texturing function, and then tried to U2F it, stencil texturing returns ints, so we should fix the tiling to retrieve the stencil values as integers not floats. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-08 08:17:32 +10:00
Jason Ekstrand	11694737fc	nir: Make nir__instr_create take a nir_shader instead of a void context Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-07 14:34:21 -07:00
Kenneth Graunke	a10d493715	nir: Implement a nir_sweep() pass. This pass performs a mark and sweep pass over a nir_shader's associated memory - anything still connected to the program will be kept, and any dead memory we dropped on the floor will be freed. The expectation is that this will be called when finished building and optimizing the shader. However, it's also fine to call it earlier, and many times, to free up memory earlier. v2: (feedback from Jason Ekstrand) - Skip sweeping impl->start_block, as it's already in the CF list. - Don't sweep SSA defs (they're owned by their defining instruction) - Don't steal phi sources (they're owned by nir_phi_instr). - Don't steal tex->src (it's owned by the tex_inst itself) - Don't sweep dereference chains (top-level dereferences are owned by the instruction; sub-dereferences are owned by the parent deref). - Don't sweep sources and destinations (SSA defs are handled as part of the defining instruction, and registers are handled as part of function implementations). - Just steal instructions; don't walk them (no longer required). v3: (feedback from Jason Ekstrand) - Steal indirect sources from nir_src/nir_dest. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-07 14:34:14 -07:00
Kenneth Graunke	de2014cf1e	nir: Allocate dereferences out of their parent instruction or deref. Jason pointed out that variable dereferences in NIR are really part of their parent instruction, and should have the same lifetime. Unlike in GLSL IR, they're not used very often - just for intrinsic variables, call parameters & return, and indirect samplers for texturing. Also, nir_deref_var is the top-level concept, and nir_deref_array/nir_deref_record are child nodes. This patch attempts to allocate nir_deref_vars out of their parent instruction, and any sub-dereferences out of their parent deref. It enforces these restrictions in the validator as well. This means that freeing an instruction should free its associated dereference chain as well. The memory sweeper pass can also happily ignore them. v2: Rename make_deref to evaluate_deref and make it take a nir_instr * instead of void *. This involves adding &instr->instr everywhere. (Requested by Jason Ekstrand.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-07 14:34:14 -07:00
Kenneth Graunke	4f4b04b7c7	nir: Allocate nir_ssa_def::uses/if_uses out of the instruction. We can't allocate them out of the nir_ssa_def itself, because it may not be ralloc'd (for example, nir_dest embeds a nir_ssa_def). However, allocating them out of the instruction should work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-07 14:34:13 -07:00
Kenneth Graunke	900498bd11	nir: Allocate nir_phi_src values out of the nir_phi_instr. Phi sources are part of the phi instruction and should have the same lifetime. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-07 14:34:13 -07:00
Kenneth Graunke	b05d53404c	nir: Allocate nir_call_instr::params out of the nir_call itself. The lifetime of the params array needs to be match the nir_call_instr itself. So, allocate it using the instruction itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-07 14:34:13 -07:00
Kenneth Graunke	73d106822e	i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats. This allows those formats to work with the meta PBO upload path. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-04-07 14:34:02 -07:00
Kenneth Graunke	60dcd97257	i965: Use SET_FIELD in 3DSTATE_STREAMOUT packets. Suggested by Topi Pohjolainen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-04-07 14:34:02 -07:00
Jason Ekstrand	2e3b35a1cb	nir/lower_tex_projector: Don't use designated initializers These don't work in MSVC or in older versions of GCC Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89899 Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-07 11:49:39 -07:00
Tapani Pälli	1aa5738e66	glsl: relax input->output validation for SSO programs Commit `18004c3` introduced more restrictive validation to linker between inputs and outputs. This patch skips the additional check for programs that utilize GL_ARB_separate_shader_objects, there inputs and outputs might not make exact match during linking but only when constructing the final pipeline. This made some of the GL_ARB_program_interface_query tests shaders fail to link, these tests can be used to verify the change. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-04-07 08:11:07 +03:00
Ilia Mirkin	ae720c66cb	nv50,nvc0: limit the y-tiling of 3d textures to the first level's tiling We limit y-tiling to 0x20 when depth is involved. However the function is run for each miplevel, and the hardware expects miplevel 0 to have the highest tiling settings. Perform the y-tiling limit on all levels of a 3d texture, not just the ones that have depth. Fixes: texelFetch fs sampler3D 98x129x1-98x129x9 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nick Tenney <nick.tenney@gmail.com> # GT216 Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-04-06 23:06:55 -04:00
Dave Airlie	ad84689f73	r600g: fix op3 abs issue This code to handle absolute values on op3 srcs was a bit too simple, it really needs a temp reg per src, not one per channel, make it easier and let sb clean up the mess. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89831 Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-07 11:40:16 +10:00
Iago Toral Quiroga	2042a2f961	i965: Do not render primitives in non-zero streams then TF is disabled Haswell hardware seems to ignore Render Stream Select bits from 3DSTATE_STREAMOUT packet when the SOL stage is disabled even if the PRM says otherwise. Because of this, all primitives are sent down the pipeline for rasterization, which is wrong. If SOL is enabled, Render Stream Select is honored and primitives bound to non-zero streams are discarded after stream output. Since the only purpose of primives sent to non-zero streams is to be recorded by transform feedback, we can simply discard all geometry bound to non-zero streams then transform feedback is disabled to prevent it from ever reaching the rasterization stage. Notice that this patch introduces a small change in the behavior we get when a geometry shader emits more vertices than the maximum declared: before, a vertex that was emitted to a non-zero stream when TF was disabled would still count for the purposes of checking that we don't exceed the maximum number of output vertices declared by the shader. With this change, these vertices are completely ignored and won't increase the output vertex count, making more room for other (hopefully more useful) vertices. Fixes piglit test arb_gpu_shader5-emitstreamvertex_nodraw on Haswell and Broadwell. v2 (Ken): Drop is_haswell check in favor of doing this unconditionally. Broadwell needs the workaround as well, and it doesn't hurt to do it in general. Also tweak comments - the Haswell PRM does actually mention this ("Command Reference: Instructions" page 797). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83962 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2015-04-06 16:00:41 -07:00
Kenneth Graunke	f368d0fa1f	i965: Add forgotten multi-stream code to Gen8 SOL state. Fixes Piglit's arb_gpu_shader5-xfb-streams-without-invocations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2015-04-06 14:07:28 -07:00
Kenneth Graunke	f9e5dc0a85	i965: Fix instanced geometry shaders on Gen8+. Jordan added this in commit `741782b594` for Gen7 platforms. I missed this when adding the Broadwell code. Fixes Piglit's spec/arb_gpu_shader5/invocation-id-{basic,in-separate-gs} with MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5 set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2015-04-06 14:06:26 -07:00
Kenneth Graunke	a09c5b8527	i965: Free dead GLSL IR one last time. While working on NIR's memory allocation model, I realized the GLSL IR memory model was broken. During glCompileShader, we allocate everything out of the _mesa_glsl_parse_state context, and reparent it to gl_shader at the end. During glLinkProgram, we allocate everything out of a temporary context, then reparent it to the exec_list containing the linked IR. But during brw_link_shader - the driver's final opportunity to do lowering and optimization - we just allocated everything out of the permanent context given to us by the linker. That memory stayed forever. Notably, passes like brw_fs_channel_expressions cause us to churn the majority of the code, so we really want to free dead IR here. Saves 125MB of memory when replaying a Dota 2 trace on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-06 14:03:43 -07:00
Kenneth Graunke	797d606127	i965: Implement SIMD16 texturing on Gen4. This allows SIMD16 mode to work for a lot more programs. Texturing is also more efficient in SIMD16 mode than SIMD8. Several messages don't actually exist in SIMD8 mode, so we did SIMD16 messages and threw away half of the data. Now we compute real data in both halves. Also, the SIMD16 "sample" message doesn't require all three coordinate components to exist (like the SIMD8 one), so we can shorten the message lengths, cutting register usage a bit. I chose to implement the visitor functionality in a separate function, since mixing true SIMD16 with SIMD8 code that uses SIMD16 fallbacks seemed like a mess. The new code bails on a few cases where we'd have to do two SIMD8 messages - we just fall back to SIMD8 for now. Improves performance in "Shadowrun: Dragonfall - Director's Cut" by about 20% on GM45 (measured with LIBGL_SHOW_FPS=1 while standing around in the first mission). v2: Add ir_txf to the has_lod case (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-06 13:49:02 -07:00
Kenneth Graunke	8aee87fe4c	i965: Use SIMD16 instead of SIMD8 on Gen4 when possible. Gen5+ systems allow you to specify multiple shader programs - both SIMD8 and SIMD16 - and the hardware will automatically dispatch to the most appropriate one, given the number of subspans to be processed. However, that is not the case on Gen4. Instead, you program a single shader. If you enable multiple dispatch modes (SIMD8 and SIMD16), the shader is supposed to contain a series of jump instructions at the beginning. The hardware will launch the shader at a small offset, hitting one of the jumps. We've always thought that sounds like a pain, and weren't clear how it affected performance - is it worth having multiple shader types? So, we never bothered with SIMD16 until now. This patch takes a simpler approach: try and compile a SIMD16 shader. If possible, set the no_8 flag, telling the hardware to just use the SIMD16 variant all the time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-06 13:49:02 -07:00
Kenneth Graunke	108b92b1e9	i965: Respect the no_8 flag on Gen4-5. This flag means to ignore the SIMD8 program and only use the SIMD16 one. It was originally meant for repdata clear shaders, but I plan to use it for other things on Gen4 as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-06 13:49:02 -07:00
Kenneth Graunke	62050886c8	i965/fp: Set coord_components correctly for cube textures. I've no idea why this was 4. It certainly seems wrong. Prevents assertion failures in fp-incomplete-tex with some upcoming patches of mine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-06 13:49:01 -07:00
Ian Romanick	dd7d068784	glsl/cse: Maintain a list of free ae_entry objects The CSE algorithm will continuously allocate new ae_entry objects. As each new basic block is exited, all of the previously allocated objects are dumped. Instead, put them in a free list and re-use them in the next basic block. Reduce, reuse, recycle! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com>	2015-04-06 11:53:59 -07:00
Matt Turner	d131630c08	nir: Remove fsin_reduced/fcos_reduced. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-06 10:13:22 -07:00
Matt Turner	c8d65dd713	st/mesa: Remove unused emit_scs(). Was only used by the sin_reduced/cos_reduced cases, which themselves were impossible to reach. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-06 10:13:22 -07:00
Matt Turner	5fb735b756	program: Remove unused emit_scs(). Was only used by the sin_reduced/cos_reduced cases, which themselves were impossible to reach. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-06 10:13:22 -07:00
Matt Turner	cdb1eb9a3f	i965/vec4: Remove emit_scs() prototype. This has never existed. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-06 10:13:22 -07:00
Matt Turner	5c71cf8531	glsl: Remove never used sin_reduced/cos_reduced. These were added in commit `f2616e56`, presumably in preparation for translating ARB vp/fp into GLSL IR. That never happened, and neither did a lowering pass that actually generated these instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-06 10:13:22 -07:00
Antia Puentes	490621f0f2	glsl: Update the #line behaviour on GLSL 3.30+ and GLSL ES+ From GLSL 3.30 and GLSL ES 1.00 on, after processing the line directive (including its new-line), the implementation should behave as if it is compiling at the line number passed as argument. In previous versions, it behaved as if compiling at the passed line number + 1. Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-06 08:55:10 +02:00
Antia Puentes	c0a7014601	glsl: respect the source number set by #line <line> <source> From GLSL 1.30.10, section 3.3 (Preprocessor): "#line line source-string-number ... After processing this directive (including its new-line), the implementation will behave as if it is compiling at ... source string number source-string-number. Subsequent source strings will be numbered sequentially, until another #line directive overrides that numbering." In the previous implementation the source number was always zero. Subsequent source strings are still not numbered sequentially, because in the glShaderSource implementation we are concatenating the source code strings into one long string. Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-06 08:50:41 +02:00
Iago Toral Quiroga	47597f8f5c	i965: Make sure we always mark array surfaces as such Even if they only have one slice, otherwise textureSize() won't produce correct results for the depth value. Fixes 10 dEQP tests in this category: dEQP-GLES3.functional.shaders.texture_functions.texturesize.sampler2darray* Reviewed-by: Mark Janes <mark.a.janes at intel.com>	2015-04-06 08:07:42 +02:00
Rob Clark	8b0b81339b	freedreno/ir3: add NIR compiler The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:40 -04:00
Ilia Mirkin	700d949ea1	freedreno/a3xx: don't decode srgb on mem2gmem Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	b060b56772	freedreno/a3xx: pass sprite coord mode through to program emit Use the correct sprite replacement depending on the flip of the coord mode, using either T or 1-T depending on whether we have an upper-left or lower-left coordinate origin. This fixes all the point sprite piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	1de72dfc8a	freedreno/a3xx: add UBO support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	c7811f56c2	freedreno/ir3: insert nop between sfu/mem operations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	14dfd8cc43	freedreno: dirty context when reallocating a bound bo Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	bde2045fa2	freedreno: keep track of buffer valid ranges Copies nouveau_buffer and radeon_buffer. This allows a write to proceed to an uninitialized part of a buffer even when the GPU is using the previously-initialized portions. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:35 -04:00
Ilia Mirkin	dacf22e0a3	freedreno: mark resources as being read so that writes flush the queue Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	2e1445c8f3	freedreno: don't bother setting resource timestamps Waiting on a bo being ready is handled in fd_bo_cpu_prep. No need to keep separate timestamps around. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	1fee3061d5	freedreno: add a reading flag to indicate gpu is reading rsc Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	ea0952a9db	freedreno: fix resource flushing confusion A resource flush is an upload of a hypothetically-staging texture to the GPU. For a UMA system, this will largely be a no-op or cache-maintenance. Move the render flush logic into transfer_map where it belongs, and clear out the transfer_flush function. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Ilia Mirkin	bfb0a8eb69	freedreno: remove tex_resource pipe_sampler_view already contains a texture, remove the redundant tex_resource member which pointed at the same thing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-05 16:36:34 -04:00
Rob Clark	6cd9c94ce4	freedreno/ir3: handle FRAG IN's without interpolation specified Fallback to picking based on semantic name. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00
Rob Clark	f513f006ce	freedreno/ir3/cmdline: add @const headers for immediates Since NIR f/e currently encodes immediates in instructions (rather than passing via const), we need to ensure that when const's are used the get initialized to the proper values. Otherwise comparing NIR to TGSI compiler, it will use proper immediate values in one case, and randomly initialize values in the other. Which confuses ir3test. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00
Rob Clark	6bc12bb5fd	freedreno/ir3/cmdline: remove hack for old compiler Since we dropped the old compiler, we don't need this hack anymore. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00
Rob Clark	f370e95421	freedreno/ir3: handle const/immed/abs/neg in cp Be smarter about propagating copies from const or immed, or with abs/neg modifiers. Also, realize that absneg.s and absneg.f are really "fancy" mov instructions. This opens up the possibility to remove more copies. It helps the TGSI frontend a bit, but will be really needed for the NIR f/e which builds everything up in SSA form (ie. will always insert a mov from const or immediate). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 16:36:34 -04:00
Rob Clark	104713d9f2	freedreno/ir3: split float/int abs/neg Even though in the end, they map to the same bits, the backend will need to be able to differentiate float abs/neg vs integer abs/neg. Rather than making the backend figure it out based on instruction opcode (which when combined with mov/absneg instructions, can be awkward), just split out different flags for each so the frontend can signal it's intentions more clearly. Also, since (neg) for bitwise op's is actually a bitwise- not, split it out into bnot flag. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 12:44:01 -04:00
Rob Clark	203f37540a	freedreno/ir3: add ir3 builder helpers Add helpers for constructing SSA forms of instructions. Only partial cat5/cat6 coverage.. but we can add stuff as needed. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 12:44:01 -04:00
Rob Clark	b1c9fb9fca	freedreno/ir3: fix sam argument order comment Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 12:44:01 -04:00
Rob Clark	101142c401	xa: support for drivers which use NIR We need to pull in libnir.la and it's dependency libglsl_util.la. Also, _mesa_error_no_memory() must be defined. Fortunately with libnir.la (vs pulling in all of libglsl.la) we don't also need libstdc++. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 09:24:17 -04:00
Rob Clark	1c857727a1	build: add libnir.la If we want to use NIR from state trackers that don't already pull in the whole of glsl (ie. anything other than mesa state tracker), we need a separate more minimal libnir. Possibly NIR should be better split out from glsl, but for now, generate a second smaller libnir.la for those who just want NIR but not all of glsl. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-04-05 09:24:17 -04:00
Rob Clark	52282fa42d	gallium/ttn: MOD is an integer instruction Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net	2015-04-05 09:24:17 -04:00
Rob Clark	7579ae422a	gallium/ttn: add UMAD Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-05 09:24:17 -04:00
Rob Clark	f2ecc95e44	nir: add lowering for idiv/udiv/umod Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD(). See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an adaptation of the nv50 code from Ilia Mirkin). A python/numpy script which implements the same algorithm (and is possibly useful for debugging or analysis) can be found here: http://people.freedesktop.org/~robclark/div-lowering.py I've tested this on i965 hacked up to insert the idiv lowering pass, and on freedreno with NIR frontend. Signed-off-by: Rob Clark <robclark@freedesktop.org> Tested-by: Eric Anholt <eric@anholt.net> (vc4)	2015-04-05 09:20:35 -04:00
Rob Clark	7880bea2fb	nir: fix typo for f2b/i2b/b2i expressions (v2) v2: discovered that i2b/b2i are also confused Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-05 08:56:24 -04:00
Rob Clark	6829d76e02	nir: add option to lower slt/sge/seq/sne In freedreno these get implemented as the matching f* instruction plus a u2f to convert the result to float 1.0/0.0. But less lines of code to just let nir_opt_algebraic handle this for us, plus opens up some small window for other opt passes to improve (ie. if some shader ended up with both a flt and slt with same src args, for example). v2: use b2f rather than u2f Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-05 08:56:24 -04:00
Mathias Froehlich	24b78fe54e	mesa: Remove unused variables left over from `107ae27e57`. Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2015-04-05 09:40:47 +02:00
Mathias Fröhlich	fdd90fcb15	i965: Implement support for ARB_clip_control. Switch between the two clip space definitions already available in hardware. Update winding order dependent state according to the clip control state. This change did not introduce new piglit quick.test regressions on an Ivybridge Mobile and a GM45 Express chipset. Also it enables and passes the clip-control and clip-control-depth-precision tests on these two chipsets. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2015-04-05 08:01:47 +02:00
Mathias Froehlich	107ae27e57	mesa: Remove the _WindowMap from gl_viewport_attrib. The _WindowMap can be dropped from gl_viewport_attrib now. Simplify gl_viewport_attrib handling where possible. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2015-04-05 08:01:47 +02:00
Mathias Froehlich	29e6c7dbc5	tnl: Maintain the _WindowMap matrix in TNLcontext v2. This is the only real user of _WindowMap which has the depth buffer scaling multiplied in. Maintain the _WindowMap of the one and only viewport inside TNLcontext. v2: Remove unneeded parentheses. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2015-04-05 08:01:47 +02:00
Mathias Froehlich	472913ea75	radeon: Make use of _mesa_get_viewport_xform v2. Instead of _WindowMap just use the translation and scale of the viewport transform directly. Thereby avoid dividing by _DepthMaxF again. v2: Change order of assignments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2015-04-05 08:01:46 +02:00
Mathias Froehlich	a8ceb8e450	i965: Make use of _mesa_get_viewport_xform. Instead of _WindowMap just use the translation and scale of the viewport transform directly. Thereby avoid dividing by _DepthMaxF again. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2015-04-05 08:01:46 +02:00
Ilia Mirkin	ba353935a3	nv50: allocate more offset space for occlusion queries Commit `1a170980a0` started writing to q->data[4]/[5] but kept the per-query space at 16, which meant that in some cases we would write past the end of the buffer. Rotate by 32, like nvc0 does. This ensures that we always have 32 bytes in front of us, and the data writes will go within the allocated space. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89679 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nick Tenney <nick.tenney@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-04-04 11:30:03 -04:00
Jason Ekstrand	9c53e80b9b	nir/lower_samplers: Use the right memory context for realloc'ing tex sources As of `da5ec2a`, we allocate instruction sources out of the instruction itself. When we realloc the texture sources we need to use the right memory context or ralloc will get angry and assert-fail Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-03 17:02:20 -07:00
Jason Ekstrand	1bd1fc248c	i965: Use brw_nir_cubemap_normalize for NIR shaders Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-03 14:12:49 -07:00
Jason Ekstrand	52e718097f	nir: Add a cubemap normalizing pass This commit adds a pass to L1-normalize cube-map coordinates. Some hardware such as i965 requires that largest cube-map coordinate is +-1. We had a pass to perform this normalization in GLSL IR but we need it in NIR for cube maps on ARB programs to work correctly. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v2 (Suggested by Eric): - Do a vector fabs and split into components later - Move to core NIR Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-03 14:12:49 -07:00
Jason Ekstrand	bff4213326	i965: Check the INTEL_USE_NIR environment variable once at context creation Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-03 14:12:49 -07:00
Jason Ekstrand	dccc57eaba	nir/from_ssa: Don't set reg->parent_instr for ssa_undef instructions Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-03 14:04:31 -07:00
Jason Ekstrand	7bdba4a245	nir: Add a src_get_parent_instr function Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-04-03 14:04:12 -07:00
Eric Anholt	cb966fb2be	i965: Use the tex projector lowering pass instead of hand-rolling it. This only impacts the ARB_fp path. We can't quite disable the GLSL-level lowering pass, because it needs to apply before brw_do_lower_unnormalized_offset(). total instructions in shared programs: 5667857 -> 5667847 (-0.00%) instructions in affected programs: 1114 -> 1104 (-0.90%) helped: 16 HURT: 6 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-03 11:50:27 -07:00
Eric Anholt	ea811b7868	nir: Add a lowering pass for texture projectors. Not much hardware wants them these days, and it might give us a chance to do CSE or algebraic at the NIR level. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-03 11:50:24 -07:00
Eric Anholt	64bdfc698d	nir: Add an interface to turn a nir_src into a nir_ssa_def. We use nir_ssa_defs for nir_builder args, so this takes a nir_src and makes one so it can be passed in. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-03 11:50:22 -07:00
Eric Anholt	ec02970205	nir: Add an interface for the builder to insert instructions before. So far we'd only used nir_builder to build brand new programs. But if we're doing modifications to instructions (like in a lowering pass), then we want to generate new stuff before the instruction we're modifying. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-03 11:50:18 -07:00
Jose Fonseca	328375d274	gallium: fix gcc compile errors when using _XOPEN_SOURCE=600 but not std=c99 The fpclassify stuff either needs std=c99 or _XOPEN_SOURCE=600 passed to gcc, but when using the latter the lrint family of function will be defined too.	2015-04-03 19:22:09 +02:00
Carl Worth	b9b66985c3	i965: Rename do_<stage>_prog to brw_compile_<stage>_prog (and export) This is in preparation for these functions to be called from other files. This commit is intended to have no functional change. It exists in preparation for some upcoming code movement in preparation for the shader cache. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-02 22:15:45 -07:00
Carl Worth	a57672f18d	i965: Split out per-stage dirty-bit checking into separate functions The dirty-bit checking from each brw_upload_<stage>_prog function is split out into its a new brw_<stage>_state_dirty function. This commit is intended to have no functional change. It exists in preparation for some upcoming code movement in preparation for the shader cache. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-02 22:15:45 -07:00
Carl Worth	28510d69ff	i965: Split out brw_<stage>_populate_key into their own functions This commit splits portions of the existing brw_upload_vs_prog and brw_upload_gs_prog function into new brw_vs_populate_key and brw_gs_populate_key functions. This follows the same style as is already present for all other stages, (see brw_wm_populate_key, etc.). This commit is intended to have no functional change. It exists in preparation for some upcoming code movement in preparation for the shader cache. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-04-02 22:15:45 -07:00
Ilia Mirkin	01d3b750b3	nv50/ir: avoid folding immediates into imad operations Commit `09ee907266` added logic to fold immediates into mad operations, but the emission code is only there for fmad. Only allow it on float types. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 18:42:31 -04:00
Ilia Mirkin	603d28f32c	nv50/ir: fix imad emission when dst == src2 Commit `fb63df2215` added 4-byte mad support, but only supported emission for floats. Disable it for ints for now. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 18:35:59 -04:00
Kenneth Graunke	da5ec2ac0b	nir: Allocate nir_tex_instr::sources out of the instruction itself. The lifetime of the sources array needs to be match the nir_tex_instr itself. So, allocate it using the instruction itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:03 -07:00
Kenneth Graunke	7380c641b1	nir: Allocate predecessor and dominance frontier sets from block itself. These sets are part of the block, and their lifetime needs to match the block itself. So, allocate them using the block itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:02 -07:00
Kenneth Graunke	131444e1c5	nir: Allocate register fields out of the register itself. The lifetime of each register's use/def/if_use sets needs to match the register itself. So, allocate them using the register itself as the context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:01 -07:00
Kenneth Graunke	587b3a20a1	nir: Make nir_create_function() strdup the function name. glsl_to_nir passes in the ir_function's name field; we were copying the pointer, but not duplicating the memory. We want to be able to free the linked GLSL IR program after translating to NIR, so we'll need to create a copy of the function name that the NIR shader actually owns. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:20:00 -07:00
Kenneth Graunke	f61b6c3e48	nir: Free dead variables when removing them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:19:58 -07:00
Kenneth Graunke	f4e4491080	nir: Combine remove_dead_local_vars() and remove_dead_global_vars(). We can just pass a pointer to the list of variables, and reuse the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:19:56 -07:00
Kenneth Graunke	33f0f68d59	ralloc: Implement a new ralloc_adopt() API. ralloc_adopt() reparents all children from one context to another. Conceptually, ralloc_adopt(new_ctx, old_ctx) behaves like this pseudocode: foreach child of old_ctx: ralloc_steal(new_ctx, child) However, ralloc provides no way to iterate over a memory context's children, and ralloc_adopt does this task more efficiently anyway. One potential use of this is to implement a memory-sweeper pass: first, steal all of a context's memory to a temporary context. Then, walk over anything that should be kept, and ralloc_steal it back to the original context. Finally, free the temporary context. This works when the context is something that can't be freed (i.e. an important structure). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-02 14:19:41 -07:00
Jason Ekstrand	ca3b4d6d17	nir/opt_peephole_ffma: Fix a couple typos in a comment Acked-by: Matt Turner <mattst88@gmail.com>	2015-04-02 11:09:37 -07:00
Ilia Mirkin	4609ba6ea3	mesa: add ARB_depth_buffer_float to ES3.0 required extension list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-02 13:35:18 -04:00
Eric Anholt	a9152376b4	vc4: Add support for nir_iabs. Tested using the GLSL 1.30 tests for integer abs(). Not currently used, but it was one of the new opcodes used by robclark's idiv lowering.	2015-04-02 10:32:35 -07:00
Jason Ekstrand	e50cf5faa5	i965/generator: Get rid of the ! in the unreachable statement Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-02 10:21:18 -07:00
Jason Ekstrand	0573d0e484	nir/print: Correctly print swizzles for explicitly sized alu sources Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-02 10:21:18 -07:00
Ilia Mirkin	4a3c0e9950	freedreno/a3xx: add MRT support The hardware only supports 4 MRTs. It should be possible to emulate support for 8, but doesn't seem worth the trouble. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	6f4c1976f4	freedreno: convert blit program to array for each number of rts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	d9992ab35a	freedreno: add support for laying out MRTs in gmem Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	602bc6c88d	freedreno: add core infrastructure support for MRTs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	d13803c76f	freedreno/ir3: add support for FS_COLOR0_WRITES_ALL_CBUFS property This will enable the driver to tell which regids to link up to which MRT outputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	f27ec59084	freedreno/a3xx: add independent blend function support This is needed for MRT support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Ilia Mirkin	8efa3e340d	freedreno: remove alpha key from ir3_shader This complication is unnecessary and makes MRTs more complicated and likely to generate tons of variants. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-02 00:09:14 -04:00
Stéphane Marchesin	70eed78cac	i915g: Implement EGL_EXT_image_dma_buf_import This adds all the plumbing to get EGL_EXT_image_dma_buf_import in i915g. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2015-04-01 20:13:37 -07:00
Matt Turner	a03d0ba78f	i965/fs: Relax type check in cmod propagation. The thing we want to avoid is int/float comparisons, but int/unsigned comparisons with 0 are equivalent. total instructions in shared programs: 6194829 -> 6193996 (-0.01%) instructions in affected programs: 117192 -> 116359 (-0.71%) helped: 471 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-04-01 13:43:57 -07:00
Matt Turner	781badee7a	nir: Remove useless ftrunc inside f2i/f2u. No shader-db changes, probably because they're all removed by the GLSL compiler optimization added in commit `69ad5fd4`. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	97e6c1b957	nir: Recognize (a < b \|\| a < c) as a < max(b, c). Doesn't work for analogous && cases, because of NaNs. total instructions in shared programs: 6195712 -> 6194829 (-0.01%) instructions in affected programs: 42000 -> 41117 (-2.10%) helped: 403 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	a2b6e908cf	nir: Add addition/multiplication identities of exp/log. instructions in affected programs: 2858 -> 2808 (-1.75%) helped: 12 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	099c729b4c	nir: Add identities for the log function. The rcp(log(x)) pattern affects instruction counts. instructions in affected programs: 144 -> 138 (-4.17%) helped: 6 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	8a6ae384b2	nir: Add identities for the exponential function. No changes in shader-db. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	e26783d445	nir: Recognize another open coded lrp. total instructions in shared programs: 6195924 -> 6195768 (-0.00%) instructions in affected programs: 4876 -> 4720 (-3.20%) helped: 58 HURT: 10 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Matt Turner	e82437e141	nir: Recognize open coded lrp. total instructions in shared programs: 6197614 -> 6195924 (-0.03%) instructions in affected programs: 34773 -> 33083 (-4.86%) helped: 147 HURT: 6 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:43:57 -07:00
Kenneth Graunke	25e214db00	nir: Use _mesa_flsll(InputsRead) in prog->nir. InputsRead is a 64-bit bitfield. Using _mesa_fls would silently truncate off the high bits, claiming inputs 32..56 (VARYING_SLOT_MAX) were never read. Using <= here was a hack I threw in at the last minute to fix programs which happened to use input slot 32. Switch back to using < now that the underlying problem is fixed. Fixes crashes in "Euro Truck Simulator 2" when using prog->nir, which uses input slot 33. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 13:30:13 -07:00
Kenneth Graunke	3d166b313d	mesa: Implement _mesa_flsll(). This is _mesa_fls() for 64-bit values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 13:30:13 -07:00
Kenneth Graunke	4b38c5c783	nir: In prog->nir, don't wrap dot products with ptn_channel(..., X). ptn_move_dest and nir_fadd already take care of replicating the last channel out, so we can just use a scalar and skip splatting it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-04-01 13:30:13 -07:00
Jason Ekstrand	218e45e2f7	i965: Use the same nir options for all gens If we tell NIR to split ffma's, then we don't need seperate options anymore. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	b9d7454571	i965/nir: Run DCE again before going out of SSA We run lowering and optimization passes that might leave garbage lying around. This keeps the FS cse from having to clean it up. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	37703040a1	i965/nir: Run the ffma peephole after the rest of the optimizations The idea here is that fusing multiply-add combinations too early can reduce our ability to perform CSE and value-numbering. Instead, we split ffma opcodes up-front, hope CSE cleans up, and then fuse after-the-fact. Unless an algebraic pass does something silly where it inserts something between the multiply and the add, splitting and re-fusing should never cause a problem. We run the late algebraic optimizations after this so that things like compare-with-zero don't hurt our ability to fuse things. shader-db results for fragment shaders on Haswell: total instructions in shared programs: 4390538 -> 4379236 (-0.26%) instructions in affected programs: 989359 -> 978057 (-1.14%) helped: 5308 HURT: 97 GAINED: 78 LOST: 5 This does, unfortunately, cause some substantial hurt to a shader in Kerbal Space Program. However, the damage is caused by changing a single instruction from a ffma to an add. This, in turn, decreases register pressure in one part of the program causing it to fail to register allocate and spill. Given the overwhelmingly positive results in other shaders and the fact that the NIR for the Kerbal shaders is actually better, this should be considered a positive. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	7f344721b1	nir/peephole_ffma: Be less agressive about fusing multiply-adds shader-db results for fragment shaders on Haswell: total instructions in shared programs: 4395688 -> 4389623 (-0.14%) instructions in affected programs: 355876 -> 349811 (-1.70%) helped: 1455 HURT: 14 GAINED: 5 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	a8c8b3b872	nir: Add a dedicated ffma peephole optimization i965/nir: Use the dedicated ffma peephole total instructions in shared programs: 4418748 -> 4394618 (-0.55%) instructions in affected programs: 1292790 -> 1268660 (-1.87%) helped: 5999 HURT: 457 GAINED: 4 LOST: 9 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:04 -07:00
Jason Ekstrand	e06a3d0282	nir: Move the compare-with-zero optimizations to the late section total instructions in shared programs: 4422307 -> 4422363 (0.00%) instructions in affected programs: 4230 -> 4286 (1.32%) helped: 0 HURT: 12 While this does hurt some things, the losses are minor and it prevents the compare-with-zero optimization from fighting with ffma which is much more important. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	da294f9b2f	nir/algebraic: Add a seperate section for "late" optimizations i965/nir: Use the late optimizations Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	1779dc060f	nir/algebraic: Remove a duplicate optimization This optimization is repeated verbatim above Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	22ee7eeb4e	nir/algebraic: #define around structure definitions Previously, we couldn't generate two algebraic passes in the same file because of multiple structure definitions. To solve this, we play the age-old header file trick and just #define around it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 12:51:03 -07:00
Jason Ekstrand	793a94d6b5	nir/print: Don't print extra swizzzle components Previously, NIR would just print 4 swizzle components if the swizzle was anything other than foo.xyzw. This creates lots of noise if, for example, you have a one-component element with a swizzle of foo.xxxx. Reviewed-by: Kenneth Grunke <kenneth@whitecape.org>	2015-04-01 12:49:49 -07:00
Emil Velikov	d99135b2e9	configure: nuke --with-max-{width,height} Unused as of commit 630ab0d27ba(mesa: remove last of MAX_WIDTH, MAX_HEIGHT). Update all the remaining references to the defines. v2: Use the correct variable name in the comments Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-04-01 19:43:34 +00:00
Emil Velikov	bd4925c6ac	gallium: ship tgsi_to_nir.h in the tarball Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-04-01 19:33:37 +00:00
Emil Velikov	4008975e6f	configure.ac: error out if python/mako is not found when required In case of using a distribution tarball (or a dirty git tree) one can have the generated sources locally. Make configure.ac error out otherwise, to alert that about the unmet requirement(s) of python/mako. v2: Check only for a single file for each dependency. Suggested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 19:33:37 +00:00
Matt Turner	3384179faa	glsl: Make sure not to dereference NULL. Found by Coverity.	2015-04-01 12:25:29 -07:00
Laura Ekstrand	142909f19d	main: create_buffers unlocks mutex when throwing OUT_OF_MEMORY. Ilia Mirkin found that I had forgotten to free the mutex in the error case. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-04-01 12:07:28 -07:00
Jose Fonseca	3321724c10	automake,scons: Put NIR source files in a separate var to fix SCons build. SCons does not build NIR yet. Trivial.	2015-04-01 19:49:09 +01:00
Jose Fonseca	7f0682cebf	automake: Fix out-of-source builds. Add include path for generated nir_opcodes.h. Trivial.	2015-04-01 19:48:09 +01:00
Brian Paul	1625d7a87a	mesa: don't include colormac.h in format code Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-01 12:04:28 -06:00
Brian Paul	2768a0b1b4	mesa: remove unneeded #include of colormac.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-01 12:04:28 -06:00
Brian Paul	f1d55017d7	tnl: remove unneeded #include of colormac.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-01 12:04:28 -06:00
Brian Paul	8ac9407a83	swrast: remove unneeded #include of colormac.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-01 12:04:28 -06:00
Brian Paul	2ad8af1a0c	mesa: remove unused macros from colormac.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-04-01 12:04:28 -06:00
Eric Anholt	15b03b7964	nir: Recognize a pattern of bool frobbing from TGSI KILL_IF. TGSI's conditional discards take float arg and negate it, so GLSL to TGSI generates a b2f and negates that value. Only, in NIR we want a proper bool once again, so we compare with 0. This is a lot of pointless extra instructions. total instructions in shared programs: 39735 -> 39702 (-0.08%) instructions in affected programs: 1342 -> 1309 (-2.46%) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-01 10:57:01 -07:00
Eric Anholt	6e8d4a2f80	nir: Recognize a pattern for doing b2f without the opcode. Since we have patterns based on b2f, generate them if we see the b2f equivalent using an iand. This is common when generating NIR from TGSI. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-04-01 10:57:01 -07:00
Eric Anholt	26261bca21	vc4: Add shader-db dumping of NIR instruction count. I was previously using temporary disables of VC4 optimization to show the benefits of improved NIR optimization, but this can get me quick and dirty numbers for NIR-only improvements without having to add hacks to disable VC4's code (disabling of which might hide ways that the NIR changes would hurt actual VC4 codegen).	2015-04-01 10:57:01 -07:00
Eric Anholt	73e2d4837d	vc4: Convert to consuming NIR. NIR brings us better optimization than I would have bothered to write within the driver, developers sharing future optimization work, and the ability to share device-specific lowering code that we and other GLES2-level drivers need. total uniforms in shared programs: 13421 -> 13422 (0.01%) uniforms in affected programs: 62 -> 63 (1.61%) total instructions in shared programs: 39961 -> 39707 (-0.64%) instructions in affected programs: 15494 -> 15240 (-1.64%) v2: Add missing imov support, and assert that there are no dest saturates. v3: Rebase on the target-specific algebraic series. v4: Rebase on gallium-includes-from-NIR changes in mater. v5: Rebase on variables being in lists instead of hash tables. v6: Squash in intermediate changes that used the NIR-to-TGSI pass (which I'm not committing)	2015-04-01 10:57:01 -07:00
Eric Anholt	783ad697d2	gallium: Add tgsi_to_nir to get a nir_shader for a TGSI shader. This will be used by the VC4 driver for doing device-independent optimization, and hopefully eventually replacing its whole IR. It also may be useful to other drivers for the same reason. v2: Add all of the instructions I was relying on tgsi_lowering to remove, and more. v3: Rebase on SSA rework of the builder. v4: Use the NIR ineg operation instead of doing a src modifier. v5: Don't use ineg for fnegs. (infer_src_type on MOV doesn't do what I expect, again). v6: Fix handling of multi-channel KILL_IF sources. v7: Make ttn_get_f() return a swizzle of a scalar load_const, rather than a vector load_const. CSE doesn't recognize that srcs out of those channels are actually all the same. v8: Rebase on nir_builder auto-sizing, make the scalar arguments to non-ALU instructions actually be scalars. v9: Add support for if/loop instructions, additional texture targets, and untested support for indirect addressing on temps. v10: Rebase on master, drop bad comment about control flow and just choose the X channel, use int comparison opcodes in LIT for now, drop unused pipe_context argument.. v11: Fix translation of LRP (previously missed because I mis-translated back out), use nir_builder init helpers. v12: Rebase on master, adding explicit include of mtypes.h to get INTERP_QUALIFIER_* v13: Rebase on variables being in lists instead of hash tables, drop use of mtypes.h in favor of util/pipeline.h. Use Ken's nir_builder swizzle and fmov/imov_alu helpers, drop "struct" in front of nir_builder, use nir_builder directly as the function arg in a lot of cases, drop redundant members of ttn_compile that are also in nir_builder, drop some half-baked malloc failure handling. v14: The indirect uniform src0 should be scalar, not vector (noticed as odd by robclark, confirmed by cwabbott). Apply Ken's review to initialize s->num_uniforms and friends, skip ttn_channel for dot products, and use the simpler discard_if intrinsic. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v13) Acked-by: Rob Clark <robclark@freedesktop.org>	2015-04-01 10:57:01 -07:00
Eric Anholt	486dcfbbd9	vc4: Tell shader-db how big our UBOs are, if present. I had regressed them for a while with the NIR work.	2015-04-01 10:57:01 -07:00
Eric Anholt	a3a07d46d1	mesa: Make a shared header for 3D pipeline enum / #defines. NIR uses these enums/#defines in nir_variables and associated intrinsics, but I want to be able to use them from TGSI->NIR and NIR->TGSI. Otherwise, we had to pull in all of mtypes.h. This doesn't cover all of the enums we might want from a shared compiler core (like varying slots or vert attribs), but it at least covers what I need at the moment (system values and interp qualifiers). v2: Move to src/glsl since util/ is really vague. Include in Makefile.am list. Use plain bitshifts and stdint types instead of undefined BITFIELD64_BIT. v3: Rename to shader_enums.h. Move it into Makefile.sources. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2, with recommendation to rename)	2015-04-01 10:57:01 -07:00
Emil Velikov	5604d7675e	nir: add nir_builder.h to the tarball The header was added with commit 2a135c470e3(nir: Add an ALU op builder kind of like ir_builder.h) but did not made it into to the sources list. Fortunately it remained unused until a recent commit faf6106c6f6(nir: Implement a Mesa IR -> NIR translator.) v2: Remove the bogus dependency. Tweak commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 14:46:42 +01:00
Emil Velikov	4984cb7ef8	xmlpool: remove the clean target ... by folding it into CLEANFILES. Don't worry about $(LANG) as it is essentially the first folder of $(POS). With the latter already handled. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 14:46:41 +01:00
Emil Velikov	a665b9b3c8	xmlpool: don't forget to ship the MOS This will allow us to finally remove python from the build time dependencies list. Considering that you're building from a release tarball of course :-) Cc: Bernd Kuhls <bernd.kuhls@t-online.de> Reported-by: Bernd Kuhls <bernd.kuhls@t-online.de> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-04-01 14:46:41 +01:00
Emil Velikov	c07df0f201	osmesa: don't try to bundle osmesa.def SConscript Both of which were removed with commit 69db422218b(scons: Don't build osmesa.) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-04-01 14:46:41 +01:00
Emil Velikov	1d36c52f5d	docs: note that classic osmesa/libEGL no longer builds with scons Plus nuke the final reference to osmesa from README.WIN32. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-04-01 14:46:35 +01:00
Iago Toral Quiroga	3818dfcf3c	i965: Handle scratch accesses where reladdr also points to scratch space This is a problem when we have IR like this: (array_ref (var_ref temps) (swiz x (expression ivec4 bitcast_f2i (swiz xxxx (array_ref (var_ref temps) (constant int (2)) ) )) )) ) ) where we are indexing an array with the result of an expression that accesses the same array. In this scenario, temps will be moved to scratch space and we will need to add scratch reads/writes for all accesses to temps, however, the current implementation does not consider the case where a reladdr pointer (obtained by indexing into temps trough a expression) points to a register that is also stored in scratch space (as in this case, where the expression used to index temps access temps[2]), and thus, requires a scratch read before it is accessed. v2 (Francisco Jerez): - Handle also recursive reladdr addressing. - Do not memcpy dst_reg into src_reg when rewriting reladdr. v3 (Francisco Jerez): - Reduce complexity by moving recursive reladdr scratch access handling to a separate recursive function. - Do not skip demoting reladdr index registers to scratch space if the top level GRF has already been visited. v4 (Francisco Jerez) - Remove redundant checks. - Simplify code by making emit_resolve_reladdr return a register with the original src data except for reg, reg_offset and reladdr. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89508 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-04-01 15:35:23 +02:00
Roland Scheidegger	e3252defd2	gallivm: (trivial) fix the logic deciding if function call should be used... Copy and paste bug with the img filter decision. Since there's only 2 different filters anyway just drop this bit.	2015-04-01 13:26:19 +02:00
Martin Peres	59af7ed28c	mesa/fbo: lock ctx->Shared->Mutex when allocating renderbuffers This mutex is used to make sure the shared context does not change while some shared code is looking into it. Calling BindRenderbufferEXT BindRenderbuffer with a gles context would not take the mutex before allocating an entry. Commit `a34669b` then moved out the allocation out of bind_renderbuffer into allocate_renderbuffer before using it for the CreateRenderBuffer entry point. This thus also made this entry point unsafe. The issue has been hinted by Ilia Mirkin. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-04-01 09:36:27 +03:00
Martin Peres	fa38321551	mesa/fbo: do not assign a value that is never read later on The issue has been detected by coverty. v2: - move the declaration of obj to the else clause (Brian Paul) v3: Review by Brian Paul - get rid of the obj declaration in favor of a direct reference Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-04-01 09:36:27 +03:00
Dave Airlie	8f7338f284	egl: add initial EGL_MESA_image_dma_buf_export v2.4 At the moment to get an EGL image to a dma-buf file descriptor, you have to use EGL_MESA_drm_image, and then use libdrm to convert this to a file descriptor. This extension just provides an API modelled on EGL_MESA_drm_image, to return a dma-buf file descriptor. v2: update spec for new API proposal add internal queries to get the fourcc back from intel driver. v2.1: add gallium pieces. v2.2: add offsets to spec and API, rename fd->fds, stride->strides in API. rewrite spec a bit more, add some q/a v2.3: add modifiers to query interface and 64-bit type for that (Daniel Stone) specifiy what happens to num fds vs num planes differences. (Chad Versace) v2.4: fix grammar (Daniel Stone) Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-04-01 14:10:04 +10:00
Jordan Justen	22ccdf12dd	i965/state: Remove brw->state.dirty We now use brw->NewGLState and brw->ctx.NewDriverState instead. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:24 -07:00
Jordan Justen	7ecf3530d8	i965/state: Don't use brw->state.dirty.mesa Now, we only use brw->NewGLState. I used this bash & sed command in the i965 directory: for file in .[ch] .[ch]pp; do sed -i -e 's/brw->state\.dirty\.mesa/brw->NewGLState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:24 -07:00
Jordan Justen	4e56a9ad46	i965/state: Don't use brw->state.dirty.brw Now, we only use ctx->NewDriverState. I used this bash & sed command in the i965 directory: for file in .[ch] .[ch]pp; do sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file done Followed by manual changes to brw_state_upload.c. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:24 -07:00
Jordan Justen	20ef23b227	i965/state: Add compute pipeline with empty atom lists Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:24 -07:00
Jordan Justen	a8e39e1903	i965/state: Only upload render programs for render state uploads Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:24 -07:00
Jordan Justen	d70f4e6daf	i965/state: Create separate dirty state bits for each pipeline When clearing the state for a pipeline, we will save changed state for the other pipelines. v3: * Adjust brw_upload_pipeline_state * Don't pull pipeline state bits into common state bits * Don't clear pipeline state bits * Adjust 'clear' phase * brw_clear_dirty_bits is now brw_render_state_finished * Move cross-pipeline state flagging to brw_pipeline_state_finished * Move pipeline clears to brw_pipeline_state_finished Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:24 -07:00
Jordan Justen	db11955072	i965/state: Support multiple pipelines in brw->num_atoms brw->num_atoms is converted to an array, but currently just an array of length 1. Adds brw_copy_pipeline_atoms which copies the atoms for a pipeline, and sets brw->num_atoms[p] for pipeline p. v2: * Rename brw->atoms[] to render_atoms * Rename brw_add_pipeline_atoms to brw_copy_pipeline_atoms * Rename brw_pipeline_first_atom to brw_get_pipeline_atoms Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:23 -07:00
Jordan Justen	736a31d462	i965/state: Rename brw_clear_dirty_bits to brw_render_state_finished Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:23 -07:00
Jordan Justen	2c02baa487	i965/state: Rename brw_upload_state to brw_upload_render_state Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 16:40:23 -07:00
Roland Scheidegger	611bd80f3b	gallivm: do some hack heuristic to disable texture functions We've seen some cases where performance can hurt quite a bit. Technically, the more simple the function the more overhead there is for using a function for this (and the less benefits this provides). Hence don't do this if we expect the generated code to be simple. There's an even more important reason why this hurts performance, which is shaders reusing the same unit with some of the same inputs, as llvm cannot figure out the calculations are the same if they are performned in the function (even just reusing the same unit without any input being the same provides such optimization opportunities though not very much). This is something which would need to be handled by IPO passes however.	2015-04-01 00:56:12 +02:00
Matt Turner	47c4b38540	i965/fs: Allow CSE to handle MULs with negated arguments. mul x, -y is equivalent to mul -x, y; and mul x, y is the negation of mul x, -y. With NIR: total instructions in shared programs: 6167779 -> 6161193 (-0.11%) instructions in affected programs: 983511 -> 976925 (-0.67%) helped: 4106 HURT: 16 GAINED: 18 LOST: 7 Without NIR: total instructions in shared programs: 6192323 -> 6185299 (-0.11%) instructions in affected programs: 987875 -> 980851 (-0.71%) helped: 4146 HURT: 16 GAINED: 16 LOST: 0	2015-03-31 14:14:36 -07:00
Matt Turner	438c1c0080	i965: Mark brw_inst_bits' brw_inst* parameter const. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-31 14:14:36 -07:00
Matt Turner	ac6102bcc5	glsl: Remove bogus Makefile dependency.	2015-03-31 14:14:36 -07:00
Matt Turner	2c38f891ad	glsl: Reassociate multiplication of matmatvec. The typical case of mat4mat4vec4 is 80 scalar multiplications, but mat4(mat4vec4) is only 32. On HSW (with vec4 vertex shaders): instructions in affected programs: 4420 -> 3194 (-27.74%) On BDW (with scalar vertex shaders): instructions in affected programs: 12756 -> 6726 (-47.27%) Implementing a general matrix chain ordering is harder (or at least tedious) because of having to walk the GLSL IR to create a list of multiplicands. I'm guessing that this patch handles 90+% of cases, but of course to tell definitively you'd have to implement the general thing. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-03-31 14:01:15 -07:00
Matt Turner	cf2dc1624f	glsl: Implement type inferencing of matrix types. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-03-31 14:01:15 -07:00
Matt Turner	73f6f9b9be	glsl: Factor out a get_mul_type() function. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-03-31 14:01:15 -07:00
Marcin Ślusarz	f9e2295560	nouveau: synchronize "scratch runout" destruction with the command stream When nvc0_push_vbo calls nouveau_scratch_done it does not mean scratch buffers can be freed immediately. It means "when hardware advances to this place in the command stream the scratch buffers can be freed". To fix it, just postpone scratch runout destruction after current fence is signalled. The bug existed for a very long time. Nobody noticed, because "scratch runout" code path is rarely executed. Fixes hang at the very beginning of first mission in "Serious Sam 3" on nve7/gk107. It manifested as: nouveau E[ PFIFO][0000:01:00.0] read fault at 0x000a9e0000 [PTE] from GR/GPC0/PE_2 on channel 0x007f853000 [Sam3[17056]] Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-31 22:04:31 +02:00
Brian Paul	3db0317351	docs: document Viewperf 12 issues Signed-off-by: Brian Paul <brianp@vmware.com>	2015-03-31 11:50:20 -06:00
Neil Roberts	fe026d7ce5	i965/skl: Avoid using the 1D stencil layout for stencil-only images Commit `cf67ca9ffa` made the layouting code pick a special layout for 1D images on Skylake. This should not be used for depth and stencil buffers because these need to be treated as 2D tiled images. However the patch was missing a check for images with a base format of GL_STENCIL_INDEX. In practice I don't think it's currently possible to hit this because Mesa doesn't support GL_ARB_texture_stencil8 and it's not possible to create a 1D renderbuffer, but it'll be good to be ready for when the extension is supported. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-31 18:22:01 +01:00
Tom Stellard	fda7558057	clover: Return CL_BUILD_ERROR for CL_PROGRAM_BUILD_STATUS when compilation fails v2 v2: - Don't use _errs map Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-31 15:40:51 +00:00
Tom Stellard	4c53d2acbb	radeonsi/compute: Default to the same PIPE_SHADER_CAP values as other shader types v2 v2: - Fix typo Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-31 15:40:51 +00:00
Leo Liu	a714fbacf7	radeon/vce: implement video usability information support This will help encoding VUI into the bitstream v2: make backward compatible Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-03-31 12:31:58 -04:00
Leo Liu	8e3668a7c0	st/omx/enc: export framerate to vce driver The framerate will be used for video usability info support by VCE driver Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-03-31 12:31:58 -04:00
Roland Scheidegger	489866938f	llvmpipe: enable ARB_texture_gather Just announce support for 4 components. While here also increase the max/min texel offsets (the limit is completely artificial, was chosen because that's what other hardware did, however there's other drivers using larger limits). Over a thousand little piglits skip->pass. v2: update docs/GL3.txt Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-31 17:23:51 +02:00
Roland Scheidegger	0753b135f6	gallivm: implement TG4 for ARB_texture_gather This is quite trivial, essentially just follow all the same code you'd use with linear min/mag (and no mip) filter, then just skip the filtering after looking up the texels in favor of direct assignment of the right channel to the result. (This is though not true for the multi-offset version if we'd want to support it - for this would probably need to do something along the lines of 4x nearest sampling due to the necessity of doing coord wrapping individually per texel.) Supports multi-channel formats. From the SM5 gather cap bit, should support non-constant offsets, plus shadow comparisons (the former untested), but not component selection (should be easy to implement but all this stuff is not really exposable anyway for now). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-31 17:23:51 +02:00
Roland Scheidegger	73c6914195	gallivm: add gather support to sampler interface Luckily thanks to the revamped interface this is a lot less work now... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-31 17:23:51 +02:00
Roland Scheidegger	1863ed21ff	gallivm: simplify sampler interface This has got a bit out of control with more and more parameters added. Worse, whenever something in there changes all callees have to be updated for that, even though they don't really do much with any parameter in there except pass it on to the actual sampling function. Hence simply put almost everything into a struct. Also instead of relying on some arguments being NULL, be explicit and set this in a key (which is just reused for function generation for simplicity). (The code still relies on them being NULL in the end for now.) Technically there is a minimal functional change here for shadow sampling: if shadow sampling is done is now determined explicitly by the texture function (either sample_c or the gl-style tex func inherit this from target) instead of the static texture state. These two should always match, however. Otherwise, it should generate all the same code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-31 17:23:51 +02:00
Jose Fonseca	0fc5b80e7a	util/debug: Update MgwHelp link, drop BfdHelp link.	2015-03-31 09:42:06 +01:00
Michel Dänzer	b8797a7875	gallivm: Fix build against LLVM 3.7 SVN r233648 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-31 15:05:01 +09:00
Eric Anholt	1dcc1ee314	vc4: Drop integer multiplies with 0 to moves of 0. This cleans up more instructions generated by uniform array indexing multiplies. total instructions in shared programs: 39989 -> 39961 (-0.07%) instructions in affected programs: 896 -> 868 (-3.12%)	2015-03-30 12:57:45 -07:00
Eric Anholt	8c5dcdbccb	vc4: Add a constant folding pass. This cleans up some pointless operations generated by the in-driver mul24 lowering (commonly generated by making a vec4 index for a matrix in a uniform array). I could fill in other operations, but pretty much anything else ought to be getting handled at the NIR level, I think. total uniforms in shared programs: 13423 -> 13421 (-0.01%) uniforms in affected programs: 346 -> 344 (-0.58%)	2015-03-30 12:57:45 -07:00
Brian Paul	dbe67d76e0	glsl: allow ForceGLSLVersion to override #version directives Previously, the ctx->Const.ForceGLSLVersion setting only worked if the shader lacked a #version directive. Now, the ForceGLSLVersion setting will override the #version directive too. This change should be safe since it should be rare to have an app that has a mix of shader versions and we only wanted to override the #version for shaders which lacked the #version directive. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-30 11:25:39 -06:00
Eric Anholt	c519c4d85e	vc4: Don't bother masking out the low 24 bits for integer multiplies The hardware just uses the low 24 lines, saving us an AND to drop the high bits. total uniforms in shared programs: 13433 -> 13423 (-0.07%) uniforms in affected programs: 356 -> 346 (-2.81%) total instructions in shared programs: 40003 -> 39989 (-0.03%) instructions in affected programs: 910 -> 896 (-1.54%)	2015-03-30 09:23:39 -07:00
Eric Anholt	5df8bf86fe	vc4: Make integer multiply use 24 bits for the low parts. The hardware uses the low 24 bits in integer multiplies, so we can have fewer high bits (and so probably drop them more frequently).	2015-03-30 09:23:39 -07:00
Samuel Iglesias Gonsalvez	18004c338f	glsl: fail when a shader's input var has not an equivalent out var in previous GLSL ES 3.00 spec, 4.3.10 (Linking of Vertex Outputs and Fragment Inputs), page 45 says the following: "The type of vertex outputs and fragment input with the same name must match, otherwise the link command will fail. The precision does not need to match. Only those fragment inputs statically used (i.e. read) in the fragment shader must be declared as outputs in the vertex shader; declaring superfluous vertex shader outputs is permissible." [...] "The term static use means that after preprocessing the shader includes at least one statement that accesses the input or output, even if that statement is never actually executed." And it includes a table with all the possibilities. Similar table or content is present in other GLSL specs: GLSL 4.40, GLSL 1.50, etc but for more stages (vertex and geometry shaders, etc). This patch detects that case and returns a link error. It fixes the following dEQP test: dEQP-GLES3.functional.shaders.linkage.varying.rules.illegal_usage_1 However, it adds a new regression in piglit because the test hasn't a vertex shader and it checks the link status. bin/glslparsertest \ tests/spec/glsl-1.50/compiler/gs-also-uses-smooth-flat-noperspective.geom pass \ 1.50 --check-link This piglit test is wrong according to the spec wording above, so if this patch is merged it should be updated. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-30 13:29:05 +02:00
Michel Dänzer	d64adc3a79	radeonsi: Cache LLVMTargetMachineRef in context instead of in screen Fixes a crash in genymotion with several threads compiling shaders concurrently. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89746 Cc: 10.5 <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-30 15:15:10 +09:00
Tapani Pälli	ce83a6ec81	glsl: fix unreachable(!"") to unreachable("") Correct error with commit `151fb1e` where assert was renamed to unreachable without removing ! from string argument. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-30 08:16:00 +03:00
Emil Velikov	938b17940f	docs: add news item and link release notes for mesa 10.5.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-28 19:21:31 +00:00
Emil Velikov	dc8d8a2951	docs: Add sha256 sums for the 10.5.2 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `ff87ae1e00`)	2015-03-28 19:21:31 +00:00
Emil Velikov	6e19f6b4d0	Add release notes for the 10.5.2 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `5e59f895c4`)	2015-03-28 19:21:31 +00:00
Ilia Mirkin	ee670c9efa	freedreno/a3xx: add support for point sprite coordinate replacement This does not (yet) support different coordinate origins, so the tests still fail due to fbo flipping. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-28 14:54:41 -04:00
Ilia Mirkin	995f55a6ce	freedreno/a3xx: make vs-set point size work This appears to need the A2XX version of the point list, so select it at draw time if necessary. Experimentally, always using the A2XX version causes hangs when PSIZE isn't actually emitted. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-28 14:54:41 -04:00
Ilia Mirkin	7fc5da8b93	freedreno/a3xx: point size should not be divided by 2 The division is probably a holdover from the days when the fixed point inline functions generated by headergen were broken. Also reduce the maximum point size to 4092 (vs 4096), which is what the blob does. Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-28 14:54:41 -04:00
Ilia Mirkin	738c8319ac	freedreno/a3xx: fix 3d texture layout The SZ2 field contains the layer size of a lower miplevel. It only contains 4 bits, which limits the maximum layer size it can describe. In situations where the next miplevel would be too big, the hardware appears to keep minifying the size until it hits one of that size. Unfortunately the hardware's ideas about sizes can differ from freedreno's which can still lead to issues. Minimize those by stopping to minify as soon as possible. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-28 14:54:41 -04:00
Ilia Mirkin	3735643df3	freedreno/a3xx: LAYERSZ2 appears to have no effect on arrays Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-28 14:54:40 -04:00
Kenneth Graunke	72b06fb08e	nir: Fix copy and pasted error message in nir_validate. These are nir_cf_nodes, not ALU instructions. Also, use unreachable() to preempt said review feedback. v2: Do it right (thanks Ilia). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-28 09:36:46 -07:00
Kenneth Graunke	31dc63d5ca	i965/nir: Use NIR for ARB_vertex_program support on Gen8+. Everything is already in place; we simply have to take the scalar code generation path. This gives us SIMD8 VS programs, instead of SIMD4x2. v2: Rebase on the patch that drops brw->gen >= 8. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-27 21:16:51 -07:00
Kenneth Graunke	ac69ab7302	i965: Move env_var_as_boolean to intel_debug.c. I need to use this in brw_vec4.cpp, so it can't be static anymore. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-27 21:16:43 -07:00
Kenneth Graunke	826d3afb8f	i965/fs: Add ARB_fragment_program support to the NIR backend. Use prog_to_nir where we would normally call glsl_to_nir, handle program parameter lists, and skip a few things that don't exist. Using NIR generates much better shader code than Mesa IR, since we get real optimizations, as opposed to prog_optimize: total instructions in shared programs: 314007 -> 279892 (-10.86%) instructions in affected programs: 285173 -> 251058 (-11.96%) helped: 2001 HURT: 67 GAINED: 4 LOST: 7 v2: Change early return in nir_setup_uniforms to if/else (Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-27 21:16:34 -07:00
Kenneth Graunke	bf2c3bc316	nir: Lower subtraction to add with negation when !lower_negate. prog->nir will generate fsub opcodes, but i965 doesn't implement them. We may as well lower them at the NIR level, since it's trivial to do. Suggested by Connor Abbott. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-27 21:16:34 -07:00
Kenneth Graunke	faf6106c6f	nir: Implement a Mesa IR -> NIR translator. Shamelessly ripped off from Eric Anholt's tgsi_to_nir pass. This is not built on SCons, like the rest of NIR. v2: - Delete redundant c->s, c->impl, and c->cf_node_list pointers (Ken) - Use nir_builder directly instead of ptn_compile in more places (Ken) - Drop 'struct' keyword in front of nir_builder (ken) - Add a file level Doxygen comment (Ken) - Use scalar constants instead of splatting (Eric) - Use nir_builder helpers for constants, moves, and swizzles (Connor) v3: Minor indentation improvements. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-27 21:16:34 -07:00
Kenneth Graunke	06f7bea96a	nir: Add builder helpers for MOVs with ALU sources and swizzling MOVs. These will be useful for prog->nir and tgsi->nir. v2: Don't forget to mark nir_swizzle as inline (Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-27 21:16:33 -07:00
Kenneth Graunke	75c922e0fe	nir: Add nir_builder helpers for creating load_const intrinsics. Both prog->nir and tgsi->nir will want to use these. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-27 21:16:33 -07:00
Ben Widawsky	74fd226e34	i965/skl: Don't use the PMA depth stall workaround The PMA depth stall must be enabled (optimization turned off) under certain circumstances on gen8. This was supposedly fixed for Gen9, which means we do not need to check, or toggle the state. The hardware is supposed to enable the hardware optimization by default, unlike BDW, so we also don't need to set it at init. For whatever reason this improves stability on ETQW with the bug mentioned below. References: https://bugs.freedesktop.org/show_bug.cgi?id=89039 (doesn't fix) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Anuj Phogat <anuj.phogat@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-27 21:04:41 -07:00
Ben Widawsky	9d32d35850	i965/skl: Disable partial resolve in VC Recomendation [sic] is to set this field to 1 always. Programming it to default value of 0, may have -ve impact on performance for MSAA WLs. Another don't suck bit which needs to get set. The patch wasn't as well tested as I would have liked, primarily I don't have perf numbers for it, but it's getting to a point where it is in danger of being lost. v2: v1 was a mix of two patches. Since 0x7004 is masked, we only need to set it once at initialization and make sure the pma workaround doesn't set the mask bit (which it doesn't). Move LRI to init gpu state (Ken) Add a comment. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-27 21:04:37 -07:00
Roland Scheidegger	b2424fb030	llvmpipe: simplify address calculation for 4x4 blocks These functions looked quite complicated, even though what they actually did was trivial (ever since we dropped swizzled rendering). Also drop lookup of format block per bytes done for each block, and do it once per scene instead. This improves everybody's favorite "benchmark" by 3% or so, though lp_rast_shade_quads_all() which calls this shows up still quite high for a function which does little more than call the jit function. (This would most likely be much better handled by the jit function itself, the strides are passed through anyway already, though for being able to handle layers it would definitely add some complexity.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-28 02:59:42 +01:00
Roland Scheidegger	764fc2be5a	gallivm: fix texture function name (key) when using txf/ld When using the texel fetch functions rather than ordinary texturing, the arguments are all int vecs instead of float vecs, not to mention the actual function would look completely different. Hence this must be included in the texture function name (which serves as the key) otherwise things crash badly when a shader accesses the same texture and sampler unit with both txf/ld and ordinary texturing instructions with otherwise matching keys.	2015-03-28 02:58:43 +01:00
Timothy Arceri	2cb149c289	glsl: mark uniform and input interface blocks as read only Reviewed-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-28 10:18:40 +11:00
Ilia Mirkin	58030a8f99	nv50/ir/gk110: fix offset flag position for TXD opcode Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-27 19:02:19 -04:00
Ilia Mirkin	49b86007aa	nv50/ir: take postFactor into account when doing peephole optimizations Multiply operations can have a post-factor on them, which other ops don't support. Only perform the peephole optimizations when there is no post-factor involved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758 Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-27 19:02:19 -04:00
Jan Vesely	a99a16a0cf	gallivm: Fix build since llvm r233411 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-27 18:57:02 -04:00
Eric Anholt	afa9fc1561	nir: Add optional lowering of flrp. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-27 13:29:48 -07:00
Roland Scheidegger	56076be2ac	gallivm: use llvm function calls for texturing instead of inlining There are issues with inlining everything, most notably llvm will use much more memory (and be slower) when compiling. Ideally we'd probably use functions for shader functions too but texture sampling usually is responsible for quite some IR (it can easily reach 80% of total IR instructions) so this seems like a good start. This still generates a different function for all different combinations just like before, however it is possible llvm is missing some optimization opportunities - it is believed though such opportunities should be somewhat rare, but at least for now it can still be switched off (at compile time only). It should probably make compiled code also smaller because the same function should be used for different variants in the same module (so for the opaque/partial or linear/elts variants). No piglit change (though it does indeed speed up unrealistic tests like fp-indirections2 by a factor of 30 or so). Has a small negative performance impact in openarena - I suspect this could be fixed by running some IPO passes (despite the private linkage, llvm right now does NO optimization at all wrt anything going past the call, even if there's just one caller - so things like values stored before the call and then always written by the function etc. will not be optimized away, nor will dead arguments (which we mostly shouldn't have) be eliminated, always constant arguments promoted etc.). v2: use proper return values instead of pointer function arguments. llvm supports aggregate return values, which do wonders here eliminating unnecessary stack variables - everything in fact will be returned in registers even without any IPO optimizations. It makes the code simpler too. With this I could not measure a peformance impact in openarena any longer (though since there's still no constant value propagation etc. into the tex functions this does not mean it couldn't have a negative impact elsewhere). v3: fix some minor issues suggested by Jose, and do disassembly (and the profiling) without hacks. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-27 19:25:53 +01:00
Roland Scheidegger	8dad9455ff	gallivm: pass jit_context pointer through to sampling The callbacks used for getting the dynamic texture/sampler state were using the jit_context from the generated jit function. This works just fine, however that way it's impossible to generate separate functions for texture sampling, as will be done in the next commit. Hence, pass this pointer through all interfaces so it can be passed to a separate function (technically, it would probably be possible to extract this pointer from the current function instead, but this feels hacky and would probably require some more hacks if we'd use real functions instead of inlining all shader functions at some point). There should be no difference in the generated code for now. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-27 19:25:53 +01:00
Christian König	787aa26cb7	gallium/vl: partially revert "Use util_cpu_to_le{16,32} in many more places." The data in memory is in big endian format and needs to be converted into CPU byte order. So the patch actually reversed what needs to be done. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-27 11:30:32 +01:00
Ilia Mirkin	626434893a	tgsi: fix out-of-bounds access for cube arrays The CUBE_ARRAY case uses r[4]. Make sure that the stack variable is there. Noticed by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-03-26 21:02:09 -04:00
Ilia Mirkin	f95a6b2ff4	st/mesa: initialize have_fma in constructor Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-26 21:02:09 -04:00
Ilia Mirkin	1b87d73a9f	gallium/util: remove u_linkage Does not appear to be used in tree. Coverity spotted some errors in the bitmask stuff, but the whole thing appears to be unused. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-26 21:02:09 -04:00
Ilia Mirkin	2e34faaf06	gallium/hud: avoid overflowing hud graph name size Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-26 21:02:08 -04:00
Ilia Mirkin	9d1b5febb6	st/mesa: update arrays when the current attrib has been updated Fixes the recently-sent gl-2.0-vertex-const-attr piglit test. Makes sure to revalidate arrays when only the current attribute has been updated via glVertexAttrib*. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89754 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-26 21:01:59 -04:00
Dave Airlie	91e3533481	st_glsl_to_tgsi: only do mov copy propagation on temps (v2) Don't propagate ARRAYs This should fix: https://bugs.freedesktop.org/show_bug.cgi?id=89759 v2: just specify arrays so we get input propagation Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-26 12:03:44 +10:00
Kenneth Graunke	ef09cfb51e	i965: Drop unnecessary brw->gen >= 8 check from scalar VS code. brw->scalar_vs already implies that brw->gen >= 8. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-25 16:19:26 -07:00
Kenneth Graunke	649173b473	i965/fs: Implement texture projection support. Our fragment program backend implements support for TXP directly, and there's no NIR lowering pass to remove the projection. When we switch fragment program support over to NIR, we need to support it somehow. It's easy enough to support directly. v2: Split out offset/tex_offset rename (requested by Jordan). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-25 16:17:19 -07:00
Kenneth Graunke	0a9bcf9e39	i965/fs: Rename offset to tex_offset to avoid shadowing offset(). fs_visitor::nir_emit_texture() created an fs_reg variable called offset, which shadowed the offset() helper function in brw_ir_fs.h. Rename the variable to tex_offset so we can still call offset(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-25 16:17:19 -07:00
Kenneth Graunke	3120345f40	nir: Add glsl_float_type() wrapper. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-25 16:17:19 -07:00
Matt Turner	871f1080d0	glsl: Use INFINITY instead of std::numeric_limits<float>::infinity(). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-25 15:06:48 -07:00
Emil Velikov	5dc573e5de	automake: add missing egl files to the tarball Namely the Haiku EGL driver backend and the SConscript for the dri2 EGL driver backend. Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-25 21:00:02 +00:00
Ian Romanick	6075780247	glsl: Constify ir_instruction::equals v2: Don't be lazy. Constify the as_foo functions and use those instead of ugly casts. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-25 10:41:08 -07:00
Ian Romanick	dec9664e35	glsl: Constify the as_foo functions Now that they're all implemented using macros, this is trivial. v2: Remove redundant parenthesis. Suggested by Curro. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-25 10:40:52 -07:00
Ian Romanick	0c4ee62045	glsl: Implement remaining as_foo functions with macros The downcast functions for non-leaf classes were previously implemented "by hand." Now they are implemented using macros based on the is_foo functions added in the previous patch. v2: Remove redundant parenthesis. Suggested by Curro (on the next patch). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-25 10:39:09 -07:00
Ian Romanick	a284c63006	glsl: Add is_rvalue, is_dereference, and is_jump methods These functions deteremine when an IR node is one of the non-leaf classes. v2: Adjust indentation to line up. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-25 10:34:59 -07:00
Jose Fonseca	25d6cdd2ff	util/u_atomic: Ignore warnings interlocked accesses. These are due how we implemented the atomic tests, not the atomic implementation itself. It's also difficult to refactor the code to avoid the warnings due to the use of macros -- the code would be quite hairy. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:48 +00:00
Jose Fonseca	28c54400af	scons: Disable MSVC warnings about inconsistent function annotation. Somehow, merely including any of the *intrin.h headers causes dozens of this warnings (when compiling pretty much every source file). MSVC does not always complain the same -- so it's possible we're doing something weird --, but silence these warnings in the meanwhile. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:45 +00:00
Jose Fonseca	cb88edbd4e	mesa: Avoid MSVC C6334 warning in /analyze mode. MSVC's implementation of signbit(x) uses sizeof(x) with expressions to dispatch to an internal function based on the argument's type (float, double, etc), but that raises a flag with MSVC's own static analyzer, and because this is an inline function in a header it causes substantial warning spam. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:43 +00:00
Jose Fonseca	fdb507e3d6	c99_math: Don't reimplement lrint and friends on MSVC 2013. MSVC 2013 declares these functions, both for C and C++ source files. This was caught with MSVC in analyze mode. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:41 +00:00
Jose Fonseca	69db422218	scons: Don't build osmesa. There doesn't seem much interest on osmesa on Windows, particularly classic osmesa. If there is indeed interest in osmesa on Windows, we should instead integrate src/gallium/targets/osmesa into SCons. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:38 +00:00
Jose Fonseca	47870d658b	scons: Don't build loader on Windows. EGL was the last user. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:35 +00:00
Jose Fonseca	f9b8c9299d	scons: Don't build egl on Windows. Useless, as there are no drivers for it. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:32 +00:00
Jose Fonseca	5db57b8a55	scons: Fix git_sha1.h generation fallback. I didn't meant to remove the 'if not os.path.exists(filename)' statement. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-25 10:42:26 +00:00
Martin Peres	31a30fb342	docs: Update progress on ARB_direct_state_access. v2: - Fix the state of the Program pipelines and Query objects (Laura) Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	bf11c195a5	main: Added entry points for NamedRenderbufferStorage/Multisample v2: Review from Laura Ekstrand - get rid of a change that should not have happened in this patch - improve the error messages - fix alignments - fix a capitalization in a function name in an error message v3: Review from Laura Ekstrand - move the test for the validity of the renderbuffer to less generic functions - get rid of some changes that accidentally landed in the wrong commit - revert some alignment fixes v3: Review from Laura Ekstrand - check that the lookup returns a valid renderbuffer - cosmetic changes to some error messages Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	245e5c4813	main: Added entry point for glGetNamedRenderbufferParameteriv v2: - improve an error message v3: - move a test to less generic functions - fix an alignment v4: - take the caller as a parameter instead of bool dsa - check that the lookup returns a valid renderbuffer Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	a34669b961	main: Added entry point for glCreateRenderbuffers v2: - refactor bindRenderBuffer and create_render_buffers to fix an assertion Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	73a9d0fbe5	main: Added entry point for glCreateSamplers Because of the current way the code is architectured, there is no functional difference between the DSA and the non-DSA path. Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	b09f2ee8f7	main: Added entry point for glCreateProgramPipelines v2: - add spaces in an error message (Laura) Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	19e6efc0ad	main: Added entry points for glGetQueryBufferObject* These entry points will be fleshed out when the GL_ARB_query_buffer_object extension gets implemented. In the meantime, return GL_INVALID_OPERATION as suggested by Ian. Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	c3c1ed874e	main: Added entry point for glCreateQueries v2: - display the name of the target instead of its id (Laura) Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	6ead10d08f	main: Added entry point for glGetTransformFeedbacki64_v v2: Review from Laura Ekstrand - use the transform feedback object lookup wrapper v3: - use the new name of _mesa_lookup_transform_feedback_object_err v4: Review from Laura Ekstrand - fix some alignement problems Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	8799ecddb6	main: Added entry point for glGetTransformFeedbacki_v v2: Review from Laura Ekstrand - use the transform feedback object lookup wrapper v3: - use the new name of _mesa_lookup_transform_feedback_object_err Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	e59d2434a0	main: Added entry point for glGetTransformFeedbackiv v2: Review from Laura Ekstrand - use the transform feedback object lookup wrapper Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	296d82376e	main: Added entry point for glTransformFeedbackBufferRange v2: review from Laura Ekstrand - use the refactored code to lookup the objects - improve some error messages - factor out the gl method name computation - better handle the spec differences between the DSA and non-DSA cases - quote the spec a little more v3: review from Laura Ekstrand - use the new name of _mesa_lookup_bufferobj_err - swap the comments around the offset and size checks v4: review from Laura Ekstrand - add more spec quotes - properly fix the comments around the offset and size checks v5: review from Laura Ekstrand - add quotes on the spec citations - revert some changes in the printf format v6: review from Laura Ekstrand - remove a redondant "gl" in a method name Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-25 10:05:45 +02:00
Martin Peres	a5d165afed	main: Added entry point for glTransformFeedbackBufferBase v2: Review from Laura Ekstrand - give more helpful error messages - factor the lookup code for the xfb and objBuf - replace some already-existing tabs with spaces - add comments to explain the cases where xfb == 0 or buffer == 0 - fix the condition for binding the transform buffer or not v3: Review from Laura Ekstrand - rename _mesa_lookup_bufferobj_err to _mesa_lookup_transform_feedback_bufferobj_err and make it static to avoid a future conflict - make _mesa_lookup_transform_feedback_object_err static v4: Review from Laura Ekstrand - add the pdf page number when quoting the spec - rename some of the symbols to follow the public/private conventions v5: Review from Laura Ekstrand - properly rename some of the symbols to follow the public/private conventions - fix some alignments - add quotes around a spec citation - add back a newline I accidentally deleted - add spaces around the ternary operator usages Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-25 10:05:45 +02:00
Martin Peres	c86cb2da25	main: Added entry point for glCreateTransformFeedbacks v2: Review from Laura Ekstrand - generate the name of the gl method once - shorten some lines to stay in the 78 chars limit v3: Review from Fredrik Höglund <fredrik@kde.org> - rename gl_mthd_name to func - set EverBound in _mesa_create_transform_feedbacks in the dsa case v4: - rename _mesa_create_transform_feedbacks to create_transform_feedbacks and make it static Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	fc76fac419	main: fix the validation of the number of samples Maybe this should be the job of the dispatch layer. v2: - add the section name and pdf page number of the quote (Laura) - OpenGL 3.0 core does not exist, get rid of "core" Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	7bd8b48084	main: replace tabs by 8 spaces in fbobject.c Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Martin Peres	cd0763b78f	main: replace tabs by 8 spaces in bufferobj.c Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-25 10:05:45 +02:00
Kristian Høgsberg	169b389a34	mesa: Apply visibility flags to src/Makefile.am targets We were building libglsl_util.la without our visibility flags and leaking hash_table_* symbols. Signed-off-by: Kristian Høgsberg <kristian.h.kristensen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-24 22:02:57 -07:00
Matt Turner	babd0fa3e2	nir: Fix typo.	2015-03-24 19:14:40 -07:00
Matt Turner	3fb56805f0	nir: Recognize sat(add(b2f(a), b2f(b))) as a logical OR. Transform this into b2f(or(a, b)). instructions in affected programs: 432 -> 430 (-0.46%) helped: 2 Acked-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-24 14:43:37 -07:00
Matt Turner	c31158d2cb	nir: Recognize mul(b2f(a), b2f(b)) as a logical AND. Transform this into b2f(and(a, b)). total instructions in shared programs: 6205448 -> 6204391 (-0.02%) instructions in affected programs: 284030 -> 282973 (-0.37%) helped: 903 HURT: 6 Acked-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-24 14:43:37 -07:00
Matt Turner	b481ebbe39	glsl: Recognize sat(add(b2f(a), b2f(b))) as a logical OR. Transform this into b2f(or(a, b)). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-24 14:43:37 -07:00
Matt Turner	c8e8f66036	glsl: Recognize mul(b2f(a), b2f(b)) as a logical AND. Transform this into b2f(and(a, b)). total instructions in shared programs: 6190291 -> 6189225 (-0.02%) instructions in affected programs: 267247 -> 266181 (-0.40%) helped: 866 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-24 14:43:37 -07:00
Matt Turner	95729d2458	nir: Handle mixed scalar/vector arguments to logical and/or/xor. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-24 14:43:37 -07:00
Matt Turner	c8acbd1bfd	glsl: Allow vector logic ops to be generated. They're not accessible from the source language, but optimizations are allowed to generate them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-24 14:42:51 -07:00
Emil Velikov	248eb54eb6	configure.ac: move AC_MSG_RESULT reporting back into the m4 macro The one who does AC_MSG_CHECKING should provide the AC_MSG_RESULT. Fixes: `ced9425327` (configure: Introduce new output variable to ax_check_python_mako_module.m4" Cc: "10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89328 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>	2015-03-24 20:49:32 +00:00
Jonathan Gray	726d99b197	gallium/util: Use HAVE___BUILTIN_FFS* macros. Make use of the builtin ffs macros and split out ffsll to a seperate block. Needed for at least OpenBSD which does not have ffsll in libc. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-24 20:49:32 +00:00
Emil Velikov	8cce7b05f1	i965: add the remaining files to the tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-24 20:49:31 +00:00
Emil Velikov	9950eec173	glsl: add the remaining files to the tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-24 20:49:31 +00:00
Emil Velikov	b2439602be	makefile: add all headers to the tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-24 20:49:31 +00:00
Emil Velikov	113d59fb55	gbm: remove gbm_gallium_drm from the loader No longer used as of commit 48c7461d5a0(st/gbm: remove state-tracker) v2: Add commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1)	2015-03-24 20:49:31 +00:00
Anuj Phogat	d8208312a3	glsl: Generate link error for non-matching gl_FragCoord redeclarations in different fragment shaders. This also applies to a case when gl_FragCoord is redeclared with no layout qualifiers in one fragment shader and not declared but used in other fragment shader. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Khronos Bug#12957 Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-03-24 11:16:31 -07:00
Eric Anholt	7bc39c8418	vc4: Add a dump-the-surface-contents routine. This has been useful once again while trying to debug stride issues between render targets and texturing.	2015-03-24 10:39:12 -07:00
Eric Anholt	4df13f55b6	vc4: Allow DRI3 on simulation, as well. The problem I'd seen before seems to be gone.	2015-03-24 10:39:12 -07:00
Eric Anholt	7f797e3d17	vc4: Fix pitch alignment of linear textures. Fixes some non-power-of-two texture rendering when I force ARGB8888 to raster.	2015-03-24 10:39:12 -07:00
Eric Anholt	b3ea377f86	vc4: Write the alignment of level width consistently in validation. 16 / cpp happens to be the same as utile_w on the only raster format supported (4 bytes per pixel), but simulator/hw source code generally talks in terms of utiles.	2015-03-24 10:39:12 -07:00
Eric Anholt	8975a09494	vc4: Fix use of a bool as an enum. The enum compared to was 0, so it worked out, but it sure looked wrong.	2015-03-24 10:39:12 -07:00
Eric Anholt	04605c21f6	vc4: Decide the HW's format before laying out the miptree. I'm experimenting with a workaround for raster texture misrendering on hardware, and this lets me look at the format chosen when computing strides.	2015-03-24 10:39:12 -07:00
Eric Anholt	25d60763d9	vc4: Use our device-specific ioctls for create/mmap. They don't do anything special for us, but I've been told by kernel maintainers that relying on dumb for my acceleration-capable buffers is not OK.	2015-03-24 10:39:12 -07:00
Eric Anholt	af3d747194	vc4: Make a new #define for making code conditional on the simulator. I'd like to compile as much of the device-specific code as possible when building for simulator, and using if (using_simulator) instead of ifdefs helps.	2015-03-24 10:39:12 -07:00
Eric Anholt	9bafcf630a	vc4: Add some useful debug printfs for miptrees. I keep rewriting these.	2015-03-24 10:39:12 -07:00
Ilia Mirkin	baa22c8825	glsl: avoid calling base_alignment when samplers are involved Earlier commit `53bf7c8fd2` changed the logic to always call base_alignment on structs. `1ec715ce8b` hacked the function to return 0 for sampler fields, but didn't handle sampler arrays. Instead of extending the hack, avoid calling base_alignment in the first place on non-UBO uniforms. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89726 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tapani Palli <tapani.palli@intel.com>	2015-03-24 10:10:13 -04:00
Ilia Mirkin	43277fcd59	Revert "nv50,nvc0: remove bogus 64_FLOAT formats" This reverts commit `20346808cf`. The conversion is actually done since these are the *B macro variants and no vtx format is supplied, which makes them go through the translate module. This restores the following piglit tests to passing: draw-vertices user gl-2.0-vertexattribpointer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-23 20:57:52 -04:00
Mario Kleiner	1110113a7f	mapi: Make private copies of name strings provided by client. glXGetProcAddress("glFoo") ends up in stub_add_dynamic() to create dynamic stubs for dynamic functions. stub_add_dynamic() doesn't store the caller provided name string "Foo" in a mesa private copy, but just stores a pointer to the "glFoo" string passed to glXGetProcAddress - a pointer into arbitrary memory outside mesa's control. If the caller passes some dynamically allocated/changing memory buffer to glXGetProcAddress(), or the caller gets unmapped from memory, e.g., some dynamically loaded application plugin which uses OpenGL, this ends badly - with a dangling pointer. strdup() the name string provided by the client to avoid this problem. Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-23 22:17:03 +00:00
Tom Stellard	dfb1ae9d91	clover: Return 0 as storage size for local kernel args that are not set v2 The storage size for local kernel args can be queried before the arguments are set by using the CL_KERNEL_LOCAL_MEM_SIZE param of clGetKernelWorkGroupInfo(). The spec says that if local kernel arguments have not been specified, then we should assume their size is 0. v2: - Implement using c++11 member initialization. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-03-23 17:20:21 +00:00
Tom Stellard	769b366b83	gallivm: Use MCInstrInfo in the disassembler for querying instruction info This fixes the build since llvm r232885 and also simplifies the code.	2015-03-23 14:43:10 +00:00
Giuseppe Bilotta	7932b30892	clover: use get_device_vendor instead of get_vendor The pipe's get_vendor method returns something more akin to a driver vendor string in most cases, instead of the actual device vendor. Use get_device_vendor instead, which was introduced specifically for this purpose. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-23 13:25:34 +00:00
Giuseppe Bilotta	76039b38f0	gallium: implement get_device_vendor() for existing drivers The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-23 13:25:34 +00:00
Giuseppe Bilotta	31d4e6fbff	gallium: introduce get_device_vendor() entrypoint for pipes This will be needed by Clover to return the correct information to CL_DEVICE_VENDOR info queries. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-23 13:25:34 +00:00
Giuseppe Bilotta	9280f17e82	gallium: remove trailing whitespace in p_screen.h Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-23 13:25:34 +00:00
Tom Stellard	6e17936bf8	clover: The unit for CL_DEVICE_MEM_BASE_ADDR_ALIGN is bits not bytes Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-23 13:22:42 +00:00
Tom Stellard	2b12b1752a	clover: Add all the mandatory 1.1 extensions to the extension string Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-23 13:22:42 +00:00
Tom Stellard	96f9cc9181	clover: Add a space at the end of CL_DEVICE_OPENCL_C_VERSION This is required by the spec. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-23 13:22:42 +00:00
Francisco Jerez	3d1bba7c9b	i965/vec4: Fix handling of multiple register reads and writes in dead_code_eliminate(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:52:57 +02:00
Francisco Jerez	2babde35b9	i965/vec4: Calculate live intervals with subregister granularity. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:52:57 +02:00
Francisco Jerez	e6e655ef76	i965/vec4: Define helpers to calculate the common live interval of a range of variables. These will be especially useful when we start keeping track of liveness information for each subregister. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:52:49 +02:00
Francisco Jerez	eddb87402e	i965/vec4: Define helper functions to convert a register to a variable index. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:13:05 +02:00
Francisco Jerez	ce030a6399	i965/vec4: Don't lose the force_writemask_all flag during CSE. And set it in the MOV instructions that copy the temporary to the original destination if the generator instruction had it set. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:13:00 +02:00
Francisco Jerez	1db9c0cd0c	i965/vec4: Fix handling of multiple register reads and writes in opt_cse(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:12:56 +02:00
Francisco Jerez	d041a43c0f	i965/vec4: Fix handling of multiple register reads and writes during copy propagation. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:12:52 +02:00
Francisco Jerez	588859e18c	i965/vec4: Fix handling of multiple register reads and writes in split_virtual_grfs(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:12:48 +02:00
Francisco Jerez	9304f60cbe	i965/vec4: Fix handling of multiple register reads and writes in opt_register_coalesce(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:12:40 +02:00
Francisco Jerez	74c7e5d351	i965: Define method to check whether a backend_reg is inside a given range. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:12:36 +02:00
Francisco Jerez	bf6eb37e0b	i965/vec4: Remove dependency of vec4_live_variables on the visitor. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:12:13 +02:00
Francisco Jerez	2e7622a487	i965/vec4: Trivial copy propagate clean-up. Fix typo and punctuation in a comment, break long line and add space before curly bracket. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	7526ee36bc	i965/vec4: Add argument index and type checks to SEL saturate propagation. SEL saturate propagation already implicitly relies on these assumptions. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	24073b2cd7	i965/vec4: Fix broken saturate mask check in copy propagation. try_copy_propagate() was checking the bit of the saturate mask for the arg-th component of the source to decide whether the whole source should be saturated (WTF?). We need to swizzle the original saturate mask and check that for all enabled channels the saturate flag is either set or unset, as we cannot saturate a subset of destination components only. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	18dc59c212	i965/vec4: Don't lose copy propagation saturate bits for not written components. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	a3733defbe	Revert "i965/vec4: Don't lose the saturate modifier in copy propagation." This reverts commit `0dfec59a27`. The change prevented propagation of copies with the saturate flag set, making the whole saturate mask tracking completely useless. A proper fix follows. Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	21c829e5cc	i965/vec4: Remove unused method definition. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	516d45f78a	i965/vec4: Some more trivial swizzle clean-up. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	430c6bf70e	i965/vec4: Improve src_reg/dst_reg conversion constructors. This simplifies the src_reg/dst_reg conversion constructors using the swizzle utils introduced in a previous patch. It also makes them more useful by changing their semantics slightly: dst_reg(src_reg) used to set the writemask to XYZW if the src_reg swizzle was anything other than XXXX, which was almost certainly not what the caller intended if the swizzle was non-trivial. After this patch the same components that are present in the swizzle will be enabled in the resulting writemask. src_reg(dst_reg) used to set the first components of the swizzle to the enabled components of the writemask and then replicate the last enabled component to fill the swizzle, which, in cases where the writemask didn't have exactly the first n components set, would in general not be compatible with the original dst_reg. E.g.: \| ADD(tmp, src_reg(tmp), src_reg(1)); would not do what one would expect (add one to each of the enabled components of tmp) if tmp didn't have a writemask of the described form (e.g. YZ, YW, XZW would all fail). This pattern actually occurs in many different places in the VEC4 back-end, it's a wonder that it hasn't caused piglit failures until now. After this patch src_reg(dst_reg) will construct a swizzle with each enabled component at its natural position (e.g. Y at the second position, Z at the third, and so on). The resulting swizzle will behave like the identity when used in any instruction with the original writemask. I've manually verified that none of the callers of both conversion constructors were relying on the previous broken semantics. There are no piglit regressions on any generation. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:33 +02:00
Francisco Jerez	62fd335338	i965/vec4: Pass argument by reference to src_reg/dst_reg conversion constructors. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	23bda945f5	i965/vec4: Remove swizzle_for_size() in favour of brw_swizzle_for_size(). It could be objected that swizzle_for_size() is "faster" than brw_swizzle_for_size(). It's not measurably better in any reasonable CPU-bound benchmark on VLV according to the Finnish benchmarking system (including the SynMark2 DrvShComp shader compilation benchmark). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	5bcca9f8dc	i965/vec4: Remove broken vector size deduction in setup_builtin_uniform_values(). This seemed to be trying to deduce the number of uniform vector components from the parameter swizzle, but the algorithm would always give 4 as result. Instead grab the correct number of components from the GLSL type. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	132cdcc468	i965/vec4: Simplify visitor handling of swizzles using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	9a17e4e900	i965/vec4: Simplify opt_register_coalesce() using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	05ec72d8ec	i965/vec4: Simplify reswizzle() using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	7b30493dc4	i965/vec4: Simplify opt_reduce_swizzle() using the swizzle utils. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	eb9bd3a1b0	i965: Fix signedness of backend_reg::reg_offset. And make it 16-bit so it packs nicely with the previous field. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	7e816c7feb	i965/vec4: Fix signedness of dst_reg::writemask. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	7678fb9c63	i965/vec4: Don't use GL types in the IR data structures. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	7bc02c786d	i965/vec4: Fix signedness of brw_is_single_value_swizzle() argument. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:32 +02:00
Francisco Jerez	cff670b009	i965: Define some useful swizzle helper functions. This defines helper functions implementing some common swizzle transformations that are usually open-coded in the compiler back-end, causing a lot of clutter. Some optimization passes will become almost trivial implemented in terms of these functions (e.g. vec4_visitor::opt_reduce_swizzle()). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 14:09:29 +02:00
Tapani Pälli	3cf99701ba	glsl: fix names in lower_constant_arrays_to_uniforms Patch changes lowering pass to use unique name for each uniform so that arrays from different stages cannot end up having same name. v2: instead of global counter, use pointer to achieve unique name (Kenneth Graunke) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590 Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-03-23 11:18:39 +02:00
Jason Ekstrand	a6d4a108d2	i965/nir: Use signed integer type for booleans FS instructions with NIR on i965: total instructions in shared programs: 2663561 -> 2619051 (-1.67%) instructions in affected programs: 1612965 -> 1568455 (-2.76%) helped: 5455 HURT: 12 FS instructions with NIR on g4x: total instructions in shared programs: 2352633 -> 2307908 (-1.90%) instructions in affected programs: 1441842 -> 1397117 (-3.10%) helped: 5463 HURT: 11 FS instructions with NIR on ilk: total instructions in shared programs: 3997305 -> 3934278 (-1.58%) instructions in affected programs: 2189409 -> 2126382 (-2.88%) helped: 8969 HURT: 22 FS instructions with NIR on hsw (snb and ivb were similar): total instructions in shared programs: 4109389 -> 4109242 (-0.00%) instructions in affected programs: 109869 -> 109722 (-0.13%) helped: 339 HURT: 190 No SIMD16 programs were gained or lost on any platform Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 01:01:14 -07:00
Jason Ekstrand	41d64fa184	i965/nir: Do boolean resolves on GEN <= 5 v2: A couple comment clean-ups from Matt Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 01:01:14 -07:00
Jason Ekstrand	a55af2699f	i965: Add a NIR analysis pass for determining when a boolean resolve is needed v2: Fix the spelling of analyze and re-arrange code for better readability as per Connor's comments. v3: Make the naming of things more consistent and add a pile of comments v4: Stop trying to avoid vectors Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-23 01:01:14 -07:00
Jason Ekstrand	2612e569e0	i965/nir: Properly set the predicate on the SEL used in min/max Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 01:01:14 -07:00
Jason Ekstrand	80390f91a0	i965/nir: Use NIR lowering for ffma for gen < 6 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-23 01:01:14 -07:00
Jason Ekstrand	235c728020	i965/nir: Use emit_lrp for emitting flrp Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-23 01:01:14 -07:00
Jason Ekstrand	a3e05898e9	i965/fs: Make emit_lrp return an fs_inst Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-23 01:01:14 -07:00
Dave Airlie	484f9f4fcd	i965: define I915_PARAM_REVISION we are broken against the libdrm 2.4.60 minimum specified, so fix it for now. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-23 09:55:33 +10:00
Jose Fonseca	397b491173	gallivm: Silence unused variable warnings on release builds. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	06ac717810	scons: Silence conversion from 'size_t' to 'type', possible loss of data on MSVC. Most cases seem harmless, though that might not always be the case. Maybe one day we can get gcc to complain about these and fix them throughout the code, but until then let's silence them. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	15c5595bb1	scons: Ensure inttypes.h is always pre-included on MSVC. It's a bit hackish couldn't find another solution. See code comment for details. The warning is useful, so universally disabling doesn't sound a good idea. Fixes warning C4005: 'xxx' : macro redefinition Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	e4d95982ee	scons: Silence MSVC C4351 warning. It warns about change in MSVC behavior -- array initialisation used to be non-standard, but is standard now, assuming I understand correctly http://en.cppreference.com/w/cpp/language/zero_initialization . Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	e518d97d7e	scons: Match some of LLVM warning options. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	31e47a59ad	scons: Cleanup flex/bison settings specification. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	9c1c657e19	scons: Prefer winflexbison, and use --wincompat when available. This avoids MSVC the warning warning C4013: 'isatty' undefined; assuming extern returning int with certain versions of flex. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Add win flex-bison link to docs/install.html.	2015-03-22 08:23:24 +00:00
Jose Fonseca	015e8b6384	scons: Define YY_USE_CONST on MSVC. This prevents the MSVC from warning C4090: 'function' : different 'const' qualifiers when compiling flex generated lexers. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	357d1fc81a	scons: Tell MSVC STL library to not use exceptions. MSVC defaults to no exceptions unless /EH option is passed (which we don't), while MSVC's STL defaults to use exceptions unless _HAS_EXCEPTIONS=0 is defined, which we didn't. This fixes warning C4530: C++ exception handler used, but unwind semantics are not enabled. Specify /EHsc Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	e6330f9f56	scons: Ensure git_sha1.h's directory exists. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	8f0274c6c6	configure: Bail out with MinGW targets. We only support native Windows builds with SCons. Tested with: ./configure --host=i686-w64-mingw32 Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	8d5c303ab9	include: Ensure float.h is included for DBL_MAX. I didn't actually hit the issue in practice, but just happen to notice while looking at the code. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	60eff44277	st/vdpau: Avoid constness cast warnings. Fixes MSVC warning C4090: '=' : different 'const' qualifiers Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-03-22 08:23:24 +00:00
Jose Fonseca	fb78cccd7b	glsl: Disable MSVC switch warning on a per-file basis. This addresses ...\glsl_parser.cpp(...) : warning C4065: switch statement contains 'default' but no 'case' labels This is on code generated by bison, which we have little control. It seems useful to have this warning otherwise enabled. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-22 08:23:23 +00:00
Jose Fonseca	d01a7cdae5	glsl: Avoid GLboolean vs bool arithmetic MSVC warnings. Note that GLboolean is an alias for unsigned char, which lacks the implicit true/false semantics that C++/C99 bool have. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Change gl_shader::IsES and gl_shader_program::IsES to be bool as recommended by Ian Romanick. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-22 08:23:23 +00:00
Emil Velikov	7c7954b09d	galahad: actually remove the driver Should have been part of 429a4355259(galahad: remove driver). Seems like I've erroneously committed the trimmed patch. Reported-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-21 22:35:27 +00:00
Emil Velikov	bbaf22a998	egl: cut down static storage size for {Version,ClientAPI}String Both seems to be excessively long, namely: ClientAPIString can get up-to 47 based on current code, while the name of the driver can dictate the length of the VersionString, currently it is around 11. Let's pad each to 100, rather than the current 1000. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-21 17:22:19 +00:00
Emil Velikov	0bff196b22	docs: note the removal of gbm_gallium, galahad and identity Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-21 17:21:30 +00:00
Emil Velikov	429a435525	galahad: remove driver Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:18:28 +00:00
Emil Velikov	84041bab3f	gallium/docs: remove information about identity driver Removed from tree. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:18:25 +00:00
Emil Velikov	55029b2bab	docs: update the egl_platforms list Add the missing wayland, null, android and haiku platforms. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:16:44 +00:00
Emil Velikov	0d6e7620f3	egl/main: drop platform fbdev specific code st/egl was the only one which had support for this platform. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:16:41 +00:00
Emil Velikov	65a8d4d6dd	winsys/sw/fbdev: remove unused software winsys st/egl was its only user. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:16:38 +00:00
Emil Velikov	1081ed9dc3	winsys/sw/wayland: remove unused winsys st/egl was its only user. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:16:35 +00:00
Emil Velikov	48c7461d5a	st/gbm: remove state-tracker st/egl was its only user. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-21 17:16:27 +00:00
Roland Scheidegger	e8039208c4	llvmpipe: use global llvm context for PIPE_SUBSYSTEM_EMBEDDED There's 2 reasons why we'd want to use the global context: 1) There still seems to be one memory "leak" left when using multiple llvm contexts (it is not a true leak as the memory disappears into some still addressable pool but nevertheless the memory consumption grows). See http://cgit.freedesktop.org/~jrfonseca/llvm-jitstress/ 2) These contexts get kinda big - even when disposing modules etc. after compiling a shader the LLVMContext can easily be over 100kB. So when there's lots of llvm contexts arounds it adds up. The downside is that at least right now this is absolutely not thread safe, so this only works safely in environments where multiple pipe contexts are not used concurrently. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-21 01:52:03 +01:00
Emil Velikov	b2dccfd17e	docs: add news item and link release notes for mesa 10.4.7 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-21 00:54:14 +00:00
Emil Velikov	0030eef62b	docs: Add sha256 sums for the 10.4.7 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `cb154bb221`)	2015-03-21 00:53:22 +00:00
Emil Velikov	befb5d1c94	Add release notes for the 10.4.7 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `d26f3c1f86`)	2015-03-21 00:53:21 +00:00
Dave Airlie	ad6ede260f	mesa: reorder gl_light_attrib reduces from 2664->2656. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:41 +10:00
Dave Airlie	b99c7defac	mesa: reorder gl_framebuffer this reduces it from 1088 -> 1080 bytes Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:38 +10:00
Dave Airlie	727eb4c4e7	mesa: fix hole in vertex_array_object this just removes 4 bytes from this object. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:33 +10:00
Dave Airlie	974e4783a5	mesa: repack gl_texture_attrib. This removes a hole, and puts the large allocation at the end, Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:29 +10:00
Dave Airlie	2dbd8284e7	mesa: reduce gl_colorbuffer_attrib and gl_fog_attrib These 392->388 and 72->68. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:25 +10:00
Dave Airlie	2c016ed35f	mesa: reorder gl_image_unit reduces 40->32 but reduces use in context from 7680->6144. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:21 +10:00
Dave Airlie	0ff4726a06	mesa: reorder gl_program, gl_shader, gl_shader_program gl_program : 1344->1336 gl_shader: 488->472 gl_shader_program: 352->344. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:16 +10:00
Dave Airlie	7b634fed59	mesa: reorder gl_transform_feedback_object Reduces size from 184 to 176 bytes. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:09 +10:00
Dave Airlie	e17b0435c5	mesa: reorder prog_instruction reduces size from 64 to 56 bytes. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:05 +10:00
Dave Airlie	401b11843b	mesa: reorder gl_array_attrib drops 80 bytes to 72. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:14:00 +10:00
Dave Airlie	b3f6e0bb58	mesa: reorder gl_client_array drops from 56 to 48 bytes, drops gl_vertex_array_object from 4584 to 4320 bytes Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:13:56 +10:00
Dave Airlie	cbaff50828	mesa: reorder gl_texture_unit drops size from 520 -> 512 bytes, which then makes gl_texture_attrib go from 99984 to 98440. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:13:51 +10:00
Dave Airlie	83606b4904	mesa: reorder gl_point_attrib this drops the size from 52 bytes to 48 bytes. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:13:47 +10:00
Dave Airlie	684c914014	mesa: reorder gl_multisample_attrib drops size from 28 bytes to 20. Acked-by: Brian Paul <brianp@vmware.com> Reviewed-by: Alex Deucher alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-21 08:13:17 +10:00
Ian Romanick	a04b520890	i965/fs: Use correct null destination register in cmod tests Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89670 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: Vinson Lee <vlee@freedesktop.org>	2015-03-20 12:27:02 -07:00
Connor Abbott	ccb9cbc849	i965/fs: bail on move-to-flag in sel peephole Fixes a piglit regression (shaders/glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined) with my series for GVN. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-20 11:53:11 -04:00
Francisco Jerez	1cc00f1875	i965: Mask out unused Align16 components in brw_untyped_atomic. This is currently not a problem because the vec4 visitor happens to mask out unused components from the destination, but it might become an issue when we start using atomics without writeback message. In any case it seems sensible to set it again here because the consequences of setting the wrong writemask (random graphics memory corruption) are difficult to debug and can easily go unnoticed. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-20 17:01:35 +02:00
Francisco Jerez	959d16e38e	i965: Pass number of components explicitly to brw_untyped_atomic and _surface_read. And calculate the message response size based on the number of components rather than the other way around. This simplifies their interface somewhat and allows the caller to request a writeback message with more than one vector component in SIMD4x2 mode. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-20 17:01:35 +02:00
Francisco Jerez	a815cd8449	i965: Don't disable exec masking for sampler message sends. This was telling the sampler to do texture fetches for all channels in the non-constant surface index case, what could have reduced throughput unnecessarily when some of the channels were disabled by control flow. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-20 17:01:35 +02:00
Francisco Jerez	a902a5d6ba	i965: Factor out logic to build a send message instruction with indirect descriptor. This is going to be useful because the Gen7+ uniform and varying pull constant, texturing, typed and untyped surface read, write, and atomic generation code on the vec4 and fs back-end all require the same logic to handle conditionally indirect surface indices. In pseudocode: \| if (surface.file == BRW_IMMEDIATE_VALUE) { \| inst = brw_SEND(p, dst, payload); \| set_descriptor_control_bits(inst, surface, ...); \| } else { \| inst = brw_OR(p, addr, surface, 0); \| set_descriptor_control_bits(inst, ...); \| inst = brw_SEND(p, dst, payload); \| set_indirect_send_descriptor(inst, addr); \| } This patch abstracts out this frequently recurring pattern so we can now write: \| inst = brw_send_indirect_message(p, sfid, dst, payload, surface) \| set_descriptor_control_bits(inst, ...); without worrying about handling the immediate and indirect surface index cases explicitly. v2: Rebase. Improve documentatation and commit message. (Topi) Preserve UW destination type cargo-cult. (Topi, Ken, Matt) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-20 17:01:35 +02:00
Francisco Jerez	fd149628e1	i965: Set nr_params to the number of uniform components in the VS/GS path. Both do_vs_prog and do_gs_prog initialize brw_stage_prog_data::nr_params to the number of uniform vectors required by the shader rather than the number of uniform components, contradicting the comment. This is inconsistent with what the state upload code and scalar path expect but it happens to work until Gen8 because vec4_visitor interprets it as a number of vectors on construction and later on overwrites its original value with the number of uniform components referenced by the shader. Also there's no need to add the number of samplers, they're not actually passed in as uniforms. Fixes a memory corruption issue on BDW with SIMD8 VS. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-20 16:55:36 +02:00
Kenneth Graunke	706b916960	i965/skl: Break down SIMD16 3-source instructions when required. Several steppings of Skylake fail when using SIMD16 with 3-source instructions (such as MAD). This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit tests. Based on a patch by Neil Roberts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-03-20 13:25:41 +00:00
Neil Roberts	bc4b18d297	i965: Refactor SIMD16-to-2xSIMD8 checks. The places that were checking whether 3-source instructions are supported have now been combined into a small helper function. This will be used in the next patch to add an additonal restriction. Based on a patch by Kenneth Graunke. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-20 13:25:41 +00:00
Neil Roberts	c02c4b567c	i965: Store the GPU revision number in brw_context brwContextInit now queries the GPU revision number via a new parameter for DRM_I915_GETPARAM. This new parameter requires a kernel patch and a patch to libdrm. If the kernel doesn't support it then it will continue but set the revision number to -1. The intention is to use this to implement workarounds that are only needed on certain steppings of the GPU. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-03-20 13:25:40 +00:00
Fredrik Höglund	2fd21d8a84	mesa: Make sure the buffer exists in _mesa_lookup_bufferobj_err Generate GL_INVALID_OPERATION and return NULL when the buffer object hasn't been created. All callers expect this. v2: Use a more concise error message. Cc: Laura Ekstrand <laura@jlekstrand.net> Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-20 01:25:29 +01:00
Dave Airlie	9d97cd2e3e	u_primconvert: add primitive restart support This add primitive restart support to the prim conversion. This involves changing the API for the translate functions as we need to pass the prim restart index and the original number of indices into the translate functions. primitive restart is support for quads, quad strips and polygons. This deal with the case where the actual number of output primitives is less than the initially calculated number, by filling the rest of the output primitives with the restart index, the other option is to reduce the output prim number, but that will make the generator code a bit messier. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-20 09:46:30 +10:00
Kenneth Graunke	201aef9d13	i965/fp: Emit discard jumps. This should improve the performance of any shaders using the KIL instruction. I'm a bit surprised we missed this. Unfortunately, I have not been able to measure any performance improvements from this patch. It does make ARB_fragment_program behave similarly to GLSL code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-19 16:14:51 -07:00
Kenneth Graunke	8a0946f3b1	i965/fs: Make an emit_discard_jump() function to reduce duplication. This is already copied in two places, and I want to copy it to a third place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-19 16:14:51 -07:00
Laura Ekstrand	09bfa326a9	main: Add TEXTURE_CUBE_MAP support in CopyTextureSubImage3D. So it turns out that this doesn't actually fix any bugs or add any features, stictly speaking. However, it does avoid a lot of kludginess. Previously, if you called glCopyTextureSubImage3D(texcube, 0, 0, 0, zoffset = 3, ... it would grab the texture image object for face = 0 in teximage.c instead of the desired face = 3. But Line 274 of brw_blorp_blit.cpp would correct for this by updating the slice to 3. This commit does the correct thing before calling any drivers, which should make the functionality much more robust and uniform across all drivers. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-19 16:07:57 -07:00
Laura Ekstrand	037e36a8aa	main: Simplify debug messages for CopyTexSubImageD. Reviewed-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-19 16:07:44 -07:00
Ian Romanick	a44b95cd57	glsl: Annotate as_foo functions that the this pointer cannot be NULL We use the idiom ir_foo x = y->as_foo(); if (x == NULL) return; all over the place. GCC generates some quite lovely code for this. One such example: 340a5b: 83 7d 18 04 cmpl $0x4,0x18(%rbp) 340a5f: 0f 85 06 04 00 00 jne 340e6b 340a65: 48 85 ed test %rbp,%rbp 340a68: 0f 84 fd 03 00 00 je 340e6b This case used as_expression() (ir_type_expression is 4). Note that it checks the ir_type, then checks that the pointer isn't NULL. There is some disconnect in GCC around the condition in the as_foo functions. return ir_type == ir_type_##TYPE ? (ir_##TYPE ) this : NULL; \ It believes "this" could be NULL, so it emits check outside the function just for fun. This patch uses assume() to tell GCC that it need not bother with extra NULL checking of the pointer returned by the as_foo functions. text data bss dec hex filename 4836430 158688 26248 5021366 4c9eb6 i965_dri-before.so 4836173 158688 26248 5021109 4c9db5 i965_dri-after.so v2: Replace 'if (this == NULL) unreachable("this cannot be NULL")' with assume(this != NULL). Suggested by Ilia Mirkin. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-19 15:35:42 -07:00
Paul Berry	bf9d921936	main: Change the type argument of use_shader_program() to gl_shader_stage. This allows it to be called from a loop. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-03-19 13:38:51 -07:00
Paul Berry	57b2652322	main: Clean up a strange construction in use_shader_program(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-19 13:38:51 -07:00
Jason Ekstrand	46c35c61e9	i965/nir: Sort uniforms direct-first and use two different uniform registers Previously, we put all the uniforms into one big array. The problem with this approach is that, as soon as there was one indirect array acces, the backend would decide that the entire large array should be pull constants. This commit splits the array in half: first direct-only uniforms and then potentially-indirect uniforms. This may not be optimal, but it does let the backend promote things to push constants. Shader-db results on HSW: total instructions in shared programs: 4114840 -> 4112172 (-0.06%) instructions in affected programs: 43316 -> 40648 (-6.16%) helped: 116 HURT: 0 v2: Set param_size[num_direct_uniforms] only if we have indirect uniforms. This caused a bug that, strangely enough, only showed up on Broadwell vertex shaders. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-19 13:18:39 -07:00
Jason Ekstrand	8a33f95b7a	nir/lower_io: Add a assign_locations function that sorts by [in]direct use v2: Delete the set of indirectly accessed variables when we're done with it v3: Rename from _packed to _scalar Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-19 13:18:39 -07:00
Jason Ekstrand	25db44a845	nir/lower_io: Make variable location assignment a manual operation Previously, we just assigned variable locations in nir_lower_io. Now, we force the user to assign variable locations for us. This gives the backend a bit more control over where variables are placed. v2: Rename from _packed to _scalar Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-19 13:18:39 -07:00
Jason Ekstrand	639115123e	nir: Use a list instead of a hash_table for inputs, outputs, and uniforms We never did a single hash table lookup in the entire NIR code base that I found so there was no real benifit to doing it that way. I suppose that for linking, we'll probably want to be able to lookup by name but we can leave building that hash table to the linker. In the mean time this was causing problems with GLSL IR -> NIR because GLSL IR doesn't guarantee us unique names of uniforms, etc. This was causing massive rendering isues in the unreal4 Sun Temple demo. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-19 13:18:38 -07:00
Brian Paul	8f255f948b	gallivm: remove unused 'builder' variable Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-19 12:56:35 -06:00
Brian Paul	1cd3745911	mesa: use more descriptive error messages for glUniform errors Different errors for type mismatches, size mismatches and matrix/ non-matrix mismatches. Use a common format of "uniformName"@location in the messags. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-19 12:56:35 -06:00
Matt Turner	b0d422cd2a	i965/fs: Print spills:fills and number of promoted constants. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-03-19 11:15:57 -07:00
Ian Romanick	b616164c95	i965/fs: Emit better b2f of an expression on GEN4 and GEN5 On platforms that do not natively generate 0u and ~0u for Boolean results, b2f expressions that look like f = b2f(expr cmp 0) will generate better code by pretending the expression is f = ir_triop_sel(0.0, 1.0, expr cmp 0) This is because the last instruction of "expr" can generate the condition code for the "cmp 0". This avoids having to do the "-(b & 1)" trick to generate 0u or ~0u for the Boolean result. This means code like mov(16) g16<1>F 1F mul.ge.f0(16) null g6<8,8,1>F g14<8,8,1>F (+f0) sel(16) m6<1>F g16<8,8,1>F 0F will be generated instead of mul(16) g2<1>F g12<8,8,1>F g4<8,8,1>F cmp.ge.f0(16) g2<1>D g4<8,8,1>F 0F and(16) g4<1>D g2<8,8,1>D 1D and(16) m6<1>D -g4<8,8,1>D 0x3f800000UD v2: When the comparison is either == 0.0 or != 0.0 use the knowledge that the true (or false) case already results in zero would allow better code generation by possibly avoiding a load-immediate instruction. v3: Apply the optimization even when neither comparitor is zero. Shader-db results: GM45 (0x2A42): total instructions in shared programs: 3551002 -> 3550829 (-0.00%) instructions in affected programs: 33269 -> 33096 (-0.52%) helped: 121 Iron Lake (0x0046): total instructions in shared programs: 4993327 -> 4993146 (-0.00%) instructions in affected programs: 34199 -> 34018 (-0.53%) helped: 129 No change on other platforms. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Palli <tapani.palli@intel.com>	2015-03-19 10:21:08 -07:00
Matt Turner	036e347f3c	util: Optimize _mesa_roundeven with SSE 4.1. The SSE 4.1 ROUND instructions let us implement roundeven directly. Otherwise we assume that the rounding mode has not been modified (as we do in the rest of Mesa) and use rint(). glibc uses the ROUND instruction in rint() after a cpuid check. This patch just lets us inline it directly when we're already building for SSE 4.1. Reviewed-by: Carl Worth <cworth@cworth.org>	2015-03-18 21:06:26 -07:00
Matt Turner	5de86102f9	util: Add a roundeven test. Reviewed-by: Carl Worth <cworth@cworth.org>	2015-03-18 21:06:26 -07:00
Matt Turner	dd0d3a2c0f	mesa: Replace _mesa_round_to_even() with _mesa_roundeven(). Eric's initial patch adding constant expression evaluation for ir_unop_round_even used nearbyint. The open-coded _mesa_round_to_even implementation came about without much explanation after a reviewer asked whether nearbyint depended on the application not modifying the rounding mode. Of course (as Eric commented) we rely on the application not changing the rounding mode from its default (round-to-nearest) in many other places, including the IROUND function used by _mesa_round_to_even! Worse, IROUND() is implemented using the trunc(x + 0.5) trick which fails for x = nextafterf(0.5, 0.0). Still worse, _mesa_round_to_even unexpectedly returns an int. I suspect that could cause problems when rounding large integral values not representable as an int in ir_constant_expression.cpp's ir_unop_round_even evaluation. Its use of _mesa_round_to_even is clearly broken for doubles (as noted during review). The constant expression evaluation code for the packing built-in functions also mistakenly assumed that _mesa_round_to_even returned a float, as can be seen by the cast through a signed integer type to an unsigned (since negative float -> unsigned conversions are undefined). rint() and nearbyint() implement the round-half-to-even behavior we want when the rounding mode is set to the default round-to-nearest. The only difference between them is that nearbyint() raises the inexact exception. This patch implements _mesa_roundeven{f,}, a function similar to the roundeven function added by a yet unimplemented technical specification (ISO/IEC TS 18661-1:2014), with a small difference in behavior -- we don't bother raising the inexact exception, which I don't think we care about anyway. At least recent Intel CPUs can quickly change a subset of the bits in the x87 floating-point control register, but the exception mask bits are not included. rint() does not need to change these bits, but nearbyint() does (twice: save old, set new, and restore old) in order to raise the inexact exception, which would incur some penalty. Reviewed-by: Carl Worth <cworth@cworth.org>	2015-03-18 21:06:26 -07:00
Matt Turner	bb22aa08e4	i965/fs: Ignore type in cmod prop if scan_inst is CMP. total instructions in shared programs: 6263270 -> 6203091 (-0.96%) instructions in affected programs: 2606529 -> 2546350 (-2.31%) helped: 14301 GAINED: 5 LOST: 3 Revewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-18 21:03:09 -07:00
Jason Ekstrand	e1f3ddef8c	i965/nir: Make our environment variable checking smarter Before, we enabled NIR if you set INTEL_USE_NIR to anything which mean that INTEL_USE_NIR=false would actually turn on NIR. In preparation for turning NIR on by default, this commit makes it smarter by allowing the INTEL_USE_NIR variable to work as either a force-enable or a force-disable. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-18 16:40:22 -07:00
Dave Airlie	37e3a116f8	egl: don't fill client apis string forever. We never reset the string on eglTerminate, so it grows for ever on multiple eglInitialise. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-19 08:28:38 +10:00
Jose Fonseca	cebc62f106	swrast: Use BITFIELD64_BIT for arrayAttribs. As VARYING_SLOT_MAX can be bigger than 32. I'll probably stop building swrast with MSVC in the near future, but this seems a real bug regardless. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-18 21:51:54 +00:00
Jose Fonseca	d3e9aa8d88	scons: Don't link program_lexer.l/y twice. program/lex.yy.c and program/program_parse.tab.c is already included in the PROGRAM_FILES variable. We still need to specify the dependency relationship though. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-18 21:51:54 +00:00
Jose Fonseca	a56f1a8b32	gallivm: Use INFINITY directly. Already done below. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-18 21:51:40 +00:00
Jose Fonseca	1d30fd85dd	scons: Silence MSVC warnings about overflows in constant arithmetic. These get triggered even when using the standard C99 INFINITY/NAN constants. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-18 21:51:40 +00:00
José Fonseca	bbac03ecca	scons: Disable MSVC signed/unsigned mismatch warnings. By default gcc ignores the issue, and as result code that mixes signed/unsigned is so widespread through the code base that it ends up being little more than noise, potentially obscuring more pertinent warnings. Maybe one day we enable the corresponding gcc warnings and cleanup, but until then, this change disables them. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-03-18 21:51:40 +00:00
Laura Ekstrand	2ccfce3f4c	docs: Update progress on ARB_direct_state_access. Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-18 13:59:39 -07:00
Brian Paul	627991dbf7	dri: add _glapi_set_nop_handler(), _glapi_new_nop_table() to dri_test.c I wasn't aware of these _glapi_ stub functions when I committed `4bdbb588a9`. Fixes "make check" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89662 Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-18 12:46:11 -06:00
Brian Paul	9263986401	mesa: remove MSVC warning pragmas Removing this block of pragmas doesn't seem to increase the number of warning generated by MSVC. Other than signed/unsigned comparison warnings there's very few other warnings nowadays. Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-18 09:01:50 -06:00
Brian Paul	ea1b066a34	mesa: add void to format_array_format_table_init() declaration Silences an MSVC warning where it's called from call_once(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-18 09:01:50 -06:00
Brian Paul	9fbbd60c1d	mapi: move some #includes from .h file to .c files Just include things where they're needed. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-18 09:01:50 -06:00
Brian Paul	4009d22b61	mesa: make _mesa_alloc_dispatch_table() static Never called from outside of context.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-18 09:01:50 -06:00
Brian Paul	4bdbb588a9	mesa: reimplement dispatch table no-op function handling Use the new _glapi_new_nop_table() and _glapi_set_nop_handler() to improve how we handle calling no-op GL functions. If there's a current context for the calling thread, generate a GL_INVALID_OPERATION error. This will happen if the app calls an unimplemented extension function or it calls an illegal function between glBegin/glEnd. If there's no current context, print an error to stdout if it's a debug build. The dispatch_sanity.cpp file has some previous checks removed since the _mesa_generic_nop() function no longer exists. This fixes the piglit gl-1.0-dlist-begin-end and gl-1.0-beginend-coverage tests on Windows. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-18 09:01:50 -06:00
Brian Paul	201e36e77d	mapi: add new _glapi_new_nop_table() and _glapi_set_nop_handler() _glapi_new_nop_table() creates a new dispatch table populated with pointers to no-op functions. _glapi_set_nop_handler() is used to register a callback function which will be called from each of the no-op functions. Now we always generate a separate no-op function for each GL entrypoint. This allows us to do proper stack clean-up for Windows __stdcall and lets us report the actual function name in error messages. Before this change, for non-Windows release builds we used a single no-op function for all entrypoints. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-18 09:01:50 -06:00
Rob Clark	aee26d292f	freedreno/ir3: fix infinite recursion in sched One more case we need to handle. One of the src instructions for the indirect could also end up being ourself. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-18 10:42:33 -04:00
Rob Clark	62cc003b7d	freedreno: fix spelling Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-18 10:42:33 -04:00
Marek Olšák	42715ad793	docs/GL3: don't list nv30 Suggested by Ilia Mirkin.	2015-03-18 12:04:27 +01:00
Marek Olšák	4e46af0195	docs/GL3: don't list swrast Let's face it: This driver is unlikely to get more love. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-18 12:04:27 +01:00
Marek Olšák	2b5379651f	docs/GL3: don't list r300 r300g already supports everything it can. There's no point in listing the driver here. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-18 12:04:27 +01:00
Marek Olšák	a984abdad3	radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords radeon_llvm_emit_prepare_cube_coords uses coords[4] in some cases (TXB2 etc.) Discovered by Coverity. Reported by Ilia Mirkin. Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-18 12:04:27 +01:00
Jonathan Gray	8475526a38	configure: check if compiler supports -Werror=vla. Check if the compiler supports -Werror=vla before using it. -Wvla was introduced with GCC 4.3 and is not present in 4.2. Fixes the build on OpenBSD. v2: Fix statement order, and quote $save_CFLAGS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89433 Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-18 10:53:20 +00:00
Chris Wilson	eeb504e0ae	i965: Defer the throttle until we submit new commands Currently, we throttle before the user begins preparing commands for the next frame when we acquire the draw/read buffers. However, construction of the command buffer can itself take significant time relative to the frame time. If we move the throttle from the buffer acquire to the command submit phase we can allow the user to improve concurrency between the CPU and GPU (i.e. reduce the amount of time we waste inside the throttle). v2: Whitespace + delay throttling until after the next submission for greater parallelism Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Chad Versace <chad.versace@linux.intel.com> Cc: Ian Romanick <idr@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> [v1]	2015-03-18 09:33:33 +00:00
Chris Wilson	64788b2e8d	i965: Throttle to the previous frame In order to facilitate the concurrency offered by triple buffering and to offset the latency induced by swapping via an external process, which may incur extra rendering itself, only throttle to the previous frame and not the last. The second issue that mostly affects swap benchmarks, but also can incur jitter in the throttling, is that the throttle bo is closer to the next SwapBuffers rather than immediately after the previous SwapBuffers. Throttling to the previous frame doubles the maximum possible latency at the benefit of improving throughput and reducing jitter. v2: Rename "first_post_swapbuffer" batches array to a plain throttle_batch[] as the pluralisation was contorting the name and not making it clear as to whether it was the first batch or first_post_swap batch. Not least of which was that not all throttle points are SwapBuffers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Chad Versace <chad.versace@linux.intel.com> Cc: Ian Romanick <idr@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2015-03-18 09:33:33 +00:00
Chris Wilson	8b9bd19021	i965: Throttle rendering to an fbo When rendering to an fbo, even though it may be acting as a winsys frontbuffer or just generally, we never throttle. However, when rendering to an fbo, there is no natural frame boundary. Conventionally we use SwapBuffers and glFinish, but potential callers avoid often glFinish for being too heavy handed (waiting on all outstanding rendering to complete). The kernel provides a soft-throttling option for this case that waits for rendering older than 20ms to be complete (that's a little too lax to be used for swapbuffers, but is here a useful safety net). The remaining choice is then either never to throttle, throttle after every draw call, or at after intermediate user defined point such as glFlush and thus all the implied flushes. This patch opts for the latter as that is the current method used for flushing to front buffers. v2: Defer the throttling from inside the flush to the next intel_prepare_render() and switch non-fbo frontbuffer throttling over to use the same lax method. The issuing being that glFlush()/intel_prepare_read() is just as likely to be called inside a tight loop and not at "frame" boundaries. v3: Rename from need_front_throttle to need_flush_throttle to avoid any ambiguity between front buffer rendering and fbo rendering. (Chad) v4: Whitespace Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Chad Versace <chad.versace@linux.intel.com> Cc: Ian Romanick <idr@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2015-03-18 09:33:33 +00:00
Jason Ekstrand	27bf37ba05	nir/peephole_select: Allow uniform/input loads and load_const Shader-db results on HSW: total instructions in shared programs: 4174156 -> 4157291 (-0.40%) instructions in affected programs: 145397 -> 128532 (-11.60%) helped: 383 HURT: 0 GAINED: 20 LOST: 22 There are two more tests lost than gained. However, comparing this with GLSL IR vs. NIR results, the overall delta is reduced from 85/44 gained/lost on current master to 71/32 with this commit. Therefore, I think it's probably a boon since we are getting "closer" to where we were before. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-17 17:11:05 -07:00
Jason Ekstrand	1be862c0c4	nir/peephole_select: Copy instructions into the block before the if Previously we tried to do poor-man's copy propagation as we created the select instructions. Instead, this commit just moves the instructions from the blocks inside the if into the block before. Copy propagation will take care of making sure we don't have any extra mov's in there for us. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-17 17:11:05 -07:00
Jason Ekstrand	8cf40ed05d	nir/peephole_select: Rename are_all_move_to_phi and use a switch Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-17 17:11:05 -07:00
Mario Kleiner	cc5ddd584d	glx: Handle out-of-sequence swap completion events correctly. (v2) The code for emitting INTEL_swap_events swap completion events needs to translate from 32-Bit sbc on the wire to 64-Bit sbc for the events and handle wraparound accordingly. It assumed that events would be sent by the server in the order their corresponding swap requests were emitted from the client, iow. sbc count should be always increasing. This was correct for DRI2. This is not always the case under the DRI3/Present backend, where the Present extension can execute presents and send out completion events in a different order than the submission order of the present requests, due to client code specifying targetMSC target vblank counts which are not strictly monotonically increasing. This confused the wraparound handling. This patch fixes the problem by handling 32-Bit wraparound in both directions. As long as successive swap completion events real 64-Bit sbc's don't differ by more than 2^30, this should be able to do the right thing. How this is supposed to work: awire->sbc contains the low 32-Bits of the true 64-Bit sbc of the current swap event, transmitted over the wire. glxDraw->lastEventSbc contains the low 32-Bits of the 64-Bit sbc of the most recently processed swap event. glxDraw->eventSbcWrap is a 64-Bit offset which tracks the upper 32-Bits of the current sbc. The final 64-Bit output sbc aevent->sbc is computed from the sum of awire->sbc and glxDraw->eventSbcWrap. Under DRI3/Present, swap completion events can be received slightly out of order due to non-monotic targetMsc specified by client code, e.g., present request submission: Submission sbc: 1 2 3 targetMsc: 10 11 9 Reception of completion events: Completion sbc: 3 1 2 The completion sequence 3, 1, 2 would confuse the old wraparound handling made for DRI2 as 1 < 3 --> Assumes a 32-Bit wraparound has happened when it hasn't. The client can queue multiple present requests, in the case of Mesa up to n requests for n-buffered rendering, e.g., n = 2-4 in the current Mesa GLX DRI3/Present implementation. In the case of direct Pixmap presents via xcb_present_pixmap() the number n is limited by the amount of memory available. We reasonably assume that the number of outstanding requests n is much less than 2 billion due to memory contraints and common sense. Therefore while the order of received sbc's can be a bit scrambled, successive 64-Bit sbc's won't deviate by much, a given sbc may be a few counts lower or higher than the previous received sbc. Therefore any large difference between the incoming awire->sbc and the last recorded glxDraw->lastEventSbc will be due to 32-Bit wraparound and we need to adapt glxDraw->eventSbcWrap accordingly to adjust the upper 32-Bits of the sbc. Two cases, correponding to the two if-statements in the patch: a) Previous sbc event was below the last 2^32 boundary, in the previous glxDraw->eventSbcWrap epoch, the new sbc event is in the next 2^32 epoch, therefore the low 32-Bit awire->sbc wrapped around to zero, or close to zero --> awire->sbc is apparently much lower than the glxDraw->lastEventSbc recorded for the previous epoch --> We need to increment glxDraw->eventSbcWrap by 2^32 to adjust the current epoch to be one higher than the previous one. --> Case a) also handles the old DRI2 behaviour. b) Previous sbc event was above closest 2^32 boundary, but now a late event from the previous 2^32 epoch arrives, with a true sbc that belongs to the previous 2^32 segment, so the awire->sbc of this late event has a high count close to 2^32, whereas glxDraw->lastEventSbc is closer to zero --> awire->sbc is much greater than glXDraw->lastEventSbc. --> We need to decrement glxDraw->eventSbcWrap by 2^32 to adjust the current epoch back to the previous lower epoch of this late completion event. We assume such a wraparound to a higher (a) epoch or lower (b) epoch has happened if awire->sbc and glxDraw->lastEventSbc differ by more than 2^30 counts, as such a difference can only happen on wraparound, or if somehow 2^30 present requests would be pending for a given drawable inside the server, which is rather unlikely. v2: Explain the reason for this patch and the new wraparound handling much more extensive in commit message, no code change wrt. initial version. Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-17 23:54:02 +00:00
Emil Velikov	3f94a5afcb	r600g: constify r600_shader_tgsi_instruction lists. Massive list of constant data. Annotate it as such. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-17 23:52:39 +00:00
Emil Velikov	63cf2b4448	r600g: kill off r600_shader_tgsi_instruction::{tgsi_opcode,is_op3} Both of which are no longer used. Use designated initializer to make things obvious as people add/remove TGSI_OPCODEs. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-17 23:52:35 +00:00
Emil Velikov	5e68c6b322	r600g: use the tgsi opcode from parse.FullToken.FullInstruction ... rather than the local one in inst_info->tgsi_opcode. This will allow us to simplify struct r600_shader_tgsi_instruction. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-17 23:52:32 +00:00
Ian Romanick	6db5e134b6	i965/fs: Apply gl_FrontFacing ? -1 : 1 optimization only for floats At the very least, unreal4/sun-temple/102.shader_test uses this pattern for a signed integer result. However, that shader did not hit the optimization in the first place because it uses !gl_FrontFacing. I changed the shader to use remove the logical-not and reverse the other operands. I verified that incorrect code is generated before this change and correct code is generated after. Fixes fs-frontfacing-ternary-1-neg-1.shader_test. No shader-db changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-17 15:01:44 -07:00
Ian Romanick	4a53445b0d	i965/fs: Change try_opt_frontfacing_ternary to eliminate asserts If we check for the case that is actually necessary, the asserts become superfluous. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-17 15:00:28 -07:00
Ian Romanick	ce3f46397d	i965/fs: Handle CMP.nz ... 0 and AND.nz ... 1 similarly in cmod propagation Espically on platforms that do not natively generate 0u and ~0u for Boolean results, we generate a lot of sequences where a CMP is followed by an AND with 1. emit_bool_to_cond_code does this, for example. On ILK, this results in a sequence like: add(8) g3<1>F g8<8,8,1>F -g4<0,1,0>F cmp.l.f0(8) g3<1>D g3<8,8,1>F 0F and.nz.f0(8) null g3<8,8,1>D 1D (+f0) iff(8) Jump: 6 The AND.nz is obviously redundant. By propagating the cmod, we can instead generate add.l.f0(8) null g8<8,8,1>F -g4<0,1,0>F (+f0) iff(8) Jump: 6 Existing code already handles the propagation from the CMP to the ADD. Shader-db results: GM45 (0x2A42): total instructions in shared programs: 3550829 -> 3550788 (-0.00%) instructions in affected programs: 10028 -> 9987 (-0.41%) helped: 24 Iron Lake (0x0046): total instructions in shared programs: 4993146 -> 4993105 (-0.00%) instructions in affected programs: 9675 -> 9634 (-0.42%) helped: 24 Ivy Bridge (0x0166): total instructions in shared programs: 6291870 -> 6291794 (-0.00%) instructions in affected programs: 17914 -> 17838 (-0.42%) helped: 48 Haswell (0x0426): total instructions in shared programs: 5779256 -> 5779180 (-0.00%) instructions in affected programs: 16694 -> 16618 (-0.46%) helped: 48 Broadwell (0x162E): total instructions in shared programs: 6823088 -> 6823014 (-0.00%) instructions in affected programs: 15824 -> 15750 (-0.47%) helped: 46 No chage on Sandy Bridge or on any platform when NIR is used. v2: Add unit tests suggested by Matt. Remove spurious writes_flag() check on scan_inst when scan_inst is known to be BRW_OPCODE_CMP (also suggested by Matt). v3: Fix some comments and remove some explicit int() casts in fs_reg constructors in the unit tests. Both suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-17 14:59:43 -07:00
Matt Turner	d35720da9b	i965: Mark paths in linear <-> tiled functions as unreachable(). text data bss dec hex filename 9663 0 0 9663 25bf intel_tiled_memcpy.o before 8215 0 0 8215 2017 intel_tiled_memcpy.o after Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-03-17 14:09:56 -07:00
Matt Turner	6c6e2a15aa	egl: Remove eglQueryString virtual dispatch. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-03-17 14:09:56 -07:00
Laura Ekstrand	827da841a1	main: Correct _mesa_error with no format in bufferobj.c. This fixes Bug 89616, a build failure due to line 1639 of bufferobj.c: _mesa_error(ctx, GL_INVALID_OPERATION, func); Trivial.	2015-03-17 13:30:54 -07:00
Laura Ekstrand	579297c8bd	main: Cosmetic changes to GetBufferSubData. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	23eab47bbe	main: Add entry point for GetNamedBufferSubData. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	3706ace244	main: Cosmetic updates to GetBufferPointerv. v3: Review from Fredrik Hoglund -Split cosmetic refactor of GetBufferPointerv out into a separate commit Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	105ddc6aea	main: Add entry point for GetNamedBufferPointerv. v3: Review from Fredrik Hoglund -Split cosmetic refactor of GetBufferPointerv out into a separate commit Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	1e45752aaf	main: Add entry points for GetNamedBufferParameteri[64]v. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	efcb830d49	main: Refactor GetBufferParameteri[64]v. v2: Split into a refactor commit and an entry point commit. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	1cfc18da8d	main: Add entry point for FlushMappedNamedBufferRange. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	ee5fae6e89	main: Refactor FlushMappedBufferRange. v2:-Remove "_mesa" from in front of static software fallback. -Split out the refactor from the addition of the DSA entry points. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	f7f5df9954	main: Add entry point for UnmapNamedBuffer. v2: review from Ian Romanick - Restore VBO_DEBUG and BOUNDS_CHECK - Remove _mesa from static software fallback unmap_buffer. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	a0cc03929e	main: Add entry points for MapNamedBuffer[Range]. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	4f513bc330	main: Refactor MapBuffer[Range]. v2: review from Jason Ekstrand - Split refactor from addition of DSA entry points. review from Ian Romanick - Remove "_mesa" from static software fallback map_buffer_range - Restore VBO_DEBUG and BOUNDS_CHECK Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	16244525fb	main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2015-03-17 10:18:34 -07:00
Laura Ekstrand	5030d0a4f7	main: Add entry points for ClearNamedBuffer[Sub]Data. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	9fa6c3637a	main: Refactor ClearBuffer[Sub]Data. v2: review by Jason Ekstrand - Split refactor of clear buffer sub data from addition of DSA entry points. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	4adaad5fcc	main: Add entry point for CopyNamedBufferSubData. v2: remove _mesa in front of static software fallback. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	9cb732b8e9	main: Improve errors and style in BufferSubData. - More explicit error reporting. - Removed legacy style. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	566ccdf11b	main: Add entry point for NamedBufferSubData. v2: review by Ian Romanick - Remove "_mesa" from name of static software fallback buffer_sub_data. - Remove mappedRange from _mesa_buffer_sub_data. - Removed some cosmetic changes to a separate commit. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	cb56835f87	main: Add entry point for NamedBufferData. v2: review from Ian Romanick - Fix space in ARB_direct_state_access.xml. - Remove "_mesa" from the name of buffer_data static fallback. - Restore VBO_DEBUG and BOUNDS_CHECK. - Fix beginning of comment to start on same line as /* Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	a76808dc19	main: Add entry point for NamedBufferStorage. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	2cf48c37c1	main: Add entry point for CreateBuffers. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-03-17 10:18:33 -07:00
Laura Ekstrand	44ecf0793d	Revert "main: _mesa_cube_level_complete checks NumLayers." This reverts commit `1ee000a0b6`. Failures with the GLES3 conformance suite and Synmark2 OGLHdrBloom revealed that this commit was in error. Extensive testing with Piglit prior to patch review and upstreaming did not reveal this problem because, in the few Piglit tests that test for cube completeness, NumLayers = 6. This is because all of the existing tests use TextureStorage to initialize the texture, which sets NumLayers. A new Piglit test has been sent to the mailing list that reproduces the bug related to this patch ("texturing: Testing glGenerateMipmap(GL_TEXTURE_CUBE_MAP) without glTexStorage2D"). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-17 10:04:10 -07:00
Neil Roberts	5a06ee7384	i965/skl: Send a message header when doing constant loads SIMD4x2 Commit `0ac4c27275` made it add a header for the send message when using SIMD4x2 on Skylake because without this it will end up using SIMD8D. However the patch missed the case when a sampler is being used to implement constant loads from a buffer surface in a SIMD4x2 vertex shader. This fixes 29 Piglit tests, mostly related to the ARL instruction in vertex programs. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-17 16:32:11 +00:00
Tapani Pälli	627c683086	i965/fs: in MAD optimizations, switch last argument to be immediate Commit `bb33a31` introduced optimizations that transform cases of MAD in to simpler forms but it did not take in to account that src[0] can not be immediate and did not report progress. Patch switches src[0] and src[1] if src[0] is immediate and adds progress reporting. If both sources are immediates, this is taken care of by the same opt_algebraic pass on later run. v2: Fix for all cases, use temporary fs_reg (Matt, Kenneth) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89569 Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-03-17 07:59:30 +02:00
Vinson Lee	60f77b22b1	common.py: Fix PEP 8 issues. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-16 22:55:08 -07:00
Roland Scheidegger	2372275d2f	gallivm: abort properly when running out of buffer space in lp_disassembly Before this actually ran into an infinite loop printing out "invalid"... Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-17 00:46:48 +01:00
Marek Olšák	9d1682d619	docs/GL3: also mark GLES3/GS5 for radeonsi as done	2015-03-16 23:27:25 +01:00
Emil Velikov	c066669b8d	st/dri: remove unused include from the automake/scons build st/dri/common hasn't been around for a while. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-16 20:59:52 +00:00
Emil Velikov	55f0c0a29f	auxiliary/os: fix the android build - s/drm_munmap/os_munmap/ Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get the mmap/munmap wrappers working with android) Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-16 20:59:36 +00:00
Emil Velikov	5664f57df3	gallium/sw/kms: trivial cleanups Remove the forward declaration and make use of the DEBUG_PRINT macro for debug builds. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-16 20:59:22 +00:00
Emil Velikov	771cd266b9	loader: include <sys/stat.h> for non-sysfs builds Required by fstat(), otherwise we'll error out due to implicit function declaration. Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com> Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>	2015-03-16 20:48:07 +00:00
Felix Janda	aead7fe2e2	c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default Previously PTHREAD_MUTEX_RECURSIVE_NP had been used on linux for compatibility with old glibc. Since mesa defines __GNU_SOURCE__ on linux PTHREAD_MUTEX_RECURSIVE is also available since at least 1998. So we can unconditionally use the portable version PTHREAD_MUTEX_RECURSIVE. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88534 Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-16 20:41:39 +00:00
Marek Olšák	b5f19db976	radeonsi: implement TGSI_OPCODE_BFI (v2) v2: Don't use the intrinsics, the shader backend can recognize these patterns and generates optimal code automatically. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-16 14:58:19 +01:00
Marek Olšák	d3723c614f	radeonsi: add a helper for extracting bitfields from parameters (v2) This will be used a lot (especially by tessellation). v2: don't use the bfe intrinsic Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-16 14:58:19 +01:00
Antia Puentes	9735a62a2c	i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7 IvyBridge and Haswell PRM say that the JIP should be emitted with type W but we were using UD. The previous implementation did not show adverse effects, but IMHO it is safer to follow the specification thoroughly. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Antia Puentes <apuentes@igalia.com>	2015-03-16 12:56:17 +01:00
Marek Olšák	dc39413640	radeonsi: move scratch reloc state setup - move it to its own function - do it after all states are emitted - bump SI_MAX_DRAW_CS_DWORDS Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:19 +01:00
Marek Olšák	567c8d7300	radeonsi: don't emit PA_SC_LINE_STIPPLE if not rendering lines Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:19 +01:00
Marek Olšák	1f4bb38264	radeonsi: don't emit PA_SC_LINE_STIPPLE after every rasterizer state change Do it only when the line stipple state is changed. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:19 +01:00
Marek Olšák	f5832f3f9d	radeonsi: move PA_SU_SC_MODE_CNTL to rasterizer state This requires enabling the optional GL provoking vertex behavior for quads. + some cosmetic changes, so that the register is set exactly the same as on r600. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:19 +01:00
Marek Olšák	98a2398222	radeonsi: implement line and polygon smoothing Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:19 +01:00
Marek Olšák	303d23e10d	radeonsi: add shader code for smoothing The fragment shader multiplies the alpha channel with gl_SampleMaskIn. If blending is enabled, it looks like MSAA. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:19 +01:00
Marek Olšák	4f20a8f278	radeonsi: split sample locations into its own state atom Sample locations are not updated as often as framebuffers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	f7796a966d	radeonsi: add basic code for overrasterization This will be used for line and polygon smoothing. This is GCN-only even though it's in shared code. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	1921fa4304	radeonsi: small cleanup in si_shader_selector_key Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	52ff1edc51	radeonsi: simplify accessing alpha pointer in si_llvm_emit_fs_epilogue Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	955ebf2890	radeonsi: add support for easy opcodes from ARB_gpu_shader5 I have to use the BFE instrinsics, because BFE is one of the most complex instructions that can't be matched easily. BFE has 3 conditional branches and one of them is quite big. In the isel DAG, lowered BFE has 27 nodes (including leafs).	2015-03-16 12:54:18 +01:00
Marek Olšák	755a2907a3	radeonsi: implement bit-finding opcodes from ARB_gpu_shader5 Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	ca90cde81e	radeonsi: implement gl_SampleMaskIn Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	f9fd0c4a55	radeonsi: add support for SQRT Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	d73c1c1304	radeonsi: add support for FMA Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	dfea35666e	gallium/radeon: don't use LLVMReadOnlyAttribute for ALU None of the instructions use a pointer argument. (+ small cosmetic changes) Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-16 12:54:18 +01:00
Marek Olšák	9da9c8e3f4	tgsi: handle bitwise opcodes in tgsi_opcode_infer_type (v2) v2: set the same types as the destination type in tgsi_exec Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-16 12:54:18 +01:00
Marek Olšák	216543ea54	gallium: add FMA and DFMA opcodes (v3) Needed by ARB_gpu_shader5. v2: select DMAD for FMA with double precision v3: add and select DFMA Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-16 12:54:18 +01:00
Rob Clark	e92bc6b38e	freedreno: update generated headers Fix a3xx texture layer-size. Signed-off-by: Rob Clark <robclark@freedesktop.org> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-15 18:00:19 -04:00
Rob Clark	d3fb949c03	freedreno/ir3: remove old compiler Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-15 13:27:03 -04:00
Rob Clark	feb858b788	freedreno/ir3: avoid scheduler deadlock Deadlock can occur if we schedule an address register write, yet some instructions which depend on that address register value also depend on other unscheduled instructions that depend on a different address register value. To solve this, before scheduling an address register write, ensure that all the other dependencies of the instructions which consume this address register are already scheduled. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-15 13:26:56 -04:00
Rob Clark	7208e96bb8	freedreno/ir3: bit of cleanup Add an array_insert() macro to simplify inserting into dynamically sized arrays, add a comment, and remove unused prototype inherited from the original freedreno.git/fdre-a3xx test code, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-15 13:26:44 -04:00
Kenneth Graunke	db095eb43b	i965: De-duplicate is_expression_commutative() functions. Create a backend_inst::is_commutative() method to replace two static functions that did the exact same thing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-03-15 03:14:53 -07:00
Chris Forbes	f68a973dfb	i965/gen4-5: Cope with immutable-format texture revalidation This is unfortunately sometimes necessary due to rebasing levels when rendering into them. 16 piglits crash -> pass, when building mesa with debug enabled. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-14 15:55:17 +13:00
Emil Velikov	8ed1b65b62	docs: add news item and link release notes for mesa 10.5.1 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-13 23:36:33 +00:00
Emil Velikov	5f72847a88	docs: Add sha256 sums for the 10.5.1 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `2abba086ca`)	2015-03-13 23:35:02 +00:00
Emil Velikov	6c96608937	Add release notes for the 10.5.1 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `11c0ff60ef`)	2015-03-13 23:35:00 +00:00
Ilia Mirkin	620e29b748	freedreno: fix slice pitch calculations For example if width were 65, the first slice would get 96 while the second would get 32. However the hardware appears to expect the second pitch to be 64, based on halving the 96 (and aligning up to 32). This fixes texelFetch piglit tests on a3xx below a certain size. Going higher they break again, but most likely due to unrelated reasons. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-03-13 16:05:16 -04:00
Ilia Mirkin	89b26d5a36	freedreno/a3xx: use the same layer size for all slices We only program in one layer size per texture, so that means that all levels must share one size. This makes the piglit test bin/texelFetch fs sampler2DArray have the same breakage as its non-array version instead of being completely off, and makes bin/ext_texture_array-gen-mipmap start passing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2015-03-13 16:05:16 -04:00
Ian Romanick	e76a8dc8ed	i965/vs: Add missing resolve_bool_comparison calls on GEN4 and GEN5 The ir_unop_any problem was discovered by some later optimization passes that generate ir_triop_csel. I was also able to reproduce it by modifying the gl-2.0-vertexattribpointer vertex shader to generate its result using color = mix(vec4(0, 1, 0, 0), vec4(1, 0, 0, 0), bvec4(any(greaterThan(diff, vec4(tolerance))))); instead of an if-statement. This also required using #version 130 and MESA_GLSL_VERSION_OVERRIDE=130. I have not nominated this for stable releases because I don't think there's any way to trigger the problem without GLSL 1.30 or optimizations that don't exist in stable. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Abdiel Janulgue <abdiel.janulgue@intel.com>	2015-03-13 12:57:32 -07:00
Chris Forbes	21ff9bfe1c	i965/disasm: Fix format strings Most of the brw_inst_* api returns 64bit values. This fixes disassembly of sampler messages, etc. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-14 07:51:18 +13:00
Chris Forbes	7c3095d6b7	i965/disasm: Mark format() as being printf-style. This allows us to get warnings from GCC when we mess up the format strings. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-14 07:50:48 +13:00
Matt Turner	97399fc751	docs: List ARB_shading_language_packing/EXT_shader_integer_mix. Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-13 10:42:38 -07:00
Matt Turner	8d3aa5926b	glsl: Expose built-in packing functions under GLSL 4.2. ARB_shading_language_packing is part of GLSL 4.2, not 4.0 as I mistakenly believed. The following functions are available only with ARB_shading_language_packing, GLSL 4.2 (not GLSL 4.0), or ES 3.0: - packSnorm2x16 - unpackSnorm2x16 - packHalf2x16 - unpackHalf2x16 Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-13 10:42:38 -07:00
Matt Turner	dac2e7deaa	egl: Create queryable strings in eglInitialize(). Creating/recreating the strings in eglQueryString() is extra work and isn't thread-safe, as exhibited by shader-db's run.c using libepoxy. Multiple threads in run.c call eglReleaseThread() around the same time. libepoxy calls eglQueryString() to determine whether eglReleaseThread() exists, and our EGL implementation passes a pointer to the version string to libepoxy while simultaneously overwriting the string, leading to a failure in libepoxy. Moreover, the EGL spec says (emphasis mine): "eglQueryString returns a pointer to a static, zero-terminated string" This patch moves some auxiliary functions from eglmisc.c to eglapi.c so that they may be used to create the extension, API, and version strings once during eglInitialize(). The auxiliary functions are renamed from _eglUpdate* to _eglCreate*, and some checks made unnecessary by calling the functions from eglInitialize() are removed. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-03-13 10:42:38 -07:00
Samuel Iglesias Gonsalvez	b43bbfa90a	glsl: optimize (0 cmp x + y) into (-x cmp y). The optimization done by commit `34ec1a24d` did not take it into account. Fixes: dEQP-GLES3.functional.shaders.random.all_features.fragment.20 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev	cf6f33ee68	mesa: Check for valid PBO access in gl(Compressed)Tex(Sub)Image calls This patch adds two types of checks to the gl(Compressed)Tex(Sub)Imgage family of functions when a pixel buffer object is bound to GL_PIXEL_UNPACK_BUFFER: - That the buffer is not mapped. - The total data size is within the boundaries of the buffer size. It does so by calling auxiliary validations functions from PBO API: _mesa_validate_pbo_source() for non-compressed texture calls, and _mesa_validate_pbo_source_compressed() for compressed texture calls. The first check is defined in Section 6.3.2 'Effects of Mapping Buffers on Other GL Commands' of the GLES 3.1 spec, page 57: "Any GL command which attempts to read from, write to, or change the state of a buffer object may generate an INVALID_OPERATION error if all or part of the buffer object is mapped. However, only commands which explicitly describe this error are required to do so. If an error is not generated, using such commands to perform invalid reads, writes, or state changes will have undefined results and may result in GL interruption or termination." Similar wording exists in GL 4.5 spec, page 76. In the case of gl(Compressed)Tex(Sub)Image(2,3)D, the specification doesn't force implemtations to throw an error. However since Mesa don't currently implement checks to determine when it is safe to read/write from/to a mapped PBO, we should always return the error if all or parts of it are mapped. The 2nd check is defined in Section 8.5 'Texture Image Specification' of the OpenGL 4.5 spec, page 203: "An INVALID_OPERATION error is generated if a pixel unpack buffer object is bound and storing texture data would access memory beyond the end of the pixel unpack buffer." Fixes 4 dEQP tests: * dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target * dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target * dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target * dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev	7c084752c6	mesa: Separate PBO validation checks from buffer mapping, to allow reuse Internal PBO functions such as _mesa_map_validate_pbo_source() and _mesa_validate_pbo_compressed_teximage() perform validation and buffer mapping within the same call. This patch takes out the validation into separate functions to allow reuse of functionality by other code (i.e, gl(Compressed)Tex(Sub)Image). Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev	7b5bb97cef	mesa: Set the correct image size in _mesa_validate_pbo_access() _mesa_validate_pbo_access() provides a generic way to check that a requested pixel transfer operation on a PBO falls within the boundaries of the buffer. It is used in various other places, and depending on the caller, some arguments are used or not. In particular, the 'clientMemSize' argument is used only by calls that are knowledgeable of the total size of the user data involved in a pixel transfer, such as the case of compressed texture image calls. Other calls don't provide 'clientMemSize' directly since it is made implicit from the size and format of the texture, and its data type. In these cases, a sufficiently big value is passed to 'clientMemSize' (INT_MAX) to avoid an incorrect constrain. The problem is that _mesa_validate_pbo_access() use uint pointers to make the calculations, which are 64 bits long in 64 bits platforms, meanwhile the dummy INT_MAX passed in 'clientMemSize' is just 32 bits. This causes a constrain that is not desired. This patch fixes that by checking that if 'clientMemSize' is MAX_INT, then UINTPTR_MAX is assumed instead. This is an ugly workaround to the fact that _mesa_validate_pbo_access() intends to be a one function fits all. The clean solution here would be to break it into different functions that provide the adequate API for each of the possible code paths and validation needs. Since there are callers relying on passing INT_MAX to 'clientMemSize', this patch is necessary to deal with the problem above while a cleaner implementation of the PBO API is not implemented. Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-13 16:40:20 +01:00
Eduardo Lima Mitev	f6f7bfb5e1	meta: Remove error checks for texture <-> pixel-buffer transfers that don't belong in driver code The implementation of texture <-> pixel-buffer transfers in drivers common layer includes certain error checks and argument validation that don't belong there, considering how the Mesa codebase is laid out. These are higher level validations that, if necessary, should be performed earlier (i.e, in GL API entry points). This patch simply removes these error checks from driver code. For more information, see discussion at http://lists.freedesktop.org/archives/mesa-dev/2015-February/077417.html. Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>	2015-03-13 16:40:20 +01:00
Brian Paul	558dcd8770	util: convert slab macros to inline functions Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-13 08:03:43 -06:00
Brian Paul	d24a20e967	egl: fix cast to silence compiler warning eglcurrent.c: In function '_eglSetTSD': eglcurrent.c:57:4: warning: passing argument 2 of 'tss_set' discards 'const' qualifier from pointer target type [enabled by default] tss_set(_egl_TSD, (const void ) t); ^ In file included from ../../../include/c11/threads.h:72:0, from eglcurrent.c:32: ../../../include/c11/threads_posix.h:357:1: note: expected 'void ' but argument is of type 'const void ' tss_set(tss_t key, void val) ^ Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-13 08:03:43 -06:00
Alexandre Demers	a38e6c4fbd	gallivm: (trivial) Fix typo in comment introduced by 70dc8a Fix typo in comment introduced by 70dc8a Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-13 13:52:52 +00:00
Seán de Búrca	1a469a34d5	mesa: improve ARB_copy_image internal format compat check The memory layout of compatible internal formats may differ in bytes per block, so TexFormat is not a reliable measure of compatibility. For example, GL_RGB8 and GL_RGB8UI are compatible formats, but GL_RGB8 may be laid out in memory as B8G8R8X8. If GL_RGB8UI has a 3 byte-per-block memory layout, the existing compatibility check will fail. Additionally, the current check allows any two compressed textures which share block size to be used, whereas the spec gives an explicit table of compatible formats. v2: Use a switch instead of array iteration for block class and show the correct GL error when internal formats are mismatched. v3: Include spec citations for new compatibility checks, rearrange check order to ensure that compressed, view-compatible formats return the correct result, and make style fixes. Original commit message amended for clarity. v4: Reformatted spec citations. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 16:40:03 -07:00
Kenneth Graunke	f3e4b2c9d2	nir: Fix non-determinism in nir_lower_vars_to_ssa(). Previously, we stored derefs in a hash table, using the malloc'd pointer as the key. Then, we walked through the hash table and generated code, based on the order of the hash table's elements. Memory addresses returned by malloc are pretty much random, which meant that the hash was random, and the hash table's elements would be walked in some random order. This led to successive compiles of the same shader using different variable names and slightly different orderings of phi-nodes. Code could not be diff'd, and the final assembly would sometimes change slightly too. It turns out the only point of the hash table was to avoid inserting the same node multiple times for different dereferences. We never actually searched the hash table! This patch uses an intrusive linked list instead. Since exec_list uses head and tail sentinels, checking prev or next against NULL will tell us whether the node is already in the list. Pair programming with Jason Ekstrand. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-12 13:25:39 -07:00
Jason Ekstrand	67388c1ef2	util: Fix foreach_list_typed_safe when exec_node is not at offset 0. __next and __prev are pointers to the structure containing the exec_node link, not the embedded exec_node. NULL checks would fail unless the embedded exec_node happened to be at offset 0 in the parent struct. v2: Jason Ekstrand <jason.ekstrand@intel.com>: Use "(__node)->__field.next != NULL" to check for the end of the list instead of the "&__next->__field != NULL". The former is far more obviously correct as it matches what the non-safe versions do. The original code tried to avoid any use of __next as the client code may delete it during its execution. However, since the looping condition is checked after the iteration clause but before the client code is executed, we know that __node is valid during the looping condition. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-12 13:25:39 -07:00
Kenneth Graunke	547c760964	i965: Use NIR for scalar VS when INTEL_USE_NIR is set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:49 -07:00
Kenneth Graunke	7ef0b6b367	i965/fs: Add VS output support to nir_setup_outputs(). Adapted from fs_visitor::visit(ir_variable *). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:49 -07:00
Kenneth Graunke	eb137117b7	i965/fs: Handle VS inputs in the NIR backend. (Jason noted that this is not a good long term solution, and we should instead improve nir_lower_io so that this extra set of MOVs is unnecessary. I tend to agree, but decided we could do that as a follow-up improvement.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Kenneth Graunke	a5c4e7fcf5	i965/fs: Refactor fs_visitor::nir_setup_inputs(). No functional change. In preparation for supporting vertex shaders, this adds a switch statement on shader stage (since vertex attributes and fragment shader varyings will need different handling). It also renames "varying" to "input", to be more general. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Kenneth Graunke	34628a838a	i965: Implement NIR intrinsics for loading VS system values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Kenneth Graunke	2c79f6f9c3	nir: Add intrinsics for SYSTEM_VALUE_BASE_VERTEX and VERTEX_ID_ZERO_BASE Ian and I added these around the time Connor was developing NIR. Now that both exist, we should make them work together! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Kenneth Graunke	b9dea9bc45	i965/nir: Lower to registers a bit later. We can't safely call nir_optimize() with register present, since several passes called in the loop can't handle registers, and will fail asserts. Notably, nir_lower_vec_alus() and nir_opt_algebraic() really don't want registers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Kenneth Graunke	1f0067811c	i965/nir: Optimize after nir_lower_var_copies(). Array variable copy splitting generates a bunch of stuff we want to clean up before proceeding. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Kenneth Graunke	1d8ef6ba60	i965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor. The NIR backend hardcodes brw_wm_prog_key at the moment, which won't work when we support scalar VS. We could use get_tex(), but it's a static method. I was going to promote it to fs_visitor, but then realized that both parameters (stage and key) are already members. It then occured to me that we could just set up a pointer in the constructor, and skip having a function altogether. This patch also converts all existing users to use key_tex. v2: Make key_tex a "const brw_sampler_prog_key_data *" instead of non-const; word-wrap some lines. (Review comments from Topi.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-12 08:29:48 -07:00
Brian Paul	48b0a3c1c9	tnl: HAVE_LE32_VERTS is never defined, remove associated code Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-12 07:52:45 -06:00
Brian Paul	6d3b86c3af	mesa: move LONGSTRING into generated enums.c enums.c is the only place this directive is needed. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-12 07:52:45 -06:00
Brian Paul	f8ed0bbfef	mesa: remove _ASMAPI, ASMAPIP Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-12 07:52:45 -06:00
Brian Paul	09ffa04cd9	mesa: remove _XFORMAPI Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-12 07:52:45 -06:00
Brian Paul	10035361b5	swrast: remove _BLENDAPI _BLENDAPI boils down to __cdecl on Windows, but __cdecl is the default calling convention so this serves no purpose. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-12 07:52:45 -06:00
Brian Paul	6ca5eaf49c	mesa: use ARRAY_SIZE in _mesa_QueryMatrixxOES() Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-12 07:52:45 -06:00
Brian Paul	c3984c1155	mesa: remove register keyword, add const in _mesa_QueryMatrixxOES() Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-12 07:52:45 -06:00
Brian Paul	97f6d50f72	mesa: reindent querymatrix.c Use 3-space indents, not 4. Move some comments after the case statements. Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-12 07:52:45 -06:00
Brian Paul	be4e198be0	mesa: move fpclassify work-arounds into c99_math.h v2: Use #error in the #else clause, per Jose. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-12 07:52:35 -06:00
Jose Fonseca	70dc8a9930	gallivm: Prevent double delete on LLVM 3.6 std::unique_ptr takes ownership of MM, and a double delete could ensure in case of an error, as pointed out by Chris Vine in https://bugs.freedesktop.org/show_bug.cgi?id=89387 Reviewed-by: Chris Vine <chris@cvine.freeserve.co.uk>	2015-03-12 10:01:09 +00:00
Emil Velikov	30916a5ef0	autogen.sh: pass --force to autoreconf, quote ORIGDIR By passing --force autoreconf will update all the aux files, which would otherwise be ignored if one updates autoconf/automake. Quote the ORIGDIR variable to prevent fall-outs, when its name contains space. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-11 23:28:26 +00:00
Emil Velikov	a385d18598	glx: remove support for non-multithreaded platforms Implicitly required for a while, although commit `9385c592c6` (mapi: remove u_thread.h) was the one that put the final nail on the coffin. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-11 23:28:26 +00:00
Emil Velikov	42144170d1	glx: remove final reference to THREADS Left over from commit 18db13f5865(mapi: THREADS was always defined, remove it) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-11 23:28:26 +00:00
Emil Velikov	39f90e6b9b	configure: require pthreads for POSIX builds This has been an implicit rule for building mesa for a long time. Let's make it official and just bail out at configure time. This way we can cleaning up some of our glx code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-11 23:28:25 +00:00
Emil Velikov	a806df3f23	egl/main: convert thread management to use c11 threads Convert the code to use the C11 threads implementation, and nuke the Windows non-pthreads code-path. The c11/threads_win32.h abstraction should be better than the current code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-11 23:28:25 +00:00
Emil Velikov	efe87f1a80	egl/main: use c11/threads' mutex directly Remove the inline wrappers/abstraction layer. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-11 23:28:25 +00:00
Jason Ekstrand	90e50908d7	nir/worklist: Don't change the start index when computing the tail index Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-11 15:18:16 -07:00
Thomas Helland	8fb8fe46fa	nir: Optimize a + neg(a) Shader-db i965 instructions: total instructions in shared programs: 1711180 -> 1711159 (-0.00%) instructions in affected programs: 825 -> 804 (-2.55%) helped: 9 HURT: 0 GAINED: 3 LOST: 3 Shader-db NIR instructions: total instructions in shared programs: 606187 -> 606179 (-0.00%) instructions in affected programs: 298 -> 290 (-2.68%) helped: 4 HURT: 0 GAINED: 0 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2015-03-11 14:21:05 -07:00
Thomas Helland	0525f2e851	nir: Optimize (ab)+(ac) -> a*(b+c) Shader-db i965 instructions: total instructions in shared programs: 1715894 -> 1710802 (-0.30%) instructions in affected programs: 443080 -> 437988 (-1.15%) helped: 1502 HURT: 13 GAINED: 4 LOST: 4 Shader-db NIR instructions: total instructions in shared programs: 607710 -> 606187 (-0.25%) instructions in affected programs: 208285 -> 206762 (-0.73%) helped: 769 HURT: 8 GAINED: 0 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2015-03-11 14:21:05 -07:00
Marius Predut	09b0325409	vbo: improve the code style by adjust the preprocessing c code directives Brian Paul review suggestion: there's more macro use here than necessary. Removed and redefine some #define preprocessing directives. Removed the directive input parameter 'T' . No functional changes. Signed-off-by: Marius Predut <marius.predut@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-11 09:34:25 -06:00
Brian Paul	9816acff2c	mesa: remove CPU_TO_LE32() for AIX This is the only remnant of AIX-specific code in Mesa. Probably long unused. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-11 09:34:25 -06:00
Brian Paul	3158b3abb3	mesa: remove #define __volatile Not actually used anwhere in Mesa. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-11 09:34:24 -06:00
Brian Paul	d7193ce42c	mesa: use strdup() instead of _mesa_strdup() We were already using strdup() in various places in Mesa. Get rid of the _mesa_strdup() wrapper. All the callers pass a non-NULL argument so the NULL check isn't needed either. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-11 09:34:24 -06:00
Brian Paul	5376bc74cc	st/glx: use strdup() instead of _mesa_strdup() Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-11 09:34:24 -06:00
Brian Paul	279c5965aa	xlib: use strdup() instead of _mesa_strdup() Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-11 09:34:24 -06:00
Brian Paul	14ba6c9325	i915: add parens to silence operator precedence warning Signed-off-by: Brian Paul <brianp@vmware.com>	2015-03-11 09:34:07 -06:00
Iago Toral Quiroga	6ac1bc90c4	i965: Fix out-of-bounds accesses into pull_constant_loc array The piglit test glsl-fs-uniform-array-loop-unroll.shader_test was designed to do an out of bounds access into an uniform array to make sure that we handle that situation gracefully inside the driver, however, as Ken describes in bug 79202, Valgrind reports that this is leading to an out-of-bounds access in fs_visitor::demote_pull_constants(). Before accessing the pull_constant_loc array we should make sure that the uniform we are trying to access is valid. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79202 Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-11 08:03:40 +01:00
Jordan Justen	5750595ca9	i965/gen6 gs: Convert brw_imm_ud/brw_imm_d to src_reg Same idea as this patch, only for gen6_gs_visitor: commit `49a938a265` Author: Jordan Justen <jordan.l.justen@intel.com> Date: Fri Feb 20 12:12:25 2015 -0800 i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data Suggested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-10 00:14:53 -07:00
Jordan Justen	e5269ca28e	i965/fs: Use unsigned for CS/VS atomics pixel mask immediate data brw_imm_ud(0xffff) should have been converted to fs_reg(0xffffu) to make sure the uint32_t fs_reg constructor was matched. commit `49a938a265` Author: Jordan Justen <jordan.l.justen@intel.com> Date: Fri Feb 20 12:12:25 2015 -0800 i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-10 00:13:08 -07:00
Jordan Justen	6626e3548b	i965/gen8: Don't allocate hiz miptree structure We now skip allocating a hiz miptree for gen8. Instead, we calculate the required hiz buffer parameters and allocate a bo directly. v2: * Update hz_height calculation as suggested by Topi v3: * Bail if we failed to create the bo (Ben) v4: * CEILING => DIV_ROUND_UP * Make sure mt->logical_depth0 being 0 would not cause trouble * Fail if Y tiling is not returned Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-09 23:56:51 -07:00
Jordan Justen	81124aefe8	i965/gen7: Don't allocate hiz miptree structure We now skip allocating a hiz miptree for gen7. Instead, we calculate the required hiz buffer parameters and allocate a bo directly. v2: * Update hz_height calculation as suggested by Topi v3: * Bail if we failed to create the bo (Ben) v4: * CEILING => DIV_ROUND_UP * Make sure mt->logical_depth0 being 0 would not cause trouble * Fail if Y tiling is not returned Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67564 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-09 23:56:51 -07:00
Jordan Justen	31b851dccb	i965/gen8: Don't rely directly on the hiz miptree structure We are still allocating a miptree for hiz, but we only use fields from intel_miptree_aux_buffer. This will allow us to switch over to not allocating a miptree. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-09 23:56:51 -07:00
Jordan Justen	26eabd189d	i965/gen7: Don't rely directly on the hiz miptree structure We are still allocating a miptree for hiz, but we only use fields from intel_miptree_aux_buffer. This will allow us to switch over to not allocating a miptree. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-09 23:56:51 -07:00
Jordan Justen	aedcd466bb	i965/hiz: Start to separate miptree out from hiz buffers Today we allocate a miptree's for the hiz buffer. We needed this in the past because we would point the hardware at offsets of the hiz buffer. Since the hiz format is not documented, this is not a good idea. Since moving to support layered rendering on Gen7+, we no longer point at an offset into the buffer on Gen7+. Therefore, to support hiz on Gen7+, we don't need a full miptree structure allocated. This patch starts to create a new auxiliary buffer structure (intel_miptree_aux_buffer) that can be a more simplistic miptree side-band buffer associated with a miptree. (For example, to serve the needs of the hiz buffer.) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-09 23:56:50 -07:00
Dave Airlie	4d318b61fc	mesa/scissor: fix typos in debug names Just noticed this when working on virgl. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-10 16:45:45 +10:00
Samuel Pitoiset	e5cd42ed9a	nvc0: fix wrong max value for driver queries The maximum value of a Gallium HUD's panel is automatically adjusted when the current value is greater than the max. If we set the pipe_query_driver_info::max_value to UINT64_MAX, the maximum value is never adjusted and this results in a flat line instead of a pretty curve which is correctly scaled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-09 20:47:05 -04:00
Vinson Lee	13f4963ed2	i965: Silence GCC maybe-uninitialized warning. brw_shader.cpp: In function ‘bool brw_saturate_immediate(brw_reg_type, brw_reg)’: brw_shader.cpp:618:31: warning: ‘sat_imm.brw_saturate_immediate(brw_reg_type, brw_reg)::<anonymous union>::ud’ may be used uninitialized in this function [-Wmaybe-uninitialized] reg->dw1.ud = sat_imm.ud; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 17:28:39 -07:00
Vinson Lee	282f67becd	i915: Fix GCC unused-but-set-variable warning in release build. i915_fragprog.c: In function ‘i915ValidateFragmentProgram’: i915_fragprog.c:1453:11: warning: variable ‘k’ set but not used [-Wunused-but-set-variable] int k; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 17:28:39 -07:00
Vinson Lee	5f759836ad	Add macro for unused function attribute. Suggested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-09 17:28:39 -07:00
Ben Widawsky	7aba4ab1f3	meta: Plug memory leak It looks like this has existed since commit `f5a477ab76` Author: Ian Romanick <ian.d.romanick@intel.com> Date: Mon Dec 16 11:54:08 2013 -0800 meta: Refactor shader generation code out of mipmap generation path Valgrind was complaining on fbo-generatemipmap-formats v2: Instead, do the allocation after the early return block (v2) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-09 16:32:33 -07:00
Kenneth Graunke	e95969cd95	i965/fs: Don't issue FB writes for bound but unwritten color targets. We used to loop over all color attachments, and emit FB writes for each one, even if the shader didn't write to a corresponding output variable. Those color attachments would be filled with garbage (undefined values). Football Manager binds a framebuffer with 4 color attachments, but draws to it using a shader that only writes to gl_FragData[0..2]. This meant that color attachment 3 would be filled with garbage, resulting in rendering artifacts. Now we skip writing to it, fixing rendering. Writes to gl_FragColor initialize outputs[0..nr_color_regions-1] to GRFs, while writes to gl_FragData[i] initialize outputs[i]. Thanks to Jason Ekstrand for tracking this down. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86747 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-09 16:07:04 -07:00
Kenneth Graunke	4ebeb71573	i965/fs: Make emit_shader_time_end() insert before EOT. Previously, we emitted the shader-time epilogue from emit_fb_writes(), during the middle of looping through color regions (or emit_urb_writes for the VS). This is duplicated several times and rather awkward. I need to fix a bug in our FB write handling, and it will be a lot easier if we move emit_shader_time_end() out of there. Now, we simply emit FB writes/URB writes, and subsequently have emit_shader_time_end() insert instructions before the final SEND with EOT. Not only is this simpler, it's actually a slight improvement: we now include the MOVs to set up the final FB write payload in our shader-time measurements. Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses send-from-GRF. (In the past, we might have hit trouble where both attempt to use MRFs for messages; that's not a problem now.) v2: Rebase on v3 of the previous patch and other shader_time fixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> [v1] Acked-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-09 16:07:04 -07:00
Kenneth Graunke	e43af8d09f	i965/fs: Make get_timestamp() pass back the MOV rather than emitting it. This makes another part of the INTEL_DEBUG=shader_time code emittable at arbitrary locations, rather than just at the end of the instruction stream. v2: Don't lose smear! Caught by Topi Pohjolainen. v3: Don't set smear on the destination of the MOV. Thanks Topi! Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-09 16:07:04 -07:00
Kenneth Graunke	bea854c7f3	i965/fs: Make emit_shader_time_write return rather than emit. Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)). The advantage is that we can also insert a shader time write at an arbitrary location in the instruction stream, rather than being restricted to emitting at the end. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-09 16:07:04 -07:00
Kenneth Graunke	f1adc45dbe	i965/fs: Set smear on shader_time diff register. The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a width 1 register. We need to read it as <0,1,0> with a subreg of 0, which is what smear accomplishes. Fixes assertion: brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-09 16:07:03 -07:00
Kenneth Graunke	ef9cc7d0c1	i965/fs: Set force_writemask_all on shader_time instructions. These computations don't have anything to do with the currently executing channels, so they should use force_writemask_all. This fixes assert failures. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-09 16:07:03 -07:00
Alexandre Demers	7a37d5c3a4	r600g: Use R600_MAX_VIEWPORTS instead of 16 Lets define R600_MAX_VIEWPORTS instead of using 16 here and there in the code when looping through viewports and scissors. It is easier to understand what this number represents. v2: Missed a case where R600_MAX_VIEWPORTS should have been used. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-03-09 23:02:05 +01:00
Ian Romanick	85df48b45a	i915: Remove unused IS_GEN2 macro Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:09:21 -07:00
Ian Romanick	07a062997a	i915: Remove (mostly) unused IS_915 macro Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:09:16 -07:00
Ian Romanick	117288dbf3	i915: Remove (mostly) unused IS_PNV, IS_PNVG, and IS_PNVGM macros Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:09:06 -07:00
Ian Romanick	19fda9fc83	i915: Remove IS_9XX macro Since the i915 / i965 split, IS_9XX just means IS_GEN3. Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:08:57 -07:00
Ian Romanick	6d41316b79	i915: Remove unused IS_MOBILE macro Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:08:49 -07:00
Ian Romanick	e7d94be1ec	i965: Don't write past the end of the application supplied buffer Both the AMD and Intel APIs provide a dataSize parameter, and this function would merrily ignore it. Neither API specifies what to do when the buffer isn't big enough. I take the easy route of writing all the complete bits of data that will fit. With more complete specs, we could probably do something different. I noticed this while looking into an unused parameter warning. The warning was actually useful! brw_performance_monitor.c: In function 'brw_get_perf_monitor_result': brw_performance_monitor.c:1261:37: warning: unused parameter 'data_size' [-Wunused-parameter] GLsizei data_size, ^ v2: Fix checks to include offset in the calculation. Noticed by Jan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2015-03-09 14:07:14 -07:00
Ian Romanick	78a211cee5	i965: Silence unused parameter warning All dd functions take a gl_context as the first parameter. Instead of removing it, just silence the warning. brw_performance_monitor.c: In function 'brw_new_perf_monitor': brw_performance_monitor.c:1354:41: warning: unused parameter 'ctx' [-Wunused-parameter] brw_new_perf_monitor(struct gl_context *ctx) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:07:14 -07:00
Ian Romanick	3a6a732c43	i965: Silence many 'static' is not at beginning of declaration warnings What a useful warning. #ThanksGCC brw_performance_monitor.c:153:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_counter gen5_raw_chaps_counters[] = { ^ brw_performance_monitor.c:185:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int gen5_oa_snapshot_layout[] = ^ brw_performance_monitor.c:221:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_group gen5_groups[] = { ^ brw_performance_monitor.c:240:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_counter gen6_raw_oa_counters[] = { ^ brw_performance_monitor.c:281:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int gen6_oa_snapshot_layout[] = ^ brw_performance_monitor.c:317:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_counter gen6_statistics_counters[] = { ^ brw_performance_monitor.c:332:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int gen6_statistics_register_addresses[] = { ^ brw_performance_monitor.c:346:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_group gen6_groups[] = { ^ brw_performance_monitor.c:356:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_counter gen7_raw_oa_counters[] = { ^ brw_performance_monitor.c:402:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int gen7_oa_snapshot_layout[] = ^ brw_performance_monitor.c:470:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_counter gen7_statistics_counters[] = { ^ brw_performance_monitor.c:493:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static int gen7_statistics_register_addresses[] = { ^ brw_performance_monitor.c:515:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] const static struct gl_perf_monitor_group gen7_groups[] = { ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:07:14 -07:00
Ian Romanick	c82c8b2201	i965/fs: Silence unused parameter warning I don't this opt_cmod_propagation_local ever used the fs_visitor. brw_fs_cmod_propagation.cpp:52:40: warning: unused parameter 'v' [-Wunused-parameter] opt_cmod_propagation_local(fs_visitor v, bblock_t block) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:07:14 -07:00
Ian Romanick	f9779e4a8f	i965/fs: Silence unused parameter warning Unused since `b18fd23`. brw_fs.cpp:2878:44: warning: unused parameter 'dispatch_width' [-Wunused-parameter] clear_deps_for_inst_src(fs_inst inst, int dispatch_width, bool deps, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:07:13 -07:00
Ian Romanick	e4f26acc08	i965/fs: Silence unused parameter warning brw_fs_visitor.cpp:2162:56: warning: unused parameter 'offset_components' [-Wunused-parameter] fs_reg offset_value, unsigned offset_components, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-09 14:07:13 -07:00
Laura Ekstrand	1e552db522	main: Add entry point for TextureBufferRange. v2: Review by Martin Peres - Get rid of difficult-to-follow code copied and pasted from the original TexBufferRange Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:54 -07:00
Laura Ekstrand	311b3686fe	main: Add check_texture_buffer_target. Creates a shared function to ensure that texture buffer target is GL_TEXTURE_BUFFER. Helps to clean up the Tex[ture]Buffer[Range] functions. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:54 -07:00
Laura Ekstrand	5f8c6eabbe	main: Add check_texture_buffer_range. Creates a shared function that TexBufferRange and TextureBufferRange can use to check the buffer range. This cleans up TexBufferRange considerably. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:54 -07:00
Laura Ekstrand	0f6372946b	main: Cosmetic changes for Texture Buffers. Adds a useful comment and some whitespace. Fixes an error message. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:54 -07:00
Laura Ekstrand	6b78a1fb89	main: Refactor _mesa_texture_buffer_range. Changes how the caller is identified in error messages, moves a check for ARB_texture_buffer_object from the entry points to the shared code in _mesa_texture_buffer_range, and removes an unused argument (GLenum target). v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:54 -07:00
Laura Ekstrand	d03337306a	main: Use _mesa_lookup_bufferobj_err to simplify Tex[ture]Buffer[Range]. v2: Review from Anuj Phogat - Split rebase of Tex[ture]Buffer[Range] - Closing curly brace on the same line as else Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:54 -07:00
Laura Ekstrand	768ca8b83e	main: Add utility function _mesa_lookup_bufferobj_err. This function is exposed to mesa driver internals so that texture buffer objects and array objects can use it. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Laura Ekstrand	ff011340a4	main: Checking for cube completeness in GetCompressedTextureImage. v2: Review from Anuj Phogat - Remove redundant copies of the cube map block comment - Replace redundant "if (!texImage) return;" statements with assert(texImage) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Laura Ekstrand	4080c330fa	main: Add TEXTURE_CUBE_MAP support for glCompressedTextureSubImage3D. v2: Review from Anuj Phogat - Remove redundant copies of the cube map block comment - Replace redundant "if (!texImage) return;" statements with assert(texImage) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Laura Ekstrand	70eab80f80	main: assert(texImage) in ARB_DSA texture cube map functions. ARB_direct_state_access functions that deal with texture cube maps need to make sure that texture images are not NULL before operating on them. In the following cases, the error check functions already throw an error if texImage == NULL, so an assert can be raised instead. v2: Review from Anuj Phogat - Replace redundant "if (!texImage) return;" statements with assert(texImage) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Laura Ekstrand	c3e92faeb4	main: Remove redundant copy of cube map block comment in GetTextureImage. The comment describing why ARB_direct_state_access texture cube map functions use _mesa_cube_level_complete is very long. To save room in the files, readers are now referred to one central comment on texturesubimage in teximage.c. v2: Review from Anuj Phogat - Remove redundant copies of the cube map block comment Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Laura Ekstrand	8979368f12	main: Remove redundant NumLayers checks. ARB_direct_state_access texture functions that operate on cube maps no longer need to verify that cube map texture objects contain six texture images because _mesa_cube_level_complete now does that for them. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Laura Ekstrand	1ee000a0b6	main: _mesa_cube_level_complete checks NumLayers. _mesa_cube_level_complete now verifies that a cube map texture object actually has six texture images before proceeding. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-09 13:33:53 -07:00
Marek Olšák	c939231e72	r300g: fix sRGB->sRGB blits Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-03-09 21:22:22 +01:00
Marek Olšák	9953586af2	r300g: fix a crash when resolving into an sRGB texture Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-03-09 21:03:49 +01:00
Marek Olšák	113601086d	r300g: use memset for clearing the shader key	2015-03-09 20:58:32 +01:00
Marek Olšák	4815c187b7	r300g: remove the broken SNORM->UNORM shader lowering pass Not used anymore.	2015-03-09 20:58:32 +01:00
Marek Olšák	74a757f92f	r300g: fix RGTC1 and LATC1 SNORM formats Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-03-09 20:58:32 +01:00
Stefan Dösinger	f710b99071	r300g: Fix the ATI1N swizzle (RGTC1 and LATC1) This fixes the GL_COMPRESSED_RED_RGTC1 part of piglit's rgtc-teximage-01 test as well as the precision part of Wine's 3dc format test (fd.o bug 89156). The Z component seems to contain a lower precision version of the result, probably a temporary value from the decompression computation. The Y and W component contain different data that depends on the input values as well, but I could not make sense of them (Not that I tried very hard). GL_COMPRESSED_SIGNED_RED_RGTC1 still seems to have precision problems in piglit, and both formats are affected by a compiler bug if they're sampled by the shader with a swizzle other than .xyzw. Wine uses .xxxx, which returns random garbage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89156 Signed-off-by: Marek Olšák <marek.olsak@amd.com> Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-03-09 20:58:32 +01:00
Tom Stellard	51b43c559f	radeonsi: Add additional information to shader dumps This adds SGPR count, VGPR count, shader size, LDS size, and scratch usage to shader dumps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-09 13:53:33 +00:00
Tom Stellard	bbfa1c3239	radeonsi/compute: Use value from compiler for COMPUTE_PGM_RSRC1.FLOAT_MODE Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-09 13:53:33 +00:00
Tom Stellard	a646b00cfc	clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG v2 This means dropping CL_FP_DENORM from the current return value. v2: - Add comments about minimum values for OpenCL 1.2. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2015-03-09 13:53:33 +00:00
Ilia Mirkin	cb3eb43ad6	freedreno/ir3: get the # of miplevels from getinfo This fixes ARB_texture_query_levels to actually return the desired value. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-09 10:50:39 -04:00
Ilia Mirkin	8ac957a51c	freedreno/ir3: fix array count returned by TXQ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-09 10:50:39 -04:00
Ilia Mirkin	f3dfe6513c	freedreno: move fb state copy after checking for size change Fixes: `1f3ca56b` ("freedreno: use util_copy_framebuffer_state()") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-09 10:50:39 -04:00
Kenneth Graunke	b9c2fa15e3	nir: Make the printer include nir_variable::location too. Being able to see both location and driver_location can be useful when debugging IO mistakes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-03-09 01:34:03 -07:00
Iago Toral Quiroga	a72fb69604	i965/fs: Implement SIMD16 dual source blending. From the SNB PRM, volume 4, part 1, page 193: "The dual source render target messages only have SIMD8 forms due to maximum message length limitations. SIMD16 pixel shaders must send two of these messages to cover all of the pixels. Each message contains two colors (4 channels each) for each pixel in the message payload." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82831 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-09 08:15:13 +01:00
Kenneth Graunke	8dcc1f2c10	nir: Only do gl_FrontFacing workaround in glsl_to_nir for the FS. Vertex shaders can have shader inputs where location happens to be VARYING_SLOT_FACE. Without predicating this on the shader stage, we suddenly end up with load_front_face intrinsics in vertex shaders, which is nonsensical. Fixes spec/arb_vertex_buffer_object/pos-array when using NIR for VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-08 20:04:02 -07:00
Kenneth Graunke	c6f2abe67e	nir: Plumb the shader stage into glsl_to_nir(). The next commit needs to know the shader stage in glsl_to_nir(). To facilitate that, we pass the gl_shader rather than the raw exec_list of instructions. This has both the exec_list and the stage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-08 20:04:01 -07:00
Kenneth Graunke	b200cbb0a4	nir: Add native_integers to nir_shader_compiler_options. glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the driver supports native integers. Presumably other passes may as well. Adding this to nir_shader_compiler_options is an easy way to provide that information, as it's accessible via nir_shader::options. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-08 20:03:57 -07:00
Kenneth Graunke	a55da73be4	nir: Try to make sense of the nir_shader_compiler_options code. The code in glsl_to_nir is entirely dead, as we translate from GLSL to NIR at link time, when there isn't a _mesa_glsl_parse_state to pass, so every caller passes NULL. glsl_to_nir seems like the wrong place to try and create the shader compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other translators all would have to duplicate that code. The driver should set this up once with whatever settings it wants, and pass it in. Eric also added a NirOptions field to ctx->Const.ShaderCompilerOptions[] and left a comment saying: "The memory for the options is expected to be kept in a single static copy by the driver." This suggests the plan was to do exactly that. That pointer was not marked const, however, and the dead code used a mix of static structures and ralloced ones. This patch deletes the dead code in glsl_to_nir, instead making it take the shader compiler options as a mandatory argument. It creates an (empty) options struct in the i965 driver, and makes NirOptions point to that. It marks the pointer const so that we can actually do so without generating "discards const qualifier" compiler warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-08 20:03:46 -07:00
Kenneth Graunke	2561aea6b3	nir: Delete nir_shader::user_structures and num_user_structures. Nothing actually uses these, and the only caller of glsl_to_nir() (brw_fs_nir.cpp) always passes NULL for the _mesa_glsl_parse_state pointer, meaning they'll always be NULL and 0, respectively. Just delete them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-08 20:03:44 -07:00
Kenneth Graunke	9f1e250e77	glsl: Mark array access when copying to a temporary for the ?: operator. Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/ array-selection.vert test contains the following code: gl_Position = (pick_from_a_or_b ? a : b)[i]; where "a" and "b" are uniform vec4[2] variables. ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and generates an if-block to copy one or the other: (declare (temporary) (array vec4 2) conditional_tmp) (if (var_ref pick_from_a_or_b) ((assign () (var_ref conditional_tmp) (var_ref a))) ((assign () (var_ref conditional_tmp) (var_ref b)))) However, we failed to update max_array_access for "a" and "b", so it remained 0 - here, the whole array is being accessed. At link time, update_array_sizes() used this bogus information to change the types of "a" and "b" to vec4[1]. We then had assignments from a vec4[1] to a vec4[2], which is highly illegal. This tripped assertions in nir_split_var_copies with scalar VS. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: mesa-stable@lists.freedesktop.org	2015-03-08 20:03:36 -07:00
Kenneth Graunke	a84f66a9b6	i965/nir: Resolve source modifiers on Gen8+ logic operations. On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and negate changes meaning to bitwise-not (~, not -). This isn't what NIR expects, so we should resolve the source modifers via a MOV. +30 Piglits (fs-op-bit{and,or,xor}-not-abs-*). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2015-03-08 20:03:35 -07:00
Dave Airlie	7c25a4a84d	st/mesa: drop unused texture function This has no users. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-09 10:43:27 +10:00
Dave Airlie	c5e69409d7	mesa/st: remove unused TexData this isn't hooked up to anything at all from what I can see. Seems like a left over from commit 5d67d4fbebb(st/mesa: remove st_TexImage(), use core Mesa code instead). Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-03-09 09:48:49 +10:00
Rob Clark	fd17db6fe5	freedreno: replace glsl130 debug flag with glsl120 Now that relative-dst works, we should never fall back to the old compiler. (Which is almost true, other than a couple edge case sched fails in piglit). So replace glsl130 flag to force GLSL 130 and integers on a3xx/a4xx with a glsl120 flag to force GLSL 120 and !integers. If this commit breaks any game/app/etc use FD_MESA_DEBUG=glsl120 as a workaround and please let me know. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	0e8d58b80a	gallium/docs: add some freedreno compiler docs Enable the 'sphinx.ext.graphviz' extension, and add in a section for driver specific docs, with freedreno compiler docs beneath. The goal is for more complete compiler docs, and hopefully some docs about other parts of the driver (such as how tiling works, etc). Note that there is also a Distribution -> Drivers section. Although that appears to be simply just a list of drivers. Not sure if that should move under the 'Drivers' section or left alone. I did add a one-line section for freedreno in the existing Distribution -> Drivers section. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	060d349920	freedreno/ir3: relative dst To simplify RA, assign arrays that are written to first. Since enough dependency information is in the graph to preserve order of reads and writes of array, so all SSA names for the array collapse into one, just assign the entire thing by array-id. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	b7703212d8	freedreno/ir3: split out array_fanin() helper We'll need this too for relative dst.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	17754b70d7	freedreno/ir3: drop deref nodes The meta-deref instruction doesn't really do what we need for relative destination. Instead, since each instruction can reference at most a single address value, track the dependency on the address register via instr->address. This lets us express the dependency regardless of whether it is used for dst and/or src. The foreach_ssa_src{_n} iterator macros now also iterates the address register so, at least in SSA form, the address register behaves as an additional virtual src to the instruction. Which is pretty much what we want, as far as scheduling/etc. TODO: For now, the foreach_src{_n} iterators are unchanged. We could wrap the address in an ir3_register and make the foreach_src_{_n} iterators behave the same way. But that seems unnecessary at this point, since we mainly care about the address dependency when in SSA form. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	f8f7548f46	freedreno/ir3: helpful iterator macros I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	26b79ac3e4	freedreno/ir3: fix register usage calculations For cat1 instructions, use reg() as well for relative src, to ensure proper accounting of register usage. Also, for relative instructions, use reg->size rather than reg->wrmask to determine the number of components read/written. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	3ecc834e75	freedreno/ir3: couple tweaks for cmdline compiler Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	0f797f7b7d	freedreno/ir3: split up ssa_dst And a couple other trivial renames, to prepare for relative dst. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Rob Clark	27648efa20	freedreno/ir3: fix failed assert in grouping Turns out there are scenarios where we need to insert mov's in "front" of an input. Triggered by shaders like: VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[9] DCL SAMP[0] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[1].xyyy 1: MOV TEMP[0].w, IN[1].wwww 2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY 3: MOV OUT[1], TEMP[0] 4: MOV OUT[0], IN[0] 5: END Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-08 17:42:43 -04:00
Jon TURNEY	72d4f6c67f	c99_alloca.h: Also use <alloca.h> for cygwin Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-07 18:18:32 +00:00
Vinson Lee	1ca39ec03c	i915: Fix GCC unused-variable warning in release build. i915_debug_fp.c: In function ‘i915_disassemble_program’: i915_debug_fp.c:302:11: warning: unused variable ‘size’ [-Wunused-variable] GLuint size = program[0] & 0x1ff; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-03-06 21:41:46 -08:00
Mark Janes	b28c037d64	r300g: Fix build, invalid extern "C" around header inclusion. A previous patch to fix header inclusion within extern "C" neglected to fix the occurences of this pattern in r300 files. When the helper to detect this issue was pushed to master, it broke the build for the r300 driver. This patch fixes the r300 build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-06 22:08:44 -05:00
Mark Janes	c4b91a1f5c	nouveau: Fix build, invalid extern "C" around header inclusion. A previous patch to fix header inclusion within extern "C" neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-06 22:08:11 -05:00
Ilia Mirkin	20346808cf	nv50,nvc0: remove bogus 64_FLOAT formats There is no HW support for these and the VBO pusher doesn't know about them. No need to, either, since the st will be lowering them to 2x32. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-06 22:06:05 -05:00
Emil Velikov	1e5f833a0d	docs: add news item and link release notes for mesa 10.4.6/10.5.0 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-07 00:33:06 +00:00
Emil Velikov	ac9679b1c5	docs: Add sha256 sums for the 10.5.0 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `0d3e4ed134`)	2015-03-07 00:25:05 +00:00
Emil Velikov	b48774e7d8	docs: Update 10.5.0 release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `97357d475f`)	2015-03-07 00:25:01 +00:00
Emil Velikov	19c5bee101	docs: Add sha256 sums for the 10.4.6 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `fc9dd495b2`)	2015-03-07 00:24:57 +00:00
Emil Velikov	9fe27c7b99	Add release notes for the 10.4.6 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `542a754524`)	2015-03-07 00:24:54 +00:00
Chia-I Wu	bca6c8572f	ilo: clarify valid and preferred tilings We did it right until the switch to gen_surface_tiling, which has GEN8_TILING_W. Generally, GEN8_TILING_W may be valid but not preferred.	2015-03-07 04:32:39 +08:00
Chia-I Wu	bf061a3d2e	ilo: clean up Gen6 WAs Add a help function for each WA and make PIPE_CONTROL flags match the WA descriptions. Call gen6_wa_pre_pipe_contro() only before PIPE_CONTROLs. Fix missing gen6_wa_pre_3dstate_vs_toggle() in the rectlist path.	2015-03-07 02:17:54 +08:00
Chia-I Wu	ba5670fc50	ilo: add generic ilo_render_3dprimitive() It replaces gen[6-8]_3dprimitive().	2015-03-07 01:45:52 +08:00
Chia-I Wu	8b2eecfbf8	ilo: add generic ilo_render_pipe_control() It replaces gen[6-8]_pipe_control() and a direct gen6_PIPE_CONTROL() call in ilo_render_emit_flush().	2015-03-07 01:40:23 +08:00
Chia-I Wu	35b713ad75	ilo: fix padding of linear sampler views Should use the temporary variable in the loop instead of layout->bo_height.	2015-03-07 01:38:35 +08:00
Chia-I Wu	dda4823844	ilo: do not check for interleaved_samples interleaved_samples is only zero-initialized when layout_want_mcs() is called. We should not check for it. There is also no need to.	2015-03-07 01:38:35 +08:00
Emil Velikov	56ede80940	Revert "egl/main: use c11/threads' mutex directly" This reverts commit `6cee785c69`. Not meant to go in yet. Lacking review.	2015-03-06 17:07:40 +00:00
Emil Velikov	eb14d28e6d	Revert "egl/main: convert thread management to use c11 threads" This reverts commit `33eff85336`. Not meant to go in yet. Lacking review.	2015-03-06 17:07:34 +00:00
Emil Velikov	3b1d69910d	Revert "configure: require pthreads for POSIX builds" This reverts commit `50714cec2b`. Not meant to go in yet. Lacking review.	2015-03-06 17:07:29 +00:00
Emil Velikov	8f2eaae10c	Revert "glx: remove final reference to THREADS" This reverts commit `8b15a883e0`. Not meant to go in yet. Lacking review.	2015-03-06 17:07:23 +00:00
Emil Velikov	5e3276f5c7	Revert "glx: remove support for non-multithreaded platforms" This reverts commit `38591295cd`. Not meant to go in yet. Lacking review.	2015-03-06 17:07:11 +00:00
Emil Velikov	1c1fd82b4b	glx: remove unneeded ifdef _WIN32 guard The C99 header exists on other platforms as well. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-06 16:49:03 +00:00
Emil Velikov	3f16751639	util: rework _MSC_VER >= 1200 checks Replace the _MSC_VER >= 1200 with defined (_MSC_VER) and compact if/else statements. We require MSVC 2008 or later with commit 46110c5d564. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-06 16:48:50 +00:00
Emil Velikov	38591295cd	glx: remove support for non-multithreaded platforms Implicitly required for a while, although commit `9385c592c6` (mapi: remove u_thread.h) was the one that put the final nail on the coffin. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-06 16:46:18 +00:00
Emil Velikov	8b15a883e0	glx: remove final reference to THREADS Left over from commit 18db13f5865(mapi: THREADS was always defined, remove it) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-06 16:46:18 +00:00
Emil Velikov	50714cec2b	configure: require pthreads for POSIX builds This has been an implicit rule for building mesa for a long time. Let's make it official and just bail out at configure time. This way we can cleaning up some of our glx code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-06 16:46:17 +00:00
Emil Velikov	33eff85336	egl/main: convert thread management to use c11 threads Convert the code to use the C11 threads implementation, and nuke the Windows non-pthreads code-path. The c11/threads_win32.h abstraction should be better than the current code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-06 16:46:17 +00:00
Emil Velikov	6cee785c69	egl/main: use c11/threads' mutex directly Remove the inline wrappers/abstraction layer. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-06 16:46:17 +00:00
José Fonseca	bfb4db83b6	include: Add helper header to help trap includes inside extern C. This is just to help repro and fixing these issues with any C++ compiler -- Commiting this will of course wait until all issues are addressed. $ scons src/glsl/ scons: Reading SConscript files ... Checking for GCC ... yes Checking for Clang ... no Checking for X11 (x11 xext xdamage xfixes glproto >= 1.4.13)... yes Checking for XCB (x11-xcb xcb-glx >= 1.8.1 xcb-dri2 >= 1.8)... yes Checking for XF86VIDMODE (xxf86vm)... yes Checking for DRM (libdrm >= 2.4.38)... yes Checking for UDEV (libudev >= 151)... yes warning: LLVM disabled: not building llvmpipe scons: done reading SConscript files. scons: Building targets ... scons: building associated VariantDir targets: build/linux-x86_64-debug/glsl Compiling src/glsl/ast_array_index.cpp ... Compiling src/glsl/ast_expr.cpp ... Compiling src/glsl/ast_function.cpp ... Compiling src/glsl/ast_to_hir.cpp ... Compiling src/glsl/ast_type.cpp ... Compiling src/glsl/builtin_functions.cpp ... In file included from include/c99_compat.h:28:0, from src/mapi/u_compiler.h:4, from src/mapi/u_thread.h:47, from src/mapi/glapi/glapi.h:47, from src/mesa/main/mtypes.h:42, from src/mesa/main/errors.h:47, from src/mesa/main/imports.h:41, from src/mesa/main/core.h:44, from src/glsl/builtin_functions.cpp:58: include/no_extern_c.h:48:1: error: template with C linkage template<class T> class _IncludeInsideExternCNotPortable; ^ In file included from include/c99_compat.h:28:0, from include/c11/threads.h:38, from src/mapi/u_thread.h:49, from src/mapi/glapi/glapi.h:47, from src/mesa/main/mtypes.h:42, from src/mesa/main/errors.h:47, from src/mesa/main/imports.h:41, from src/mesa/main/core.h:44, from src/glsl/builtin_functions.cpp:58: include/no_extern_c.h:48:1: error: template with C linkage template<class T> class _IncludeInsideExternCNotPortable; ^ Compiling src/glsl/builtin_types.cpp ... Compiling src/glsl/builtin_variables.cpp ... scons: *** [build/linux-x86_64-debug/glsl/builtin_functions.os] Error 1 scons: building terminated because of errors. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-06 12:38:55 +00:00
Iago Toral Quiroga	7f10e1678e	i965: free scratch buffers when destroying the context If scratch space is needed for a shader stage we try to reuse the last scratch buffer bound to that stage. If we can't, we free the old scratch buffer and allocate a new one. This means we always keep the last scratch buffer for a particular shader stage around for the entire life span of the context. These buffers are being reported by Valgrind as definitely lost after destroying the OpenGL context. For example, for the geometry shader stage: ==18350== 248 bytes in 1 blocks are definitely lost in loss record 85 of 150 ==18350== at 0x4C2CC70: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==18350== by 0xA1B35D6: drm_intel_gem_bo_alloc_internal (intel_bufmgr_gem.c:724) ==18350== by 0xA1B383F: drm_intel_gem_bo_alloc (intel_bufmgr_gem.c:794) ==18350== by 0xA1AEFA3: drm_intel_bo_alloc (intel_bufmgr.c:52) ==18350== by 0x9D08E31: brw_get_scratch_bo (brw_program.c:226) ==18350== by 0x9D2A0F2: do_gs_prog (brw_vec4_gs.c:280) ==18350== by 0x9D2A635: brw_gs_precompile (brw_vec4_gs.c:401) ==18350== by 0x9D14F68: brw_shader_precompile(gl_context, gl_shader_program) (brw_shader.cpp:76) ==18350== by 0x9D157B8: brw_link_shader (brw_shader.cpp:269) ==18350== by 0x9B0941E: _mesa_glsl_link_shader (ir_to_mesa.cpp:3038) ==18350== by 0x99AE4ED: link_program (shaderapi.c:917) ==18350== by 0x99AF365: _mesa_LinkProgram (shaderapi.c:1385) So make sure that by the time we destroy the context we check if we have live scratch buffers for the various stages and release them if that is the case. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-03-06 13:13:24 +01:00
Ville Syrjälä	970dc23603	i965: Fix URB size for CHV Increase the device info .urb.size for CHV to match the default URB size (192kB). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-03-06 11:50:49 +02:00
Samuel Iglesias Gonsalvez	ced9425327	configure: Introduce new output variable to ax_check_python_mako_module.m4 This output variables gives more flexibility for future changes in autoconf to detect if it is needed to auto-generate files and check for the auto-generation dependencies. It is still returning error when Python is not installed. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2015-03-06 09:39:41 +01:00
Andrey Sudnik	0dfec59a27	i965/vec4: Don't lose the saturate modifier in copy propagation. Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89224 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-05 15:47:19 -08:00
Matt Turner	78df9d5e30	i965/vec4: Handle saturate in dump_instruction(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-05 15:47:18 -08:00
Chia-I Wu	ebad062e9a	ilo: enable L3 cache in MOCS This enables L3 cache in MOCS almost everywhere.	2015-03-06 04:50:19 +08:00
Chia-I Wu	c7d17f8a80	ilo: track if a ilo_view_surface is a scanout Scanouts require a different cache type.	2015-03-06 04:43:20 +08:00
Chia-I Wu	e7c74ef43d	ilo: clean up SURFACE_STATE and BINDING_TABLE_STATE Add ilo_builder_surface_pointer() to replace ilo_builder_surface_write(). Make Gen8+ take a different path in gen6_SURFACE_STATE().	2015-03-06 04:43:20 +08:00
Brian Paul	8b2c845ea0	mapi: actually remove unused u_thread.h I thought this was in the previous commit in the series. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-05 13:39:22 -07:00
Rob Clark	60096ed906	freedreno/ir3: fix silly typo for binning pass shaders Was resulting in gl_PointSize write being optimized out, causing particle system type shaders to hang if hw binning enabled. Fixes neverball, OGLES2ParticleSystem, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-05 15:36:47 -05:00
Timothy Arceri	1a96d9ef1c	glsl: let interface linking code validate its arrays Currently intrastage arrays are validated twice for interface blocks. Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-06 07:26:57 +11:00
Timothy Arceri	c5a56a63f9	glsl: use common intrastage array validation Use common intrastage array validation for interface blocks. This change also allows us to support interface blocks that are arrays of arrays. V2: Reinsert unsized array asserts in interstage_match() Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-06 07:26:50 +11:00
Timothy Arceri	50859c688c	glsl: move array validation into its own function V2: return true when var->type is unsized but max access is within valid range Reviewed-by: Mark Janes <mark.a.janes@intel.com>	2015-03-06 07:26:41 +11:00
Kenneth Graunke	aa0705c06c	i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta. A while back I switched intel_blit_framebuffer to prefer Meta over the BLT. This meant that Gen8 platforms would start using the 3D engine for blits, just like we do on Gen6-7.5. However, I hadn't considered Gen4-5 when making that change. The BLT engine appears to be substantially faster on 965GM than using Meta to drive the 3D engine. This isn't too surprising: original Gen4 doesn't support tile offsets (that came on G45), and the level/layer fields don't work for cubemap rendering, so for inconvenient miplevel alignments, we end up blitting or copying data to/from temporaries in order to render to it. We may as well just use the blitter. I chose to use the BLT on Gen4-5 because they use the same ring for both 3D and BLT; Gen6+ splits it out. Fixes regressions on 965GM due to botched tile offset code (we should fix those properly as well, but they're longstanding bugs - for now, put things back to the status quo). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-03-05 10:36:03 -08:00
Chia-I Wu	4ddd981e40	ilo: add more convenient intel_bo_{ref,unref}() They both check for NULL and intel_bo_ref() returns the referenced bo. They replace intel_bo_{reference,unreference}().	2015-03-06 02:25:03 +08:00
Chia-I Wu	70ef171e91	ilo: add intel_bo_set_tiling() Make intel_winsys_alloc_bo() always allocate a linear bo, and add intel_bo_set_tiling() to set the tiling. Document the purpose of tiling.	2015-03-06 02:25:03 +08:00
Chia-I Wu	0ac706535a	ilo: replace intel_tiling_mode by gen_surface_tiling The former is used by the kernel driver to set up fence registers and to pass tiling info across processes. It lacks INTEL_TILING_W, which made our code less expressive.	2015-03-06 02:25:03 +08:00
Chia-I Wu	eb32ac1956	ilo: update genhw headers The main change is non-inline <enum>s are now generated as C enums.	2015-03-06 02:25:03 +08:00
Mark Janes	237dcb4aa7	Fix invalid extern "C" around header inclusion. System headers may contain C++ declarations, which cannot be given C linkage. For this reason, include statements should never occur inside extern "C". This patch moves the C linkage statements to enclose only the declarations within a single header. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-05 10:21:40 -08:00
Matt Turner	2e4c95dfe2	i965: Tell intel_get_memcpy() which direction the memcpy() is going. The SSSE3 swizzling code was written for fast uploads to the GPU and assumed the destination was always 16-byte aligned. When we began using this code for fast downloads as well we didn't do anything to account for the fact that the destination pointer given by glReadPixels() or glGetTexImage() is not guaranteed to be suitably aligned. With SSSE3 enabled (at compile-time), some applications would crash when an SSE aligned-store instruction tried to store to an unaligned destination (or an assertion that the destination is aligned would trigger). To remedy this, tell intel_get_memcpy() whether we're uploading or downloading so that it can select whether to assume the destination or source is aligned, respectively. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89416 Tested-by: Uriy Zhuravlev <stalkerg@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-05 10:18:28 -08:00
Mark Janes	5f9ee6a02f	mesa/x86: missing stdio inclusions Several patches added include statements where required by the m64 build. Some files are only compiled for m32, and require similar changes. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 10:16:25 -08:00
Tom Stellard	c97e902a1a	clover: Enable cl_khr_fp64 for devices that support doubles v4 v2: - Report correct values for CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE and CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE. - Only define cl_khr_fp64 if the extension is supported. - Remove trailing space from extension string. - Rename device query function from cl_khr_fp64() to has_doubles(). v3: - Return 0 for device::doubled_fp_confg() when doubles aren't supported. v4: - Remove device query for double fp_config. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-03-05 14:07:37 +00:00
Emil Velikov	8d8ca64c28	xmlpool: make sure we ship options.h The header is included in ../xmlpool.h. With the latter of which used directly in a number of places in mesa. Note that we can also add it (alongside t_option.h) to noinst_HEADERS, but neither solution fixes the issue that brough us here - namely: Do not regenerate the headers, if it already exists. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 14:45:55 +00:00
Emil Velikov	fe5fddd7e2	mapi: fix *glapi dependency tracking I.e. add {shared-,}glapi/glapi_mapi_tmp.h to the SOURCES list. Otherwise there will be no knowledge that the file is required by others for the build. Thus autotools won't pick it up for the distribution tarball. v2: Don't forget about the static glapi. Spotted by Matt. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 14:45:55 +00:00
Emil Velikov	2c0f72d538	mesa: drop Makefile from get_hash.h dependency list Not required. Additionally this had the side effect of generating the file, despite it's existence. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 14:45:55 +00:00
Emil Velikov	d22391cb16	mesa: fix dependency tracking of generated sources Some of the files generated were not in the SOURCES variable, thus although generated prior to compilation the dependency tracking was incomplete. The latter of which resulted in the files missing from the distribution tarball. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 14:45:54 +00:00
Emil Velikov	3f6c28f2a9	mesa: rename format_info.c to format_info.h The file is auto-generated, and #included by formats.c. Let's rename it to reflect the latter. This will also help up fix the dependency tracking by adding it to the _SOURCES variable, without the side effect of it being compiled (twice). v2: Update .gitignore to reflect the rename. Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 14:45:54 +00:00
Emil Velikov	abae3434c4	mesa/main: update .gitignore Drop the no longer present get_es{1,2}.c from the list. v2: Keep the format_info.c rename hunk out of this patch. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-05 14:45:54 +00:00
Emil Velikov	d1fbea038b	egl/main: remove no-longer needed definition of stdint types All the users directly include the header, plus we have a in-tree replacements for non C99 compilers which we already use. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-05 14:45:54 +00:00
Emil Velikov	bf0e4d219a	egl/drivers: include stdint.h where needed Currently these files are including it indirectly via eglcompiler.h The latter of which will be removed with follow up commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-05 14:45:54 +00:00
Emil Velikov	74c40b9b56	egl/main: drop the declaration of PUBLIC keyword. Should no longer be used. As many places indirectly include eglcompiler.h keep this change separate, so that it can be easily reverted, if needed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-05 14:45:54 +00:00
Emil Velikov	dd438ae34b	egl/main: no longer export internal function With the split of the gallium egl module we had previously it required access to some of the internal functions. As the only build (automake) that did this no longer builds it we can now appropriately hide those functions. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-05 14:45:53 +00:00
Emil Velikov	d780012cd7	egl/main: replace __FUNCTION__ with __func__ The latter is a C99 standard, and our current wrapper c99_compat.h should handle non-compliant compilers. Drop the c99_compat.h inclusion from eglcompiler.h altogether, as it's no longer required. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-05 14:45:53 +00:00
Emil Velikov	7bd1693877	egl/main: replace INLINE with inline Drop the custom keyword in favour of the C99 one. All the places using it now directly include c99_compat.h which should handle things on platforms which lack it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-05 14:45:53 +00:00
Brian Paul	9385c592c6	mapi: remove u_thread.h Just use c11 threads directly. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	262cd683e2	mapi: use c11 call_once() instead of pthread_once() Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	18db13f586	mapi: THREADS was always defined, remove it THREADS was defined if HAVE_PTHREADS or _WIN32 was defined. That's always the case. The build would die in c11/threads.h otherwise. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	fac77912b5	mesa: remove THREADS check, printf calls in debug.c THREADS is going away in the next commit. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	458c7490c2	mapi: rewrite u_current_init() function without u_thread_self() Remove u_thread_self() since u_thread.h is going away soon. Create a simple thread ID abstraction which wraps WIN32 or c11 threads. This also gets rid of the questionable casting of thrd_t to an unsigned long. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	6b5eb7bce6	mapi: fix preprocessor check in u_current_destroy() So it matches the preprocessor check around the u_current_init_tsd() code. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	c3f352e836	mapi: remove u_macros.h Only U_STRINGIFY() is used in entry.c Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	83926b8193	osmesa: include stdio.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	80524549f0	xlib: include stdio.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	8f1a11bfc4	st/osmesa: include stdio.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	8c68987d09	st/xlib: include stdio.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	68579c4a5c	st/xlib: include stdio.h Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	fe976ceb76	st/mesa: include stdio.h where needed Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:43 -07:00
Brian Paul	2655afc7e6	swrast: include stdio.h where needed Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:42 -07:00
Brian Paul	78ee6fdb23	nouveau: include stdio.h where needed Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:42 -07:00
Brian Paul	f330ab9383	dri/common: include stdio.h where needed Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:42 -07:00
Brian Paul	db9a088d32	glsl: include stdio.h where needed Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:42 -07:00
Brian Paul	db29869205	mesa: include stdio.h where needed Instead of relying on glapi.h or some other header to provide it. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:42 -07:00
Brian Paul	028968a3ce	mesa: include c11/threads.h in mtypes.h Let's directly include c11/threads.h instead of relying on glapi.h to provide it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-03-05 06:59:42 -07:00
Neil Roberts	7286a68991	meta: Fix the y offset for 1D_ARRAY in _mesa_meta_pbo_TexSubImage The yoffset needs to be interpreted as a slice offset for 1D array textures. This patch implements that by moving the yoffset into zoffset similar to how it moves the height into depth. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-03-05 13:24:53 +00:00
Neil Roberts	a08bff1e98	meta: Allow GL_UN/PACK_IMAGE_HEIGHT in _mesa_meta_pbo_Get/TexSubImage Now that a layered source PBO is interpreted as a single tall 2D image it's quite easy to accept the image height packing option by just creating an image that is tall enough to include the image padding. I'm not sure whether the image height property should affect 1D_ARRAY textures. My intuition and interpretation of the GL spec (which is a bit vague) would be that it shouldn't. However the software fallback path in Mesa uses the property for packing but not for unpacking. The binary NVidia driver uses it for both. This patch doesn't use it for either case so it is different from the software fallback. There is some discussion about this here: http://lists.freedesktop.org/archives/mesa-dev/2015-February/077925.html This is tested by the texsubimage Piglit test with the array and pbo arguments. Previously this test was skipping this code path because it always sets the image height. I've also tested it by modifying the getteximage-targets test. It wasn't using this code path before because it was using the default texture object so this code couldn't successfully create a frame buffer. I also modified it to add some image padding with the image height in the PBO. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-03-05 13:24:45 +00:00
Neil Roberts	7d10d2feee	Revert "common: Fix PBOs for 1D_ARRAY." This reverts commit `546aba143d`. I think the changes to the calls to glBlitFramebuffer from this patch are no different to what it was doing previously because it used to set height to 1 before doing the blits. However it was introducing some problems with the blit for layer 0 because this was no longer special cased. It didn't fix problems with the yoffset which needs to be interpreted as a slice offset. I think a better solution would be to modify the original if statement to cope with the yoffset. Conflicts: src/mesa/drivers/common/meta_tex_subimage.c Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-05 13:23:10 +00:00
Vinson Lee	29c23644cc	glsl: Fix GCC unused-variable warning in release build. CXX ast_array_index.lo ast_array_index.cpp: In function ‘void update_max_array_access(ir_rvalue, int, YYLTYPE, _mesa_glsl_parse_state)’: ast_array_index.cpp:86:30: warning: unused variable ‘interface_type’ [-Wunused-variable] const glsl_type interface_type = ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2015-03-04 17:20:25 -08:00
Chia-I Wu	b5eb6f769d	ilo: improve WA handling in rectlist path Add wrappers for 3DPRIMITIVE to make sure we clear current_pipe_control_dw1 and deferred_pipe_control_dw1 after it. Add missing gen7_wa_post_ps_and_later().	2015-03-04 15:28:05 -07:00
Chia-I Wu	1424bdd61b	ilo: clean up Gen7.5 WAs These WAs gen7_wa_post_3dstate_push_constant_alloc_ps() gen7_wa_pre_vs() gen7_wa_pre_3dstate_sf_depth_bias() first half of gen7_wa_pre_depth() gen7_wa_post_ps_and_later() are Gen7-specific. Update copy-and-pasted gen8_wa_pre_depth() also.	2015-03-04 15:28:05 -07:00
Tom Stellard	a398168f72	clover: Fix build since llvm r231270	2015-03-04 13:10:56 -08:00
Chia-I Wu	68d2e395d9	ilo: add ILO_DEBUG=hang When set, detect and dump the hanging batch bufffer.	2015-03-05 04:52:49 +08:00
Chia-I Wu	af4cff5d6f	ilo: add some more winsys functions Add intel_winsys_get_reset_stats(), intel_winsys_import_userptr(), and intel_bo_map_async(). The latter two are stubs, but we are not going to use them immediately either.	2015-03-04 13:42:17 -07:00
Matt Turner	1e128e9b69	i965/fs: Don't propagate cmod to inst with different type. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89317 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-04 12:37:34 -08:00
Matt Turner	ade0b580e7	r300g: Check return value of snprintf(). Would have at least prevented the crash the previous patch fixed. Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-04 11:15:09 -08:00
Matt Turner	f5e2aa1324	r300g: Use PATH_MAX instead of limiting ourselves to 100 chars. When built with Gentoo's package manager, the Mesa source directory exists seven directories deep. The path to the .test file is too long and is silently truncated, leading to a crash. Just use PATH_MAX. Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540970 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-03-04 11:15:09 -08:00
Brian Paul	67e0a4f6e8	glx/tests: add -I src/ to fix make check Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-04 11:02:09 -07:00
Kristian Høgsberg	10c82c6c5f	i965: Fix uint64_t overflow in intel_client_wait_sync() DRM_IOCTL_I915_GEM_WAIT takes an int64_t for the timeout value but GL_ARB_sync takes an uint64_t. Further, the ioctl used to wait indefinitely when passed a negative timeout, but it's been broken and now returns immediately in that case. Thus, if an application passes UINT64_MAX to wait forever, we overflow to -1LL and return immediately. Work around this mess by clamping the wait timeout to INT64_MAX. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-03-04 09:55:31 -08:00
Daniel Stone	65c8965d03	egl: Take alpha bits into account when selecting GBM formats This fixes piglit when using PIGLIT_PLATFORM=gbm Tom Stellard: - Fix ARGB2101010 format Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-03-04 15:48:18 +00:00
Rob Clark	b709adf7cc	freedreno/ir3: fix old compiler after `f6b2e8af74` If first_driver_param is left as zero (calloc'd struct), the result is c0 getting clobbered. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-04 11:37:58 -05:00
Brian Paul	34ff9bc669	gallivm: init MM = NULL to silence warning Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	8aa9191878	mapi: remove u_compiler.h Just include c99_compat.h or util/macros.h where needed. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	4ab713423f	mapi: use util/macros.h instead of locally defined macros The next step is to get rid of u_compiler.h completely. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	41c87cc566	mapi: replace INLINE with inline Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	5bebd7099a	mesa: consolidate PUBLIC macro definition Define the macro in src/util/macros.h rather than in two different places. Note that USED isn't actually used anywhere at this time. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	25656753d7	st/xlib: include p_compiler.h to get PUBLIC definition To prevent build break with following changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	25a847d9cc	mapi: remove unneeded ARRAY_SIZE #define include util/macros.h instead. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Brian Paul	0339e7dbda	glx: use ARRAY_SIZE from macros.h Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-03-04 08:33:48 -07:00
Jose Fonseca	6e836d2c86	scons: Update for the fact that we require GCC 4.2 Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-04 15:12:22 +00:00
Jose Fonseca	d0b1c74b73	svga: Set MSVC2013 compat flags. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-04 15:12:19 +00:00
Jose Fonseca	2c25008e8e	softpipe,trace: Set MSVC 2008 compat flags. Although we don't deploy these, we need to use them for debugging. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-04 15:12:17 +00:00
Jose Fonseca	00faf9f000	scons: Use -Werror MSVC compatibility flags per-directory. Matching what we already do with autotools builds. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-03-04 15:12:06 +00:00
Jose Fonseca	3acd7a34ab	st/vega: Remove. OpenVG API seems to have dwindled away. The code would still be interesting if we wanted to implement NV_path_rendering but given the trend of the next gen graphics APIs, it seems unlikely that this becomes ARB or core. v2: Remove a few "openvg" references left, per Emil Velikov. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> v3: Update release notes.	2015-03-04 11:01:45 +00:00
Jose Fonseca	5564c361b5	st/egl: Remove. Largely superseeded by src/egl, and WGL/GLX_EXT_create_context_es_profile extensions. Note this will break Android.mk with gallium drivers -- somebody familiar with that build infrastructure will need to update it to use gallium drivers through egl_dri2. v2: Remove the _EGL_BUILT_IN_DRIVER_GALLIUM define from src/egl/main/Android.mk; and update the src/egl/main/Sconscript to create a SharedLibrary, add versioning, create symlink - copy the bits from egl-static, per Emil Velikov. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> v3: Disallow undefined symbols in libEGL.so. Update release notes	2015-03-04 11:01:42 +00:00
Jose Fonseca	17b2825d76	windows/gdi: Remove. This classic driver is so far behind Gallium softpipe/llvmpipe based one, that's hard to imagine ever being useful. v2: Drop drivers/windows from src/mesa/Makefile.am:EXTRA_DIST per Emil Velikov. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> v3: Update release notes.	2015-03-04 11:01:38 +00:00
Jose Fonseca	40a4797384	nir: Use helper macros for dealing with VLAs. v2: - Single statement, by using memset return value as suggested by Ian Romanick. - No internal declaration, as suggested by Jason Ekstrand. - Move macros to a header. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-04 10:52:02 +00:00
Marc-Andre Lureau	073a5d2e84	gallium/auxiliary/indices: fix start param Since commit `28f3f8d`, indices generator take a start parameter. However, some index values have been left to start at 0. This fixes the glean/fbo test with the virgl driver, and copytexsubimage with freedreno. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-03-04 00:15:22 -05:00
Vinson Lee	b77576edc1	scons: Define _DEFAULT_SOURCE. Fix GCC cpp warnings with glibc >= 2.19. /usr/include/features.h:148:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp] # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-03-03 17:23:48 -08:00
Frank Henigman	e43729943e	intel: fix EGLImage renderbuffer _BaseFormat Correctly set _BaseFormat field when creating a gl_renderbuffer with EGLImage storage. Change-Id: I8c9f7302d18b617f54fa68304d8ffee087ed8a77 Signed-off-by: Frank Henigman <fjhenigman@google.com> Reviewed-by: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-03-03 10:58:42 -08:00
Rob Clark	8e67fd798e	freedreno/a4xx: re-enable int (conditional on glsl130) Re-enable integer, now that we can handle flat varyings. Still, ofc, conditional on FD_MESA_DEBUG=glsl130, until we can deprecate _old compiler.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-03 10:41:00 -05:00
Rob Clark	e9f2abe349	freedreno/ir3: handle flat bypass for a4xx We may not need this for later a4xx patchlevels, but we do at least need this for patchlevel 0. Bypass bary.f for fetching varyings when flat shading is needed (rather than configure via cmdstream). This requires a special dummy bary.f w/ (ei) flag to signal to scheduler when all varyings are consumed. And requires shader variants based on rasterizer flatshade state to handle TGSI_INTERPOLATE_COLOR. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-03 10:41:00 -05:00
Rob Clark	9d732d3125	freedreno/ir3: add support for memory (cat6) instructions Scheduled basically the same as texture (cat5) instructions, using (sy) flag for synchronization. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-03 10:41:00 -05:00
Rob Clark	20b50a0712	freedreno/ir3: fix up cat6 instruction encodings I think there is at least one more sub-encoding, but these two should be enough to cover the common load/store instructions. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-03 10:41:00 -05:00
Rob Clark	4abb789bca	tgsi/lowering: don't forget interp for BCOLOR inputs To lower two sided color, tgsi_lowering creates additional BCOLOR inputs (matching up to the BCOLOR outputs on the vert shader). These inputs should copy the interpolation state of their matching COLOR input. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-03 10:41:00 -05:00
Rob Clark	583a8a8f65	freedreno/a3xx,a4xx: silence some warnings fd3_emit.c: In function ‘fd3_emit_vertex_bufs’: fd3_emit.c:377:11: warning: unused variable ‘semantic’ [-Wunused-variable] uint8_t semantic = sem2name(vp->inputs[i].semantic); and fd4_emit.c: In function ‘fd4_emit_vertex_bufs’: fd4_emit.c:304:11: warning: unused variable ‘semantic’ [-Wunused-variable] uint8_t semantic = sem2name(vp->inputs[i].semantic); Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-03-03 10:41:00 -05:00
Brian Paul	5ece288876	c99_alloca.h: add case for __sun Reviewed-by: Alan Coopersmith <alan.coopersmith@oracle.com>	2015-03-03 08:40:13 -07:00
Jose Fonseca	80c5bd7ef0	configure: Leverage gcc warn options to enable safe use of C99 features where possible. The main objective of this change is to enable Linux developers to use more of C99 throughout Mesa, with confidence that the portions that need to be built with MSVC -- and only those portions --, stay portable. This is achieved by using the appropriate -Werror= options only on the places they need to be used. Unfortunately we still need MSVC 2008 on a few portions of the code (namely llvmpipe and its dependencies). I hope to eventually eliminate this so that we can use C99 everywhere, but there are technical/logistic challenges (specifically, newer Windows SDKs no longer bundle MSVC, instead require a full installation of Visual Studio, and that has hindered adoption of newer MSVC versions on our build processes.) Thankfully we have more directy control over our OpenGL driver, which is why we're now able to migrate to MSVC 2013 for most of the tree. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-03 09:25:11 +00:00
Ben Widawsky	3d4d77a5dc	i965: Fix assertion in brw_reg_type_letters While using various debugging features (optimization debug, instruction dumping, etc) this function is called in order to get a readable letter for the type of unit. On GEN8, two new units were added, the Qword and the Unsigned Qword (Q, and UQ respectively). The existing assertion tries to determine that the argument passed in is within the correct boundary, however, it was using UQ as the upper limit instead of Q. To my knowledge you can only hit this case with the branch I am currently working on, so it doesn't fix any known issues. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-02 19:55:20 -08:00
Ben Widawsky	37c2687645	i965: Rename some PIPE_CONTROL flags I'm not really sure of the origins of the existing flag names. Modern docs have some slightly different names. Having the correct names makes it easier to determine if existing PIPE_CONTROL flag settings are correct, as well as making adding new PIPE_CONTROLs easier. This originally came up while I was trying to implement workarounds and spotted some things called, "flush" which should have been called "invalidate." Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-02 19:28:43 -08:00
Matt Turner	e214000f25	i965/fs: Don't use backend_visitor::instructions after creating the CFG. This is a fix for a regression introduced in commit `a9f8296d` ("i965/fs: Preserve the CFG in a few more places."). The errata this code works around is described in a comment before the function: "[DevBW, DevCL] Errata: A destination register from a send can not be used as a destination register until after it has been sourced by an instruction with a different destination register. The framebuffer write's sources must be in message registers, which SEND instructions cannot have as a destination. There's no way for this errata to affect anything at the end of the program. Just remove the code. Cc: 10.4, 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84613 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-03-02 18:13:28 -08:00
Jason Ekstrand	c4925d7f3b	main/base_tex_format: Properly handle STENCIL_INDEX1/4/16 This takes "fbo-stencil blit GL_STENCIL_INDEX1/4/16" from crash to pass on BDW. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-02 11:06:44 -08:00
Jason Ekstrand	b1ab02d9c0	meta/TexSubImage: Stash everything other than PIXEL_TRANSFER/store in meta_begin Previously, there were bugs where if the app set a scissor it could affect the area of the texture that was downloaded. There was also potential that the framebuffer SRGB state could affect downloads. This ensures that those will get saved/restored and can't affect the texture download. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89292 Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-03-02 11:06:37 -08:00
Matt Turner	93a8c702a6	i915: Remove hand-rolled memcpy implementation. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-02 10:38:49 -08:00
Matt Turner	54d7925012	i965: Remove hand-rolled memcpy implementation. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-03-02 10:38:49 -08:00
Matt Turner	da20bf068e	i965: Consider scratch writes to have side effects. We could do better by tracking scratch reads and writes. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88793 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-03-02 10:24:49 -08:00
Matt Turner	491d42135a	mesa: Correct backwards NULL check. Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-03-02 10:24:33 -08:00
Matt Turner	87109acbed	mesa: Free memory allocated for luminance in readpixels. Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-03-02 10:24:18 -08:00
Matt Turner	2b2fa18652	mesa: Indent break statements and add a missing one. Always indenting break statements makes spotting missing ones easier. Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-03-02 10:24:16 -08:00
Vinson Lee	3de01d2fe4	c99_alloca.h: Include stdlib.h on all non-Windows. Fix build on FreeBSD. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89364 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Brian Paul <brianp@vmware.com>	2015-03-02 09:26:36 -07:00
Brian Paul	6f0e9c2e39	mesa: remove extra definition of ARRAY_SIZE in src/mesa/main/macros.h Already defined in src/util/macros.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-02 08:55:31 -07:00
Brian Paul	e1437d6c0a	mesa: remove the Elements() macro definition No longer used. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:31 -07:00
Brian Paul	692bd4a1ab	util: replace Elements() with ARRAY_SIZE() Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-03-02 08:55:31 -07:00
Brian Paul	6633271159	radeon: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:31 -07:00
Brian Paul	9775dbc335	r200: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	ea760c2090	nouveau: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	49a7f8c919	i965: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	b565771003	i915: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	0a77ffcd5a	mapi: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	c16c719647	glsl: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	70b401029c	st/dri: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	2f0143ca96	st/mesa: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	c7136ff646	mesa/program: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	16f7b77275	mesa/swrast: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	766f5cf8f8	mesa/vbo: replace Elements() with ARRAY_SIZE() Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	c2e130f820	mesa/main: replace Elements() with ARRAY_SIZE() We've been using a mix of these two macros for a while now. Let's just use the later everywhere. It seems to be the convention used by other open-source projects. Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-03-02 08:55:30 -07:00
Brian Paul	cd6db1989a	mesa: trim down #includes in api_loopback.h Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-02 08:55:30 -07:00
Brian Paul	775049b6ad	mesa: trim down includes of compiler.h In some cases, glheader.h is the right #include. Also remove some instances of struct _glapi_table declarations. Acked-by: Matt Turner <mattst88@gmail.com>	2015-03-02 08:55:30 -07:00
Jose Fonseca	fa5140bb18	scons: Fix HAVE___* definition. These definitions must be moved before `cppdefines` is used to have effect. Trivial.	2015-03-02 14:23:51 +00:00
Jose Fonseca	9a07435ff8	identity: Remove. It's unmaintained, and most likely broken: I use trace driver every now and then, and everytime I do I need to fix it up. It's also unused: identity_screen_create is never called. Above all, it's dead weight: if identity driver had the infrastructure for other pass-through drivers (like trace and rbug), then it would make sense on its own right. But as it is implemmented, it's just another driver to (forget) to update whenever there is a gallium interface change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-03-02 14:12:46 +00:00
Francisco Jerez	7bfbaf4a5a	i965: Remove the create_raw_surface vtbl hook. It's a wrapper around emit_buffer_surface_state with format=RAW, pitch=1, rw=true and the remaining arguments ordered differently. There's no point in having a separate vtbl pointer for that. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-03-02 14:33:13 +02:00
Francisco Jerez	65f9b83e05	i965: Add missing defines for render cache messages. And remove duplicated definition of OWORD_DUAL_BLOCK_WRITE. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2015-03-02 14:33:13 +02:00
Neil Roberts	cf67ca9ffa	i965/skl: Lay out a 1D miptree horizontally On Gen9+ the 1D miptree is laid out with all of the mipmap levels in a horizontal line. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-02 11:57:37 +00:00
Neil Roberts	0f1e86afd6	i965/skl: Lay out 3D textures the same as array textures On Gen9+ the 3D textures use the same mipmap layout as 2D array textures. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-03-02 11:57:37 +00:00
Neil Roberts	aef8a48979	i965/skl: Fix the maximum thread count format for the PS According to the bspec for some reason the format of the maximum number of threads field has changed from U8-2 to U8-1 for the PS. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-03-02 11:57:37 +00:00
Marek Olšák	27a34f62ba	draw: fix division-by-zero for empty geometry shaders Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89372 Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-03-02 12:46:36 +01:00
Chris Forbes	b51ff50a76	i965/gs: Check newly-generated GS-out VUE map against correct stage Previously, we compared our new GS-out VUE map to the existing VS-out VUE map, which is bogus. This would mostly manifest as redundant dirty flagging where the GS is in use but the VS and GS output layouts differ; but there is a scary case where we would fail to flag a GS-out layout change if it happened to match the VS-out layout. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.5, 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88885	2015-03-01 11:13:35 +13:00
Brian Paul	213c41bf5d	i965: add GLSL_TYPE_DOUBLE switch case to silence warning Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-28 13:39:58 -07:00
Brian Paul	7783131a51	mesa: include macros.h in stencil.h Since it uses the CLAMP macro. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-28 13:39:58 -07:00
Brian Paul	8a25e73df3	mesa: move finite macro to imports.h Move it to the only place it's used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-28 13:39:57 -07:00
Brian Paul	977c56df09	mesa: remove _NORMAPI, _NORMAPIP macros Was only used in one place. Use equivalent _XFORMAPIP there instead. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-28 13:39:57 -07:00
Brian Paul	61d344ebba	mesa: move FLT_MAX_EXP to c99_math.h Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-28 13:39:57 -07:00
Brian Paul	20dc94ba3c	mesa: move ONE_DIV_SQRT_LN2 to prog_statevars.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-28 13:39:57 -07:00
Brian Paul	cbf788a348	mesa: remove unused uninitialized_var() macro Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-28 13:39:57 -07:00
Matt Turner	e71a7f8013	mesa: Check return value of __get_cpuid(). The use of the uninitialized_var() macro was to silence an uninitialized variable warning that I assumed stemmed from gcc being unable to see inside __get_cpuid() or understand its inline assembly. In fact, it was because the __get_cpuid() function can fail, and not initialize its arguments. Instead, check for failure and return early. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-28 12:20:31 -08:00
Matt Turner	5666d9266f	i965/fs/nir: Mark fallthrough.	2015-02-28 10:46:41 -08:00
Matt Turner	54cd2f7c96	i965/fs/nir: Mark fallthrough.	2015-02-28 10:38:21 -08:00
Matt Turner	d528907fd2	i965: Avoid applying negate to wrong MAD source. For some given GLSL IR like (+ (neg x) (* 1.2 x)), the try_emit_mad function would see that one of the +'s sources was a negate expression and set mul_negate = true without confirming that it was actually a multiply. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89315 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89095 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-27 20:24:12 -08:00
Matt Turner	43ef2657a0	i965/vec4: Fix implementation of i2b. I broke this in commit `2881b123d`. I must have misread i2b as b2i. Cc: 10.5 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88246 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-27 20:24:12 -08:00
Ian Romanick	b8a1637119	i965/fs/nir: Use emit_math for nir_op_fpow It appears that all the other instructions that need it already use it. This one just got missed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-27 18:47:04 -08:00
Matt Turner	76cd0f00f4	mapi: Don't rely on GNU void pointer arithmetic. Commit `79daa510c` added -Werror=pointer-arith to CFLAGS, which makes arithmetic on void pointers an error. See https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-27 16:57:10 -08:00
Kenneth Graunke	982723dfa2	Revert "configure: Leverage gcc warn options to enable safe use of C99 features where possible." This reverts commit `79daa510c7`. I apparently hadn't done a clean build when testing this; it broke the build for Tom, Ben, and myself. We like the idea; let's try a v2.	2015-02-27 16:13:10 -08:00
Jonathan Gray	7983a3d2e0	auxilary/os: correct sysctl use in os_get_total_physical_memory() The length argument passed to sysctl was the size of the pointer not the type. The result of this is sysctl calls would fail on 32 bit BSD/Mac OS X. Additionally the wrong pointer was passed as an argument to store the result of the sysctl call. Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-27 23:17:22 +00:00
Brian Paul	667dac9d40	glsl: silence uninitialized var warning on MinGW Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-27 15:22:25 -07:00
Brian Paul	bf8d049488	mesa: silence unused var warning in get_tex_rgba_uncompressed() Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-27 15:22:25 -07:00
Brian Paul	48f229d759	mesa: move declaration before code To fix MinGW warning. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-27 15:22:24 -07:00
Brian Paul	5b089e5f15	meta: silence declaration after code warning on MinGW Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-27 15:22:24 -07:00
Brian Paul	544f56b75a	meta: silence uninitialized variable warnings for MinGW Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-27 15:22:24 -07:00
Brian Paul	098e5bf3b3	c99_alloca.h: fix #include for MinGW As with MSVC, include malloc.h but don't redefine alloca. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89364 Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-02-27 15:22:24 -07:00
Brian Paul	943784bbcd	gallium/util: add debug_print_usage_enum() debug helper Signed-off-by: Brian Paul <brianp@vmware.com>	2015-02-27 15:22:04 -07:00
Brian Paul	b14cec0b8e	gallium/util: fix 'statement with no effect' warning Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-02-27 15:20:15 -07:00
Kenneth Graunke	53295bebc8	i965: Fix I/L/LA SNORM formats. _mesa_choose_tex_format (texformat.c) tries I8_SNORM, L8_SNORM, and either L8A8_SNORM or A8L8_SNORM, none of which are supported by our driver. Failing that, it falls back to RGBX for luminance, and RGBA intensity and luminance alpha. So, we need to use swizzle overrrides to obtain the correct values. Fixes Piglit's EXT_texture_snorm/fbo-blending-formats and fbo-clear-formats. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-02-27 11:36:27 -08:00
Kenneth Graunke	ea696be5ac	i965/fs: Patch the instruction generating discards; don't use CMP.Z. CMP.Z doesn't work on Gen4-5 because the boolean isn't guaranteed to be 0 or 0xFFFFFFFF - only the low bit is defined. We can call emit_bool_to_cond_code to generate the condition in f0.0; the last instruction will generate the flag value. We can patch it to use f0.1, and negate the condition. Fixes discard tests on Gen4-5. Haswell shader-db stats: total instructions in shared programs: 5770279 -> 5769112 (-0.02%) instructions in affected programs: 64342 -> 63175 (-1.81%) helped: 1069 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-27 11:36:24 -08:00
Kenneth Graunke	4ebacf8aa6	i965/fs: Introduce brw_negate_cmod(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-27 11:36:08 -08:00
Laura Ekstrand	0fad07af9a	main: Fix whitespace in teximage.c. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-27 11:11:45 -08:00
Tom Stellard	da85ab4b65	radeonsi/compute: Enable PIPE_SHADER_CAP_DOUBLES v2 v2: - Simplify ifdef Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-27 14:57:52 +00:00
Tom Stellard	75514555aa	clover: Don't unconditionally define cl_khr_fp64 This should be done by the frontend for devices that support this extension. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-02-27 14:57:44 +00:00
Tom Stellard	ed07255149	pipe-loader: Fix build with dri drivers enabled, and vl state trackers disabled Configure arguments: ./configure --disable-dri3 --disable-xvmc --enable-opencl --with-gallium-drivers=r300,r600,radeonsi --with-egl-platforms=drm Build error: make[3]: *** No rule to make target `../../../../src/gallium/auxiliary/libgalliumvlwinsys.la', needed by `pipe_r300.la'. Stop. Cc: "10.5" <mesa-stable@lists.freedestkop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-27 14:51:33 +00:00
Jose Fonseca	79daa510c7	configure: Leverage gcc warn options to enable safe use of C99 features where possible. The main objective of this change is to enable Linux developers to use more of C99 throughout Mesa, with confidence that the portions that need to be built with MSVC -- and only those portions --, stay portable. This is achieved by using the appropriate -Werror= options only on the places they need to be used. Unfortunately we still need MSVC 2008 on a few portions of the code (namely llvmpipe and its dependencies). I hope to eventually eliminate this so that we can use C99 everywhere, but there are technical/logistic challenges (specifically, newer Windows SDKs no longer bundle MSVC, instead require a full installation of Visual Studio, and that has hindered adoption of newer MSVC versions on our build processes.) Thankfully we have more directy control over our OpenGL driver, which is why we're now able to migrate to MSVC 2013 for most of the tree. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-27 14:30:36 +00:00
Jose Fonseca	f320ecf218	nir: Use alloca instead of variable length arrays. This is to enable the code to build with -Werror=vla in the short term, and enable the code to build with MSVC2013 soon after. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-27 14:30:36 +00:00
Brian Paul	84a1e3d61e	mesa: restore #include stdarg.h in imports.h https://bugs.freedesktop.org/show_bug.cgi?id=89345 Signed-off-by: Brian Paul <brianp@vmware.com>	2015-02-27 07:04:49 -07:00
Brian Paul	06ed81044f	c99_math.h: add defines for M_PI, M_E, M_LOG2E Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89342 Signed-off-by: Brian Paul <brianp@vmware.com>	2015-02-27 07:04:49 -07:00
Vinson Lee	8170eba7e7	r300g/tests: Include stdio.h. Fix build error. CC compiler/tests/r300_compiler_tests-radeon_compiler_regalloc_tests.o compiler/tests/radeon_compiler_regalloc_tests.c: In function ‘test_runner_rc_regalloc’: compiler/tests/radeon_compiler_regalloc_tests.c:57:3: error: implicit declaration of function ‘fprintf’ [-Werror=implicit-function-declaration] fprintf(stderr, "Failed to load program\n"); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2015-02-26 21:01:32 -08:00
Brian Paul	40cfa0c347	radeon/compiler: include stdio.h Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89343 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-02-26 17:53:05 -07:00
Laura Ekstrand	549078cb5a	main: Fix target checking for CompressedTexSubImage*D. This fixes a dEQP test failure. In the test, glCompressedTexSubImage2D was called with target = 0 and failed to throw INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx, target) being called before the target checking. To remedy this, target checking was made into its own function and called prior to _mesa_get_current_tex_object. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89311 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-26 14:24:11 -08:00
Laura Ekstrand	ca65764d60	main: Fix target checking for CopyTexSubImage*D. This fixes a dEQP test failure. In the test, glCopyTexSubImage2D was called with target = 0 and failed to throw INVALID ENUM. This failure was caused by _mesa_get_current_tex_object(ctx, target) being called before the target checking. To remedy this, target checking was separated from the main error-checking function and called prior to _mesa_get_current_tex_object. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89312 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-26 13:31:59 -08:00
Brian Paul	688d7656c5	c99: in c99_math.h check that _USE_MATH_DEFINES is defined with MSVC Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 12:21:30 -07:00
Brian Paul	fb2ddef157	mesa: remove unused INLINE macro from compiler.h We now use 'inline' everywhere in Mesa. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-26 11:02:14 -07:00
Brian Paul	164b3cd757	st/mesa: replace INLINE with inline Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-26 11:02:14 -07:00
Brian Paul	0dc6b72455	swrast: replace INLINE with inline Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-26 11:02:14 -07:00
Brian Paul	f51f2af76d	radeon: replace INLINE with inline Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-26 11:02:14 -07:00
Brian Paul	bbedb85898	r200: replace INLINE with inline Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-26 11:02:13 -07:00
Brian Paul	8e9fe53ce9	i915: replace INLINE with inline Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-26 11:02:13 -07:00
Jose Fonseca	46110c5d56	include,auxiliary: Remove support for MSVC older then 2008. MSVC 2008 (shipped with Windows SDK 7.0.7600) is the oldest we need to support. At least on llvmpipe, gallium/auxiliary, and util modules. For the remaining modules (particular all OpenGL specific code) can be built with MSVC 2013. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-26 16:53:16 +00:00
Brian Paul	fd090fdadd	mesa: don't include stdint.h in compiler.h Not needed. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:39 -07:00
Brian Paul	95855dd32f	mesa: don't include math.h in compiler.h Not needed by anything in that header. Include math.h or c99_math.h where needed instead. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:39 -07:00
Brian Paul	4f25a18011	mesa: trim down #includes in compiler.h Don't include stuff we don't need. Fix a few #includes elsewhere to keep thing building. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:39 -07:00
Brian Paul	538e13d4a1	r300g: remove dependency on compiler.h It only needs typical stdio.h and stdlib.h functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	609cb60d4b	mesa: don't include limits.h in compiler.h Not needed. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	13730bcaf3	mesa: don't include float.h in compiler.h Not needed. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	ddf4b2e363	mesa: only include ctype.h where it's used Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	135b8c6530	mesa: include stdarg.h only where it's used Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	6b06697b0d	mesa: remove M_PI, M_E, M_LOG2E macro definitions Should be defined in math.h. If not, we can add them to c99_math.h Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	6cb431c19c	glsl: #include c99_math.h instead of core.h We only need the M_LOG2E definition. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	36ea81d067	gallium: whitespace, comment formatting fixes in p_defines.h Just to keep things consistent. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	e09fe38935	util: add debug_print_bind_flags() debug helper Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Brian Paul	2069f2c7fa	gallium: renumber PIPE_BIND_ flags Note that PIPE_BIND_COMMAND_ARGS_BUFFER and PIPE_BIND_LINEAR were both bit 21 before. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-02-26 08:38:38 -07:00
Neil Roberts	a44606eb81	meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source tex A layered PBO image is now interpreted as a single tall 2D image so the z argument in _mesa_meta_bind_fbo_image is ignored. Therefore this was just redundantly rebinding the same image repeatedly. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-26 12:04:21 +00:00
Marius Predut	1a93e7690d	mesa: use fi_type in vertex attribute code For 32-bit builds, floating point operations use x86 FPU registers, not SSE registers. If we're actually storing an integer in a float variable, the value might get modified when written to memory. This patch changes the VBO code to use the fi_type (float/int union) to store/copy vertex attributes. Also, this can improve performance on x86 because moving floats with integer registers instead of FP registers is faster. Neil Roberts review: - include changes on all places that are storing attribute values. - check with and without -O3 compiler flag. Brian Paul review: - use fi_type type instead gl_constant_value type - fix a bunch of nit-picks. - fix compiler warnings Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82668 Signed-off-by: Marius Predut <marius.predut@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-25 16:35:49 -07:00
Anuj Phogat	4705346463	i965/gen8: Use HALIGN_16 if MCS is enabled for non-MSRT Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:59 -08:00
Anuj Phogat	84199fa647	i965: Pass pointer to miptree as function parameter in intel_horizontal_texture_alignment_unit This will be used by next patch in the series. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:53 -08:00
Anuj Phogat	94d88cb468	i965: Allocate texture buffer in intelTexImage before calling _mesa_meta_pbo_TexSubImage(). This will be used in later patches and will be required in Skylake to get the tile resource mode of miptree before calling _mesa_meta_pbo_TexSubImage(). Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:46 -08:00
Anuj Phogat	82f6d17300	i965: Make a function to check the conditions to use the blitter No functional changes in the patch. Just makes the code look cleaner. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:41 -08:00
Anuj Phogat	6960a3962c	i965: Move the comment to the right place Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:37 -08:00
Anuj Phogat	524a729f68	i965: Fix condition to use Y tiling in blitter in intel_miptree_create() Y tiling is supported in blitter on SNB+. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:32 -08:00
Anuj Phogat	688309374d	meta: Pass null pointer for the pixel data to avoid unnecessary data upload to a temporary pbo created in _mesa_meta_pbo_GetTexSubImage(). Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:28 -08:00
Anuj Phogat	068ba4ac78	meta: Fix buffer object assignment to account for both pack and unpack bo's create_texture_for_pbo() is shared by _mesa_meta_pbo_GetTexSubImage() and _mesa_meta_pbo_TexSubImage() functions. So, we need to account for both pack and unpack buffer objects. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:23 -08:00
Anuj Phogat	618c4c4b6a	meta: Use GL_STREAM_READ for pbo created with GL_PIXEL_PACK_BUFFER create_texture_for_pbo() is used by both _mesa_meta_pbo_GetTexSubImage() and _mesa_meta_pbo_TexSubImage() functions with different PBO targets. Use GL_STREAM_READ with GL_PIXEL_PACK_BUFFER and GL_STREAM_DRAW with GL_PIXEL_UNPACK_BUFFER. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:11:14 -08:00
Anuj Phogat	8d6ae49a8b	meta: Add assertion check for ctx->Meta->SaveStackDepth before using it for derefrencing. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:10:59 -08:00
Anuj Phogat	0a4ea87344	meta: Do power of two samples check only for samples > 0 otherwise samples=0 passes the check, which is invalid. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-02-25 14:10:47 -08:00
Matt Turner	cb25087c7b	glsl: Rewrite and fix min/max to saturate optimization. There were some bugs, and the code was really difficult to follow. We would optimize min(max(x, b), 1.0) into max(sat(x), b) but not pay attention to the order of min/max and also do max(min(x, b), 1.0) into max(sat(x), b) Corrects four shaders from Champions of Regnum that do min(max(x, 1), 10) and corrects rendering of Mass Effect under VMware Workstation. Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180 Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-25 08:44:49 -08:00
Rob Clark	864340219b	freedreno: drop ARRAY_SIZE macro Since now ARRAY_SIZE has been added to util/macros.h. Fixes a bunch of: freedreno_util.h:79:0: warning: "ARRAY_SIZE" redefined #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])) ^ In file included from ../../../../src/gallium/include/pipe/p_compiler.h:36:0, from ../../../../src/gallium/include/pipe/p_context.h:31, from freedreno_context.h:32, from freedreno_context.c:29: ../../../../src/util/macros.h:29:0: note: this is the location of the previous definition # define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) ^ Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-25 08:37:58 -05:00
Neil Roberts	67e3302497	i965: Don't force x-tiling for 16-bpp formats on Gen>7 Sandybridge doesn't support y-tiling for surface formats with 16 or more bpp. There was previously an override to explicitly allow this for Gen7. However, this restriction is also removed in Gen8+ so we should use y-tiling there too. This is important to do for Skylake which doesn't support x-tiling for 3D surfaces. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-25 13:19:34 +00:00
Andreas Boll	6d164f65c5	glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA If the renderer supports the core profile the query returned incorrectly 0x8 as value, because it was using (1U << __DRI_API_OPENGL_CORE) for the returned value. The same happened with the compatibility profile. It returned 0x1 (1U << __DRI_API_OPENGL) instead of 0x2. Internal DRI defines: dri_interface.h: #define __DRI_API_OPENGL 0 dri_interface.h: #define __DRI_API_OPENGL_CORE 3 Those two bits are supposed for internal usage only and should be translated to GLX_CONTEXT_CORE_PROFILE_BIT_ARB (0x1) for a preferred core context profile and GLX_CONTEXT_COMPATIBILITY_PROFILE_BIT_ARB (0x2) for a preferred compatibility context profile. This patch implements the above translation in the glx module. v2: Fix the incorrect behavior in the glx module Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-25 08:23:38 +01:00
Andreas Boll	06924972d5	dri/common: Update comment about driQueryRendererIntegerCommon Since `87d3ae0b45` driQueryRendererIntegerCommon handles __DRI2_RENDERER_PREFFERED_PROFILE too. Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-25 08:23:33 +01:00
Ilia Mirkin	720ba6ca97	glsl: add double support for packing varyings Doubles are always packed, but a single double will never cross a slot boundary -- single slots can still be wasted in some situations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 22:07:29 -05:00
Laura Ekstrand	546aba143d	common: Fix PBOs for 1D_ARRAY. Corrects the way that _mesa_meta_pbo_TexSubImage and _mesa_meta_pbo_GetTexSubImage handle 1D_ARRAY textures. Fixes a failure in the Piglit arb_direct_state_access/gettextureimage-targets test. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Laura Ekstrand <laura@jlekstrand.net> Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-24 17:33:44 -08:00
Laura Ekstrand	ccc5ce6f72	common: Correct PBO 2D_ARRAY handling. Changes PBO uploads and downloads to use a tall (height * depth) 2D texture for blitting. This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY textures are not properly uploaded and downloaded. Removes the option to use a 2D ARRAY texture for the PBO during upload and download. This option didn't work because the miptree couldn't be set up reliably. v2: Review from Jason Ekstrand and Neil Roberts: -Delete the depth parameter from create_texture_for_pbo -Abandon the option to create a 2D ARRAY texture in create_texture_for_pbo Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-24 17:30:13 -08:00
Laura Ekstrand	06084652fe	common: Correct texture init for meta pbo uploads and downloads. This moves the line setting immutability for the texture to after _mesa_initialize_texture_object so that the initializer function will not cancel it out. Moreover, because of the ARB_texture_view extension, immutable textures must have NumLayers > 0, or depth will equal (0-1)=0xFFFFFFFF during SURFACE_STATE setup, which triggers assertions. v2: Review from Kenneth Graunke: - Include more explanation in the commit message. - Make texture setup bug fixes into a separate patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-24 17:27:52 -08:00
Brian Paul	88ff8dee02	mesa: remove DEG2RAD macro Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 17:10:28 -07:00
Brian Paul	ab68219a59	mesa: remove MAX_GLUSHORT, move MAX_GLUINT The later is only used in one place in swrast. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 17:10:28 -07:00
Brian Paul	f847ddb64d	mesa: move signbit() macro to c99_math.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 17:10:28 -07:00
Brian Paul	612143b2d0	mesa: remove unused isblank() function Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 17:10:28 -07:00
Brian Paul	e033d2c642	glcpp: remove unneeded #include of core.h isblank() is not used in the code. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 17:10:28 -07:00
Brian Paul	9fd7e9d831	mesa: remove sqrtf macro Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 17:10:28 -07:00
Kenneth Graunke	ee3f674572	i965: Remove redundant discard jumps. With the previous optimization in place, some shaders wind up with multiple discard jumps in a row, or jumps directly to the next instruction. We can remove those. Without NIR on Haswell: total instructions in shared programs: 5777258 -> 5775872 (-0.02%) instructions in affected programs: 20312 -> 18926 (-6.82%) helped: 716 With NIR on Haswell: total instructions in shared programs: 5773163 -> 5771785 (-0.02%) instructions in affected programs: 21040 -> 19662 (-6.55%) helped: 717 v2: Use the CFG rather than the old instructions list. Presumably the placeholder halt will be in the last basic block. v3: Make sure placeholder_halt->prev isn't the head sentinel (caught twice by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:53 -08:00
Kenneth Graunke	30f51f1a1a	glsl: Optimize "if (cond) discard;" to a conditional discard. st_glsl_to_tgsi and ir_to_mesa have handled conditional discards for a long time; the previous patch added that capability to i965. i965 (Haswell) shader-db stats: Without NIR: total instructions in shared programs: 5792133 -> 5776360 (-0.27%) instructions in affected programs: 737585 -> 721812 (-2.14%) helped: 6300 HURT: 68 GAINED: 2 With NIR: total instructions in shared programs: 5787538 -> 5769569 (-0.31%) instructions in affected programs: 767843 -> 749874 (-2.34%) helped: 6522 HURT: 35 GAINED: 6 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:53 -08:00
Kenneth Graunke	8eb6c10999	i965/fs: Handle conditional discards. The discard condition tells us which channels we want killed. We want to invert that condition to get the channels that should survive (remain live) in f0.1. Emit a CMP to negate it. Nothing generates these today, but that will change shortly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Kenneth Graunke	8e62bd52f8	nir: Introduce nir_intrinsic_discard_if. This is a conditional discard, which takes a boolean source. Note that we don't generate ir_discard::condition today, so this shouldn't break drivers (since none implement this intrinsic yet). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Kenneth Graunke	23d42b46e3	glsl: Delete dead discard conditions in constant folding. opt_constant_folding() already detects conditional assignments where the condition is constant, and either deletes the assignment or the condition. Make it handle discards in the same fashion. Spotted happening in the wild in Tropico 5 shaders. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Kenneth Graunke	d77b186871	glsl: Handle conditional discards in lower_discard_flow(). This pass wasn't prepared to handle conditional discards. Instead of initializing the "discarded" temporary to "true", set it to the condition. Then, refer to the variable for the condition, to avoid duplicating the expression tree. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Kenneth Graunke	44b45da994	glsl: Make ir_rvalue_visitor visit ir_discard::condition. This was forgotten. I omitted the NULL check since we don't check ir_assignment::condition either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Kenneth Graunke	926d8b0510	glsl: Make ir_validate check the type of ir_discard::condition. Copy and pasted from the ir_if::condition handling, plus a NULL check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 15:24:52 -08:00
Matt Turner	6f5604601c	Revert "i965/fs: Remove force_writemask_all assertion for execsize < 8." This reverts commit `0d8f27eab7`. "This doesn't seem to be necessary." <- I was wrong! Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-02-24 14:08:04 -08:00
Matt Turner	2c7a703b05	i965/fs: Emit MOV(1) instructions with force_writemask_all. Fixes rendering with Dolphin. Tested-by: Markus Wick <markus@selfnet.de> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-02-24 14:08:04 -08:00
Matt Turner	467077b834	i965/fs: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0. total instructions in shared programs: 5695356 -> 5689775 (-0.10%) instructions in affected programs: 486231 -> 480650 (-1.15%) helped: 2604 LOST: 1	2015-02-24 14:08:04 -08:00
Matt Turner	b8582d18e6	i965/fs/nir: Optimize integer multiply by a 16-bit constant. Gen8+ support was just broken, since MUL now consumes 32-bits from both sources. Fixes 986 piglit tests on my BDW. total instructions in shared programs: 7753873 -> 7753522 (-0.00%) instructions in affected programs: 28164 -> 27813 (-1.25%) helped: 77 GAINED: 47 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 14:08:04 -08:00
Matt Turner	7a997a3863	i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0. total instructions in shared programs: 7756214 -> 7753873 (-0.03%) instructions in affected programs: 455452 -> 453111 (-0.51%) helped: 2333 Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 14:08:04 -08:00
Jason Ekstrand	c750ecaa12	nir/register: Add a parent_instr field This adds a parent_instr field similar to the one for ssa_def. The difference here is that the parent_instr field on a nir_register can be NULL if the register does not have a unique definition or if that definition does not dominate all its uses. We set this field in the out-of-SSA pass so that backends can get SSA-like information even after they have gone out of SSA. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 14:08:04 -08:00
Marek Olšák	fc59695b92	st/mesa: remove unused/broken function st_print_shaders Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-24 22:59:57 +01:00
Brian Paul	a86054bac7	st/mesa: remove struct qualifier from st_src_reg parameter It's a class. Silences MSVC warning.	2015-02-24 14:44:19 -07:00
Brian Paul	a2b366b92c	mesa: remove INV_SQRTF() macro Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	bbb2d84032	mesa: remove ceilf, floorf macros Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	bdd0402ca3	mesa: remove expf macro Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	cffedcf163	mesa: remove logf macro Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	f5816d77e2	mesa: remove powf macro Use the wrapper in c99_math.h if needed. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	bad154e677	mesa: remove unused exp2f, log2f, truncf wrappers Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	aeabf4ede5	mesa: remove unused acosf, asinf, atan2f, etc. macros Not used anywhere. If any of these are needed, they should be added to c99_math.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	bd7f7aac56	mesa: replace FABSF with fabsf Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	46ce78d4c6	mesa: replace FLOORF with floorf Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	b2c13534f7	mesa: remove unused CEILF macro Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	79b480ccc0	mesa: replace LOGF, EXPF with logf, expf Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	e25f7772ca	mesa: replace FREXPF, LDEXPF with frexpf, ldexpf Start getting rid of some imports.h macros. Use the c99 functions instead. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 14:44:19 -07:00
Brian Paul	e6eddbb96a	targets/libgl-xlib: add src/ include dir to fix build	2015-02-24 14:44:19 -07:00
Brian Paul	a55831e8fa	swrast: fix a few release build warnings	2015-02-24 14:44:19 -07:00
Marek Olšák	1180e61a1b	r600g,radeonsi: fix streamout after pipeline stats have been used EVENT_TYPE_PIPELINESTAT_STOP disables streamout queries too. Luckily, pipeline stats are enabled by default, so we don't even have to emit EVENT_TYPE_PIPELINESTAT_START. Tested on Hawaii, Bonaire, Redwood, RV730. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	fdf2c04737	radeonsi: small cleanup around current_rast_prim - remove the last parameter of si_emit_rasterizer_prim_state - remove the last unused parameter of si_emit_draw_registers - use current_rast_prim in si_emit_draw_registers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	0b1f31ab7f	radeonsi: set current_rast_prim in the right place Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	4eb0ccf9e7	radeonsi: simplify obtaining a shader property in si_emit_clip_regs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	5349437154	radeonsi: only preload VertexID for the GS copy shader The copy shader doesn't use any other preloaded VGPRs. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	ffd701e677	radeonsi: dump the shader key when dumping shaders Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	93daf5a2f6	r600g,radeonsi: cleanup of hex literals 0x3F800000 -> fui(1.0) 0x00000000 -> 0 Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	fa913a2dc6	radeonsi: set PA_SU_HARDWARE_SCREEN_OFFSET to 0 It was probably 0 already, but it doesn't hurt to set it. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	558f51f1c5	st/mesa: cleanup st_translate_geometry_program Mostly dead code or code that didn't do anything. Computing gs_num_outputs at the end was also useless. It's already set correctly. Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	94746cadc0	st/mesa: inline st_free_tokens Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	b039302fb7	st/mesa: cleanup st_geometry_program structure It's full of unused variables and variables only used in st_translate_geometry_program. Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-24 21:21:04 +01:00
Marek Olšák	002aa75022	mesa: add a missing GS support check in GetActiveUniformBlockiv Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 21:21:04 +01:00
Glenn Kennard	d80701df8a	r600g: Implement GL_ARB_draw_indirect for EG/CM Requires Evergreen/Cayman and radeon kernel module 2.41.0 or newer. Expected piglit fails due to hardware limitations: * arb_draw_indirect-draw-arrays-prim-restart Restarts not applied for DrawArrays commands * arb_draw_indirect-vertexid Base vertex offset is not included in vertex id Marek: bump vgt_state num_dw by 3 (= space needed for one register write) Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-02-24 21:21:04 +01:00
Rob Clark	dd70e78674	freedreno/a4xx: aniso filtering Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-24 14:23:38 -05:00
Rob Clark	c70097ae86	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-24 14:23:38 -05:00
Rob Clark	daccbd27ce	freedreno/a4xx: add ARB_instanced_arrays support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-24 14:23:38 -05:00
Rob Clark	e13398714c	freedreno/a4xx: handle index_bias (i.e. base_vertex) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-24 14:23:38 -05:00
Rob Clark	283bb4848e	freedreno/a4xx: add support for vertexid and instanceid sysvals ir3 bits of it already in place from a3xx patch.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-24 14:23:38 -05:00
Rob Clark	4aef0d79ee	freedreno/a4xx: pass number of instances to draw a4xx has it's own draw packet, so needs equivalent update to what a3xx already got. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-24 14:23:38 -05:00
Emil Velikov	86d88e2fbb	docs: add news item and link release notes for mesa 10.4.5 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-24 16:10:52 +00:00
Emil Velikov	d60c628f2a	docs: Add sha256 sums for the 10.4.5 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `41bdeda102`)	2015-02-24 16:10:52 +00:00
Emil Velikov	1d761be43a	Add release notes for the 10.4.5 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `a5c608e951`)	2015-02-24 16:10:52 +00:00
Leo Liu	9c7b343bc0	st/omx/dec/h264: fix picture out-of-order with poc type 0 v2 poc counter should be reset with IDR frame, otherwise there would be a re-order issue with frames before and after IDR v2: add commit message Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-24 10:39:49 -05:00
Emil Velikov	fece147be5	install-lib-links: remove the .install-lib-links file With earlier commit (install-lib-links: don't depend on .libs directory) we moved the location of the file from .libs/ to the current dir. Although we did not attribute that in the former case autotools was doing us a favour and removing the file. Explicitly remove the file at clean-local time, otherwise we'll end up with dangling files. Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-24 15:33:25 +00:00
Francisco Jerez	f8f3aa78d8	clover: Set appropriate flag defaults on memory object creation. According to the spec when no device access mode is specified clCreateBuffer and clCreateImage* should default to read/write, and clCreateSubBuffer should default to the parent's device access flags. clCreateSubBuffer is also required to inherit the host access and host pointer flags from the parent. Reviewed-and-tested-by: EdB <edb+mesa@sigluy.net>	2015-02-24 16:18:14 +02:00
EdB	0e8460a528	clover: Add CL_MEM_HOST_* flag checks. Those flags have been introduced in OpenCL 1.2. [ Francisco Jerez: Rebase. Throw CL_INVALID_VALUE from clCreateSubBuffer if the subbuffer drops access flags from its parent. Use single function taking the set of allowed host access flags to validate memory transfer operands. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-02-24 16:17:18 +02:00
Francisco Jerez	80d3c1e537	clover: Factor out memory object flags validation to a helper function. And define constants for commonly used subsets of flags to save some typing. Reviewed-and-tested-by: EdB <edb+mesa@sigluy.net>	2015-02-24 16:15:48 +02:00
Eric Anholt	49d3c6a8e6	vc4: Update to current kernel sources. New BO create and mmap ioctls are added. The submit ABI gains a flags argument, and the pointers are fixed at 64-bit. Shaders are now fixed at the start of their BOs.	2015-02-24 13:49:12 +00:00
Eric Anholt	1d1e820a6d	r600: Fix build after `984f306937` Same as for the CLAMP macro, undef it before including a header file that tries to make fields with that name.	2015-02-24 13:49:12 +00:00
Tobias Klausmann	98ae01c822	st/nine: Mark end of non-void function unreachable Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 12:21:00 +00:00
Tobias Klausmann	984f306937	gallium: include util/macros.h The most common macros are defined there, no use to duplicate these Clean up the already redefinded macros Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-24 12:20:59 +00:00
Alex Henrie	9913ce14e7	driconf: Update Catalan translation Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>	2015-02-24 09:03:45 +00:00
Alex Henrie	d28a4b523d	driconf: Update Spanish translation Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>	2015-02-24 09:03:45 +00:00
Eduardo Lima Mitev	0c47e5492b	mesa: Add missing error checks to GetProgramInfoLog, GetShaderInfoLog and GetProgramiv Fixes 3 dEQP tests: * dEQP-GLES3.functional.negative_api.state.get_program_info_log * dEQP-GLES3.functional.negative_api.state.get_shader_info_log * dEQP-GLES3.functional.negative_api.state.get_programiv Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 08:58:54 +01:00
Iago Toral Quiroga	fe74fee8fa	i965: Fix non-AA wide line rendering with fractional line widths "(...)Let w be the width rounded to the nearest integer (...). If the line segment has endpoints given by (x0,y0) and (x1,y1) in window coordinates, the segment with endpoints (x0,y0-(w-1)/2) and (x1,y1-(w-1/2)) is rasterized, (...)" The hardware it not rounding the line width, so we should do it. Also, we should be careful not to go beyond the hardware limits for the line width after it gets rounded. Gen6-7 define a maximum line width slightly below 8.0, so we should advertise a maximum line width lower than 7.5 to make sure that 7.0 is the maximum integer line width that we can select. Since the line width granularity in these platforms is 0.125, we choose 7.375. Other platforms advertise rounded maximum line widths, so those are fine. Fixes the following 3 dEQP tests: dEQP-GLES3.functional.rasterization.primitives.lines_wide dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-24 08:58:54 +01:00
Iago Toral Quiroga	6148e3aae7	mesa: Fix ctx->Texture.CubeMapSeamless The intel driver code, and apparently all other Mesa drivers, call _mesa_initialize_context early in the CreateContext hook. That function will end up calling _mesa_init_texture which will do: ctx->Texture.CubeMapSeamless = _mesa_is_gles3(ctx); But this won't work at this point, since _mesa_is_gles3 requires ctx->Version to be set and that will not happen until late in the CreateContext hook, when _mesa_compute_version is called. We can't just move the call to _mesa_compute_version before _mesa_initialize_context since it needs that available extensions have been computed, which again requires other things to be initialized, etc. Instead, we enable seamless cube maps since GLES2, which should work for most implementations, and expect drivers that don't support this to disable it manually as part of their context initialization setup. Fixes the following 192 dEQP tests: dEQP-GLES3.functional.texture.filtering.cube.formats.* dEQP-GLES3.functional.texture.filtering.cube.sizes.* dEQP-GLES3.functional.texture.filtering.cube.combinations.* dEQP-GLES3.functional.texture.mipmap.cube.* dEQP-GLES3.functional.texture.vertex.cube.filtering.* dEQP-GLES3.functional.texture.vertex.cube.wrap.* dEQP-GLES3.functional.shaders.texture_functions.texturelod.samplercube_fixed_* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 08:58:54 +01:00
Eduardo Lima Mitev	dccdf1d687	mesa: Return error if BeginQuery is called with an existing object of different type Section 2.14 Asynchronous Queries, page 84 of the OpenGL ES 3.0.4 spec states: "BeginQuery generates an INVALID_OPERATION error if any of the following conditions hold: [...] id is the name of an existing query object whose type does not match target; [...] Similar wording exists in the OpenGL 4.5 spec, section 4.2. QUERY OBJECTS AND ASYNCHRONOUS QUERIES, page 43. Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.fragment.begin_query Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 08:58:53 +01:00
Eduardo Lima Mitev	3699866463	mesa: Return INVALID_OPERATION when querying a never bound Query obj Section 2.14 Asynchronous Queries, page 84 of the OpenGL ES 3.0.4 states: "The command void GenQueries( sizei n, uint ids ); returns n previously unused query object names in ids. These names are marked as used, for the purposes of GenQueries only, but no object is associated with them until the first time they are used by BeginQuery." This means that any attempt to use or query a Query object id before it has ever been bound by calling glBeginQuery, should be assume to be an invalid object. Fixes 1 dEQP test: dEQP-GLES3.functional.negative_api.state.get_query_objectuiv Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-24 08:58:53 +01:00
Iago Toral Quiroga	4db4a559ad	mesa: Add _mesa_is_array_texture helper Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-24 08:58:53 +01:00
Eduardo Lima Mitev	2aa71e9485	mesa: Fix error validating args for TexSubImage3D The zoffset and depth values were not being considered when calling error_check_subtexture_dimensions(). Fixes 2 dEQP tests: * dEQP-GLES3.functional.negative_api.texture.texsubimage3d_neg_offset * dEQP-GLES3.functional.negative_api.texture.texsubimage3d_invalid_offset Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.4 10.5" <mesa-stable@lists.freedestkop.org>	2015-02-24 08:58:53 +01:00
Samuel Iglesias Gonsalvez	fbd6eba72b	i965/blorp: round to nearest when converting float into integer Fixes: dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_linear dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_nearest dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_linear No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-24 08:58:53 +01:00
Carl Worth	4a6c6c49a7	i965: Perform program state upload outside of atom handling Across the board of the various generations, the intial few atoms in all of the atom lists are basically the same, (performing uploads for the various programs). The only difference is that prior to gen6 there's an ff_gs upload in place of the later gs upload. In this commit, instead of using the atom lists for this program state upload, we add a new function brw_upload_programs that calls into the per-stage upload functions which in turn check dirty bits and return immediately if nothing needs to be done. This commit is intended to have no functional change. The motivation is that future code, (such as the shader cache), wants to have a single function within which to perform various operations before and after program upload, (with some local variables holding state across the upload). It may be worth looking at whether some of the other functionality currently handled via atoms might also be more cleanly handled in a similar fashion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-23 14:54:15 -08:00
Vivek Kasireddy	1e96eece30	egl, wayland: RGB565 format support on Back-buffer In current code, color format is always hardcoded to __DRI_IMAGE_FORMAT_ARGB8888 when buffer or DRI image is allocated in function calls, get_back_bo and dri2_get_buffers, regardless of current target's color format. This problem may leads to incorrect render pitch calculation, which eventually ends up with wrong offset of pixels in the frame buffer when the image is in different color format from dri surf's, especially with different bpp. (e.g. RGB565-16bpp) Attached code patch simply adds RGB565 and XRGB8888 cases to two functions noted above to resolve the issue. v2: added a case of XRGB8888, format and bpp selection is done via switch-case (not "if-else" anymore) Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-23 14:07:02 -08:00
Brian Paul	cbd287f094	mesa: move math-related function into new c99_math.h file The alternative would be to include math.h in c99_compat.h but that seems heavy-handed. This patch also replaces INLINE with inline in the c99 math function wrappers. Fixes MSVC build. Acked-by: Matt Turner <mattst88@gmail.com>	2015-02-23 14:45:14 -07:00
Jason Ekstrand	9b9ef2aeee	nir/gcm: Add some missing break statements Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-23 13:20:13 -08:00
Jason Ekstrand	cb4b2ad44a	nir: Copy-propagate vecN operations that are actually moves We were already do this for ALU operations but we haven't for non-ALU operations. This changes that. total NIR instructions in shared programs: 2039883 -> 2022338 (-0.86%) NIR instructions in affected programs: 1768850 -> 1751305 (-0.99%) helped: 14244 HURT: 124 total FS instructions in shared programs: 4083960 -> 4084036 (0.00%) FS instructions in affected programs: 7302 -> 7378 (1.04%) helped: 12 HURT: 51 Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-23 13:19:05 -08:00
Francisco Jerez	f80af89d48	ra: Disable round-robin strategy for optimistically colorable nodes. The round-robin allocation strategy is expected to decrease the amount of false dependencies created by the register allocator and give the post-RA scheduling pass more freedom to move instructions around. On the other hand it has the disadvantage of increasing fragmentation and decreasing the number of equally-colored nearby nodes, what increases the likelihood of failure in presence of optimistically colorable nodes. This patch disables the round-robin strategy for optimistically colorable nodes. These typically arise in situations of high register pressure or for registers with large live intervals, in both cases the task of the instruction scheduler shouldn't be constrained excessively by the dense packing of those nodes, and a spill (or on Intel hardware a fall-back to SIMD8 mode) is invariably worse than a slightly less optimal scheduling. Shader-db results on the i965 driver: total instructions in shared programs: 5488539 -> 5488489 (-0.00%) instructions in affected programs: 1121 -> 1071 (-4.46%) helped: 1 HURT: 0 GAINED: 49 LOST: 5 v2: Re-enable round-robin already for the lowest one of the nodes pushed optimistically onto the sack (Connor). v3: Use UINT_MAX instead of ~0, open-code MIN2 (Jason, Connor). Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-23 20:55:40 +02:00
Francisco Jerez	34c93fd7f1	i965/fs: Fix lower_load_payload() not to use an incorrect half for immediates and uniforms. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-23 20:55:40 +02:00
Francisco Jerez	ea7b4d25c8	i965/fs: Fix lower_load_payload() to take into account non-zero reg_offset. Fixes metadata guess when instructions in the program specify a destination register with non-zero reg_offset and when the payload of a LOAD_PAYLOAD spans several registers. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-23 20:55:40 +02:00
Francisco Jerez	08b4c8f7bf	i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload(). MRFs cannot be read from anyway so they cannot possibly be a valid source of LOAD_PAYLOAD. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-23 20:55:40 +02:00
Francisco Jerez	8e47f51a5a	i965/fs: Less broken handling of force_writemask_all in lower_load_payload(). It's perfectly fine to read the second half of a register written with force_writemask_all from a first half MOV instruction or vice versa, and lower_load_payload shouldn't mark the whole MOV as belonging to the second half in that case. Replicate the same metadata to both halves of the destination when writemasking is disabled. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-23 20:55:40 +02:00
Matt Turner	57d80d11b1	mesa/vbo: Use unreachable to silence uninitialized var warning. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-23 10:49:57 -08:00
Matt Turner	bb2a897dbc	mesa: Move START/END_FAST_MATH macros to their only use. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-23 10:49:48 -08:00
Matt Turner	08bc7cf8f6	mesa: Remove definition of NULL. If your stdlib.h doesn't define this you should fix your stdlib.h. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-23 10:49:47 -08:00
Matt Turner	bfcdb84383	mesa: Use assert() instead of ASSERT wrapper. Acked-by: Eric Anholt <eric@anholt.net>	2015-02-23 10:49:47 -08:00
Matt Turner	52049f8fd8	mesa: Remove CHECK macro. There's some commentary about how it's defined by other "modules", and maybe that was true in 2000 when the code was added. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-23 10:41:22 -08:00
Matt Turner	6a587a4461	mesa: Remove dead CAPI define. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-23 10:41:22 -08:00
Matt Turner	14ded5ee61	gallium: Use util_cpu_to_le{16,32} in many more places. ... and util_le{16,32}_to_cpu. I think I've used the right ones for describing the actual operation performed (even though they're both just "byte-swap this if I'm on big-endian"). The Linux Kernel has typedefs __le32/__be32 and friends that static analysis tools can use to check that byte-orderings are correct. It might be interesting to apply that here as well. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-23 10:41:22 -08:00
Matt Turner	3492e88090	gallium/util: Use HAVE___BUILTIN_* macros. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-23 10:41:22 -08:00
Matt Turner	5a191f49ad	mesa: Move C99 MSVC compatibility code from u_math.h to c99_compat.h. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-23 10:41:21 -08:00
Matt Turner	0b6d43e329	i965: Link test programs with gtest before pthreads. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540962	2015-02-23 10:41:21 -08:00
Brian Paul	5dc6c8c570	osmesa: add gallium include dirs to Makefile.am Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89260 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-23 10:07:48 -07:00
Brian Paul	44375a3b13	util: move pipe_prim_names array into u_prim_name() Also, wrapping the array in #ifdef DEBUG / #endif doesn't seem necessary. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-23 10:02:39 -07:00
Brian Paul	f1c67e37e6	util: rewrite debug_print_transfer_flags() using debug_dump_flags() Add add missing PIPE_TRANSFER_PERSISTENT, PIPE_TRANSFER_COHERENT flags. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-23 10:02:39 -07:00
Eduardo Lima Mitev	0bfe21e8e0	mesa: Adds missing error condition in _mesa_check_sample_count() This corrects a trivial error introduced in commit `19252fee46`. That patch was merged recently and omits one condition (that 'samples' is greater than zero) in one of the error checks. That error will definitely cause regressions. Also corrects the reference to the specification above the error check, which was wrongly quoting OpenGL instead of OpenGL-ES. Reviewed-by: Martin Peres <martin.peres@linux.intel.com>	2015-02-23 15:04:26 +01:00
Marek Olšák	050bf75c8b	radeonsi: fix a warning caused by previous commit Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-02-23 11:45:00 +01:00
Marek Olšák	7820a11e3d	radeonsi: fix point sprites Broken by `a27b74819a`. This fix is critical and should be ported to stable ASAP. Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>	2015-02-23 11:40:55 +01:00
Ben Widawsky	6e62a52865	i965/skl: Use 1 register for uniform pull constant payload When under dispatch_width=16 the previous code would allocate 2 registers for the payload when only one is needed. This manifested itself through bugs on SKL which needs to mess with this instruction. Ken though this might impact shader-db, but apparently it doesn't Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Timo Aaltonen <timo.aaltonen@canonical.com>	2015-02-22 12:27:35 -08:00
Eric Anholt	4359954d84	nir: Generalize the optimization of subs of subs from 0. I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above, but we can generalize it and make it more potentially useful. In the specific original case of a 0 for our new 'a' argument, it'll get further algebraic optimization once the 0 is an argument to the new add. No shader-db effects. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	345c2b288a	nir: Collapse repeated bcsels on the same argument. vc4 results: total instructions in shared programs: 39881 -> 39794 (-0.22%) instructions in affected programs: 6302 -> 6215 (-1.38%) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	a38038ca5e	nir: When faced with a csel on !condition, just flip the arguments. total NIR instructions in shared programs: 39426 -> 39411 (-0.04%) NIR instructions in affected programs: 3748 -> 3733 (-0.40%) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	8e1152cb33	nir: Allow nir_opt_algebraic to see booleanness through &&, \|\|, ^, !. We have some useful optimizations to drop things like 'ine a, 0' on a boolean argument, but if 'a' came from logical operations on bools, it couldn't tell. These kinds of constructs appear as a result of TGSI->NIR quite frequently (at least with if flattening), so being a little more aggressive in detecting booleans can pay off. v2: Add ixor as a booleanness-preserving op (Suggestion by Connor). vc4 results: total instructions in shared programs: 40207 -> 39881 (-0.81%) instructions in affected programs: 6677 -> 6351 (-4.88%) Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Eric Anholt	dc982f4a85	nir: Add a couple of simplifications of csel operations. vc4 was already cleaning these up, but it does shave 4 NIR instructions in shader-db. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-21 14:57:14 -08:00
Ilia Mirkin	c2ece77678	glsl: ensure that enter/leave record get a record type May make life easier for tools like Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-21 17:27:24 -05:00
Ilia Mirkin	1763494b31	tgsi: avoid returning pointer to local var, make it static Spotted by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-21 17:27:24 -05:00
Rob Clark	51e335742e	freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly Fixes xonotic, some webgl stuff, and really pretty much anything with more than 4 varyings. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-21 17:11:02 -05:00
Rob Clark	fb1301e40a	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-21 17:11:02 -05:00
Rob Clark	bdf023482a	freedreno/a4xx: bit of cleanup Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-21 17:11:02 -05:00
Rob Clark	9153dd4b7e	loader: not having a pci-id should not be a warn If there is no pci-id, which is valid for vc4 and freedreno, just emit an info msg. Keep malformed but existing pci-id's as a warning. Mostly just to clean up a warning that confuses users for the non-pci devices. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-21 17:11:02 -05:00
Rob Clark	e17437386c	freedreno: implement fence I never actually implemented the stubbed out fence stuff back in the early days. Fix that. We'll need a few libdrm_freedreno changes to handle timeout properly, so ignore that for now to avoid a libdrm_freedreno dependency bump. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-21 17:11:02 -05:00
Rob Clark	6855226653	freedreno/a2xx: fix increment in assert Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88883 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-02-21 17:11:01 -05:00
Jordan Justen	49a938a265	i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data The brw_imm_ud will yield a HW_REG which then will introduce a barrier for certain optimization opportunities. No piglit regressions seen with gen8 (simd8vs). Suggested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-21 11:40:53 -08:00
Jordan Justen	17fbd854e0	i965/fs: Set pixel/sample mask for compute shaders atomic ops For fragment programs, we pull this mask from the payload header. The same mask doesn't exist for compute shaders, so we set all bits to enabled. Previously we were setting 0xff to support SIMD8 VS, but with CS we support SIMD16, and therefore we change this to 0xffff. Related commits for SIMD8 VS: commit `d9cd982d55` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Sun Feb 15 20:06:59 2015 -0800 i965/simd8vs: Fix SIMD8 atomics commit `4a95be9772` Author: Jordan Justen <jordan.l.justen@intel.com> Date: Tue Feb 17 09:57:35 2015 -0800 i965/simd8vs: Fix SIMD8 atomics (read-only) Note: this mask is ANDed with the execution mask, so some channels may not end up issuing the atomic operation. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-02-21 11:40:53 -08:00
Chia-I Wu	9fe81879c5	ilo: R32G32B32_FLOAT need no special care on Gen8+ Gen8+ must use VALIGN_4. Unlike prior Gens, R32G32B32_FLOAT should supposedly support VALIGN_4.	2015-02-21 11:33:54 +08:00
Chia-I Wu	226109436f	ilo: 128 BPP formats can use TiledY on Gen7.5+ The restriction is lifted.	2015-02-21 11:33:54 +08:00
Ilia Mirkin	f8e4792b22	nvc0: enable double support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:51:50 -05:00
Ilia Mirkin	5491458843	nvc0/ir: remove merge/split pairs to allow normal propagation to occur Because the TGSI interface creates merges for each instruction source and then splits them back out, there are a lot of unnecessary merge/split pairs which do essentially nothing. The various modifier/etc propagation doesn't know how to walk though those, so just remove them when they're unnecessary. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:51:50 -05:00
Ilia Mirkin	93812dc10a	nvc0/ir: add support for new TGSI double opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:51:43 -05:00
Ilia Mirkin	ef8f09be33	nvc0/ir: handle zero and negative sqrt arguments Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:28 -05:00
Ilia Mirkin	88127874a3	nvc0/ir: no instruction can load a double immediate Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:28 -05:00
Ilia Mirkin	b87b498b88	nvc0/ir: fix lowering of RSQ/RCP/SQRT/MOD to work with F64 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:28 -05:00
Ilia Mirkin	93ebe91bae	gm107/ir: fix F2F flipped stype/dtype flags Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:27 -05:00
Ilia Mirkin	dbf4a674b9	gm107/ir: fix DSET boolean float flag Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:27 -05:00
Ilia Mirkin	727018bb0c	gm107/ir: fix DMUL opcode encoding Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:27 -05:00
Ilia Mirkin	493ad88e1b	gk110/ir: add emission of dadd/dmul/dmad opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:27 -05:00
Ilia Mirkin	fd0b1a4cbf	nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-20 19:30:27 -05:00
Roland Scheidegger	88305dfd0b	mesa: don't enable NV_fragment_program_option with swrast Since dropping some NV_fragment_program opcodes (commits `868f95f1da`, `a3688d686f`) we can no longer parse all opcodes necessary for this extension, leading to bugs (https://bugs.freedesktop.org/show_bug.cgi?id=86980). Hence don't announce support for it in swrast (no other driver enabled it). (Note that remnants of some NV_fp/vp extensions remain, they could be dropped but are required as hacks for getting viewperf11 catia to run.)	2015-02-21 01:23:00 +01:00
Brian Paul	9dbe5e1dca	drivers/x11: add gallium include dirs to Makefile.am Fixes xlib driver build after `e8c5cbfd92`. Acked-by: Matt Turner <mattst88@gmail.com>	2015-02-20 16:25:07 -07:00
Marek Olšák	0feb0b7373	vbo: fix an unitialized-variable warning It looks like a bug to me. Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-21 00:16:35 +01:00
Marek Olšák	41f49a2fd4	gallium/sw/kms: fix a type-mismatch warning Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-21 00:16:35 +01:00
Marek Olšák	1a44566132	gallium/sw/kms: don't redefine DEBUG Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-21 00:16:35 +01:00
Marek Olšák	f900233928	targets/d3dadapter9: remove an unused variable Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-21 00:16:35 +01:00
Marek Olšák	ab947d2dd8	tgsi: fix type-mismatch warning Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-21 00:16:34 +01:00
Marek Olšák	6f273ec408	gallivm: fix uninitialized-variable warnings Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-21 00:16:34 +01:00
Matt Turner	b21ad12485	mesa: Have configure define NDEBUG, not mtypes.h. mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not defined. Confusing and bizarre that you don't get NDEBUG if you don't include mtypes.h. ... which is just what happened in commit `bef38f62e`. Let's let configure define this for us if not using --enable-debug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-20 14:10:38 -08:00
Kenneth Graunke	b6393d7040	nir: Fix the Mesa build without -DDEBUG. With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which doesn't exist, breaking the build: assert(state->states[index].index < state->states[index].stack_size); Switch it to ifndef NDEBUG, so the field will exist if the assertion actually generates code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-20 13:43:44 -08:00
Eric Anholt	bef38f62e0	nir: Drop dependency on mtypes.h for core NIR. One less new directory necessary for gallium code that wants to interact with NIR. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	90b4bf2e6e	glsl: Only include mtypes from glsl_types.h for the C++ code that needs it. It's used in one of the methods, not in the structure definitions. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	b53d035825	util: Move Mesa's bitset.h to util/. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	8aa381e3cd	mesa: Make bitset.h not rely on Mesa-specific types and functions. Note that we can't use u_math.h's align() because it's a function instead of a macro, while BITSET_DECLARE needs a constant expression for nouveau's usage in global declarations. v2: Stick some parens around the bits macro argument usage (review by Jose). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	41b1882ed4	mesa: Use u_math.h from macros.h This avoids duplication of some macros and other definitions across the tree. Note that COPY_4FV switches from a memcpy-based implementation to an assignment of 4 floats. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	5ca019358f	gallium/util: Don't include unused debug functions from u_math.h It introduces references to gallium util/ symbols which means we don't get to include it from outside-of-gallium code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Eric Anholt	e8c5cbfd92	mesa: Add gallium include dirs to more parts of the tree. v2: Try to patch up the scons bits. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-02-20 11:36:34 -08:00
Marek Olšák	f5ac5e20b1	gallium/radeon: fix an uninitialized-variable warning	2015-02-20 20:20:10 +01:00
Ilia Mirkin	c85a686d02	gallium: add new double-related shader caps to all the getters Missed a few drivers in the earlier changes, this should fix up all the ones that print unknown caps or don't have a default statement. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-20 14:09:25 -05:00
Brian Paul	71b155a2cb	svga: add missing _DROUND,DFRACEXP_DLDEXP_SUPPORTED switch cases To silence unhandled switch case warnings.	2015-02-20 08:09:40 -07:00
Marek Olšák	7692704b14	radeonsi: don't use SQC_CACHES to flush ICACHE and KCACHE on SI This reverts `73c2b0d18c`. It doesn't seem to be reliable. It's probably missing a wait packet or something, because it's just a register write and doesn't wait for anything. SURFACE_SYNC at least seems to wait until the flush is done. Just guessing. Let's not complicate things and revert this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88561 Cc: 10.5 <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-20 12:06:22 +01:00
Iago Toral Quiroga	2a06728ba0	i965/gen6: Fix GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB In gen6 we need to compute the primitive count in the generated GS program. The current implementation only counts full primitives, that is, if the output primitive type is a triangle strip, it won't count individual triangles in the strip, only complete strips. If we want to count basic primitives instead we have two options: rework the assembly code we generate for strip primitives or simply use CL_INVOCATION_COUNT to resolve the query and let the hardware do that work for us. This patch implements the latter approach. Fixes the following piglit test: bin/arb_pipeline_statistics_query-geom -auto Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89210 Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-20 11:24:11 +01:00
Eduardo Lima Mitev	097b933b55	mesa: Check that draw buffers are valid for glDrawBuffers on GLES3 Section 4.2 (Whole Framebuffer Operations) of the OpenGL 3.0 specification says: "Each buffer listed in bufs must be BACK, NONE, or one of the values from table 4.3 (NONE, COLOR_ATTACHMENTi)". Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.buffer.draw_buffers Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-20 09:35:12 +01:00
Samuel Iglesias Gonsalvez	fe1e89a026	glsl: don't allow invariant qualifiers for interface blocks GLSL 1.50 and GLSL 4.40 specs, they both say the same in "Interface Blocks" section: "If optional qualifiers are used, they can include interpolation qualifiers, auxiliary storage qualifiers, and storage qualifiers and they must declare an input, output, or uniform member consistent with the interface qualifier of the block" From GLSL ES 3.0, chapter 4.3.7 "Interface Blocks", page 38: "GLSL ES 3.0 does not support interface blocks for shader inputs or outputs." and from GLSL ES 3.0, chapter 4.6.1 "The invariant qualifier", page 52. "Only variables output from a shader can be candidates for invariance." This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_fragment No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> v2: - Enable this check for GLSL. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-20 09:35:08 +01:00
Eric Anholt	85316d059c	vc4: Keep an array of pointers to instructions defining the temps around. The optimization passes are always regenerating it and throwing it away, but it's not hard to keep track of.	2015-02-19 23:35:17 -08:00
Eric Anholt	877b48a531	vc4: Move qir_uniform() and the constant-value versions to vc4_qir.c/h. I may want them in optimization passes, and they're not really particular to the program translation stage.	2015-02-19 23:35:17 -08:00
Eric Anholt	14dc281c13	vc4: Enforce one-uniform-per-instruction after optimization. This lets us more intelligently decide which uniform values should be put into temporaries, by choosing the most reused values to push to temps first. total uniforms in shared programs: 13457 -> 13433 (-0.18%) uniforms in affected programs: 1524 -> 1500 (-1.57%) total instructions in shared programs: 40198 -> 40019 (-0.45%) instructions in affected programs: 6027 -> 5848 (-2.97%) I noticed this opportunity because with the NIR work, some programs were happening to make different uniform copy propagation choices that significantly increased instruction counts.	2015-02-19 23:35:17 -08:00
Eric Anholt	09c844fcd9	vc4: Rename add_uniform() to qir_uniform().	2015-02-19 23:35:17 -08:00
Eric Anholt	96f6efc561	vc4: Shut up runtime warnings about new pipe caps.	2015-02-19 23:35:13 -08:00
Matt Turner	e0137fd6f7	i965/vec4: Add and use byte-MOV instruction for unpack 4x8. Previously we were using a B/UB source in an Align16 instruction, which is illegal. It for some reason works on all platforms, except Broadwell. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86811 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:44 -08:00
Matt Turner	dada30462b	i965/blorp: Emit MADs. Low hanging fruit: cuts a couple of instructions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Matt Turner	30ec53f30e	i965/blorp: Optimize clamping tex coords. Each emit_cond_mov() emits a CMP of its first to arguments using the specified conditional mod, followed by a predicated MOV of the fifth argument into the fourth. In all four cases here, it was just implementing MIN/MAX which we can do in a single SEL instruction. Also reorder the instructions for a slightly better schedule. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Matt Turner	3b7f683f3b	i965: Use greater-equal cmod to implement maximum. The docs specifically call out SEL with .l and .ge as the implementations of MIN and MAX respectively. Among other things, SEL with these conditional mods are commutative. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Matt Turner	f8b435ae6a	i965: Don't emit saturates for instructions without destinations. We were special casing OPCODE_END but no other instructions that have no destination, like OPCODE_KIL, leading us to emitting MOVs with null destinations. total instructions in shared programs: 5705243 -> 5701539 (-0.06%) instructions in affected programs: 124104 -> 120400 (-2.98%) helped: 904 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Matt Turner	7f8dd91d16	i965/fs: Consider MOV.SAT to interfere if it has a source modifier. The saturate propagation pass recognizes that the second instruction below does not interfere with an attempt to propagate the saturate modifier from instruction 3 to 1. 1: add(8) dst0 src0 src1 2: mov.sat(8) dst1 dst0 3: mov.sat(8) dst2 dst0 Unfortunately, we did not consider the case of instruction 2 having a source modifier on dst0. Take for instance: 1: add(8) dst0 src0 src1 2: mov.sat(8) dst1 -dst0 3: mov.sat(8) dst2 dst0 Consider such an instruction to interfere. Increase instruction counts in Anomaly 2, which could be a bug fix depending on the values the first instruction produces. instructions in affected programs: 53228 -> 53934 (1.33%) HURT: 360 Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Matt Turner	871ad3f08b	i965/fs: Use fs_inst::overwrites_reg() in saturate propagation. This is safer and matches the conditional_mod propagation pass. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Matt Turner	bf3389ec49	i965/fs: Add unit tests for saturate propagation pass. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 21:16:43 -08:00
Timothy Arceri	9acb011a3e	glsl: Use the without_array predicate Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-20 16:11:15 +11:00
Ilia Mirkin	5000a5f67b	nv50: add PIPELINE_STATISTICS query support, based on nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Nick Tenney <nick.tenney@gmail.com>	2015-02-19 23:12:35 -05:00
Ilia Mirkin	f883df74e0	svga: add missing : Fixes: `924ee3f408` ("gallium: add shader cap for dldexp/dfracexp support") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 20:18:02 -05:00
Jason Ekstrand	c7002fad90	nir/GCM: Pull unpinned instructions out of blocks while pinning This lets us be slightly more efficient by not walking the CFG extra times. Also, it may make it easier to ensure that GVN happens on only unpinned instructions. Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	8dfe6f672f	nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	190073c737	nir: Add a global code motion (GCM) pass v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use nir_dominance_lca for computing least common anscestors - Use the block index for comparing dominance tree depths - Pin things that do partial derivatives Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	a52a4b5223	nir/instr: Change "live" to a more generic "pass_flags" field Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	3d25afc51c	nir: Make nir_[cf_node/instr]_[prev/next] return null if at the end Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	902b0ccc9a	nir/from_ssa: Don't try to read an invalid instruction Right now, the nir_instr_prev function function blindly looks up the previous element in the exec list and casts it to an instruction even if it's the tail sentinel. The next commit will change this to return null if it's the first instruction. Making this change first avoids getting a segfault between commits. The only reason we never noticed is that, thanks to the way things are laid out in nir_block, the casted instruction's type was never parallal_copy. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	0281fd0786	nir/validate: Validate SSA defs the same way we do for registers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	34952b5671	nir/validate: Validate if_uses on registers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	98ecb25f89	nir: Properly clean up CF nodes when we remove them Previously, if you remved a CF node that still had instructions in it, none of the use/def information from those instructions would get cleaned up. Also, we weren't removing if statements from the if_uses of the corresponding register or SSA def. This commit fixes both of these problems Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	e025943134	nir: use nir_foreach_ssa_def for indexing ssa defs This is both simpler and more correct. The old code didn't properly index load_const instructions. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	0167c38cac	nir/from_ssa: Use the nir_block_dominance function instead of our own Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	f481a9425c	nir/dominance: Add a constant-time mechanism for comparing blocks This is mostly thanks to Connor. The idea is to do a depth-first search that computes pre and post indices for all the blocks. We can then figure out if one block dominates another in constant time by two simple comparison operations. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:17 -08:00
Jason Ekstrand	b4c5489c8a	nir/dominance: Expose the dominance intersection function Being able to find the least common anscestor in the dominance tree is a useful thing that we may want to do in other passes. In particular, we need it for GCM. v2: Handle NULL inputs by returning the other block Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-19 17:06:16 -08:00
Ilia Mirkin	6316c90cc0	st/mesa: lower DFRACEXP/DLDEXP when they are not supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 19:39:15 -05:00
Ilia Mirkin	e4a3f48a45	st/mesa: disable lowering of dops to dfrac when dround is available Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 19:38:26 -05:00
Ilia Mirkin	e556bfc8ff	st/mesa: add support for new double opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 19:32:55 -05:00
Ilia Mirkin	924ee3f408	gallium: add shader cap for dldexp/dfracexp support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 19:32:52 -05:00
Ilia Mirkin	899d779cb7	gallium: add a cap to enable double rounding opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 19:32:49 -05:00
Ilia Mirkin	12dedca523	gallium: add some more double opcodes to avoid unnecessary lowering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 19:32:35 -05:00
Dave Airlie	1759689d18	docs/GL3.txt: softpipe now supports GL_ARB_gpu_shader_fp64 Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 10:12:00 +10:00
Dave Airlie	8c6a0ebaad	st/mesa: add st fp64 support (v7.1) This adds support to the state tracker for ARB_gpu_shader_fp64. The details are explained in comments within the code. v2 : add double to int/unsigned conversion v3: handle fp64 consts better v4: use DRSQ v4.1: add d2b v4.2: drop DDIV v5: split out some prep patches. v5.1: add some comments. v5.2: more comments v6: simplify down the double instruction generation loop. v7: Merge Ilia's two cleanup patches. v7.1: minor fixups for Ilia patch + cleanups Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 10:06:56 +10:00
Dave Airlie	0178358a2d	mesa/st_tgsi_to_glsl: prepare add_constant for fp64 This just moves stuff around a little to make the next patch cleaner. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 10:06:47 +10:00
Dave Airlie	12150a5bee	st/glsl_to_tgsi: convert dst to an array This is just prep work for fp64 support where we need an array of 2 dst values. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 10:05:52 +10:00
Dave Airlie	c442d0961e	i965: just avoid warnings with fp64 This just fills in some blanks to avoid warnings in the i965 driver. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 09:44:28 +10:00
Kenneth Graunke	75f6ed617f	glsl: Add compute to _mesa_shader_stage_to_string(); use unreachable. This is basically Ian's review feedback for my patch that added _mesa_shader_stage_to_abbrev() - it just makes both consistent again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-19 15:15:46 -08:00
Kenneth Graunke	5cdfa839c2	i965/vec4: Print "VS" or "GS" when compiles fail, not "vec4". This is now trivial to do right. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:46 -08:00
Kenneth Graunke	e60318fbcd	i965/vec4: Replace debug_flag with debug_enabled. backend_visitor now handles this, so we can delete the vec4_visitor specific code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	eeacbc1a02	i965: Make scheduler cycle estimates use the proper stage name. Previously, the vec4 backend labeled shaders as "vec4" - now it uses the specific names "VS" and "GS". The FS backend now correctly prints "VS" for vertex shaders (rather than "fs"). It also prints "FS" instead of "fs" for fragment shaders; preserving that behavior didn't seem essential. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	2bd139e18c	i965/fs: Un-hardcode DEBUG_WM, "FS", and "fragment". These code paths can (or will) be used for other shader stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	7e35a81264	i965: Create backend_visitor fields for debugging messages. We introduce three new fields in backend_visitor: - debug_enabled: whether or not INTEL_DEBUG & DEBUG_<stage flag> - stage_name: "vertex", "fragment", etc. for use in messages - stage_abbrev: "VS", "FS", etc. for use in messages Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	7c891e8ddd	i965: Add a function to translate MESA_SHADER_* into DEBUG_* enums. When compiling, we have a gl_shader_stage (MESA_SHADER_*) enum, and want to know whether debugging is enabled for that stage. This allows us to easily translate it into the corresponding debug flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	7555d1bafb	glsl: Create a _mesa_shader_stage_to_abbrev() function. This is similar to _mesa_shader_stage_to_string(), but returns "VS" instead of "vertex". v2: Use unreachable() and add MESA_SHADER_COMPUTE (requested by Ian). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	231267bf01	i965/fs: Use VARYING_SLOT checks rather than strcmp(). Comparing the location field is equivalent and more efficient. We'll also need this when we start using NIR for ARB programs, as our NIR converter will set the location field correctly, but probably won't use the GLSL names for these concepts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Kenneth Graunke	a07cd42f1e	i965/fs: Remove type parameter from emit_vs_system_value(). Every VS system value has type D. We can always add this back if that changes, but for now, it's extra typing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-19 15:15:45 -08:00
Dave Airlie	2e9f4eadfb	glsl: add lowering for double divide to rcp/mul It looks like no hw does div anyways, so we should just lower at the GLSL level. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 08:58:06 +10:00
Dave Airlie	0e82817247	softpipe/tgsi: expose doubles for softpipe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 08:52:11 +10:00
Dave Airlie	fa43e0443e	tgsi: add support for flt64 constants These act like flt32 except they take up two slots, and you can only add 2 x flt64 constants in one slot. The main reason they are different is we don't want to match half a flt64 constants against a flt32 constant in the matching code, we need to make sure we treat both parts of the flt64 as an single structure. Cleaned up printing/parsing by Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 08:51:49 +10:00
Dave Airlie	3cd1338534	gallium: add double opcodes and TGSI execution (v4.2) This patch adds support for a set of double opcodes to TGSI. It is an update of work done originally by Michal Krol on the gallium-double-opcodes branch. The opcodes have a hint where they came from in the header file. v2: add unsigned/int <-> double v2.1: update docs. v3: add DRSQ (Glenn), fix review comments (Glenn). v4: drop DDIV v4.1: cleanups, fix some docs bugs, (Ilia) rework store_dest and fetch_source fns. (Ilia) 4.2: fixup float comparisons (Ilia) This is based on code by Michael Krol <michal@vmware.com> Roland and Glenn also reviewed earlier versions. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-20 08:49:12 +10:00
Brian Paul	14b9bf630c	gallium/util: indentation fix	2015-02-19 15:36:59 -07:00
Brian Paul	21c57a697f	st/mesa: add GSL_TYPE_DOUBLE, new ir_unop_* switch cases To silence compiler warnings about unhandled switch cases. v2: move GSL_TYPE_DOUBLE case to the "Invalid type in type_size" section, per Ilia. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 15:36:59 -07:00
Brian Paul	2f5597787c	nir: add missing GLSL_TYPE_DOUBLE case in type_size() To silence compiler warning about unhandled switch case. v2: move GLSL_TYPE_DOUBLE to the "not reached" section, per Ilia. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 15:36:59 -07:00
Brian Paul	62a8883f32	st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels Use pipe_sampler_view_reference() instead of ordinary assignment. Also add a new sanity check assertion. Fixes piglit gl-1.0-drawpixels-color-index test crash. But note that the test still fails. Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 15:36:59 -07:00
Brian Paul	89c96afe3c	swrast: fix multiple color buffer writing If a fragment program wrote to more than one color buffer, the first fragment color got replicated to all dest buffers. This fixes 5 piglit FBO tests, including fbo-drawbuffers-arbfp. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45348 Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-19 15:36:59 -07:00
Brian Paul	fbac86ad2a	mesa: remove unused _math_trans_4chan() Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-19 15:36:59 -07:00
Lucas Stach	5c1aac17ad	install-lib-links: don't depend on .libs directory This snippet can be included in Makefiles that may, depending on the project configuration, not actually build any installable libraries. In that case we don't have anything to depend on and this part of the makefile may be executed before the .libs directory is created, so do not depend on it being there. Cc: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>	2015-02-19 10:02:02 -08:00
Francisco Jerez	6c34fd20be	i965/vec4: Calculate register allocation q values manually. This fixes a regression in the running time of Piglit introduced by commit `78e9043475`, which increased the number of register allocation classes set up by the VEC4 back-end from 2 to 16. The algorithm used by ra_set_finalize() to calculate them is unnecessarily expensive, do it manually like the FS back-end does. Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:09:12 +02:00
Francisco Jerez	35a77a148f	i965: Don't compact instructions with unmapped bits. Some instruction bits don't have a mapping defined to any compacted instruction field. If they're ever set and we end up compacting the instruction they will be forced to zero. Avoid using compaction in such cases. v2: Align multiple lines of an expression to the same column. Change conditional compaction of 3-source instructions to an assertion. (Matt) v3: The 3-source instruction bit 105 is part of SourceIndex on CHV. Add assertion that reserved bit 7 is not set. (Matt) Document overlap with UIP and 64-bit immediate fields. v4: Make some more unmapped bit checks assertions. (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:06:42 +02:00
Francisco Jerez	6c07279e5a	i965: Handle F16TO32/F32TO16 with dword src/dst consistently on both back-ends. Due to the way it's implemented in hardware, the F16TO32/F32TO16 instructions require the source/destination register to be of some 16-bit type in Align1 mode, while they require it to be some 32-bit type in Align16 mode (and as an undocumented feature the high 16 bits of the destination register are zeroed out in the case of the F32TO16 instruction on Gen7). Make their behaviour consistent so you can specify a 32 bit register type as source or destination and get predictable results in the most significant bits no matter what access mode is being used. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:06:42 +02:00
Francisco Jerez	437d401e63	i965/gen8: Fix F32TO16 in vec4 mode if the source and destination registers alias. We cannot zero out the destination register if it overlaps with the source. Use an Align1 instruction instead to zero out the high 16 bits after the conversion to half float. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:06:42 +02:00
Francisco Jerez	509f58740c	i965/fs: Replace ud_reg_to_w() with a more general helper function. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:06:42 +02:00
Francisco Jerez	63d6d09a3b	i965/vec4: Don't attempt to reduce swizzles of send from GRF instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:06:42 +02:00
Francisco Jerez	bda7698fce	i965/vec4: Fix constant propagation across different types. If the source type differs from the original type of the constant we need to bit-cast it before propagating, otherwise the original type information will be lost. If the constant was a vector float there isn't much we can do, because the result of bit-casting the component values of a vector float cannot itself be represented as an immediate. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 14:06:42 +02:00
Samuel Iglesias Gonsalvez	187ace73a9	glsl: A shader cannot redefine or overload built-in functions in GLSL ES 3.00 Create a new search function to look for matching built-in functions by name and use it for built-in function redefinition or overload in GLSL ES 3.00. GLSL ES 3.0 spec, chapter 6.1 "Function Definitions", page 71 "A shader cannot redefine or overload built-in functions." While in GLSL ES 1.0 specification, chapter 8 "Built-in Functions" "User code can overload the built-in functions but cannot redefine them." So this check is specific to GLSL ES 3.00. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_vertex dEQP-GLES3.functional.shaders.functions.invalid.overload_builtin_function_fragment dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_vertex dEQP-GLES3.functional.shaders.functions.invalid.redefine_builtin_function_fragment No piglit regressions. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-19 10:05:33 +01:00
Eduardo Lima Mitev	19252fee46	mesa: Adds check for integer internal format and num samples in glRenderbufferStorageMultisample Per GLES3 specification, section 4.4 Framebuffer objects page 198, "If internalformat is a signed or unsigned integer format and samples is greater than zero, then the error INVALID_OPERATION is generated.". Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.buffer.renderbuffer_storage_multisample Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 09:35:41 +01:00
Eduardo Lima Mitev	dbc160a3f8	mesa: Returns correct error values from gl(Get)SamplerParameter() on GL-ES 3.0+ '3.8.2 Sampler Objects' section of the GL-ES 3.0 specification states: "An INVALID_OPERATION error is generated if sampler is not the name of a sampler object previously returned from a call to GenSamplers." In desktop GL, an GL_INVALID_VALUE is returned instead. Fixes 6 dEQP failing tests: dEQP-GLES3.functional.negative_api.shader.get_sampler_parameteriv * dEQP-GLES3.functional.negative_api.shader.get_sampler_parameterfv * dEQP-GLES3.functional.negative_api.shader.sampler_parameteri * dEQP-GLES3.functional.negative_api.shader.sampler_parameteriv * dEQP-GLES3.functional.negative_api.shader.sampler_parameterf * dEQP-GLES3.functional.negative_api.shader.sampler_parameterfv Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-19 09:35:37 +01:00
Ilia Mirkin	e8e22cf65f	glsl: remove bogus 'd' constant qualifiers 0.0 is a double anyways. Apparently my version of gcc was happy with 0.0d as well, but this is not true of all compilers. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89218 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 01:45:54 -05:00
Ilia Mirkin	0cade4ea2b	st/mesa: don't die for ETC2 formats when no driver support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 01:41:28 -05:00
Eric Anholt	2a135c470e	nir: Add an ALU op builder kind of like ir_builder.h v2: Rebase on the nir_opcodes.h python code generation support. v3: Use SSA values, and set an appropriate writemask on dot products. v4: Make the arguments be SSA references as well. This lets you stack up expressions in the arguments of other expressions, at the cost of having to insert a fmov/imov if you want to swizzle. Also, add the generated file to NIR_GENERATED_FILES. v5: Use more pythonish style for iterating the list. v6: Infer the size of the dest from the size of the srcs, and auto-swizzle a single small src out to the appropriate size. v7: Add little helpers for initializing the struct, add a typedef for the struct like other nir types have. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v6) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v7)	2015-02-18 22:28:42 -08:00
Ilia Mirkin	de798bb937	docs: mark ARB_gpu_shader_fp64 as done in core No driver support... yet. But core is ready. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Ilia Mirkin	e790a3c910	glsl/tests: add DOUBLE types Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 00:28:35 -05:00
Ilia Mirkin	2e7e7b8af6	glsl: add a lowering pass for frexp/ldexp with double arguments Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 00:28:35 -05:00
Dave Airlie	fffbf37124	glsl: lower double optional passes (v2) These lowering passes are optional for the backend to request, currently the TGSI softpipe backend most likely the r600g backend would want to use these passes as is. They aim to hit the gallium opcodes from the standard rounding/truncation functions. v2: also lower floor in mod_to_floor Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Dave Airlie	e6354a2850	glsl: implement double builtin functions This implements the bulk of the builtin functions for fp64 support. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Dave Airlie	2e626318e0	glsl/lower_instructions: add double lowering passes This lowers double dot product and lrp to fma. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Dave Airlie	8be5ee23de	glsl: enable/disable certain lowering passes for doubles We want to restrict some lowering passes to floats only, and enable other for doubles. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Tapani Pälli	3bbaf71994	glsl: validate output types for shader stages Patch fixes Piglit test: arb_gpu_shader_fp64/preprocessor/fs-output-double.frag and adds additional validation for shader outputs. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Dave Airlie	94f9ed701a	glsl: add double support to lower_mat_op_to_vec Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Dave Airlie	3773072169	glsl: Linking support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:35 -05:00
Dave Airlie	7aa3ffe2c5	glsl: Support double loop control Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	53383476d1	glsl: Support double inouts Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	a10275f762	glsl/lexer: Support double floats Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	942574bb24	glsl/parser: Support double floats Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	ba3bab264d	glsl/ast: Support double floats Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	24626444c3	glsl: Add ubo lowering support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	8609b53716	glsl: Add support doubles in optimization passes Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	41e9adfd83	glsl/ir: Add builder support for functions with double floats Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	eeae6251be	glsl/ir: Add builtin constant function support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	753ba6b999	glsl/ir: Add cloning support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	57c6c3d3bd	glsl/ir: Add printing support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	5a69bdb599	glsl/ir: Add builtin function support for doubles v2: add d2b, more ir_constant stuff (Ilia) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Ilia Mirkin	53bf7c8fd2	glsl: fix uniform linking logic in the presence of structs Add a enter/leave record callback so that the offset may be aligned to the proper value. Otherwise only leaf fields are called, and the first field needs to be aligned to the outer struct's base alignment while the last field needs to be aligned to the inner struct's base alignment. This removes most usage of the last field/record type values passed into visit_field. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 00:28:34 -05:00
Ilia Mirkin	1ec715ce8b	glsl: teach std140_base_alignment about samplers These functions are about to be used more aggressively for determining uniform layout. Samplers may be inside of structs, and it's easier to reuse the existing base alignment logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-02-19 00:28:34 -05:00
Dave Airlie	fe23bb85ba	glsl: Uniform linking support for doubles Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:34 -05:00
Dave Airlie	3af8db94cd	glsl: Add double builtin type generation Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	277f4d75a7	glsl: add ARB_gpu_shader_fp64 to the glsl extensions. (v2) v2: add define bit (Tapani Pälli) Patch makes following Piglit tests pass: arb_gpu_shader_fp64/preprocessor/define.vert arb_gpu_shader_fp64/preprocessor/define.frag Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	5cc486b4e3	mesa: add double uniform support. (v5) This adds support for the new uniform interfaces from ARB_gpu_shader_fp64. v2: support ARB_separate_shader_objects ProgramUniformd (Ian) don't allow boolean uniforms to be updated (issue 15) (Ian) v3: fix size_mul v4: Teach uniform update to take into account double precision (Topi) v5: add transpose for double case (Ilia) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	bf257d2c90	glsl: Add double builtin type This causes a lot of warnings about unchecked type in switch statements - fix them later. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	6227af2690	mesa: add ARB_gpu_shader_fp64 extension info (v2) This just adds the entries to extensions.c and mtypes.h v2: use core profile only (Ian) Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Dave Airlie	3c915e5c16	glapi: add ARB_gpu_shader_fp64 (v2) Just add the xml file covering this extension, and dummy interface files in mesa, and fix up sanity tests. v2: Enable ProgramUniformd from ARB_separate_shader_objects (Ian) use 40 instead of 43 for dispatch_sanity.cpp (Chris) uncomment PU sanity tests. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:28:33 -05:00
Ilia Mirkin	069dab7576	freedreno: add missing PIPE_CAP_RESOURCE_FROM_USER_MEMORY to switch Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:25:03 -05:00
Ilia Mirkin	92fc8f04d6	freedreno/a3xx: add ARB_instanced_arrays support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:25:03 -05:00
Ilia Mirkin	f6b2e8af74	freedreno/a3xx: add support for vertexid and instanceid sysvals Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:25:03 -05:00
Ilia Mirkin	2c6e3d822b	freedreno: pass number of instances to draw Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:25:03 -05:00
Ilia Mirkin	e4ddfeea65	freedreno/a3xx: add ETC2 decoding support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-19 00:25:03 -05:00
Ilia Mirkin	33edda7d97	st/mesa: pass etc2 textures to driver if supported If the driver actually supports ETC2, don't decode it in software. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-19 00:25:03 -05:00
Ilia Mirkin	845b9e4294	llvmpipe,softpipe: only support ETC1, not the upcoming ETC2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-18 22:32:25 -05:00
Ilia Mirkin	0821efcb33	gallium: add ETC2 format support No actual decoding is added, similar faking mechanism to bptc. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-18 22:32:25 -05:00
Ilia Mirkin	d622afdbc3	freedreno/a3xx: add hardware ETC1 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-18 22:32:25 -05:00
Eric Anholt	935ee6b652	gallium/dri: Shut up a compiler warning. The compiler doesn't see that buffers is set in the !image case and used in the !image case. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-18 15:15:29 -08:00
Eric Anholt	6eadde51bb	nir: Recognize and reduce duplicated fsats. No effect on vc4 shader-db. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	1907a3a7ee	nir: Add a flag for lowering fsat. vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44356 -> 44354 (-0.00%) instructions in affected programs: 55 -> 53 (-3.64%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	e5ecf8e427	nir: Add a flag for lowering ffma. vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13966 -> 13791 (-1.25%) uniforms in affected programs: 435 -> 260 (-40.23%) total instructions in shared programs: 44732 -> 44356 (-0.84%) instructions in affected programs: 9599 -> 9223 (-3.92%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	42a8ace66e	nir: Add a flag for lowering fneg/ineg. vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44911 -> 44732 (-0.40%) instructions in affected programs: 11371 -> 11192 (-1.57%) v2: Fix broken iabs(isub(0, a)) transformation. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:51 -08:00
Eric Anholt	cb95a228e8	nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)). vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13972 -> 13966 (-0.04%) uniforms in affected programs: 408 -> 402 (-1.47%) total instructions in shared programs: 44973 -> 44911 (-0.14%) instructions in affected programs: 1551 -> 1489 (-4.00%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:50 -08:00
Eric Anholt	ccf14bca4b	nir: Add lowering of POW instructions if the lower flag is set. This could be done in a separate pass like we do in GLSL IR, but it seems to me like having the definitions of the transformations in the two directions next to each other makes a lot of sense. v2: Reorder the comment about the transformation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-18 14:47:50 -08:00
Eric Anholt	8e9dbfff17	nir: Conditionalize the POW reconstruction on shader compiler options. Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic and other passes should try and generate certain types of opcodes or patterns. Extend that to NIR by defining our own struct, which is automatically generated from the Mesa struct in glsl_to_nir and provided directly by the driver in TGSI-to-NIR. v2: Split out the previous two prep patches. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2015-02-18 14:47:50 -08:00
Eric Anholt	955a6bb57d	nir: Add an optional expression controlling nir_algebraic xforms. This will be used so that we can customize the transforms for the target GPU, so we don't un-lower expressions that had already been lowered (or introduce new lowering transformations that not all GPUs want) v2: Drop the complication of having the condition->index dictionary, since we don't actually expect there to be many different conditions (change by Kenneth). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-18 14:47:50 -08:00
Eric Anholt	f90bb54734	nir: Add a nir_shader_compiler_options struct pointed to by the shaders. This will be used to give the optimization passes a chance to customize behavior for the particular target device. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2015-02-18 14:47:50 -08:00
Jordan Justen	4a95be9772	i965/simd8vs: Fix SIMD8 atomics (read-only) An update for `d9cd982d55`. A similar change was needed for CS to allow the piglit test tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test to pass. The previous change (`d9cd982d`) should fix cases that write atomics, such as atomicCounterIncrement, and this change will fix cases than only read atomics, such as atomicCounter. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-02-18 14:33:36 -08:00
Chia-I Wu	b0e26173b2	ilo: fix PCB alloc asserts on Gen7.5 GT3 GT3 has two slices and all limits are doubled.	2015-02-18 14:20:29 -07:00
Chia-I Wu	68573f57ee	ilo: fix compiler warnings Fix -Wmaybe-uninitialized warnings. The change to ilo_blit_resolve_slices_for_hiz() is a potential bug fix.	2015-02-18 14:20:29 -07:00
Adam Jackson	b290330e3b	i915: For the love of all that is holy, stop saying "IGD" a001 and a011 are pineview chips. Say so. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2015-02-18 14:51:16 -05:00
Emil Velikov	8a71fd8d49	auxiliary/vl: honour the DRI2PROTO_CFLAGS Otherwise for non-default installations the build will fail to find the headers and error out. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-18 11:02:50 +00:00
Emil Velikov	dd7b6670a2	auxiliary/vl: Build vl_winsys_dri.c only when needed. With commit c39dbfdd0f7(auxiliary/vl: bring back the VL code for the dri targets) we did not fully consider users of dri-swrast alone. Thus we ended up trying to compile the dri2 specific code on platform which lack it - Cygwin for example. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reported-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2015-02-18 11:02:50 +00:00
Emil Velikov	3018c4a56a	automake: Use AM_DISTCHECK_CONFIGURE_FLAGS Currently we use DISTCHECK_CONFIGURE_FLAGS, which is reserved for the user. As with other variables, one should use the AM_ variable within the makefile. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-18 11:02:44 +00:00
Emil Velikov	b0eada1707	glx: do not leak the dri2 extension information The XExtensionInfo is allocated dynamically (if the pointer is NULL) in the XEXT_GENERATE_FIND_DISPLAY macro. On the other hand the macro XEXT_GENERATE_CLOSE_DISPLAY does not check/free the memory. Follow the example set by dri1 and appledri, and use a static variable. Spotted while hunting "still reachable" leaks in Waffle. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-18 11:02:25 +00:00
Michel Dänzer	4db985a5fa	Revert "radeon/llvm: enable unsafe math for graphics shaders" This reverts commit `0e9cdedd2e`. It caused the grass to disappear in The Talos Principle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89069 Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-18 17:06:32 +09:00
Ilia Mirkin	b7a85bee83	st/mesa: add ARB_pipeline_statistics_query support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-18 02:10:47 -05:00
Ben Widawsky	e206785b57	i965: implement ARB_pipeline_statistics_query NOTE: The implementation was initially one patch, this. All the history is kept here, even though all the core mesa changes were moved to the parent of this patch. This patch implements ARB_pipeline_statistics_query. This addition to GL does not add a new API. Instead, it adds new tokens to the existing query APIs. The work to hook up the new tokens is trivial due to it's similarity to the previous work done for the query APIs. I've implemented all the new tokens to some degree, but have stubbed out the untested ones at the entry point for Begin(). Doing this should allow the remainder of the code to be left in. The new tokens give GL clients a way to obtain stats about the GL pipeline. Generally, you get the number of things going in, invocations, and number of things coming out, primitives, of the various stages. There are two immediate uses for this, performance information, and debugging various types of misrendering. I doubt one can use these for debugging very complex applications, but for piglit tests, it should be quite useful. Tessellation shaders, and compute shaders are not addressed in this patch because there is no upstream implementation. I've implemented how I believe tessellation shader stats will work for Intel hardware (though there is a bit of ambiguity). Compute shaders are a bit more interesting though, and I don't yet know what we'll do there. For the lazy, here is a link to the relevant part of the spec: https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt Running the piglit tests http://lists.freedesktop.org/archives/piglit/2014-November/013321.html (http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats) yield the following results: > piglit-run.py -t stats tests/all.py output/pipeline_stats > [5/5] pass: 5 Running Test(s): 5 v2: - Don't allow pipeline_stats to be per stream (Ilia). This may (not sure) be needed for AMD_transform_feedback4, which we do not support. > If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_- > EMITTED_ARB counts primitives emitted to any of the vertex streams for > which STREAM_RASTERIZATION_AMD is enabled. - Remove comment from GL3.txt because it is only used for extensions that are part of required versions (Ilia) - Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia) - Add a fallthrough comment (Ilia) - Only divide PS invocations by 4 on HSW+ (Ben) v3: - Add ARB_pipeline_statistics_query to relnotes.html - Add ARB_pipeline_statistics_query.xml to the Makefile.am, and master XML (Ilia) - Correct extension number (Ilia) - Add link to xml in the main GL API xml (Ilia) - remove special GS case from gen6_end_query (Ian) - Make lookup table static so gcc doesn't initialized it on every call (Ian) - Use if (_mesa_has_geometry_shaders(ctx)) instead of explicit checks (Ian) - Core mesa parts moved into a prep patch (Ilia) v4: - Change to 10.6 relnotes since we missed 10.5 window - Moved compute shader stuff into the switch statement (Jordan) - Jordan: Add compute shader support v5: - Fixed relnote style (Ilia) v6: - Rebased on master which beat me to adding the first relnotes - essentially this undoes v5 (which had a typo anyway) - Some code style fixes (Ken) - Remove some excess comments (Ken) - Unify tessellation failure style - unreachable (Ken) - Fix workaround comment for PS invocations (Ken) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 23:01:12 -08:00
Ben Widawsky	86ffc36d3c	mesa: Add support for the ARB_pipeline_statistics_query extension This was originally part of a single patch which added the extension, and implemented it for i965 classic. For information about the evolution of the patch, please see the subsequent commit. One difference here as compared to the original mega patch is this does build support for the compute shader query. Since it cannot be tested on any platform, it will always return NULL for now. Jordan has already written a patch to address this, and when that patch lands, this logic can be modified. v2: Fix typo in subject (Brian Paul) Add checks for desktop gl (Ilia) Fail for any callers for now (Ilia) Update QueryCounterBits for new tokens (Ilia) Jordan: Use _mesa_has_compute_shaders Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> v3: Rebased on patch which adds the proper information to unstub tessellation shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 23:01:11 -08:00
Jordan Justen	2cd2831500	mesa: Add _mesa_has_compute_shaders v2 (Ben): Change GLboolean to bool as requested by Ian Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-17 23:00:15 -08:00
Fabian Bieler	599cbe5508	mesa: Add ARB_tessellation_shader to extension table. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 22:06:19 -08:00
Kenneth Graunke	d523fefa75	i965: Prefer Meta over the BLT for BlitFramebuffer. There's some debate about whether we should use Meta or BLORP, but either should run circles around the BLT engine. In particular, this means that Gen8+ will use the 3D engine for blits, like we do on Gen6-7. Improves performance in "copypixrate -blit -back" (from Mesa demos) by 232.037% +/- 3.15795% (n=10) on Broadwell GT3e. v2: Rebase on Laura's changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-17 22:06:06 -08:00
Matt Turner	bb33a31c38	i965/fs: Add algebraic optimizations for MAD. total instructions in shared programs: 5764176 -> 5763808 (-0.01%) instructions in affected programs: 25121 -> 24753 (-1.46%) helped: 164 HURT: 2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	8cfd1e2ac6	i965/fs: Emit MAD instructions when possible. Previously we didn't emit MAD instructions since they cannot take immediate arguments, but with the opt_combine_constants() pass we can handle this properly. total instructions in shared programs: 5920017 -> 5733278 (-3.15%) instructions in affected programs: 3625153 -> 3438414 (-5.15%) helped: 22017 HURT: 870 GAINED: 91 LOST: 49 Without constant pooling, this patch is a complete loss: total instructions in shared programs: 5912589 -> 5987888 (1.27%) instructions in affected programs: 3190050 -> 3265349 (2.36%) helped: 1564 HURT: 17827 GAINED: 27 LOST: 101 And since the constant pooling patch by itself hurt a bunch of things, from before constant pooling to this patch the results are: total instructions in shared programs: 5895414 -> 5747946 (-2.50%) instructions in affected programs: 3617993 -> 3470525 (-4.08%) helped: 20478 HURT: 4469 GAINED: 54 LOST: 146 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	36bc5f06dd	i965/fs: Allow immediates in MAD and LRP instructions. And then the opt_combine_constants() pass will pull them out into registers. This will allow us to do some algebraic optimizations on MAD and LRP. total instructions in shared programs: 5946656 -> 5931320 (-0.26%) instructions in affected programs: 778247 -> 762911 (-1.97%) helped: 3780 HURT: 6 GAINED: 12 LOST: 12 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	2dad1e3abd	i965/fs: Add pass to combine immediates. total instructions in shared programs: 5885407 -> 5940958 (0.94%) instructions in affected programs: 3617311 -> 3672862 (1.54%) helped: 3 HURT: 23556 GAINED: 31 LOST: 165 ... but will allow us to always emit MAD instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	0d8f27eab7	i965/fs: Remove force_writemask_all assertion for execsize < 8. This doesn't seem to be necessary. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	662c645318	i965/cfg: Add function to generate a dot file of the dominator tree. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	b06eef05d0	i965/cfg: Add function to generate a dot file of the CFG. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	0e3dbc0248	i965/cfg: Calculate the immediate dominators. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	08f304bb3b	i965/cfg: Allow cfg::dump to be called without a visitor. The fs_visitor's dump_instruction() implementation calls cfg_t() indirectly through calculate_live_intervals, so if you have an infinite loop in the CFG code, you can't call cfg::dump(fs_visitor *) to debug it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Matt Turner	1af5c4a526	i965: Allow exec_list sentinels as arguments to insert functions. To insert an instruction at the end of a basic block, we typically do something like inst = block->last_non_control_flow_inst(); inst->insert_after(block, new_inst); But blocks can consist of a single control flow instruction, so inst will actually be the exec_list's head sentinel. We shouldn't use it as if it were a regular instruction, but it is safe to insert something after it. This patch avoids assert-failing because an exec_list sentinel wasn't in the basic block's instruction list. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-17 20:44:09 -08:00
Alan Coopersmith	b7ce7c00e3	Make _mesa_swizzle_and_convert argument types in .c match those in .h Caused Solaris Studio compilers to fail to build with errors about incompatible function redefinitions. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 18:16:33 -08:00
Alan Coopersmith	4671dca0ee	Use __typeof instead of typeof with Solaris Studio compilers While the C compiler accepts typeof, C++ requires __typeof. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86944 Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 18:16:33 -08:00
Alan Coopersmith	d602fbd861	Avoid fighting with Solaris headers over isnormal() When compiling in C99 or C++11 modes, Solaris defines isnormal() as a macro via <math.h>, which causes the function definition to become too mangled to compile. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 18:16:33 -08:00
Alan Coopersmith	815b3bd096	Remove extraneous ; after DECL_TYPE usage The macro is defined to provide a trailing ; so this caused the expansion to end in ";;" which made the Solaris Studio compilers issue warnings for every line of: "builtin_type_macros.h", line 113: Warning: extra ";" ignored. for every file that included the header, filling build logs with thousands of useless warnings. Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 18:16:33 -08:00
Alan Coopersmith	60ad5103b9	Bracket arguments to tr so they work with Solaris tr https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Limitations-of-Usual-Tools.html#index-g_t_0040command_007btr_007d-1842 Without this fix, egl fails to build on Solaris, with the error: <command-line>:0:22: error: '_EGL_PLATFORM_x11' undeclared (first use in this function) egldisplay.c:207:31: note: in expansion of macro '_EGL_NATIVE_PLATFORM' native_platform = _EGL_NATIVE_PLATFORM; ^ Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-17 18:16:32 -08:00
Kenneth Graunke	76960a55e6	glsl: Reduce memory consumption of copy propagation passes. opt_copy_propagation and opt_copy_propagation_elements create new ACP and Kill sets each time they enter a new control flow block. For if blocks, they also copy the entire existing ACP set contents into the new set. When we exit the control flow block, we discard the new sets. However, we weren't freeing them - so they lived on until the pass finished. This can waste a lot of memory (57MB on one pessimal shader). This patch makes the pass allocate ACP entries using this->acp as the memory context, and Kill entries out of this->kill. It also steals kill entries when moving them from the inner kill list to the parent. It then frees the lists, including their contents. v2: Move ralloc_free(this->acp) just before this->acp = orig_acp (suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>	2015-02-17 17:33:27 -08:00
Chris Forbes	eda3dd0076	i965: Add device limits for tess threads & URB entries This should cover all platforms prior to Skylake. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-17 17:33:27 -08:00
Dave Airlie	e8e4437ed0	r600g/sb: treat undefined values like constants When we schedule an instructions with undefined value, we eventually will use 0, which is a constant, however sb wasn't taking this into account and creating ops with illegal scalar swizzles. this replaces my fix for op3 in t slots. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-18 11:13:06 +10:00
Kenneth Graunke	598d144cef	i915c: Use the actual MIN instruction. Matt Turner noticed that the hardware has always had a MIN instruction, but the driver always used MAX+MOV for no apparent reason. This should cut an instruction, and a temporary, allowing more programs to run in hardware. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 15:24:15 -08:00
Kenneth Graunke	7bf774034a	i915g: Use the actual MIN instruction. Matt Turner noticed that the hardware has always had a MIN instruction, but the driver always used MAX+MOV for no apparent reason. This should cut an instruction, and a temporary, allowing more programs to run in hardware. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 15:24:15 -08:00
Kenneth Graunke	27b6ef7eca	i965: Add a function to disassemble an instruction from the 4 dwords. I used this a while back when debugging GPU hangs, and it seems like it could be useful, so I figured I'd add it so people can use it in the debugger. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-17 15:24:15 -08:00
Kenneth Graunke	0b499abb51	i965: Do Sandybridge workaround flushes before each primitive. Sandybridge requires the post-sync non-zero workaround in a ton of places, and if you ever miss one, the GPU usually hangs. Currently, we try to track exactly when a workaround flush is necessary (via the brw->batch.need_workaround_flush flag). This is tricky to get right, and we've botched it several times in the past. This patch unconditionally performs the post-sync non-zero flush at the start of each primitive's state upload (including BLORP). We drop the needs_workaround_flush flag, and drop all the other callers, as the flush has already been performed. We have no data to indicate that simply flushing all the time will hurt performance, and it has the potential to help stability. v2: Add post-sync workaround to initial GPU state upload to be extra cautious (suggested by Chad Versace). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2015-02-17 15:24:14 -08:00
Laura Ekstrand	92163482bd	main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly. Previously array textures were not working with GetCompressedTextureImage, leading to failures in the test arb_direct_state_access/getcompressedtextureimage.c. Tested-by: Laura Ekstrand <laura@jlekstrand.net> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-17 13:45:48 -08:00
Ian Romanick	4470bf1f49	i965/vec4: Silence unused parameter warnings brw_vec4_copy_propagation.cpp:243:59: warning: unused parameter 'reg' [-Wunused-parameter] int arg, struct copy_entry entry, int reg) ^ brw_vec4_generator.cpp:869:57: warning: unused parameter 'inst' [-Wunused-parameter] vec4_generator::generate_unpack_flags(vec4_instruction inst, ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 12:29:58 -08:00
Ian Romanick	2524f9b80d	mesa/main: Silence unused parameter warning Just remove the _mesa_free_lighting_data function. The body has been empty since the shine table was moved into the tnl module (commit `ba1d921`). main/light.c:1216:46: warning: unused parameter 'ctx' [-Wunused-parameter] _mesa_free_lighting_data( struct gl_context *ctx ) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 12:29:58 -08:00
Ian Romanick	1424bbfb57	util/hash: Silence comparison between signed and unsigned integer warnings in tests delete_management.c:56:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < size; i++) { ^ delete_management.c:69:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = size - 100; i < size; i++) { ^ delete_management.c:79:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(key_value(entry->key) >= size - 100 && ^ delete_management.c:79:70: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(key_value(entry->key) >= size - 100 && ^ insert_many.c:56:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < size; i++) { ^ insert_many.c:62:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < size; i++) { ^ insert_many.c:67:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] assert(ht->entries == size); ^ random_entry.c:62:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] for (i = 0; i < size; i++) { ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 12:29:58 -08:00
Ian Romanick	3d8f9570cd	util/hash: Silence unused parameter warnings in tests delete_and_lookup.c:37:21: warning: unused parameter ‘key’ [-Wunused-parameter] badhash(const void key) ^ delete_and_lookup.c:43:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ delete_and_lookup.c:43:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ collision.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ collision.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ destroy_callback.c:50:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ destroy_callback.c:50:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ insert_many.c:46:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ insert_many.c:46:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ insert_and_lookup.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ insert_and_lookup.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ null_destroy.c:32:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ null_destroy.c:32:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ random_entry.c:52:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ random_entry.c:52:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ remove_null.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ remove_null.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char argv) ^ replacement.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter] main(int argc, char argv) ^ replacement.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter] main(int argc, char *argv) ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 12:29:58 -08:00
Ian Romanick	147afac80c	glcpp: Silence GCC warning glcpp/glcpp.c:124:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] const static struct option ^ Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-17 12:29:58 -08:00
Marek Olšák	2ead74888a	radeonsi: fix a crash if a stencil ref state is set before a DSA state + minor indentation fixes Discovered by Axel Davy. This can't be reproduced with any app, because all state trackers set a DSA state first. Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2015-02-17 17:41:00 +01:00
Marek Olšák	7713d594e4	r600g,radeonsi: implement GL_AMD_pinned_memory v2: update release notes Reviewed-by: Christian König <christian.koenig@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	c688988b0d	winsys/radeon: test the userptr ioctl to see if it's present There is no other way to check for support. Reviewed-by: Christian König <christian.koenig@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	064847122a	winsys/radeon: allow unaligned size for user-memory buffers This is not required, but being user-friendly doesn't hurt. Reviewed-by: Christian König <christian.koenig@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	e8d727a2b6	winsys/radeon: allow mapping a user buffer OpenGL requires this. Reviewed-by: Christian König <christian.koenig@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	8b587ee701	gallium: add interface and state tracker support for GL_AMD_pinned_memory v2: add alignment restrictions to docs, fix indentation in headers Reviewed-by: Christian König <christian.koenig@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	11ebb03c26	mesa: implement GL_AMD_pinned_memory It's not possible to query the current buffer binding, because the extension doesn't define GL_..._BUFFER__BINDING_AMD. Drivers should check the target parameter of Drivers.BufferData. If it's equal to GL_EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD, the memory should be pinned. That's all there is to it. A piglit test is on the piglit mailing list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-02-17 17:31:48 +01:00
Christian König	4fa61b1a23	winsys/radeon: add user pointer support Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	e8625a29fe	mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	218b15715e	radeonsi: initialize TC_L2_dirty to false after buffer allocation I forgot to do this, though "true" should have no effect on correctness. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	a27b74819a	radeonsi: small fix in SPI state Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	5f1cef76f9	r600g,radeonsi: use fences to implement PIPE_QUERY_GPU_FINISHED Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-17 17:31:48 +01:00
Marek Olšák	f1103f6a1e	r600g,radeonsi: demote TIMESTAMP_DISJOINT query to be a software query The query result is always constant. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-17 17:31:48 +01:00
Dave Airlie	59292b38eb	st/glsl_to_tgsi: fix whitespace everytime I open this file in emacs with show trailing whitespace or git add from it my screen flares with red. Just do a general cleanup, makes working on fp64 support not as jarring. I'm not saying this is perfect, its just better than before. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-17 14:49:19 +10:00
Ilia Mirkin	b53fbec01d	glsl/tests: add IMAGE type. This fixes a warning when running make check. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-02-17 11:26:06 +10:00
Chia-I Wu	faaf13f6bf	ilo: always set up BLEND_STATE on Gen8 There is now an DW0 that seems to be always referenced.	2015-02-17 04:59:33 +08:00
Chia-I Wu	6d4475d7bf	ilo: fix alpha test on Gen8 Shoudl use GEN8_BLEND_DW0_ALPHA_TEST_ENABLE instead of GEN6_RT_DW1_ALPHA_TEST_ENABLE (and others).	2015-02-17 04:59:33 +08:00
Ben Widawsky	d9cd982d55	i965/simd8vs: Fix SIMD8 atomics The short version: we need to set bits in R0.7 which provide a mask to be used for PS kill samples/pixels. Since the VS has no such concept, we just need to set all 1. The longer version... Execution for SIMD8 atomics is defined as follows: SIMD8: The low 8 bits of the execution mask are ANDed with 8 bits of the Pixel/Sample Mask from the message header. For the typed messages, the Slot Group in the message descriptor selects either the low or high 8 bits. For the untyped messages, the low 8 bits are always selected. The resulting mask is used to determine which slots are read into the destination GRF register (for read), or which slots are written to the surface (for write). If the header is not present, only the low 8 bits of the execution mask are used. The message header for untyped messages is defined in R0.7 "This field contains the 16-bit pixel/sample mask to be used for SIMD16 and SIMD8 messages. All 16 bits are used for SIMD16 messages. For typed SIMD8 messages, Slot Group selects which 8 bits of this field are used. For untyped SIMD8 messages, the low 8 bits of this field are used." Furthermore, "The message header for the untyped messages only needs to be delivered for pixel shader threads, where the execution mask may indicate pixels/samples that are enabled only due to derivative (LOD) calculations, but the corresponding slot on the surface must not be accessed." We're not using a pixel shader here, but AFAICT, this mask is used for all stages. This leaves two options, Remove the header, or make the VS code emit the correct thing for the header. I believe one of the goals of using SIMD8 VS was to get as much code reuse as possible, and so I chose the latter. Since the VS has no such thing as kill instructions, the mask is derived simple as all 1's. v2: Add a comment to the code (stolen from Curro on the mailing list) Change the control flow style (Curro + Jason) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87258 Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-16 12:22:44 -08:00
Brian Paul	9ac3700146	mesa: move assertion after declarations in texstore.c To fix MSVC build.	2015-02-16 08:39:25 -07:00
Brian Paul	4d2cee4d5e	mesa: silence uninitialized var warning in get_tex_rgba_uncompressed() Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-16 08:33:28 -07:00
Neil Roberts	bb77745681	meta: Fix saving the results of the current occlusion query When restoring the current state in _mesa_meta_end it was previously trying to copy the on-going sample count of the current occlusion query into the new query after restarting it so that the driver will continue adding to the previous value. This wouldn't work for two reasons. Firstly, the query might not be ready yet so the Result member will usually be zero. Secondly the saved query is stored as a pointer to the query object, not a copy of the struct, so it is actually restarting the exact same object. Copying the result value is just copying between identical addresses with no effect. The call to _mesa_BeginQuery will have always reset it back to zero. This patch fixes it by making it actually wait for the query object to be ready before grabbing the previous result. The downside of doing this is that it could introduce a stall but I think this situation is unlikely so it might not matter too much. A better solution might be to introduce a real suspend/resume mechanism to the driver interface. This could be implemented in the i965 driver by saving the depth count multiple times like it does in the i945 driver. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88248 Reviewed-by: Carl Worth <cworth@cworth.org> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-16 12:09:17 +00:00
Francisco Jerez	946e29847b	i965/vec4: Override destination register writemask in sampler message send. This line was removed by accident in commit `16b9112574` causing a regression in the ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert Khronos conformance test. It's necessary because the swizzle_result() code below expects all four components of the vector to be valid. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89094 Tested-by: Lu Hua <huax.lu@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-16 13:51:08 +02:00
Iago Toral Quiroga	0a811e1d1e	i965: Fix a crash in the texture gradient lowering pass with cube samplers We need to swizzle the rhs to match the number of components in the writemask, otherwise we'll hit an assertion in ir_assignment. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-16 10:53:48 +01:00
Iago Toral Quiroga	ba426522dd	mesa: Fix element count for byte-swaps in texstore, readpix and texgetimage Some old format conversion code in pack.c implemented byte-swapping like this: GLint comps = _mesa_components_in_format(dstFormat); GLint swapSize = _mesa_sizeof_packed_type(dstType); if (swapSize == 2) _mesa_swap2((GLushort ) dstAddr, n comps); else if (swapSize == 4) _mesa_swap4((GLuint ) dstAddr, n comps); where n is the pixel count. But this is incorrect for packed formats, where _mesa_sizeof_packed_type is already returning the size of a pixel instead of the size of a single component, so multiplying this by the number of components in the format results in a larger element count for _mesa_swap than we want. Unfortunately, we followed the same implementation for byte-swapping in the rewrite of the format conversion code for texstore, readpixels and texgetimage. This patch computes the correct element counts for _mesa_swap calls by computing the bytes per pixel in the image and dividing that by the swap size to obtain the number of swaps required per pixel. Then multiplies that by the number of pixels in the image to obtain the swap count that we need to use. Also, when handling byte-swapping in texstore_rgba, we were ignoring the image's depth. This patch fixes this too. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-16 10:51:18 +01:00
Iago Toral Quiroga	4b249d2eed	mesa: Handle transferOps in texstore_rgba In the recent rewrite of the format conversion code we did not handle this. This patch adds the missing support. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89068 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-16 10:49:41 +01:00
Matt Turner	a2299bfbbd	i965/fs: Handle U/UW-type immediates in the generator.	2015-02-15 14:29:08 -08:00
Matt Turner	7a83f7d481	i965/fs: Handle W/UW-type immediates in dump_instructions().	2015-02-15 14:29:08 -08:00
Matt Turner	74ef90acd7	i965: Let dump_instructions() work before calculate_cfg(). Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-15 12:24:11 -08:00
Matt Turner	fa124a337c	i965/fs: Call calculate_cfg() before optimize(). The CFG is fundamental to the FS IR, not merely a piece of optimization. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-15 12:24:11 -08:00
Matt Turner	eb47d0efd3	i965: Optimize multiplication by -1 into a negated MOV. instructions in affected programs: 968 -> 942 (-2.69%) helped: 4 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-15 12:24:10 -08:00
Matt Turner	e8a6f2ad65	i965: Add an is_negative_one() method. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-15 12:24:10 -08:00
Matt Turner	72b9f8db2a	i965/vec4/vp: Use vec4_visitor::CMP. ... instead of emit(BRW_OPCODE_CMP, ...). In commit `6b3a301f` I changed vec4_visitor::CMP to set the destination's type to that of src0. In the following commit (`2335153f`) I removed an apparently now unnecessary work around for Gen8 that did the same thing. But there was a single place that emitted a CMP instruction without using the vec4_visitor::CMP function. Use it there. And change dst_null_d to dst_null_f for good measure, since ARB vp doesn't have integers. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89032 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-15 12:24:10 -08:00
Chia-I Wu	69b1693ef3	ilo: fix some state pointer commands on Gen8 3DSTATE_CC_STATE_POINTERS seems to be ignored when bit 0 of DW1 is not set. Follow i965 and set the bit for 3DSTATE_CC_STATE_POINTERS and 3DSTATE_BLEND_STATE_POINTERS. Add gen checks for all state pointer commands.	2015-02-15 13:32:41 +08:00
Ilia Mirkin	854eb06bee	nvc0: allow holes in xfb target lists Tested with a modified xfb-streams test which outputs to streams 0, 2, and 3. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-14 17:15:54 -05:00
Ilia Mirkin	80d373ed5b	st/mesa: treat resource-less xfb buffers as if they weren't there If a transform feedback buffer's size is 0, st_bufferobj_data doesn't end up creating a buffer for it. There's no point in trying to write to such a buffer, so just pretend as if it's not really there. This fixes arb_gpu_shader5-xfb-streams-without-invocations on nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-14 17:15:54 -05:00
Ilia Mirkin	68e4f3f572	nvc0: bail out of 2d blits with non-A8_UNORM alpha formats This fixes the teximage-colors uploads with GL_ALPHA format and non-GL_UNSIGNED_BYTE type. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-14 17:15:54 -05:00
Jason Ekstrand	3c57a59527	i965/nir: Don't support gl_FrontFacing as an input variable Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-14 13:47:16 -08:00
Jason Ekstrand	dd110cdfd8	nir: Make gl_FrontFacing a system_value GLSL IR labels gl_FrontFacing as an input variable and not a system value. This commit makes NIR silently translate gl_FrontFacing to a system value so that it properly gets translated into a load_system_value intrinsic. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-14 13:47:16 -08:00
Jason Ekstrand	785b22caee	i965/nir: Add support for nir_intrinsic_load_front_face Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-14 13:47:16 -08:00
Jason Ekstrand	929f43851e	nir/lower_phis_to_scalar: Fix some logic in is_phi_scalarizable Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-14 13:46:59 -08:00
Shawn Starr	7df256add2	clover: Use Legacy PassManager for LLVM trunk (3.7) Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Shawn Starr <shawn.starr@rogers.com>	2015-02-14 01:31:57 +00:00
Chia-I Wu	8323796840	ilo: fix JIP/UIP on Gen8 UIP is in DW2 and JIP is in DW3 on Gen8. Also, the units are in bytes.	2015-02-14 06:52:36 +08:00
Chia-I Wu	c62507f42c	ilo: do not set GEN6_THREADCTRL_SWITCH It is not needed on Gen6+, and it appears to be broken on Gen8.	2015-02-14 06:52:36 +08:00
Chia-I Wu	7504b357d4	ilo: correct ISA UIP/JIP decoding for Gen8 JIP is int32_t and UIP is in DW2 on Gen8.	2015-02-14 06:52:36 +08:00
Chia-I Wu	f8126fed95	ilo: prepare for 64-bit immediates decoding Replace imm32 by imm64. Add more ways (UD, D, etc) to access the immediate.	2015-02-14 06:52:36 +08:00
Chia-I Wu	9ed376a76c	ilo: cleanup ISA DW1 decoding Decode the higher and lower 16 bits separately.	2015-02-14 06:52:36 +08:00
Chia-I Wu	db362983d1	ilo: cleanup ISA DW0 decoding Add disasm_inst_decode_dw0_opcode_gen6() to decode the opcode. Simplify branch_ctrl/acc_wr_ctrl decoding.	2015-02-14 06:52:36 +08:00
Chia-I Wu	5fc0dd8953	ilo: update some outdated gen checks Update gen checks for 3DSTATE_POLY_STIPPLE_OFFSET, 3DSTATE_POLY_STIPPLE_PATTERN, 3DSTATE_LINE_STIPPLE, and 3DSTATE_AA_LINE_PARAMETERS.	2015-02-14 06:52:36 +08:00
Chia-I Wu	8b9446dbeb	ilo: fix rectlist length on Gen8 5 PIPE_CONTROLs, 2 3DSTATE_WM_HZ_OP, and depth buffer setup require 65 DWords.	2015-02-14 06:52:36 +08:00
Chia-I Wu	baba8b2745	ilo: fix 3DSTATE_VF_TOPOLOGY The pipe primitive type was wrongly translated twice.	2015-02-14 06:52:36 +08:00
Jose Fonseca	c944b91190	os,llvmpipe: Set rasterizer thread names on Linux. To help identify llvmpipe rasterizer threads -- especially when there can be so many. We can eventually generalize this to other OSes, but for that we must restrict the function to be called from the current thread. See also http://stackoverflow.com/a/7989973 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-13 19:42:21 +00:00
Jose Fonseca	b09f25428f	uti/u_atomic: Don't test p_atomic_add with booleans. Add another class of tests. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=89112 I failed to spot this in my previous change, because bool was a typedef for char on the system I tested. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-13 19:39:27 +00:00
Tapani Pälli	e333035c47	mesa: fix OES_texture_float texture render target behavior Current implementation allowed usage of unsized type texture GL_FLOAT and GL_HALF_FLOAT as a render target as this was 'expected behavior' by WEBGL_oes_texture_float and is also allowed by the oes-texture-float WebGL test. However this broke some ES3 conformance tests that do not accept such behavior. Patch sets such an fbo incomplete as expected by the ES3 conformance tests. Textures with sized types like RGBA32F will still continue to work as render targets. v2: code style cleanups (Ian Romanick, Matt Turner) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88905 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.5" <mesa-stable@lists.freedesktop.org>	2015-02-13 07:51:13 +02:00
Eric Anholt	3f1e1287fd	vc4: Make SF be a flag on the QIR instructions. Right now the places that used to emit a mov.sf just put the SF on the previous instruction when it generated the source of the SF value. Even without optimization to push the sf up further (and kill thus potentially kill more MOVs), this gets us: total uniforms in shared programs: 13455 -> 13457 (0.01%) uniforms in affected programs: 3 -> 5 (66.67%) total instructions in shared programs: 40296 -> 40198 (-0.24%) instructions in affected programs: 12595 -> 12497 (-0.78%)	2015-02-12 16:33:16 -08:00
Eric Anholt	4413861dd8	r200: Drop unused variable. Quiets compiler warning since `e7f2f2dea5`. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-02-12 16:33:16 -08:00
Eric Anholt	55de910f90	i965: Quiet another compiler warning about uninitialized values. The compiler can't tell that we're always going to hit the first if block on the first time through the loop. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-12 16:33:16 -08:00
Eric Anholt	f65e26478b	i965: Move some asserts to unreachable. If execution was supposed to be supported in this case, we'd run into trouble from completely uninitialized sat_imm values. v2: Drop the '!' before the string. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-12 16:32:10 -08:00
Eric Anholt	6489cb1ae6	i965: Shut up a compiler warning about uninitialized var. We always pass this argument, even if it won't be used by the particular texture op. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-12 16:29:53 -08:00
Carl Worth	55a57834bf	Revert use of Mesa IR optimizer for ARB_fragment_programs Commit `f82f2fb3dc` added use of the Mesa IR optimizer for both ARB_fragment_program and ARB_vertex_program, but only justified the vertex-program portions with measured performance improvements. Meanwhile, the optimizer was seen to generate hundreds of unused immediates without discarding them, causing failures. Discard the use of the optimizer for now to fix the regression. (In the future, we anticpate things moving from Mesa IR to NIR for better optimization anyway.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82477 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> CC: "10.3 10.4 10.5" <mesa-stable@lists.freedesktop.org>	2015-02-12 13:33:12 -08:00
Jose Fonseca	1ba9f9e62c	util/u_atomic: Use lower-case variables in _Interlocked* helpers.	2015-02-12 19:32:21 +00:00
Jose Fonseca	531d47baa8	util/u_atomic: Add _InterlockedExchangeAdd8/16 for older MSVC. We need to build certain parts of Mesa (namely gallium, llvmpipe, and therefore util) with Windows SDK 7.0.7600, which includes MSVC 2008. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-12 19:32:21 +00:00
Jose Fonseca	d2438f5920	util/u_atomic: Test p_atomic_add() for 8bit integers. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-12 19:32:21 +00:00
Ilia Mirkin	b1e70f2423	docs: add ARB_draw_indirect to ES 3.1 list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-12 11:12:29 -05:00
Axel Davy	63986f9580	egl: Soften several HAVE_DRM_PLATFORM to HAVE_LIBDRM To fix build when libdrm is not found, commit `a594cec7e3` did put several parts of egl code under #ifdef HAVE_DRM_PLATFORM. HAVE_DRM_PLATFORM means the egl drm platform is being built. What should have been used instead is HAVE_LIBDRM. At a few locations, the HAVE_DRM_PLATFORM introduced have already been replaced by HAVE_LIBDRM, this patch replaces the remaining occurences. This patch makes for example EGL_EXT_image_dma_buf_import be advertised by egl under x11 when the drm egl platform is not built, whereas previously it required the drm egl platform to be built. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-12 13:20:22 +00:00
Emil Velikov	c39dbfdd0f	auxiliary/vl: bring back the VL code for the dri targets With commit c642e87d9f4(auxiliary/vl: rework the build of the VL code) we split out the VL code into a separate static library that was meant to be used by the VL targets alone - va, vdpau, xvmc. The commit failed to consider the way we handle vdpau-gl interop and broke it. Bring back the functionality by keeping the vl <> vl_stub separation as requrested by Christian. v2: Update the omx target as well. Update mesa-stable email address. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86837 Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Andy Furniss <adf.lists@gmail.com>	2015-02-12 13:19:26 +00:00
Emil Velikov	153539bd9d	configure: rework wayland_scanner handling(fix make distcheck) Currently having the wayland-scanner is optional, which causes problems when autotools parses through the makefiles, and tries to generate all the BUILT_SOURCES. As the config option --with-egl-platform=wayland is not the default, we won't end up setting the WAYLAND_SCANNER variable, which in turn will cause some files to not get generated. There has been a wayland-scanner package as of wayland 1.2 which provides a variable for the scanner binary, so let's use that one and fall back to manually searching via AC_PATH_PROG when needed. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-12 13:19:20 +00:00
Emil Velikov	72e602905d	nir: add missing header to the sources list Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-12 13:19:13 +00:00
Emil Velikov	556fc4b84d	nir: resolve nir.h dependency list (fix make distcheck) Use nir/nir_opcodes.h as is (w/o the absolute path), as it is the target name used to generate the actual file. Otherwise the target is missing, the file won't get generated and the build will fail. Cc: "10.5" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-12 13:18:52 +00:00
Martin Peres	9f7efa78a8	docs: update GL3.txt to state my current work on the dsa extension Signed-off-by: Martin Peres <martin.peres@linux.intel.com>	2015-02-12 11:24:37 +02:00
Ben Widawsky	e93566a15c	i965/vs/skl: Use vec4 datatypes for message header We're using a SIMD4x2 sampler message, which has execsize 4, and so the register width must be <= 4. Use <4,4,1> regioning instead of <8,8,1> regioning to access the same data but avoid tripping the assert. Fixes the following piglit tests: spec/glsl-1.20/compiler/structure-and-array-operations/array-selection.vert spec/glsl-es-3.00/compiler/uniform_block/interface-name-basic.vert spec/glsl-es-3.00/compiler/uniform_block/interface-name-field-clashes-with-struct.vert spec/glsl-es-3.00/compiler/uniform_block/interface-name-field-clashes-with-function.vert spec/glsl-es-3.00/compiler/uniform_block/interface-name-array.vert glslparsertest/glsl2/condition-07.vert spec/glsl-es-3.00/compiler/uniform_block/interface-name-field-clashes-with-variable.vert v2: Better commit message courtesy of Ken. I had a discussion with Ken, and we both question how we end up with a mov and execsize 4. For now though, this fixes the piglit tests, so we can worry about it later. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-11 21:41:58 -08:00
Chia-I Wu	cba6a4a129	ilo: update screen init for Gen8 This is very preliminary and is only tested with glxgears. All information about Gen8 is derived from i965 and beignet.	2015-02-12 08:05:07 +08:00
Chia-I Wu	cb1cdecf64	ilo: update outdated render command emissions for Gen8	2015-02-12 07:56:13 +08:00
Chia-I Wu	9ab4fc4e63	ilo: update rectlist command emission for Gen8	2015-02-12 07:56:13 +08:00
Chia-I Wu	4caf8d9761	ilo: update draw command emission for Gen8	2015-02-12 07:56:13 +08:00
Chia-I Wu	d8927ab02f	ilo: update surface state emission for Gen8	2015-02-12 07:56:13 +08:00
Chia-I Wu	7832a3013b	ilo: update dynamic state emission for Gen8	2015-02-12 07:56:13 +08:00
Chia-I Wu	8682cbab3e	ilo: update outdated gen assertions for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	c173a5288f	ilo: add new WM related helpers for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	8c2cbc8955	ilo: update VS related functions for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	0e3381154c	ilo: update VF related functions for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	a57805cb75	ilo: update SAMPLER_STATE for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	7e7e45db65	ilo: update SAMPLER_BORDER_COLOR_STATE for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	8976a190b2	ilo: update depth clear value for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	0b7fdce4f5	ilo: update ilo_zs_surface for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	aa7109f059	ilo: update ilo_view_surface for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	7922982d4f	ilo: update texture layout for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	47dc2ae6e2	ilo: update ilo_blend_state and related functions for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	e8455128aa	ilo: update ilo_dsa_state and related functions for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	9aeee99e4d	ilo: update multisample related states for Gen8	2015-02-12 07:56:12 +08:00
Chia-I Wu	6366fbc1a8	ilo: update WM and PS related functions for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	584d3369b6	ilo: update SBE related functions for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	4cb592ec17	ilo: update SF related functions for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	05e2eb57cd	ilo: update CLIP related functions for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	9ab0165375	ilo: update SF_CLIP_VIEWPORT for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	b64aeebbcc	ilo: update streamout related functions for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	6f77bd3bdc	ilo: update 3DSTATE_{DS,HS,GS} for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	3be0504399	ilo: update 3DSTATE_CONSTANT_x for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	49306afe7b	ilo: update 3DSTATE_URB_x for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	d43ae05d76	ilo: update 3DSTATE_PUSH_CONSTANT_ALLOC_x for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	f43332ca2f	ilo: update render engine common helpers for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	8d9f69bef2	ilo: update BLT helpers for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	574f8d0229	ilo: update MI helpers for Gen8	2015-02-12 07:56:11 +08:00
Chia-I Wu	bfc8a72609	ilo: add functions for Gen8 relocs Extend ilo_builder_writer_reloc() for Gen8 memory addressing. Add new wrappers, ilo_builder_surface_reloc64(() and ilo_builder_batch_reloc64().	2015-02-12 07:56:11 +08:00
Chia-I Wu	a7911620f6	ilo: update the toy compiler for Gen8 Based on what we know from the classic driver.	2015-02-12 07:56:11 +08:00
Chia-I Wu	0066c22c40	ilo: update genhw headers Accumulated changes for various renames and additions, including Gen8 definitions. Some of the dynamic state __SIZE no longer means the size of an element, but the size of an array of elements. The changes can be seen in ilo_render_dynamic.c.	2015-02-12 07:56:10 +08:00
Chia-I Wu	5933d84ad6	ilo: clean up ilo_gpe_init_dsa() Add dsa_get_stencil_enable_gen6(), dsa_get_depth_enable_gen6(), and dsa_get_alpha_enable_gen6() to be called from ilo_gpe_init_dsa().	2015-02-12 07:56:10 +08:00
Chia-I Wu	aa354b92d2	ilo: clean up ilo_gpe_init_blend() Make ilo_blend_state more space efficient and forward-looking.	2015-02-12 07:56:10 +08:00
Chia-I Wu	1d07055b50	ilo: clean up sample patterns Use signed int for sample positions and add helpers to access them. Call them patterns instead of positions.	2015-02-12 07:56:10 +08:00
Matt Turner	69ad5fd4ce	glsl: Optimize (f2i(trunc x)) into (f2i x). total instructions in shared programs: 5950326 -> 5949286 (-0.02%) instructions in affected programs: 88264 -> 87224 (-1.18%) helped: 692	2015-02-11 13:50:19 -08:00
Matt Turner	c262b2b582	glsl: Optimize round-half-up pattern. Hurts some Psychonauts shaders, but after the next patch (which this enables) they're fewer instructions than before this patch.	2015-02-11 13:50:19 -08:00
Matt Turner	a5455ab1ca	glsl: Add trunc() to ir_builder.	2015-02-11 13:50:19 -08:00
Matt Turner	d91390634f	i965: Add LINTERP/CINTERP to can_do_cmod(). LINTERP is implemented as a PLN instruction or a LINE+MAC. PLN and MAC can do conditional mod. CINTERP is just a MOV. total instructions in shared programs: 5952103 -> 5950284 (-0.03%) instructions in affected programs: 324573 -> 322754 (-0.56%) helped: 1819 We lose the SIMD16 in one Unigine Heaven shader which appears six times in shader-db.	2015-02-11 13:50:19 -08:00
Matt Turner	245c7848fc	program: Remove _mesa_nop_vertex_program/_mesa_nop_fragment_program. Dead since commit `284ce20901` Author: Eric Anholt <eric@anholt.net> Date: Fri Aug 20 10:52:14 2010 -0700 Remove remnants of the old glsl compiler. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-11 13:50:19 -08:00
Matt Turner	4c42e1116b	nir: Recognize open-coded fmin/fmax. And unfortunately other shaders do the same thing but with >=/<= which we can't apply this optimization to because of NaNs. instructions in affected programs: 23309 -> 22938 (-1.59%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-11 13:50:19 -08:00
Eric Anholt	56e21647e2	nir: Add algebraic opt for int comparisons with identical operands. No change on shader-db on i965. v2: Reword the comment due to feedback from Erik Faye-Lund Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)	2015-02-11 11:52:38 -08:00
Eric Anholt	2919bdf466	nir: Fix load_const comparisons for CSE. We want the size of a float per component, not the size of a whole vec4. NIR instructions on i965: total instructions in shared programs: 1261937 -> 1261929 (-0.00%) instructions in affected programs: 114 -> 106 (-7.02%) Looking at one of these examples (tesseract), it's from vec4 load_consts for a MRT solid fill, which do get CSEed now that we don't memcmp off the end of the const value and into the SSA def. For the 1-component loads that are common in i965, we were only memcmping off into the rest of the usually zero-filled const_value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-11 11:52:38 -08:00
Matt Turner	09d6ea9ae3	i965/fs: Remove conditional mod when optimizing a SEL into a MOV. Missed in commit `ca675b73`, but got right in the companion commit `3c28b2c0`.	2015-02-11 10:26:49 -08:00
Jeremy Huddleston Sequoia	e68b67b53f	darwin: build fix xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration] glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable); ^ Fixes regression from `291be28476` Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2015-02-10 22:22:33 -08:00
Jeremy Huddleston Sequoia	1c67a5687a	darwin: build fix ../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2015-02-10 20:35:10 -08:00
Matt Turner	ea0f0eb6c0	glsl: Optimize 1/exp(x) into exp(-x). Lots of shaders divide by exp2(...) which we turn into a multiplication by the reciprocal. We can avoid the reciprocal by simply negating exp2's argument. total instructions in shared programs: 5947154 -> 5946695 (-0.01%) instructions in affected programs: 118661 -> 118202 (-0.39%) helped: 380 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-10 17:48:44 -08:00
Matt Turner	a9065cef48	nir: Remove casts from void*. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-10 17:48:42 -08:00
Matt Turner	bb1e007157	nir: Replace assert(0) with unreachable(). Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-10 17:48:31 -08:00
Matt Turner	942b56ad05	nir: Remove unused has_indirect variable. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-10 17:48:16 -08:00
Matt Turner	fff0b2eab5	i965/vec4: Emit MADs from (x + abs(y * z)). Same as commit `3654b6d4` to the fs backend. total instructions in shared programs: 5945788 -> 5945787 (-0.00%) instructions in affected programs: 36 -> 35 (-2.78%) helped: 1 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 17:48:15 -08:00
Matt Turner	3d581f9996	i965/vec4: Emit MADs from (x + -(y * z)). Same as commit `c4fab711` to the fs backend. total instructions in shared programs: 5945998 -> 5945788 (-0.00%) instructions in affected programs: 74665 -> 74455 (-0.28%) helped: 399 HURT: 180 It hurts some programs because we make no attempts in the vec4 backend to avoid MADs if they have constant (or vector uniform) arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 17:47:37 -08:00
Neil Roberts	5b29b2922a	i965/skl: Implement WaDisable1DDepthStencil Skylake+ doesn't support setting a depth buffer to a 1D surface but it does allow pretending it's a 2D texture with a height of 1 instead. This fixes the GL_DEPTH_COMPONENT_* tests of the copyteximage piglit test (and also seems to avoid a subsequent GPU hang). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89037 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 18:00:21 +00:00
Francisco Jerez	1b224290fb	i965/gen7-8: Implement glMemoryBarrier(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 19:09:25 +02:00
Francisco Jerez	46b03d5400	i965: Generalize the update_null_renderbuffer_surface vtbl hook to non-renderbuffers. Null surfaces are going to be useful to have something to point unbound image units to, as the ARB_shader_image_load_store extension requires us to behave deterministically in cases where some shader tries to access an unbound image unit: Invalid stores and atomics are supposed to be discarded and invalid loads are supposed to return zero, which is precisely what the null surface does. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 19:09:25 +02:00
Francisco Jerez	342b7ce7d4	i965: Allocate binding table space for shader images. v2: Bump the number of supported image uniforms to 32 (Ken). Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 19:09:25 +02:00
Francisco Jerez	36a17f0f99	i965: Don't tile 1D miptrees. It doesn't really improve locality of texture fetches, quite the opposite it's a waste of memory bandwidth and space due to tile alignment. v2: Check mt->logical_height0 instead of mt->target (Ken). Add short comment explaining why they shouldn't be tiled. Reviewed-by: Neil Roberts <neil@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 19:09:25 +02:00
Francisco Jerez	b40bcd24e0	i965/vec4: Don't set any dependency control bits for F32TO16 on Gen8. It's expanded to several instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:25 +02:00
Francisco Jerez	aef83957e1	i965: Handle negated unsigned immediate values in constant propagation. Negation of UD/UW sources behaves the same as for D/W sources, taking the two's complement of the source, except for bitwise logical operations on Gen8 and up which take the one's complement. Fixes crash in a GLSL shader with subtraction of two unsigned values. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:25 +02:00
Francisco Jerez	64fde7b31c	i965/vec4: Take into account non-zero reg_offset during register allocation. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:25 +02:00
Francisco Jerez	78e9043475	i965/vec4: Add register classes up to MAX_VGRF_SIZE. In preparation for some send from GRF instructions that will require larger payloads. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:25 +02:00
Francisco Jerez	530445330b	i965/vec4: Init mlen for several send from GRF instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:25 +02:00
Francisco Jerez	5f878d1b47	i965/vec4: Don't infer MRF dependencies for send from GRF instructions. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:24 +02:00
Francisco Jerez	de666fc102	i965/vec4: Fix the scheduler to take into account reads and writes of multiple registers. v2: Avoid nested ternary operators in vec4_instruction::regs_read(). (Matt) Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:24 +02:00
Francisco Jerez	8ad486077e	i965/vec4: Make vec4_visitor::implied_mrf_writes() return zero for sends from GRF. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:24 +02:00
Francisco Jerez	16b9112574	i965/vec4: Pass dst register to the vec4_instruction constructor. So regs_written gets initialized with a sensible value. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:24 +02:00
Francisco Jerez	0c902a8f78	i965/vec4: Initialize vec4_instruction::predicate and ::predicate_inverse. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:24 +02:00
Francisco Jerez	388b136e67	i965/vec4: Implement equals() method for dst_reg too. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 19:09:24 +02:00
Francisco Jerez	3df2cb2f86	i965/fs: Fix fs_inst::regs_written calculation for instructions with scalar dst. Scalar registers are required to have zero stride, fix the regs_written calculation not to assume that the instruction writes zero registers in that case. v2: Rename CEILING() to DIV_ROUND_UP(). (Matt, Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:51 +02:00
Francisco Jerez	f2668f9f21	i965/fs: Fix stack allocation of fs_inst and stop stealing src array provided on construction. Using 'ralloc*(this, ...)' is wrong if the object has automatic storage or was allocated through any other means. Use normal dynamic memory instead. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:51 +02:00
Francisco Jerez	c472793a2a	i965/fs: Remove duplicate include of brw_shader.h The second one was inside an extern "C" block, luckily it was being discarded by the preprocessor. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:51 +02:00
Francisco Jerez	dfe957c02b	i965: Move up fs_inst::flag_subreg to backend_instruction. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:51 +02:00
Francisco Jerez	639696aa05	i965: Move up fs_inst::regs_written to backend_instruction. It will also be useful in the VEC4 back-end. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:51 +02:00
Francisco Jerez	4ed52e8bc4	i965/vec4: Remove dependency of vec4_instruction on the visitor class. The only reason why you need a vec4_visitor to construct a vec4_instruction is to initialize vec4_instruction::ir and ::annotation. Instead set them from vec4_visitor::emit() just like fs_visitor does. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:50 +02:00
Francisco Jerez	a3ee6c7d19	i965/fs: Remove dependency of fs_inst on the visitor class. The fs_visitor argument of fs_inst::regs_read() wasn't used at all. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:50 +02:00
Francisco Jerez	bfbb0e84e1	i965: Move IR object definitions to separate header files. One should be able to manipulate i965 IR without pulling the whole FS/VEC4 visitor classes -- Optimization passes and other transformations would ideally be visitor-agnostic. Among other issues this avoids a circular dependency between the header file where such visitor-agnostic code will be defined and the main FS/VEC4 header where both IR (layer below) and visitor (layer above) happen to be defined. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:50 +02:00
Francisco Jerez	447879eb88	i965: Factor out virtual GRF allocation to a separate object. Right now virtual GRF book-keeping and allocation is performed in each visitor class separately (among other hundred different things), leading to duplicated logic in each visitor and preventing layering as it forces any code that manipulates i965 IR and needs to allocate virtual registers to depend on the specific visitor that happens to be used to translate from GLSL IR. v2: Use realloc()/free() to allocate VGRF book-keeping arrays (Connor). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 16:05:47 +02:00
Francisco Jerez	e6146e6f14	glsl: Forbid calling the constructor of any opaque type. The spec doesn't define any opaque type constructors. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-10 15:49:43 +02:00
Francisco Jerez	c4111dfa0a	glsl: Return correct number of coordinate components for cubemap array images. Cubemap array images are unlike cubemap array samplers in that they don't need an additional coordinate to index individual cubemaps in the array, instead they behave like a 2D array of 6n layers, with n the number of cubemaps in the array. Take this exception into account. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-10 15:49:43 +02:00
Francisco Jerez	fcc2fd53df	mesa: Bump MAX_IMAGE_UNIFORMS to 32. So the i965 driver can expose 32 image uniforms per shader stage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-10 15:37:56 +02:00
Francisco Jerez	818585b9f9	mesa: Rename the CEILING() macro to DIV_ROUND_UP(). Some people have complained that code using the CEILING() macro is difficult to understand because it's not immediately obvious what it is supposed to do until you go and look up its definition. Use a more descriptive name that matches the similar utility macro in the Linux kernel. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-10 15:37:47 +02:00
Tiziano Bacocco	1e02f2badf	nv50,nvc0: Mark PIPE_QUERY_TIMESTAMP_DISJOINT as ready immediately Without this when an application issues that query, it would try to wait the result from the gpu, and since no query has been actually issued, it will wait forever. Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-10 08:02:17 -05:00
Roy Spliet	09ee907266	nv50/ir: Fold IMM into MAD Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-10 08:02:13 -05:00
Roy Spliet	3dc39d0bca	nv50/ir: Add emit support for MAD IMM format But don't enable generation of it in the opProperties, because we can't guarantee the SDST==SRC2 constraint until after register assignment. We'll add a post-RA folding pass to utilise this. Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-10 08:02:02 -05:00
Roy Spliet	fb63df2215	nv50/ir: Add support for MAD 4-byte opcode Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of some of the constraints. Obviously tested with a wide variety of shaders. V2: Document MAD as supported short form V3: Split up IMM from short-form modifiers Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-10 08:01:46 -05:00
Ilia Mirkin	354206f407	nv50/ir: change the way float face is returned The old way made it impossible for the optimizer to reason about what was going on. The new way is the same number of instructions (the neg gets folded into the cvt) but enables the optimizer to be cleverer if comparing to a constant (most common case). [The optimizer is presently not sufficiently clever to work this out, but it could relatively easily be made to be. The old way would have required significant complexity to work out.] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-10 08:01:46 -05:00
Kenneth Graunke	480ee1f0b4	nir: Mark nir_print_instr's instr pointer as const. Printing instructions doesn't modify them, so we can mark the parameter const. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-10 03:37:55 -08:00
Kenneth Graunke	08a06b6b89	i965: Fix integer border color on Haswell. +82 Piglits - 100% of border color tests now pass on Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2015-02-09 13:18:58 -08:00
Kenneth Graunke	e1e73443c5	i965: Use a gl_color_union for sampler border color. This should have no effect, but will make it easier to implement other bug fixes. v2: Eliminate "unsigned one" local; just use the value where necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2015-02-09 13:18:58 -08:00
Kenneth Graunke	8cb18760cc	i965: Override swizzles for integer luminance formats. The hardware's integer luminance formats are completely unusable; currently we fall back to RGBA. This means we need to override the texture swizzle to obtain the XXX1 values expected for luminance formats. Fixes spec/EXT_texture_integer/texwrap formats bordercolor [swizzled] on Broadwell - 100% of border color tests now pass on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2015-02-09 13:18:54 -08:00
Carl Worth	b16de0b713	util/u_atomic: Add new macro p_atomic_add This provides for atomic addition, which will be used by an upcoming shader-cache patch. A simple test is added to "make check" as well. Note: The various O/S functions differ on whether they return the original value or the value after the addition, so I did not provide an add_return() macro which would be sensitive to that difference. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Aaron Watry <awatry@gmail.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-02-09 10:47:44 -08:00
Jason Ekstrand	345e8cc849	util/hash_table: Try to hit a double-insertion bug in the collision test Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-07 17:01:05 -08:00
Jason Ekstrand	623c3a858d	util/set: Do a full search when adding new items Previously, the set_insert function would bail early if it found a deleted slot that it could re-use. However, this is a problem if the key being inserted is already in the set but further down the list. If this happens, the element ends up getting inserted in the set twice. This commit makes it so that we walk over all of the possible entries for the given key and then, if we don't find the key, place it in the available free entry we found. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-07 17:01:05 -08:00
Jason Ekstrand	c9287e797b	util/hash_table: Do a full search when adding new items Previously, the hash_table_insert function would bail early if it found a deleted slot that it could re-use. However, this is a problem if the key being inserted is already in the hash table but further down the list. If this happens, the element ends up getting inserted in the hash table twice. This commit makes it so that we walk over all of the possible entries for the given key and then, if we don't find the key, place it in the available free entry we found. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-02-07 17:01:05 -08:00
James Legg	1581e12aba	mesa: Make renderbuffer FBO attachments not layered For framebuffer completeness checks, consider renderbuffers as not layered. Previously, they would have counted as layered if a layered textured had previously been bound to the same attachment point. This could cause framebuffer completeness checks to incorrectly fail with GL_FRAMEBUFFER_INCOMPLETE_LAYER_TARGETS, even if no layered attachments were present. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89026	2015-02-08 13:54:15 +13:00
Emil Velikov	49299ef6fa	Post-branch version bump to 10.6.0-devel, add release notes template Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-07 19:12:20 +00:00
Brian Paul	d1e21325cf	gallium/hud: also try R8_UNORM format for font texture Convert the code to try formats from an array rather than a bunch of if/else cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-07 11:03:37 -07:00
Brian Paul	6447e9dbfa	gallium/hud: flush stdout in print_help(), for Windows Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-07 11:03:37 -07:00
Ben Widawsky	7ea1e37497	i965: Add more stringent blitter assertions Blits to or from a y-tiled surface must always be a multiple of the tile size. From page 16 of the HSW PRM (https://01.org/linuxgraphics/sites/default/files/documentation/intel-gfx-prm-osrc-hsw-memory-views.pdf#16) "The pitch of a tiled enclosing region must be an integral number of tile widths" Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-07 08:08:59 -08:00
Ben Widawsky	efde74c89d	i965: Consolidate some of the intel_blit logic An upcoming patch is going to introduce some code here, and having this code organized as the patch does makes it a bit easier to read later. There should be no functional change here. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-07 08:07:56 -08:00
Park, Jeongmin	0467a52dc3	st/dri: Make depth buffer optional for postprocessing Since only pp_jimenezmlaa uses depth buffer, we can make it optional. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-02-07 12:12:00 +01:00
Park, Jeongmin	2e6ba6afdb	postprocess: Check for depth buffer in pp_jimenezmlaa Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88962 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-02-07 12:12:00 +01:00
Ben Widawsky	8030e269e9	i965/vec4: Correct MUL destination hazard As it turns out, we were over-thinking the cause of the hang on Cherryview. It's simply errata for Cherryview. commit `88fea85f09` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Fri Nov 21 10:47:41 2014 -0800 i965/vec4/gen8: Handle the MUL dest hazard exception This is an explanation to why we never saw the hang on BDW. NOTE: The problem the original patch was trying to fix does still exist. It will have to be fixed at some point. v2: Modify commit message, s/CHV/BDW Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-06 17:54:17 -08:00
Emil Velikov	e660f0dd80	docs: add news item and link release notes for mesa 10.4.4 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-02-07 00:51:08 +00:00
Emil Velikov	d8278be310	docs: Add sha256 sums for the 10.4.4 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `54da987bae`)	2015-02-07 00:48:04 +00:00
Emil Velikov	7d796a59de	Add release notes for the 10.4.4 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `62eb27ac8b`)	2015-02-07 00:48:02 +00:00
Eric Anholt	bff4cbdafa	nir: Fix broken fsat recognizer. We've probably never seen this ridiculous pattern in the wild, so it didn't matter. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-06 15:57:55 -08:00
Eric Anholt	6706537dd4	nir: Slightly simplify algebraic code generation by reusing a struct. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-06 15:57:55 -08:00
Eric Anholt	9e35af08af	tgsi/ureg: Add missing some missing opcodes opcode_tmp.h I wanted all of these for NIR-to-TGSI. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-06 15:50:07 -08:00
Eric Anholt	f3dbf3689a	tgsi/ureg: Move ureg_dst_register() to the header. I wanted to use it for nir-to-tgsi. The equivalent ureg_src_register() is also located here. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-06 15:50:07 -08:00
Marek Olšák	40fa7d44ab	gallium/u_tests: test a NULL buffer sampler view Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-06 22:27:07 +01:00
Marek Olšák	56e709bffb	gallium/u_tests: test a NULL constant buffer This expects (0,0,0,0), though it can be changed to something else or allow more than one set of values to be considered correct. This is currently the radeonsi behavior. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-06 22:27:07 +01:00
Marek Olšák	9e8a6d8486	gallium/u_tests: test a NULL texture sampler view v2: allow one of the two values	2015-02-06 22:27:06 +01:00
Marek Olšák	63e51baedc	gallium/u_tests: restructure the only test, refactor out reusable code Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-06 22:27:06 +01:00
Marek Olšák	dcf996c31e	gallium: run gallium tests if GALLIUM_TESTS=1 is set Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-06 22:27:06 +01:00
Marek Olšák	0271ac72d1	gallium/postprocessing: fix crash at context destruction Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-06 20:03:06 +01:00
Xavier Bouchoux	2fd21c4098	r600g/sb: fix a bug in constants folding optimisation pass ADD R6.y.1, R5.w.1, ~1\|3f800000 ADD R6.y.2, \|R6.y.1\|, -0.0001\|b8d1b717 was wrongly being converted to ADD R6.y.1, R5.w.1, ~1\|3f800000 ADD R6.y.2, R5.w.1, -1.0001\|bf800347 because abs() modifier was ignored. Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-06 20:03:06 +01:00
Xavier Bouchoux	acef65503e	r600g: fix abs() support on ALU 3 source operands instructions Since alu does not support abs() modifier on source operands, spill and apply the modifiers to a temp register when needed. Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-06 20:03:06 +01:00
David Heidelberg	bae23a1756	r300g: small code cleanup (v2) v2: incorporated changes from Marek Olšák Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: David Heidelberg <david@ixit.cz>	2015-02-06 18:27:30 +01:00
Iago Toral Quiroga	71a36e0a2c	glsl: GLSL ES identifiers cannot exceed 1024 characters v2 (Ian Romanick) - Move the check to the lexer before rallocing a copy of the large string. Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-06 12:21:42 +01:00
Kenneth Graunke	d4a461caaf	i965: Fix INTEL_DEBUG=shader_time for SIMD8 VS (and GS). We were incorrectly attributing VS time to FS8 on Gen8+, which now use fs_visitor for vertex shaders. We don't hit this for geometry shaders yet, but we may as well add support now - the fix is obvious, and we'll just forget later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2015-02-05 20:01:03 -08:00
Kenneth Graunke	32f1d4e286	i965/fs: Use inst->eot rather than opcodes in register allocation. Previously, we special cased FB writes and URB writes in the register allocation code. What we really wanted was to handle any message with EOT set. This saves us from extending the list with new opcodes in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-05 20:01:02 -08:00
Kenneth Graunke	10d8a1a88e	i965/fs: Delete is_last_send(); just check inst->eot. This helper function basically just checks inst->eot, but also asserts that only opcodes we expect to terminate threads have EOT set. As far as I'm aware, we've never had such a bug. Removing it means that we don't have to extend the list for new opcodes. Cherryview and Skylake introduce an optimization where sampler messages can have EOT set; scalar GS/HS/DS will likely introduce new opcodes as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2015-02-05 20:00:42 -08:00
Michel Dänzer	a338dc0186	st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB The latter currently implies CPU read access, so only PIPE_USAGE_STAGING can be expected to be fast. Mesa demos src/tests/streaming_rect on Kaveri (radeonsi): Unpatched: 42 frames in 1.023 seconds = 41.056 FPS Patched: 615 frames in 1.000 seconds = 615.000 FPS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88658 Cc: "10.3 10.4" <mesa-stable@lists.freedestkop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-02-06 10:55:53 +09:00
Tiziano Bacocco	17abefa12b	st/nine: Implement dummy vbo behaviour when vs is missing inputs Use a dummy vertex buffer object when vs inputs have no corresponding entries in the vertex declaration. This dummy buffer will give to the shader float4(0,0,0,0). This fixes several artifacts on some games. Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>	2015-02-06 00:07:20 +01:00
Axel Davy	90585cbc9a	gallium/targets/d3dadapter9: Free card device The drm fd wasn't released, causing a crash for wine tests on nouveau, which seems to have a bug when a lot of device descriptors are open. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:20 +01:00
Axel Davy	8b3a9d5c9f	gallium/targets/d3dadapter9: Release the pipe_screen at destruction. We weren't releasing hal and ref, causing some issues (threads not released, etc) Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	8f50614910	gallium/targets/d3dadapter9: Fix device detection for render-nodes When on a render node the unique ioctl doesn't work. This patch drops the code to detect the device, which relied on an ioctl, and replaces it by the mesa loader function. The mesa loader function is more complete and won't fail for render-nodes. Alternatively we could also have used the pipe cap to determine the vendor and device id from the driver. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	2c54d154e8	st/nine: Dummy sampler should have a=1 Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	9ac74e604b	st/nine: Fix update_framebuffer binding cbufs the pixel shader wouldn't render to Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	ee606b4780	st/nine: Clear: better behave if rt_mask is different to the one of the framebuffer bound Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	d8d48f6f71	st/nine: Fix multisampling support detection Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Tiziano Bacocco	a1d369e804	st/nine: Fix enabled lights in stateblocks Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>	2015-02-06 00:07:19 +01:00
Axel Davy	1543defc5e	st/nine: Fix depth stencil formats bindings flags. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	49214a3dfc	st/nine: Fix gpu memory leak in swapchain Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	d538007734	st/nine: SetResourceResize should track nr_samples too Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Tiziano Bacocco	1c1d26cd97	st/nine: D3DRS_FILLMODE set to 0 is D3DFILL_SOLID Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>	2015-02-06 00:07:19 +01:00
Tiziano Bacocco	50f0e011da	st/nine: Setting D3DRS_ALPHAFUNC to 0 means D3DCMP_NEVER Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com>	2015-02-06 00:07:19 +01:00
Axel Davy	dfe5e84e74	st/nine: Implement fallback behaviour when rts and ds don't match This seems to be the behaviour on Win. Previous behaviour led to different issues depending on the driver. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	8b901e3011	st/nine: Fix present_buffers allocation If has_present_buffers was false at first, but after a device reset, it turns true (for example if we begin to render to a multisampled back buffer), there was a crash due to present_buffers being uninitialised. This patch fixes it. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Patrick Rudolph	792af626d4	st/nine: Check for aligned offset in each vertex element Fixes wine test test_vertex_declaration_alignment() Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:19 +01:00
Patrick Rudolph	63221c6f09	st/nine: Fix bufferoverflow in {Get\|Set}PixelShaderConstantF Previous code wasn't checking against the correct limit: 224 for sm3 hardware, but 256. Fixes wine test test_pixel_shader_constant() Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:19 +01:00
Patrick Rudolph	2dcad120a0	st/nine: Set [out] argument to NULL for some functions Wine tests, and probably some apps, check for errors by checking for NULL instead of error codes. Fixes wine test test_surface_blocks() Reviewed-by: Axel davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:19 +01:00
Patrick Rudolph	9aa3ebd0e7	st/nine: Remove duplicated debug message Likely a rebase error Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:19 +01:00
Patrick Rudolph	33617ef296	st/nine: Return E_FAIL for unused vertexdeclaration type Add returncode E_FAIL. Return E_FAIL for any vertexdeclaration element with type unused. Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:19 +01:00
Patrick Rudolph	faf94f6eea	st/nine: Missing sanity check for CALLOC return E_OUTOFMEMORY if allocation of usage_map fails Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:19 +01:00
Axel Davy	75676886e4	st/nine: Implement ATOC hack ATOC is an hack for Alpha to coverage that is supported by NV and Intel. You need to check the support for it with CheckDeviceFormat. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	0a4aaf1d41	st/nine: Implement AMD alpha to coverage This D3D hack is supposed to be supported by all AMD SM2+ cards. Apps use it without checking if they are on AMD. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	bf0adf248f	st/nine: Add D3DFMT_DF16 support This depth buffer format, like D3DFMT_INTZ, can be used to read the depth buffer values when bound to a shader. Some apps may use this format to get better performance when they don't need the precision of INTZ (24 bits for depth, 8 for stencil, whereas DF16 is just 16 bits for depth) We don't add support for DF24 yet, because it implies support for FETCH4, which we don't support for now. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	34292754d2	st/nine: Change the value of some advertised caps These values are taken from wine. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	25f1e5584c	st/nine: NineDevice9_SetClipPlane: pPlane must be non-NULL Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:19 +01:00
Axel Davy	02a89dc163	st/nine: Implement fallback for D3DFMT_D24S8, D3DFMT_D24X8 and D3DFMT_INTZ Some drivers support PIPE_FORMAT_S8_UINT_Z24_UNORM, some others PIPE_FORMAT_Z24_UNORM_S8_UINT, some both. It doesn't matter which one we use, since the d3d formats they map to aren't lockable (app can read it directly). Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	27e438e356	st/nine: Refactor format d3d9 to pipe conversion Move the checks of whether the format is supported into a common place. The advantage is that allows to handle when a d3d9 format can be mapped to several formats, and that cards don't support all of them. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	f8713b1bfd	st/nine: Refactor nine_d3d9_to_pipe_format_map The order of the format is changed to have an increasing ordering of the d3d9 format values. Some missing formats are added and matched to PIPE_FORMAT_NONE Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	4cf5701160	st/nine: Improve CheckDeviceFormat debug output Because the debug output of this function was cut in two parts, sometimes the second part wasn't print when we would return earlier, whereas we would like to get it. The reason of the separation was that it's only at the end of the function we can print what we map to the d3d9 arguments, but we can always retrieve that info by hand. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	42ac71a4e2	st/nine: Implement RESZ hack This D3D hack allows to resolve a multisampled depth buffer into a single sampled one. Note that the implementation is slightly incorrect. When querying the content of D3DRS_POINTSIZE, it should return the resz code if it has been set. This behaviour will be implemented when state changes will be reworked. For now the current behaviour is ok, since apps use the D3DCREATE_PUREDEVICE flag when creating the device, which means they won't read states and in exchange get better performance. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	5c61f6344a	st/nine: fix early basetexture destruction Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Patrick Rudolph	dfeca90419	st/nine: Do not leak private data in volume9. This->data was allocated by nine, but not freed. Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:18 +01:00
Patrick Rudolph	b3afcc0968	st/nine: Check block alignment for compressed textures in NineSurface9_CopySurface Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Patrick Rudolph <siro@das-labor.org>	2015-02-06 00:07:18 +01:00
Axel Davy	65ce2b2848	st/nine: Commit sampler views again if srgb state changed. This fixes a wine test and some minor visual issues on some games. The patch is not optimal, there is probably a more efficient way to fix this issue, but the code there already has some innefficiencies. There is plans to rewrite that part of the code to make it more efficient. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	2d2286d17c	st/nine: Fix use of D3DSP_NOSWIZZLE D3DSP_NOSWIZZLE already contains the shift. Detected with Clang. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	1f3b7d4039	st/nine: Check for the correct number of constants. This removes unneeded hack for Anno 1404. This app is not checking the number of supporting constants, and rely on the shader compilation to fail if it puts too many constants. This patch also checks for the correct number of constants for ps. Note that we don't check the official limitations for old vs and ps versions. The restrictions were fixed, unlike for the number of vertex shader constants for later versions. Likely apps use the correct number, and it's not a problem for us if it wants use more. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	d0aeb4422b	st/nine: Introduce failure handling for shader parsing. Instead of crashing on buggy shaders, we should return an error. This patch introduces this behaviour in the case of invalid constant access Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	6fcc2c8872	st/nine: Print warnings for r500 when shader is likely to go wrong r500 hasn't enough float constants for vs to fill all needs. Overlapping issues can happen with complex shaders. The fix would be to recompile shaders to include the integer and boolean constants, instead of reserving slots for them. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	70a523818f	st/nine: Declare constants only up to the maximum needed. Previously 276 constants were declared everytime. This patch makes shaders declare constants up to the maximum constant needed and moves the moment we print the TGSI shader after the moment we declare the constants. This is needed for r500, since when indirect addressing is used, it cannot reduce the amount of constants needed, and that it is restricted to 256 constant slots. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	a249c7a161	st/nine: Refactor how user constbufs sizes are calculated Count explicitly the slots for float, int and bool constants, and deduce the constbuf size in nine_shader. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	65ca8e4b3d	st/nine: Explicit nine requirements This patch raises nine requirements and disables nine for old hw that don't match them. Currently for these cards only games that don't have tight requirements would work well with nine. However nine is missing several checks regarding these limitations. To make code and future patches less heavy, dropping support for these old card seems a good solution. That makes r500 the only dx9 generation cards supported by nine. It seems the one with the less limitations for nine. Still not everything is ok, and we'll have for example to implement shader recompilation for these cards to include integer and boolean constants in the shader. Eventually when this is done, we can reintroduce support for older cards. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Axel Davy	eb1c12d20d	gallium: Add MULTISAMPLE_Z_RESOLVE cap Resolving a multisampled depth texture into a single sampled texture is supported on >= SM4.1 hw. It is possible some previous hw support it. The ability was tested on radeonsi and nvc0. Apparently is is also supported for radeon >= r700. This patch adds the MULTISAMPLE_Z_RESOLVE cap and add it to the drivers. It is advertised for drivers for which it is sure the ability is supported. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-02-06 00:07:18 +01:00
Laura Ekstrand	77cc799853	GL: Update glext.h to Revision 29735 (20150202). Khronos modified glext.h to get rid of GL_TEXTURE_BINDING, a special enum added for ARB_direct_state_access. This enum was ruled unimplementable. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Laura Ekstrand <laura@jlekstrand.net>	2015-02-05 11:41:26 -08:00
Jose Fonseca	08efcc0960	llvmpipe: Trivially advertise PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT. Nothing special needs to be done. Even though llvmpipe copies constant (ie uniform) buffers internally, the application is supposed to flush and sync, so all should work. All bufferstorage piglit tests pass. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-05 16:16:47 +00:00
Matt Turner	2335153ff2	i965: Remove now unnecessary Gen8 CMP destination type override. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-04 12:14:35 -08:00
Matt Turner	6b3a301f61	i965: Set CMP's destination type to src0's type. Allows CMP instructions with float sources to be compacted and coissued. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-04 12:14:34 -08:00
Matt Turner	7e60794392	i965/fs: Implement the WaCMPInstFlagDepClearedEarly work-around. Prevents piglit regressions from the next patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-04 12:14:34 -08:00
Jose Fonseca	661c8bb220	gallium/util: Don't implement u_bit_scan64 on MSVC. As ffsll doesn't exist in MSVC yet, and u_bit_scan64 is only used by radeonsi which is never built with MSVC. This is just a stop-gap fix to unbreak MSVC build until we refactor these mathematical portability wrappers into src/util. Trivial.	2015-02-04 15:22:59 +00:00
Jose Fonseca	46f1033067	gallium/util: Define ffsll on MinGW. Trivial. (Fixing MSVC will be far less so, as _BitScanForward64 is only supported on x64.)	2015-02-04 14:58:20 +00:00
Marek Olšák	6c5af1dc4e	radeonsi: implement polygon stippling Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	6895dfb184	radeonsi: add polygon stipple texture slot Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	1fe7ba8c69	radeonsi: deduce rasterizer primitive type at the beginning of draw_vbo I will need this for polygon stippling. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	8f65e6eae8	radeonsi: allow 64 descriptors per array We need a slot for the stipple texture and the pixel shader already uses 32 textures (16 API slots + 16 FMASK slots). Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	9af943c32e	radeonsi: add support for sampler views where resource = NULL The hardware obeys swizzles even if the resource is NULL. This will be used by set_polygon_stipple. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	70e4243f07	radeonsi: add support for NULL texture sampler views that return (0,0,0,1) This used to hang. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	82f64a68a4	radeonsi: fix a crash when binding a NULL sampler view list Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	b142dd2f24	radeonsi: move the buffer descriptor to the end of the image descriptor This will allow supporting NULL textures. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	afe1e6acdd	radeonsi: don't use tgsi_parse_context to get processor type Also remove unused "tokens". Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	50908a8918	radeonsi: fix instanced arrays with non-zero start instance Fixes piglit ARB_base_instance/arb_base_instance-drawarrays. Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	658f1d4cfe	r600g,radeonsi: don't append to streamout buffers that haven't been used yet The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it. Instead, use offset = 0, which is what we always do when not appending. This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*. Yes, the test does use transform feedback. Cc: 10.3 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	b616429ca8	gallium: set PIPE_MAX_SAMPLERS to 18 For drivers that use higher slots not to crash in tgsi_shader_info. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	8fc542aa89	gallium/u_pstipple: add ability to specify a fixed texture unit E.g. r600g can use slot 17, which is outside of the API range. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	50433ea526	gallium/util: add u_bit_scan64 Same as u_bit_scan, but for uint64_t. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-04 14:34:13 +01:00
Marek Olšák	f2328ffdc8	tgsi: add tgsi_get_processor_type helper from radeon Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-04 14:34:13 +01:00
Kenneth Graunke	ccbe15f332	i965/fs: Fix saturate on MAD and LRP with the NIR backend. Fixes misrendering in "Witcher 2" with INTEL_USE_NIR=1, and probably many other programs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-04 00:34:57 -08:00
Iago Toral Quiroga	1b029f8a4a	mesa: Fix _mesa_format_convert fallback path when src is not an array format When a rebase swizzle is provided and we call _mesa_swizzle_and_convert after unpacking the source format we were always passing normalized=false. We should pass true or false depending on the formats involved in the conversion for the byte and float paths (the integer path cannot ever be normalized). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2015-02-04 08:08:34 +01:00
Park, Jeongmin	6fd4a61ad6	st/osmesa: Fix osbuffer->textures indexing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930 Cc: 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-03 15:46:56 -07:00
Connor Abbott	ab24e12706	i965/nir: use redundant phi optimization Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 16:00:13 -05:00
Connor Abbott	a135f34080	nir: add an optimization to remove useless phi nodes This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 16:00:13 -05:00
Jason Ekstrand	572d1f6e41	nir/validate: Ensure that phi sources are SSA-only Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:52:42 -08:00
Jason Ekstrand	5420774510	nir/validate: Validate that only float ALU outputs are saturated Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:46:55 -08:00
Jason Ekstrand	c0df85cca4	nir/lower_source_mods: Don't lower saturate for non-float outputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-02-03 12:46:38 -08:00
Jason Ekstrand	8776b1b14b	i965/fs_nir: Get rid of get_alu_src Originally, get_alu_src was supposed to handle resolving swizzles and things like that. However, now that basically every instruction we have only takes scalar sources, we don't really need it anymore. The only case where it's still marginally useful is for the mov and vecN operations that are left over from SSA form. We can handle those cases as a special case easily enough. As a side-effect, we don't need the vec_to_movs pass anymore. v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we detect if we need an extra copy for swizzling. The old code involved a pile of confusing switch fall-throughs; we now use a loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Jason Ekstrand	112d738b91	i965/fs: Use NIR's scalarizing abilities and stop handling vectors Now that we can scalarize with NIR, there's no need for all this code anymore. Let's get rid of it and just do scalar operations. v2: run copy prop before lowering phi nodes v3: Get rid of the "emit(...)->saturate = foo" pattern v4: Run alu_to_scalar as an optimization pass total instructions in shared programs: 5998321 -> 5974070 (-0.40%) instructions in affected programs: 732075 -> 707824 (-3.31%) helped: 3137 HURT: 191 GAINED: 18 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Jason Ekstrand	f2adcd36cb	nir: Add a pass to lower vector phi nodes to scalar phi nodes v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Add better comments - Use nir_ssa_dest_init and nir_src_for_ssa more places - Fix some void * casts v3 Jason Ekstrand <jason.ekstrand@intel.com>: - Rework the way we determine whether or not to sccalarize a phi node to make the recursion non-bogus - Treat load_const instructions as scalarizable v4 Jason Ekstrand <jason.ekstrand@intel.com>: - Allow uniform and input loads to be scalarizable v5 Jason Ekstrand <jason.ekstrand@intel.com>: - Also consider loads of inputs (varying, uniform, or ubo) to be scalarizable. We were already doing this for load_var on uniforms and inputs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:33:11 -08:00
Matt Turner	e87928a494	i965/fs: Add support for constant propagating into sources with modifiers. All but 16 of the programs helped were ARB fp programs. total instructions in shared programs: 5949286 -> 5945470 (-0.06%) instructions in affected programs: 275162 -> 271346 (-1.39%) helped: 1197 GAINED: 1 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	cfa2165642	i965/vec4: Use abs/negate functions in const propagation. No changes in shader-db. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	dbd4c22a37	i965: Add function to take the abs of immediates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	638beee24a	i965: Add function to negate immediates. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	1f4bdad316	i965: Mark UB/B immediates as unreachable. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-02-03 12:25:14 -08:00
Matt Turner	32e98e8ef0	gallium/util: Don't use __builtin_clrsb in util_last_bit(). Unclear circumstances lead to undefined symbols on x86. Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-02-03 12:25:14 -08:00
Matt Turner	d8be1b9aba	glsl/list: Note that exec_lists may not be realloc'd. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-03 12:25:14 -08:00
Nils Wallménius	cfb5b1c59e	st/mesa: mark constant array of swizzles as static const This saves about 0.5k in the text section for a gallium driver on amd64. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-02-04 09:07:13 +13:00
Eduardo Lima Mitev	0ed3bffc08	mesa: Returns a GL_INVALID_VALUE error on several APIs when buffer size is negative Section 2.3.1 (Errors) of the OpenGL 4.5 spec says: "If a negative number is provided where an argument of type sizei or sizeiptr is specified, an INVALID_VALUE error is generated. This patch adds checks for negative buffer size values passed to different APIs. It also moves up the check on other APIs that already had it, making it the first error check performed in the function, for consistency. While there may be other APIs throughtout the code lacking this check (or at least not at the beginning of the function), this patch focuses on the cases that break the dEQP tests listed below. It could be a good excersize for the future to check all other cases, and improve consistency in the order of the checks throughout the whole Mesa code base. This fixes 5 dEQP test: * dEQP-GLES3.functional.negative_api.state.get_attached_shaders * dEQP-GLES3.functional.negative_api.state.get_shader_source * dEQP-GLES3.functional.negative_api.state.get_active_uniform * dEQP-GLES3.functional.negative_api.state.get_active_attrib * dEQP-GLES3.functional.negative_api.shader.program_binary Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Samuel Iglesias Gonsalvez	284bd1ecdf	mesa: fix error value in GetFramebufferAttachmentParameteriv for OpenGL ES 3.0 Section 6.1.13 "Framebuffer Object Queries" of OpenGL ES 3.0 spec: "If the default framebuffer is bound to target, then attachment must be BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or STENCIL, identifying the stencil buffer." OpenGL ES 3.0, section 2.5 (GL Errors): "If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated." Then change the returned error to INVALID_ENUM. Fixes: dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	5dfb085ff3	glsl: Improve precision of mod(x,y) Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Eduardo Lima Mitev	c27d23f0c8	mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3 GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX (2.8.1 Transferring Array Elements, page 26) which is not currently possible to query using glGet() funcs. Fixes 4 dEQP tests: dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64 * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	ec7dcaf578	glsl: can't have 'const' qualifier used with struct or interface block members Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	5d655a43e6	glsl: interface blocks must be declared at global scope Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-03 13:19:36 +01:00
Iago Toral Quiroga	6dd346c232	i965: Fix negate with unsigned integers For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8) g5<1>.xF -g9<4,4,1>.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 24 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uint_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec2_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec3_ dEQP-GLES3.functional.shaders.operator.unary_operator.minus._uvec4_ Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-03 13:19:36 +01:00
Jose Fonseca	5b941ce857	scons: Fix Windows builds with LLVM 3.5. LLVMBitReader dependency was introduced, as pointed out by Rob Conde.	2015-02-03 10:18:51 +00:00
Ilia Mirkin	bc321db75b	st/mesa: add EXT_polygon_offset_clamp support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-02 20:44:22 -05:00
Ilia Mirkin	7c211a12aa	gallium: add a cap to determine whether the driver supports offset_clamp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-02-02 20:44:02 -05:00
Ilia Mirkin	2ce29ce5af	i965/gen6+: enable EXT_polygon_offset_clamp Replace the hard-coded 0's with the context clamp value. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-02-02 20:35:36 -05:00
Ilia Mirkin	81998dda63	mesa: add support for GL_EXT_polygon_offset_clamp Nothing enables the extension yet, but the values are now available. The spec calls for it to only be exposed for GL 3.3+, which is core-only in mesa. Instead we allow any driver to enable it, including in a compat context for any GL version. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-02 20:35:36 -05:00
Ilia Mirkin	83321009de	glapi: add GL_EXT_polygon_offset_clamp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2015-02-02 20:35:36 -05:00
Kenneth Graunke	0f06f12c11	glsl: Pick ast_conditional branch regardless of op1/2 being constant. If the ?: operator's condition is a constant value, and both branches were pure expressions, we can just make the resulting value one or the other. Previously, we only did this if op[1] and op[2] were also constant values - but there's no actual reason for that restriction. No changes in shader-db, probably because we usually optimize this later anyway. But it does make us generate less stupid code up front. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-02-02 17:14:55 -08:00
Kenneth Graunke	534f07ee85	i965: Add a better PRM citation for the IMS dimension mangling. Paul originally had to reverse engineer these formulas based on the description about how the sampler works. The description here is not the easiest to follow - especially given that it's from the Sandybridge era, when the hardware only did 4x multisampling. Jordan and I recently found another part of the documentation where they simply state that IMS dimensions must be adjusted by a set of formulas. Quoting this section provides an easy to follow explanation for the code, including 2x/4x/8x/16x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-02-02 17:14:38 -08:00
Laura Ekstrand	e9b86cb5d6	swrast: Whitespace fixes. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-02 13:22:26 -08:00
Laura Ekstrand	e187c2f543	DD: Refactor BlitFramebuffer. In preparation for glBlitNamedFramebuffer, the DD table function BlitFramebuffer needs to accept two arbitrary framebuffer objects rather than assuming ctx->ReadBuffer and ctx->DrawBuffer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-02-02 13:21:20 -08:00
Laura Ekstrand	ad2c64abbd	GL: Update glext.h to Khronos Revision 29537. Khronos Revision 29537 fixes ARB_direct_state_access function prototypes that had GLsizei where they should have had GLsizeiptr. The mainly affects functions related to buffer objects. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-02-02 10:39:55 -08:00
Jason Ekstrand	2cebaac479	i965: Don't use tiled_memcpy to download from RGBX or BGRX surfaces Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-02-02 10:18:42 -08:00
Neil Roberts	af8fd694d4	dir-locals.el: Don't set variables for non-programming modes This limits the style changes to modes inherited from prog-mode. The main reason to do this is to avoid setting fill-column for people using Emacs to edit commit messages because 78 characters is too many to make it wrap properly in git log. Note that makefile-mode also inherits from prog-mode so the fill column should continue to apply there. v2: Apply to all the .dir-locals.el files, not just the one in the root directory. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2015-02-02 12:02:55 +00:00
Iago Toral Quiroga	68155e5a36	i965: Fix intel_miptree_copy_teximage for GL_TEXTURE_1D_ARRAY For GL_TEXTURE_1D_ARRAY targets we store the depth of the array in the Height field and leave Depth=1 in the underlying texture object. When we call intel_miptree_copy_teximage in the process of re-creating a miptree (possibily because the number of miplevels has changed) we didn't account for this, so we where only copying texture images for the first slice. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-02-02 09:29:18 +01:00
Eric Anholt	753c327151	vc4: Kill a bunch of color write calculation when colormask is all off. I could have done this in the bit that generates the ANDs and ORs, but it's probably generally useful. Sadly, I still need this even if I move to NIR, because I can't yet express my read of the destination color in NIR, which I would need to move my blend/logicop/colormask handling into NIR. total uniforms in shared programs: 13497 -> 13455 (-0.31%) uniforms in affected programs: 101 -> 59 (-41.58%) total instructions in shared programs: 40797 -> 40296 (-1.23%) instructions in affected programs: 1639 -> 1138 (-30.57%)	2015-02-01 16:07:24 -08:00
Fredrik Höglund	0508032413	docs: Update ARB_direct_state_access Mark vertex array objects as started.	2015-02-01 23:00:42 +01:00
Martin Peres	9272022353	doc: break down ARB_direct_state_access in GL3.txt A student was wondering what was going on + I started working on it too. CC: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Laura Ekstrand <laura@jlekstrand.net> Signed-off-by: Fredrik Höglund <fredrik@kde.org>	2015-02-01 22:50:35 +01:00
Eric Anholt	12ebd7e20e	vc4: Dump the VPM read index in QIR disasm. Since the VPM reads have to be in order, it's useful to see their indices in the dump.	2015-02-01 12:53:08 -08:00
Jason Ekstrand	6094619c02	i965/pixel_read: Don't try to do a tiled_memcpy from a multisampled buffer The GL spec guarantees that glGetTexImage will never get a multisampled texture, but this is not true for glReadPixels. If we get a multisampled buffer, we have to do a multisample resolve on it before we can pull the data down for the user. Since this isn't practical to handle in tiled_memcpy, we just fall back to the other paths that can handle this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-31 08:54:32 -08:00
Francisco Jerez	11f5d8a5d4	i965: Enable L3 caching of buffer surfaces. And remove the mocs argument of the emit_buffer_surface_state vtbl hook. Its semantics vary greatly from one generation to another, so it kind of encourages the caller to pass 0 which is the only valid setting across generations. After this commit the hardware-specific code decides what the best cacheability settings are for buffer surfaces, just like we do for textures. This together with some additional changes coming is expected to improve performance of pull constants, buffer textures, atomic counters and image objects on Gen7 and up. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-31 17:01:49 +02:00
José Fonseca	11a955aef4	egl: Pass the correct X visual depth to xcb_put_image(). The dri2_x11_add_configs_for_visuals() function happily matches a 32 bits EGLconfig with a 24 bits X visual. However it was passing 32bits depth to xcb_put_image(), making X server unhappy: https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911 Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-31 09:14:36 +00:00
Jason Ekstrand	5c31184cf5	intel/pixel_read: Properly flip the results for window system buffers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841 Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-30 18:56:56 -08:00
Jason Ekstrand	837a4c42a6	i965/tiled_memcpy: Support a signed linear pitch Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-30 18:56:56 -08:00
Jason Ekstrand	7cc3bb2318	main: Add STENCIL_INDEX formats to base_tex_format This fixes a bug on BDW when our meta-based stencil blit path assert-fails due to an invalid internal format even though we do support the ARB_stencil_texturing extension. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 15:49:45 -08:00
Jason Ekstrand	16875bc5cd	teximage: Don't indent switch cases No functional change. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 15:49:45 -08:00
Brian Paul	b930ef1ce8	mesa: remove some dead display list code The size of a Node is always four bytes so no need for the old code that was used when sizeof(Node)==8. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 13:27:18 -07:00
Brian Paul	20bc72b791	mesa: remove stale comment in dlist.c code sizeof(Node) is always 4 bytes. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 13:27:18 -07:00
Brian Paul	613974b774	mesa: s/union gl_dlist_node/Node/ in dlist.c code Just minor clean-up. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 13:27:17 -07:00
Brian Paul	53b01938ed	mesa: fix display list 8-byte alignment issue The _mesa_dlist_alloc() function is only guaranteed to return a pointer with 4-byte alignment. On 64-bit systems which don't support unaligned loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code. The solution is to add a new _mesa_dlist_alloc_aligned() function which will return a pointer to an 8-byte aligned address on 64-bit systems. This is accomplished by inserting a 4-byte NOP instruction in the display list when needed. The only place this actually matters is the VBO code where we need to allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte aligned (just as if it were malloc'd). The gears demo and others hit this bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662 Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-01-30 08:48:19 -07:00
José Fonseca	fbc3e030e6	util/u_atomic: Provide a _InterlockedCompareExchange8 for older MSVC. Fixes build with Windows SDK 7.0.7600. Tested with u_atomic_test, both on x86 and x86_64. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-01-30 15:24:34 +00:00
José Fonseca	d7f2dfb67e	util/u_atomic: Use _Interlocked* intrinsics for non 64bits. The intrinsics are universally available, whereas older Windows SDKs (e.g. 7.0.7600) don't have the non-intrisic entrypoint. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-01-30 15:24:33 +00:00
Neil Roberts	a7eec6d620	i965/skl: Force a BINDING_TABLE_POINTER_* after push constant command According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_* command. This patch just makes it set the BRW_NEW_SURFACES state when uploading the push constants to ensure the binding tables will be updated. This fixes the fbo-blending-formats Piglit test and possibly others. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-30 12:25:13 +00:00
Topi Pohjolainen	083fb215e1	meta: Don't write depth when decompressing tex-images Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 09:59:13 +02:00
Topi Pohjolainen	c49c750579	meta: Don't write depth when generating miptrees Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 09:59:04 +02:00
Topi Pohjolainen	941aced635	meta/blit: Compile programs with and without depth When color buffers alone are concerned the depth is not needed. No regression on BDW where meta blit is used instead of blorp. I also disabled blorp temporarily for fbo-blits on IVB and saw no regressions there either. I also compared several graphics benchmarks on BDW and saw neither regressions or improvements. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 09:58:32 +02:00
Topi Pohjolainen	97caf5fa04	meta/blit: Write depth only when asked for Implementing an idea from Ken, on i965 the shader program for 2D blits becomes significantly simpler. Before: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; send(8) g2<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q }; mov(8) g123<1>F g2<8,8,1>F { align1 1Q compacted }; mov(8) g124<1>F g3<8,8,1>F { align1 1Q compacted }; mov(8) g125<1>F g4<8,8,1>F { align1 1Q compacted }; mov(8) g126<1>F g5<8,8,1>F { align1 1Q compacted }; mov(8) g127<1>F g2<8,8,1>F { align1 1Q compacted }; nop ; sendc(8) null g123<8,8,1>F render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT }; After: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted }; send(8) g124<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q }; sendc(8) null g124<8,8,1>F render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT }; v2 (Matt): Removed unintended white-space change Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 09:57:51 +02:00
Topi Pohjolainen	4c157d34c0	meta/blit: Add plumbing for shaders without depth Currently all blit programs are unconditionally compiled with gl_FragDepth. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-30 09:54:53 +02:00
Jason Ekstrand	604ae33c8b	nir/opt_algebraic: Add some constant bcsel reductions total instructions in shared programs: 5998190 -> 5997603 (-0.01%) instructions in affected programs: 54276 -> 53689 (-1.08%) helped: 293 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:11:13 -08:00
Jason Ekstrand	7f19cd5a56	nir/opt_algebraic: Add some boolean simplifications total instructions in shared programs: 5998321 -> 5998287 (-0.00%) instructions in affected programs: 4520 -> 4486 (-0.75%) helped: 8 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:11:10 -08:00
Jason Ekstrand	70273c5cd5	nir/algebraic: Support specifying variable as constant or by type Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:07:45 -08:00
Jason Ekstrand	81f77e4f3a	nir/algebraic: Fail to compile of a variable is used in a replace but not the search Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:07:45 -08:00
Jason Ekstrand	026b5cc792	nir/search: Allow for matching variables based on types This allows you to match on an unknown value but only if it is of a given type. 90% of the uses of this are for matching only booleans, but adding the generality of arbitrary types is no more complex. nir_algebraic.py doesn't handle this yet but that's ok because the C language will ensure that the default type on all variables is void. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:07:45 -08:00
Jason Ekstrand	d8999bcdce	nir/search: Add support for matching unknown constants There are some algebraic transformations that we want to do but only if certain things are constants. For instance, we may want to replace a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant. While this generates more instructions, some of it will get constant folded. nir_algebraic.py doesn't handle this yet, but that's ok because the C language will make sure that false is the default for now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:07:45 -08:00
Jason Ekstrand	5ab1489ae6	nir: Add an invalid type This allows us to indicate a concept of an invalid type. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-29 17:07:45 -08:00
Roland Scheidegger	f01e8d3ba5	gallium/docs: fix docs wrt ARL/ARR/FLR since the address reg holds integer values, ARL/ARR do an implicit float-to-int conversion, so clarify that. Thus it is also incorrect to say that FLR really does the same as ARL. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-29 22:08:12 +01:00
Eric Anholt	fc884eadf1	nir: Add variants of some of the comparison simplifications. We end up with these from TGSI-to-NIR because the pass generating the comparisons doesn't know if the arg is actually a bool input or not. vc4 results: total instructions in shared programs: 41801 -> 41508 (-0.70%) instructions in affected programs: 4253 -> 3960 (-6.89%) Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-29 11:44:06 -08:00
Eric Anholt	2b9c3bace7	vc4: Fix point size handling when it's the first output.	2015-01-29 11:43:33 -08:00
Eric Anholt	9a3a60cb13	nir: Don't try to to-SSA ALU instructions that are already SSA. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-29 11:43:33 -08:00
Eric Anholt	68d476167c	nir: Fix a bit of broken indentation. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-29 11:42:08 -08:00
Eric Anholt	36c604c824	nir: Add a couple of helpers for glsl types. This will be used by tgsi_to_nir, which needs to get vec4 types for declaring shader input/output variables. v2: Add a missing space. Reviewed-by: Matt Turner <mattst88@gmail.com> (v2) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-29 11:41:17 -08:00
Emil Velikov	765cfe9a90	docs: fix mesa 10.4.3 release date Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-01-29 14:02:48 +00:00
Kalyan Kondapally	e638841b87	Mesa: Advertise GL_OES_texture_float extensions support with i965. This patch advertises support for GL_OES_texture_float extensions when using i965 drivers. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-29 08:22:12 +02:00
Kalyan Kondapally	2c2a92d5b8	Mesa: Add support for HALF_FLOAT_OES type. This patch adds needed support for accepting HALF_FLOAT_OES as valid type for TexImageD and TexSubImageD when Texture FLoat extensions are supported. Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-29 08:21:41 +02:00
Kalyan Kondapally	a63c8a524b	Mesa: Add support for GL_OES_texture_float extensions. This patch series adds support for following GLES2 Texture Float extensions: 1)GL_OES_texture_float, 2)GL_OES_texture_half_float, 3)GL_OES_texture_float_linear, 4)GL_OES_texture_half_float_linear. This patch adds basic infrastructure and needed boolean flags to advertise support for these extensions, by default the support is disabled. Next patch in the series introduces support for HALF_FLOAT_OES token. v4: take assert away and make valid_filter_for_float conditional (Tapani), fix the alphabetical order (Emil) Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-29 08:16:47 +02:00
Eric Anholt	dd4d9a4e62	nir: Make vec-to-movs handle src/dest aliasing. It now emits vector MOVs instead of a series of individual MOVs, which should be useful to any vector backends. This pushes the problem of src/dest aliasing of channels on a scalar chip to the backend, but if there are any vector operations in your shader then you needed to be handling this already. Fixes fs-swap-problem with my scalarizing patches. v2: Rename to insert_mov(), and add a comment about what it does. v3: Rewrite the comment. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v3)	2015-01-28 16:33:34 -08:00
Eric Anholt	d70eb38517	gallium: Replace u_simple_list.h with util/simple_list.h The code was exactly the same, except util/ has c++ guards and a struct simple_node declaration. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-01-28 16:33:34 -08:00
Eric Anholt	7c99187c6a	mesa: Port a variant of `68afbe89c7` to util/ The idea is that after a remove_from_list(), you might want to be able to do a remove_from_list() on it again or an is_empty_list(). This is apparently relied on by r300g. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-01-28 16:33:34 -08:00
Eric Anholt	8ab6759cef	mesa: Move simple_list.h to src/util. We have two copies of it in the tree, I'm going to delete one. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-01-28 16:33:34 -08:00
Tom Stellard	2397a72129	radeonsi: Enable VGPR spilling for all shader types v5 v2: - Only emit write SPI_TMPRING_SIZE once per packet. - Use context global scratch buffer. v3: - Patch shaders using WRITE_DATA packet instead of map/unmap. - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and VS_PARTIAL_FLUSH when patching shaders. v4: - Code cleanups. - Remove unnecessary multiplies. v5: - Patch shaders in system memory and re-upload to vram. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-28 21:03:47 +00:00
Tom Stellard	5dcd97f25c	radeonsi/compute: Allocate the scratch buffer during state creation This moves scratch buffer allocation from si_launch_grid() to si_create_compute_state(). This helps to reduce the overhead of launching a kernel and also fixes a bug in the code that would cause the scratch buffer to be too small if a kernel with smaller scratch size was launched before a kernel with a larger scratch size. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-28 21:03:46 +00:00
Tom Stellard	32206c5e56	radeonsi: Add radeon_shader_binary member to struct si_shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-28 21:03:46 +00:00
Tom Stellard	37559f8dfc	radeonsi/compute: Rename si_compute::program to si_compute::shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-28 21:03:46 +00:00
Marek Olšák	5935edd47c	radeonsi: Avoid leaking memory when rebuilding shader states Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-28 21:03:46 +00:00
Jason Ekstrand	bb26ebac13	nir/opcodes: Use a return type of tfloat for ldexp Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-28 13:21:40 -08:00
Jason Ekstrand	7ac79eea1a	Revert "util: Move the alternate fpclassify implementation to util" This reverts commits `d6eb572905` and `58e8468d11`. This is no longer necessary as we aren't using it in NIR anymore. Also, it broke the build on some strange systems so let's put it back in querymatrix where it came from. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88852 Acked-by: Matt Turner <mattst88@gmail.com>	2015-01-28 13:20:26 -08:00
Jason Ekstrand	f0340ff625	Revert "nir/opcodes: Use fpclassify() instead of isnormal() for ldexp" This reverts commit `d7d340fb2f`. We have an isnormal() implementation available, the only problem was that we had the wrong return type (fixed in a later patch). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806 Acked-by: Matt Turner <mattst88@gmail.com>	2015-01-28 13:19:47 -08:00
Jason Ekstrand	58e8468d11	util: Predicate the fpclassify fallback on !defined(__cplusplus) The problem is that the fallbacks we have at the moment don't work in C++. While we could theoretically fix the fallbacks it would also raise the issue of correctly detecting the fpclassify function. So, for now, we'll just disable it until we actually have a C++ user. Reported-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: EdB <edb+mesa@sigluy.net>	2015-01-28 11:47:56 -08:00
Sven Arvidsson	3b7747c022	drirc: set allow_glsl_extension_directive_midshader for Dead Island. Signed-off-by: Sven Arvidsson <sa@whiz.se> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-01-28 14:50:28 +01:00
Jason Ekstrand	d7d340fb2f	nir/opcodes: Use fpclassify() instead of isnormal() for ldexp Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-28 03:42:41 -08:00
Jason Ekstrand	d6eb572905	util: Move the alternate fpclassify implementation to util Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-28 03:42:41 -08:00
Jason Ekstrand	5e8468e6da	i965/tex: Don't create read-write textures with non-renderable formats I haven't actually seen this bug in the wild, but it's possible that someone could ask to do a S3TC PBO download or something. This protects us from accidentally creating a render target with a compressed or otherwise non-renderable format. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-28 01:28:32 -08:00
Jason Ekstrand	34723c0861	i965/gen8: Include the buffer offset when emitting renderbuffer relocs Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88792 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-28 01:28:31 -08:00
Tapani Pälli	291d7ef84d	mesa: improve error messaging for format CSV parser Patch adds 2 error messages that point user directly to fix mispelled or impossible swizzle field for a format. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-28 10:40:15 +02:00
EdB	6ee5effac1	clover/llvm: Dump the OpenCL C code earlier. [ Francisco Jerez: As discussed on the mailing list, this is intended to produce more useful debug output in cases where the compilation terminates unexpectedly. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-01-28 02:27:41 +02:00
EdB	13d23a9a17	clover/llvm: Move CLOVER_DEBUG stuff into anonymous namespace. [ Francisco Jerez: As we're at it make debug_options[] local to its only user and remove temporary. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-01-28 02:27:41 +02:00
Dave Airlie	349df23eb0	r600g: add support for primitive id without geom shader (v2) GLSL 1.50 specifies a fragment shader may have a primitive id input without a geometry shader present. On r600 hw there is a special GS scenario for this, you have to enable GS_SCENARIO_A and pass the primitive id through the vertex shader which operates in GS_A mode. This is a first pass attempt at this, and passes the piglit tests that test for this. v1.1: clean up debug print + no need to assign key value to setup output. v2: add r600 support Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-01-28 09:51:21 +10:00
Dave Airlie	cc2fc095bf	r600g: move selecting the pixel shader earlier. In order to detect that a pixel shader has a prim id input when we have no geometry shader we need to reorder the shader selection so the pixel shader is selected first, then the vertex shader key can take into account the primitive id input requirement and lack of geom shader. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-01-28 09:51:02 +10:00
Michel Dänzer	5c83a0d2ce	st/clover: Pass target instead of target.begin() to std::string() Fixes reading beyond allocated memory: ==1936== Invalid read of size 1 ==1936== at 0x4C2C1B4: strlen (vg_replace_strmem.c:412) ==1936== by 0x9E00C30: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20) ==1936== by 0x5B44FAE: clover::compile_program_llvm(clover::compat::string const&, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&, pipe_shader_ir, clover::compat::string const&, clover::compat::string const&, clover::compat::string&) (invocation.cpp:698) ==1936== by 0x5B39A20: clover::program::build(clover::ref_vector<clover::device> const&, char const, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==1936== by 0x5B20152: clBuildProgram (program.cpp:182) ==1936== by 0x400F41: main (hello_world.c:109) ==1936== Address 0x56fee1f is 0 bytes after a block of size 15 alloc'd ==1936== at 0x4C28C20: malloc (vg_replace_malloc.c:296) ==1936== by 0x5B398F0: alloc (compat.hpp:59) ==1936== by 0x5B398F0: vector<std::basic_string<char> > (compat.hpp:98) ==1936== by 0x5B398F0: string<std::basic_string<char> > (compat.hpp:327) ==1936== by 0x5B398F0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==1936== by 0x5B20152: clBuildProgram (program.cpp:182) ==1936== by 0x400F41: main (hello_world.c:109) Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2015-01-27 16:55:29 +09:00
Michel Dänzer	ee31c8d706	r600g,radeonsi: Fix calculation of IR target cap string buffer size Fixes writing beyond the allocated buffer: ==31855== Invalid write of size 1 ==31855== at 0x50AB2A9: vsprintf (iovsprintf.c:43) ==31855== by 0x508F6F6: sprintf (sprintf.c:32) ==31855== by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526) ==31855== by 0x5B2B7DE: get_compute_param<char> (device.cpp:37) ==31855== by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201) ==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==31855== by 0x5B20152: clBuildProgram (program.cpp:182) ==31855== by 0x400F41: main (hello_world.c:109) ==31855== Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd ==31855== at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324) ==31855== by 0x5B2B7C2: allocate (new_allocator.h:104) ==31855== by 0x5B2B7C2: allocate (alloc_traits.h:357) ==31855== by 0x5B2B7C2: _M_allocate (stl_vector.h:170) ==31855== by 0x5B2B7C2: _M_create_storage (stl_vector.h:185) ==31855== by 0x5B2B7C2: _Vector_base (stl_vector.h:136) ==31855== by 0x5B2B7C2: vector (stl_vector.h:278) ==31855== by 0x5B2B7C2: get_compute_param<char> (device.cpp:35) ==31855== by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201) ==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63) ==31855== by 0x5B20152: clBuildProgram (program.cpp:182) ==31855== by 0x400F41: main (hello_world.c:109) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-01-27 16:54:38 +09:00
Connor Abbott	f1a9252def	nir: fix a bug with constant folding non-per-component instructions Before, we were only copying the first N channels, where N is the size of the SSA destination, which is fine for per-component instructions, but non-per-component instructions like fdot3 can have more source components than destination components. Fix this using the helper function introduced in the last patch. v2: use new helper name Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 21:26:36 -05:00
Connor Abbott	816f0515a2	nir: add a helper function for getting the number of source components Unlike with non-SSA ALU instructions, where if they're per-component you have to look at the writemask to know which source channels are being used, SSA ALU instructions always have all the possible channels enabled so we can just look at the number of components in the SSA definition for per-component instructions to say how many source components are being used. v2: use new name nir_ssa_alu_instr_src_components() Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 21:26:36 -05:00
Sisinty Sasmita Patra	90bd943f2a	i965: Implemente a tiled fast-path for glReadPixels and glGetTexImage Added intel_readpixels_tiled_mempcpy and intel_gettexsubimage_tiled_mempcpy functions. These are the fast paths for glReadPixels and glGetTexImage. On chrome, using the RoboHornet 2D Canvas toDataURL test, this patch cuts amount of time spent in glReadPixels by more than half and reduces the time of the entire test by 10%. v2: Jason Ekstrand <jason.ekstrand@intel.com> - Refactor to make the functions look more like the old intel_tex_subimage_tiled_memcpy - Don't export the readpixels_tiled_memcpy function - Fix some pointer arithmatic bugs in partial image downloads (using ReadPixels with a non-zero x or y offset) - Fix a bug when ReadPixels is performed on an FBO wrapping a texture miplevel other than zero. v3: Jason Ekstrand <jason.ekstrand@intel.com> - Better documentation fot the *_tiled_memcpy functions - Add target restrictions for renderbuffers wrapping textures v4: Jason Ekstrand <jason.ekstrand@intel.com> - Only check the return value of brw_bo_map for error and not bo->virtual v5: Jason Ekstrand <jason.ekstrand@intel.com> - Don't unnecessarily repeat a comment Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-26 17:29:35 -08:00
Sisinty Sasmita Patra	b52959c602	i965/tiled_memcpy: Add tiled-to-linear paths This commit addes tiled copy functions for coping from tiled memory to linear memory. These are very similar to the existing linear-to-tiled paths. v2: Jason Ekstrand <jason.ekstrand@intel.com> - New commit message - Various whitespace fixes - Added ptrdiff_t casts as done in commit `225a09790` v3: Jason Ekstrand <jason.ekstrand@intel.com> - Fixed a comment Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-26 17:29:34 -08:00
Sisinty Sasmita Patra	009be40b7d	i965: Refactor tiled memcpy functions and move them into their own file This commit refactors the tiled_memcpy code in intel_tex_subimage.c and moves it into its own file intel_tiled_memcpy files. Also, xtile_copy and ytile_copy are renamed to linear_to_xtiled and linear_to_ytiled respectively. The *_faster functions are similarly renamed. There was also a bit of logic to select between the the libc provided memcpy function and our custom memcpy that does an RGBA -> BGRA swizzle. This was moved into an intel_get_memcpy function so that rgba8_copy can live (and be inlined) in intel_tiled_memcpy.c. v2: Jason Ekstrand <jason.ekstrand@intel.com> - Better commit message - Fix up the copyright on the intel_tiled_memcpy files - Various whitespace fixes - Moved a bunch of stuff that did not need to be exposed from intel_tiled_memcpy.h to intel_tiled_memcpy.c - Added proper documentation for intel_get_memcpy - Incorperated the ptrdiff_t tweaks from commit `225a09790` v3: Jason Ekstrand <jason.ekstrand@intel.com> - Fixed a comment - Move the tile size constants into the .c file Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-26 17:29:34 -08:00
Jason Ekstrand	f883aac06e	i965/tex_subimage: Use the fast tiled path for rectangle textures There's no reason why we should be doing this for 2D textures and not rectangles. Just a matter of adding another hunk to the condition. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-26 17:29:34 -08:00
Dave Airlie	ea9ae5d51a	mesa/autoconf: attempt to use gnu99 on older gcc compilers anonymous structs/union don't work with c99 but do work with gnu99 on gcc 4.4. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-01-27 10:27:56 +10:00
Felix Janda	2e2087a9eb	mesa: simplify detection of fpclassify Fixes compilation with musl libc. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-26 14:07:57 -08:00
Jason Ekstrand	dd74369a0a	nir/opcodes: Don't go through doubles when constant-folding iabs Previously, we called the abs() function in math.h. However, this involves unnecessarily going through double. This commit changes it to use integers directly with a ternary. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-26 11:25:02 -08:00
Jason Ekstrand	9bd28fe3a3	nir/opcodes: Simplify and fix the unpack_half__split_ constant expressions Previously, these functions were explicitly writing to dst.x and dst.y. However they both return only one component so writing to dst.y is invalid. Also, since they only return one component, we don't need the explicit assignment in the expression and can simplify it use an implicit assignment. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 11:25:02 -08:00
Jason Ekstrand	27c6e3e4ca	nir: Use pointers for nir_src_copy and nir_dest_copy This avoids the overhead of copying structures and better matches the newly added nir_alu_src_copy and nir_alu_dest_copy. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-26 11:24:58 -08:00
Kenneth Graunke	9f5fee8804	i965: Handle CMP.nz ... 0 and MOV.nz similarly in cmod propagation. "MOV.nz null src" and "CMP.nz null src 0" are equivalent instructions. Previously, we deleted MOV.nz instructions when the instruction generating the MOV's source also wrote the flag register (as the flag register already contains the desired value). However, we wouldn't delete CMP.nz instructions that served the same purpose. We also didn't attempt true cmod propagation on MOV.nz instructions, while we would for the equivalent CMP.nz form. This patch fixes both limitations, treating both forms equally. CMP.nz instructions will now be deleted (helping the NIR backend), and MOV.nz instructions will have their .nz propagated. No changes in shader-db without NIR. With NIR, total instructions in shared programs: 6006153 -> 5969364 (-0.61%) instructions in affected programs: 2087139 -> 2050350 (-1.76%) helped: 10704 HURT: 0 GAINED: 2 LOST: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-26 10:13:18 -08:00
Jan Vesely	9cbb9165b9	clover: Fix build with llvm after r226981 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88783 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2015-01-26 09:46:41 -05:00
Niels Ole Salscheider	4b94c3fc31	configure: Link against all LLVM targets when building clover Since `8e7df519bd`, we initialise all targets in clover. This fixes bug 85380. v2: Mention correct bug in commit message Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-25 18:11:03 +02:00
Connor Abbott	0aa31bf9c3	nir/constant_folding: use the new constant folding infrastructure Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-24 21:35:35 -08:00
Jason Ekstrand	89285e4d47	nir: add new constant folding infrastructure Add a required field to the Opcode class, const_expr, that contains an expression or statement that computes the result of the opcode given known constant inputs. Then take those const_expr's and expand them into a function that takes an opcode and an array of constant inputs and spits out the constant result. This means that when adding opcodes, there's one less place to update, and almost all the opcodes are self-documenting since the information on how to compute the result is right next to the definition. The helper functions in nir_constant_expressions.c were taken from ir_constant_expressions.cpp. v3 Jason Ekstrand <jason.ekstrand@iastate.edu> - Use mako to generate one function per opcode instead of doing piles of string splicing v4 Jason Ekstrand <jason.ekstrand@iastate.edu> - More comments and better indentation in the mako - Add a description of the constant expression language in nir_opcodes.py - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-24 21:35:35 -08:00
Connor Abbott	fa4bc6c130	nir: use Python to autogenerate opcode information Before, we used a system where a file, nir_opcodes.h, defined some macros that were included to generate the enum values and the nir_op_infos structure. This worked pretty well, but for development the error messages were never very useful, Python tools couldn't understand the opcode list, and it was difficult to use nir_opcodes.h to do other things like autogenerate a builder API. Now, we store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py to generate the old nir_opcodes.c and nir_opcodes_h.py to generate nir_opcodes.h, which contains all the enum names and gets included into nir.h like before. In addition to solving the above problems, using Python and Mako to generate everything means that it's much easier to add keep information centralized as we add new things like constant propagation that require per-opcode information. v2: - make Opcode derive from object (Dylan) - don't use assert like it's a function (Dylan) - style fixes for fnoise, use xrange (Dylan) - use iterkeys() in nir_opcodes_h.py (Dylan) - use pydoc-style comments (Jason) - don't make fmin/fmax commutative and associative yet (Jason) Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> v3 Jason Ekstrand <jason.ekstrand@intel.com> - Alphabetize source file lists - Generate nir_opcodes.h in the builddir instead of the source dir - Include $(builddir)/src/glsl/nir in the i965 build - Rework nir_opcodes.h generation so it generates a complete header file instead of one that has to be embedded inside an enum declaration	2015-01-24 21:33:56 -08:00
Emil Velikov	d2811c29da	docs: add news item and link release notes for mesa 10.4.3 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-01-24 13:18:10 +00:00
Emil Velikov	48818a0fc7	docs: Add sha256 sums for the 10.4.3 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `49a5bce780`)	2015-01-24 13:14:56 +00:00
Emil Velikov	9f35423270	Add release notes for the 10.4.3 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `e92bfa3f95`)	2015-01-24 13:14:54 +00:00
Matt Turner	94e7b59a75	i965: Convert CMP.GE -(abs)reg 0 -> CMP.Z reg 0. total instructions in shared programs: 5952059 -> 5951603 (-0.01%) instructions in affected programs: 138812 -> 138356 (-0.33%) GAINED: 1 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	40ae302a3c	i965/fs: Add support for removing MOV.NZ instructions. For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g25<1>F -g6<0,1,0>F g15<8,8,1>F cmp.l.f0(8) g26<1>D g25<8,8,1>F 0F mov.nz.f0(8) null g26<8,8,1>D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null -g6<0,1,0>F g15<8,8,1>F total instructions in shared programs: 5955701 -> 5951657 (-0.07%) instructions in affected programs: 302910 -> 298866 (-1.34%) GAINED: 1 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	9a3a294224	i965/fs: Allow flipping cond mod for negated arguments. This allows us to apply the optimization in cases where the CMP's argument is negated, by flipping the conditional mod. For example, it allows us to optimize this: add(8) temp a b cmp.l.f0(8) null -temp 0.0 into add.g.f0(8) temp a b total instructions in shared programs: 5958360 -> 5955701 (-0.04%) instructions in affected programs: 466880 -> 464221 (-0.57%) GAINED: 0 LOST: 1 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	d6317beb46	i965/fs: Propagate cmod across flag read if it contains the same value. total instructions in shared programs: 5959463 -> 5958900 (-0.01%) instructions in affected programs: 70031 -> 69468 (-0.80%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	3fb5b2bc47	i965/fs: Add unit tests for cmod propagation pass. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	19f9cb72c8	i965/fs: Add pass to propagate conditional modifiers. total instructions in shared programs: 5974160 -> 5959463 (-0.25%) instructions in affected programs: 1743737 -> 1729040 (-0.84%) GAINED: 0 LOST: 12 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	3759a89ad3	i965/fs: Eliminate null-dst instructions without side-effects. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	7452f18b22	i965/fs: Apply conditional mod specially to split MAD/LRP. Otherwise we'll apply the conditional mod to only one of SIMD8 instructions and trigger an assertion. NoDDClr/NoDDChk have the same problem but we never apply those to these instructions, so I'm leaving them for a later time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:40 -08:00
Matt Turner	eed7223243	i965/fs: Add a pass to fixup 3-src instructions that have a null dest. 3-src instructions can only have GRF/MRF destinations. It's really difficult to deal with that restriction in dead code elimination (that wants to give instructions null destinations to show that their result isn't used) while allowing 3-src instructions to have conditional mod, so don't, and just give then a destination before register allocation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	215b081c2a	i965: Add is_3src() to backend_instruction. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	0654ca7d7e	i965: Add backend_instruction::can_do_cmod(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	71486e9f2d	i965/cfg: Add a foreach_block_reverse macro. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:57:39 -08:00
Matt Turner	65dd4a255a	i965/cfg: Add a foreach_inst_in_block_reverse_safe macro. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:57:39 -08:00
Matt Turner	579157e6c1	glsl: Add a foreach_in_list_reverse_safe macro. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:57:39 -08:00
Matt Turner	c638ea3d19	i965: Don't make instructions with a null dest a barrier to scheduling. Now that we properly track accumulator dependencies, the scheduler is able to schedule instructions between the mach and mov in the common the integer multiplication pattern: mul acc0, x, y mach null, x, y mov dest, acc0 Since a null destination implies no dependency on the destination, we can also safely schedule instructions (that don't write the accumulator) between the mul and mach. GAINED: 103 LOST: 43 Causes one program to spill (643 -> 1076 instructions). I committed this patch last year (commit `42a26cb5`) but reverted it (commit `0d3f83f4`) after inexplicable artifacts in Kerbal Space Program (bug 78648). Tapani reapplied this patch and could not reproduce the bug with current Mesa. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:57:39 -08:00
Ian Romanick	f02f1af9f7	i965/fs: Allow SIMD16 on pre-SNB when try_replace_with_sel is successful If try_replace_with_sel is able to replace the flow control with a SEL instruction, then there is no flow control... failing SIMD16 because of nonexistent flow control is wrong. No piglit regressions on any i965 platform in Jenkins. total instructions in shared programs: 4382707 -> 4382707 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 GAINED: 2089 LOST: 0 No other platforms affected in shader-db. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-23 17:34:47 -08:00
Eric Anholt	0680d170d1	nir: Expose nir_print_instr() for debug prints It's nice to have this present in your default cases so you can see what instruction is triggering an abort. v2: Just pass a NULL state, now that it won't crash when you do. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:30:11 -08:00
Eric Anholt	6445a40520	nir: When asked to print with a NULL state, just use bare variable names. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 17:30:01 -08:00
Eric Anholt	447ddfc137	nir: Add nir_lower_alu_to_scalar. This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted for vc4. v2: Use the nir_src_for_ssa() helper, and another instance of nir_alu_src_copy(). v3: Drop the non-SSA support. All intended callers will have SSA-only ALU ops. v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused unsupported() function, drop lower_context struct. v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert about weird input_sizes[]. Reviewed-by: Jason Ekstrand <jason.ekstrand@iastate.edu>	2015-01-23 16:37:23 -08:00
Eric Anholt	b200127816	nir: Make some helpers for copying ALU src/dests. There aren't many users yet, but I wanted to do this from my scalarizing pass. v2: Constify the src arguments. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 16:37:16 -08:00
Kenneth Graunke	15063d2ad0	nir: Add algebraic optimizations for division and reciprocal. These also exist in opt_algebraic.cpp. total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%) NIR instructions in affected programs: 42221 -> 42002 (-0.52%) helped: 198 total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%) i965 instructions in affected programs: 84322 -> 83885 (-0.52%) helped: 394 HURT: 1 (by 1 instruction) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	bbd60f6d79	nir: Add algebraic optimizations for exponential/logarithmic functions. Most of these exist in the GLSL IR algebraic pass already. However, SSA allows us to find more instances of the patterns. total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%) NIR instructions in affected programs: 124189 -> 120026 (-3.35%) helped: 604 total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%) i965 instructions in affected programs: 261295 -> 254507 (-2.60%) helped: 1295 HURT: 3 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	391fb32bbe	nir: Add algebraic optimizations for simplifying comparisons. The first batch removes bonus fnot/inot operations, possibly allowing other optimizations to better recognize patterns. The next batch replaces a fadd and constant 0.0 with an fneg - negation is usually free on GPUs, while addition is not. total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%) NIR instructions in affected programs: 411143 -> 405922 (-1.27%) helped: 2233 HURT: 214 A few shaders are hurt by a few instructions due to moving neg such that it has a constant operand, which is then folded, resulting in two distinct load_consts for x and -x. We can always clean that up later. total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%) i965 instructions in affected programs: 784980 -> 775093 (-1.26%) helped: 4508 HURT: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	551a752a59	nir: Add algebraic optimizations for pointless shifts. The GLSL IR optimization pass contained these; we may as well include them too. v2: Fix a >> 0 and a << 0 optimizations (caught by Matt). No change in the number of NIR instructions on a shader-db run. total i965 instructions in shared programs: 6035397 -> 6035392 (-0.00%) i965 instructions in affected programs: 542 -> 537 (-0.92%) helped: 2 (in glamor) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	3e56572c49	nir: Add a bunch of algebraic optimizations on logic/bit operations. Matt and I noticed a bunch of "val <- ior a a" operations in a shader, so we decided to add an algebraic optimization for that. While there, I decided to add a bunch more of them. v2: Delete bogus fand/for optimizations (caught by Jason). total NIR instructions in shared programs: 2023511 -> 2020814 (-0.13%) NIR instructions in affected programs: 149634 -> 146937 (-1.80%) helped: 1032 total i965 instructions in shared programs: 6035392 -> 6035397 (0.00%) i965 instructions in affected programs: 537 -> 542 (0.93%) HURT: 2 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	978b0a9cda	nir: Implement CSE on intrinsics that can be eliminated and reordered. Matt and I noticed that one of the shaders hurt by INTEL_USE_NIR=1 had load_input and load_uniform intrinsics repeated several times, with the same parameters, but each one generating a distinct SSA value. This made ALU operations on those values appear distinct as well. Generating distinct SSA values is silly - these are read only variables. CSE'ing them makes everything use a single SSA value, which then allows other operations to be CSE'd away as well. Generalizing a bit, it seems like we should be able to safely CSE any intrinsics that can be eliminated and reordered. I didn't implement support for variables for the time being. v2: Assert that info->num_variables == 0 (requested by Jason). total NIR instructions in shared programs: 2435936 -> 2023511 (-16.93%) NIR instructions in affected programs: 2413496 -> 2001071 (-17.09%) helped: 16872 total i965 instructions in shared programs: 6028987 -> 6008427 (-0.34%) i965 instructions in affected programs: 640654 -> 620094 (-3.21%) helped: 2071 HURT: 585 GAINED: 14 LOST: 25 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	cbdd623f13	nir: Pull nir_instr_can_cse()'s SSA checks out of the switch. This should not be a change in behavior, as all current cases that potentially answer "yes" require SSA. The next patch will introduce another case that requires SSA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	d7743bb1c2	i965/nir: Report NIR instruction counts (in SSA form) via KHR_debug. This allows us to count NIR instructions via shader-db. Use "run" as normal. The results file will contain both NIR and assembly. Then, to generate a NIR report: ./report.py <(grep NIR results/foo) <(grep NIR results/bar) Or, to generate an i965 report: ./report.py <(grep -v NIR results/foo) <(grep -v NIR results/bar) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	f3e06fcc6a	i965/nir: Print NIR on INTEL_DEBUG=fs. This is useful for debugging and looking for optimization opportunities. It will need to be expanded when we add support for other scalar stages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-23 14:53:26 -08:00
Kenneth Graunke	faa38e16aa	i965/nir: Do optimizations again just before lowering source mods. We want to run CSE and algebraic optimizations again after lowering IO. Some of the passes in the optimization loop don't handle saturates and other modifiers, so run it before lowering to source modifiers. total instructions in shared programs: 6046190 -> 6045768 (-0.01%) instructions in affected programs: 22406 -> 21984 (-1.88%) helped: 47 HURT: 0 GAINED: 0 LOST: 0 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 14:53:25 -08:00
Matt Turner	9b5efac461	loader: Remove NEED_OPENGL_COMMON check. HAVE_DRICOMMON is sufficient since OpenGL must be enabled for DRI.	2015-01-23 14:28:44 -08:00
Matt Turner	2e7b62cbb9	gitignore: Ignore .tar.xz files.	2015-01-23 14:28:44 -08:00
Matt Turner	dd6f641303	mesa: Build with subdir-objects.	2015-01-23 14:28:44 -08:00
Matt Turner	145919b2ab	glsl: Build a libglsl_util library. Rather than sourcing files with ../dir/file.c which leads to distclean wiping out ../dir's .deps directory.	2015-01-23 14:28:44 -08:00
Matt Turner	a37ae2ab92	mapi: Build with subdir-objects.	2015-01-23 14:28:44 -08:00
Matt Turner	961def1074	mapi: Remove vgapi from SUBDIRS. OpenVG is disabled with via autotools.	2015-01-23 14:28:44 -08:00
Matt Turner	ce98519266	mesa: Drop inclusion of glapi_gen.mk. Some glapi headers used to be generated from this Makefile.am, but no longer.	2015-01-23 14:28:43 -08:00
Matt Turner	618c3b35f1	glsl: Build with subdir-objects. Apparently $(top_srcdir) is not expanded in a source list when using subdir-objects, so remove that. It's not clear to me why we were going to such lengths to prefix each source file anyway.	2015-01-23 14:28:42 -08:00
Matt Turner	a8b880bd63	nir: Add headers to distribution.	2015-01-23 14:27:39 -08:00
Matt Turner	ae494281a4	nir: Add nir_{opt_,}algebraic.py to distribution.	2015-01-23 14:26:53 -08:00
Matt Turner	4db329ddff	mesa: Add format_{un,}pack.py to distribution.	2015-01-23 14:26:53 -08:00
Matt Turner	195488e945	mesa: Remove pack_tmp.h from sources. Missed in commit `3a4de321`.	2015-01-23 13:35:25 -08:00
Connor Abbott	68a9d0b36f	nir: add generated file to .gitignore Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-23 10:20:46 -08:00
Ville Syrjälä	f4b31d29d7	i965: Fix min_vs_entries for CHV According to BSpec the correct number for min_vs_entries is 34 for CHV. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-01-23 12:09:41 +02:00
Ville Syrjälä	99754446ab	i965: Fix max_wm_threads for CHV Change max_wm_threads to match the spec on CHV. The max number of threads in 3DSTATE_PS is always programmed to 64 and the hardware internally scales that depending on the GT SKU. So this doesn't change the max number of threads actually used, but it does affect the scratch space calculation. On CHV the old value was too small, so the amount of scratch space allocated wasn't sufficient to satisfy the actual max number of threads used. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2015-01-23 12:09:35 +02:00
Connor Abbott	c8761c8559	glsl: fix stale comment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-23 00:23:51 -05:00
Jason Ekstrand	6be2434031	i965/emit: Assert that src1 is not an MRF after doing the MRF->GRF conversion When emitting texturing from indirect texture units, we need to be able to scratch around in the header message. Since we only do this for >= HSW, this is ok since there are no MRFs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj phogat <anuj.phogat@gmail.com>	2015-01-22 16:00:34 -08:00
Jason Ekstrand	7de8a3e13e	i965/emit: Do the sampler index adjustment directly in header.0.3 Prior to this commit, the adjust_sampler_state_pointer function took an extra register that it could use as scratch space. The usual candidate was the destination of the sampler instruction. However, if that register ever aliased anything important such as the sampler index, this would scratch all over important data. Fortunately, the calculation is such that we can just do it in place and we don't need the scratch space at all. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-22 15:19:13 -08:00
Axel Davy	8751734613	st/nine: Correctly handle when ff vs should have no texture coord input/output Previous code semantic was: . if ff ps will not run a ff stage, then do not output texture coords for this stage for vs . if XYZRHW is used (position_t), use only the mode where input coordinates are copied to the outputs. Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it means copy texture coord input to texture coord output if there is such input. The case where there is no texture coord input wasn't handled correctly. Drivers like r300 dislike when vs has inputs that are not fed. Moreover if the app uses ff vs with a programmable ps, we shouldn't look at what are the parameters of the ff ps to decide to output or not texture coordinates. The new code semantic is: . if XYZRHW is used, restrict to PASSTHRU . if PASSTHRU is used and no texture input is declared, then do not output texture coords for this stage The case where ff ps needs a texture coord input and ff vs doesn't output it is not handled, and should probably be a runtime error. This fixes 3Dmark05, which uses ff vs with programmable ps. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:24 +00:00
Axel Davy	77fcff37cf	st/nine: Change comment relating to vertex shader inputs not matching declaration Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:24 +00:00
Axel Davy	f8a74410f1	st/nine: Allocate vs constbuf buffer for indirect addressing once. When the shader does indirect addressing on the constants, we allocate a temporary constant buffer to which we copy the constants from the app given user constants and the constants filled in the shader. This patch makes this buffer be allocated once. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:24 +00:00
Axel Davy	e0f75044c8	st/nine: Allocate the correct size for the user constant buffer Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:24 +00:00
Axel Davy	b9cbea9dbc	st/nine: Add variables containing the size of the constant buffers Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:24 +00:00
Axel Davy	a721987077	st/nine: Fix sm3 relative addressing for non-debug build Relative addressing needs the constant buffer to get all the correct constants, even those defined by the shader. The code to copy the shader constants to the constant buffer was enabled only for debug build. Enable it always. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:23 +00:00
Axel Davy	4b7a9cfddb	st/nine: Remove unused code for ps Since constant indirect adressing is not allowed for ps, we can remove our code to handle that. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:23 +00:00
Axel Davy	9690bf33d7	st/nine: Correct rules for relative adressing and constants. relative adressing for constants is possible only for vs float constants. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:23 +00:00
Axel Davy	bce94ce831	st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:23 +00:00
Axel Davy	9e23b64c15	st/nine: Implement TEXDP3TEX Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:23 +00:00
Axel Davy	09eb1e901f	st/nine: Implement TEXDP3 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:23 +00:00
Axel Davy	f19e699368	st/nine: Implement TEXDEPTH Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:23 +00:00
Axel Davy	3676ab02fb	st/nine: Implement TEXM3x3SPEC Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:22 +00:00
Axel Davy	2b9f079ae3	st/nine: Implement TEXM3x2TEX Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:22 +00:00
Axel Davy	fdff111dc8	st/nine: implement TEXM3x2DEPTH Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:22 +00:00
Axel Davy	7865210670	st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC The fix is that this line: "src[s] = tx->regs.vT[s];" is wrong if s doesn't start from 0. Instead access tx->regs.vT directly when needed. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:22 +00:00
Axel Davy	b1259544e3	st/nine: Fill missing dst and src number for some instructions. Not filling them correctly results in bad padding and later crash. Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:22 +00:00
Axel Davy	5399119fb1	st/nine: Implement TEXCOORD special behaviours texcoord for ps < 1_4 should clamp between 0 and 1 the values. texcrd (texcoord ps 1_4) does not clamp and can be used with two modifiers _dw and _dz that means the channels are divided by w or z. Implement those in shared code, since the same modifiers can be used for texld ps 1_4. v2: replace DIV by RCP + MUL v3: Remove an useless MOV Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:22 +00:00
Axel Davy	30704bbc6e	st/nine: Fix CALLNZ implementation Nothing seems to indicates the negation modifier would be stored in the instruction flags instead of the source modifier. tx_src_param has already handled it if it is in the source modifier. In addition, when the card supports native integers, the boolean are stored in 32 bits int and are equal to 0 or 0xFFFFFFFF. Given 0xFFFFFFFF is NaN if it was a float, better use UIF than IF. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:22 +00:00
Axel Davy	6378d74937	st/nine: Fix some fixed function pipeline operation Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:21 +00:00
Axel Davy	018407b5d8	st/nine: Clamp ps 1.X constants This is wine (and windows) behaviour. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:21 +00:00
Axel Davy	8bbc5e2781	st/nine: Remove duplicated code for ps texcoord input declaration Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:21 +00:00
Axel Davy	3ca67f8810	st/nine: Fix CND implementation Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:21 +00:00
Axel Davy	dd055176cc	st/nine: Match REP implementation to LOOP Previous implementation was behaving fine, but improve it by: . Improved documentation . Decreasing counter (comparing to 0 is likely to be faster than to constant) . Move the counter update at the end for better performance for shaders that break the loop earlier than when the count is done. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:21 +00:00
Axel Davy	6a8e5e48be	st/nine: Rewrite LOOP implementation, and a0 aL handling Previous implementation didn't work well with nested loops. Instead of using several address registers, put a0 and aL into normal registers, and copy them to one address register when we need to use them. Wine tests loop_index_test() and nested_loop_test() now pass correctly. Fixes r600g crash while loading Bioshock - bug https://bugs.freedesktop.org/show_bug.cgi?id=85696 Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:21 +00:00
Axel Davy	c9aa9a0add	st/nine: Correct LOG on negative values We should take the absolute value of the input. Also return -FLT_MAX instead of -Inf for an input of 0. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:20 +00:00
Axel Davy	f5e8e3fb80	st/nine: Handle NRM with input of null norm When the input's xyz are 0.0, the output should be 0.0. This is due to the fact that Inf * 0 = 0 for dx9. To handle this case, cap the result of RSQ to FLT_MAX. We have FLT_MAX * 0 = 0. Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:20 +00:00
Axel Davy	2487f73574	st/nine: Handle RSQ special cases We should use the absolute value of the input as input to ureg_RSQ. Moreover, an input of 0.0 should return FLT_MAX. Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:20 +00:00
Axel Davy	c12f8c2088	st/nine: Fix POW implementation POW doesn't match directly TGSI, since we should take the absolute value of src0. Fixes black textures in some games Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:20 +00:00
Axel Davy	e0dd9ca985	st/nine: Fix typo for M4x4 Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:20 +00:00
Axel Davy	53dc992f20	st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs Let's say we have c1 and c2 declared in the shader and c0 given by the app Then here we would have read c0, c1 and c2 given by the app, instead of the correct c0, c1, c2. This correction fixes several issues in some games. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:20 +00:00
Axel Davy	9fb58a74a0	st/nine: Saturate oFog and oPts vs outputs According to docs and Wine, these two vs outputs have to be saturated. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:20 +00:00
Axel Davy	a214838181	st/nine: Remove some shader unused code Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:19 +00:00
Axel Davy	d08c7b0b88	st/nine: Convert integer constants to floats before storing them when cards don't support integers The shader code is already behaving as if they are floats when the the card doesn't support integers Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:19 +00:00
Axel Davy	d9d18fe39f	st/nine: Rework of boolean constants Convert them to shader booleans at earlier stage. Previous code is fine, but later patch will make integers being converted at earlier stage, so do the same for booleans Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:19 +00:00
Axel Davy	77f0ecf9ce	st/nine: Add ATI1 and ATI2 support Adds ATI1 and ATI2 support to nine. They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM, but need special handling. Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:19 +00:00
Axel Davy	b0b5430322	st/nine: Check if srgb format is supported before trying to use it. According to msdn, we must act as if user didn't ask srgb if we don't support it. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:19 +00:00
Stanislaw Halik	82810d3b66	st/nine: Hack to generate resource if it doesn't exist when getting view Buffers in the MANAGED pool are supposed to have the content in a ram buffer, a copy in VRAM if there is enough memory (driver manages memory and decide when to delete the buffer in VRAM). This is not implemented properly in nine, and a VRAM copy is going to be created when the RAM memory is filled, and the VRAM copy will get synced with the RAM memory updates. Due to some issues (in the implementation or in app logic), it can happen we try to create a sampler view of the resource while we haven't created the VRAM resource. This hack creates the resource when we hit this case, which prevents crashing, but doesn't help with the resource content. This fixes several games crashing at launch. Acked-by: Axel Davy <axel.davy@ens.fr> Acked-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Stanislaw Halik <sthalik@misaki.pl> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:18 +00:00
Axel Davy	47280d777d	st/nine: NineBaseTexture9: update sampler view creation While previous code was having the correct behaviour in general, this new code is more readable (without checking all gallium formats manually) and has a more defined behaviour for depth stencil resources. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:18 +00:00
Axel Davy	0abfb80dac	st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:18 +00:00
Axel Davy	0d2c22e648	st/nine: Fix crash when deleting non-implicit swapchain The implicit swapchains are destroyed when the device instance is destroyed. However for non-implicit swapchains, it is not the case, and the application can have kept an reference on the swapchain buffers to reuse them. Fixes problems with battle.net launcher. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:18 +00:00
Axel Davy	9232161178	st/nine: CubeTexture: fix GetLevelDesc This->surfaces contains the surfaces associated to the levels and faces. This->surfaces[6*Level] is what we want here, since it gives us a face descriptor for the level 'Level'. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:18 +00:00
Axel Davy	18c7e70226	st/nine: NineBaseTexture9: fix setting of last_layer Use same similar settings as u_sampler_view_default_template Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:18 +00:00
Axel Davy	05e20e1045	st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS The cap means D3DFVF_XYZRHW vertices will see clipping. This is not the case when PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since it'll disable clipping. Reviewed-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:18 +00:00
Xavier Bouchoux	dc88989189	st/nine: Fix D3DRS_POINTSPRITE support It's done by testing the existence of the point sprite output register after parsing the vertex shader. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:17 +00:00
Axel Davy	d2f2a550cf	st/nine: Add new texture format strings Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:17 +00:00
Xavier Bouchoux	072e2ba8e1	st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:17 +00:00
Xavier Bouchoux	8bb550b958	st/nine: Additional defines to d3dtypes.h Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Xavier Bouchoux <xavierb@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-22 22:16:17 +00:00
Axel Davy	3bc75fcf22	st/nine: Fix clip state logic The clip state was reset everytime, incurring an overhead. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2015-01-22 22:16:17 +00:00
David Heidelberger	23fae79735	st/nine: query: remove unused variable (trivial) Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: David Heidelberg <david@ixit.cz>	2015-01-22 22:16:16 +00:00
Eric Anholt	fc6938d23e	nir: Fix setup of constant bool initializers. brw_fs_nir has only seen scalar bools so far, thanks to vector splitting, and the ralloc of in glsl_to_nir.cpp will usually get you a 0-filled chunk of memory, so reading too large of a value will usually get you the right bool value. But once we start doing vector bools in a few commits, we end up getting bad values. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-22 13:52:19 -08:00
Eric Anholt	534a4ec82f	nir: Make an easier helper for setting up SSA defs. Almost all instructions we nir_ssa_def_init() for are nir_dests, and you have to keep from forgetting to set is_ssa when you do. Just provide the simpler helper, instead. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-22 13:52:19 -08:00
Jonathan Gray	c5be9c126d	glsl: Link glsl_test with pthreads library. Otherwise pthread_mutex_lock will be an undefined reference on OpenBSD. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88219 Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2015-01-22 21:29:43 +00:00
Vinson Lee	9db7b12cb2	scons: Add X11 include path if X11 is available. Mac OS X XQuartz places X11 headers at /opt/X11/include. This patch fixes this Mac OS X SCons build error. Compiling src/gallium/state_trackers/glx/xlib/glx_api.c ... In file included from src/gallium/state_trackers/glx/xlib/glx_api.c:34: include/GL/glx.h:30:10: fatal error: 'X11/Xlib.h' file not found ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-01-22 21:29:43 +00:00
José Fonseca	fea35bbf6d	meta: Move loop declaration to top of block. Fixes MSVC build. Trvial.	2015-01-22 20:06:17 +00:00
Jason Ekstrand	d5d4ba9139	i965/tex_subimage: use meta instead of the blitter for PBO TexSubImage Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:37:13 -08:00
Jason Ekstrand	779923194c	i965/tex_image: Use meta for instead of the blitter PBO TexImage and GetTexImage Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:37:09 -08:00
Jason Ekstrand	ef0499af25	i965/pixel_read: Use meta_pbo_GetTexSubImage for PBO ReadPixels Since the meta path can do strictly more than the blitter path, we just remove the blitter path entirely. Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:36:25 -08:00
Jason Ekstrand	8546fe900c	meta: Add an implementation of GetTexSubImage for PBOs Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:36:24 -08:00
Jason Ekstrand	7f396189f0	meta: Add a BlitFramebuffers-based implementation of TexSubImage This meta path, designed for use with PBO's, creates a temporary texture out of the PBO and uses BlitFramebuffers to do the actual texture upload. v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Add support for handling simple packing options v3 Jason Ekstrand <jason.ekstrand@intel.com>: - Refactor to split out the texture-from-pbo code - Rename to _mesa_meta_pbo_TexSubImage Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:36:24 -08:00
Jason Ekstrand	e24d17e08c	formats: Use a hash table for _mesa_format_from_array_format Going through the for loop every time has noticable overhead. This fixes things up so we only do that once ever and then just do a hash table lookup which should be much cheaper. v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use once_flag and call_once from c11/threads.h instead of pthreads Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:35:43 -08:00
Jason Ekstrand	333226522c	i965: Implement SetTextureStorageForBufferObject Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:21:07 -08:00
Jason Ekstrand	117a1d69de	i965: Apply the miptree offset to surface state for renderbuffers Previously, we were completely ignoring the mt->offset field for renderbuffers. While it does have some alignment constraints, it is valid to use it. This patch adds the code to each of the 4 surface state setup functions to handle it. Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:21:07 -08:00
Jason Ekstrand	404660e3c7	i965/mipmap_tree: Add a depth parameter to create_for_bo Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:21:07 -08:00
Jason Ekstrand	3298b1235a	mesa/dd: Add a function for creating a texture from a buffer object Reviewed-by: Neil Roberts <neil@linux.intel.com>	2015-01-22 10:21:07 -08:00
Tapani Pälli	adc8cdfa35	glsl: do not allow interface block to have name already taken Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert v2: combine var declaration with assignment (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-22 07:54:19 +02:00
Matt Turner	28b7c6b285	nir: Replace assert(0) with unreachable(). Fixes a couple of warnings in the process. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-21 21:06:37 -08:00
Matt Turner	6de077f01d	i965/vec4: Fix fprintf argument ordering. Introduced in commit `3167a80b`.	2015-01-21 20:17:26 -08:00
Jason Ekstrand	f88c6a4997	nir: Stop using designated initializers Designated initializers with anonymous unions don't work in MSVC or GCC < 4.6. With a couple of constructor methods, we don't need them any more and the code is actually cleaner. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88467 Reviewed-by: Connor Abbot <cwabbott0@gmail.com>	2015-01-21 19:55:02 -08:00
Tobias Klausmann	76086d7120	mesa: change assert to unreachable in two format functions This fixes two problems reported by osc: I: Program returns random data in a function E: Mesa no-return-in-nonvoid-function ../../src/mesa/main/format_utils.c:180 E: Mesa no-return-in-nonvoid-function ../../src/mesa/main/glformats.c:2714 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2015-01-21 13:17:27 -08:00
Jason Ekstrand	7da60eca4f	nir: Add src and dest constructors Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-21 12:21:10 -08:00
Jan Vesely	3c3e60e050	mesa: Add assert to check number of vector elements The below code crashes when vector_elements <= 0 Fixes Warray-bounds warnings Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-21 14:06:02 +00:00
Jan Vesely	3cb10cce37	mesa: Fix some signed-unsigned comparison warnings v2: s/unsigned int/unsigned/ in prog_optimize.c Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-21 14:05:52 +00:00
Jan Vesely	da1f92779d	mesa: remove comparisons that are always true Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-21 14:05:04 +00:00
Jason Ekstrand	194f6235b3	nir: Add a nir_foreach_phi_src helper macro Reviewed-by: Connor Abbott <cwabbott02gmail.com>	2015-01-20 16:53:29 -08:00
Ben Widawsky	169d7e5cb1	i965: Extract scalar region checking logic There are currently 2 users of this functionality. I have 2 more users coming up, and having a simple function makes the results much cleaner. The existing interface semantics was proposed by Matt. v2 (Ken): Rename to region_matches()/has_scalar_region(). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-20 15:24:40 -08:00
Ben Widawsky	9394f58383	i965: Add QWORD sizes to type_sz macro GEN8 added the QWORD as a valid type for certain operations on the EU. In order to calculate the number of registers used one must have the type size as part of the equation. Quoting the formula in the code: regs_written = (dst.width * dst.stride * type_sz(dst.type) + 31) / 32; Adding this separately for bisection since there is no simple way to add an assert in the type_sz function. NOTE: As a side note, I was confused for a while because it's impossible to calculate the region, ie. registers needed, without vstride. However, at this point these are all part of the IR, and so no vstride must exist. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-20 15:24:40 -08:00
Eric Anholt	b368c91f26	vc4: Fix build since `8ed5305d28`	2015-01-20 14:19:29 -08:00
Rob Clark	fd6e18d651	freedreno/a4xx: sysmem bypass Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-20 13:27:28 -05:00
Rob Clark	5da3bec44b	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-20 13:27:19 -05:00
Tom Stellard	17a2f11a06	radeonsi: Re-enable LLVM IR dumps This was inadvertently disabled by `761e36b4ca`.	2015-01-20 09:55:44 -05:00
Tom Stellard	73bc0fdb6f	radeonsi/compute: Use relocs for scratch pointer rather than user sgprs v2 Instead of passing a pointer to the scratch buffer via user sgprs, we now patch the shader with the buffer address using reloc information from the LLVM generated ELF. v2: - Make sure not to break older LLVM.	2015-01-20 09:55:44 -05:00
Tom Stellard	dfdaf3eb7e	radeon: Teach radeon_elf_read() how to parse reloc information v3 v2: - Use strdup for copying reloc names. - Free reloc memory. v3: - Add free_relocs parameter to radeon_shader_binary_free_members()	2015-01-20 09:55:43 -05:00
Tom Stellard	5667aa58c4	radeon: Add a helper function for freeing members of radeon_shader_binary	2015-01-20 09:55:43 -05:00
Kenneth Graunke	c4fd0c9052	i965: Work around mysterious Gen4 GPU hangs with minimal state changes. Gen4 hardware appears to GPU hang frequently when using Chromium, and also when running 'glmark2 -b ideas'. Most of the error states contain 3DPRIMITIVE commands in quick succession, with very few state packets between them - usually VERTEX_BUFFERS/ELEMENTS and CONSTANT_BUFFER. I trimmed an apitrace of the glmark2 hang down to two draw calls with a glUniformMatrix4fv call between the two. Either draw by itself works fine, but together, they hang the GPU. Removing the glUniform call makes the hangs disappear. In the hardware state, this translates to removing the CONSTANT_BUFFER packet between the two 3DPRIMITIVE packets. Flushing before emitting CONSTANT_BUFFER packets also appears to make the hangs disappear. I observed a slowdown in glxgears by doing it all the time, so I've chosen to only do it when BRW_NEW_BATCH and BRW_NEW_PSP are unset (i.e. we haven't done a CS_URB_STATE change or already flushed the whole pipeline). I'd much rather understand the problem, but at this point, I don't see how we'd ever be able to track it down further. We have no real tools, and the hardware people moved on years ago. I've analyzed 20+ error states and read every scrap of documentation I could find. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80568 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85367 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2015-01-19 13:13:51 -08:00
Kenneth Graunke	a5ca86a983	i965/nir: Enable SIMD16 support in the NIR FS backend. With the previous commits in place, it just works. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-19 13:13:50 -08:00
Kenneth Graunke	45123ee818	i965/nir: Use offset() instead of altering reg_offset directly. offset() properly handles reg_width, so it'll work for SIMD16. While we're in the area, simplify a few cases, and use retype() to cut a few more lines of code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-19 13:13:48 -08:00
Kenneth Graunke	3f263ffbb3	i965/nir: Replace fs_reg(GRF, virtual_grf_alloc(...)) with vgrf(...). brw_fs_nir.cpp creates almost all of its registers via: fs_reg reg = fs_reg(GRF, virtual_grf_alloc(num_components)); When we add SIMD16 support, we'll need to set reg->width = 16 and double the VGRF size...on pretty much every VGRF it allocates. This patch replaces that pattern with a new "vgrf" helper method: fs_reg reg = vgrf(num_components); The new function correctly takes reg_width into account. For now, reg_width is always 1, so this should have no functional change. v2: Just make vgrf() account for reg_width right away, rather than changing the behavior in the next patch. v3: Replace one last virtual_grf_alloc I missed. It's used in code that only runs for dispatch_width == 8, so it doesn't matter, but consistency is nice. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-19 13:13:46 -08:00
Kenneth Graunke	d1533d87cc	i965: Replace fs_reg(fs_visitor, type) with fs_visitor::vgrf(type). I dislike how fs_reg has a constructor that knows about fs_visitor. Apart from that, it stands alone, with no need to interact with the rest of the compiler. Which is sensible - a class that represents a register should do just that. Allocating virtual register numbers should be left up to the compiler (fs_visitor). This patch replaces the constructor with a new fs_visitor::vgrf method, eliminating fs_reg's dependency on fs_visitor. It ends up being no more code. v2: Rebase from May 2014 -> January 2015. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-19 13:13:34 -08:00
Marek Olšák	5b01512df3	st/mesa: don't set vs.key.clamp_color if a shader doesn't write any colors And update some comments.	2015-01-19 20:15:27 +01:00
Marek Olšák	ccc5b60b06	winsys/radeon: increase the size of buffer cache This should fix this performance regression: https://bugs.freedesktop.org/show_bug.cgi?id=88227 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-19 20:15:27 +01:00
Carl Worth	3b8ccca8a3	Rename sha1.c and sha1.h to mesa-sha1.c and mesa-sha1.h The filename of sha1.h was conflicting with the system-provided sha1.h, (and in some confiurations, our sha1.c was unsuccessfully attemping to include "sha1.h" and <sha1.h> as two different files). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88523	2015-01-19 10:53:07 -08:00
Martin Peres	7a182d2335	mesa: fix a trivial spelling mistake Signed-off-by: Martin Peres <martin.peres@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-19 01:23:07 -08:00
Tapani Pälli	d74a817b86	mesa: support GL_RGB for GL_EXT_texture_type_2_10_10_10_REV Commit `8ec6534` changed texture upload path and the way how texture format is being checked, this commit adds support for GL_RGB with GL_UNSIGNED_INT_2_10_10_10_REV as specified by the extension EXT_texture_type_2_10_10_10_REV specification. This fixes regression in ES3 conformance test ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels v2: add MESA_FORMAT_R10G10B10X2_UNORM format (Iago Toral) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88385 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-01-19 08:11:45 +02:00
Micah Fedke	d36fa60191	mesa: Add ARB_shader_precision infrastructure Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-19 16:33:21 +13:00
Kenneth Graunke	461103ef64	i965/fs: Fix the dummy fragment shader. We hit an assertion that the destination of the FB write should not be an immediate. (I don't know what we were thinking.) Use ARF null. Trying to substitute real shaders with the dummy shader would crash when trying to upload non-existent uniforms. Say there are none. It also wouldn't generate any code because we didn't compute the CFG, and code generation now requires it. Compute it. Gen4-5 also require a message header to be present. On Gen6+, there were assertion failures in SF/SBE state because urb_setup was memset to 0 instad of -1, causing it to think there were attributes when nothing was set up right. Set to no attributes. Finally, you have to ensure "Setup URB Entry Read Length" is non-zero or you get GPU hangs, at least on Crestline. It now works on at least Crestline and Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-17 14:20:41 -08:00
Kristian Høgsberg	8c6018e9bc	gbm: Define _DEFAULT_SOURCE to avoid warning glibc 2.19 introduced _DEFUAULT_SOURCE as a replacement for _BSD_SOURCE, and deprecates _BSD_SOURCE with an annoying warning. Defining both is how you're supposed to transition so let's do that. It gets rid of the warning and we can figure out when/if we can drop _BSD_SOURCE later. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-16 21:54:54 -08:00
Vinson Lee	9075823c17	sha1: Fix gcry_md_hd_t typo. Fix build error. CC libmesautil_la-sha1.lo sha1.c: In function '_mesa_sha1_final': sha1.c:210:22: error: 'grcy_md_hd_t' undeclared (first use in this function) gcry_md_hd_t h = (grcy_md_hd_t) ctx; ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88519 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2015-01-16 16:25:39 -08:00
Vinson Lee	10a4f1e77a	nir: s/malloc.h/stdlib.h/ Fix build error on Mac OS X. CC nir_to_ssa.lo nir_to_ssa.c:29:10: fatal error: 'malloc.h' file not found ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88478 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2015-01-16 16:14:51 -08:00
Kristian Høgsberg	a9f657ded1	i965: Fix up too-wide comment Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-16 14:42:27 -08:00
Kristian Høgsberg	9bf2c7166a	gbm/dri: Fix const confusion The driver name is no longer const, it's always allocated dynamically one way or another. Drop const from dri_screen_create_dri2 driver_name argument to avoid warning. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-16 14:29:40 -08:00
Carl Worth	59216f53ec	configure: Add machinery for --enable-shader-cache (and --disable-shader-cache) We don't actually have the code for the shader cache just yet, but this configure machinery puts everything in place so that the shader cache can be optionally compiled in. Specifically, if the user passes no option (neither --disable-shader-cache, nor --enable-shader-cache), then this feature will be automatically detected based on the presence of a usable SHA-1 library. If no suitable library can be found, then the shader cache will be automatically disabled, (and reported in the final output from configure). The user can force the shader-cache feature to not be compiled, (even if a SHA-1 library is detected), by passing --disable-shader-cache. This will prevent the compiled Mesa libraries from depending on any library for SHA-1 implementation. Finally, the user can also force the shader cache on with --enable-shader-cache. This will cause configure to trigger a fatal error if no sutiable SHA-1 implementation can be found for the shader-cache feature. Bug fix by José Fonseca <jfonseca@vmware.com>: Fix to put conditional assignment in Makefile.am, not Makefile.sources to avoid breaking scons build. Note: As recommended by José, with this commit the scons build will not compile any of the SHA-1-using code. This is waiting for someone to write SConstruct detection of the available SHA-1 libraries, (and set the appropriate HAVE_SHA1_* variables). Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-16 13:47:40 -08:00
Carl Worth	a24bdce46f	mesa: Add mesa SHA-1 functions The upcoming shader cache uses the SHA-1 algorithm for cryptographic naming. These new mesa_sha1 functions are implemented with any one of several differeny cryptographics libraries. This code was copied from the xserver repository, (where it has apparently been functioning well on a variety of operating systems), and comes licensed with a license identical to that of Mesa. Bug fixes by José Fonseca <jfonseca@vmware.com>: Fix to put conditional assignment in Makefile.am, not Makefile.sources to avoid breaking scons build. Fix include file for CryptoAPI section. Fix missing cast in openssl section. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-16 13:47:40 -08:00
Carl Worth	670826b431	configure: Add copyright and license block to configure.ac Prior to copying in code from the xserver configure.ac file, it makes sense to have the license of this file clearly marked, (to show that it's licensed identically to the configure.ac file from the xserver repository). And since the text of the license refers to "the above copyright notice" it also makes sense to have an actual copyright attribution in place. I generated this list of names by looking at the output of: git shortlog -n --format=%aD -- configure.ac (and arbitrarily stopping for contributors with fewer than 15 commits). Then for each name, I looked for existing Copyright attributions in the mesa source tree with the same name, (and using "Intel Corporation" as the copyright holder where I knew that was appropriate).	2015-01-16 13:47:40 -08:00
Carl Worth	977ddecb69	glsl: Add unit tests for blob.c In addition to exercising all of the functions in blob.h, this includes a stress test that forces some reallocing, and also tests to verify the alignment and overrun-detection code in blob.c.	2015-01-16 13:47:40 -08:00
Tapani Pälli	ffcad3a548	glsl: Add blob_overwrite_bytes and blob_overwrite_uint32 These functions are useful when serializing an unknown number of items to a blob. The caller can first save the current offset, write a placeholder uint32, write out (and count) the items, then use blob_overwrite_uint32 with the saved offset to replace the placeholder value. Then, when deserializing, the reader will first read the count and know how many subsequent items to expect. (I wrote this code after reading a very similar patch written by Tapani when he wrote serialization code for IR. Since I re-used the idea of his code so directly, I've credited him as the author of this code. --Carl) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 13:47:40 -08:00
Carl Worth	1c9877327e	glsl: Add blob.c---a simple interface for serializing data This new interface allows for writing a series of objects to a chunk of memory (a "blob").. The allocated memory is maintained within the blob itself, (and re-allocated by doubling when necessary). There are also functions for reading objects from a blob as well. If code attempts to read beyond the available memory, the read functions return 0 values (or its moral equivalent) without reading past the allocated memory. Once the caller is done with the reads, it can check blob->overrun to ensure whether any invalid values were previously returned due to attempts to read too far. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 13:47:40 -08:00
Tapani Pälli	165575d0a8	mesa: Add iterate method for string_to_uint_map The upcoming shader cache needs this to be able to cache hash data from the gl_shader_program structure. Edited-by: Carl Worth <cworth@cworth.org>: There is an internal implementation detail that the hash table underlying the struct string_to_uint_map stores each value internally as (value+1). The user needn't be very concerned with this (other than knowing that a value of UINT_MAX cannot be stored) since put() adds 1 and get() subtracts 1. So in this commit, rather than call the user's function directly with hash_table_call_foreach, we call through a wrapper that fixes up the off-by-one values before the caller's callback sees them. And with this wrapper in place, we also give a better signature to the callback function being passed to iterate(), so that this callback function can actually expect a char* and an unsigned argument, (rather than a couple of void* ). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-16 13:47:40 -08:00
Carl Worth	62d5b4b03a	util: Make unreachable at least be an assert Previously, if __builtin_unreachable() was unavailable, the unreachable macro was defined to do nothing. We do better here, by at least still making it an assert. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-16 13:47:40 -08:00
Carl Worth	f87ffd5cc3	glsl: Add convenience function get_sampler_instance This is similar to the existing functions get_instance, get_array_instance, etc. for getting a type singleton. The new get_sampler_instance() function will be used by the upcoming shader cache. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-16 13:47:40 -08:00
Kenneth Graunke	127c972492	i965: Fix some oddities in FB_WRITE register width and execution size. Previously, we generated this for FB writes in SIMD16 mode: load_payload(16) vgrf5@8+0.0:F, vgrf1:F, vgrf2:F, vgrf3:F, vgrf4:F fb_write(8) (null):UD, vgrf5@8+0.0:F 1sthalf The LOAD_PAYLOAD's destination had its register width set to 8, and the FB_WRITE had its execution size set to 8. This seems wrong, and while it probably doesn't affect anything, we should fix it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 12:39:35 -08:00
Kenneth Graunke	faaca23734	i965/fs: Make lower_load_payload etc. appear in INTEL_DEBUG=optimizer. In order to support calling lower_load_payload() inside a condition, this patch makes OPT() a statement expression: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html We recently did the equivalent change in the vec4 backend (commit `9b8bd67768`). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 12:38:26 -08:00
Neil Roberts	a4ab08bf45	format_utils: Use a more precise conversion when decreasing bits When converting to a format that has fewer bits the previous code was just shifting off the bits. This doesn't provide very accurate results. For example when converting from 8 bits to 5 bits it is equivalent to doing this: x * 32 / 256 This works as if it's taking a value from a range where 256 represents 1.0 and scaling it down to a range where 32 represents 1.0. However this is not correct because it is actually 255 and 31 that represent 1.0. We can do better with a formula like this: (x * 31 + 127) / 255 The +127 is to make it round correctly. The new code has a special case to use uint64_t when the result of the multiplication would overflow an unsigned int. This function is inline and only ever called with constant values so hopefully the if statements will be folded. The main incentive to do this is to make the CPU conversion path pick the same values as the hardware would if it did the conversion. This fixes failures with the ‘texsubimage pbo’ test when using the patches from here: http://lists.freedesktop.org/archives/mesa-dev/2015-January/074312.html v2: Use 64-bit arithmetic when src_bits+dst_bits > 32 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-16 13:53:15 +00:00
Iago Toral Quiroga	6367ca8b41	i965/gen6: Fix crash with VS+TF after rendering with GS Rendering with a GS and then using transform feedback with a program that does not have a GS can crash in gen6. The reason for this is that brw_begin_transform_feedback checks brw->geometry_program to decide if there is a GS program, but this is not correct: brw->geometry_program is updated when issuing drawing commands, so after rendering with a GS it will be non-NULL until we draw again with a program that does not have a GS. If the next program uses TF, we will call glBegintransformFeedback before issuing the drawing command and hence brw->geometry_program will be non-NULL if the previous rendering used a GS. The right thing to do here is to check ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY] instead. This is what the gen7 code path does too. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=87694 Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-16 14:16:59 +01:00
Jason Ekstrand	bc6e57e019	nir/live_variables: Use a worklist This is a rework of the liveness algorithm using a worklist as suggested by Connor. Doing so reduces the number of times we walk over the instructions because we don't have to do an entire pointless walk over the instructions just to figure out it's time to stop. Also, the stuff after the last loop in the funciton will only ever get visited once. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 16:54:21 -08:00
Jason Ekstrand	4839d1aed1	nir: Add a worklist helper structure A worklist is a common concept in optimizations. This adds a structure that we can reuse for many different types of optimizations. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 16:54:21 -08:00
Brian Paul	0aaaa13ec9	nir: fix incorrect argument passed to validate_src() in validate_tex_instr() Silences a compiler warning. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 17:41:42 -07:00
Brian Paul	aa479a69d6	nir: silence compiler warning from visit_src() call v2: use proper argument Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 17:09:02 -07:00
Brian Paul	337eca4ac8	mesa: move GET_CURRENT_CONTEXT() to top of _mesa_init_renderbuffer() To fix MSVC build. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-15 16:15:34 -07:00
Mike Mason	e407fb1af4	mesa: Fix render buffer initial internal format in GLES 3 Changes the initial internal format of a render buffer to GL_RGBA4 in GLES 3. This fixes a failure in the following DrawElements test: dEQP-GLES3.functional.state_query.rbo.renderbuffer_internal_format Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-15 13:29:48 -08:00
Jason Ekstrand	153b8b3525	util/hash_set: Rework the API to know about hashing Previously, the set API required the user to do all of the hashing of keys as it passed them in. Since the hashing function is intrinsically tied to the comparison function, it makes sense for the hash set to know about it. Also, it makes for a somewhat clumsy API as the user is constantly calling hashing functions many of which have long names. This is especially bad when the standard call looks something like _mesa_set_add(ht, _mesa_pointer_hash(key), key); In the above case, there is no reason why the hash set shouldn't do the hashing for you. We leave the option for you to do your own hashing if it's more efficient, but it's no longer needed. Also, if you do do your own hashing, the hash set will assert that your hash matches what it expects out of the hashing function. This should make it harder to mess up your hashing. This is analygous to `94303a0750` where we did this for hash_table Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	4c99e3ae78	util: Move main/set to util/hash_set Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Jason Ekstrand	8ed5305d28	hash_table: Rename insert_with_hash to insert_pre_hashed We already have search_pre_hashed. This makes the APIs match better. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 13:21:27 -08:00
Matt Turner	f0aec4ee1e	i965: Don't consider null dst instructions as matching non-null dst. When performing common subexpression elimination on instructions with non-null destinations we emit a MOV to copy the result to a new register that must have no other uses. In the case of: cmp.g.f0.0(8) null:D, vgrf43:F, 0.500000f ... cmp.g.f0.0(8) vgrf113:D, vgrf43:F, 0.500000f we put the first instruction in the AEB and decided that we could reuse its result when we found the second. Unfortunately, that meant that we'd emit a MOV from the first's destination, which is null. Don't do anything if the entry's destination is null and the instruction's destination is non-null. Tested-by: Tapani Pälli <tapani.palli@intel.com>	2015-01-15 10:11:42 -08:00
Matt Turner	41d9f232b6	i965/vec4: Make sure that imm writes are to registers in the same file. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887	2015-01-15 10:11:42 -08:00
Matt Turner	3654b6d43c	i965/fs: Emit MADs from (x + abs(y * z)). Just use the abs source modifier on both of the multiplicand arguments. instructions in affected programs: 300 -> 296 (-1.33%) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-15 10:10:44 -08:00
Matt Turner	c4fab711ed	i965/fs: Emit MADs from (x + -(y * z)). Just use the negation source modifier on one of the multiplicand arguments. total instructions in shared programs: 5889529 -> 5880016 (-0.16%) instructions in affected programs: 600846 -> 591333 (-1.58%) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-15 10:10:44 -08:00
Jason Ekstrand	0d05d1226e	nir/algebraic: Only replace an instruction once Without the break, it was possible that an instruction would match multiple expressions. If this happened, you could end up trying to replace it multiple times and get a segfault. This makes it so that, after a successful replacement, it moves on to the next instruction. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	c56adc68e2	i965/nir: Do a final copy lowering pass before lowering locals to regs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	0f85310975	nir/vars_to_ssa: Use the copy lowering from lower_var_copies Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	d3636da902	nir: Add a pass for lowering copy instructions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	700ba5daaf	nir/vars_to_ssa: Refactor get_deref_node This refactor allows you to more easily get the deref node associated with a given variable. We then use that new functionality in the deref_may_be_aliased function instead of creating a 1-element deref chain. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	55b5058e69	nir: Rename lower_variables to lower_vars_to_ssa The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	4aa6162f6e	nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src array This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	dcb1acdea0	nir/validate: Only build in debug mode Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:24 -08:00
Jason Ekstrand	347ab2bf24	nir/lower_variables: Improve documentation Additional description was added to a variety of places. Also, we no longer use the term "leaf" to describe fully-qualified direct derefs. Instead, we simply use the term "direct" or spell it out completely. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	8016fa39e1	nir/lower_variables: Use a for loop for get_deref_node Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	0c0ca8b6ae	nir: Use the actual FNV-1a hash for hashing derefs We also switch to using loops rather than recursion. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	a3b73ccf6d	util/hash_table: Pull the details of the FNV-1a into helpers This way the basics of the FNV-1a hash can be reused to easily create other hashing functions. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	e4115ca9d8	nir: Make intrinsic flags into an enum This should be much better for debugging as GDB will pick up on the fact that it's an enum and actually tell you what you're looking at instead of giving you some arbitrary hex value you have to go look up. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	ed13f4e716	nir: Use static inlines instead of macros for list getters This should make debugging a lot easier as GDB handles static inlines much better than macros. Also, static inlines are typesafe. Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	b95fae034f	nir/variable: Remove the constant_value field This was a left-over relic of GLSL IR that we aren't using for anything. If we ever want that value again, we can add it back, but NIR constant folding should be just as good as GLSL IR's if not better pretty soon, so I'm not worried about it. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	8599b30c67	nir: Add some documentation Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	ad9d0a9ea6	nir/lower_variables: Follow the Cytron paper more closely Previously, our variable renaming algorithm, while similar to the one in the Cytron paper, was not the same. While I'm pretty sure it was correct, it will be easier for readers of the code in the variable renaming pass if it follows more closely. This commit removes the automatic stack popping we were doing and replaces it with explicit popping like Cytron does. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	b1d114a48c	nir/print: Various cleanups recommended by Eric Cc: Eric Anholt <eric@anholt.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	e2763339fe	nir/lower_variables: Add a bunch of comments and re-arrange a few things This commit seeks to make the lower_variables pass much more clear by adding a pile of comments and re-arranging a few things. There are no functional or algorithmic changes. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	40ca129ed5	nir: Rename parallel_copy_copy to parallel_copy_entry and add a foreach macro parallel_copy_copy was a silly name. Also, things were getting long and annoying, so I added a foreach macro. For historical reasons, several of the original iterations over parallel copy entries in from_ssa used the _safe variants of the loop. However, all of these no longer ever remove an entry so it's ok to make them all use the normal iterator. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	1b720c6ed8	nir/from_ssa: Clean up parallel copy handling and document it better Previously, we were doing a lazy creation of the parallel copy instructions. This is confusing, hard to get right, and involves some extra state tracking of the copies. This commit adds an extra walk over the basic blocks to add the block-end parallel copies up front. This should be much less confusing and, consequently, easier to get right. This commit also adds more comments about parallel copies to help explain what all is going on. As a consequence of these changes, we can now remove the at_end parameter from nir_parallel_copy_instr. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	de73d1e173	nir: Rename nir_block_following_if to nir_block_get_following_if The new name is a little longer but less confusing. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:23 -08:00
Jason Ekstrand	cb53aacaa1	i965/fs_nir: Handle sample ID, position, and mask better Before, we were emitting the full pile of setup instructions for sample_id and sample_pos every time they were used. With this commit, we emit them in their own pass once at the beginning of the shader and simply emit uses later on. When it comes time for setting up VS, we can put setup for its special values in the same pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	813316d150	nir/opcodes: Remove the per_component info field Originally, this field was intended for determining if the given instruction acted per-component or if it had mismatching source and destination sizes that would have to be interpreted specially. However, we can easily derive this from output_size == 0, so it's not really that useful. Also, the values we were setting in nir_opcodes.h for this field were completely bogus and it was never used. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	e2a8f9e5cc	nir/search: Use nir_op_infos to determine if an operation is commutative Prior to this commit, we had a big switch statement for this. Now it's baked into the opcode metadata so we can just use that. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	46f3e1ab50	nir/opcodes: Add algebraic properties metadata This commit adds some algebraic properties to the metadata of each opcode in NIR. In particular, you now know, just from the metadata, if a given opcode is commutative or associative. This will be useful for algebraic transformation passes that want to be able to match a + b as well as b + a in one go. v2: Make algebraic properties all caps. This was more consistent with the intrinsics flags and seems better for flags in general. Also, the enums are now declared with (1 << n) rather then hex values. v3: fmin and fmax technically aren't commutative or associative. Things get funny when one of the arguments is a NaN. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	2c7da78805	nir: Make load_const SSA-only As it was, we weren't ever using load_const in a non-SSA way. This allows us to substantially simplify the load_const instruction. If we ever need a non-SSA constant load, we can do a load_const and an imov. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	675ffdef30	nir: Make nir_ssa_undef_instr_create initialize the destination Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	951a7f23a0	i965/nir: Move the other lowering passes to before out-of-SSA Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	5c16be1c52	nir/lower_system_values: Handle SSA destinations Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	821e75a160	nir/lower_atomics: Use/support SSA Previously, lower_atomics was non-SSA only. We assert-failed if the destination of an atomic operation intrinsic was an SSA def and we used temporary registers for computing offsets. This commit changes both of these behaviors. We now use SSA values for computing offsets (so we can optimize them) and we handle SSA destinations. We also move the pass to run before we go out of SSA on i965 as it now generates SSA values. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	8ddb03d56d	nir/live_variables: Use the new ssa_def iterator Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	28a3e164e2	nir: Use nir_foreach_ssa_def for setting up ssa destinations Before, we were using foreach_dest and switching on whether the destination was an SSA value. This works, except not all destinations are SSA values so we have to special-case ssa_undef instructions. Now that we have a foreach_ssa_def function, we can iterate over all of the register destinations in one pass and iterate over the SSA destinations in a second. This way, if we add other ssa-only instructions, we won't have to worry about adding them to the special case we have for ssa_undef. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	193fea9eb6	nir: Add a foreach_ssa_def function There are some functions whose destinations are SSA-only and so aren't a nir_dest. This provides a function that is capable of iterating over the SSA definitions defined by those functions. If you want registers, you should use the old iterator. v2: Kenneth Graunke <kenneth@whitecape.org>: - Fix nir_foreach_ssa_def's return value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	bc0735857f	nir/lower_variables: Use a real dominance DFS for variable renaming Previously, we were just iterating over the program "in order" which kind-of approximates a DFS, but not really. In particular, we got the following case wrong: loop { a = 3; if (foo) { a = 5; } else { break; } use(a); } where use(a) would get 3 instead of 5 because of premature popping of the SSA def stack. Now, since we do an actaul DFS, we should evaluate use(a) immediately after a = 5 and we should be ok. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:22 -08:00
Jason Ekstrand	dfb3abbaec	nir: Remove predication We stopped generating predicates in glsl_to_nir some time ago. Right now, it's all dead untested code that I'm not convinced always worked in the first place. If we decide we want them back, we can revert this patch. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	b3fd098e7d	nir: Make bcsel a fully vector operation Previously, the condition was a scalar that applied to all components simultaneously. As of this commit, the condition is a vector and each component is switched seperately. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	295faf9462	nir: Call nir_metadata_preserve more places Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	b6c81b3ff4	nir/metadata: Rename metadata_dirty to metadata_preserve nir_metadata_dirty was a terrible name because the parameter it takes is the metadata to be preserved. This is really confusing because it looks like it's doing the opposite of what it is actually doing. Now it's named sensibly. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	3c2c0a164c	i965/fs_nir: Add support for indirect texture arrays v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use the nir_tex_src_sampler_offset source type instead of the sampler_indirect thing that I cooked up before. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	60ec60a600	nir: Rework the way samplers are lowered v2 Jason Ekstrand <jason.ekstrand@intel.com>: - Use the nir_tex_src_sampler_offset source type instead of the sampler_indirect thing that I cooked up before. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	4cdabcc0fa	nir/tex_instr_create: Initialize all 4 sources This helps a lot with things like lowering passes that may need to add sources. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	62ac0ee804	nir/tex_instr: Rename the indirect source type and add an array size In particular, we rename nir_tex_src_sampler_index to _sampler_offset and add a sampler_array_size field to nir_tex_instr. This way we can pass the size of sampler arrays through to backends even after removing the variable information and, with it, the type. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	534d145e5e	nir: Use a source for uniform buffer indices instead of an index In GLSL-to-NIR we were just setting the base index to 0 whenever there was an indirect so having it expressed as a sum makes no sense. Also, while a base offset may make sense for the memory location (first element in the array, etc.) it makes less sense for the actual uniform buffer index. This may change later, but it seems to make more sense for now. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	6a5604ca6a	nir: Constant fold array indirects Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	cd4b995254	nir: Make texture instruction names more consistent This commit renames nir_instr_as_texture to nir_instr_as_tex and renames nir_instr_type_texture to nir_instr_type_tex to be consistent with nir_tex_instr. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	d6fe35a418	nir: Remove the ffma peephole This is no longer needed because it's now part of the algebraic optimization pass Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:21 -08:00
Jason Ekstrand	f77f4c00ce	nir: Add a basic constant folding pass Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	d5410bd8f6	nir: Add an algebraic optimization pass This pass uses the previously built algebraic transformations framework and should act as an example for anyone else wanting to make an algebraic transformation pass for NIR. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	0e145a951e	nir: Add infastructure for generating algebraic transformation passes This commit builds on the nir_search.h infastructure by adding a bit of python code that makes it stupid easy to write an algebraic transformation pass. The nir_algebraic.py file contains four python classes that correspond directly to the datastructures in nir_search.c and allow you to easily generate the C code to represent them. Given a list of search-and-replace operations, it can then generate a function that applies those transformations to a shader. The transformations can be specified manually, or they can be specified using nested tuples. The nested tuples make a neat little language for specifying expression trees and search-and-replace operations in a very readable and easy-to-edit fasion. The generated code is also fairly efficient. Insteady of blindly calling nir_replace_instr with every single transformation and on every single instruction, it uses a switch statement on the instruction opcode to do a first-order culling and only calls nir_replace_instr if the opcode is known to match the first opcode in the search expression. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	0057dfd673	nir: Add an expression matching framework This framework provides a simple way to do simple search-and-replace operations on NIR code. The nir_search.h header provides four simple data structures for representing expressions: nir_value and four subtypes: nir_variable, nir_constant, and nir_expression. An expression tree can then be represented by nesting these data structures as needed. The nir_replace_instr function takes an instruction, an expression, and a value; if the instruction matches the expression, it is replaced with a new chain of instructions to generate the given replacement value. The framework keeps track of swizzles on sources and automatically generates the currect swizzles for the replacement value. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	a94d1c2481	nir/glsl: Emit abs, neg, and sat operations instead of source modifiers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	8edcd1de14	nir: Make the type casting operations static inline functions Previously, the casting operations were macros. While this is usually fine, the casting macro used the input parameter twice leading to strange behavior when you passed the result of another function into it. Since we know the source and destination types explicitly, we don't loose anything by making it a function. Also, this gives us a nice little macro for creating cast function that will hopefully prevent mistyping. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	919426631b	nir: Add a lowering pass for adding source modifiers where possible Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	1d83a8eb7a	nir: Add neg, abs, and sat opcodes Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:20:20 -08:00
Jason Ekstrand	a1c259d666	i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-15 07:19:41 -08:00
Jason Ekstrand	e257a51124	i965/fs_nir: Add a has_indirect flag and clean up some of the input/output code Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	a3ad7fdf33	nir: Add a helper for getting a constant value from an SSA source Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	940ccc45ad	nir/glsl: Add support for gpu_shader5 interpolation instrinsics Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	45bdcc257e	nir: Add gpu_shader5 interpolation intrinsics Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	e3fa49c9e6	nir/validate: Validate intrinsic source/destination sizes Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	27663dbe8e	nir: Vectorize intrinsics We used to have the number of components built into the intrinsic. This meant that all of our load/store intrinsics had vec1, vec2, vec3, and vec4 variants. This lead to piles of switch statements to generate the correct intrinsic names, and introspection to figure out the number of components. We can make things much nicer by allowing "vectorized" intrinsics. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	d1d12efb36	nir: Remove the old variable lowering code Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:03 -08:00
Jason Ekstrand	faad82b4e7	nir/validate: Ensure that outputs are write-only and inputs are read-only Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	26865f858d	i965/fs_nir: Use the new variable lowering code This commit switches us over to the new variable lowering code which is capable of properly handling lowering indirects as we go. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	29e607e5cf	nir/glsl: Generate SSA NIR With this commit, the GLSL IR -> NIR pass generates NIR in more-or-less SSA form. It's SSA in the sense that it doesn't have any registers, but it isn't really useful SSA because it still has a pile of load/store intrinsics that we will need to get rid of. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	6962c332e5	nir: Add a pass to lower global variables to local variables Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	619b2e2499	nir: Add a pass for lowering input/output loads/stores Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	aff431293b	nir: Add a pass to lower local variables to registers Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	d477beab07	nir: Add a pass to lower local variable accesses to SSA values This pass analizes all of the load/store operations and, when a variable is never aliased (potentially used by an indirect operation), it is lowered directly to an SSA value. This pass translates to SSA directly and does not require any fixup by the original to-SSA pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	615ba5ad04	nir: Add a copy splitting pass Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	68778d52cd	nir: Automatically update SSA if uses Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	7c5284d0e5	i965/fs_nir: Don't dump the shader. This is killing piglit. I'll leave the logging local Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	9318ce8c5a	nir/glsl: Don't allocate a state_slots array for 0 state slots Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	9d62df3800	nir: Validate that the sources of a phi have the same size as the destination Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	24249599b1	nir/copy_propagate: Don't cause size mismatches on phi node sources Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	6a52d2af2f	nir: Don't require a function in ssa_def_init Instead, we give SSA definitions a temporary index of 0xFFFFFFFF if the instruction does not have a block and a proper index when it actually gets added to the list. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	829aa98320	nir: Use an integer index for specifying structure fields Previously, we used a string name. It was nice for translating out of GLSL IR (which also does that) but cumbersome the rest of the time. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	4f8230e247	nir: Add a concept of a wildcard array dereference Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	b5143edaee	nir: Make array deref direct vs. indirect an enum Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:02 -08:00
Jason Ekstrand	8219ff1796	nir: Clean up nir_deref helper functions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	895eee505c	nir/lower_samplers: Use the nir_instr_rewrite_src function Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	cd01de0812	nir: Add a helper for rewriting an instruction source Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	04fb073344	i965/fs_nir: Properly saturate multiplies Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	5690c2b54c	nir/from_ssa: Don't lower constant SSA values to registers Backends want to be able to do special things with constant values such as put them into immediates or make decisions based on whether or not a value is constant. Before, constants always got lowered to a load_const into a register and then a register use. Now we leave constants as SSA values so backends can special-case them if they want. Since handling constant SSA values is trivial, this shouldn't be a problem for backends. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	c2abfc0b86	i965/fs_nir: Handle SSA constants Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	e0aa4c6272	i965/fs_nir: Use an array rather than a hash table for register lookup Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	20adc516e2	i965/fs_nir: Add the CSE pass and actually run in a loop Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	6bdce55c44	nir: Add a basic CSE pass This pass is still fairly basic. It only handles ALU operations, constant loads, and phi nodes. No texture ops or intrinsics yet. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	20a5812606	nir: Add a fused multiply-add peephole	2015-01-15 07:19:01 -08:00
Jason Ekstrand	02ee1d22a1	nir: Validate that the SSA def and register indices are unique Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	c937bdb3c2	i965/fs_nir: Turn on the peephole select optimization Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	13ec15bdbf	nir: Add a peephole select optimization Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	ef7ebb908e	nir/nir: Patch up phi predecessors in move_successors Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	02eef48343	nir/nir: Use safe iterators when iterating over the CFG Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	c6582e884d	glsl/list: Add a foreach_list_typed_safe_reverse macro Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	dc4e660dfa	nir/nir: Fix a bug in move_successors The unlink_blocks function moves successors around to make sure that, if there is a remaining successor, it is in the first successors slot and not the second. To fix this, we simply get both successors up front. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	2bd5a24a5e	i965/fs_nir: Validate optimization passes Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	10adf8fc85	nir: Differentiate between signed and unsigned versions of find_msb We also make the return types match GLSL. The GLSL spec specifies that findMSB and findLSB return a signed integer. Previously, nir had them return unsigned. This updates nir's behavior to match what GLSL expects. We also update the nir-to-fs generator to take the new instructions. While we're at it, we fix the case where the input to findMSB is zero. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	a76ccbfacf	nir/print: Don't reindex things These indices should now be reasonably stable/consistent. Redoing the indices in the print functions makes it harder to debug problems. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:01 -08:00
Jason Ekstrand	73522ec83f	nir: Validate all lists in the validator Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	8b3dfdce76	glsl/list: Fix the exec_list_validate function Some time while refactoring things to make it look nicer before pushing to master, I completely broke the function. This fixes it to be correct. Just goes to show you why you souldn't push code that has no users yet... Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	4285aaecdc	i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	943ddb9458	nir: Add a better out-of-SSA pass This commit rewrites the out-of-SSA pass to not be nearly as naieve. It's based on "Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency" by Boissinot et. al. It should be fairly close to state-of-the art. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	4f44120ff5	nir: Add a function for comparing two sources Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	366181d826	nir: Add a parallel copy instruction type Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	7de6b7fc3e	nir: Add a function for rewriting all the uses of a SSA def Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	946012f10f	nir: Automatically handle SSA uses when an instruction is inserted Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	fbc443ad56	nir: Add an initialization function for SSA definitions Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	f86902e75d	nir: Add an SSA-based liveness analysis pass. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	c9a21c725d	nir: set reg_alloc and ssa_alloc when indexing registers and SSA values Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	d7e482d32c	nir: Add a function to detect if a block is immediately followed by an if Since we don't actually have an "if" instruction, this is a very common pattern when iterating over instructions. This adds a helper function for it to make things a little less painful. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	dfdf0c4673	nir: Add a foreach_block_reverse function Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	07556442a7	nir/foreach_block: Return false if the callback on the last block fails Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	49911cf4db	nir: Add a basic metadata management system Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	ea1eefe13f	nir/lower_variables_scalar: Silence a compiler warning Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	63eb32950e	i965/fs_nir: Convert the shader to/from SSA Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	9d986d19d0	nir: Add a lower_vec_to_movs pass Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:19:00 -08:00
Jason Ekstrand	2943522d80	nir: Add a naieve from-SSA pass This pass is kind of stupidly implemented but it should be enough to get us up and going. We probably want something better that doesn't generate all of the redundant moves eventually. However, the i965 backend should be able to handle the movs, so I'm not too worried about it in the short term.	2015-01-15 07:18:59 -08:00
Jason Ekstrand	ff0a9fcf33	i965/fs_nir: Don't duplicate emit_general_interpolation Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	b1fe8604c6	i965/fs: Don't take an ir_variable for emit_general_interpolation Previously, emit_general_interpolation took an ir_variable and pulled the information it needed from that. This meant that in fs_fp, we were constructing a dummy ir_variable just to pass into it. This commit makes emit_general_interpolation take only the information it needs and gets rid of the fs_fp cruft. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	b600f1a381	nir: Add intrinsics to do alternate interpolation on inputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	4b4f90dbff	nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean immediates Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	744b4e9348	i965/fs_nir: Add atomic counters support Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	6e46c98ec1	nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	95fbd6e1ee	i965/fs_nir: Handle coarse/fine derivatives Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	d40b5ca5c5	nir/glsl: Add support for coarse and fine derivatives Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	8c75a7ce59	nir: Add fine and coarse derivative opcodes Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	458a6ce500	nir/glsl: Add support for saturate Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	4582341ea7	i965/fs_nir: Add support for sample_pos and sample_id	2015-01-15 07:18:59 -08:00
Jason Ekstrand	7cd1537aae	Fix up varying pull constants Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	4bb81f6d02	Fix what I think are a few NIR typos Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	b092bc9805	i965/fs_nir: Use the correct texture offset immediate Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	c181ff268e	i965/fs_nir: Use the correct types for texture inputs Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	c2ded36bb6	i965/fs_nir: Make the sampler register always unsigned Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Jason Ekstrand	ae2880d131	i965/fs: Only use nir for 8-wide non-fast-clear shaders. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-01-15 07:18:59 -08:00
Connor Abbott	2faf7f87d6	i965/fs: add a NIR frontend This is similar to the GLSL IR frontend, except consuming NIR. This lets us test NIR as part of an actual compiler. v2: Jason Ekstrand <jason.ekstrand@intel.com>: Make brw_fs_nir build again Only use NIR of INTEL_USE_NIR is set whitespace fixes	2015-01-15 07:18:59 -08:00
Connor Abbott	9afc566e2d	i965/fs: Don't pass through the coordinate type All we really need is the number of components.	2015-01-15 07:18:58 -08:00
Connor Abbott	616a48ebc6	i965/fs: make emit_fragcoord_interpolation() not take an ir_variable	2015-01-15 07:18:58 -08:00
Connor Abbott	7602385ac5	nir: add an SSA-based dead code elimination pass v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace fixes	2015-01-15 07:18:58 -08:00
Connor Abbott	8b7cb7674c	nir: add an SSA-based copy propagation pass	2015-01-15 07:18:58 -08:00
Connor Abbott	4553887d4a	nir: add a pass to convert to SSA v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace fixes	2015-01-15 07:18:58 -08:00
Connor Abbott	b559ee709b	nir: calculate dominance information	2015-01-15 07:18:58 -08:00
Connor Abbott	cff1deff72	nir: add an optimization to turn global registers into local registers After linking and inlining, this allows us to convert these registers into SSA values and optimise more code.	2015-01-15 07:18:58 -08:00
Connor Abbott	613bf6818a	nir: add a pass to lower atomics v2: Jason Ekstrand <jason.ekstrand@intel.com> whitespace fixes	2015-01-15 07:18:58 -08:00
Connor Abbott	8692c6a023	nir: add a pass to lower system value reads v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace fixes	2015-01-15 07:18:58 -08:00
Connor Abbott	8cdcfce5ce	nir: add a pass to lower sampler instructions	2015-01-15 07:18:58 -08:00
Connor Abbott	370e875b32	nir: add a pass to remove unused variables After we lower variables, we want to delete them in order to free up some memory. v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace fixes	2015-01-15 07:18:58 -08:00
Connor Abbott	494790b2a9	nir: keep track of the number of input, output, and uniform slots	2015-01-15 07:18:58 -08:00
Connor Abbott	c2f36cf125	nir: add a pass to lower variables for scalar backends	2015-01-15 07:18:58 -08:00
Connor Abbott	7f0daaa5e7	nir: add a glsl-to-nir pass v2: Jason Ekstrand <jason.ekstrand@intel.com>: Make glsl_to_nir build again fix whitespace	2015-01-15 07:18:58 -08:00
Connor Abbott	dbb76421da	nir: add a validation pass This is similar to ir_validate.cpp. v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace fixes	2015-01-15 07:18:58 -08:00
Connor Abbott	98fa28bff7	nir: add a printer This is similar to ir_print_visitor.cpp. v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace fixes	2015-01-15 07:18:58 -08:00
Jason Ekstrand	9b1139649d	SQUASH: Fix comments from eric Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 07:18:58 -08:00
Jason Ekstrand	8b4c860580	SQUASH: Add an assert	2015-01-15 07:18:58 -08:00
Connor Abbott	2812e5de93	nir: add core helper functions These include functions for adding and removing various bits of IR and helpers for iterating over all the sources and destinations of an instruction. This is similar to ir.cpp. v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace and automake fixes	2015-01-15 07:18:58 -08:00
Jason Ekstrand	f521a3c543	SQUASH: Use the enum for the variable mode	2015-01-15 07:18:57 -08:00
Connor Abbott	30c4678f64	nir: add the core datastructures This includes all the instructions, ifs, loops, functions, etc. This is similar to the information in ir.h. v2: Jason Ekstrand <jason.ekstrand@intel.com>: Include ralloc and hash_table from the util directory whitespace fixes Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-By glenn.kennard <glenn.kennard@gmail.com>	2015-01-15 07:18:57 -08:00
Connor Abbott	b5ca34a211	nir: add a simple C wrapper around glsl_types.h v2: Jason Ekstrand <jason.ekstrand@intel.com>: whitespace and automake fixes Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 07:18:57 -08:00
Connor Abbott	77e7a00267	nir: add initial README Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 07:18:57 -08:00
Connor Abbott	ab2ae63854	exec_list: add a list_foreach_typed_reverse() macro Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-15 07:18:57 -08:00
Eric Anholt	84ef2d4156	vc4: Add some dumping for STORE_TILE_BUFFER_GENERAL.	2015-01-15 22:21:29 +13:00
Eric Anholt	1b241c59e8	vc4: Add dumping for the TILE_RENDERING_MODE_CONFIG packet. I wanted to read it, so I wrote parsing.	2015-01-15 22:19:25 +13:00
Eric Anholt	d0d6d24723	vc4: Fix CL dumping trying to dump too far. Execution will end at the cl->next, because that's what ct0ea/ct1ea get programmed to.	2015-01-15 22:19:25 +13:00
Eric Anholt	0471f72755	vc4: Fix texture type masking. Everything from ETC1 to RGBA64 was getting its top bit dropped, but we didn't use any of those formats.	2015-01-15 22:19:25 +13:00
Eric Anholt	6313a2c8f0	vc4: Colormask should apply after all other fragment ops (like logic op). Theoretically it should apply after dithering as well, but ditehring for 565 happens in fixed function in the TLB store.	2015-01-15 22:19:25 +13:00
Eric Anholt	0289a26201	vc4: No turning unpack arguments into small immediates. Since unpack only happens on things read from the A register file, we have to leave them as something that can be allocated to A (temp or uniform).	2015-01-15 22:19:25 +13:00
Eric Anholt	772c47aefe	vc4: Move the tests for src needing to be an A register to vc4_qir.c. I want it from another location.	2015-01-15 22:19:25 +13:00
Eric Anholt	8f2fb68026	vc4: Don't swap the raddr on instructions doing unpacks. It would mean different unpacking behavior, since only the A file does unpack (with PM==0).	2015-01-15 22:19:25 +13:00
Eric Anholt	5d5707707f	vc4: Don't let pairing happen with badly mismatched unpack flags. No difference on shader-db, but prevents definite regressions in the blending changes.	2015-01-15 22:19:25 +13:00
Eric Anholt	3820866e40	vc4: Don't let pairing happen with badly mismatched pack flags. No difference on shader-db, but will become more important as I introduce more use of pack flags with the blending changes.	2015-01-15 22:19:25 +13:00
Eric Anholt	d1f2fc834d	vc4: Fix early Z behavior on hardware. It turns out the simulator was not treating this bit the same as the RPi, and I'd forgotten to remove it when turning on early Z. The result was that you'd get big chunks of your rendering missing.	2015-01-15 22:19:25 +13:00
Michel Dänzer	82b7ee62fc	Revert "radeonsi: only set BC_OPTIMIZE_DISABLE when necessary" This reverts commit `0543630d0b`. It caused flickering artifacts in Steam games such as Team Fortress 2 or Left 4 Dead 2. We could probably only enable this optimization by also making sure the shader code only uses either SI_PARAM_LINEAR_CENTROID or SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader variant. Sorry I didn't remember this when reviewing the reverted change. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-01-15 15:09:48 +09:00
Michel Dänzer	a6a75f1286	st/clover: Adapt to TargetLibraryInfo.h move in LLVM SVN r226078 Trivial.	2015-01-15 12:57:05 +09:00
Ian Romanick	0a0d2c9443	mesa: Micro-optimize _mesa_is_valid_prim_mode You would not believe the mess GCC 4.8.3 generated for the old switch-statement. On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: Difference at 95.0% confidence -0.37374% +/- 0.184057% (n=40) 64-bit: Difference at 95.0% confidence 0.966722% +/- 0.338442% (n=40) The regression on 32-bit is odd. Callgrind says the caller, _mesa_is_valid_prim_mode is faster. Before it says 2,293,760 cycles, and after it says 917,504. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-14 17:09:50 -08:00
Ian Romanick	ead200d156	mesa: Check for vertex program the same way in desktop GL and ES On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Multithread: 32-bit: Difference at 95.0% confidence 0.416027% +/- 0.163529% (n=40) 64-bit: Difference at 95.0% confidence 0.494771% +/- 0.259985% (n=40) Gl32Batch7 had no difference proven at 95.0% confidence (n=120) on 32-bit or 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-14 17:09:50 -08:00
Ian Romanick	d5f936367f	mesa: Drop index buffer bounds check The previous check was insufficient (as it did not take 'indices' into consideration), and DX10 hardware does not need this check anyway. Since index_bytes is no longer used, remove it. On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: Difference at 95.0% confidence 1.66929% +/- 0.230107% (n=40) 64-bit: Difference at 95.0% confidence -1.40848% +/- 0.288038% (n=40) The regression on 64-bit is odd. Callgrind says the caller, validate_DrawElements_common is faster. Before it says 10,321,920 cycles, and after it says 8,945,664. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-14 17:09:50 -08:00
Ian Romanick	a4aeb534ea	mesa: Only check for a current vertex shader in core profile This doesn't affect performance, but it feels more correct. On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: No difference proven at 95.0% confidence (n=120) 64-bit: No difference proven at 95.0% confidence (n=120) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-14 17:09:50 -08:00
Ian Romanick	d6c6b186cf	mesa: Only validate shaders that can exist in the context On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: Difference at 95.0% confidence 0.495267% +/- 0.202063% (n=40) 64-bit: Difference at 95.0% confidence 3.57576% +/- 0.288175% (n=40) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-14 17:09:50 -08:00
Ian Romanick	14aadbe827	i965: Store the atoms directly in the context Instead of having an extra pointer indirection in one of the hottest loops in the driver. On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: Difference at 95.0% confidence 1.98515% +/- 0.20814% (n=40) 64-bit: Difference at 95.0% confidence 1.5163% +/- 0.811016% (n=60) v2 (Ken): Cut size of array from 64 to 57 to save memory. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-14 17:01:27 -08:00
Ian Romanick	6ed53c27ef	i965: Micro-optimize brw_get_index_type With the switch-statement, GCC 4.8.3 produces a small pile of code with a branch. 00000000 <brw_get_index_type>: 000000: 8b 54 24 04 mov 0x4(%esp),%edx 000004: b8 01 00 00 00 mov $0x1,%eax 000009: 81 fa 03 14 00 00 cmp $0x1403,%edx 00000f: 74 0d je 00001e <brw_get_index_type+0x1e> 000011: 31 c0 xor %eax,%eax 000013: 81 fa 05 14 00 00 cmp $0x1405,%edx 000019: 0f 94 c0 sete %al 00001c: 01 c0 add %eax,%eax 00001e: c3 ret However, this could be two instructions. 00000000 <brw_get_index_type>: 000000: 2d 01 14 00 00 sub $0x1401,%eax 000005: d1 e8 shr %eax 000007: 90 nop 000008: 90 nop 000009: 90 nop 00000a: 90 nop 00000b: c3 ret The function was also moved to the header so that it could be inlined at the two call sites. Without this, 32-bit also needs to pull the parameter from the stack. This means there is a push, a call, a move, and a ret added to a two instruction function. The above code shows the function with __attribute__((regparm=1)), but even this adds several extra instructions. There is also an extra instruction on 64-bit to move the parameter to %eax for the subtract. On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: Difference at 95.0% confidence 0.818589% +/- 0.234661% (n=40) 64-bit: Difference at 95.0% confidence 0.54554% +/- 0.354092% (n=40) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-14 16:56:47 -08:00
Ian Romanick	3f1f1d0df4	meta: Put _mesa_meta_in_progress in the header file ...so that it can be inlined in the two places that call it. On Bay Trail-D using Fedora 20 compile flags (-m64 -O2 -mtune=generic for 64-bit and -m32 -march=i686 -mtune=atom for 32-bit), affects Gl32Batch7: 32-bit: No difference proven at 95.0% confidence (n=120) 64-bit: Difference at 95.0% confidence 1.24042% +/- 0.382277% (n=40) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-14 16:55:53 -08:00
Kenneth Graunke	3167a80bb1	i965: Fix "vertex" vs. "geometry" and "VS" vs. "GS" in debug output. We were happily printing "Native code for unnamed vertex shader" and "VS vec4" program for geometry shaders in our INTEL_DEBUG=gs output, as well as the KHR_debug output used by shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-14 16:55:43 -08:00
Kenneth Graunke	68ed14d6ad	i965: Pass a shader stage abbreviation to fs_generator(). A lot of messages hardcoded the string "FS", which is confusing on Broadwell, where we use this code for VS support as well. shader-db particularly got confused, as it reported two "FS SIMD8" shaders, and no vertex shaders at all. Craziness ensued. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-14 16:55:38 -08:00
Samuel Iglesias Gonsalvez	efef6c8280	configure: add check for GNU indent Only GNU indent is supported when indenting autogenerated format_pack.c and format_unpack.c files. Some non-GNU indent (Mac OS X and FreeBSD) add extra whitespaces than break the build of those files. Fallback to 'cat' if a non-GNU indent is found. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=88335 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-14 12:52:22 +01:00
Samuel Iglesias Gonsalvez	6d43a4c338	configure: change required Python Mako version to 0.3.4 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2015-01-14 12:52:22 +01:00
Iago Toral Quiroga	c6a2628950	mesa: rename RGBA8888_* format constants to something appropriate. The 8888 suggests 8-bit components which is not correct, so replace that with the actual size of the components in each format. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-14 07:57:31 +01:00
Jason Ekstrand	ae417957e0	i965/miptree_map_blit: Don't do the initial copy if INVALIDATE_RANGE is set Before we were always coping from the buffer being mapped into the temporary buffer. However, if INVALIDATE_RANGE is set, then we know that the data is going to be junk after we unmap so there's no point in doing the blit. This is important because doing the blit will cause a stall 3 lines later when we map the buffer. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-13 22:06:51 -08:00
Tapani Pälli	f52fe39d31	mesa/glsl/glapi: enable GL_EXT_draw_buffers extension Patch enables ES2 extension that utilizes existing ES3 functionality. Changes make all the subtests to run and pass in WebGL conformance test 'webgl-draw-buffers' when running Chrome on OpenGL ES, also Piglit test 'draw_buffers_gles2' passes. v2: remove unused boolean (Ilia Mirkin) v3: proper error checking for invalid values (Chad Versace) v4: run error check explicitly for ES2 and ES3 (Kenneth Graunke) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-14 07:48:51 +02:00
Jason Ekstrand	3a5c7e47fd	i965/fs: Allow constant propagation between different types This will be needed for NIR because it is typeless and treats all constants as uint32 values and reinterprets them when they are used later. This commit allows those values to be properly propagated. Also, this helps some synmark shaders because it allows us to copy propagate a 0x00000000UD into a 0.0F in a load_payload, which then lets us combine 4 load_payloads. instructions in affected programs: 2288 -> 2144 (-6.29%) Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-13 13:24:52 -08:00
Chad Versace	610c7486c2	egl/wayland: Fix unused variable warnings Remove ctx variables unused as of `70e8ccc459`.	2015-01-13 11:33:23 -08:00
Mike Mason	90d2a85193	mesa: Enable GL_RGB/GL_RGBA in GLES3 glGetInternalformativ Removes commit `7894278` changes and moves fix to _mesa_GetInternalformativ(). The original commit enabled the GL_RGB and GL_RGBA unsized internal formats as valid for render buffers in GLES3, but this is incorrect. They should have only been enabled for GetInternalformativ() Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88079 Reviewed-by: Chad Versace <chad.versace@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-13 11:23:46 -08:00
Rob Clark	876550ff97	freedreno/ir3: handle "holes" in inputs If, for example, only the x/y/w components of in.xyzw are actually used, we still need to have a group of four registers and assign all four components. The hardware can't write in.xy and in.w to discontiguous registers. To handle this, pad with a dummy NOP instruction, to keep the neighbor chain contiguous. This fixes a problem noticed with firefox OMTC. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-13 08:17:18 -05:00
Iago Toral Quiroga	b6819cd554	mesa: Fix error reporting for some cases of incomplete FBO attachments According to the OpenGL and OpenGL ES specs (sections "FRAMEBUFFER COMPLETENESS" and "Whole Framebuffer Completeness"), the image for color, depth or stencil attachments must be renderable, otherwise the attachment is considered incomplete and we should report GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT. Currently, we detect this situation properly but report a different error. This fixes the following 3 piglit tests: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgba_unsigned_int_2_10_10_10_rev dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb16f Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev	038894c7cb	mesa: Returns a GL_INVALID_VALUE error if num of texs in glDeleteTextures is negative Per GLES3 manual for glDeleteTextures <https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteTextures.xhtml>, GL_INVALID_VALUE is generated if n is negative. Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.texture.deletetextures Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev	2012f62d4a	mesa: Returns a GL_INVALID_VALUE error if num of fbos in glDeleteRenderbuffers is negative Per GLES3 manual for glDeleteRenderbuffers <https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteRenderbuffers.xhtml>, GL_INVALID_VALUE is generated if n is negative. Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.buffer.delete_renderbuffers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev	f77a473497	mesa: Returns a GL_INVALID_VALUE error if num of fbos in glDeleteFramebuffers is negative Per GLES3 manual for glDeleteFramebuffers <https://www.khronos.org/opengles/sdk/docs/man3/html/glDeleteFramebuffers.xhtml>, GL_INVALID_VALUE is generated if n is negative. Fixes 1 dEQP test: * dEQP-GLES3.functional.negative_api.buffer.delete_framebuffers Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev	f408c333e2	mesa: Allows querying GL_SAMPLER_BINDING on GLES3 profile From GLES3 specification (page 123), "The currently bound sampler may be queried by calling GetIntegerv with pname set to SAMPLER_BINDINGGL_SAMPLER_BINDING". Fixes 4 dEQP tests: * dEQP-GLES3.functional.state_query.integers.sampler_binding_getboolean * dEQP-GLES3.functional.state_query.integers.sampler_binding_getinteger * dEQP-GLES3.functional.state_query.integers.sampler_binding_getinteger64 * dEQP-GLES3.functional.state_query.integers.sampler_binding_getfloat Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Samuel Iglesias Gonsalvez	719e3f016e	main: round floating-point value to nearest integer in glGetSamplerParameteriv() Previously, a cast was done to convert from float to int but there were rounding errors. The spec specificies in Data Conversion chapter that Floating-point values are rounded to the nearest integer. This patch fixes the following 2 dEQP tests: dEQP-GLES3.functional.state_query.sampler.sampler_texture_min_lod_getsamplerparameteri dEQP-GLES3.functional.state_query.sampler.sampler_texture_max_lod_getsamplerparameteri Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Samuel Iglesias Gonsalvez	d8d59202af	main: round floating-point value to nearest integer in glGetTexParameteriv() Previously, a cast was done to convert from float to int but there were rounding errors. The spec specificies in Data Conversion chapter that Floating-point values are rounded to the nearest integer. This patch fixes the following 8 dEQP tests: dEQP-GLES3.functional.state_query.texture.texture_2d_texture_min_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_2d_texture_max_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_3d_texture_min_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_3d_texture_max_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_2d_array_texture_min_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_2d_array_texture_max_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_cube_map_texture_min_lod_gettexparameteri dEQP-GLES3.functional.state_query.texture.texture_cube_map_texture_max_lod_gettexparameteri Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Samuel Iglesias Gonsalvez	8e49a3e028	main: fix return GL_FRAMEBUFFER_ATTACHMENT_TEXTURE_LEVEL value Return the proper value for two-dimensional array texture and three-dimensional textures. From OpenGL ES 3.0 spec, chapter 6.1.13 "Framebuffer Object Queries", page 234: "If pname is FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER and the texture object named FRAMEBUFFER_ATTACHMENT_OBJECT_NAME is a layer of a three-dimensional texture or a two-dimensional array texture, then params will contain the number of the texture layer which contains the attached im- age. Otherwise params will contain the value zero." Furthermore, FRAMEBUFFER_ATTACHMENT_TEXTURE_LAYER is an alias of FRAMEBUFFER_ATTACHMENT_TEXTURE_3D_ZOFFSET_EXT. This patch fixes dEQP test: dEQP-GLES3.functional.state_query.fbo.framebuffer_attachment_texture_layer Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Iago Toral Quiroga	c260d61e76	i965: Fix bitcast operations with negate (ceil) Commit `0ae9ca12a8` put source modifiers out of the bitcast operations by adding a MOV operation that would handle them separately. It missed the case of ceil though: the implementation negates both its source and destination operands. The source operand will be used for RNDD, which we can handle normally, but we need to fix the modifier for the negated result. v2: - RNDD can handle the source modifier so no need to put that one in a separate MOV. Fixes the following 42 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.common.ceil._vertex dEQP-GLES3.functional.shaders.builtin_functions.common.ceil._fragment dEQP-GLES3.functional.shaders.builtin_functions.precision.ceil._vertex. dEQP-GLES3.functional.shaders.builtin_functions.precision.ceil._fragment. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-13 12:19:32 +01:00
Iago Toral Quiroga	d42e090386	mesa: Depth and stencil attachments must be the same in OpenGL ES3 "9.4. FRAMEBUFFER COMPLETENESS ... Depth and stencil attachments, if present, are the same image." Notice that this restriction is not included in the OpenGL ES2 spec. Fixes 18 dEQP tests in: dEQP-GLES3.functional.fbo.completeness.attachment_combinations.* Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev	b8b1d83c71	mesa: Initializes the stencil value masks to 0xFF instead of ~0u '4.1.4 Stencil Test' section of the GL-ES 3.0 specification says: "In the initial state, [...] the front and back stencil mask are both set to the value 2^s − 1, where s is greater than or equal to the number of bits in the deepest stencil buffer* supported by the GL implementation." Since the maximum supported precision for stencil buffers is 8 bits, mask values should be initialized to 2^8 - 1 = 0xFF. Currently, these masks are initialized to max unsigned integer (~0u), because in OpenGL 3.0 and before, the initial mask values were: "In the initial state, stenciling is disabled, the front and back stencil reference value are both zero, the front and back stencil comparison functions are both ALWAYS, and the front and back stencil mask are both all ones." The problem is that it causes the mask values to overflow to -1 when converted to signed integer by glGet* APIs. Fixes 6 dEQP failing tests: * dEQP-GLES3.functional.state_query.integers.stencil_value_mask_getfloat * dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_getfloat * dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_getfloat * dEQP-GLES3.functional.state_query.integers.stencil_value_mask_separate_both_getfloat * dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_getfloat * dEQP-GLES3.functional.state_query.integers.stencil_back_value_mask_separate_both_getfloat Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Eduardo Lima Mitev	aa727c1dd9	i965: Sets missing vertex shader constant values for HighInt format The range's min and max, and the precision value are not set correctly for the vertex shader constants. Fixes 1 dEQP test: dEQP-GLES3.functional.state_query.shader.precision_vertex_highp_int Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-13 12:19:32 +01:00
Marek Olšák	bed6f20f28	r600g: fix build failure when building the driver without LLVM	2015-01-12 23:20:26 +01:00
Laura Ekstrand	0e6f0eea1a	main: Remove comparison unsigned int >= 0. Fixes "macro compares unsigned to 0 (NO_EFFECT)" found by Coverity Scan. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-12 10:23:17 -08:00
Juha-Pekka Heikkila	c503ce1044	mesa/main: In _mesa_CompressedTextureSubImage3D() check found texObj Check returned texObj is not null. If texObj is null there is already GL_INVALID_OPERATION error set. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2015-01-12 09:56:43 -08:00
José Fonseca	457d40e9e8	mesa: Move declarations to to of block. To fix MSVC build. Trivial.	2015-01-12 12:40:01 +00:00
Samuel Iglesias Gonsalvez	c471b09bf4	mesa: restrict use of GL_ABGR_EXT format to allowed data types GL_UNSIGNED_SHORT_5_5_5_1, GL_UNSIGNED_SHORT_1_5_5_5_REV, GL_UNSIGNED_INT_10_10_10_2, GL_UNSIGNED_INT_2_10_10_10_REV data types are not explicitly allowed to work with GL_ABGR_EXT format neither in GL nor GL_EXT_abgr specs. Removed the corresponding mesa formats as there are no other functions using them inside Mesa anymore. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:30 +01:00
Iago Toral Quiroga	769de5165c	mesa: Remove _mesa_rebase_rgba_uint and _mesa_rebase_rgba_float These are no longer used anywhere now that we have _mesa_format_convert. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:30 +01:00
Samuel Iglesias Gonsalvez	8993b9818c	mesa: Remove _mesa_pack_int_rgba_row() and auxiliary functions These are no longer used. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:30 +01:00
Iago Toral Quiroga	d28d9376e2	mesa: Remove _mesa_(un)pack_index_span These are not used anywhere. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	3a4de32144	mesa: Remove _mesa_pack_rgba_span_float and tmp_pack.h _mesa_pack_rgba_span_float was the last of the color span functions and we have replaced all calls to it with calls to _mesa_format_convert, so we can remove it together with tmp_pack.h which was used to generate the pack functions for multiple types that were used from the various color span functions that have been removed. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	873437e209	mesa: Remove _mesa_unpack_color_span_float And various helper functions that went unused after removing it. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	3ba92bac76	mesa: Remove (signed) integer pack and span functions. These are no longer used now that we moved to _mesa_format_convert. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	2280fdeb61	mesa: Remove _mesa_unpack_color_span_ubyte This is no longer used anywhere after moving to _mesa_format_convert. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	c540800aa5	mesa: Remove _mesa_make_temp_float_image Now that we have _mesa_format_convert we don't need this. This was only used to create temporary RGBA float images in the process of storing some compressed formats. These can call _mesa_texstore with a RGBA/float dst to achieve the same goal. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	4468386a3c	mesa: Remove _mesa_make_temp_ubyte_image Now that we have _mesa_format_convert we don't need this. texstore_rgba will use the GL_COLOR_INDEX to RGBA conversion helpers instead and compressed formats that used _mesa_make_temp_ubyte_image to create an ubyte RGBA temporary image can call _mesa_texstore with a RGBA/ubyte dst to achieve the same goal. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	43a76a9e44	mesa: Remove _mesa_unpack_color_span_uint This is no longer used. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Eduardo Lima Mitev	87c595c17b	mesa: Replace _mesa_unpack_bitmap with _mesa_unpack_image() _mesa_unpack_bitmap() was introduced by commit `02b801c` to handle the case when data is stored in PBO by display lists, in the context of this bug: Incorrect pixels read back if draw bitmap texture through Display list https://bugs.freedesktop.org/show_bug.cgi?id=10370 Since _mesa_unpack_image() already handles the case of GL_BITMAP, this patch removes _mesa_unpack_bitmap() and makes affected calls go through _mesa_unapck_image() instead. The sample test attached to the original bug report passes with this change and there are no piglit regressions. Signed-off-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	ea79ab3e8c	mesa: Let _mesa_swizzle_and_convert take array format types instead of GL types In the future we would like to have a format conversion library that is independent of GL so we can share it with Gallium. This is a step in that direction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	a55f67fcb0	st/mesa: Use _mesa_format_convert to implement st_GetTexImage. Instead of using _mesa_pack_rgba_span_float. This should allow us to remove that function in a later patch. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	84eb402c01	swrast: Use _mesa_format_convert to implement draw_rgba_pixels. This is the only place that uses _mesa_unpack_color_span_float so after this we should be able to remove that function. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	a629f0612d	mesa: Use _mesa_format_convert to implement get_tex_rgba_compressed. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	77bd2b288f	mesa: use _mesa_format_convert to implement get_tex_rgba_uncompressed. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	5038d839b8	mesa: use _mesa_format_convert to implement glReadPixels. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	8ec6534b26	mesa: Use _mesa_format_convert to implement texstore_rgba. Notice that _mesa_format_convert does not handle byte-swapping scenarios, GL_COLOR_INDEX or MESA_FORMAT_YCBCR(_REV), so these must be handled separately. Also, remove all the code that goes unused after using _mesa_format_convert. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	2ec8718dae	mesa: Add helpers to extract GL_COLOR_INDEX to RGBA float/ubyte We only use _mesa_make_temp_ubyte_image in texstore.c to convert GL_COLOR_INDEX to RGBA, but this helper does more stuff than this. All uses of this helper can be replaced with calls to _mesa_format_convert except for this GL_COLOR_INDEX conversion. This patch extracts the GL_COLOR_INDEX to RGBA logic to a separate helper so we can use that instead from texstore.c. In future patches we will replace all remaining calls to _mesa_make_temp_ubyte_image in the repository (related to compressed formats) with calls to _mesa_format_convert so we can remove _mesa_make_temp_ubyte_image and related functions. v2: - Remove ‘for’ loop initial declaration. They are only allowed in C99 or C11 mode. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	d71a1adff2	mesa: Add RGBA to Luminance conversion helpers For glReadPixels with a Luminance destination format we compute luminance values from RGBA as L=R+G+B. This, however, requires ad-hoc implementation, since pack/unpack functions or _mesa_swizzle_and_convert won't do this (and thus, neither will _mesa_format_convert). This patch adds helpers to do this computation so they can be used to support conversion to luminance formats. The current implementation of glReadPixels does this computation as part of the span functions in pack.c (see _mesa_pack_rgba_span_float), that do this together with other things like type conversion, etc. We do not want to use these functions but use _mesa_format_convert instead (later patches will remove the color span functions), so we need to extract this functionality as helpers. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Iago Toral Quiroga	a177b30f1f	mesa: Add _mesa_swap2_copy and _mesa_swap4_copy We have _mesa_swap{2,4} but these do in-place byte-swapping only. The new functions receive an extra parameter so we can swap bytes on a source input array and store the results in a (possibly different) destination array. This is useful to implement byte-swapping in pixel uploads, since in this case we need to swap bytes on the src data which is owned by the application so we can't do an in-place byte swap. v2: - Include compiler.h in image.h, which is necessary to build in MSCV as indicated by Brian Paul. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:29 +01:00
Samuel Iglesias Gonsalvez	dcef50b9b5	mesa/pack: use _mesa_format_from_format_and_type in _mesa_pack_rgba_span_from_* We had previously added the needed mesa formats, so we can simplify the code further. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Iago Toral Quiroga	559a1072da	mesa: Add helper to convert a GL format and type to a mesa (array) format. v2 after review by Jason Ekstrand: - Move _mesa_format_from_format_and_type to glformats - Return a mesa_format for GL_UNSIGNED_INT_8_8_8_8(_REV) v3: - Adapted to the new implementation of mesa_array_format as a plain uint32_t bitfield. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Iago Toral Quiroga	b1f0229140	mesa: Add a helper _mesa_compute_rgba2base2rgba_component_mapping This will come in handy when callers of _mesa_format_convert need to compute the rebase swizzle parameter to use. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Iago Toral Quiroga	3171a09c25	mesa: Add a rebase_swizzle parameter to _mesa_format_convert The new parameter allows callers to provide a rebase swizzle that the function needs to use to match the requirements of the base internal format involved. This is necessary when the source or destination internal formats (depending on whether we are doing the conversion for a pixel download or a pixel upload respectively) do not match the base formats of the source or destination formats of the conversion. This can happen when the driver does not support the internal formats and uses a different format to store pixel data internally. For example, a texture upload from RGB to Luminance in a driver that does not support textures with a Luminance format may decide to store the Luminance data as RGBA. In this case we want to store the RGBA values as (R,R,R,1). Following the same example, when we download from that texture to RGBA we want to read (R,0,0,1). The rebase_swizzle parameter allows these transforms to happen. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Iago Toral Quiroga	1aaed75330	mesa: Expose compute_component_mapping as _mesa_compute_component_mapping This is necessary to handle conversions between array types where the driver does not support the dst format requested by the client and chooses a different format instead. We will need this in _mesa_format_convert, so move it to format_utils.c, prefix it with '_mesa_' and make it available to other files. v2: - Move _mesa_compute_component_mapping to glformats Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Jason Ekstrand	deca11c0dc	mesa: Add an implementation of a master convert function. v2 by Iago Toral <itoral@igalia.com>: - When testing if we can directly pack we should use the src format to check if we are packing from an RGBA format. The original code used the dst format for the ubyte case by mistake. - Fixed incorrect number of bits for dst, it was computed using the src format instead of the dst format. - If the dst format is an array format, check if it is signed. We were only checking this for the case where it was not an array format, but we need to know this in both scenarios. - Fixed incorrect swizzle transform for the cases where we convert between array formats. - Compute is_signed and bits only once and for the dst format. We were computing these for the src format too but they were overwritten by the dst values immediately after. - Be more careful when selecting the integer path. Specifically, check that both src and dst are integer types. Checking only one of them should suffice since OpenGL does not allow conversions between normalized and integer types, but putting extra care here makes sense and also makes the actual requirements for this path more clear. - The format argument for pack functions is the destination format we are packing to, not the source format (which has to be RGBA). - Expose RGBA8888_* to other files. These will come in handy when in need to test if a given array format is RGBA or in need to pass RGBA formats to mesa_format_convert. v3 by Samuel Iglesias <siglesias@igalia.com>: - Add an RGBA8888_INT definition. v4 by Iago Toral <itoral@igalia.com> after review by Jason Ekstrand: - Added documentation for _mesa_format_convert. - Added additional explanatory comments for integer conversions. - Ensure that we use _messa_swizzle_and_convert for all signed source formats. - Squashed: do not directly (un)pack to RGBA UINT if the source is not unsigned. v5 by Iago Toral <itoral@igalia.com>: - Adapted to the new implementation of mesa_array_format as a plain uint32_t bitfield. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	ba5418c60d	mesa/pack: refactor _mesa_pack_rgba_span_float() Use autogenerated format pack functions and take advantage of some macros to reduce source code, facilitating its maintenance. Unfortunately, dstType == GL_UNSIGNED_SHORT cannot simplified like the others, so keep it as it is. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	41a785b09c	mesa/main/pack_tmp.h: Add float conversion support We will use this in a later patch to refactor _mesa_pack_rgba_span_float. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	1a5ec9624a	mesa/pack: use autogenerated format_pack functions Take advantage of new mesa formats and new format_pack functions to reduce source code in _mesa_pack_rgba_span_from_ints() and _mesa_pack_rgba_span_from_uints(). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	8c82b22a16	mesa: use format conversion functions in swrast This commit adds a macro to facilitate the task of using format conversions functions but keeps the same API. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	c5a5c9a7db	mesa/formats: add new mesa formats and their pack/unpack functions. This will be used to refactor code in pack.c and support conversion to/from these types in a master convert function that will be added later. v2: - Fix autogeneration of MESA_FORMAT_A2R10G10B10_UNORM pack/unpack functions Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	f8d160fc96	mesa/format_pack: Add _mesa_pack_int_rgba_row() This will be used to unify code in pack.c. v2: - Modify pack_int_() function generator to use c.datatype() and f.datatype() v3: - Only autogenerate pack_int_() functions for non-normalized integer formats. v4: - Use _mesa_unsigned_to_unsigned() in pack_int_*() because, in order to be able to pack both signed and unsigned formats, we need to sign-extend. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	9567e1048b	mesa: Add _mesa_pack_uint_rgba_row() format conversion function We will use this later on to handle uint conversion scenarios in a master convert function. v2: - Modify pack_uint_() function generation to use c.datatype() and f.datatype(). - Remove UINT_TO_FLOAT() macro usage from pack_uint() - Remove "if not f.is_normalized()" conditional as pack_uint() functions are only autogenerated for non normalized formats. v3: - Add clamping for non-normalized integer formats in pack_uint() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Jason Ekstrand	e1fdcddafe	mesa: Autogenerate format_unpack.c Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 by Samuel Iglesias <siglesias@igalia.com>: - Add usage of INDENT_FLAGS in Makefile.am v3 by Samuel Iglesias <siglesias@igalia.com>: - Modify unpack_float_() and unpack_ubyte_() function generation to use c.datatype() and f.datatype() - Fix out-of-tree build v4 by Samuel Iglesias <siglesias@igalia.com>: - format_unpack.c.mako is now format_unpack.py, with the template code inlined. It now auto-generates format_unpack.c - Add format_unpack.c to gitignore. - Simplify Makefile.am change - Modify SConscript to build format_unpack.c with scons v5 by Samuel Iglesias <siglesias@igalia.com>: - Don't allow float to non-normalized integer format conversions. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Jason Ekstrand	e0439f7505	mesa: Autogenerate most of format_pack.c We were auto-generating it before. The problem was that the autogeneration tool we were using was called "copy, paste, and edit". Let's use a more sensible solution. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 by Samuel Iglesias <siglesias@igalia.com> - Remove format_pack.c as it is now autogenerated - Add usage of INDENT_FLAGS in Makefile.am - Remove trailing blank line v3 by Samuel Iglesias <siglesias@igalia.com> - Merge format_convert.py into format_parser.py - Adapt pack__ function generations - Fix out-of-tree build v4 by Samuel Iglesias <siglesias@igalia.com> - _get_datatype() is now a helper function v5 by Samuel Iglesias <siglesias@igalia.com> - format_pack.c.mako is now format_pack.py, with the template code inlined. It now auto-generates format_pack.c - Simplify Makefile.am change. - Modify SConscript to build format_pack.c with scons. - Remove run_mako.py - Add format_pack.c to gitignore v6 by Samuel Iglesias <siglesias@igalia.com>: - Don't allow float to non-normalized integer format conversions. - Add non-normalized formats support for ubyte packing functions. Merge the previously separated patch. - Add clamping for non-normalized integer formats in pack_ubyte*() v7 by Samuel Iglesias <siglesias@igalia.com>: - Add assert to check that sRGB formats are 8-bit size. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Samuel Iglesias Gonsalvez	2b37bea010	configure: require python mako module It is now a hard dependency because of the autogeneration of format pack and unpack functions. Update the documentation to reflect this change. v2: - Inline python script in m4 file and use PYTHON2 v3: - Remove semicolons and quotes and change coding style - Add Ilia Mirkin suggestion to use Python's split functionality. - Use AX_CHECK_PYTHON_MAKO_MODULE name. - Change to MIT license Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Jason Ekstrand	f89793946a	mesa: Add a _mesa_is_format_color_format helper Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-12 11:20:28 +01:00
Iago Toral Quiroga	3c19251f28	mesa: Let _mesa_get_format_base_format also handle mesa_array_format. If we need the base format for a mesa_array_format we have to find the matching mesa_format first. This is expensive because it requires to loop through all existing mesa formats until we find the right match. We can resolve the base format of an array format directly by looking at its swizzle information. Also, we can have _mesa_get_format_base_format accept an uint32_t which can pack either a mesa_format or a mesa_array_format and resolve the base format for either type. This way clients do not need to check if they have a mesa_format or a mesa_array_format and call different functions depending on the case. Another reason to resolve the base format for array formats directly is that we don't have matching mesa_format enums for every possible array format, so for some GL format/type combinations we can produce array formats that don't have a corresponding mesa format, in which case we would not be able to find the base format. Example format=GL_RGB, type=GL_UNSIGNED_SHORT. This type would map to something like MESA_FORMAT_RGB_UNORM16, but we don't have that. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:28 +01:00
Jason Ekstrand	3da735cc4c	main: Add a concept of an array format An array format is a 32-bit integer format identifier that can represent any format that can be represented as an array of standard GL datatypes. Whie the MESA_FORMAT enums provide several of these, they don't account for all of them. v2 by Iago Toral Quiroga <itoral@igalia.com>: - Implement mesa_array_format as a plain bitfiled uint32_t type instead of using a struct inside a union to access the various components packed in it. This is necessary to support bigendian properly, as pointed out by Ian. - Squashed: Make float types normalized v3 by Iago Toral Quiroga <itoral@igalia.com>: - Include compiler.h in formats.h, which is necessary to build in MSVC as indicated by Brian Paul. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2015-01-12 11:20:28 +01:00
Iago Toral Quiroga	382d097e54	swrast: Remove unused variable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:27 +01:00
Samuel Iglesias Gonsalvez	fea1be8d0b	mesa: Fix _mesa_swizzle_and_convert integer conversions to clamp properly Fix various conversion paths that involved integer data types of different sizes (uint16_t to uint8_t, int16_t to uint8_t, etc) that were not being clamped properly. Also, one of the paths was incorrectly assigning the value 12, instead of 1, to the constant "one". v2: - Create auxiliary clamping functions and use them in all paths that required clamp because of different source and destination sizes and signed-unsigned conversions. v3: - Create MIN_INT macro and use it. v4: - Add _mesa_float_to_[un]signed() and mesa_half_to_[un]signed() auxiliary functions. - Add clamp for float-to-integer conversions in _mesa_swizzle_and_convert() Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:27 +01:00
Jason Ekstrand	483b043488	mesa/format_utils: Prefix and expose the conversion helper functions Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> v2 by Samuel Iglesias <siglesias@igalia.com>: - Fix compilation errors Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:27 +01:00
Iago Toral Quiroga	3473a84fb2	mesa: Fix incorrect assertion in init_teximage_fields_ms _BaseFormat is a GLenum (unsigned int) so testing if its value is greater than 0 to detect the cases where _mesa_base_tex_format returns -1 doesn't work. Fixing the assertion breaks the arb_texture_view-lifetime-format piglit test on nouveau, since that test calls _mesa_base_tex_format with GL_R16F with a context that does not have ARB_texture_float, so it returns -1 for the BaseFormat, which was not being caught properly by the ASSERT in init_teximage_fields_ms until now. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:27 +01:00
Samuel Iglesias Gonsalvez	b2b39ce257	mesa: Fix get_texbuffer_format(). We were returning incorrect mesa formats for GL_LUMINANCE_ALPHA16I_EXT and GL_LUMINANCE_ALPHA32I_EXT. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:27 +01:00
Jason Ekstrand	96fe6191cb	mesa: Fix A1R5G5B5 packing/unpacking As with B5G6R5, these have been left broken with comments saying they are. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-12 11:20:27 +01:00
Jason Ekstrand	3e4669a8f3	mesa/colormac: Remove an unused macro The PACK_565_REV macro is no longer used. It was also extremely confusing because it's actually a byteswapped 565 not reversed 565. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2015-01-12 11:20:27 +01:00
Jason Ekstrand	ec0bfba496	mesa: Fix packing/unpacking of MESA_FORMAT_R5G6B5_UNORM Aparently, the packing/unpacking functions for these formats have differed from the format description in formats.h. Instead of fixing this, people simply left a comment saying it was broken. Let's actually fix it for real. v2 by Samuel Iglesias <siglesias@igalia.com>: - Fix comment in formats.h Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-12 11:20:27 +01:00
Jason Ekstrand	7d1b08ac44	mesa: Fix clamping to -1.0 in snorm_to_float This patch fixes the return of a wrong value when x is lower than -MAX_INT(src_bits) as the result would not be between [-1.0 1.0]. v2 by Samuel Iglesias <siglesias@igalia.com>: - Modify snorm_to_float() to avoid doing the division when x == -MAX_INT(src_bits) Cc: 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-12 11:20:27 +01:00
Emil Velikov	3b5f206475	docs: add news item and link release notes for mesa 10.3.7/10.4.2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2015-01-12 10:46:38 +00:00
Emil Velikov	8e34db76e1	docs: Add sha256 sums for the 10.4.2 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `14f1659b43`)	2015-01-12 10:46:38 +00:00
Emil Velikov	1631f74a1c	Add release notes for the 10.4.2 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `02f2e97c3e`)	2015-01-12 10:46:38 +00:00
Emil Velikov	134593f0c0	docs: Add sha256 sums for the 10.3.7 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `20e0546cc2`)	2015-01-12 10:46:38 +00:00
Emil Velikov	4a8105e5cc	Add release notes for the 10.3.7 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `6b00e5585a`)	2015-01-12 10:46:38 +00:00
Kenneth Graunke	f95733ddb7	i965: Respect the no_8 flag on Gen6, not just Gen7+. When doing repclears, we only want to use the SIMD16 program, not the SIMD8 one. Kristian added this to the Gen7+ code, but apparently we missed it in the Gen6 code. This patch copies that code over. Approximately doubles the performance in a clear microbenchmark from mesa-demos (clearspd -width 500 -height 500 +color) on Sandybridge. Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com> References: https://code.google.com/p/chrome-os-partner/issues/detail?id=34681	2015-01-12 00:41:07 -08:00
Ian Romanick	f591712efe	mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary There are no binary formats supported, so what are you doing? At least this gives the application developer some feedback about what's going on. The spec gives no guidance about what to do in this scenario. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Leight Bade <leith@mapbox.com>	2015-01-12 12:01:09 +13:00
Ian Romanick	4fd8b30123	mesa: Ensure that length is set to zero in _mesa_GetProgramBinary v2: Fix assignment of length. Noticed by Julien Cristau. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Leight Bade <leith@mapbox.com>	2015-01-12 12:01:06 +13:00
Ian Romanick	201b9c1818	mesa: Add missing error checks in _mesa_ProgramBinary Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87516 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Leight Bade <leith@mapbox.com>	2015-01-12 12:00:45 +13:00
Eric Anholt	ff1948a1be	vc4: Clamp the inputs to the blend equation to [0, 1]. Fixes the remaining ARB_color_buffer_float rendering tests.	2015-01-11 17:17:20 +13:00
Eric Anholt	1519a1928a	vc4: Add a little helper for clamping to [0,1].	2015-01-11 17:17:20 +13:00
Eric Anholt	1a328120d3	vc4: Fix up statechange management for uncompiled/compiled FS/VS. No need to recheck the FS compile when the VS source has changed, but there is a need to recheck the VS compile when the compiled VS has changed (since the live inputs may change). Fixes es3conform's blend test.	2015-01-11 17:17:20 +13:00
Eric Anholt	c122662984	vc4: Fix clear color setup for RGB565. The util_pack_color() thing only sets up the low bits of the union, so only return them, too. Fixes intermittent failure on fbo-alphatest-formats and es3conform's framebuffer-objects test under simulation.	2015-01-11 17:17:19 +13:00
Eric Anholt	355156d2f7	vc4: Avoid the save/restore of r3 for raddr conflicts, just use ra31. Turns out this was harmful in code quality: total instructions in shared programs: 39487 -> 38845 (-1.63%) instructions in affected programs: 22522 -> 21880 (-2.85%) This costs us yet another register, which is painful since it means more programs might fail to compile). However, the alternative was causing us trouble where we'd save/restore r3 while it contained a MIN-ed direct texture offset, causing the kernel to fail to validate our shaders (such as in GLB2.7).	2015-01-11 08:57:24 +13:00
Eric Anholt	a8e14c293b	vc4: Allow dead code elimination of VPM reads. This gets a bunch of dead reads out of the CSes, which don't read most attributes generally. total instructions in shared programs: 39753 -> 39487 (-0.67%) instructions in affected programs: 4721 -> 4455 (-5.63%)	2015-01-10 20:55:37 +13:00
Eric Anholt	b920ecf793	vc4: Cook up the draw-time VPM setup info during shader compile. This will give the compiler the chance to dead-code eliminate unused VPM reads. This is particularly a big deal in the CS where a bunch of vattrs are just not going to be used.	2015-01-10 15:24:56 +13:00
Eric Anholt	c772c92153	vc4: Split two notions of instructions having side effects. Some ops can't be DCEd, while some of the ops that are just important due to the args they have can be.	2015-01-10 15:24:46 +13:00
Eric Anholt	a58ae83882	vc4: Redo VPM reads as a read file. This will let us do copy propagation of the VPM reads.	2015-01-10 14:35:24 +13:00
Eric Anholt	06b6a72a3e	vc4: Fix miscalculation of the VPM space. We pass in a byte offset, not dword. I'm rather scared that this actually managed to pass piglit, but it does fix gears.	2015-01-10 14:35:06 +13:00
Eric Anholt	92a0b0bd70	vc4: Pack VPM attr contents according to just the size of the attribute. total instructions in shared programs: 40960 -> 39753 (-2.95%) instructions in affected programs: 20871 -> 19664 (-5.78%)	2015-01-10 13:54:12 +13:00
Eric Anholt	72cb6619cb	vc4: Restructure color packing as a series of channel replacements. I'm using this in some WIP commits for doing blending in 8888 instead of vec4. But it also gives us these results immediately, thanks to allowing more uniforms/immediates in the arguments: total instructions in shared programs: 41027 -> 40960 (-0.16%) instructions in affected programs: 4381 -> 4314 (-1.53%)	2015-01-10 13:54:12 +13:00
Eric Anholt	3093bfacf0	vc4: Fix the no-copy-propagating-from-TLB_COLOR_READ check. Our MOV's dst obviously won't be the TLB_COLOR_READ's def, because we're ssa.	2015-01-10 13:54:12 +13:00
Eric Anholt	1d04432677	vc4: Move global seqno short-circuiting to vc4_wait_seqno(). Any other caller would want it, too.	2015-01-10 13:54:12 +13:00
Eric Anholt	24d9487432	state_tracker: Fix assertion failures in conditional block movs. If you had a conditional assignment of an array or struct (say, from the if-lowering pass), we'd try doing swizzle_for_size() on the aggregate type, and it would assertion fail due to vector_elements==0. Instead, extend emit_block_mov() to handle emitting the conditional operations, which also means we'll have appropriate writemasks/swizzles on the CMPs within a struct containing various-sized members. Fixes 20 testcases in es3conform on vc4. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-10 13:54:12 +13:00
Matt Turner	3d8188d4f8	i965: Consider SEL.{GE,L} to be commutative operations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-08 15:38:16 -08:00
Matt Turner	7f813bf53d	i965/cfg: Fix end_ip of last basic block. start_ip and end_ip are inclusive. Increases instruction counts in 64 shaders in shader-db, likely indicative of them previously being misoptimized.	2015-01-08 15:38:16 -08:00
Brian Paul	df461ac952	mesa: compute row stride outside of loop and fix MSVC compilation error Can't do void pointer arithmetic with MSVC. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-08 14:35:16 -07:00
Brian Paul	e2bf5b183b	mesa: fix MSVC compilation errors Move assertions after declarations and don't use void pointer arithmetic. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-08 14:35:07 -07:00
Laura Ekstrand	8d2542fc9d	main: Checking for cube completeness in TextureSubImage. This is part of a potential solution to a spec bug. Cube completeness is a concept from glGenerateMipmap, but it seems reasonable to check for it in TextureSubImage when target=GL_TEXTURE_CUBE_MAP. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:30 -08:00
Laura Ekstrand	efbc1c86a6	main: Checking for cube completeness in GetTextureImage. This is part of a potential solution to a spec bug. Cube completeness is a concept from glGenerateMipmap, but it seems reasonable to check for it in GetTextureImage when the target is GL_TEXTURE_CUBE_MAP. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:30 -08:00
Laura Ekstrand	b66dd38a37	main: Added _mesa_cube_level_complete to check for the completeness of an arbitrary cube map level. Reviewed-by: Chad Versace <chad.versace@intel.com>	2015-01-08 11:37:30 -08:00
Laura Ekstrand	2546d901be	main: glDeleteTextures now throws GL_INVALID_VALUE if n is negative. This is in conformance with the OpenGL spec. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:30 -08:00
Laura Ekstrand	50d679381d	main: Refactor in teximage.c to handle NULL from _mesa_get_current_tex_object. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:30 -08:00
Laura Ekstrand	98e64e538a	main: Added entry point for glTextureBuffer. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:30 -08:00
Laura Ekstrand	499004e56a	main: Fix texObj->Immutable flag update in _mesa_texture_image_multisample. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	a7d69516b8	main: Added entry points for glTextureStorage[23]DMultisample. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	91089d6d65	main: Added entry point for glGenerateTextureMipmap. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	239e3fb876	main: Added entry points for glCompressedTextureSubImage*D. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	8b5482ec03	main: Added entry point for glGetCompressedTextureImage. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	a739bdeb1d	main: Added entry point for glGetTextureImage. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	f51f6805f5	main: Nameless texture creation and deletion. Does not affect normal creation and deletion paths. In implementing ARB_DIRECT_STATE_ACCESS functions, it is often necessary to abstract the functionality of a traditional GL API function into a backend that both the traditional and dsa API functions can share. For instance, glTexParameteri and glTextureParameteri both call _mesa_texture_parameteri, which takes a context object and a texture object as arguments. The existance of such backend functions provides the opportunity for driver internals (such as meta) to pass around the actual texture object rather than its ID or target, saving on texture object storage and look-up overhead. This patch provides nameless texture creation and deletion for meta. This will be used in an upcoming refactor of meta. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	d6b7c40cec	main: Added entry points for CopyTextureSubImage*D. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	bad39f6c1e	main: Fixed some comments in texparam.c Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	c2c5077864	main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	89912d04a1	main: Added entry point for glGetTextureParameterfv. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	86bb3be319	main: Added entry points for glGetTextureLevelParameteriv, fv. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	bf5c588cde	main: legal_get_tex_level_parameter_target now handles GL_TEXTURE_CUBE_MAP. ARB_DIRECT_STATE_ACCESS functions allow an effective target of GL_TEXTURE_CUBE_MAP. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	d954f6023b	main: Added entry points for glTextureParameteriv, Iiv, Iuiv. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	354d789f3b	main: Added entry point for glTextureParameteri. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	2ce5db3930	main: Added entry point for glTextureParameterfv. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	abc688e33a	main: Added entry point for glTextureParameterf. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	5ad5393f3b	main: Added get_texobj_by_name in texparam.c. This is a convenience function for TextureParameter functions. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	795ba44754	main: set_tex_parameterf now handles errors according to the OpenGL 4.5 Specification. Beginning in the OpenGL 4.3 core specification, certain error handling has changed. One example shown here is that INVALID_ENUM is thrown instead of INVALID_OPERATION when a user attempts to set sampler parameters for a multisample target. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:29 -08:00
Laura Ekstrand	f4dce7a6a6	main: set_tex_parameteri now handles errors according to the OpenGL 4.5 Specification. Beginning in the OpenGL 4.3 core specification, some error handling has changed (see OpenGL 4.5 core spec, 30.10.2014, Section 8.10 Texture Parameters, pages 228-29). As an example, changing sampler states with a multisample target throws INVALID_ENUM rather than INVALID_OPERATION. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	77aabd8be2	main: Added entry point for BindTextureUnit. The following preparations were made in texstate.c and texstate.h to better facilitate the BindTextureUnit function: Dylan Noblesmith: mesa: add _mesa_get_tex_unit() mesa: factor out _mesa_max_tex_unit() This is about to appear in a lot more places, so reduce boilerplate copy paste. add _mesa_get_tex_unit_err() checking getter function Reduce boilerplate across files. Laura Ekstrand: Made note of why BindTextureUnit should throw GL_INVALID_OPERATION if the unit is out of range. Added assert(unit > 0) to _mesa_get_tex_unit. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	4b381e84db	main: Corrected comment on _mesa_is_zero_size_texture. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	b8939fd3d1	main: Added entry points for glTextureSubImage*D. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	5a5fe9f308	main: Added entry points for glTextureStorage*D. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	97c838cf85	main: Added entry point for glCreateTextures. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	15ddc2d94b	main: Removed trailing whitespaces in texture code. main: Removed trailing whitespace in texstate.c. main: Deleted trailing whitespaces in texobj.c. main: Fixed whitespace errors in teximage.h and teximage.c. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	ea1fb258ba	main: Renamed _mesa_get_compressed_teximage to _mesa_GetCompressedTexImage_sw. This reflects the new naming convention for software fallbacks. To avoid confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks now have the form _mesa_[Driver function name]_sw. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	460365cde3	main: Renamed _mesa_get_teximage to _mesa_GetTexImage_sw. This reflects the new naming convention for software fallbacks. To avoid confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks now have the form _mesa_[Driver function name]_sw. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	16f6d9cf5f	main: Changed _mesa_alloc_texture_storage to _mesa_AllocTextureStorage_sw. In order to implement ARB_DIRECT_STATE_ACCESS, many GL API functions must now rely on a backend that both traditional and DSA functions can use. For instance, _mesa_TexStorage2D and _mesa_TextureStorage2D both call a backend function _mesa_texture_storage that takes a context and a texture object as arguments. The backend is named _mesa_texture_storage so that Meta can call it and avoid looking up the context and the texture object. However, backend names often look very close to the names of software fallbacks (ie. _mesa_alloc_texture_storage). For this reason, software fallbacks have been renamed for clarity to have the form _mesa_[Driver function name]_sw. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	35371d6560	main: Moved _mesa_get_current_tex_object from teximage.c to texobj.c. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	d7528fce5a	main: Moved _mesa_lock_texture and _mesa_unlock_texture to texobj.h from teximage.h. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	838ef5b781	i965: blit_texture_to_pbo() now accepts TEXTURE_CUBE_MAP. ARB_DIRECT_STATE_ACCESS permits the user to use TEXTURE_CUBE_MAP as a target. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	60e3bfddaf	main: Added utility function _mesa_lookup_texture_err(). Most ARB_DIRECT_STATE_ACCESS functions take an object's ID and use it to look up the object in its hash table. If the user passes a fake object ID (ie. a non-generated name), the implementation should throw INVALID_OPERATION. This is a convenience function for texture objects. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
Laura Ekstrand	56875181c7	glapi: Added ARB_direct_state_access.xml file. main: Added ARB_direct_state_access to extensions.c as dummy_false. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2015-01-08 11:37:28 -08:00
José Fonseca	6c9b695a9c	st/wgl: Ignore ulVersion in DrvValidateVersion. We never used ulVersion for proper version checks. Most 3rd party drivers use version 1, but recently NVIDIA OpenGL driver started using a different version number, so the handy trick of renaming Mesa's ICDs as nvoglv32.dll on Windows machines with NVIDIA hardware for quick testing of Mesa software renderers stopped working. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-01-08 18:57:04 +00:00
José Fonseca	0dba2af2fb	mesa: Address `assignment makes integer from pointer without a cast` gcc warning. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-08 18:57:04 +00:00
Kristian Høgsberg	0ac4c27275	i965/skl: Always use a header for SIMD4x2 sampler messages SKL+ overloads the SIMD4x2 SIMD mode to mean either SIMD8D or SIMD4x2 depending on bit 22 in the message header. If the bit is 0 or there is no header we get SIMD8D. We always wand SIMD4x2 in vec4 and for fs pull constants, so use a message header in those cases and set bit 22 there. Based on an initial patch from Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2015-01-08 10:13:32 -08:00
Kristian Høgsberg	cec8eff28e	i965/skl: Report more accurate number of samples for format Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-07 21:51:35 -08:00
Rob Clark	e7026ac486	freedreno/ir3: fix pos_regid > max_reg We can't (or don't know how to) turn this off. But it can end up being stored to a higher reg # than what the shader uses, leading to corruption. Also we currently aren't clever enough to turn off frag_coord/frag_face if the input is dead-code, so just fixup max_reg/max_half_reg. Re-org this a bit so both vp and fp reg footprint fixup are called by a common fxn used also by ir3_cmdline. Also add a few more output lines for ir3_cmdline to make it easier to see what is going on. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	1e5c207dba	freedreno/ir3: start on indirect gpr reads Handle TEMP[ADDR[]] src registers by generating a fanin to group array elements, similarly to how texture fetch instructions work. NOTE: For all the scalar instructions generated for a single tgsi vector operation which uses an array src (or possibly even uses the same array as multiple srcs), re-use the same fanin node. Since a vector operation operates on all components at the same time, it should never see more than one version of the same array. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	63e5b72da8	freedreno/ir3: make reg array dynamic To use fanin's to group registers in an array, we can potentially have a much larger array of registers. Rather than continuing to bump up the array size, just make it dynamically allocated when the instruction is created. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	9a9f2a893b	freedreno/ir3: simplify RA Group inputs/outputs, in addition to fanin/fanout, as they must also exist in sequential scalar registers. This lets us simplify RA by working in terms of neighbor groups. NOTE: has the slight problem that it can't optimize out mov's for things like: MOV OUT[n], IN[m] To avoid this, instead of trying to figure out what mov's we can eliminate, we first remove all mov's prior to grouping, and then re-insert mov's as needed while grouping inputs/outputs/fanins. Eventually we'd prefer the frontend to not insert extra mov's in the first place (so we don't have to bother removing them). This is the plan for an eventual NIR based frontend, so separate out the instr grouping (which will still be needed for NIR frontend) from the mov elimination (which won't). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	dddfe6c21e	freedreno/ir3: regmask support for relative addr For temp arrays, a 32bit mask won't be sufficient.. but otoh we don't need to support an arbitrary mask. So for this case use a simple size field rather than a bitmask. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	9bb865b3cf	freedreno/ir3: split up ssa_src Slight bit of refactoring that will be needed for indirect gpr addressing (TEMP[ADDR[]]). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	d15db9e7c0	freedreno/ir3: drop instr_clone() stuff Unnecessary and overly complicated. And gets in the way for temp arrays (TEMP[ADDR[]]). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	212b909643	freedreno/ir3: runtime enable RA debug for DEBUG builds Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	8c3952051e	freedreno/ir3: handle relative addr in ir3_dump Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	56370b9feb	freedreno/ir3: legalize vs unused sam dst components We probably could be more clever elsewhere and mask out components that are not used. But either way, legalize should realize that there is also a write-after-write hazard with texture sample instructions. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	063e2ef76a	freedreno/ir3: hack for old compiler Old compiler doesn't have ir3_block's.. so we need a special path. This hack can be dropped when ir3_compiler_old is retired. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	18899d1b80	tgsi: track max array per file NOTE IN[] and OUT[] don't need (have?) ArrayID's.. and TEMP[] can optionally have them. So we implicitly assume that ArrayID==0 always exists for each file. This is why array_max[file] is never less than zero. You can tell from indirect_files(_read/written) if the legacy array- id zero was actually used. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2015-01-07 19:37:28 -05:00
Rob Clark	49b4a6331f	tgsi: keep track of read vs written indirects At least temporarily, I need to fallback to old compiler still for relative dest (for freedreno), but I can do relative src temp. Only a temporary situation, but seems easy/reasonable for tgsi-scan to track this. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-01-07 19:37:28 -05:00
Marek Olšák	d7cd9bfc7f	Revert "radeonsi: reduce the size of si_pm4_state" This reverts commit `9141d88555`. It broke OpenCL.	2015-01-08 00:10:36 +01:00
Tom Stellard	e28f9d0e60	radeonsi: Fix crash when destroying si_screen We were invalidating si_screen:tm by calling r600_destroy_common_screen() which frees the si_screen object. This caused the driver to crash in LLVMDisposeTargetMachine() since we were passing it an invalid pointer. https://bugs.freedesktop.org/show_bug.cgi?id=88170	2015-01-07 16:28:40 -05:00
José Fonseca	2b7fd5b11d	mesa: Don't use _mesa_generic_nop on Windows. It doesn't work on Windows because of STDCALL calling convention -- it's the callee responsibility to pop the arguments, and the number of arguments vary with the prototype --, so the stack pointer ends up getting corrupted. This is just a non-invasive stop-gap fix. A proper fix would be more elaborate, and require either: - a variation of __glapi_noop_table which sets GL_INVALID_OPERATION error - stop using APIENTRY on all internal _mesa_* functions. Tested with piglit gl-1.0-beginend-coverage (it now fails instead of crashing). VMware PR1350505 Reviewed-by: Brian Paul <brianp@vmware.com>	2015-01-07 19:35:35 +00:00
José Fonseca	fd1f79f7dd	glapi: Force frame pointer elimination on Windows. To catch mismatches in cdecl vs stdcall calling convention. See code comment for more detailed explanation. Tested with piglit gl-1.0-beginend-coverage (it now also crashes on debug builds.) VMware PR1350505. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-01-07 19:35:34 +00:00
Marek Olšák	1829f9c928	radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders v2: complete rewrite Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2015-01-07 18:27:54 +01:00
Marek Olšák	d8185aa9a8	radeonsi: emit SURFACE_SYNC last This fixes a case where a transform feedback buffer is fed back as an index buffer, because SURFACE_SYNC must be after VS_PARTIAL_FLUSH. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	7c9ec6ca7e	radeonsi: flush all CB/DB caches unconditionally when changing the framebuffer This is easier to read and will work better with shader image stores. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	a1bbccf521	radeonsi: change TC cache flushing strategy for textures Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	ca9c5b2be5	radeonsi: improve and fix streamout flushing - we don't usually need to flush TC L2 - we should flush KCACHE (not really an issue now since we always flush KCACHE when updating descriptors, but it could be a problem if we used CE, which doesn't require flushing KCACHE) - add an explicit VS_PARTIAL_FLUSH flag Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	18a30c9778	radeonsi: use TC L2 for CP DMA operations with shader resources on CIK So that TC L2 doesn't need to be flushed. The only problem is with index buffers, which don't use TC. A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty). Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	11b76369f5	radeonsi: use TC L2 for updating descriptors on CIK This allows not flushing TC L2 on CIK later. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	02ba7334d3	radeonsi: don't use TC L2 for updating descriptors on SI It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA when updating the same memory. The solution for SI is to use uncached access here, because CP DMA doesn't support cached access. CIK will be handled in the next patch. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	edf18da85d	radeonsi: only flush the right set of caches for CP DMA operations That's either framebuffer caches or caches for shader resources. The motivation is that framebuffer caches need to be flushed very rarely here. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	73c2b0d18c	radeonsi: implement separate ICACHE and KCACHE flush for SI Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	0aecf9e2d1	radeonsi: add a combined flag for flushing a framebuffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	2bfe9d4538	radeonsi: rename flush flags, split the TC flag into L1 and L2 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	d217819e78	r600g,radeonsi: separate cache flush flags I will rename them for radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	d14f2ab4ad	r600g: move r6xx-specific streamout flush flagging into r600g Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	0543630d0b	radeonsi: only set BC_OPTIMIZE_DISABLE when necessary SPI_PS_IN_CONTROL is moved into the SPI mapping state. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	5d8e838dae	radeonsi: do not define FACE as an ordinary PS input Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	15a7fff69a	radeonsi: remove flatshade from the shader key Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	13de9475fc	radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen It doesn't do anything useful. And colors are floating-point, so we can use fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE state only (in the next patch). Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	e3d4bdd6a8	radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values Only done for completeness. Not used by anything yet. Tested by advertising PIPE_CAP_VERTEXID_NOBASE. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	d7c6f397f4	radeonsi: fix VertexID for OpenGL This fixes all failing piglit VertexID tests. Cc: 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	368b0a7340	radeonsi: clarify a hw bug in shader exports Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	d1d2af2398	radeonsi: use ordered compares for SSG and face selection Ordered compares are what you have in C. Unordered compares are the result of negating ordered compares (they return true if either argument is NaN). That special NaN behavior is completely useless here, and unordered compares produce horrible code with all stable LLVM versions. (I think that has been fixed in LLVM git) Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	a38e8de643	radeonsi: remove unused and not useful variables Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	638fa8016a	radeonsi: remove init config from states It really doesn't do anything there. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	9141d88555	radeonsi: reduce the size of si_pm4_state - the relocs array is unused, remove it - ndw is at most 115 (init), set 140 as the maximum - compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-01-07 12:06:43 +01:00
Marek Olšák	1b82eb677d	tgsi: add uses_centroid into tgsi_shader_info	2015-01-07 12:06:43 +01:00
Marek Olšák	eaae92a349	st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-07 12:06:43 +01:00
Marek Olšák	8f5d309521	vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays From GL 4.4 Core profile: If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not performed for array elements transferred by any drawing command not taking a type parameter, including all of the Draw commands other than DrawEle- ments. Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-01-07 12:06:42 +01:00
Eric Anholt	426fd535d9	vc4: Fix scaling W projection of the Z coordinate when there's a Z offset. Fixes piglit glsl-fs-fragcoord-zw-perspective, es3conform gl_FragCoord_z_frag, and the rest of the piglit glsl 1.10 interpolation tests.	2015-01-06 17:22:13 -08:00
Eric Anholt	49b5c901e8	vc4: Fix deletion from the program cache. They key is, oddly enough, in the key field, not in the data field (which is the vc4_compiled_shader *). Fixes regular failures in fp-long-alu.	2015-01-06 15:41:36 -08:00
Eric Anholt	b295403971	vc4: Skip storing the Z/S contents when it's invalidated. Improves framerate of 5 seconds of es2gears by 1.57473% +/- 0.669409% (n=67). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-06 15:40:41 -08:00
Eric Anholt	239db93888	gallium: Plumb the swap INVALIDATE_ANCILLARY flag through more layers. v2: Instead of telling the driver that the window system ancillaries have been invalidated (when the driver doesn't know which of its buffers are the window system's!), introduce a method for invalidating specific surfaces. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-06 15:40:41 -08:00
Eric Anholt	70e8ccc459	egl: Inform the client API when ancillary buffers may become undefined. This is part of the EGL spec, and is useful for a tiled renderer to avoid the memory bandwidth cost of storing the depth/stencil buffers. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-01-06 15:40:40 -08:00
Vinson Lee	5ae1305124	ax_prog_flex.m4: Merge upstream OpenBSD fixes. Merge the following upstream autoconf-archive patches. ax_prog_flex: change grep syntax to accept e.g. "flex.real" in case a wrapper or symlink is used. AX_PROG_FLEX: avoid use of grep empty string escape extension (fix for OpenBSD) AX_PROG_FLEX: Also accept gflex. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jonathan Gray <jsg@openbsd.org>	2015-01-06 15:06:54 -08:00
Tom Stellard	a8ef880a1b	radeon/llvm: Use amdgcn triple for SI+ on LLVM >= 3.6	2015-01-06 12:53:21 -08:00
Tom Stellard	761e36b4ca	radeonsi: Cache LLVMTargetMachine object in si_screen Rather than building a new one every compile. This should reduce some of the overhead of compiling shaders. One consequence of this change is that we lose the MachineInstrs dumps when dumping the shaders via R600_DEBUG. The LLVM IR and assembly is still dumped, and if you still want to see the MachineInstr dump, you can run the dumped LLVM IR through llc.	2015-01-06 12:53:21 -08:00
Brian Paul	934e41c0b3	mesa: create, use new _mesa_texture_base_format() function Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:55 -07:00
Brian Paul	f262ed6e3d	mesa: remove unused ctx parameter for _mesa_select_tex_image() Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:55 -07:00
Brian Paul	05279fa563	swrast: use new _mesa_base_tex_image() helper Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:55 -07:00
Brian Paul	58e8dd6b9d	st/mesa: use new _mesa_base_tex_image() helper This involved adding a new st_texture_image_const() helper also. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:55 -07:00
Brian Paul	3a400cbb66	mesa: add _mesa_base_tex_image() helper function Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	d0fa559e49	mesa: simplify a conditional in detach_shader() Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	c0a445037b	mesa: minor whitespace fixes in shaderapi.c Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	6d9aed19f3	mesa: make _mesa_reference_shader_program() an inline function which wraps _mesa_reference_shader_program_(), similar to what we do for other reference-counted objects. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	3f687e995f	mesa: update comment on delete_shader_program() Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	5b7e7cfb2b	mesa: rearrange error handling in glProgramParameteri() Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	41dc2fee4e	mesa: fix error strings in shaderapi.c The _mesa_-prefixed function names should not appear in GL error messages. Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	a6822e3135	glsl: use the is_gl_identifier() helper in a couple more places Reviewed-by: Eric Anholt <eric@anholt.net>	2015-01-05 13:50:54 -07:00
Brian Paul	83b344021b	meta: init var to silence uninitialized variable warning	2015-01-05 13:50:54 -07:00
Brian Paul	d294365d06	draw: silence uninitialized variable warning v2: move initialization of llvm_gs to declaration. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2015-01-05 13:50:54 -07:00
Brian Paul	04e35cc4aa	gallivm: silence a couple compiler warnings Silence warnings about possibly uninitialized variables when making a release build. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-01-05 13:50:54 -07:00
Leonid Shatz	5fea39ace3	gallium/util: make sure cache line size is not zero The "normal" detection (querying clflush size) already made sure it is non-zero, however another method did not. This lead to crashes if this value happened to be zero (apparently can happen in virtualized environments at least). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913 Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-05 17:58:39 +01:00
Roland Scheidegger	b59c7ed0ab	gallium/util: fix crash with daz detection on x86 The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function decoration to fix the segfault which can happen if stack alignment is only 4 bytes. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658. Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2015-01-05 17:58:38 +01:00
Ilia Mirkin	21a280f87c	nvc0: add name to magic number Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Ilia Mirkin	7228302009	nvc0: regenerate rnndb headers The headers hadn't been regenerated in a long time and had seen a number of manual modifications. A few changes: - remove nvc0_2d entirely, use the nv50 header which has the nvc0 values too - remove 3ddefs, it's identical to the nv50 file - move macros out into a separate file Also the upstream rnndb changed the overall chip naming convention; this was fixed up manually in the generated files until a better solution is determined. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Ilia Mirkin	7ed02b111a	nv50: regenerate rnndb headers The headers hadn't been regenerated in a long time, and there were a few minor divergences. Among other things, rnndb has changed naming to G80/etc, for now I've not tackled switching that over and manually replaced the nvidia codenames back to the chip ids. However no other modifications of the headergen'd headers was done. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Tobias Klausmann	1f8c0be27e	nv50: enable texture compression Compression seems to be supported for only some formats. Enable it for those. Previously this was disabled for everything despite the code looking like it was actually enabled. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Ilia Mirkin	e452cfb149	nv50/ir: enable sat modifier for OP_SUB SUB is handled the same as ADD, so no reason not to allow a saturate modifier on it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Roy Spliet	44673512a8	nv50/ir: Add sat modifier for mul Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Ilia Mirkin	ec3e1e6194	nv50,nvc0: avoid doing work inside of an assert assert is compiled out in release builds - don't put logic into it. Note that this particular instance is only used for vp debugging and is normally compiled out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-05 00:34:33 -05:00
Ilia Mirkin	fb1afd1ea5	nv50/ir: fix texture offsets in release builds assert's get compiled out in release builds, so they can't be relied upon to perform logic. Reported-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Roy Spliet <rspliet@eclipso.eu> Cc: "10.2 10.3 10.4" <mesa-stable@lists.freedesktop.org>	2015-01-05 00:34:33 -05:00
Kenneth Graunke	5464257263	i965: Micro-optimize swizzle_to_scs() and make it inlinable. brw_swizzle_to_scs has been showing up in my CPU profiling, which is rather silly - it's a tiny amount of code. It really should be inlined, and can easily be implemented with fewer instructions. The enum translation is as follows: SWIZZLE_X, SWIZZLE_Y, SWIZZLE_Z, SWIZZLE_W, SWIZZLE_ZERO, SWIZZLE_ONE 0 1 2 3 4 5 4 5 6 7 0 1 SCS_RED, SCS_GREEN, SCS_BLUE, SCS_ALPHA, SCS_ZERO, SCS_ONE which is simply (swizzle + 4) & 7. Haswell needs extra textureGather workarounds to remap GREEN to BLUE, but Broadwell and later do not. This patch replicates swizzle_to_scs in gen7_wm_surface_state.c and gen8_surface_state.c, since the Gen8+ code can be simplified to a mere two instructions. Both copies can be marked static for easy inlining. v2: Put the commit message in the code as comments (requested by Jason Ekstrand). Also fix a typo. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-01-04 21:31:40 -08:00
Kenneth Graunke	f3ad1804eb	i965: Support MESA_FORMAT_R8G8B8X8_SRGB. Valve games use GL_SRGB8 textures. Instead of supporting that properly, we fell back to MESA_FORMAT_R8G8B8A8_SRGB (with an alpha channel), which meant that we had to use texture swizzling to override the alpha to 1.0 when sampling. This meant shader recompiles on Gen < 7.5 platforms. By supporting MESA_FORMAT_R8G8B8X8_SRGB, the hardware just returns 1.0 for us, so we can just use SWIZZLE_XYZW, and avoid any recompiles. All generations of hardware have supported the format for sampling and filtering; we can easily support rendering by using the R8G8B8A8_SRGB format and writing garbage to the X channel. (We do this already for the non-SRGB version of this format.) This removes all remaining shader recompiles in a time demo of "Counter Strike: Global Offensive" (32 -> 0) on Sandybridge. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-04 21:31:40 -08:00
Kenneth Graunke	51b9382da8	i965: Fix BLORP sRGB MSAA overrides to cope with X vs. A formats. The logic in brw_blorp_surface_info::set uses brw_format_for_mesa_format for source surfaces, and brw->render_target_format[] for destination surfaces. We should do the same in the sRGB MSAA overrides. Currently, this isn't a problem, since SRGB MSAA buffers are all RGBA. The next commit will introduce RGBX SRGB MSAA buffers, at which point we need to get the RGBX -> RGBA format overrides for rendering right. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-04 21:31:40 -08:00
Kenneth Graunke	1f1102c834	i965: Copy shader->shadow_samplers to prog->ShadowSamplers. ir_to_mesa does this - apparently we just forgot or something. Without this, we'll guess the wrong texture swizzle (XYZW for color instead of XXX1 for depth) when doing precompiles. This cuts 26 shader recompiles in a time demo of "Counter Strike: Global Offensive" (58 -> 32) on Sandybridge. Haswell still has 0 recompiles. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-04 21:31:40 -08:00
Kenneth Graunke	0b98b2bf53	i965: Make the precompile ignore DEPTH_TEXTURE_MODE on Gen7.5+. Gen7.5+ platforms that support the "Shader Channel Select" feature leave key->tex.swizzles[i] as SWIZZLE_NOOP except when GL_DEPTH_TEXTURE_MODE is GL_ALPHA (which is really uncommon). So, the precompile should leave them as SWIZZLE_NOOP (aka SWIZZLE_XYZW) as well. We didn't notice this because prog->ShadowSamplers is not set correctly. The next patch will fix that problem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87886 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2015-01-04 21:31:40 -08:00
Kenneth Graunke	d41cf9fb60	i965: Implement WaCsStallAtEveryFourthPipecontrol on IVB/BYT. According to the documentation, we need to do a CS stall on every fourth PIPE_CONTROL command to avoid GPU hangs. The kernel does a CS stall between batches, so we only need to count the PIPE_CONTROLs in our batches. v2: Get the generation check right (caught by Chris Wilson), combine the ++ with the check (suggested by Daniel Vetter). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-04 17:21:33 -08:00
Marek Olšák	3793a1b421	r300g: handle vertex format PIPE_FORMAT_NONE	2015-01-04 23:54:47 +01:00
Marek Olšák	48094d0e65	glsl_to_tgsi: fix a bug in copy propagation This fixes the new piglit test: arb_uniform_buffer_object/2-buffers-bug Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2015-01-03 13:25:30 +01:00
Kenneth Graunke	916516b251	i965: Make INTEL_DEBUG=state ignore state flags with a count of 1. There are too many state flags to fit in one terminal screen, even with a very tall terminal. Everything is flagged once, so a value of 1 means that it hasn't ever happened again, and thus isn't terribly interesting. Skipping those makes it easier to see the interesting values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-03 01:45:15 -08:00
Kenneth Graunke	408e298942	i965: Fix INTEL_DEBUG=optimizer with VF types. Hardcoding stderr is wrong; INTEL_DEBUG=optimizer uses other files. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-03 01:45:15 -08:00
Kenneth Graunke	9b8bd67768	i965: Show opt_vector_float() and later passes in INTEL_DEBUG=optimizer. In order to support calling opt_vector_float() inside a condition, this patch makes OPT() a statement expression: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html We've used that elsewhere already. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-01-03 01:45:15 -08:00
Jeremy Huddleston Sequoia	61711316f5	swrast: Fix -Wduplicate-decl-specifier warning swrast.c:67:12: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier] const char const swrast_vendor_string = "Mesa Project"; ^ swrast.c:68:12: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier] const char const swrast_renderer_string = "Software Rasterizer"; ^ Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2015-01-01 19:55:43 -08:00
Roy Spliet	c3260f8d98	nv50/ir: Fold sat into mad The mad instruction emitter already supported the saturate modifier, but the ModifierFolding pass never tried folding cvt sat operations in for NV50. Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-01 21:40:35 -05:00
Ilia Mirkin	9e94b87b60	nv50/ir: fold MAD when one of the multiplicands is const Fold MAD dst, src0, immed, src2 (or src0/immed swapped) when - immed = 0 -> MOV dst, src2 - immed = +/- 1 -> ADD dst, src0, src2 These types of MAD patterns were observed in some st/nine shaders. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-01-01 21:40:35 -05:00
Alexander von Gluck IV	290553b6d6	gallium/state_tracker: Rewrite Haiku's state tracker * More gallium-like * Leverage stamps properly and don't call mesa functions	2015-01-01 21:33:36 -05:00
Marek Olšák	b77eaafcdc	radeonsi: fix warnings	2015-01-01 14:42:32 +01:00
Kenneth Graunke	c633528cba	i965: Fix start/base_vertex_location for >1 prims but !BRW_NEW_VERTICES. This is a partial revert of `c89306983c`. It split the {start,base}_vertex_location handling into several steps: 1. Set brw->draw.start_vertex_location = prim[i].start and brw->draw.base_vertex_location = prim[i].basevertex. (This happened once per _mesa_prim, in the main drawing loop.) 2. Add brw->vb.start_vertex_bias and brw->ib.start_vertex_offset appropriately. (This happened in brw_prepare_shader_draw_parameters, which was called just after brw_prepare_vertices, as part of state upload, and only happened when BRW_NEW_VERTICES was flagged.) 3. Use those values when emitting 3DPRIMITIVE (once per _mesa_prim). If we drew multiple _mesa_prims, but didn't flag BRW_NEW_VERTICES on the second (or later) primitives, we would do step #1, but not #2. The first _mesa_prim would get correct values, but subsequent ones would only get the first half of the summation. The reason I originally did this was because I needed the value of gl_BaseVertexARB to exist in a buffer object prior to uploading 3DSTATE_VERTEX_BUFFERS. I believed I wanted to upload the value of 3DPRIMITIVE's "Base Vertex Location" field, which was computed as: (prims[i].indexed ? prims[i].start : prims[i].basevertex) + brw->vb.start_vertex_bias. The latter value wasn't available until after brw_prepare_vertices, and the former weren't available in the state upload code at all. Hence the awkward split. However, I believe that including brw->vb.start_vertex_bias was a mistake. It's an extra bias we apply when uploading vertex data into VBOs, to move [min_index, max_index] to [0, max_index - min_index]. >From the GL_ARB_shader_draw_parameters specification: "<gl_BaseVertexARB> holds the integer value passed to the <baseVertex> parameter to the command that resulted in the current shader invocation. In the case where the command has no <baseVertex> parameter, the value of <gl_BaseVertexARB> is zero." I conclude that gl_BaseVertexARB should only include the baseVertex parameter from glDrawElements, not any internal biases we add for optimization purposes. With that in mind, gl_BaseVertexARB only needs prim[i].start or prim[i].basevertex. We can simply store that, and go back to computing start_vertex_location and base_vertex_location in brw_emit_prim(), like we used to. This is much simpler, and should actually fix two bugs. Fixes missing geometry in Unvanquished. Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85529 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-31 17:10:47 -08:00
Kenneth Graunke	faa615a798	i965: Use WARN_ONCE for the single-primitive-exceeded-aperture message. This makes it show up via ARB_debug_output and is also less code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-31 17:06:51 -08:00
Eric Anholt	a6f6d6188c	u_primconvert: Fix leak of the upload BO on context destroy. v2: Conditionalize it on having done any uploads (Turns out u_upload_destroy() isn't safe with a NULL arg). Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)	2014-12-31 13:50:17 -08:00
Eric Anholt	37478c638a	vc4: Fix memory leak as of `0404e7fe0a`. Can't reset the CL before looking at how much we had pupt in it.	2014-12-31 11:34:28 -08:00
Ilia Mirkin	be0311c962	nv50,nvc0: set vertex id base to index_bias Fixes the piglits which check that gl_VertexID includes the base vertex offset: arb_draw_indirect-vertexid elements gl-3.2-basevertex-vertexid Note that this leaves out the original G80, for which this will continue to fail. It could be fixed by passing a driver constbuf value in, but that's beyond the scope of this change. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>	2014-12-30 23:30:23 -05:00
Tiziano Bacocco	609c3e51f5	nv50,nvc0: implement half_pixel_center LAST_LINE_PIXEL has actually been renamed to PIXEL_CENTER_INTEGER in rnndb; use that method to implement the rasterizer setting, used for st/nine. Signed-off-by: Tiziano Bacocco <tizbac2@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2014-12-30 20:11:55 -05:00
Eric Anholt	3ba57bae47	vc4: Only render tiles where the scissor ever intersected them. This gives a 2.7x improvement in x11perf -rect100, since we only end up load/storing the x11perf window, not the whole screen.	2014-12-30 14:33:52 -08:00
Eric Anholt	0404e7fe0a	vc4: Move draw call reset handling to a helper function. This will be more important in the next commit, when there's more state to reset to nonzero values, and I want an early exit from the submit function.	2014-12-30 14:30:59 -08:00
Eric Anholt	effb39e899	vc4: Drop the content of vc4_flush_resource(). The callers all follow it with a flush of the context, and the flush of the context gives us more information about how things are being flushed.	2014-12-30 14:30:59 -08:00
Emil Velikov	64dcb2bb0a	docs: add news item and link release notes for mesa 10.3.6/10.4.1 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-12-30 02:50:43 +00:00
Emil Velikov	4fa6024b5f	docs: Add sha256 sums for the 10.4.1 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-12-30 02:45:36 +00:00
Emil Velikov	73ec4e2265	Add release notes for the 10.4.1 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-12-30 02:45:34 +00:00
Emil Velikov	dd0f2f3695	docs: Add sha256 sums for the 10.3.6 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-12-30 02:45:30 +00:00
Emil Velikov	184246b6d9	Add release notes for the 10.3.6 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-12-30 02:45:29 +00:00
Matt Turner	6c18279b9f	mesa: Remove __SSE4_1__ guards from sse_minmax.c. See commit `e07c9a288`. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-12-29 12:17:06 -08:00
Matt Turner	798c094e62	i965/vec4: Do separate copy followed by constant propagation after opt_vector_float(). total instructions in shared programs: 5877012 -> 5876617 (-0.01%) instructions in affected programs: 33140 -> 32745 (-1.19%) From before the commit that allows VF constant propagation (which hurt some programs) to here, the results are: total instructions in shared programs: 5877951 -> 5876617 (-0.02%) instructions in affected programs: 123444 -> 122110 (-1.08%) with no programs hurt. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	d61c519822	i965/vec4: Allow constant propagation of VF immediates. total instructions in shared programs: 5877951 -> 5877012 (-0.02%) instructions in affected programs: 155923 -> 154984 (-0.60%) Helps 1233, hurts 156 shaders. The hurt shaders are addressed in the next commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	c855f49c99	i965/vec4: Add parameter to skip doing constant propagation. After CSEing some MOV ..., VF instructions we have code like mov tmp, [1F, 2F, 3F, 4F]VF mov r10, tmp mov r11, tmp ... use r10 use r11 We want to copy propagate tmp into the uses of r10 and r11, but not constant propagate the VF immediate into the uses of tmp. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	bbdd3198a5	i965/vec4: Do CSE, copy propagation, and DCE after opt_vector_float(). total instructions in shared programs: 5869005 -> 5868220 (-0.01%) instructions in affected programs: 70208 -> 69423 (-1.12%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	7463e6d61b	i965/vec4: Perform CSE on MOV ..., VF instructions. Port of commit `a28ad9d4` from the fs backend. No shader-db changes since we don't emit MOV ..., VF instructions yet. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	44573458bd	i965/vec4: Add pass to gather constants into a vector-float MOV. Currently only handles consecutive instructions with the same destination that collectively write all channels. total instructions in shared programs: 5879798 -> 5869011 (-0.18%) instructions in affected programs: 465236 -> 454449 (-2.32%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	7bc6e455e2	i965: Add support for saturating immediates. I don't feel great about assert(!"unimplemented: ...") but these cases do only seem possible under some currently impossible circumstances. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:08:18 -08:00
Matt Turner	3978585bcc	i965: Add fs_reg/src_reg constructors that take vf[4]. Sometimes it's easier to generate 4x values into an array, and the memcpy is 1 instruction, rather than 11 to piece 4 arguments together. I'd forgotten to remove the prototype from fs_reg from a previous patch, so it's already there for us here. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-29 10:05:03 -08:00
Alexander von Gluck IV	0c7f895995	gallium/target: Drop no longer needed Haiku viewport override * Drop no longer needed mesa headers * Haiku LLVM pipe working with LLVM 3.5.0 on x86_64	2014-12-27 06:12:54 +00:00
Alexander von Gluck IV	2b3a570920	gallium/st: Clean up Haiku depth mapping, fix colorspace errors	2014-12-27 05:55:29 +00:00
Eric Anholt	cb5a37249c	vc4: Handle unaligned accesses in CL emits. As of `229bf4475f` we started getting SIBGUS from unaligned accesses on the hardware, for reasons I haven't figured out. However, we should be avoiding unaligned accesses anyway, and our CL setup certainly would have produced them.	2014-12-25 15:47:39 -10:00
Eric Anholt	db6e054eb0	vc4: Don't bother zero-initializing the shader reloc indices. They should all be set to real values by the time they're read, and ideally if you used valgrind you'd see uninitialized value uses.	2014-12-25 12:25:41 -10:00
Eric Anholt	0b607b54ce	vc4: Fix the argument type for cl_u16(). It doesn't matter, since it just got truncated to 16 inside, anyway.	2014-12-25 12:25:41 -10:00
Alexander von Gluck IV	890ef622d6	egl: Fix non-dri SCons builds re #87657 * Revert change to egl main producing Shared Libraries * Check for dri before including dri code	2014-12-25 10:34:49 -05:00
Michel Dänzer	b3057f8097	radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0 E.g. this could happen on older kernels which don't support the RADEON_INFO_SI_BACKEND_ENABLED_MASK query yet. The code in si_write_harvested_raster_configs() doesn't deal with this correctly and would probably mangle the value badly. Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-12-25 12:06:22 +09:00
Eric Anholt	229bf4475f	vc4: Optimize CL emits by doing size checks up front. The optimizer obviously doesn't have the ability to rewrite these to skip the size checks per call, so we have to do it manually. Improves a norast benchmark on simulation by 0.779706% +/- 0.405838% (n=6087).	2014-12-24 10:28:26 -10:00
Eric Anholt	20e3a2430e	vc4: Avoid repeated hindex lookups in the loop over tiles. Improves norast performance of a microbenchmark by 11.1865% +/- 2.37673% (n=20).	2014-12-24 08:28:33 -10:00
Kenneth Graunke	4616b2ef85	i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms. This was probably missed when moving from a fixed binding table layout to a dynamic one that changes based on the shader. Fixes newly proposed Piglit test fbo-mrt-new-bind. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87619 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Mike Stroyan <mike@LunarG.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-12-24 00:15:40 -08:00
Kenneth Graunke	b7f14e03e3	i965: Cache register write capability checks. Our ability to perform register writes depends on the hardware and kernel version. It shouldn't ever change on a per-context basis, so we only need to check once. Checking introduces a synchronization point between the CPU and GPU: even though we submit very few GPU commands, the GPU might be busy doing other work, which could cause us to stall for a while. On an idle i7 4750HQ, this improves performance in OglDrvCtx (a context creation microbenchmark) by 6.14748% +/- 1.6837% (n=20). With Unigine Valley running in the background (to keep the GPU busy), it improves performance in OglDrvCtx by 2290.92% +/- 29.5274% (n=5). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-12-24 00:15:40 -08:00
Rob Clark	f332cf92b6	freedreno/ir3: split out legalize pass Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-23 19:53:01 -05:00
Rob Clark	4097ef6ee8	freedreno/ir3: ra debug Some compile time RA debug Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-23 19:53:01 -05:00
Alexander von Gluck IV	402c808372	egl/haiku: Clean up SConscript whitespace	2014-12-23 09:07:58 -05:00
Alexander von Gluck IV	49ce07878d	egl/dri2: Fix build of dri2 egl driver with SCons * egl/dri2 was missing a SConscript * Problem caught by Adrián Arroyo Calle	2014-12-23 09:07:58 -05:00
Alexander von Gluck IV	e7ac21202d	egl: Clean up Haiku visual creation * Only create one struct * 'final' also is a language conflict * Some style cleanup	2014-12-23 09:07:58 -05:00
Alexander von Gluck IV	400b833592	egl: Add Haiku code and support * This is the cleaned up work of the Haiku GCI student Adrián Arroyo Calle adrian.arroyocalle@gmail.com * Several patches were consolidated to prevent unnecessary touching of non-related code	2014-12-23 09:07:57 -05:00
Timothy Arceri	da4fb3e7a1	glsl: check if implicitly sized arrays match explicitly sized arrays across the same stage V2: Improve error message. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-23 19:32:56 +11:00
Chad Versace	414be86c96	i965: Use safer pointer arithmetic in gather_oa_results() This patch reduces the likelihood of pointer arithmetic overflow bugs in gather_oa_results(), like the one fixed by `b69c7c5dac`. I haven't yet encountered any overflow bugs in the wild along this patch's codepath. But I get nervous when I see code patterns like this: (void) + (int) (int) I smell 32-bit overflow all over this code. This patch retypes 'snapshot_size' to 'ptrdiff_t', which should fix any potential overflow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-12-22 15:47:14 -06:00
Chad Versace	225a09790d	i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy() This patch reduces the likelihood of pointer arithmetic overflow bugs in intel_texsubimage_tiled_memcpy() , like the one fixed by `b69c7c5dac`. I haven't yet encountered any overflow bugs in the wild along this patch's codepath. But I recently solved, in commit `b69c7c5dac`, an overflow bug in a line of code that looks very similar to pointer arithmetic in this function. This patch conceptually applies the same fix as in `b69c7c5dac`. Instead of retyping the variables, though, this patch adds some casts. (I tried to retype the variables as ptrdiff_t, but it quickly got very messy. The casts are cleaner). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-12-22 15:47:11 -06:00
Chad Versace	aebcf26d82	i965: Fix intel_miptree_map() signature to be more 64-bit safe This patch should diminish the likelihood of pointer arithmetic overflow bugs, like the one fixed by `b69c7c5dac`. Change the type of parameter 'out_stride' from int to ptrdiff_t. The logic is that if you call intel_miptree_map() and use the value of 'out_stride', then you must be doing pointer arithmetic on 'out_ptr'. Using ptrdiff_t instead of int should make a little bit harder to hit overflow bugs. As a side-effect, some function-scope variables needed to be retyped to avoid compilation errors. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-12-22 15:47:07 -06:00
Chad Versace	d11bc9fe8d	i965: Remove spurious casts in copy_image_with_memcpy() If a pointer points to raw, untyped memory and is never dereferenced, then declare it as 'void' instead of casting it to 'void'. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-22 15:46:54 -06:00
Marek Olšák	2150db4d5d	radeonsi: force NaNs to 0 This fixes incorrect rendering in Unreal Engine demos. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83510 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-21 20:34:38 +01:00
David Heidelberg	4fb1d00f4e	st/nine: fix DBG typo (trivial) Signed-off-by: David Heidelberg <david@ixit.cz> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-12-21 20:34:19 +01:00
David Heidelberg	fbfe2918f4	r300g: implement ARR opcode Same as ARL, just has extra rounding. Useful for st/nine. Tested-by: Pavel Ondračka <pavel.ondracka@email.cz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: David Heidelberg <david@ixit.cz> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-12-21 20:34:19 +01:00
Rob Clark	aa6415b485	freedreno/a4xx: blend-color Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-20 12:08:37 -05:00
Rob Clark	10d81a03b3	freedreno/a4xx: alpha-test Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-20 12:08:37 -05:00
Rob Clark	097d760aac	freedreno: update generated headers	2014-12-20 12:08:37 -05:00
Rob Clark	f20a0acd43	freedreno/ir3: trans_kill cleanup trans_kill() only handles the single opcode. Drop the remnant of a time when both KILL and KILL_IF were handled by the same fxn. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-20 12:08:37 -05:00
Rob Clark	4ee545646d	freedreno/ir3: hack for standalone compiler Standalone compiler doesn't have screen or context. We need to come up with a better way to control the target arch (ie. something that we can control from cmdline w/ standalone compiler) but for now this hack keeps it from segfault'ing. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-20 12:08:37 -05:00
Matt Turner	a5481d6fbb	i965/fs: Add missing const qualifier.	2014-12-19 12:55:13 -08:00
Eric Anholt	e06b0778f5	vc4: Coalesce MOVs into VPM with the instructions generating the values. total instructions in shared programs: 41168 -> 40976 (-0.47%) instructions in affected programs: 18156 -> 17964 (-1.06%)	2014-12-18 15:00:56 -08:00
Eric Anholt	a871eff16c	vc4: Redefine VPM writes as a (destination) QIR register file. This will let me coalesce the VPM writes into the instructions generating the values.	2014-12-17 22:35:08 -08:00
Timothy Arceri	a9e77896a7	docs: note change in minimum GCC version to 4.2.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-12-18 16:08:27 +11:00
Timothy Arceri	743a684512	gallium: remove support for GCC older than 4.2.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-18 16:08:19 +11:00
Timothy Arceri	6852dce591	mesa: bump required GCC version to 4.2.0 It turns out Mesa hasn't compiled on less then 4.2 for a while so update conf to reflect this. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-18 16:08:11 +11:00
Eric Anholt	e473fbe469	vc4: Add support for turning constant uniforms into small immediates. Small immediates have the downside of taking over the raddr B field, so you might have less chance to pack instructions together thanks to raddr B conflicts. However, it also reduces some register pressure since it lets you load 2 "uniform" values in one instruction (avoiding a previous load of the constant value to a register), and increases some pairing for the same reason. total uniforms in shared programs: 16231 -> 13374 (-17.60%) uniforms in affected programs: 10280 -> 7423 (-27.79%) total instructions in shared programs: 40795 -> 41168 (0.91%) instructions in affected programs: 25551 -> 25924 (1.46%) In a previous version of this patch I had a reduction in instruction count by forcing the other args alongside a SMALL_IMM to be in the A file or accumulators, but that increases register pressure and had a bug in handling FRAG_Z. In this patch is I just use raddr conflict resolution, which is more expensive. I think I'd rather tweak allocation to have some way to slightly prefer good choices for files in general, rather than risk failing to register allocate by forcing things into register classes.	2014-12-17 19:35:13 -08:00
Eric Anholt	ff266483fb	vc4: Move follow_movs() to common QIR code. I want this from other passes.	2014-12-17 19:05:52 -08:00
Eric Anholt	8d22e8907f	vc4: Fix missing newline for load immediate instruction disasm.	2014-12-17 19:05:52 -08:00
Matt Turner	18ebf9e251	mesa: Remove unnecessary -f from $(RM). $(RM) includes -f.	2014-12-17 17:54:33 -08:00
Matt Turner	b2b6cf2437	mesa: Remove tarballs/checksum rules.	2014-12-17 17:54:33 -08:00
Matt Turner	4cc8d66f74	gallium: Add egl and gbm to distribution.	2014-12-17 17:54:33 -08:00
Matt Turner	baedd68ca9	mesa: Set DISTCHECK_CONFIGURE_FLAGS. Enable some non-default options that distros are likely to use.	2014-12-17 17:54:33 -08:00
Matt Turner	ce48ce425a	targets/xvmc: Add uninstall hooks to handle megadriver hardlinks.	2014-12-17 17:54:33 -08:00
Matt Turner	ed1ac1d574	targets/vdpau: Add uninstall hooks to handle megadriver hardlinks.	2014-12-17 17:54:33 -08:00
Matt Turner	adc2922f9c	targets/vdpau: Add clean-local rule to remove .lib links.	2014-12-17 17:54:33 -08:00
Eric Anholt	06890c444a	vc4: Add a userspace BO cache. Since our kernel BOs require CMA allocation, and the use of them requires new mmaps, it's pretty expensive and we should avoid it if possible. Copying my original design for Intel, make a userspace cache that reuses BOs that haven't been shared to other processes but frees BOs that have sat in the cache for over a second. Improves glxgears framerate on RPi by around 30%.	2014-12-17 16:07:01 -08:00
Eric Anholt	39bc936011	vc4: Add dmabuf support. This gets DRI3 working on modesetting with glamor. It's not enabled under simulation, because it looks like handing our dumb-allocated buffers off to the server doesn't actually work for the server's rendering.	2014-12-17 16:07:01 -08:00
Eric Anholt	113044e1b9	vc4: Drop a weird argument in the BOs-from-handles API.	2014-12-17 16:06:17 -08:00
Roland Scheidegger	f97b731c82	draw: revert using correct order for prim decomposition. This reverts `db3dfcfe90`. The commit was correct but we've got some precision problems later in llvmpipe (or possibly in draw clip) due to the vertices coming in in different order, causing some internal test failures. So revert for now. (Will only affect drivers which actually support constant-interpolated attributes and not just flatshading.)	2014-12-17 20:17:42 +01:00
Jan Vesely	bc18b48924	util: Silence signed-unsigned comparison warnings Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-17 17:15:36 +00:00
Cody Northrop	83e8bb5b1a	i965: Require pixel alignment for GPU copy blit The blitter will start at a pixel's natural alignment. For PBOs, if the provided offset if not aligned, bits will get dropped. This change adds offset alignment check for src and dst, kicking back if the requirements are not met. The change is based on following verbiage from BSPEC: Color pixel sizes supported are 8, 16, and 32 bits per pixel (bpp). All pixels are naturally aligned. Found in the following locations: page 35 of intel-gfx-prm-osrc-hsw-blitter.pdf page 29 of ivb_ihd_os_vol1_part4.pdf page 29 of snb_ihd_os_vol1_part5.pdf This behavior was observed with Steam Big Picture rendering incorrect icon colors. The fix has been tested on Ubuntu and SteamOS on Haswell. Signed-off-by: Cody Northrop <cody@lunarg.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83908 Reviewed-by: Neil Roberts <neil@linux.intel.com>	2014-12-16 16:04:14 -08:00
Mark Janes	fc016bc0f3	i965: remove includes of sampler.h from extern "C" blocks C linkage was removed from functions in program/sampler.cpp. However, some cpp files include program/sampler.h within extern "C" blocks, causing link errors for test_vec4_copy_propagation. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-16 15:39:55 -08:00
Kenneth Graunke	3eb6258db7	i965/query: Cache whether the batch references the query BO. Chris Wilson noted that repeated calls to CheckQuery() would call drm_intel_bo_references(brw->batch.bo, query->bo) on each invocation, which is expensive. Once we've flushed, we know that future batches won't reference query->bo, so there's no point in asking more than once. This patch adds a brw_query_object::flushed flag, which is a conservative estimate of whether the batch has been flushed. On the first call to CheckQuery() or WaitQuery(), we check if the batch references query->bo. If not, it must have been flushed for some reason (such as being full). We record that it was flushed. If it does reference query->bo, we explicitly flush, and record that we did so. Any subsequent checks will simply see that query->flushed is set, and skip the drm_intel_bo_references() call. Inspired by a patch from Chris Wilson. According to Eero, this does not affect the performance of Witcher 2 on Haswell, but approximately halves the userspace CPU usage. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-16 15:39:54 -08:00
Kenneth Graunke	cb5cfb8361	i965/query: Use brw_bo_map to handle stall warnings. This is less code and also measures the duration of the stall for us. Our old code predates the existance of brw_bo_map(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-16 15:39:54 -08:00
Kenneth Graunke	9c47653d32	i965/query: Remove redundant drm_intel_bo_references call in CheckQuery. CheckQuery calls drm_intel_bo_references to see if the batch references the query BO, and if so, flushes. It then checks if the query BO is busy, and if not, calls gen6_queryobj_get_results(). Stupidly, gen6_queryobj_get_results() immediately did a second redundant drm_intel_bo_references check, even though we know the buffer is not referenced and in fact idle. This patch moves the batch-flush check out of gen6_queryobj_get_results and into WaitQuery() (the other caller). That way, both callers do a single batch-flush check. This should only be a minor improvement, since it would only affect the first CheckQuery call where the result is actually available. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86969 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-16 15:39:53 -08:00
Kenneth Graunke	12c16f4f27	i965/query: Add query->bo == NULL early return in CheckQuery hook. If query->bo == NULL, this is a redundant CheckQuery call, and we should simply return. We didn't do anything anyway - we skipped the batch flushing block, and although we called get_results(), it has an early return and does nothing. Why bother? Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-16 15:39:53 -08:00
Kenneth Graunke	ed8edd7175	i965/query: Set Ready flag in gen6_queryobj_get_results(). q->Ready means that the results are in, and core Mesa is free to return them to the application. gen6_queryobj_get_results() is a natural place to set that flag; doing so means callers don't have to. The older non-hardware-context aware code couldn't do this, because we had to call brw_queryobj_get_results() to gather intermediate results when we ran out of space for snapshots in the query buffer. We only gather complete results in the Gen6+ code, however. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-16 15:39:50 -08:00
Eric Anholt	1f0e106050	vc4: Add support for turning add-based MOVs to muls for pairing. total instructions in shared programs: 43053 -> 40795 (-5.24%) instructions in affected programs: 37996 -> 35738 (-5.94%)	2014-12-16 13:45:41 -08:00
Eric Anholt	f96bd9673e	vc4: Add a helper for changing a field in an instruction.	2014-12-16 13:45:41 -08:00
Eric Anholt	8e18adea61	vc4: Fix the name of qpu_waddr_ignores_ws(). We're deciding about the WS bit, not PM.	2014-12-16 13:45:41 -08:00
Timothy Arceri	54cc3be436	docs: note change in minimum GCC version to 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:55 +11:00
Timothy Arceri	e801fbb813	util: remove support for GCC older than 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:42 +11:00
Timothy Arceri	0936d42d52	mesa: remove support for GCC older than 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:35 +11:00
Timothy Arceri	bf37433f8c	gbm: remove support for GCC older than 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:29 +11:00
Timothy Arceri	13675a4907	gallium: remove support for GCC older than 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:23 +11:00
Timothy Arceri	8d0c641603	egl: remove support for GCC older than 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:17 +11:00
Timothy Arceri	78e1246bec	mesa: bump required GCC version to 4.1.0 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:11 +11:00
Timothy Arceri	5eec7c8ab8	mesa: remove support for GCC older than 3.3.0 GCC >=3.3 has been required since `9aa3aa7138` Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-By: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-17 08:37:05 +11:00
Matt Turner	2308b3bef2	i965/fs: Add a comment explaining what saturate propagation does.	2014-12-16 11:30:44 -08:00
Eric Anholt	3f6b008168	vc4: Add support for enabling early Z discards. This is the same basic logic from the original Broadcom driver.	2014-12-16 10:37:34 -08:00
Brian Paul	c6e8d2c659	st/mesa: remove extern "C" around #includes in st_glsl_to_tgsi.cpp Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	6dac455e6a	program: remove extern "C" usage in sampler.cpp Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	6d2f59fd94	program: remove extern "C" around #includes Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	241c599cb1	glsl: remove extern "C" around #includes Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	44c8957cfe	st/mesa: add extern "C" to st_context.h Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	d260348130	st/mesa: add extern "C" to st_program.h Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	de42431a9d	main: remove extern C around #includes in ff_fragment_shader.cpp Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	7b0aefaf74	mesa: move #include of mtypes.h outside __cplusplus check Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	04addcc6a3	program: add #ifndef SAMPLER_H wrapper Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	641314eff3	mesa: put extern "C" in src/mesa/program/*h header files Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Brian Paul	3ebc135b4e	mesa: put extern "C" in header files Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-16 07:52:41 -07:00
Juha-Pekka Heikkila	4b342fbbb7	mapi: add glapi-test and shared-glapi-test to .gitignore On the same go remove src/mapi/shared-glapi/tests/.gitignore and src/mapi/glapi/tests/.gitignore as useless. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-16 13:51:09 +02:00
Juha-Pekka Heikkila	ebbf0a250a	util: add u_atomic_test to .gitignore Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-16 13:50:59 +02:00
Juha-Pekka Heikkila	5d431ffd61	glx: remove __glXstrdup() I didn't find this being used anywhere Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-16 13:50:53 +02:00
Juha-Pekka Heikkila	096b48b3e1	i965: add test_vf_float_conversions to .gitignore Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-16 13:50:45 +02:00
Juha-Pekka Heikkila	430fbd8ad8	i965: Make validate_reg tables constant Declare local tables constant. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-12-16 13:50:38 +02:00
Timothy Arceri	873d7351c5	glsl: remove commented out code MaxGeometryOutputComponents is used as the value for gl_MaxGeometryVaryingComponents Acked-by: Matt Turner <mattst88@gmail.com>	2014-12-16 15:57:30 +11:00
Timothy Arceri	965cfbc85e	i965: remove commented out code Acked-by: Matt Turner <mattst88@gmail.com>	2014-12-16 15:57:25 +11:00
Ilia Mirkin	1402f689f1	nvc0: add missed PIPE_CAP_VERTEXID_NOBASE Commit `ade8b26bf` missed adding this cap to nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-12-15 23:18:07 -05:00
Roland Scheidegger	fef58979e1	st/mesa: use vertex id lowering according to pipe cap bit. Tested with llvmpipe by setting the cap bit temporarily, seems to work, though no driver requests it for now.	2014-12-16 04:23:00 +01:00
Roland Scheidegger	97dc3d826e	draw: implement support for the VERTEXID_NOBASE and BASEVERTEX semantics. This fixes 4 vertexid related piglit tests with llvmpipe due to switching behavior of vertexid to the one gl expects. (Won't fix non-llvm draw path since we don't get the basevertex currently.)	2014-12-16 04:23:00 +01:00
Roland Scheidegger	ade8b26bf5	gallium: add TGSI_SEMANTIC_VERTEXID_NOBASE and TGSI_SEMANTIC_BASEVERTEX Plus a new PIPE_CAP_VERTEXID_NOBASE query. The idea is that drivers not supporting vertex ids with base vertex offset applied (so, only support d3d10-style vertex ids) will get such a d3d10-style vertex id instead - with the caveat they'll also need to handle the basevertex system value too (this follows what core mesa already does). Additionally, this is also useful for other state trackers (for instance llvmpipe / draw right now implement the d3d10 behavior on purpose, but with different semantics it can just do both). Doesn't do anything yet. And fix up the docs wrt similar values. v2: incorporate feedback from Brian and others, better names, better docs. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-12-16 04:23:00 +01:00
Dave Airlie	3c8ef3a74b	r600g/sb: implement r600 gpr index workaround. (v3.1) r600, rv610 and rv630 all have a bug in their GPR indexing and how the hw inserts access to PV. If the base index for the src is the same as the dst gpr in a previous group, then it will use PV instead of using the indexed gpr correctly. The workaround is to insert a NOP when you detect this. v2: add second part of fix detecting DST rel writes followed by same src base index reads. v3: forget adding stuff to structs, just iterate over the previous node group again, makes it more obvious. v3.1: drop local_nop. Fixes ~200 piglit regressions on rv635 since SB was introduced. Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-12-16 12:44:45 +10:00
Vadim Girlin	de0fd375f6	r600g/sb: fix issues with loops created for switch Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-12-16 12:43:31 +10:00
Dave Airlie	34e512d9ea	Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch" This reverts commit `7b0067d23a`. Vadim's patch fixes this a lot better.	2014-12-16 12:43:23 +10:00
Eric Anholt	1b486b52ac	vc4: Add support for 32-bit signed norm/scaled vertex attrs. 32-bit unsigned would require some adjustments to handle values >= 0x80000000.	2014-12-15 14:33:05 -08:00
Eric Anholt	48a2154520	vc4: Add support for 16-bit signed/unsigned norm/scaled vertex attrs.	2014-12-15 14:33:01 -08:00
Eric Anholt	9ca32d6c19	vc4: Rename the 16-bit unpack #define. It's only an f16 conversion if you're doing a float operation, otherwise it's 16 bit signed to 32-bit signed.	2014-12-15 14:33:01 -08:00
Eric Anholt	2142fd1f6f	vc4: Add support for 8-bit unnormalized vertex attrs.	2014-12-15 14:33:00 -08:00
Eric Anholt	214a169b32	vc4: Refactor vertex attribute conversions a bit. There was just way too much indentation.	2014-12-15 14:28:23 -08:00
Eric Anholt	1fa1ee56a0	vc4: Fix use of r3 as a temp in 8-bit unpacking. We're actually allocating out of r3 now, and I missed it because I'd typed this one as qpu_rn(3) instead of qpu_r3().	2014-12-15 14:28:23 -08:00
Eric Anholt	8e678de761	vc4: Rename UNPACK_8* to UNPACK_8*_F. There is an equivalent unpack function without conversion to float if you use an integer operation instead.	2014-12-15 14:28:23 -08:00
Eric Anholt	ade7704685	vc4: Add support for UMAD.	2014-12-15 14:28:23 -08:00
Eric Anholt	440075fb50	vc4: 0-initialize the screen again. I typoed this when rebasing the memory leak fixes.	2014-12-15 14:28:22 -08:00
Maxence Le Doré	19e05d6898	glsl: Add gl_MaxViewports to available builtin constants It seems to have been forgotten during viewports array implementation time. Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-15 12:20:00 -08:00
Andres Gomez	8517e665bc	i965/brw_reg: struct constructor now needs explicit negate and abs values. We were assuming, when constructing a new brw_reg struct, that the negate and abs register modifiers would not be present by default in the new register. Now, we force explicitly setting these values when constructing a new register. This will avoid problems like forgetting to properly set them when we are using a previous register to generate this new register, as it was happening in the dFdx and dFdy generation functions. Fixes piglit test shaders/glsl-deriv-varyings Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82991 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-15 11:40:22 -08:00
Eric Anholt	e108442bb1	vc4: Fix leaks of the compiled shaders' keys.	2014-12-14 23:12:11 -08:00
Eric Anholt	667719fcb2	vc4: Fix leaks of the CL contents.	2014-12-14 23:12:11 -08:00
Eric Anholt	1f1ca8b2ea	vc4: Fix leak of vc4_bos stashed in the context.	2014-12-14 23:12:11 -08:00
Eric Anholt	80ed075e60	vc4: Fix leak of the compiled shader programs in the cache.	2014-12-14 23:12:11 -08:00
Eric Anholt	4da9e3d805	vc4: Fix leak of a copy of the scheduled QPU instructions. They're copied into a vc4_bo after compiling is done.	2014-12-14 23:12:11 -08:00
Eric Anholt	5c9b8eace2	vc4: Switch to using the util/ hash table. No performance difference on a microbenchmark with norast that should hit it enough to have mattered, n=220.	2014-12-14 23:12:11 -08:00
Eric Anholt	c84306fdc2	vc4: Fix leak of simulator memory on screen cleanup.	2014-12-14 23:11:59 -08:00
Eric Anholt	f519c3bff1	vc4: Fix a leak of the simulator's exec BO's actual vc4_bo.	2014-12-14 23:10:35 -08:00
Eric Anholt	6c3115af85	hash_table: Fix compiler warnings from the renaming. Not sure how we both missed this. None of the callers were using the return value, though.	2014-12-14 20:22:07 -08:00
Jason Ekstrand	94303a0750	util/hash_table: Rework the API to know about hashing Previously, the hash_table API required the user to do all of the hashing of keys as it passed them in. Since the hashing function is intrinsically tied to the comparison function, it makes sense for the hash table to know about it. Also, it makes for a somewhat clumsy API as the user is constantly calling hashing functions many of which have long names. This is especially bad when the standard call looks something like _mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data); In the above case, there is no reason why the hash table shouldn't do the hashing for you. We leave the option for you to do your own hashing if it's more efficient, but it's no longer needed. Also, if you do do your own hashing, the hash table will assert that your hash matches what it expects out of the hashing function. This should make it harder to mess up your hashing. v2: change to call the old entrypoint "pre_hashed" rather than "with_hash", like cworth's equivalent change upstream (change by anholt, acked-in-general by Jason). Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-12-14 19:32:53 -08:00
Mario Kleiner	0d7f4c8658	glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2) glXSwapBuffersMscOML() with target_msc=divisor=remainder=0 gets translated into target_msc=divisor=0 but remainder=1 by the mesa api. This is done for server DRI2 where there needs to be a way to tell the server-side DRI2ScheduleSwap implementation if a call to glXSwapBuffers() or glXSwapBuffersMscOML(dpy,window,0,0,0) was done. remainder = 1 was (ab)used as a flag to tell the server to select proper semantic. The DRI3/Present backend ignored this signalling, treated any target_msc=0 as glXSwapBuffers() request, and called xcb_present_pixmap with invalid divisor=0, remainder=1 combo. The present extension responded kindly to this with a BadValue error and dropped the request, but mesa's DRI3/Present backend doesn't check for error codes. From there on stuff went downhill quickly for the calling OpenGL client... This patch fixes the problem. v2: Change comments to be more clear, with reference to relevant spec, as suggested by Eric Anholt. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-12-14 15:09:49 +00:00
Mario Kleiner	455d3036fa	glx/dri3: Request non-vsynced Present for swapinterval zero. (v3) Restores proper immediate tearing swap behaviour for OpenGL bufferswap under DRI3/Present. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> v2: Add Frank Binns signed off by for his original earlier patch from April 2014, which is identical to this one, and Chris Wilsons reviewed tag from May 2014 for that patch, ergo also for this one. v3: Incorporate comment about triple buffering as suggested by Axel Davy, and reference to relevant spec provided by Eric Anholt. Signed-off-by: Frank Binns <frank.binns@imgtec.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-12-14 15:09:49 +00:00
Mario Kleiner	ad8b0e8bf6	glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2) Prevent calls to glXGetSyncValuesOML() and glXWaitForMscOML() from overwriting the (ust,msc) values of the last successfull swapbuffers call (PresentPixmapCompleteNotify event), as glXWaitForSbcOML() relies on those values corresponding to the most recent completed swap, not to whatever was last returned from the server. Problematic call sequence without this patch would have been, e.g., glXSwapBuffers() ... wait ... swap completes -> PresentPixmapComplete event -> (ust,msc) updated to reflect swap completion time and count. ... wait for at least 1 video refresh cycle/vblank increment. glXGetSyncValuesOML() -> PresentNotifyMsc event overwrites (ust,msc) of swap completion with (ust,msc) of most recent vblank glXWaitForSbcOML() -> Returns sbc of last completed swap but (ust,msc) of last completed vblank, not of last completed swap. -> Client is confused. Do this by tracking a separate set of (ust, msc) for the dri3_wait_for_msc() call than for the dri3_wait_for_sbc() call. This makes the glXWaitForSbcOML() call robust again and restores consistent behaviour with the DRI2 implementation. Fixes applications originally written and tested against DRI2 which also rely on this not regressing under DRI3/Present, e.g., Neuro-Science software like Psychtoolbox-3. This patch fixes the problem. v2: Rename vblank_msc/ust to notify_msc/ust as suggested by Axel Davy for better clarity. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2014-12-14 15:09:49 +00:00
Mario Kleiner	8cab54de16	glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2) targetSBC == 0 is a special case, which asks the function to block until all pending OpenGL bufferswap requests have completed. Currently the function just falls through for targetSBC == 0, returning bogus results. This breaks applications originally written and tested against DRI2 which also rely on this not regressing under DRI3/Present, e.g., Neuro-Science software like Psychtoolbox-3. This patch fixes the problem. v2: Simplify as suggested by Axel Davy. Add comments proposed by Eric Anholt. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-12-14 15:09:49 +00:00
Emil Velikov	ac0940224b	docs: Add 10.4 sha256 sums, news item and link release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `af0c82099b`) Conflicts: docs/index.html docs/relnotes.html	2014-12-14 14:10:34 +00:00
Emil Velikov	1faac11778	docs: Update 10.4.0 release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `5fe79b0b12`)	2014-12-14 14:10:34 +00:00
Rob Clark	0ebd623f60	freedreno/a4xx: mipmaps Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-13 15:09:37 -05:00
Rob Clark	cf80694df5	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-13 15:09:37 -05:00
Rob Clark	f24e910da4	freedreno: add is_a3xx()/is_a4xx() helpers A bunch of open-coded 'gpu_id > 300's seems like it will eventually cause problems with future generations. There were already a few minor problems with caps for features that still need additional work on a4xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-13 15:09:37 -05:00
Rob Clark	7474de2235	freedreno: helper to calc layer/level offset Rather than duplicating this everywhere. Especially as on a4xx the layout of layers and levels differs based on texture type. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-13 15:09:37 -05:00
Kenneth Graunke	23caba862a	i965/vec4: Drop writemasks on scratch reads. This code is complete nonsense and has apparently existed since I first implemented register spilling in the VS two years ago. Scratch reads are SEND messages, which ignore the destination writemask. The comment about "data that may not have been written to scratch" is also confusing - we always spill whole 4x2 registers, so such data simply does not exist. We can safely ignore the writemask. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-12 23:21:27 -08:00
Timothy Arceri	a3218e65d1	mesa: remove long dead 3Dnow optimisation This code has been turned off for the last decade. Considering 3Dnow is obsolete it seems the bug will never be fixed so just remove it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-13 12:15:25 +11:00
Brian Paul	64bd1ac2b1	ir_to_mesa: remove unused 'target' variable Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-12 16:45:33 -07:00
Brian Paul	7dccc1a57a	util: add missing closing brace for __cplusplus	2014-12-12 16:45:33 -07:00
Brian Paul	0dcc7de205	mesa: remove obsolete comment on _mesa_ClearColor()	2014-12-12 16:45:33 -07:00
Brian Paul	caa13c59ef	mesa: whitespace fixes, 80-column wrapping in texobj.c	2014-12-12 16:45:33 -07:00
Brian Paul	e725dc0a74	mesa: whitespace, line wrap fixes in clear.c	2014-12-12 16:45:33 -07:00
Matt Turner	3f3aeb5333	mapi: Move rules for generating glapi_mapi_tmp.h out of the conditional. Allows distcheck to succeed, regardless of how Mesa has been configured.	2014-12-12 12:11:50 -08:00
Matt Turner	5ea4b25fba	glsl: Add dist-hook to delete glcpp test *.out files.	2014-12-12 12:11:50 -08:00
Matt Turner	a29ae0b3dd	glcpp: Make tests write .out files to builddir.	2014-12-12 12:11:50 -08:00
Matt Turner	75c7a7114f	gallium: Remove Android files from distribution. Android builds Mesa from git, so there don't need to be in the tarball.	2014-12-12 12:11:50 -08:00
Matt Turner	00eadb77e6	osmesa: Add osmesa.def to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	92f89f0c0c	x86-64: Remove calling_convention.txt. It just details the x86-64 calling convention. No need for this in Mesa.	2014-12-12 12:11:50 -08:00
Matt Turner	9e191e8829	drivers/x11: Add headers to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	dd6a43f07c	drivers/windows: Add to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	d51150a98a	mesa: Add autogen.sh to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	4401e2b219	mapi: Add ABI-check tests to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	43ac31dff0	mesa: Add notes/readme files to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	a208e9b520	util: Wire up u_atomic_test.	2014-12-12 12:11:50 -08:00
Matt Turner	952b324b23	mesa: Add scons files to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	f6502aaa58	haiku: Add files to distribution.	2014-12-12 12:11:50 -08:00
Matt Turner	fe2c72e6ec	egl: Add files to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	feb741dc7c	egl+gbm: Add symbols-check tests to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	0ac98e7296	docs: Add to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	55983a1eaa	glapi/gen: Add gl_and_glX_API.xml to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	7a26c82489	glx/apple: Add headers to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	a267212a4d	mesa: Add a dist hook to remove .gitignore files from distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	b662d5282f	mesa: Add clean-local rule to remove .lib links.	2014-12-12 12:11:49 -08:00
Matt Turner	8e2577f2a9	glsl: Add clean-local rule to delete glcpp test output.	2014-12-12 12:11:49 -08:00
Matt Turner	e643fd3b4a	util: List hash_table tests as check_PROGRAMS. EXTRA_PROGRAMS is not what you want for binaries listed in TEST.	2014-12-12 12:11:49 -08:00
Matt Turner	216248730a	xmlpool: Add $(MOS) and options.h to CLEANFILES.	2014-12-12 12:11:49 -08:00
Matt Turner	3b7bcb5d04	dri: Add uninstall hooks to handle megadriver hardlinks.	2014-12-12 12:11:49 -08:00
Matt Turner	65155c208d	targets/dri: Remove unnecessary variables in install-data-hook.	2014-12-12 12:11:49 -08:00
Matt Turner	d27379d016	glx/tests: Add headers to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	3d357d030f	gallium/targets: Add *.sym files to distribution. And add d3dadapter9's extra dependency.	2014-12-12 12:11:49 -08:00
Matt Turner	00ab151ad1	egl/dri2: Add headers to distribution.	2014-12-12 12:11:49 -08:00
Matt Turner	7a08a1e61b	egl: Drop unnecessary Makefile.am.	2014-12-12 12:11:48 -08:00
Matt Turner	d1c1d6d9b6	glx: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	82b7da3de7	glx: Alphabetize source lists. And remove absurd tab-space-space indentation.	2014-12-12 12:11:48 -08:00
Matt Turner	4f90f341a7	swrast: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	c9b5c4d407	r200: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	7162219450	r200: Alphabetize source list.	2014-12-12 12:11:48 -08:00
Matt Turner	5fd472507b	radeon: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	b53fbe2552	radeon: Alphabetize source list.	2014-12-12 12:11:48 -08:00
Matt Turner	10259d8614	nouveau: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	6b0207552f	nouveau: Alphabetize source list.	2014-12-12 12:11:48 -08:00
Matt Turner	e81ec49b56	i965: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	976b3f4cfa	i965: Alphabetize source list.	2014-12-12 12:11:48 -08:00
Matt Turner	d8e28537e3	i915: Add headers to distribution.	2014-12-12 12:11:48 -08:00
Matt Turner	0698f5de4a	i915: Alphabetize source list.	2014-12-12 12:11:48 -08:00
Matt Turner	9f565f5f8a	loader: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	929bcfb756	program: Add lex and yacc sources to distribution. Since we have manual build rules and list the .c/.cpp files in SOURCES, we need to explicitly list these for distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	e3ea939988	glsl: Add parser headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	4af1905e73	drivers/common: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	942e646941	vbo: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	b8205d4db7	vbo: Alphabetize VBO_FILES.	2014-12-12 12:11:47 -08:00
Matt Turner	009bf242d3	tnl: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	e15cd6dd9f	tnl: Alphabetize TNL_FILES.	2014-12-12 12:11:47 -08:00
Matt Turner	d1127e29dd	tnl_dd: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	d36113e000	tnl_dd: Remove dead t_dd_vb.c. Dead since `e4344161` ("dri: Remove all DRI1 drivers").	2014-12-12 12:11:47 -08:00
Matt Turner	e88ed739f0	swrast: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	58a3ec427f	state_trackers: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	4194f9c1ad	x86: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	0557d54847	x86-64: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	d5fba58f85	sparc: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	1abf4e2f45	math: Add headers to distribution.	2014-12-12 12:11:47 -08:00
Matt Turner	152e967063	program: Add headers to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	e475ad70c8	program: Alphabetize PROGRAM_FILES.	2014-12-12 12:11:46 -08:00
Matt Turner	67abb4910a	mesa: Remove moved texcompress_rgtc_tmp.h from source list. Missed in commit `ebcb2ee9`.	2014-12-12 12:11:46 -08:00
Matt Turner	9a742eef53	mesa: Add headers to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	19999c3114	mesa: Alphabetize MAIN_FILES.	2014-12-12 12:11:46 -08:00
Matt Turner	3125cd1f6b	glsl: Add lex and yacc sources to distribution. Since we have manual build rules and list the .c/.cpp files in SOURCES, we need to explicitly list these for distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	55afbcc661	include: Add remaining headers to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	2a5b012171	configure.ac: Ship .xz compressed tarballs, in addition to .gz. 11 MiB -> 6.5 MiB.	2014-12-12 12:11:46 -08:00
Matt Turner	dd439e494e	configure.ac: Use tar-ustar archive format. The default tar-v7 archive format doesn't support filenames longer than 99 characters, of which we have a few (in src/glsl/tests/lower_jumps/).	2014-12-12 12:11:46 -08:00
Matt Turner	8280358cf1	gtest: Add headers to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	838ac978f4	glsl: Add headers to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	69386ddfa6	glsl: Distribute tests/, TODO, and README	2014-12-12 12:11:46 -08:00
Matt Turner	b245009173	mesa: Add python scripts to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	cceeea0c4c	dri/common: Add files to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	748d0b04a0	vgapi: Add vgapi.csv to distribution.	2014-12-12 12:11:46 -08:00
Matt Turner	72cf4baeb3	mapi: Add mapi_abi.py to EXTRA_DIST	2014-12-12 12:11:45 -08:00
Matt Turner	f6357a993b	dri/common: Drop unused mmio.h. Unused since commit `7550a24f`.	2014-12-12 12:11:45 -08:00
Matt Turner	547faf1dec	glapi/gen: Add KHR_context_flush_control.xml to distribution.	2014-12-12 12:11:45 -08:00
Matt Turner	2de8da637e	configure.ac: Drop generating egl-static and gbm Makefiles.	2014-12-12 12:11:45 -08:00
Matt Turner	1cd2b9177e	util: Add headers and python scripts for distribution.	2014-12-12 12:11:45 -08:00
Matt Turner	7808344271	glapi: Make mapi/glapi/gen before mapi to avoid distcheck problem.	2014-12-12 12:11:45 -08:00
Matt Turner	2eef9c0b16	r200: Avoid out of bounds array access. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-12 12:11:31 -08:00
Eric Anholt	e5eaf8ec60	vc4: Fix referencing of sync objects. While the pipe_reference_* helpers set the pointer, a bare pipe_reference doesn't. Fixes 5 ARB_sync tests.	2014-12-12 09:30:35 -08:00
José Fonseca	e75e677d28	util: Unbreak usage of assert()/debug_assert() inside expressions. `f0ba7d897d` made debug_assert()/assert() unsafe for expressions, but only now that u_atomic.h started to rely on them for Windows that this became an issue. This fixes non-debug builds with MSVC. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-12 14:19:53 +00:00
Eric Anholt	92b85fba89	vc4: Consider FS backface color loads as color inputs as well. This fixes flatshading of backface color in 4 of the piglit interpolation tests.	2014-12-11 23:52:34 -08:00
Eric Anholt	5b3c0d999c	vc4: Drop redundant index size setting. This is already done at set_index_buffer() time.	2014-12-11 23:52:34 -08:00
Eric Anholt	d78eb57528	vc4: Don't throw out the index offset in the shadow index buffer path. When we upload shadow indices at draw time, we need the source offset. Fixes the piglit draw-elements test.	2014-12-11 23:52:25 -08:00
Eric Anholt	0ae5e002e0	vc4: Fix triangle-guardband-viewport piglit test. The original Broadcom driver also did this with the viewport.	2014-12-11 21:31:27 -08:00
Eric Anholt	87db578268	vc4: Fix a memory leak in setting up QPU instructions for scheduling.	2014-12-11 21:31:27 -08:00
Ben Widawsky	5069e4bd40	i965/gen8+: Remove false perf debug message about MOCS We support MOCS on both gen8 and gen9, so the message seems meaningless. Remove it to avoid confusion. Trivial. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-11 18:59:38 -08:00
Ben Widawsky	9cd4f90242	i965/gen8: Check correct number of blitter dwords The odds of having this patch make a difference on Gen8+ are probably very low. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-but-not-tested-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-11 18:59:36 -08:00
Alexander von Gluck IV	ad2ffd3bc6	mesa/drivers: Add missing mesautil lib to Haiku swrast * Resolves missing util_format_linear_to_srgb_8unorm_table symbol.	2014-12-11 03:34:15 +00:00
Roland Scheidegger	ff96537759	draw: simplify prim id insertion in prim assembler Because all topologies are reduced to basic primitives (i.e. no strips, fans) and the vertices involved are all copied, there's no need for any elaborate decisions where to insert the prim id. The logic employed was correct for first provoking vertex, but didn't account at all for the last provoking vertex case. And since we now will get the right constant value even if the primitive type is later changed (for unfilled etc.) this is no longer required to pass certain tests (which were checking for prim_id == some const interpolated value so passing because both were wrong in the end). This is a bit overkill (3x4 values assigned in total even though it's really one scalar per prim...) but the code is now much easier and I don't need to add more cases for last provoking vertex. This fixes piglit primitive-id-no-gs-strip test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-10 22:11:16 +01:00
Roland Scheidegger	db3dfcfe90	draw: fix another decompose bug affecting constant interpolated attributes Previously the first provoking vertex convention would only be used if flatshading were enabled. No matter how I look at it that cannot be possibly correct. Maybe the code getting used was somewhat simpler that way at a time where there weren't constant interpolated attributes, only flatshading... (Note that all other places including the decomposition macros already do the same.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-10 22:11:16 +01:00
Roland Scheidegger	2b23149206	draw: fix flatshade stage for constant interpolated values This stage only worked for traditional old-school flatshading, it did ignore constant interpolated values and only handled colors, the code probably predates using of constant interpolated values in gallium. So fix this - the clip stage apparently did this a long time ago already. Unfortunately this also means the stage needs to be invoked when flatshading isn't enabled but some other prim changing stages are - for instance with fill mode line each of the 3 lines in a tri should get the same attribute value from the leading vertex in the original tri if interpolation is constant, which did not happen before Due to that, the stage is now run in more cases, even unnecessary ones. Could in theory skip it completely if there aren't any constant interpolated attributes (and rast->flatshade isn't set), but not sure it's worth bothering, as it looks kinda complicated getting this information in advance. No piglit change (doesn't really cover this directly). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-10 22:11:16 +01:00
Roland Scheidegger	fb61f75bf6	draw: copy over prim id header in flatshade stage when emitting lines Just like we do for tris (det shouldn't matter at this point, however can have flags for things like line stipple reset). No piglit change, it would fail line stippling tests if the flatshade stage were run, which will happen with the next commit. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-10 22:11:16 +01:00
Roland Scheidegger	fe7e6b248f	gallium/docs: clarify fragment shader position input w component. The previous language was a bit misleading, since it sounded like w was interpolated then the reciprocal calculated which isn't what should be happening. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-10 22:11:16 +01:00
Marek Olšák	ac319d94d3	docs/relnotes: document the removal of GALLIUM_MSAA Cc: 10.2.10.3 10.4 <mesa-stable@lists.freedesktop.org>	2014-12-10 21:59:37 +01:00
Marek Olšák	15186607bb	radeonsi: take into account NULL colorbuffers when computing CB_TARGET_MASK Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	3291eedfe6	radeonsi: only emit line stippling and provoking vertex state when it changes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	acda2e113a	radeonsi: fix SPI state dependency on sprite_coord_enable Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	7991d602f3	radeonsi: fix line stippling and provoking vertex state for GS primitives I'm not sure if GS hw outputs line lists or line strips. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	834bee42ed	radeonsi: emit DRAW_PREAMBLE only if it changes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	c466093512	radeonsi: remove setting of VGT_DISPATCH_DRAW_INDEX It's used only if VGT_SHADER_STAGES_EN.DISPATCH_DRAW_EN is 1, which we don't set. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	6fde194910	radeonsi: emit GS_OUT_PRIM_TYPE only if it changes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	34350131de	radeonsi: emit primitive restart only if it changes Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	3382036946	radeonsi: emit base vertex and start instance only if they change v2: added a helper function for invalidation of the sh constants Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	b472709090	radeonsi: emit clip registers only if VS, GS, or rasterizer is changed Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	161534737c	radeonsi: get info about VS outputs from tgsi_shader_info Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	20e570d115	radeonsi: move all shader-related functions to a new file si_state_shaders.c This huge amount of code deserves its own file. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	ca7f1cf8b5	radeonsi: generate derived and draw-related registers directly in the CS The big function is split into 3 smaller functions. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	508c1ca6af	radeonsi: si_conv_pipe_prim shouldn't fail An assertion should suffice. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	c6546cfb03	radeonsi: remove useless variable si_context::pm4_dirty_cdwords Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	e90bae4376	radeonsi: remove unused draw packet functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	384213cb51	radeonsi: emit draw packets directly into the CS Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	feedd8f700	radeonsi: add emit util functions for SH registers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	2b76bb3ba7	tgsi: add tgsi_shader_info::writes_clipvertex Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-10 21:59:37 +01:00
Marek Olšák	8115797801	tgsi: add clip and cull distance writemasks into tgsi_shader_info Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-10 21:59:36 +01:00
Marek Olšák	946eb08e6a	tgsi: add tgsi_shader_info::writes_psize Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-10 21:59:36 +01:00
Marek Olšák	0a60ebe30c	cso: put cso_release_all into cso_destroy_context Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-10 21:59:36 +01:00
Kristian Høgsberg	ee5fb8d1ba	i965: Generate vs code using scalar backend for BDW+ With everything in place, we can now use the scalar backend compiler for vertex shaders on BDW+. We make scalar vertex shaders the default on BDW+ but add a new vec4vs debug option to force the vec4 backend. No piglit regressions. Performance impact is minimal, I see a ~1.5 improvement on the T-Rex GLBenchmark case, but in general it's in the noise. Some of our internal synthetic, vs bounded benchmarks show great improvement, 20%-40% in some cases, but real-world cases are mostly unaffected. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:27 -08:00
Kristian Høgsberg	7ff457b930	i965: Clean up fs_visitor::run and rename to run_fs Now that fs_visitor::run is back to being only fragment shader compilation, we can clean up a few stage == MESA_SHADER_FRAGMENT conditions and rename it to run_fs. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:23 -08:00
Kristian Høgsberg	8b6a797d74	i965: Add fs_visitor::run_vs() to generate scalar vertex shader code This patch uses the previous refactoring to add a new run_vs() method that generates vertex shader code using the scalar visitor and optimizer. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:19 -08:00
Kristian Høgsberg	bf23079379	i965: Rename brw_vec4_prog_data/key to brw_bue_prog_data/key These structs aren't vec4 specific, they are shared by shader stages operating on Vertex URB Entries (VUEs). VUEs are the data structures in the URB that hold vertex data between the pipeline geometry stages. Using vue in the name instead of vec4 makes a lot more sense, especially when we add scalar vertex shader support. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:16 -08:00
Kristian Høgsberg	3d10f0a98c	i965: Prepare for using the ATTR register file in the fs backend The scalar vertex shader will use the ATTR register file for vertex attributes. This patch adds support for the ATTR file to fs_visitor. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:11 -08:00
Kristian Høgsberg	df0966fb1a	i965: Consolidate code to get struct brw_sampler_prog_key_data This chunk of code is repeated in a few places, and we're going to add a MESA_SHADER_VERTEX case to it soon. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:08 -08:00
Kristian Høgsberg	c5b3878714	i965: Add new SIMD8 VS prog data flag This flag signals that we have a SIMD8 VS shader so we can set up the corresponding state accordingly. This boils down to setting the BDW+ SIMD8 enable bit in 3DSTATE_VS and making UBO and pull constant buffers use dword pitch. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:04 -08:00
Kristian Høgsberg	d9e29f5d88	i965: Add SIMD8 URB write low-level IR instruction This is all we need from the generator for SIMD8 vertex shaders. This opcode is just the send instruction, all the hard work will happen in the visitor using LOAD_PAYLOAD. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:29:00 -08:00
Kristian Høgsberg	686ef091a4	i965: Remove shader program argument and member from fs_generator Now that the caller passes in the shader debug name, we don't need this anymore. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:28:55 -08:00
Kristian Høgsberg	9a1af7b318	i965: Set shader name for generator from call site fs_generator no longer knows what stage it's generating code for, so we have to set the debug name of the shader from the call site. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:28:51 -08:00
Kristian Høgsberg	7bb9d33b8d	i965: Generalize fs_generator further This removes all stage specific data from the generator, and lets us create a generator for any stage. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:28:48 -08:00
Kristian Høgsberg	840e8fc920	i965: Don't copy propagate constants from sources with saturate We don't propagate the saturate bit and some instructions can't saturate at all. If the source has saturate set, just skip propagation. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-10 12:28:32 -08:00
Matt Turner	47aaabda47	i965: Replace 'noann' debug flag with 'ann'. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-10 10:19:16 -08:00
Matt Turner	1a2de7dce8	i965: Disable unlit-centroid workaround on Gen < 6. Back to the original commit (`8313f444`) adding the workaround, we were enabling it on gens <= 7, even though gens <= 5 can't do multisampling. I cannot find documentation that says that Sandybridge needs this workaround but in practice disabling it causes these piglit tests to fail: EXT_framebuffer_multisample/interpolation {2,4} centroid-deriv{,-disabled} On Ironlake: total instructions in shared programs: 4358478 -> 4349671 (-0.20%) instructions in affected programs: 117680 -> 108873 (-7.48%) A bunch of shaders in TF2, Portal 2, and L4D2 are cut by 25~30%. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-10 10:18:39 -08:00
Adrien Destugues	13e42fc025	hgl: traverse add-on entries * Allow using symlinks to add-ons when developing.	2014-12-10 14:01:01 +00:00
Alexander von Gluck IV	03e237e9f2	gallium/target: Haiku softpipe * Use print macro to fix warning on 64-bit systems	2014-12-10 14:01:01 +00:00
Alexander von Gluck IV	63d3f621e3	gallium/aux: Avoid redefining MAX * Can be redefined on some platforms through u_debug.h	2014-12-10 14:01:00 +00:00
Jan Vesely	3a18fc6058	clover: Use switch when creating kernel arguments. This way we get a warning if an enum value is not handled. v2: codestyle Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-12-10 15:48:20 +02:00
Dave Airlie	7f21cf7198	r600g: only init GS_VERT_ITEMSIZE on r600 On evergreen there are 4 regs, on r600/700 there is only one. Don't initialise regs and trash someone elses state. Not sure this fixes anything, but hey one less stupid. Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.3 10.4" mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-12-10 16:34:40 +10:00
Eric Anholt	8812dc503e	vc4: Do QPU scheduling across uniform loads. This means another pass of reordering the uniform data store, but it lets us pair up a lot more instructions. total instructions in shared programs: 44639 -> 43176 (-3.28%) instructions in affected programs: 36938 -> 35475 (-3.96%)	2014-12-09 21:19:11 -08:00
Eric Anholt	c5b544403f	vc4: Populate the delay field better, and schedule high delay first. This is a standard scheduling heuristic, and clearly helps. total instructions in shared programs: 46418 -> 44467 (-4.20%) instructions in affected programs: 42531 -> 40580 (-4.59%)	2014-12-09 18:32:36 -08:00
Eric Anholt	45a8923771	vc4: Skip raddr dependencies for 32-bit immediate loads. These don't have raddr fields.	2014-12-09 18:32:36 -08:00
Eric Anholt	f431b4f110	vc4: Mark VPM read setup as impacting VPM reads, not writes. Fixes assertion failures if we adjust scheduling priorities to emphasize VPM reads more.	2014-12-09 18:32:36 -08:00
Eric Anholt	cff8c96a0d	vc4: Refuse to merge instructions involving 32-bit immediate loads. An immediate load overwrites the mul and add operations, so you can't merge with them.	2014-12-09 18:32:36 -08:00
Aaron Watry	25db8729dc	clover: Fix build after llvm r223802 Signed-off-by: Aaron Watry <awatry at gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-12-09 19:28:50 -06:00
Rob Clark	69d23809d0	freedreno/a4xx: frag-coord / face fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:03:55 -05:00
Rob Clark	3dbcd25022	freedreno/a4xx: fix rendering to layer != 0 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:03:40 -05:00
Rob Clark	6a5ba23fa6	freedreno/a4xx: temp hack for FLAT varyings Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:03:09 -05:00
Rob Clark	eb6fd3b8eb	freedreno/ir3: lower TXP as needed On a3xx, lower TXP for 3D textures, on a4xx lower all TXP. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:03:01 -05:00
Rob Clark	5b38a1740b	freedreno/a4xx: XA gpu hang at startup Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:02:45 -05:00
Rob Clark	1e3a732603	freedreno/a4xx: texture fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:01:49 -05:00
Rob Clark	5d7c9c9160	freedreno: cleanup slice alignment/setup Collapse things back into a setup_slices() which takes the desired alignment as a param. This gets things ready for a4xx which has some slightly different requirements. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:01:21 -05:00
Rob Clark	8ecbcbf0aa	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-09 18:01:10 -05:00
Rob Clark	219440ddeb	tgsi/lowering: add support to lower TXP (v2) v2: actually do perspective divide for RECT/SHADOWRECT Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-12-09 17:47:44 -05:00
Timothy Arceri	f1b5f2b157	mesa: use build flag to ensure stack is realigned on x86 Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment. V4: fix comment and indentation V3: move all sse4.1 build flag config to the same location and add comment as to why we need to do the realign V2: use $target_cpu rather than $host_cpu and setup build flags in config rather than makefile https://bugs.freedesktop.org/show_bug.cgi?id=86788 Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Matt Turner <mattst88@gmail.com> CC: "10.4" <mesa-stable@lists.freedesktop.org>	2014-12-10 07:35:38 +11:00
Marek Olšák	65ef78e861	draw: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION Required by Nine. Tested with util_run_tests. It's added to softpipe, llvmpipe, and r300g/swtcl. Tested-by: David Heidelberg <david@ixit.cz>	2014-12-09 12:27:10 +01:00
Samuel Iglesias Gonsalvez	6cc7251185	main: return two minor digits for ES shading language version For OpenGL ES 3.0 spec, the minor number for SHADING_LANGUAGE_VERSION is always two digits, matching the OpenGL ES Shading Language Specification release number. For example, this query might return the string "3.00". This patch fixes the following dEQP test: dEQP-GLES3.functional.state_query.string.shading_language_version No piglit regression observed. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-09 11:40:00 +01:00
Samuel Iglesias Gonsalvez	426a50e208	glsl: invariant qualifier is not valid for shader inputs in GLSL ES 3.00 GLSL ES 3.00 spec, chapter 4.6.1 "The Invariant Qualifier", Only variables output from a shader can be candidates for invariance. This includes user-defined output variables and the built-in output variables. As only outputs can be declared as invariant, an invariant output from one shader stage will still match an input of a subsequent stage without the input being declared as invariant. This patch fixes the following dEQP tests: dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage_precision dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage_precision dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_precision_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_precision_invariant_input dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_invariant_input No piglit regressions observed. v2: - Add spec content in the code Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-09 11:40:00 +01:00
Iago Toral Quiroga	e1ed4f2532	mesa: Recompute LegalTypesMask if the GL API has changed The current code computes ctx->Array.LegalTypesMask just once, however, computing this needs to consider ctx->API so we need to make sure that the API for that context has not changed if we intend to reuse the result. The context API can change, at least, if we go through _mesa_meta_begin, since that will always force API_OPENGL_COMPAT until we call _mesa_meta_end. If any operation in between these two calls triggers a call to update_array_format, then we might be caching a value for LegalTypesMask that will not be right once we have called _mesa_meta_end and restored the context API. Fixes the following 179 dEQP tests in i965: dEQP-GLES3.functional.vertex_arrays.single_attribute.strides.fixed.* dEQP-GLES3.functional.vertex_arrays.single_attribute.normalize.fixed.* dEQP-GLES3.functional.vertex_arrays.single_attribute.output_types.fixed.* dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_draw.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_draw.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_draw.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_copy.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_copy.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_copy.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.static_read.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.stream_read.fixed dEQP-GLES3.functional.vertex_arrays.single_attribute.usages.dynamic_read.fixed dEQP-GLES3.functional.vertex_arrays.multiple_attributes.input_types.3_fixed2 dEQP-GLES3.functional.draw.random.{2,18,28,68,83,106,109,156,181,191} Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-09 11:40:00 +01:00
Eduardo Lima Mitev	09cb149ba7	mesa: Returns zero samples when querying GL_NUM_SAMPLE_COUNTS when internal format is integer From GL ES 3.0 specification, section 6.1.15 Internal Format Queries (page 236), multisampling is not supported for signed and unsigned integer internal formats. Fixes 19 dEQP tests under 'dEQP-GLES3.functional.state_query.internal_format.*'. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-09 11:40:00 +01:00
Eduardo Lima Mitev	7894278717	mesa: Enables GL_RGB and GL_RGBA unsized internal formats for OpenGL ES 3.0 GL_RGB and GL_RGBA are valid internal formats on a GLES3 profile. See "Table 1. Unsized Internal Formats" at https://www.khronos.org/opengles/sdk/docs/man3/html/glTexImage2D.xhtml. Fixes 2 dEQP tests: - dEQP-GLES3.functional.state_query.internal_format.rgb_samples - dEQP-GLES3.functional.state_query.internal_format.rgba_samples Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-09 11:40:00 +01:00
Eduardo Lima Mitev	242ad32655	mesa: Considers GL_DEPTH_STENCIL_ATTACHMENT a valid argument for FBO invalidation under GLES3 In OpenGL and OpenGL-ES 3+, GL_DEPTH_STENCIL_ATTACHMENT is a valid attachment point for the family of functions that invalidate a framebuffer object (e.g, glInvalidateFramebuffer, glInvalidateSubFramebuffer, etc). Currently, a GL_INVALID_ENUM error is emitted for this attachment point. Fixes 21 dEQP test failures under 'dEQP-GLES3.functional.fbo.invalidate.*'. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-09 11:40:00 +01:00
Eric Anholt	8420a95692	vc4: Reserve rb31 instead of r3 for raddr conflict spills. This increases the cost of a raddr b conflict spill (save r3 to rb31, move src1 to r3, move rb31 back to r3 when done, instead of just move src1 to r3), but on average thanks to instruction pairing it's more worthwhile to have another accumulator. total instructions in shared programs: 46428 -> 46171 (-0.55%) instructions in affected programs: 38030 -> 37773 (-0.68%)	2014-12-09 01:04:46 -08:00
Eric Anholt	ab1b1fa6fb	vc4: Prioritize allocating accumulators to short-lived values. The register allocator walks from the end of the nodes array looking for trivially-allocatable things to put on the stack, meaning (assuming everything is trivially colorable and gets put on the stack in a single pass) the low node numbers get allocated first. The things allocated first happen to get the lower-numbered registers, which is to say the fast accumulators that can be paired more easily. When we previously made the nodes match the temporary register numbers, we'd end up putting the shader inputs (VS or FS) in the accumulators, which are often long-lived values. By prioritizing the shortest-lived values for allocation, we can get a lot more instructions that involve accumulators, and thus fewer conflicts for raddr and WS. total instructions in shared programs: 52870 -> 46428 (-12.18%) instructions in affected programs: 52260 -> 45818 (-12.33%)	2014-12-09 00:55:14 -08:00
Dave Airlie	0d4272cd8e	r600g: fix regression since UCMP change Since `d8da6decea` where the state tracker started using UCMP on cayman a number of tests regressed. this seems to be r600g is doing CNDGE_INT for UCMP which is >= 0, we should be doing CNDE_INT with reverse arguments. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-12-09 11:54:46 +10:00
Matt Turner	2a0bef91ca	program: Delete dead _mesa_realloc_instructions. Dead since 2010 (commit `284ce209`). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-08 17:02:19 -08:00
Matt Turner	811a1836c8	swrast: Remove 'inline' from tex filter functions. Reduces .text size of mesa_dri_drivers.so (i965-only) by 62k, or 1.4%. Note that we don't remove inline from lerp_2d(), which has a comment above it saying it definitely should be inlined. Though, removing the inline keyword from it doesn't actually change the compiled code for me. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-08 17:02:19 -08:00
Matt Turner	8af4aaf351	Don't cast the return value of malloc/realloc See commit `2b7a972e` for the Coccinelle script. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-08 17:02:19 -08:00
Matt Turner	f0a8bcd84e	Use calloc instead of malloc/memset-0 See commit `6bda027e` for the Coccinelle script. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-08 17:02:19 -08:00
Matt Turner	9019e5e195	Remove useless checks for NULL before freeing See commits `5067506e` and `b6109de3` for the Coccinelle script. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-08 17:02:19 -08:00
Kristian Høgsberg	cae7a2a031	i965/skl: Add Skylake PCI IDs Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-12-08 16:33:59 -08:00
Damien Lespiau	5bad948fa8	i965/skl: Emit depth stall workaround for gen9 as well The docs say that we shouldn't need this workaround for gen8+, but just removing it, causes gpu hangs. We'll revisit this, but for now, just extend the workaround to gen9. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-12-08 16:33:59 -08:00
Ben Widawsky	9404494b9b	i965/skl: Fix GS thread count location SKL moves the GS threadcount to dw8 from dw7, and no longer does the divide by 2 thing. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Kristian Høgsberg <krh@bitplanet.net>	2014-12-08 16:33:59 -08:00
Vinson Lee	d20235f79a	i965: Fix union usage for G++ <= 4.6. This patch fixes this build error with G++ <= 4.6. CXX test_vf_float_conversions.o test_vf_float_conversions.cpp: In function ‘unsigned int f2u(float)’: test_vf_float_conversions.cpp:63:20: error: expected primary-expression before ‘.’ token Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-08 16:25:16 -08:00
Eric Anholt	70dd3df344	vc4: Interleave register allocation from regfile A and B. The register allocator prefers low-index registers from vc4_regs[] in the configuration we're using, which is good because it means we prioritize allocating the accumulators (which are faster). On the other hand, it was causing raddr conflicts because everything beyond r0-r2 ended up in regfile A until you got massive register pressure. By interleaving, we end up getting more instruction pairing from getting non-conflicting raddrs and QPU_WSes. total instructions in shared programs: 55957 -> 52719 (-5.79%) instructions in affected programs: 46855 -> 43617 (-6.91%)	2014-12-08 16:08:13 -08:00
Eric Anholt	46741c1b87	vc4: Fix decision for whether the MIN operation writes to the B regfile.	2014-12-08 16:08:13 -08:00
Eric Anholt	24c5ab7bbb	vc4: Drop dependency on r3 for color packing. We can avoid it by carefully ordering the packing. This is important as a step in giving r3 to the register allocator. total instructions in shared programs: 56087 -> 55957 (-0.23%) instructions in affected programs: 18368 -> 18238 (-0.71%)	2014-12-08 16:08:13 -08:00
Eric Anholt	dfbf58c439	vc4: Add support for GL 1.0 logic ops.	2014-12-08 16:08:13 -08:00
Eric Anholt	5045d8ca42	vc4: Add support for TGSI_OPCODE_UCMP. This is being emitted now from st_glsl_to_tgsi.cpp.	2014-12-08 16:08:13 -08:00
Tom Stellard	c16436149c	radeonsi/compute: Clamp COMPUTE_TMPRING_SIZE.WAVES to: num_cu * 32 This is the maximum value allowed for this field.	2014-12-08 17:20:50 -05:00
Tom Stellard	0e1c085f17	winsys/radeon: Always report at least 1 compute unit All uses of this require that the value be at least one, so it's easier to report at least one than having to wrap all uses in MAX2(max_compute_units, 1). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-12-08 17:20:50 -05:00
Tom Stellard	67dcbcd92c	radeonsi: Program RASTER_CONFIG for harvested GPUs v5 Harvested GPUs have some of their render backends disabled, so in order to prevent the hardware from trying to render things with these disabled backends we need to correctly program the PA_SC_RASTER_CONFIG register. v2: - Write RASTER_CONFIG for all SEs. v3: - Set GRBM_GFX_INDEX.INSTANCE_BROADCAST_WRITES bit. - Set GRBM_GFX_INFEX.SH_BROADCAST_WRITES bit when done setting PA_SC_RASTER_CONFIG. - Get num_se and num_sh_per_se from kernel. v4: - Get correct value for num_se - Remove loop for setting PA_SC_RASTER_CONFIG - Only compute raster config when a backend has been disabled. v5: Michel Dänzer - Fix computation for chips with multiple SEs https://bugs.freedesktop.org/show_bug.cgi?id=60879 CC: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-12-08 17:20:50 -05:00
Roland Scheidegger	fea5c2640b	draw: (trivial): remove double semicolon	2014-12-09 00:10:41 +01:00
Abdiel Janulgue	49e0431211	st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported There is a bug in the current lowering pass implementation where we lower saturate to clamp only for vertex shaders on drivers supporting SM 3.0. The correct behavior is to actually lower to clamp only when we don't support saturate which happens on drivers that don't support SM 3.0 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-12-08 20:14:26 +02:00
Abdiel Janulgue	4ea8c8d56c	glsl: Don't optimize min/max into saturate when EmitNoSat is set v3: Fix multi-line comment format (Ian) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-12-08 20:14:17 +02:00
Abdiel Janulgue	39f7b72428	ir_to_mesa: Remove sat to clamp lowering pass Fixes an infinite loop in swrast where the lowering pass unpacks saturate into clamp but the opt_algebraic pass tries to do the opposite. v3 (Ian): This is a revert of commit `cfa8c1cb` "ir_to_mesa: lower ir_unop_saturate" on the ir_to_mesa.cpp portion. prog_execute.c can handle saturates in vertex shaders, so classic swrast shouldn't need this lowering pass. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83463 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-12-08 20:14:10 +02:00
Michael Forney	5d64da401c	loader: Add missing EXPAT_CFLAGS to libloader.la CPPFLAGS Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-08 08:50:27 -08:00
Matt Turner	f65200ccc9	i965: Remove default from brw_instruction_name switch to catch missing names. The case-range extension is available in clang and gcc at least back to 3.4.0. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-08 08:50:26 -08:00
Matt Turner	b6a71cbb64	i965: Add missing opcode names. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-08 08:50:26 -08:00
Matt Turner	6383e206c0	i965: Add opcode names for set_omask and set_sample_id. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-08 08:50:26 -08:00
Chad Versace	7e8ba77c49	egl: Expose EGL_KHR_get_all_proc_addresses and its client extension Mesa already implements the behavior of EGL_KHR_get_all_proc_addresses and EGL_KHR_client_get_all_proc_addresses. This patch just exposes the extension strings. See: https://www.khronos.org/registry/egl/extensions/KHR/EGL_KHR_get_all_proc_addresses.txt Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-12-07 20:58:25 -08:00
Emil Velikov	0b6e0aa5ae	docs: add news item and link release notes for mesa 10.3.5 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-12-07 19:22:11 +00:00
Emil Velikov	7409ad5147	docs: Add sha256 sums for the 10.3.5 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `1ba2029184`)	2014-12-07 19:22:11 +00:00
Emil Velikov	8d235e0c70	Add release notes for the 10.3.5 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `c90b0db1ae`)	2014-12-07 19:22:11 +00:00
Ilia Mirkin	043b79461f	freedreno/a2xx: silence warning about missing DEPTH32X Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:53 -05:00
Ilia Mirkin	c416f49ebe	freedreno/a3xx: handle index_bias (i.e. base_vertex) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:50 -05:00
Ilia Mirkin	b38b40d7bb	freedreno/a3xx: add bgr565 texturing and rendering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:47 -05:00
Ilia Mirkin	e02ed16cb5	freedreno/a3xx: add support for SRGB render targets Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:43 -05:00
Ilia Mirkin	39a7c049d3	freedreno/a3xx: output RGBA16_FLOAT from fs for certain outputs Fixes R11G11B10F rendering, and is required for SRGB format support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:40 -05:00
Ilia Mirkin	3674c76edf	freedreno/a3xx: re-enable rgb10_a2 render targets There were previously regressions regarding border colors, which the updated swizzle logic resolves. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:37 -05:00
Ilia Mirkin	fc94b2c2a0	freedreno/a3xx: fix border color swizzle to match texture format desc This is a hack since it uses the texture information together with the sampler, but I don't see a better way to do it. In OpenGL, there is a 1:1 correspondence. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:33 -05:00
Ilia Mirkin	97fef2db5c	freedreno/a3xx: fix alpha-blending on RGBX formats Expert debugging assistance provided by Chris Forbes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-06 18:18:20 -05:00
Chris Forbes	6b01969345	glcpp: Fix `can not` to `cannot` in error message Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-07 11:49:28 +13:00
Chris Forbes	b49a069bd3	glcpp: Disallow undefining GL_* builtin macros. Fixes the piglit test: spec/glsl-es-3.00/compiler/undef-GL_ES.vert Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-07 11:47:45 +13:00
Chris Forbes	ed56c16820	i965/Gen6-7: Fix point sprites with PolygonMode(GL_POINT) This was an oversight in the original patch. When PolygonMode is used, then front faces, back faces, or both may be rendered as points and are affected by point sprite state. Note that SNB/IVB can't actually be fully conformant here, for a legacy context -- we don't have separate sets of pointsprite enables for front and back faces. Haswell ignores pointsprite state correctly in hardware for non-point rasterization, so can do this correctly, but it doesn't seem worth it. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.4" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86764 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-07 11:46:42 +13:00
Chris Forbes	092c73a7c3	i965: Fix regs read for FS_OPCODE_INTERP_PER_SLOT_OFFSET Dead code elimination was eating the Y offset. Fixes the piglit test: spec/ARB_gpu_shader5/arb_gpu_shader5-interpolateAtOffset-nonconst Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-07 10:29:26 +13:00
Chris Forbes	680f72d6f2	i965: Add opcode names for FS interpolation opcodes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-07 10:29:20 +13:00
Roland Scheidegger	d8da6decea	mesa/st: don't use CMP / I2F for conditional assignments with native integers The original idea was to optimize away the condition by integrating it directly into the CMP instruction. However, with native integers this requires an extra I2F instruction. It is also fishy because the negation used didn't really honor ieee754 float comparison rules, not to mention the CMP instruction itself (being pretty much a legacy instruction) doesn't really have defined special float value behavior in any case. So, use UCMP and adjust the code trying to optimize the condition away accordingly (I have absolutely no idea if such conditions are actually hit or would be translated away somewhere else already). v2: cosmetic changes No piglit regressions on llvmpipe. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-06 18:03:25 +01:00
Roland Scheidegger	6f2cf5f3d0	llvmpipe: decrease MAX_SCENES from 2 to 1 Multiple scenes per context are meant to be used so a new scene can be built while another one is processed in rasterization. However, quite surprisingly, this does not actually work (and according to git log, possibly never did, though maybe it did at some point further back (5 years+) but was buggy) because we always wait immediately on the rasterizer to finish the scene when contexts (and hence setup/scene) is flushed. This means when we try to get an empty scene later, any old one is already empty again. Thus using multiple scenes is just a waste of memory (not too bad, since the additional scenes are guaranteed to be empty, which means their size ought to be one data block (64kB) plus the size of some structs), without actually really doing anything. (There is also quite some code for the whole concept of multiple scenes which doesn't really do much in practice, but keep it hoping the wait-on-scene-flush can be fixed some day.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-06 18:03:18 +01:00
Roland Scheidegger	1b6db3593e	draw: use the prim type from prim_info not emit in passthrough emit The prim assembler may change the prim type when injecting prim ids now, which isn't reflected by what's stored in emit. This looks brittle and potentially dangerous (it is not obvious if such prim type changes are really supported by pt emit, the prim type is actually also set in prepare which would then be different). This fixes piglit primitive-id-no-gs-first-vertex.shader_test. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-06 18:03:11 +01:00
Roland Scheidegger	fe86415beb	draw: use correct output prim for non-adjacent topologies in prim assembler. The decomposition done in the prim assembler will turn tri fans into tris, but this wasn't reflected in the output prim type. Meaning with a tri fan with 6 verts input, the output was a tri fan with 12 vertices instead of a tri list with 12 vertices (not as bad as it sounds, since the additional tris created would all be degenerate since they'd all have two times vertex zero but still bogus). This is because the prim assembler is used if either the input topology is something with adjacency, or if prim id needs to be injected, and for the latter case topologies without adjacency can be converted to basic ones. Unfortunately decomposition here for inserting prim ids is necessary, at least for the indexed case where we can't just insert the prim id at the right place depending on provoking vertex. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-06 18:03:05 +01:00
Roland Scheidegger	3fdbad1142	draw: kill off unneded prim assembler code for handling adjacency verts The default macros when the adjacency macros aren't defined will already exactly do that (that is, drop the adjacent vertices and call the non-adjacent macro). Reviewed-by: Jose Fonseca <jfonseca@vmwarec.com>	2014-12-06 18:02:59 +01:00
Roland Scheidegger	ec30c66b46	gallium/docs: (trivial) remove STR opcode description. The opcode was removed alongside SFL by commit `ecfe9e2ad2`.	2014-12-06 17:56:46 +01:00
Matt Turner	a28ad9d4c0	i965/fs: Perform CSE on MOV ..., VF instructions. Safe from causing optimization loops, since we don't constant propagate VF arguments. (for this and the previous patch): total instructions in shared programs: 4289075 -> 4271932 (-0.40%) instructions in affected programs: 1616779 -> 1599636 (-1.06%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-05 16:43:31 -08:00
Matt Turner	963a3c7f90	i965/fs: Try to emit LINE instructions on Gen <= 5. The LINE instruction performs a multiply-add instruction (a * b + c) where b and c are scalar arguments. It reads b and c from offsets in src0 such that you can load them (it they're representable) as a vector-float immediate with a single instruction. Hurts some programs, but that'll all get better once we CSE the vector-float MOVs in the next patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77544 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-05 16:43:31 -08:00
Matt Turner	6be863af0e	i965/fs: Add support for generating the LINE instruction. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-05 16:43:31 -08:00
Matt Turner	92346db057	i965: Set the region of LINE's src0 to <0,1,0>. The PRMs say that <src0> region must be a replicated scalar (with HorzStride = VertStride = 0). but apparently that doesn't actually apply to all generations. I did notice when implementing the optimization later in this series that G45 and ILK needed this regioning. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-05 16:43:31 -08:00
Matt Turner	9ed8d00ab5	i965: Give compile stats through KHR_debug. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-05 16:43:31 -08:00
Matt Turner	5b1e51bfbe	mesa: Add a source parameter to _mesa_gl_debug. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-05 16:43:31 -08:00
Eric Anholt	befdff8142	vc4: Try swapping the regfile A to B to pair instructions. total instructions in shared programs: 56995 -> 56087 (-1.59%) instructions in affected programs: 40503 -> 39595 (-2.24%)	2014-12-05 16:27:58 -08:00
Eric Anholt	7d8b79f398	vc4: Allow pairing of some instructions that disagree about the WS bit. No difference on shader-db because we tend to have a lot of other conflicts going on as well (like RADDR_A disagreements)	2014-12-05 16:27:06 -08:00
Matt Turner	e36c6513ce	configure.ac: Replace contraction to fix syntax highlighting.	2014-12-05 13:22:56 -08:00
Ben Widawsky	f13870db09	i965/gs: Avoid DW * DW mul The GS has an interesting use for mul. Because the GS can emit multiple vertices per input vertex, and it also has a unique count at the top of the URB payload, the GS unit needs to be able to dynamically specify URB write offsets (relative to the global offset). The documentation in the function has a very good explanation from Paul on the mechanics. This fixes around 2000 piglit tests on BSW. v2: Reworded commit message (Ben) no mention of CHV (Matt) Change SHRT_MAX to USHRT_MAX (Ken, and Matt) Update comment in code to reflect the use of UW (Ben) Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken) Drop the bogus hunk in emit_control_data_bits() (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes) Cc: "10.4 10.3 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-05 12:12:46 -08:00
Eric Anholt	6f32deb538	vc4: Add separate write-after-read dependency tracking for pairing. If an operation is the last one to read a register, the instruction containing it can also include the op that has the next write to that register. total instructions in shared programs: 57486 -> 56995 (-0.85%) instructions in affected programs: 43004 -> 42513 (-1.14%)	2014-12-05 10:53:53 -08:00
Eric Anholt	042962df2d	vc4: Fix inverted priority of instructions for QPU scheduling. We were scheduling TLB operations as early as possible, and texture setup as late as possible. When I introduced prioritization, I visually inspected that an independent operation got moved above texture results collection, which tricked me into thinking it was working (but it was just because texture setup was being pushed late). total instructions in shared programs: 57651 -> 57486 (-0.29%) instructions in affected programs: 18532 -> 18367 (-0.89%)	2014-12-05 10:43:14 -08:00
Eric Anholt	bd4057a5d7	vc4: Refuse to merge two ops that both access shared functions. Avoids assertion failures in vc4_qpu_validate.c if we happen to find the right set of operations available.	2014-12-05 10:43:14 -08:00
Eric Anholt	dadc32ac80	vc4: Allow dead code elimination of color reads. This might happen if the blending functions are set up to not actually use the destination color/alpha, for example.	2014-12-05 10:43:14 -08:00
Eric Anholt	34cf86bdc4	vc4: Add a debug flag for waiting for sync on submit. This is nice when you're tracking down which command list is hanging the GPU.	2014-12-05 10:43:14 -08:00
Matt Turner	c0e26c5d27	i965/fs: Move brw_file_from_reg() higher in the file. This was supposed to be part of the previous commit.	2014-12-05 09:53:35 -08:00
Matt Turner	db186f2a38	i965/fs: Make brw_reg_from_fs_reg static and remove prototype. And move it above its first use in brw_fs_generator.cpp. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-05 09:49:42 -08:00
Matt Turner	2881b123d0	i965: Use ~0 to represent true on all generations. Jason realized that we could fix the result of the CMP instruction on Gen <= 5 by doing -(result & 1). Also do the resolves in the vec4 backend before use, rather than when the bool was created. The FS does this and it saves some unnecessary resolves. On Ironlake: total instructions in shared programs: 4289762 -> 4287277 (-0.06%) instructions in affected programs: 619430 -> 616945 (-0.40%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-05 09:49:42 -08:00
Matt Turner	05e2578cac	i965: Change the type of booleans to D. This is a revert of commit `4656c14e` ("i965/fs: Change the type of booleans to UD and emit correct immediates") plus some small additional fixes, like casting ctx->Const.UniformBooleanTrue to int and changing UD to D in the ir_unop_b2f cases. Note that it's safe to leave 0x3f800000 as UD and as a literal it's more recognizable than 1065353216. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-05 09:49:42 -08:00
Matt Turner	66cc8de042	i965/fs: Add a negate() function. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-05 09:49:42 -08:00
Matt Turner	15f6118b77	i965/vec4: Don't DCE flag-writing insts because dest was unused. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-05 09:49:42 -08:00
Matt Turner	0d3cc01b0b	i965/vec4: Allow CSE on uniform-vec4 expansion MOVs. Three source instructions cannot directly source a packed vec4 (<0,4,1> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to both halves of a register. If these uniform values are used by multiple three-source instructions, we'll emit multiple expansion moves, which we cannot combine in CSE (because CSE emits moves itself). So emit a virtual instruction that we can CSE. Sometimes we demote a uniform to to a pull constant after emitting an expansion move for it. In that case, recognize in opt_algebraic that if the .file of the new instruction is GRF then it's just a real move that we can copy propagate and such. total instructions in shared programs: 5822418 -> 5812335 (-0.17%) instructions in affected programs: 351841 -> 341758 (-2.87%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-05 09:49:42 -08:00
Matt Turner	be80f69ecd	glsl: Optimize scalar all_equal/any_nequal into equal/nequal. Cuts an instruction from two shaders in Tesseract, by allowing the (x+y) cmp 0 -> x cmp -y optimization to take place. instructions in affected programs: 1198 -> 1194 (-0.33%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-12-05 09:49:42 -08:00
José Fonseca	a1fc6a91e5	mesa: Ensure stack is realigned on x86. Nowadays GCC assumes stack pointer is 16-byte aligned even on 32-bits, but that is an assumption OpenGL drivers (or any dynamic library for that matter) can't afford to make as there are many closed- and open- source application binaries out there that only assume 4-byte stack alignment. This fix uses force_align_arg_pointer GCC attribute, and is only a stop-gap measure. The right fix would be to pass -mstackrealign or -mincoming-stack-boundary=2 to all source fails that use any -msse* option, as there is no way to guarantee if/when GCC will decide to spill SSE registers to the stack. https://bugs.freedesktop.org/show_bug.cgi?id=86788 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-12-05 15:17:37 +00:00
José Fonseca	f9098f0972	util/primconvert: Avoid point arithmetic; apply offset on all cases. Matches what u_vbuf_get_minmax_index() does. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-12-05 14:44:16 +00:00
Ilia Mirkin	c3bed13604	util/primconvert: take ib offset into account Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-05 07:23:48 -05:00
Ilia Mirkin	fb434e675f	util/primconvert: support instanced rendering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-05 07:23:48 -05:00
Ilia Mirkin	1dfa039168	util/primconvert: pass index bias through The index_bias (aka base_vertex) applies to the downstream draw just as much, since the actual index values are never modified. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-12-05 07:23:48 -05:00
Kenneth Graunke	ae45a5a28d	i965: Compute VS attribute WA bits earlier and check if they changed. BRW_NEW_VERTICES is flagged every time we draw a primitive. Having the brw_vs_prog atom depend on BRW_NEW_VERTICES meant that we had to compute the VS program key and do a program cache lookup for every single primitive. This is painfully expensive. The workaround bit computation is almost entirely based on the vertex attribute arrays (brw->vb.inputs[i]), which are set by brw_merge_inputs. The only thing it uses the VS program for is to see which VS inputs are actually read. brw_merge_inputs() happens once per primitive, and can safely look at the currently bound vertex program, as it doesn't change in the middle of a draw. This patch moves the workaround bit computation to brw_merge_inputs(), right after assigning brw->vb.inputs[i], and stores the previous WA bit values in the context. If they've actually changed from the last draw (which is uncommon), we signal that we need a new vertex program, causing brw_vs_prog to compute a new key. Improves performance in Gl32Batch7 by 13.6123% +/- 0.739652% (n=166) on Haswell GT3e. I'm told Baytrail shows similar gains. v2: Introduce a new BRW_NEW_VS_ATTRIB_WORKAROUNDS dirty bit, rather than reusing BRW_NEW_VERTEX_PROGRAM (suggested by Chris Forbes). This prevents unnecessary re-emission of surface/sampler related atoms (and an SOL atom on Sandybridge). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-04 17:50:52 -08:00
Matt Turner	0b4a688691	egl/dri2: Log a warning if no platforms are enabled. If you hit this, you didn't compile with --with-egl-platforms=... Recompile with something like --with-egl-platforms=x11,drm and make clean and make again. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-12-04 15:13:51 -08:00
Kenneth Graunke	ca19e89d6e	i965: Drop BRW_NEW_VERTEX_PROGRAM and _NEW_TRANSFORM from Gen4 VS state. These stopped being necessary in commit `ab973403e4`. v2: Update commit message with a better explanation (thanks to Eric Anholt for doing the git archaeology). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-04 15:04:35 -08:00
Kenneth Graunke	a2dd8ea59a	i965: Drop BRW_NEW_VERTEX_PROGRAM from Gen7+ 3DSTATE_VS atoms. We don't access brw->vertex_program or ctx->_Shader since the previous commit, so we don't need this dirty bit. I think it's still necessary on Gen6 because it still conflates constant uploading with unit state uploading. We can fix that later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-04 15:04:35 -08:00
Kenneth Graunke	7b6620faf5	i965: Store floating point mode choice in brw_stage_prog_data. We use IEEE mode for GLSL programs, but need to use ALT mode for ARB programs so that 0^0 == 1. The choice is based entirely on the shader source language. Previously, our code to determine which mode we wanted was duplicated in 8 different places (VS and FS for Gen4-5, Gen6, Gen7, and Gen8). The ctx->_Shader->CurrentProgram[stage] == NULL check was confusing as well - we use CurrentProgram (non-derived state), but _Shader (derived state). It also relies on knowing that ARB programs don't use gl_shader_program structures today. The compiler already makes this assumption in a few places, but I'd rather keep that assumption out of the state upload code. With this patch, we select the mode at compile time, and store that choice in prog_data. The state upload code simply uses that decision. This eliminates a BRW_NEW_*_PROGRAM dependency in the state upload code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-04 15:04:35 -08:00
Kenneth Graunke	d300e58db0	i965: Make Gen4-5 and Gen8+ ALT checks use ctx->_Shader too. Commit `c0347705` changed the Gen6-7 code to use ctx->_Shader rather than ctx->Shader, but neglected to change the Gen4-5 or Gen8+ code. This might fix SSO related bugs, but ALT mode is only used for ARB programs, so if there's an actual problem, it's likely no one would run into it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-04 15:04:35 -08:00
Kenneth Graunke	8daf3c53c7	i965: Move PSCDEPTH calculations from draw time to compile time. The "Pixel Shader Computed Depth Mode" value is entirely based on the shader program, so we can easily do it at compile time. This avoids the if+switch on every 3DSTATE_WM (Gen7)/3DSTATE_PS_EXTRA (Gen8+) upload, and shares a bit more code. This also simplifies the PMA stall code, making it match the formula more closely, and drops a BRW_NEW_FRAGMENT_PROGRAM dependency. (Note that the previous comment was wrong - the code and the documentation have != PSCDEPTH_OFF, not ==.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-04 15:04:35 -08:00
Rob Clark	4265148ac6	freedreno/a4xx: unify vertex/texture formats into a single table Similar to the scheme that Ilia put in place for a3xx. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-04 16:01:37 -05:00
Rob Clark	e9589a8fcf	freedreno/a4xx: fd4_util -> fd4_format Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-04 16:01:37 -05:00
Rob Clark	8bf69a29bb	freedreno: update generated headers / a4xx fmt rename Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-04 16:01:37 -05:00
Kenneth Graunke	bcc7eb115e	i965: Add var->location != -1 assertions. We shouldn't receive variables with invalid locations set - adding these assertions should help catch problems before they cause crashes later. Inspired by similar code in st_glsl_to_tgsi. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-03 17:45:16 -08:00
Matt Turner	b5b18e4687	i965/fs: Don't offset uniform registers in half(). Half gives you the second half of a SIMD16 register, but if the register is a uniform it would incorrectly give you the next register. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-03 16:47:45 -08:00
Rob Clark	c74f2db0a5	freedreno/a4xx: frag-depth fixes Also seems to fix kill/discard. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 16:38:26 -05:00
Ian Romanick	a909b995d9	linker: Assign varying locations geometry shader inputs for SSO Previously only geometry shader outputs would be assigned locations if the geometry shader was the only stage in the linked program. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: pavol@klacansky.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-12-03 11:33:49 -08:00
Ian Romanick	5eca78a00a	linker: Wrap access of producer_var with a NULL check producer_var could be NULL if consumer_var is not NULL and consumer_is_fs is false. This will occur when the producer is NULL and the consumer is the geometry shader for a program that contains only a geometry shader. This will occur starting with the next patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: pavol@klacansky.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-12-03 11:33:49 -08:00
Jan Vesely	a2f2eebfdf	st/xvmc: Fix compiler warnings Mostly signed/unsigned comparison Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-12-03 17:07:08 +01:00
Axel Davy	712a4c5438	st/nine: Fix vertex declarations for non-standard (usage/index) Nine code to match vertex declaration to vs inputs was limiting the number of possible combinations. Some sm3 games have issues with that, because arbitrary (usage/index) can be used. This patch does the following changes to fix the problem: . Change the numbers given to (usage/index) combinations to uint16 . Do not put limits on the indices when it doesn't make sense . change the conversion rule (usage/index) -> number to fit all combinations . Instead of having a table usage_map mapping a (usage/index) number to an input index, usage_map maps input indices to their (usage/index) Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	5d6d260833	st/nine: sm1_declusage_to_tgsi, do not restrict indices with TGSI_SEMANTIC_GENERIC With sm3, you can declare an input/output with an usage and an usage index. Nine code hardcodes the translation usage/index to a corresponding TGSI code. The translation was limited to a few usage/index combinations that were corresponding to most of the needs of games, but some games did not work. This patch rewrites that Nine code to map all possible usage/index combination to TGSI code. The index associated to TGSI_SEMANTIC_GENERIC doesn't need to be low for good performance, as the old code was supposing, and is not particularly bounded (it's UINT16). Given the index is BYTE, we can map all combinations. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	3e1f731d3e	st/nine: Queries: Always return D3D_OK when issuing with D3DISSUE_BEGIN This is the behaviour that Wine tests. Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	2f78259c11	st/nine: Queries: always succeed for D3DQUERYTYPE_TIMESTAMP when flushing This is the behaviour that Wine tests Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	225d7f8e0e	st/nine: Queries: allow app to call GetData without Issuing first Nine was allowing that behaviour, but was not filling the result. Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	eac0b9b68a	st/nine: Queries: Fix D3DISSUE_END behaviour. Issuing D3DISSUE_END should: . reset previous queries if possible . end the query Previous behaviour wasn't calling end_query for queries not needing D3DISSUE_BEGIN, nor resetting previous queries. This fixes several applications not launching properly. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	ca0588d1a1	st/nine: Queries: return S_FALSE instead of INVALIDCALL when in building query state It is the same behaviour as wine has. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	b0302a95ec	st/nine: Queries: Use gallium caps to get if queries are supported. (v2) Some queries need the driver to advertise a cap to be supported. For example r300 doesn't support them. v2 (David): check also for PIPE_CAP_QUERY_PIPELINE_STATISTICS, fix wine tests on r300g Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	6b35662e30	st/nine: Queries: Remove flush logic get_query_result flushes automatically, we don't need to flush. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:24 +01:00
Axel Davy	3e48791aea	st/nine: Queries: remove dummy queries Applications are supposed to call CreateQuery with a NULL ppQuery to know if the query is supported. We supported that. However when ppQuery was not NULL, we were accepting to create the query and were creating a dummy query even when the query is not supported. Wine has different behaviour. This patch drops the dummy queries support and matches wine behaviour. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-12-03 16:39:23 +01:00
Ilia Mirkin	79f9a106b9	freedreno/a3xx: implement anisotropic filtering Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-12-03 09:23:46 -05:00
Rob Clark	b491d1ca6e	freedreno/a4xx: rect textures Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 09:22:05 -05:00
Rob Clark	fbba633f2f	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 09:22:05 -05:00
Rob Clark	4cfe905a9b	freedreno: fix signed vs unsigned lols Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-03 09:22:05 -05:00
José Fonseca	ef7e0b39a2	gallivm: Update for RTDyldMemoryManager becoming an unique_ptr. Trivial. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=86958	2014-12-03 07:49:47 +00:00
Tapani Pälli	636db35c35	glsl: throw error when using invariant(all) in a fragment shader Note that some of the GLSL specifications explicitly state this as compile error, some simply state that 'it is an error'. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-03 08:56:19 +02:00
Ben Widawsky	c914247dcb	i965/skl: Fix SBE state upload code. The state upload code was incorrectly shifting the attribute swizzles. The effect of this is we're likely to get the default swizzle values, which disables the component. This doesn't technically fix any bugs since Skylake support is still disabled by default (no PCI IDs). While here, since VARYING_SLOT_MAX can be greater than the number of attributes we have available, add a warning to the code to make sure we never do the wrong thing (and hopefully prevent further static analysis from finding this). Admittedly I am a bit confused. It seems to me like the moment a user has greater than 8 varyings we will hit this condition. CC Ken to clarify. v2: Forgot to git add the warning message in v1 v3: Change the > 31 varyings to an assertion (Ken) Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> (via Coverity) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 22:11:09 -08:00
Jan Vesely	02cc9e9f9e	r600, llvm: Don't leak global symbol offsets Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-12-02 22:32:05 -05:00
Matt Turner	bc3ca485ae	i965: Avoid union literal, for old gcc compatibility. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86939 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-12-02 17:20:16 -08:00
Matt Turner	f0fa6a5e86	i965: Remove tabs from instruction scheduler. Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-12-02 17:20:16 -08:00
Kenneth Graunke	51f7f613f9	i965/vs: Set brw_vs_prog_key::clamp_vertex_color to 0 when irrelevant. Vertex color clamping is only relevant if the shader writes to the built-in gl_[Secondary]{Front,Back}Color varyings. Otherwise, brw_vs_prog_key::clamp_vertex_color is never used, so we can simply leave it set to 0. This enables us to correctly predict the clamp_vertex_color key value in the precompile for shaders which don't use those varyings. Eliminates virtually all VS recompiles in Serious Sam 3's intro. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	afd605f346	i965: Make vertex color clamp handling code VS specific. Vertex color clamping only applies to gl_[Secondary]{Front,Back}Color, which are compatibility-only built-in varyings. We only support GS in core profile, so they can't exist in geometry shaders. We can drop several dirty bits from the GS program key - they're unnecessary for a core profile implementation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	169b6c1955	i965/vs: Handle vertex color clamping in emit_urb_slot(). Vertex color clamping only applies to a few specific built-ins: COL0/1 and BFC0/1 (aka gl_[Secondary]{Front,Back}Color). It seems weird to handle special cases in a function called emit_generic_urb_slot(). emit_urb_slot() is all about handling special cases, so it makes more sense to handle this there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	793ac67d3d	i965: Use the enum type for gen6_gather_wa sampler key field. Requested by Matt Turner. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	e5e466c954	i965: Drop use of GL types in program keys. This is really far removed from the API; we should just use C types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	a64f3ba3d1	i965: Move program key structures to brw_program.h. With fs_visitor/fs_generator being reused for SIMD8 VS/GS programs, we're running into weird #include patterns, where scalar code #includes brw_vec4.h and such. Program keys aren't really related to SIMD4X2/SIMD8 execution - they mostly capture NOS for a particular shader stage. Consolidating them all in one place that's vec4/scalar neutral should help avoid problems. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	5f34a18f96	i965: Delete brw_state_flags::cache and related code. It's been merged into brw_state_flags::brw for simplicity and efficiency. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	4f24c168c8	i965: Move BRW_NEW__PROG_DATA flags to .brw (not .cache). I put the BRW_NEW__PROG_DATA flags at the beginning so that brw_state_cache.c can still continue using 1 << brw_cache_id. I also added a comment explaining the difference between BRW_NEW__PROG_DATA and BRW_NEW__PROGRAM, as it took me a long time to remember it. Non-mechanical changes: - brw_state_cache.c and brw_ff_gs.c now signal .brw, not .cache. - brw_state_upload.c - INTEL_DEBUG=state changes. - brw_context.h - bit definition merging. v2: Correct the explanation of BRW_NEW_*_PROG_DATA to mention state-based recompiles, and nix the "proper subset" claim, as it's false. (Caught by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	ce44b2061c	i965: Rename CACHE_NEW__PROG to BRW_NEW__PROG_DATA. Now that we've moved a bunch of CACHE_NEW_* bits to BRW_NEW_, the only ones that are left are legitimately related to the program cache. Yet, it seems a bit wasteful to have an entire bitfield for only 7 bits. State upload is one of the hottest paths in the driver. For each atom in the list, we call check_state() to see if it needs to be emitted. Currently, this involves comparing three separate bitfields (mesa, brw, and cache). Consolidating the brw and cache bitfields would save a small amount of CPU overhead per atom. Broadwell, for example, has 57 state atoms, so this small savings can add up. CACHE_NEW__PROG covers the brw__prog_data structures, as well as the offset into the program cache BO (prog_offset). Since most uses refer to brw__prog_data, I decided to use BRW_NEW__PROG_DATA as the name. Removing "cache" completely is a bit painful, so I decided to do it in several patches for easier review, and to separate mechanical changes from manual ones. This one simply renames things, and was made via: $ for file in .[ch]; do sed -i -e 's/CACHE_NEW_$[A-Z_\]$_PROG/BRW_NEW_\1_PROG_DATA/g' \ -e 's/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/g' $file done Note that BRW_NEW_*_PROG_DATA is still in .cache, not .brw! The next patch will remedy this flaw. It will also fix the alphabetization issues. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Matt Turner <mattst88@gmail.com>	2014-12-02 17:00:26 -08:00
Kenneth Graunke	2a4f5728ad	i965: Remove "disable_derivative_optimization" driconf option. This was added in September 2013 when we first implemented the fast (but lower quality) derivatives. A quick Google search didn't turn up anyone using or recommending the option, so I suspect no one does. Applications that want to control the quality of their derivatives can use the new GL_ARB_derivative_control extension, or use the glHint mechanism. The driconf option seems superfluous. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-02 17:00:26 -08:00
Ian Romanick	0391d1bbea	i965: Just return void from brw_try_draw_prims Note from Ken: "We used to use the return value to indicate whether software fallbacks were necessary, but we haven't in years." Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 12:16:28 -08:00
Ian Romanick	9fd398215d	mesa: Use current Mesa coding style in check_valid_to_render This makes some others patches (still in my local tree) a bit cleaner. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 12:16:28 -08:00
Ian Romanick	331b0120d1	mesa: Use unreachable instead of assert in check_valid_to_render This is generally the prefered style these days. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 12:16:28 -08:00
Ian Romanick	304c466bd8	mesa: Silence unused parameter warnings in _mesa_validate_Draw functions ../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawElements': ../../src/mesa/main/api_validate.c:376:37: warning: unused parameter 'basevertex' [-Wunused-parameter] ../../src/mesa/main/api_validate.c: In function '_mesa_validate_MultiDrawElements': ../../src/mesa/main/api_validate.c:394:65: warning: unused parameter 'basevertex' [-Wunused-parameter] ../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawRangeElements': ../../src/mesa/main/api_validate.c:452:35: warning: unused parameter 'basevertex' [-Wunused-parameter] ../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawArrays': ../../src/mesa/main/api_validate.c:473:25: warning: unused parameter 'start' [-Wunused-parameter] ../../src/mesa/main/api_validate.c: In function '_mesa_validate_DrawElementsInstanced': ../../src/mesa/main/api_validate.c:590:44: warning: unused parameter 'basevertex' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 12:16:28 -08:00
Ian Romanick	5e72886db0	mesa: Refactor common validation code to validate_DrawElements_common Most of the code in _mesa_validate_DrawElements, _mesa_validate_DrawRangeElements, and _mesa_validate_DrawElementsInstanced was the same. Refactor this out to common code. As a side-effect, a bug in _mesa_validate_DrawElementsInstanced was fixed. Previously this function would not generate an error when check_valid_to_render failed if numInstances was 0. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 12:16:28 -08:00
Ian Romanick	b93dcb0e71	mesa: Generate GL_INVALID_OPERATION when drawing w/o a VAO in core profile GL 3-ish versions of the spec are less clear that an error should be generated here, so Ken (and I during review) just missed it in `1afe335`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-02 12:16:28 -08:00
Brian Paul	4e6244e80f	mesa: fix height error check for 1D array textures height=0 is legal for 1D array textures (as depth=0 is legal for 2D arrays). Fixes new piglit ext_texture_array-errors test. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-12-02 10:00:03 -07:00
Jan Vesely	ca0616f17e	r600, llvm: Fix mem leak Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-12-02 11:30:13 -05:00
EdB	745b1f5503	clover: clCompileProgram CL_INVALID_COMPILER_OPTIONS clCompileProgram should return CL_INVALID_COMPILER_OPTIONS instead of CL_INVALID_BUILD_OPTIONS Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-12-02 11:05:03 -05:00
Eric Anholt	29c7cf2b2b	vc4: Pair up QPU instructions when scheduling. We've got two mostly-independent operations in each QPU instruction, so try to pack two operations together. This is fairly naive (doesn't track read and write separately in instructions, doesn't convert ADD-based MOVs into MUL-based movs, doesn't reorder across uniform loads), but does show a decent improvement on shader-db-2. total instructions in shared programs: 59583 -> 57651 (-3.24%) instructions in affected programs: 47361 -> 45429 (-4.08%)	2014-12-01 22:29:42 -08:00
Dave Airlie	7b0067d23a	r600g/sb: fix issues cause by GLSL switching to loops for switch Since `73dd50acf6` glsl: implement switch flow control using a loop The SB backend was falling over in an assert or crashing. Tracked this down to the loops having no repeats, but requiring a working break, initial code just called the loop handler for all non-if statements, but this caused a regression in tests/shaders/dead-code-break-interaction.shader_test. So I had to add further code to detect if all the departure nodes are empty and avoid generating an empty loop for that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86089 Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-By: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-12-02 13:57:27 +10:00
Rob Clark	036f434ac2	freedreno/a4xx: alpha blend fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 20:31:23 -05:00
Rob Clark	a7d91c33c2	freedreno/a4xx: fix DRAW initiator encoding of index size Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 20:31:23 -05:00
Rob Clark	81194ac767	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 20:31:23 -05:00
Matt Turner	5df88c2096	i965/vec4: Rewrite dead code elimination to use live in/out. Improves 359 shaders by >=10% 114 shaders by >=20% 91 shaders by >=30% 82 shaders by >=40% 22 shaders by >=50% 4 shaders by >=60% 2 shaders by >=80% total instructions in shared programs: 5845346 -> 5822422 (-0.39%) instructions in affected programs: 364979 -> 342055 (-6.28%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	7a5cc789de	i965/vec4: Track liveness of the flag register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	b449366587	i965/fs: Remove opt_drop_redundant_mov_to_flags(). Dead code elimination now handles this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	b37273b924	i965/fs: Use const fs_reg & rather than a copy or pointer. Also while we're touching var_from_reg, just make it an inline function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	60d507c3c5	i965/fs: Dead code eliminate instructions writing the flag. Most prominently helps Natural Selection 2, which has a surprising number shaders that do very complicated things before drawing black. instructions in affected programs: 21052 -> 16978 (-19.35%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	bf8deb5514	i965/fs: Track liveness of the flag register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	13f6601585	i965: Use local pointer to block_data in live intervals. The next patch will be simplified because of this, and makes reading the code a lot easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	a50915984f	i965/vec4: Make live_intervals part of the vec4_visitor class. Like in fs_visitor. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	e4d0299089	i965/fs: Treat the FB_WRITE as predicated if we're discarding. Pre-Haswell hardware couldn't actually predicate it, but it's easier to pretend as if it's predicated in the visitor since it will generate a MOV from f0.1. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:13 -08:00
Matt Turner	f1e5418f40	i965: Don't treat IF or WHILE with cmod as writing the flag. Sandybridge's IF and WHILE instructions can do an embedded comparison with conditional mod. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:12 -08:00
Matt Turner	937ddb419d	i965/disasm: Disassemble tdr and tm registers properly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:42:12 -08:00
Jordan Justen	cd1b0f04be	main, glsl: Bump max known desktop glsl version to 4.50 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-01 16:20:21 -08:00
Jordan Justen	307d22abb0	glsl/cs: Change gl_WorkGroupSize from ivec3 to uvec3 As documented in: https://www.opengl.org/registry/specs/ARB/compute_shader.txt const uvec3 gl_WorkGroupSize; Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-01 16:20:21 -08:00
Jonathan Gray	31a46fb7a5	i965: avoid anonymous struct in float <-> VF conversions Anonymous structures are only supported with newer versions of GCC. They will not work with GCC 4.2.1 used by OpenBSD or GCC 4.4.7 shipped with RHEL6 going by a commit to fix a similiar problem in radeonsi earlier in the year (`74388dd24b`). Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jonathan Gray <jsg@jsg.id.au>	2014-12-01 16:13:08 -08:00
Brian Paul	991d5cf8ce	mesa: fix arithmetic error in _mesa_compute_compressed_pixelstore() We need parenthesis around the expression which computes the number of blocks per row. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>	2014-12-01 16:30:55 -07:00
Brian Paul	691170b9c7	vbo: also print buffer object pointer in vbo_print_vertex_list() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:30:39 -07:00
Brian Paul	1e14aaa8f9	mesa: some improvements for print_list() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:30:17 -07:00
Brian Paul	c407c6d588	mesa: inline/remove _mesa_polygon_stipple() Was not called from any other place. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:30:12 -07:00
Brian Paul	f54162857c	svga: fix comment typo	2014-12-01 16:30:12 -07:00
Brian Paul	953847e5a8	mesa: remove unused functions in prog_execute.c Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-12-01 16:29:55 -07:00
Brian Paul	cd8a7258b8	mesa: update glext.h to version 20141118	2014-12-01 15:22:20 -07:00
Brian Paul	ded14afa42	gallium: add include path to fix building of pipe-loader code The pipe-loader code wasn't finding util/u_atomic.h Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-01 15:22:08 -07:00
José Fonseca	0806bf8815	graw: Avoid 'near'/'far' variables. They are defined by windows.h, which got included slightly more frequently than before with u_atomic.h	2014-12-01 20:24:51 +00:00
Matt Turner	120426b13d	i965/fs: Clean up some whitespace in reg_allocate. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-01 11:32:56 -08:00
Matt Turner	2e007fd621	ra: Don't use regs as the ralloc context. The i965 backends pass something out of 'screen', which is allocated per-process, making using this as a ralloc context not thread-safe. All callers ra_alloc_interference_graph() already ralloc_free() its return value. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-01 11:32:54 -08:00
Matt Turner	933c678776	i965: Initialize INTEL_DEBUG once per process. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-01 11:32:52 -08:00
Matt Turner	82811ff176	i965: Initialize compaction tables once per process. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-01 11:32:51 -08:00
Matt Turner	9db278d0e2	glsl: Initialize static temporaries_allocate_names once per process. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-12-01 11:32:48 -08:00
José Fonseca	a5299e9e1c	util/u_atomic: Fix the unlocked implementation. It was totally broken: - p_atomic_dec_zero() was returning the negation of the expected value - p_atomic_inc_return()/p_atomic_dec_return() was post-incrementing/decrementing, hence returning the old value instead of the new - p_atomic_cmpxchg() was returning the new value on success, instead of the old It is clear this never used in the past. I wonder if it wouldn't be better to yank it altogether. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-01 11:28:45 -08:00
José Fonseca	ff80b92a58	util/u_atomic: Add a simple test. It was much easier for me to verify things build and run as expected with this simple test, than building and testing whole Mesa. With scons the test can be build and run merely by doing: scons u_atomic_test Building the test with autotools is left as a future exercise. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-12-01 11:28:45 -08:00
Matt Turner	6df72e970c	util: Make u_atomic.h typeless. like how C11's stdatomic.h provides generic functions. GCC's __sync_* builtins already take a variety of types, so that's simple. MSVC and Sun Studio don't, but we can implement it with something that looks a little crazy but is actually quite readable. Thanks to Jose for some MSVC fixes! Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:45 -08:00
Matt Turner	41b5858a2f	util: Use stdbool.h's bool rather than "boolean". Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:45 -08:00
Matt Turner	2879a77a37	util: Remove u_atomic.h's GCC inline assembly. GCC >= 4.1 support the __sync_* intrinsics. That seems like a sufficiently old baseline. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:45 -08:00
Matt Turner	972f8458f1	util: Remove u_atomic.h's MSVC inline assembly. There was already an intrinsics path that implemented all of the same functions, plus more. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:45 -08:00
Matt Turner	504062be2a	util: Remove u_atomic.h's Gallium dependence. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:45 -08:00
Matt Turner	4abd20e261	util: s/INLINE/inline/ in u_atomic.h. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:44 -08:00
Matt Turner	ccad3829e3	util: Move u_atomic.h to src/util. To be shared outside of Gallium. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-12-01 11:28:44 -08:00
Eric Anholt	3fe4d8e1e3	vc4: Introduce scheduling of QPU instructions. This doesn't reschedule much currently, just tries to fit things into the regfile A/B write-versus-read slots (the cause of the improvements in shader-db), and hide texture fetch latency by scheduling setup early and results collection late (haven't performance tested it). This infrastructure will be important for doing instruction pairing, though. shader-db2 results: total instructions in shared programs: 61874 -> 59583 (-3.70%) instructions in affected programs: 50677 -> 48386 (-4.52%)	2014-12-01 11:00:23 -08:00
Eric Anholt	6958c404ca	vc4: Drop the explicit scoreboard wait. This is actually implicitly handled by the TLB operations.	2014-12-01 11:00:23 -08:00
Eric Anholt	334036fb64	vc4: Also deal with VPM reads at thread end. Prevents a regression with QPU scheduling, which happens to put the no-op reads for unused VPM contents end up at the end of the program.	2014-12-01 11:00:23 -08:00
Eric Anholt	a7b1a93137	vc4: Fix assertion about SFU versus texturing. We're supposed to be checking that nothing else writes r4, which is done by the TMU result collection signal, not the coordinate setup. Avoids a regression when QPU instruction scheduling is introduced.	2014-12-01 11:00:23 -08:00
Eric Anholt	2d5784c825	vc4: Add another check for invalid TLB scoreboard handling. This was caught by an assertion in the simulator.	2014-12-01 11:00:23 -08:00
Rob Clark	bb19f2c3c4	freedreno/a4xx: invalidate cache when vbo's change Otherwise vertex shader can see stale cache data. This in particular happens when the same vbo is updated and reused. Not sure yet if vbo's at differing addresses but bound to same vertex buffer slot could have issues, but seems safest to flush whenever new vertex buffers are bound. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-12-01 12:02:25 -05:00
Ilia Mirkin	ebbd34a468	st/mesa: avoid exposing EXT_texture_integer for pre-GLSL 1.30 For drivers building up to GL(ES)3, only expose the actual extension if the API will let it be used (e.g. via overrides/debug flags that enable higher versions). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-11-30 13:04:29 -05:00
Ilia Mirkin	4907c31385	freedreno/a3xx: add missing integer formats and enable rendering The mesa state tracker doesn't fall back on similar integer formats, so they must all be provided. Remove the restriction against integer color rendering. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	82104c19f3	freedreno/a3xx: enable sampling from integer textures We need to produce a u32 destination type on integer sampling instructions, so keep that in a shader key set based on the currently-bound textures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	8e336ef55b	freedreno: allow each generation to hook into sampler view setting Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	618ff11457	freedreno/a3xx: don't use half precision shaders for int/float32 Integer outputs end up getting mangled due to cov.f32f16, and float32 loses precision. Use full precision shaders in both of those cases. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	f866446e8c	freedreno/a3xx: disable blending for integer formats Also add support for the BLENDABLE bind flag, similarly predicated on non-int formats. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:04:28 -05:00
Ilia Mirkin	8e147e9ec8	freedreno/a3xx: remove blend clamp enables from gmem/clears Just pass the data through unmolested. This probably has no effect since blending isn't actually enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:00:41 -05:00
Ilia Mirkin	d63afe3b58	freedreno/a3xx: add format to emit info, use to set sint/uint flags Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:00:41 -05:00
Ilia Mirkin	5d95e99622	freedreno/a3xx: add 16-bit unorm/snorm texture formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:00:41 -05:00
Ilia Mirkin	547182977f	freedreno/ir3: remove unused arg parameter Leaving it around in the struct in case we want to use it later. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-30 13:00:22 -05:00
Ilia Mirkin	de83ef677f	freedreno/ir3: fix UMAD Looks like none of the mad variants do u16 * u16 + u32, so just add in the extra value "by hand". Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>	2014-11-30 13:00:22 -05:00
Rob Clark	66f694b16c	freedreno/a4xx: stencil fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-30 10:44:09 -05:00
Rob Clark	5b46670487	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-30 10:44:03 -05:00
Rob Clark	3e698ebf44	freedreno/a4xx: add render target format to fd4_emit This lets us move emitting SP_FS_MRT_REG back to fd4_program_emit. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-30 10:43:50 -05:00
Ilia Mirkin	4aec928ca4	freedreno/a3xx: unify vertex/texture formats into a single table The table contains all the relevant information about each format. The helper functions now just do lookups in the table. Note that this adds support for a lot of formats that were previously unsupported. Additionally it adds disabled support for integer render buffers, which will require more work to actually enable. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-11-29 12:15:43 -05:00
Ilia Mirkin	20fbf99595	freedreno/a3xx: rename vertex/texture format enums to be more consistent Switch both of them from independently inconsistent conventions to having UINT/SINT/UNORM/SNORM/FLOAT/FIXED suffixes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-11-29 12:15:43 -05:00
Ilia Mirkin	3338bfcf49	freedreno/a3xx: fd3_util -> fd3_format All the "util" helpers are actually format-related Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-11-29 12:15:43 -05:00
Ilia Mirkin	3de9fa8ff4	freedreno/a3xx: only enable blend clamp for non-float formats This fixes arb_color_buffer_float-render GL_RGBA16F. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-11-29 12:15:43 -05:00
Kenneth Graunke	67c498086d	i965: Add _CACHE_ in brw_cache_id enum names. BRW_CACHE_VS_PROG is more easily associated with program caches than plain BRW_VS_PROG. While we're at it, rename BRW_WM_PROG to BRW_CACHE_FS_PROG, to move away from the outdated Windowizer/Masker name. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:47 -08:00
Kenneth Graunke	e563c33d57	i965: Move CACHE_NEW_SAMPLER to BRW_NEW_SAMPLER_STATE_TABLE. This flag signifies that we've emitted a new SAMPLER_STATE table. Given that we haven't cached those in years, CACHE_NEW_SAMPLER isn't a great name. Putting it in the BRW_NEW_* hierarchy would make more sense; BRW_NEW_SAMPLER_STATE_TABLE better reflects its actual purpose. When this flag is raised, the pointer to the SAMPLER_STATE table has changed, so we need to re-issue any packets which point to it (unit state on Gen4-5, 3DSTATE_SAMPLER_STATE_POINTERS on Gen6, and the per-stage variants on Gen7+). Saves 2 * sizeof(void *) bytes per context, as we remove useless aux_compare/aux_free function pointers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:46 -08:00
Kenneth Graunke	324368b500	i965: Move some /* CACHE_NEW_SAMPLER */ comments. Marking brw_stage_state::sampler_count as CACHE_NEW_SAMPLER is wrong. The number of samplers used by each program is actually computed at draw time (brw_try_draw_prims), based purely on the currently bound shader programs (gl_program::SamplersUsed). CACHE_NEW_SAMPLER means that we've emitted a new SAMPLER_STATE table. Although this could indicate that the number of samplers has changed, it could also simply mean that the contents of the table has changed (i.e. we've bound different textures). The real reason these atoms depend on CACHE_NEW_SAMPLER is because they include a pointer to the SAMPLER_STATE table. This was not commented. So, move the comments to the appropriate place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:44 -08:00
Kenneth Graunke	66ebfad3cd	i965: Move CACHE_NEW__VP flags to BRW_NEW__VP. We've been streaming these out for ages, so they basically have nothing to do with brw_state_cache.c. Saves 6 * sizeof(void *) bytes per context, as we won't have useless aux_compare/aux_free functions for them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:42 -08:00
Kenneth Graunke	4d67b6ab9a	i965: Fold the gen7_cc_viewport_state_pointer atom into brw_cc_vp. These always happen together; the extra atom just means another item to iterate through, flags to check, and a call through a function pointer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:40 -08:00
Kenneth Graunke	f421db70ba	i965: Combine CACHE_NEW__UNIT into BRW_NEW_GEN4_UNIT_STATE. On Gen4-5, unit state is specified as indirect state, rather than commands. If any unit state changes, we upload it via brw_state_batch and arrange for 3DSTATE_PIPELINED_POINTERS to be re-emitted, which updates pointers to all unit state at once. Since there's only one command and state atom (brw_psp_urb_cs) that needs to know about this, there's no benefit to having six separate flags. We can combine CACHE_NEW__UNIT into a single flag. We also haven't cached these in a long time, so it doesn't make sense to use the "CACHE_NEW_" prefix. Instead, use the "BRW_NEW_" prefix. This also saves 12 * sizeof(void *) bytes of memory per context, as we remove useless aux_compare/aux_free functions for each CACHE bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:38 -08:00
Kenneth Graunke	bea9b8e306	i965: Alphabetize brw_tracked_state flags and use a consistent style. Most of the dirty flags were listed in some arbitrary order. Some used bonus parenthesis. Some put multiple flags on one line, others put one per line. Some used tabs instead of spaces...but only on some lines. This patch settles on one flag per line, in alphabetical order, using spaces instead of tabs, and sheds the unnecessary parentheses. Sorting was mostly done with vim's visual block feature and !sort, although I alphabetized short lists by hand; it was pretty manual. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-29 02:18:36 -08:00
Christoph Bumiller	f3b4b263c2	nv50/ir/tgsi: handle TGSI_OPCODE_ARR This instruction is used by st/nine. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2014-11-28 19:17:52 -05:00
Kenneth Graunke	133280120b	i965: Set prog_data->uses_kill if simulating alpha test via discards. When using MRT on Gen4-5, we have to simulate GL's alpha test feature by emitting discards in the fragment shader. In this case, it makes sense to set prog_data->uses_kill, which means the fragment shader may kill pixels via the discard mechanism. This saves us from having to look an extra key value in a couple of places, including in the generator. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-27 20:25:24 -08:00
Kenneth Graunke	06372c3fa9	i965: Use brw_wm_prog_data::uses_kill, not gl_fragment_program::UsesKill Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-27 20:25:23 -08:00
Kenneth Graunke	a0f8b363c0	i965/fs: Pass key->render_to_fbo via src1 of FS_OPCODE_DDY_*. This means the generator doesn't have to look at the key, which is a little nicer - we're pretty close to no key dependencies at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-27 20:25:19 -08:00
Kenneth Graunke	cea37f0911	i965/fs: Handle derivative quality decisions in the front-end. Kristian noted that there's very little use of brw_wm_prog_key in the generator, and that it basically just generates what it's told, without caring about what stage it's handling. One exception to this is derivative handling. When handling dFdxCoarse and dFdxFine, we packed an enum value in a second source register, explicitly telling the generator what to do. For dFdx, we specified an enum value of "please use the hint", then checked the program key in the generator level code. A natural method is to define separate FS_OPCODE_DD[XY]_{COARSE,FINE} opcodes, and have the front-end (which already decides what IR to generate based on the program key) decide which dPdx/dPdy should correspond to. This consolidates the decision making in one place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-27 20:25:14 -08:00
Kenneth Graunke	2315ae6653	i965: Create prog_data temporary variables in PS state upload code. prog_data->foo is a bit more readable than brw->wm.prog_data->foo. The local variable definition is also a great location to put the obligatory /* CACHE_NEW_WM_PROG */ comment. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-11-27 20:24:24 -08:00
Kenneth Graunke	6a1c1fd503	i965: Fix missing CACHE_NEW_WM_PROG in 3DSTATE_PS_EXTRA. brw->wm.prog_data is covered by CACHE_NEW_WM_PROG, not BRW_NEW_FRAGMENT_PROGRAM. So, we should listen to it. However, I believe that BRW_NEW_FRAGMENT_PROGRAM is sufficient to cover all the necessary cases - CACHE_NEW_WM_PROG happens in a subset of cases. So, the code being wrong shouldn't have triggered bugs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-11-27 20:24:15 -08:00
Ilia Mirkin	e928b1e65b	nv50: remove ancient map of rt formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-27 16:51:31 -05:00
Ilia Mirkin	37fe347542	freedreno/ir3: don't pass consts to madsh.m16 in MOD logic madsh.m16 can't handle a const in src1, make sure to unconst it Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@gmail.com> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>	2014-11-27 14:25:36 -05:00
Romain Failliot	b340469f33	docs: Set llvmpipe and softpipe note only for MSAA. Right now, in mesamatrix.net, the footnote is set so that it seems to be for all the features, while actually it only applies to MSAA. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-27 18:03:26 +01:00
Neil Roberts	c97cbd7e3d	glsl: Use \| action in the lexer source to avoid duplicating the float action Flex and lex have a special action ‘\|’ which means to use the same action as the next rule. We can use this to reduce a bit of code duplication in the rules for the various float literal formats. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-27 11:43:59 +00:00
Neil Roberts	9d8aa88693	glsl: Disallow float literals with the 'f' suffix but no point or exponent According to the GLSL spec float literals like ‘1f’ shouldn't be allowed without adding a decimal point or an exponent. Apparently the AMD driver also disallows this so it seems unlikely that anything would be relying on it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-27 11:43:17 +00:00
Dave Airlie	91a827624c	r600g: make llvm code compile this time Actually compiling the code helps make it compile. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-27 14:11:23 +10:00
Dave Airlie	b10ddf962f	r600g: fix fallout from last patch I accidentally rebased from the wrong machine and missed some fixes that were on my r600 box. doh. this fixes a bunch of geom shader textureSize tests on rv635 from gpu reset to pass. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86760 Reported-by: wolput@onsneteindhoven.nl Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-27 13:12:41 +10:00
Dave Airlie	07ae69753c	r600g: merge the TXQ and BUFFER constant buffers (v1.1) We are using 1 more buffer than we have, although in the future the driver should just end up using one buffer in total probably, this is a good first step, it merges the txq cube array and buffer info constants on r600 and evergreen. This should in theory fix geom shader tests on r600. v1.1: fix comments from Glenn. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-27 10:31:38 +10:00
Matt Turner	bc5f5424e3	glapi: Remove dead mesadef.py. Dead since commit `4e120c97`, in which apiparser (which mesadef.py imports) was removed. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-11-26 20:31:15 +00:00
José Fonseca	37b2a29d3b	mesa/gdi: Don't pretend mesa.def is auto generated. Just use the same entrypoints we use for st/wgl's opengl32.dll. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:31:14 +00:00
José Fonseca	cb009bdd44	st/wgl: Don't export wglGetExtensionsStringARB. It's not exported by the official opengl32.dll neither. Applications are supposed to get it via wglGetProcAddress(), not GetProcAddress(). Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:31:11 +00:00
José Fonseca	5fdb6d6839	mapi/glapi: Fix dll linkage of GLES1 symbols. This fixes several MSVC warnings like: warning C4273: 'glClearColorx' : inconsistent dll linkage In fact, we should avoid using `declspec(dllexport)` altogether, and use exclusively the .DEF instead, which gives more precise control of which symbols must be exported, but all the public GL/GLES headers practically force us to pick between `declspec(dllexport)` or `declspec(dllimport)`. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:31:07 +00:00
José Fonseca	4b6e93650c	util/u_snprintf: Don't redefine HAVE_STDINT_H as 0. We now always guarantee availability of stdint.h on MSVC -- if MSVC doesn't supply one we use our own. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:30:58 +00:00
José Fonseca	29557a1fa8	gallivm: Removed unused variable. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:25:12 +00:00
José Fonseca	a0ddc54777	draw,gallivm,llvmpipe: Avoid implicit casts of 32-bit shifts to 64-bits. Addresses MSVC warnings "result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)", which can often be symptom of bugs, but in these cases were all benign. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:25:12 +00:00
José Fonseca	aef3a01d57	scons: Generate SSE2 floating-point arithmetic. - SSE2 is available on all x86 processors we care about. - It's recommended by Intel: https://software.intel.com/en-us/blogs/2012/09/26/gcc-x86-performance-hints - And has been the default since MSVC 2012: http://msdn.microsoft.com/en-us/library/7t5yh4fd(v=vs.110).aspx Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:25:12 +00:00
José Fonseca	0473577f91	scons: Remove dead code/comments. - Remove no-op if-clause. - -mstackrealign has been enabled again on MinGW for quite some time and appears to work alright nowadays. - Drop -mmmx option as it is implied my -msse, and we don't use MMX intrinsics anyway. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-26 20:25:12 +00:00
Axel Davy	a10bf5c10c	st/nine: fix formatting in query9 (cosmetic) Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:12 +00:00
Axel Davy	d52328fc39	st/nine: Fix setting of the shift modifier in nine_shader It is an sint_4, but it was stored in a uint_8... The code using it was acting as if it was signed. Problem found thanks to Coverity Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:12 +00:00
David Heidelberg	90fea6b3e0	st/nine: remove unused pipe_viewport_state::translate[3] and scale[3] `2efabd9f5a` removed them as unused. This caused random memory overwrites (reported by Coverity). Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-26 20:09:12 +00:00
Axel Davy	614d9387c7	st/nine: fix wrong variable reset Error detected by Coverity (COPY_PASTE_ERROR) Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-26 20:09:12 +00:00
David Heidelberg	a99f31bced	st/nine: return GetAvailableTextureMem in bytes as expected (v2) PIPE_CAP_VIDEO_MEMORY returns the amount of video memory in megabytes, so need to converted it to bytes. Fixed Warframe memory detection. v2: also prepare for cards with more than 4GB memory Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: Yaroslav Andrusyak <pontostroy@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-26 20:09:11 +00:00
Axel Davy	4eea2496bc	st/nine: Add pool check to SetTexture (v2) D3DPOOL_SCRATCH is disallowed according to spec. D3DPOOL_SYSTEMMEM should be allowed but we don't handle it right for now. v2: Fixes segfault in SetTexture when unsetting the texture Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:11 +00:00
Axel Davy	890f963d64	st/nine: propertly declare constants (v2) Fixes "Error : CONST[20]: Undeclared source register" when running dx9_alpha_blending_material. Also artifacts on ilo. v2: also remove unused MISC_CONST Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:11 +00:00
Stanislaw Halik	7f74b9d479	st/nine: call DBG() at more external entry points Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: David Heidelberg <david@ixit.cz> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Stanislaw Halik <sthalik@misaki.pl>	2014-11-26 20:09:11 +00:00
Axel Davy	6aeae7442d	st/nine: rework the way D3DPOOL_SYSTEMMEM is handled This patch moves the data field from Resource9 to Surface9 and cleans D3DPOOL_SYSTEMMEM handling in Texture9. This fixes HL2 lost coast. It also removes in Texture9 some code written to support importing and exporting non D3DPOOL_SYSTEMMEM shared buffers. This code hadn't the design required to support the feature and wasn't used. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:10 +00:00
Axel Davy	133b2087c5	st/nine: Rework Basetexture9 and Resource9. Instead of having parts of the structures initialised by the parents, have them initialised by the children. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:10 +00:00
Axel Davy	104b5a8193	st/nine: clean device9ex. Pass ex specific parameters as arguments to device9 ctor instead of passing them by filling the structure. Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-26 20:09:10 +00:00
Emil Velikov	9b7037a369	nine: the .pc file should not follow mesa version The version provided by it should be the same as the one provided/handled by the module. Add the missing tiny version. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: David Heidelberg <david@ixit.cz>	2014-11-26 20:09:10 +00:00
Emil Velikov	c642e87d9f	auxiliary/vl: rework the build of the VL code Rather than shoving all the VL code for non-VL targets, increasing their size, just split it out and use it when needed. This gives us the side effect of building vl_winsys_dri.c once, dropping a few automake warnings, and reducing the size of the dri modules as below text data bss dec hex filename 5850573 187549 1977928 8016050 7a50b2 before/nouveau_dri.so 5508486 187100 391240 6086826 5ce0aa after/nouveau_dri.so The above data is for a nouveau + swrast + kms_swrast 'megadriver'. v2: Do not include the vl sources in the auxiliary library. v3: Rebase. Add nine. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-26 20:09:09 +00:00
Emil Velikov	86a51eb861	auxiliary/vl: split the vl sources list into VL_SOURCES With follow up commit we'll split vl static lib from the auxiliary one, and choose the appropriate vl (galliumvl or galliumvl_stub) for the respective targets to link against. v2: Rebase. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-26 20:09:09 +00:00
Emil Velikov	f093c1c8ec	auxiliary/vl: add galliumvl_stub.la Will be used by the non-VL targets, to stub out the functions called by the drivers. The entry point to those are within the VL state-trackers, yet the compiler cannot determine that at link time. Thus we'll need to stub them out to prevent unresolved symbols in the dri, egl, gbm and pipe-loader targets. v2: Rebase. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-26 20:09:09 +00:00
Emil Velikov	2dbaedaf10	automake: rework VL dependency tracking Set a single VL_{CFLAG,LIBS} for xcb and friends, and let each target check for it's relevant library alone. Required as with follow up commits we'll build aux/vl into a separate module, which needs VL_CFLAGS Cleanup add a couple of explicit LIBDRM_LIBS linking, as aux/vl itself requires libdrm, despite that LIBDRM_{RADEON,NOUVEAU...} may provide it as well. v2: Rebase. Make sure st/xvmc programs work. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-26 20:08:40 +00:00
Emil Velikov	303bc3609a	configure: check the package version when auto-detecting the VL targets Or we might end up where automatically enable the build, only to error out a couple of lines after that. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-26 20:08:39 +00:00
Siavash Eliasi	8dc8c496e1	mesa: Permanently enable features supported by target CPU at compile time. This will remove the need for unnecessary runtime checks for CPU features if already supported by target CPU, resulting in smaller and less branchy code. V2: - Removed the SSSE3 related part for the not yet merged patch. - Avoiding redefinition of macros. Tested-by: David Heidelberg <david@ixit.cz>	2014-11-26 20:08:38 +00:00
Emil Velikov	752c2e9690	docs: add relnotes template for 10.5.0 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-26 18:00:17 +00:00
Timothy Arceri	b3721cd230	util: update hash type comments Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-11-26 20:04:13 +11:00
Matt Turner	531feec9dc	i965/vec4: Handle destination writemasks in VEC4_OPCODE_PACK_BYTES. Since pack_bytes expands to two mov(4) align1 instructions, we can't use swizzles directly. For an instruction like pack_bytes m4.y:UD, vgrf13.xyzw:UD we can write into the .y component by settings the offset based on the swizzle. Also while we're doing this, we can set the dependency control hints properly, so that a series of pack_bytes writing into separate components of a register can issue without blocking.	2014-11-25 17:29:02 -08:00
Matt Turner	70fcd56538	i965/vec4: Optimize packSnorm4x8(). Reduces the number of instructions needed to implement packSnorm4x8() from 13 -> 7.	2014-11-25 17:29:02 -08:00
Matt Turner	3532be7680	i965/vec4: Optimize packUnorm4x8(). Reduces the number of instructions needed to implement packUnorm4x8() from 11 -> 6.	2014-11-25 17:29:02 -08:00
Matt Turner	e14c7c7faf	i965/vec4: Add VEC4_OPCODE_PACK_4_BYTES. Will be used by emit_pack_{s,u}norm_4x8().	2014-11-25 17:29:02 -08:00
Matt Turner	94a30bbd4f	i965/vec4: Optimize unpackSnorm4x8(). Reduces the number of instructions needed to implement unpackSnorm4x8() from 16 -> 6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-25 17:29:02 -08:00
Matt Turner	bf686b2785	i965/vec4: Optimize unpackUnorm4x8(). Reduces the number of instructions needed to implement unpackUnorm4x8() from 11 -> 4. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-25 17:29:02 -08:00
Matt Turner	cb0ba848d4	i965/vec4: Add vector float immediate infrastructure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-25 17:29:02 -08:00
Matt Turner	5d23721c1d	i965/fs: Add vector float immediate infrastructure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-25 17:29:02 -08:00
Matt Turner	276075f864	i965: Disassemble vector float immediates properly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-11-25 17:29:02 -08:00
Matt Turner	b2abf033e0	i965: Add unit test for float <-> VF conversions. Using Eric's original VF -> float conversion code to initialize the table.	2014-11-25 17:29:02 -08:00
Matt Turner	c37d798e78	i965: Add functions to convert float <-> VF. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-25 17:29:01 -08:00
Chris Forbes	0008d0e59e	i965/Gen6-7: Do not replace texcoords with point coord if not drawing points Fixes broken rendering in Windows-based QtQuick2 apps run through Wine. This library sets all texture units' GL_COORD_REPLACE, leaves point sprite mode enabled, and then draws a triangle fan. Will need a slightly different fix for Gen4-5, but I don't have my old machines in a usable state currently. V2: - Simplify patch -- the real changes are no longer duplicated across the Gen6 and Gen7 atoms. - Also don't clobber attr overrides -- which matters on Haswell too, and fixes the other half of the problem - Fix newly-introduced warnings V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than core flag and state; keep the state flags in order. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.4" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-25 22:38:32 +13:00
Kenneth Graunke	60f011af1a	glsl: Make lower_constant_arrays_to_uniforms require dereferences. Ilia noticed that my lowering pass was converting the constant array used by textureGatherOffsets' offsets parameter to a uniform. This broke textureGather for Nouveau, and is generally a horrible plan, since it violates the GLSL constraint that offsets must be an immediate constant. When I wrote this pass, I neglected to consider whole array assignment. I figured opt_array_splitting would handle constant indexing, so this pass was really about fixing variable indexing. textureGatherOffsets is an example of whole array access that we really don't want to touch. Whole array copies don't appear to benefit from this either - they're most likely initializers for temporary arrays which are going to be mutated anyway. Since you're copying, you may as well copy from immediates, not uniforms. This patch makes the pass look for ir_dereference_arrays of ir_constants, rather than looking for any ir_constant directly. This way, it ignores whole array assignment. No shader-db changes or Piglit regressions on Haswell. Some Piglit tests generate different code (fixing textureGatherOffsets on Nouveau). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2014-11-24 15:30:09 -08:00
Kenneth Graunke	f0c91f32c0	i965: Precompile ARB programs. We already precompile GLSL programs; it seems logical to precompile ARB programs as well. We just never hooked it up. This also makes the programs compile even if no drawing occurs, which is useful for shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-24 15:30:09 -08:00
Kenneth Graunke	b55777f39d	i965: Make precompile functions accessible from C. Previously, the prototypes for brw_vs/gs/fs_precompile were scattered between brw_vs.h (C), brw_gs.h (C), and brw_fs.h (C++ only). Also, brw_fs_precompile had C++ linkage, while the others were C. This patch moves all the prototypes to a central location (brw_shader.h) and makes brw_fs_precompile have C linkage. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-24 15:30:09 -08:00
Kenneth Graunke	62b425448c	i965: Pass gl_program pointers into precompile functions. We'd like to do precompiling for ARB vertex and fragment programs, which only have gl_program structures - gl_shader_program is NULL. This patch makes the various precompile functions take a gl_program parameter directly, rather than accessing it via gl_shader_program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-24 15:30:09 -08:00
Kenneth Graunke	d54925df9c	i965: Move brw->precompile checks out a level. brw_shader_precompile should just do a precompile; it makes more sense for the caller to decide whether we should do one. Simpler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-24 15:30:09 -08:00
Roland Scheidegger	880424b8ad	llvmpipe: (trivial) remove redundant util_cpu_detect() call in lp_test_main Already called earlier.	2014-11-25 00:29:29 +01:00
Roland Scheidegger	8148a06b8f	llvmpipe: fix lp_test_arit denorm handling llvmpipe disables denorms on purpose (on x86/sse only), because denorms are generally neither required nor desired for graphic apis (and in case of d3d10, they are forbidden). However, this caused some arithmetic tests using denorms to fail on some systems, because the reference did not generate the same results anymore. (It did not fail on all systems - behavior of these math functions is sort of undefined when called with non-standard floating point mode, hence the result differing depending on implementation and in particular the sse capabilities.) So, for the reference, simply flush all (input/output) denorms manually to zero in this case. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=67672. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-25 00:29:29 +01:00
Eric Anholt	93d30ff5d6	nouveau: Fix build after STR/BRA opcode dropping. I missed these while git grepping for users of the dead opcodes. Sigh, macros.	2014-11-24 15:22:25 -08:00
Eric Anholt	a3688d686f	mesa: Drop unused NV_fragment_program opcodes. The extension itself was deleted 2 years ago. There are still some prog_instruction opcodes from NV_fp that exist because they're used by ir_to_mesa.cpp, though. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Roamnick <ian.d.romanick@intel.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	868f95f1da	mesa: Drop unused SFL/STR opcodes. They're part of NV_vertex_program2, which I'm pretty sure we're never going to support. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Roamnick <ian.d.romanick@intel.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	365a4a3f9a	gallium: Drop the unused CND opcode. Nothing in the tree generates it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	00f7002c5c	gallium: Drop unused BRA opcode. Never generated, and implemented in only nvfx vertprog. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	ecfe9e2ad2	gallium: Drop the unused SFL/STR opcodes. Nothing generated them. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	dc00b382b5	gallium: Drop the unused RFL opcode. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	8c822b1e91	gallium: Drop unused X2D opcode. Nothing in the tree generates it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	ff886c4955	gallium: Drop the unused ARA opcode. Nothing in the tree generated it. v2: Only drop ARA, not ARR as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v2)	2014-11-24 14:56:22 -08:00
Eric Anholt	de2f8d75db	gallium: Drop the unused RCC opcode. Nothing in the tree generated it. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	d4864cdf15	gallium: Drop the NRM and NRM4 opcodes. They weren't generated in tree, and as far as I know all hardware had to lower it to a DP, RSQ, MUL. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-24 14:56:22 -08:00
Eric Anholt	7361d5ba63	ilo: Drop the explicit intialization of gaps in TGSI opcodes. The nice thing about the good way of initializing arrays like this is that you don't need to initialize everything in order, or even everything at all. Taking advantage of that only needs a tiny fixup to deal with the default NULL value of the pointers. I haven't dropped the initialization of opcodes that exist and are unsupported.	2014-11-24 14:56:22 -08:00
Eric Anholt	386c3fcb14	r300: Drop the "/* gap */" notes. This switch statement's code structure isn't dependent on the numbers of the opcodes at all.	2014-11-24 14:56:22 -08:00
Eric Anholt	2f01cc8417	r600: Drop the "/* gap */" notes. These are obviously the gaps already, due to the bare numbers with unsupported implementations. This makes inserting new gaps less irritating.	2014-11-24 14:56:22 -08:00
Jose Fonseca	925cb75f89	nine: Drop use of TGSI_OPCODE_CND. This was the only state tracker emitting it, and hardware was just having to lower it anyway (or failing to lower it at all). v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed to actually not reference TGSI_OPCODE_CND. Change by anholt. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: David Heidelberg <david@ixit.cz>	2014-11-24 14:56:22 -08:00
Jose Fonseca	56fd7c6361	nine: Don't reference the dead TGSI_OPCODE_NRM. The translation is lowering it to not using TGSI_OPCODE_NRM, anyway. v2: Extracted from a larger patch by Jose that also dropped DP2A usage. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: David Heidelberg <david@ixit.cz>	2014-11-24 14:56:22 -08:00
Eric Anholt	7c0acd8535	nine: Don't use the otherwise-dead SFL opcode in an unreachable path. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: David Heidelberg <david@ixit.cz>	2014-11-24 14:56:21 -08:00
Matt Turner	057e6e5251	i965/gen6/gs: Don't declare a src_reg with struct. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 14:09:23 -08:00
Matt Turner	ff966aff99	i965/disasm: Fix all32h/any32h predicate disassembly. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-11-24 14:09:23 -08:00
Matt Turner	b754e52532	glsl: Fix tautological comparison. Caught by clang. warning: comparison of constant -1 with expression of type 'ir_texture_opcode' is always false [-Wtautological-constant-out-of-range-compare] if (op == -1) ~~ ^ ~~ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 14:09:23 -08:00
Matt Turner	024db256d4	util: Prefer atomic intrinsics to inline assembly. Cuts a little more than 1k of .text size from i915g. This was previously done in commit `5f66b340` and subsequently reverted in commit `3661f757` after bug 30514 was filed. I believe the cause of bug 30514 wasn't anything related to cross compiling, but rather that the toolchain used defaulted to -march=i386, and i386 doesn't have the CMPXCHG or XADD instructions used to implement the intrinsics. So we reverted a patch that improved things so that we didn't break compilation for a platform that never could have worked anyway.	2014-11-24 14:09:23 -08:00
Matt Turner	99cebffda9	util: Implement assume() for clang. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-11-24 14:09:23 -08:00
Matt Turner	56ac25918a	i965: Don't overwrite the math function with conditional mod. Ben was asking about the undocumented restriction that the math instruction cannot use the dependency control hints. I went to reconfirm and disabled the is_math() check in opt_set_dependency_control() and saw that the disassembled math instructions with dependency hints had a bogus math function. We were mistakenly overwriting it by setting an empty conditional mod. Unfortunately, this wasn't the cause of the aforementioned problem (I reproduced it). This bug is benign, since we don't set dependeny hints on math instructions -- but maybe some day. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 14:07:32 -08:00
Matt Turner	f5bef2d2e5	i965: Assert that math instructions don't have conditional mod. The math function field is at the same location as conditional mod. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 14:06:39 -08:00
Matt Turner	803a744507	glsl: Remove unused ast copy constructors. These were added in commits `a760c738` and `43757135` to be used in implementing C-style aggregate initializers (commit `1b0d6aef`). Paul rewrote that code in commit `0da1a2cc` to use GLSL types, rather than AST types, leaving these copy constructors unused. Tested by making them private and providing no definition.	2014-11-24 14:06:39 -08:00
Matt Turner	baff470823	glapi: Remove dead gl_offsets.py. Dead since commit `07b85457`.	2014-11-24 14:02:54 -08:00
Matt Turner	76ef547be7	glapi: Remove dead extension_helper.py. Dead since commit `3d16088f`.	2014-11-24 14:02:54 -08:00
Eric Anholt	52a7cb2ec4	vc4: Fix some inconsistent indentation.	2014-11-24 12:37:33 -08:00
Eric Anholt	6f4adb7483	vc4: Don't forget to actually connect the fence code. I thought I'd tested this.	2014-11-24 12:37:33 -08:00
Eric Anholt	fa74ec7e98	vc4: Add a note about a piece of errata I've learned about. Right now in my environment I've only got a small CMA area, so this constraint ends up holding.	2014-11-24 12:37:33 -08:00
Chris Forbes	2b4fe85f0e	mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose This was just returning the same value as GL_CURRENT_MATRIX_ARB. Spotted while investigating something else in apitrace. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 21:55:47 +13:00
Chris Forbes	129178893b	glsl: Generate unique names for each const array lowered to uniforms Uniform names (even for hidden uniforms) are required to be unique; some parts of the compiler assume they can be looked up by name. Fixes the piglit test: tests/spec/glsl-1.20/linker/array-initializers-1 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 21:07:56 +13:00
Chris Forbes	adefccd12a	i965: Handle nested uniform array indexing When converting a uniform array reference to a pull constant load, the `reladdr` expression itself may have its own `reladdr`, arbitrarily deeply. This arises from expressions like: a[b[x]] where a, b are uniform arrays (or lowered const arrays), and x is not a constant. Just iterate the lowering to pull constants until we stop seeing these nested. For most shaders, there will be only one pass through this loop. Fixes the piglit test: tests/spec/glsl-1.20/linker/double-indirect-1.shader_test Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-24 21:07:29 +13:00
Dave Airlie	c88385603a	r600g: do all CUBE ALU operations before gradient texture operations (v2.1) This moves all the CUBE section above the gradients section, so that the gradient emission happens on one block which is what sb/hardware expect. v2: avoid changes to bytecode by using spare temps v2.1: shame gcc, oh the shame. (uninit var warnings) Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-24 13:33:07 +10:00
Dave Airlie	38ec184419	r600: fix texture gradients instruction emission (v2) The piglit tests were failing, and it appeared to be SB optimising out things, but Glenn pointed out the gradients are meant to be clause local, so we should emit the texture instructions in the same clause. This moves things around to always copy to a temp and then emit the texture clauses for H/V. v2: Glenn pointed out we could get another ALU fetch in the wrong place, so load the src gpr earlier as well. Fixes at least: ./bin/tex-miplevel-selection textureGrad 2D Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-24 10:41:30 +10:00
Ilia Mirkin	fecae4625c	nv50,nvc0: buffer resources can be bound as other things down the line res->bind is not an indicator of how the resource is currently bound. buffers can be rebound across different binding points without changing underlying storage. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-11-23 15:43:28 -05:00
Ilia Mirkin	e80a0a7d9a	nv50,nvc0: actually check constbufs for invalidation The number of vertex buffers has nothing to do with the number of bound constbufs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-11-23 15:43:27 -05:00
Ilia Mirkin	7d07083cfd	nv50/ir: set neg modifiers on min/max args Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=86618 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-11-23 15:43:27 -05:00
Chris Forbes	89b9ef937c	mesa: Fix function name in GetActiveUniformName error Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-11-23 15:04:15 +13:00
Stéphane Marchesin	3d9c1a9dd6	i915g: Fallback copy_render for ZS formats These don't work out of the box, need more work, maybe with a proxy format? Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:41 -08:00
Stéphane Marchesin	90207340c7	i915g: Add back 4444 and 5551 formats Now that we have the transfers working, we can re-add those formats. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:40 -08:00
Stéphane Marchesin	1e47510df7	i915g: Don't limit blitter to POT textures Now that we have NPOT support for u_blitter, there is no reason to limit this any longer. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:40 -08:00
Stéphane Marchesin	e30c799da9	i915g: Align all texture dimensions to the next POT This creates a usable layout for all NPOT textures. Of course these still have lots of limitations, but at least we can render to a level. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:40 -08:00
Stéphane Marchesin	675019584c	i915g: Fix typos Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:40 -08:00
Stéphane Marchesin	2ed24b2c31	i915g: Fix maxlod computation. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:39 -08:00
Stéphane Marchesin	0220a428d7	i915g: Fix offset for level != 0 For NPOT texture layouts, we want to be able to access texture levels other than 0 directly. Since the hw doesn't support that, We do it by adding the offset directly. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:39 -08:00
Stéphane Marchesin	a9b0787076	i915g: Don't write constants past I915_MAX_CONSTANT This happens with glsl-convolution-1, where we have 64 constants. This doesn't make the test pass (we don't have 64 constants anyway, only 32) but this prevents it from crashing. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:39 -08:00
Stéphane Marchesin	5f61744adb	i915g: Don't hardcode array size for phase count This is an array of temp registers, so use I915_MAX_TEMPORARY for the size. Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>	2014-11-22 00:13:39 -08:00
David Heidelberg	25b00f4617	draw: allow LLVM use on non-SSE2 X86 cpus This patch remove workaround related to LLVM < 3.2 bug. Original bug has been closed as fixed in 2011. At this moment gallium requires LLVM 3.3 (2013). LLVM has been tested without SSE2 support in commit `ca70de9bd2` and removed after requiring LLVM 3.3 in commit `013ff2fae1` Original LLVM bug: http://llvm.org/bugs/show_bug.cgi?id=6960 Signed-off-by: David Heidelberg <david@ixit.cz> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-22 04:29:00 +00:00
Emil Velikov	7d854c9771	docs: add news item and link release notes for mesa 10.3.4 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-22 04:26:06 +00:00
Emil Velikov	34616bc922	docs: Add sha256 sums for the 10.3.4 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `72c27d7a3a`)	2014-11-22 04:24:32 +00:00
Emil Velikov	9e168ad903	Add release notes for the 10.3.4 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `26c8ecd85d`)	2014-11-22 04:24:29 +00:00
Kenneth Graunke	a746be259d	i965: Make Gen4-5 push constants call _mesa_load_state_parameters too. In commit `5e37a2a4a8`, I made the pull constant code stop calling _mesa_load_state_parameters() when there were no pull parameters. This worked fine on Gen6+ because the push constant code also called it if there were any push constants. However, the Gen4-5 push constant code wasn't doing this. This patch makes it do so, like the Gen6+ code. A better long term solution would be to make core Mesa just handle this for us when necessary. Fixes around 8766 Piglit tests on Ironlake, and probably Gen4 as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Mark Janes <mark.a.janes@intel.com>	2014-11-21 16:25:17 -08:00
Ben Widawsky	88fea85f09	i965/vec4/gen8: Handle the MUL dest hazard exception Fix one of the few cases where we can't reliable touch the destination hazard bits. I am explicitly doing this patch individually so it is easy to backport. I was tempted to do this patch before the previous patch which reorganized the code, but I believe even doing that first, this is still easy to backport. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84212 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-21 12:08:46 -08:00
Ben Widawsky	156f565f9e	i965/vec4: Extract depctrl hazards Move this to a separate function so that we can begin to add other little caveats without making too big a mess. NOTE: There is some desire to improve this function eventually, but we need to fix a bug first. v2: Use const for the inst for the hazard check (Matt) Invert safe logic to get rid of the double negative (Matt) Add PRM reference for predicates (Matt) Add note about empirical evidence for math (Matt) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-21 12:08:46 -08:00
Matt Turner	40c0d79d29	i965/fs: Remove is_valid_3src(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-21 10:26:44 -08:00
Matt Turner	0777775274	i965/fs: Remove is_valid_3src() checks from emit_lrp. The visitor emits MOVs to temporary registers for immediates, so these never trigger. For further proof, check case ir_triop_fma. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-21 10:26:44 -08:00
Matt Turner	1fdc75fde4	i965/fs: Remove unused apply_stride(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-21 10:26:44 -08:00
Matt Turner	279c1c80b6	i965/fs: Move ip_record class to its one use. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-21 10:26:44 -08:00
Matt Turner	d9432af45a	i965: Move common fields into backend_instruction. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-21 10:26:42 -08:00
Matt Turner	bd50213929	i965: Combine offset/texture_offset fields. texture_offset was only used by some texturing operations, and offset was only used by spill/unspill and some URB operations. These fields are never used at the same time. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-21 10:26:38 -08:00
Marek Olšák	645b471d61	radeonsi: use minnum and maxnum LLVM intrinsics for MIN and MAX opcodes So far it has been compiled into pretty ugly code (8 instructions or so for either opcode). Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-11-21 12:15:58 +01:00
Eric Anholt	21577571b3	vc4: Update for new kernel ABI with async execution and waits. Our submits now return immediately and you have to manually wait for things to complete if you want to (like a normal driver).	2014-11-20 13:07:07 -08:00
Ville Syrjälä	390799c496	i915: Only use TEXCOORDTYPE_VECTOR with cube maps on gen2 Check that the target is GL_TEXTURE_CUBE_MAP before emitting TEXCOORDTYPE_VECTOR texture coordinates. I'm not sure if the hardware would like CARTESIAN coordinates with cube maps, and as I'm too lazy to find out just emit the VECTOR coordinates for cube maps always. For other targets use CARTESIAN or HOMOGENOUS depending on the number of texture coordinates provided. Fixes rendering of the "electric" background texture in chromium-bsu main menu. We appear to be provided with three texture coordinates there (I'm guessing due to the funky texture matrix rotation it does). So the code would decide to use TEXCOORDTYPE_VECTOR instead of TEXCOORDTYPE_CARTESIAN even though we're dealing with a 2D texure. The results weren't what one might expect. demos/cubemap still works, which hopefully indicates that this doesn't break things. Also tested with: bin/glean -o -v -v -v -t +texCube --quick bin/cubemap -auto from piglit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-20 21:58:57 +02:00
Ben Widawsky	ca39c46c3b	i965/disasm: Properly decode branch_ctrl (gen8+) Add support for decoding the new branch control bit. I saw two things wrong with the existing code. 1. It didn't bother trying to decode the bit. - While we do not intentionally emit this bit today, I think it's interesting to see if we somehow ended up with the bit set. It may also be useful in the future. 2. It seemed to be the wrong bit. - The docs are pretty poor wrt which bit this actually occupies. To me, it /looks/ like it should be bit 28. I am not sure where Ken got 30 from. I verified it should be 28 by looking at the simulator code. I also added the most basic support for GOTO simply so we don't need to remember to change the function in the future. v2: Move the branch_ctrl check out of the if gen >= 6 check to make it more readable. (Matt) ENDIF doesn't have branch_ctrl (Matt + Ken) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-20 09:45:23 -08:00
José Fonseca	56bf948e11	rtasm,translate: Re-enable SSE on Mingw64. This reverts `f4dd099171`. The src/gallium/tests/unit/translate_test.c gives the same results on MinGW 64-bits as on Linux 64-bits. And since MinGW is often used for development/testing due to its convenience, it's better not to have this sort of differences relative to MSVC. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-20 14:11:36 +00:00
Kenneth Graunke	5e37a2a4a8	i965: Skip _mesa_load_state_parameters when there are zero parameters. Saves a tiny bit of CPU overhead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-11-20 01:56:54 -08:00
Marek Olšák	6f7371619c	radeonsi: remove unused variable si_state_dsa::db_render_control	2014-11-19 21:42:14 +01:00
Roland Scheidegger	763fc526c7	llvmpipe: enable PIPE_CAP_TGSI_VS_LAYER_VIEWPORT No changes required in the driver itself, all handled by draw. piglit results in a quick run: skip->pass 7 skip->fail 2 (The new failures in the ARB_fragment_layer_viewport group are expected, we fail the same if gs doesn't write these outputs regardless of the vs.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-19 18:35:30 +01:00
Roland Scheidegger	4b6d6642d2	draw: fixes for vertex shaders outputting layer or viewport index Mostly add a couple cases so we don't just check gs for this. There's only one gotcha, the built-in vp transform in the llvm vs can't handle it (this would be fixable though non-trivial due to vp index being non-constant for the SoA outputs, but we don't use it if there's a gs neither - the whole clip/vp transform integration there is suboptimal). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-11-19 18:35:30 +01:00
Michael Varga	9460cd39e8	st/va: surface: render subpicture Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-19 09:29:11 -05:00
Michael Varga	7523db174e	st/va: subpicture implementation added BGRA format create/destroy set image associate/deassociate Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-19 09:29:11 -05:00
Michael Varga	05e225b558	st/va: added internal storage for VAImage and BGRA format When calling vaCreateImage() an internal copy of VAImage is maintained since the allocation of "image" may not be guaranteed to live long enough. Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-19 09:29:11 -05:00
Michael Varga	7b4f233c1f	st/va: added some calls to handle_table_remove() In a few locations handles were being added but not removed. Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-19 09:29:10 -05:00
Chad Versace	b69c7c5dac	i965: Fix segfault in WebGL Conformance on Ivybridge Fixes regression of WebGL Conformance test texture-size-limit [1] on Ivybridge Mobile GT2 0x0166 with Google Chrome R38. Regression introduced by commit `6c04423153` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Sun Feb 2 02:58:42 2014 -0800 i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192. The test regressed because the pointer offset arithmetic in intel_miptree_map_gtt() overflows for large textures. The pointer arithmetic is not 64-bit safe. [1] `52f0dc240f/sdk/tests/conformance/textures/texture-size-limit.html` Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=78770 Fixes: Intel CHRMOS-1377 Reported-by: Lu Hua <huax.lu@intel.com> Reviewed-by: Ian Romanic <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-11-18 19:16:45 -08:00
Siavash Eliasi	80bffde0a2	mesa/main: Fix tmp_row memory leak in texstore_rgba_integer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-18 14:55:39 -08:00
Jason Ekstrand	d76be6bd60	docs/GL3: Mark GL_ARB_direct_state_access as being started by Laura	2014-11-18 14:54:12 -08:00
Dave Airlie	1830138cc0	r600g: limit texture offset application to specific types (v2) For 1D and 2D arrays we don't want the other coordinates being offset and affecting where we sample. I wrote this patch 6 months ago but lost it. Fixes: ./bin/tex-miplevel-selection textureLodOffset 1DArray ./bin/tex-miplevel-selection textureLodOffset 2DArray ./bin/tex-miplevel-selection textureOffset 1DArray ./bin/tex-miplevel-selection textureOffset 1DArrayShadow ./bin/tex-miplevel-selection textureOffset 2DArray ./bin/tex-miplevel-selection textureOffset(bias) 1DArray ./bin/tex-miplevel-selection textureOffset(bias) 2DArray v2: rewrite to handle more cases and be consistent with code above. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-19 08:22:13 +10:00
Dave Airlie	d4c342f67e	r600g: geom shaders: always load texture src regs from inputs Otherwise we seem to lose the split_gs_inputs and try and pull from an uninitialised register. fixes 9 texelFetch geom shader tests. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-19 08:21:40 +10:00
Eric Anholt	82e919d33b	vc4: Emit semaphore instructions for new kernel ABI. Previously, the kernel would dispatch thread 0, wait, then dispatch thread 1. By insisting that the thread contents use semaphores in the right place, the kernel can sleep for longer by dispatching both threads at once.	2014-11-18 12:46:55 -08:00
Eric Anholt	05f165b62d	vc4: Mark a big array as const. Drops 1kb of code from this inner loop, in exchange for 2.5k of data.	2014-11-18 12:42:52 -08:00
Andres Gomez	1398ed724a	glsl_compiler: Add binding hash tables to avoid SIGSEVs on linking stage When using the stand alone compiler, if we try to link a shader with vertex attributes it will segfault on linking as the binding hash tables are not included in the shader program. Obviously, we cannot make the linking stage succeed without the bound attributes but we can prevent the crash and just let the linker spit its own error. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-18 08:47:04 -07:00
Andres Gomez	f9fc3ae89b	linker: Add carriage returns on several linker errors Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-18 08:47:04 -07:00
Andres Gomez	2d5af04bae	draw: Fixed inline comments Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-18 08:47:03 -07:00
Roland Scheidegger	74f505fa73	gallivm: fix alignment issue for vertex data fetch We cannot guarantee that vertex buffers have the necessary alignment for fetching all AoS members at once (for instance 4x32bit XYZW data). We can however guarantee that for textures. This did not cause errors for older llvm versions but it now matters and will cause segfaults if the data happens to not be aligned. Thus we need to set alignment manually. (Note that we can't actually really guarantee data to be even element aligned due to offsets in vertex buffers being bytes and OpenGL allowing this, but it does not matter for x86 as alignment is only required for sse vectors - not sure what happens on other archs, however.) This fixes https://bugs.freedesktop.org/show_bug.cgi?id=85467.	2014-11-18 15:26:59 +01:00
Marek Olšák	3958378abb	radeonsi: support gl_FragCoord at integer pixel center No known benefit for OpenGL, but it doesn't hurt. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-11-18 14:27:54 +01:00
Marek Olšák	da2dea3843	radeonsi: support per-sample gl_FragCoord Cc: 10.4 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-11-18 14:27:54 +01:00
Ilia Mirkin	68db29c434	st/mesa: add a fallback for clear_with_quad when no vs_layer Not all drivers can set gl_Layer from VS. Add a fallback that passes the instance id from VS to GS, and then uses the GS to set the layer. Tested by adding quad_buffers \|= clear_buffers; clear_buffers = 0; to the st_Clear logic, and forcing set_vertex_shader_layered in all cases. No piglit regressions (on piglits with 'clear' in the name). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.4 10.3" <mesa-stable@lists.freedesktop.org>	2014-11-17 22:17:49 -05:00
Vinson Lee	7b8e04b3f0	mesa: Bump version to 10.5.0-devel. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-11-18 02:02:54 +00:00
Axel Davy	7f565845a1	nine: Implement threadpool DRI_PRIME setups have different issues due the lack of dma-buf fences support in the drivers. For DRI3 DRI_PRIME, a race can appear, making tearings visible, or worse showing older content than expected. Until dma-buf fences are well supported (and by all drivers), an alternative is to send the buffers to the server only when rendering has finished. Since waiting the rendering has finished in the main thread has a performance impact, this patch uses an additional thread to offload the wait and the sending of the buffers to the server. Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-18 02:02:54 +00:00
Axel Davy	948e6c5228	nine: Add drirc options (v2) Implements vblank_mode and throttling, which allows us change default ratio between framerate and input lag. Acked-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: David Heidelberg <david@ixit.cz> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-11-18 02:02:54 +00:00
Joakim Sindholt	fdd96578ef	nine: Add state tracker nine for Direct3D9 (v3) Work of Joakim Sindholt (zhasha) and Christoph Bumiller (chrisbmr). DRI3 port done by Axel Davy (mannerov). v2: - nine_debug.c: klass extended from 32 chars to 96 (for sure) by glennk - Nine improvements by Axel Davy (which also fixed some wine tests) - by Emil Velikov: - convert to static/shared drivers - Sort and cleanup the includes - Use AM_CPPFLAGS for the defines - Add the linker garbage collector - Restrict the exported symbols (think llvm) v3: - small nine fixes - build system improvements by Emil Velikov v4: [Emil Velikov] - Do no link against libudev. No longer needed. Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-18 02:02:54 +00:00
Christoph Bumiller	7d2573b537	gallium/auxiliary: add contained and rect checks (v6) v3: thanks to Brian, improved coding style, also glennk helped spot few things (unsigned -> int, two constify) v4: thanks Ilia improved function, dropped u_box_clip_3d v5: incorporated rest of Gregor proposed changes,clean ups v6: u_box_clip_2d simplify proposed by Ilia Mirkin Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-18 02:02:54 +00:00
Christoph Bumiller	cb49132166	gallium/auxiliary: add inc and dec alternative with return (v4) At this moment we use only zero or positive values. v2: Implement it for also for Solaris, MSVC assembly and enable for other combinations. v3: Replace MSVC assembly by assert + warning during compilation v4: remove inc and dec with return for MSVC assembly Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-18 02:02:53 +00:00
Christoph Bumiller	e23d63cffd	gallium/auxiliary: implement sw_probe_wrapped (v2) Implement pipe_loader_sw_probe_wrapped which allows to use the wrapped software renderer backend when using the pipe loader. v2: - remove unneeded ifdef - use GALLIUM_PIPE_LOADER_WINSYS_LIBS - check for CALLOC_STRUCT thanks to Emil Velikov Acked-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-18 02:02:53 +00:00
Christoph Bumiller	8314315dff	winsys/sw/wrapper: implement is_displaytarget_format_supported for swrast Acked-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-18 02:02:53 +00:00
Christoph Bumiller	259ec77db9	tgsi/ureg: add ureg_UARL shortcut (v2) v2: moved in in same order as in p_shader_tokens (thanks Brian) Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: David Heidelberg <david@ixit.cz>	2014-11-18 02:02:53 +00:00
Dave Airlie	4e520101e6	r600g/cayman: handle empty vertex shaders Some of the geom shader tests produce an empty vertex shader, on cayman we'd crash in the finaliser because last_cf was NULL. cayman doesn't need the NOP workaround, so if the code arrives here with no last_cf, just emit an END. fixes crashes in a bunch of piglit geom shader tests. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-18 11:59:47 +10:00
Dave Airlie	27e1e0e710	r600g/cayman: fix texture gather tests It appears on cayman the TG4 outputs were reordered. This fixes a lot of piglit tests. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-18 11:59:30 +10:00
Dave Airlie	70dac5fa44	r600g: cayman umad assigns dst pointlessly There is no need to assign dst here, just use the chan from j Pointed out by glennk. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-18 11:59:30 +10:00
Dave Airlie	4a128d5a16	r600g/cayman: fix integer multiplication output overwrite (v2) This fixes tests/spec/glsl-1.10/execution/fs-op-assign-mult-ivec2-ivec2-overwrite.shader_test. hopeful fix for fd.o bug 85376 Reported-by: ghallberg Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-11-18 11:58:16 +10:00
Brian Paul	11abd7b2bc	st/mesa: copy sampler_array_size field when copying instructions The sampler_array_size field was added by "mesa/st: add support for dynamic sampler offsets". But the field wasn't getting copied in the get_pixel_transfer_visitor() or get_bitmap_visitor() functions. The count_resources() function then didn't properly compute the glsl_to_tgsi_visitor::samplers_used bitmask. Then, we didn't declare all the sampler registers in st_translate_program(). Finally, we asserted when we tried to emit a tgsi ureg src register with File = TGSI_FILE_UNDEFINED. Add the missing assignments and some new assertions to catch the invalid register sooner. Cc: "10.3, 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-17 15:07:54 -07:00
Brian Paul	920f875132	gallium/tests: add missing arg to util_make_vertex_passthrough_shader() Fix oversights from the "add a window_space option to the passthrough vertex shader" patch. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-11-17 10:20:24 -07:00
Michel Dänzer	ae4536b4f7	radeonsi: Disable asynchronous DMA except for PIPE_BUFFER Using the asynchronous DMA engine for multi-dimensional operations seems to cause random GPU lockups for various people. While the root cause for this might need to be fixed in the kernel, let's disable it for now. Before re-enabling this, please make sure you can hit all newly enabled paths in your testing, preferably with both piglit and real world apps, and get in touch with people on the bug reports below for stability testing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85647 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83500 Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>	2014-11-17 16:17:52 +09:00
Vinson Lee	876c53375e	scons: Require glproto >= 1.4.13 for X11. GLXBadProfileARB and X_GLXCreateContextAtrribsARB require glproto >= 1.4.13. These symbols were added in commit `d5d41112cb` "st/xlib: Generate errors as specified." Signed-off-by: Vinson Lee <vlee@freedesktop.org> Cc: "10.4" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-11-16 13:26:26 -08:00
José Fonseca	aafbebe8ab	draw: Make it more clear that *_jit_context points to pipe_viewport_state structures. No change in behavior.	2014-11-16 11:33:21 +00:00
José Fonseca	2a3e140ff4	draw: Fix breakage due to removal pipe_viewport_state::translate[3] and scale[3]. Unfortunately no LLVM type was generated for pipe_viewport_state -- it was being treated as a single floating point array --, so llvmpipe (and any driver that relies on draw/llvm) got totally busted.	2014-11-16 11:31:23 +00:00
José Fonseca	d2dbeed006	gallium/auxiliary: Fix build without LLVM. Trivial.	2014-11-16 10:22:46 +00:00
José Fonseca	4784623b3e	gallium/auxiliary: Remove GALLIVM_CPP_SOURCES Redundant. Should fix ttps://bugs.freedesktop.org/show_bug.cgi?id=86330	2014-11-16 10:16:47 +00:00
Emil Velikov	45e2ba1b8c	freedreno: add missing headers in Makefile.sources ... or autotools will fail to pick them up for the distribution tarball. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-16 01:16:30 +00:00
Emil Velikov	c3bb38c4cb	targets: bundle all files in the tarball We were missing a few files - The version scripts - Android & scons build scripts - A few headers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-16 01:16:30 +00:00
Emil Velikov	d936ef3fb7	auxiliary: ship all files in the distribution tarball - Add all headers into Makefile.sources - Don't forget the target-helpers - Add the python scripts & the formats table/list (csv) - Temporary add vl/vl_winsys_dri.c to EXTRA_DIST until we rework the way VL is build. - Add the following to EXTRA_DIST - they are included via the generated u_indices_gen.c thus we should not add them to *SOURCES. indices/u_indices.c indices/u_unfilled_indices.c XXX: Should we nuke gallivm/f.cpp ? It seems that no-one is using it. v2: Rebase Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-16 01:07:32 +00:00
Emil Velikov	ded56e4674	gallium: ship the gallium API headers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-16 01:03:42 +00:00
Emil Velikov	dfa61dc37e	pipe-loader: consolidate sources into Makefile.sources Drop the unneeded subdir-objects. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-16 01:03:42 +00:00
Thierry Reding	631090e155	dri/kms: Always zero out struct drm_mode_create_dumb The DRM_IOCTL_MODE_CREATE_DUMB (and others) IOCTL isn't very rigorously specified, which has the effect that some kernel drivers do not consider the .pitch and .size fields of struct drm_mode_create_dumb outputs only. Instead they will use these as lower bounds and overwrite them only if the values that they compute are larger than what userspace provided. This works if and only if userspace initializes the fields explicitly to either 0 or some meaningful value. However, if userspace just leaves the values uninitialized and the struct drm_mode_create_dumb is allocated on the stack for example, the driver may try to overallocate buffers. Fortunately most userspace does zero out the structure before passing it to the IOCTL, but there are rare exceptions. Mesa is one of them. In an attempt to rectify this situation, kernel drivers are being updated to not use the .pitch and .size fields as inputs. However in order to fix the issue with older kernels, make sure that Mesa always zeros out the structure as well. Future IOCTLs should be more rigorously defined so that structures can be validated and IOCTLs rejected if output fields aren't set to zero. Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-16 01:03:40 +00:00
Marek Olšák	2efabd9f5a	gallium: remove unused pipe_viewport_state::translate[3] and scale[3] Almost all drivers ignore them.	2014-11-16 01:28:28 +01:00
Marek Olšák	ff8042270f	radeonsi: implement TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION Required by Nine. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2014-11-16 01:28:28 +01:00
Marek Olšák	48f1409c3b	tgsi/ureg: simplify code for declaring properties Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2014-11-16 01:28:26 +01:00
Marek Olšák	e6a2d3f7b6	gallium/util: add a test for TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION Not testable by OpenGL. Required by Nine. This is an example of how to implement a piglit-like test using gallium only.	2014-11-16 01:28:26 +01:00
Marek Olšák	717f2dd69f	gallium/util: add a window_space option to the passthrough vertex shader Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2014-11-16 01:28:24 +01:00
Marek Olšák	ad54b01896	tgsi: fixup the string of VS_WINDOW_SPACE_POSITION Tested-by: Nick Sarnie <commendsarnex@gmail.com>	2014-11-16 01:28:09 +01:00
Rob Clark	7c5707bd4a	freedreno/a4xx: implement mem->gmem (restore) Support to restore gmem (tile buffer) (in case it wasn't glClear'd). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-15 18:16:36 -05:00
Rob Clark	0c6275300e	freedreno/a4xx: move where SP_FS_MRT_REGn is emitted Addition of color fmt bitfield to this register (compared to a3xx) means we need to re-emit if either prog or framebuffer state is dirty. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-15 18:16:36 -05:00
Emil Velikov	e07c9a288c	Revert "mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__." This reverts commit `8d3f739383`. In the last commit we've updated our check to determine if the actual code is buildable, rather than if the compiler acknowledges the option. I.e. did anyone provide -mno-sse4.1 vs is my compiler too old. Now this code will never be attemped to be build, in both cases. Confirmed by building mesa with export CFLAGS='-march=native -mno-sse4.1' ./configure && make Tested-by: David Heidelberg <david@ixit.cz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-15 20:34:36 +00:00
Emil Velikov	1a6ae84041	configure.ac: roll up a program for the sse4.1 check So when checking/building sse code we have three possibilities: 1 Old compiler, throws an error when using -msse* 2 New compiler, user disables sse* (-mno-sse*) 3 New compiler, user doesn't disable sse The original code, added code for #1 but not #2. Later on we patched around the lack of handling #2 by wrapping the code in __SSE4_1__. Yet it lead to a missing/undefined symbol in case of #1 or #2, which might cause an issue for #2 when using the i965 driver. A bit later we "fixed" the undefined symbol by using #1, rather than updating it to handle #2. With this commit we set things straight :) To top it all up, conventions state that in case of conflicting (-enable-foo -disable-foo) options, the latter one takes precedence. Thus we need to make sure to prepend -msse4.1 to CFLAGS in our test. v2: Clean the #includes. Suggested by Ilia, Matt & Siavash. Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberg <david@ixit.cz> Tested-by: Siavash Eliasi <siavashserver@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-15 20:34:34 +00:00
Ilia Mirkin	3bc42a09e2	nv50,nvc0: use clip_halfz setting when creating rasterizer state This enables the ARB_clip_control extension. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.4" <mesa-stable@lists.freedesktop.org>	2014-11-15 14:14:51 -05:00
Rob Clark	61c68b69d7	freedreno: add adreno 420 support Very initial support. Basic stuff working (es2gears, es2tri, and maybe about half of glmark2). Expect broken stuff. Still missing: mem->gmem (restore), queries, mipmaps (blob segfaults!), hw binning, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-15 08:30:31 -05:00
Rob Clark	4b1dfcb2c1	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-15 08:30:31 -05:00
Kristian Høgsberg	a4ffc2a445	i965: Move fs_visitor ra pass to new fs_visitor::allocate_registers() This will be reused for the scalar VS pass. v2 (Ken): Rebase on master. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-14 19:38:08 -08:00
Kristian Høgsberg	c50f2dadc5	i965: Move fs_visitor optimization pass into new method fs_visitor::optimize() We'll reuse this toplevel optimization driver for the scalar VS. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-14 19:38:06 -08:00
Kristian Høgsberg	5c4efc644e	i965: Move more code into codegen-branch of the fs_visitor::run() if statement These last few operations all only apply when we've actually generated code, optimized and allocated registers. The dummy and the repclear shaders don't need the gen4 send workaround, and don't spill. This means we can move these lines into the else-branch, which will make the following refactoring easier. v2 (Ken): Rebase on master, which removed the uncompressed stack. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-14 19:38:05 -08:00
Kristian Høgsberg	f2bb655ac7	i965: Refactor fs_generator API We split out SIMD8 and SIMD16 generation into seperate calls to new method generate_code(), which returns the start offset for the generated code. A new get_assembly() method returns the generated code. This avoids asserting MESA_SHADER_FRAGMENT and accessing wm_prog_data in the generator. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-14 19:38:03 -08:00
José Fonseca	13849f327c	st/wgl: Implement WGL_EXT_create_context_es/es2_profile. Derived from st/glx's GLX_EXT_create_context_es/es2_profile implementation. Tested with an OpenGL ES 2.0 ApiTrace. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-14 23:29:59 +00:00
José Fonseca	d5d41112cb	st/xlib: Generate errors as specified. Tested with piglit glx tests. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-14 23:29:59 +00:00
Rob Clark	82103206fe	freedreno/ir3: move some helpers Split out a few helpers from fd3_program so we don't have to duplicate for fd4_program. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-14 13:59:54 -05:00
Rob Clark	e091c08089	freedreno: rename draw->draw_vbo Gets rid of a namespace conflict w/ a4xx which wants an fd4_draw() version of fd_draw().. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-14 13:59:31 -05:00
Rob Clark	2f024d2b10	freedreno/a3xx: missing u_upload_destroy Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-14 12:07:57 -05:00
Rob Clark	28b2269ee0	freedreno: fix borked check for a320.0 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-14 12:07:39 -05:00
Rob Clark	8b898c1174	freedreno/ir3: half vs full reg in standalone compiler output Handle hrN.c in printing outputs/inputs. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-11-14 12:02:43 -05:00
José Fonseca	7037793f6b	st/dri: Support EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR/GLX_CONTEXT_DEBUG_BIT_ARB on ES contexts. The latest version of the specs explicitly allow it, and given that Mesa universally supports KHR_debug we should definitely support it. Totally untested. (Just happened to noticed this while implementing GLX_EXT_create_context_es2_profile for st/xlib.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-11-14 16:10:22 +00:00
Marek Olšák	363b53f000	egl: remove egl_gallium from the loader Acked-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Informally acked-by: Jose Fonseca	2014-11-14 16:16:12 +01:00
Marek Olšák	c46c551c56	configure.ac: remove enable flags for EGL and GBM Gallium state trackers Acked-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Informally acked-by: Jose Fonseca	2014-11-14 16:16:12 +01:00
Kenneth Graunke	bd20fad316	i965/vec4: Combine all the math emitters. 17 insertions(+), 102 deletions(-). Works just as well. v2: Make emit_math take const references (suggested by Matt), drop redundant WRITEMASK_XYZW setting (Matt and Curro). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-11-13 20:55:41 -08:00
Kenneth Graunke	dba683cf16	i965/vec4: Use const references in emit() functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-11-13 20:55:41 -08:00
Kenneth Graunke	0efc53a96c	i965: Use macros to create prototypes for emitter helpers. We do this almost everywhere else; this should make it easier to modify. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-11-13 20:55:41 -08:00
Ben Widawsky	f14a35f9dc	i965: Always enable VF statistics Every other unit in the geometry pipeline automatically enables statistics gathering. This part of the pipe has been controlled by the DEBUG_STATS variable, but this is asymmetric. This dates back to the original implementation, and I am not sure if there is a reason for it. I need access to these stats to implement ARB_pipeline_statistics_query. Eric wrote it, and Ken touched it last. Do you have any opposition? Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86145 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2014-11-13 10:48:24 -08:00
Ville Syrjälä	0d924738d9	i915: Emit 3DSTATE_SCISSOR_RECTANGLE_0 before 3DSTATE_SCISSOR_ENABLE According to gen2 BSpec the pipeline must be flushed at least up to the windower before changing the scissor rect enable field. Emitting the 3DSTATE_SCISSOR_RECTANGLE_0 before 3DSTATE_SCISSOR_ENABLE is sufficient to do that. gen3 BSpec no longer has that piece of text, but let's make the same change there too for symmetry. The spec does still say that the scissor rectangle must be defined before enabling it, so the new order does seem more in line with the spec. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	81c31e560f	i915: Don't call _mesa_meta_glsl_Clear() on gen2 Gen2 doesn't have fragment shaders so we shouldn't be calling _mesa_meta_glsl_Clear() on gen2. Restore the appropriate ARB_fragment_shader check to the clear path which was lost in: commit `94f22fbe78` Author: Tapani Pälli <tapani.palli@intel.com> Date: Wed Aug 8 20:46:45 2012 +0300 intel: use _mesa_meta_Clear with OpenGL ES 1.1 v2 v2: Fix spelling in commit message Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	4747b2638c	i915: Protect macro argument for TEXTURE_SET() TEXTURE_SET() is the only register macro that forgets to wrap the argument evaluation in parens. Only simple integers are passed to this macro so there's no bug but sitll it seems prudent to add the parens. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	3746ff89bc	i915: Kill intel_context::hw_stencil ctx.hw_stencil is not used anywhere so kill it. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	dafae910d4	i915: Accept GL_DEPTH_STENCIL GL_DEPTH_COMPONENT formats for renderbuffers Gen2 doesn't support depth/stencil textures, and since commit `c1d4d49993` Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Thu Apr 24 14:11:43 2014 +0300 i915: Don't advertise Z formats in TextureFormatSupported on gen2 depth/stencil formats are no longer accepted as texture formats. However we still want depth/stencil renderbuffers, so add explicit format checks to intel_alloc_renderbuffer_storage() to allow such things. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	a071425817	i915: Override mip filter to nearest with aniso gen2 doesn't supporte linear mip filter with anisotropic min/mag filtering. The hardware would automagically downgrade the min/mag filters to linear in such cases, which IMO looks worse than forcing the mip filter to nearest. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	40a08e0d6a	i915: Use L8A8 instead of I8 to simulate A8 on gen2 Gen2 doesn't support the A8 texture format. Currently the driver substitutes it with I8, but that results in incorrect RGB values. Use A8L8 instead. We end up wasting a bit of memory, but at least we should get the correct results. v2: Handle the fallback in _mesa_choose_tex_format() and also do it for all alpha formats that currently accept A8 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72819 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80050 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38873 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Ville Syrjälä	7988ff2fd1	i915: Fix GL_DOT3_RGBA a bit The spec says using DOT4 for alpha is undefined unless DOT4 is also used for color. It seems to do the right thing anyway, but better safe than sorry. Also override numAlphaArgs to 2 for DOT4 since that's what it wants. This migth fix something in case the specified alpha mode has only one argument. Also avoids emitting a needless 3DSTATE_MAP_BLEND_ARG if the specified alpha mode has three arguments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-11-13 19:13:27 +02:00
Neil Roberts	352f8f2d13	linker: Add a missing space in an error message Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-13 16:46:12 +00:00
José Fonseca	d5b1731178	llvmpipe: Call pipe_thread_wait() on Linux. To address http://lists.freedesktop.org/archives/mesa-dev/2014-November/070569.html In short, revert `706ad3b649` for non-Windows OSes.	2014-11-13 15:01:19 +00:00
Kenneth Graunke	2b6e703863	i915g: we also have more than 0 viewports! See `546d6c8d` for the corresponding fix in freedreno. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Stephane Marchesin <stephane.marchesin@gmail.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-11-12 20:59:28 -08:00
Eric Anholt	b3d269f5ae	vc4: Avoid reusing a pointer from c->outputs[] after add_output(). add_output() can resize the qreg array, so we might use a stale pointer.	2014-11-12 18:24:10 -08:00
Eric Anholt	acc1cca7ae	vc4: Fix assumption of TGSI OUT[0] being POSITION in the VS. All the shaders we've received so far had this be the case, but with nir-to-tgsi that changed. I might decide to make nir-to-tgsi keep the outputs in the same order, for debugging sanity, but I'm not sure.	2014-11-12 18:23:40 -08:00
Ilia Mirkin	22543dd8a1	nvc0: remove unused mm_VRAM_fe0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-11-12 15:24:15 -05:00
José Fonseca	9247509a8d	st/glx: Implement GLX_EXT_create_context_es2_profile. apitrace now supports it, and it makes it much easier to test tracing/replaying on OpenGL ES contexts since GLX_EXT_create_context_{es2,es}_profile are widely available. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-12 19:03:50 +00:00
Tom Stellard	0cae7ea271	Revert "clover: Fix build after llvm r221375" This reverts commit `cd93d82ba9`. llvm r221375 was reverted, so this commit needs to be too.	2014-11-12 12:30:08 -05:00
José Fonseca	977b18e486	gallivm: Fix build with LLVM 3.6 (r221751). Tested with LLVM 3.3, 3.4, 3.5, and 3.6. Trivial.	2014-11-12 11:08:07 +00:00
Matt Turner	7a82961b71	i965/cfg: Remove if_block/else_block. I used these in the SEL peephole, but they require extra tracking and fix ups. The SEL peephole can pretty easily find the blocks it needs without these. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-11 09:41:06 -08:00
Matt Turner	4001181ba3	i965/fs: Don't use if_block/else_block in SEL peephole. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-11 09:41:06 -08:00
Chia-I Wu	20a061d2b4	ilo: clean up gen6_3DSTATE_SF() Make the helpers fill out valid Gen7 3DSTATE_SF and 3STATE_SBE. This prevents the helpers from having to do dw[0] = GEN7_SBE_DW1_x; // setting DW1 value to dw[0]!? and simplifies gen7_3DSTATE_{SF,SBE}(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 16:04:18 +08:00
Chia-I Wu	239dca78b1	ilo: clean up gen7_3DSTATE_STREAMOUT() Render stream and render enable are independent from so enable. Having a single return point makes it easier to see that. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:52:26 +08:00
Chia-I Wu	eab595d573	ilo: rework gen7_3DSTATE_SO_DECL_LIST() Started to make pipe_stream_output_info mandatory, but ended up adding support for stream id and making a workaround Gen7-specific. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:52:26 +08:00
Chia-I Wu	c637075ea2	ilo: add 3DSTATE_SO_BUFFER variants Add gen7_disable_3DSTATE_SO_BUFFER() to disable SO buffers. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:52:25 +08:00
Chia-I Wu	2ff88ce4be	ilo: add gen6_3dstate_constant() It replaces gen6_fill_3dstate_constant(). gen6_3DSTATE_CONSTANT_{VS,GS,PS} are made wrappers of the new function. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:52:25 +08:00
Chia-I Wu	31372f2d2c	ilo: add variants of 3DSTATE_{HS,DS} Rename them to gen7_disable_3DSTATE_{HS,DS}() to reflect the fact. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:52:25 +08:00
Chia-I Wu	421b565b3b	ilo: add variants of 3DSTATE_GS Add gen6_so_3DSTATE_GS(), gen6_disable_3DSTATE_GS(), and gen7_disable_3DSTATE_GS() to do SO on GEN6 or to disable GS. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:52:22 +08:00
Chia-I Wu	63ded78e1c	ilo: add variants of 3DSTATE_VS Add gen6_disable_3DSTATE_VS() to disable VS. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:51:36 +08:00
Chia-I Wu	9087239df8	ilo: add variants of 3DSTATE_PS Add gen7_disable_3DSTATE_PS() to disable PS. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:51:31 +08:00
Chia-I Wu	8ebb86325b	ilo: add variants of 3DSTATE_WM Add gen6_hiz_3DSTATE_WM() and gen7_hiz_3DSTATE_WM() for HiZ ops without dispatching. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:51:28 +08:00
Chia-I Wu	703ae84ac2	ilo: add variants of 3DSTATE_CLIP Add gen6_disable_3DSTATE_CLIP to disable clipping. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 13:51:21 +08:00
Chia-I Wu	8abf4976c6	ilo: prefix 3DSTATE_VF with gen75 3DSTATE_VF is Gen7.5+ only. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-11 09:11:56 +08:00
Michael Varga	9d6253cf82	st/va: MPEG4 call vlVaDecoderFixMPEG4Startcode() If the VOP and GOV headers were truncated they will be regenerated. Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-10 10:24:07 -05:00
Michael Varga	d335f5ffa6	st/va: MPEG4 generate GOV and VOP header Also, Implemented a small locally used interface for writing bits to a buffer. Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-10 10:24:07 -05:00
Michael Varga	fa9e461967	st/va: MPEG4 populate the SPS structure Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-10 10:24:07 -05:00
Michael Varga	92350a65c4	st/va: MPEG4 populate the iq matrix buffers Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-10 10:24:07 -05:00
Michael Varga	9f1ee1b5c9	st/va: MPEG4 populate the PPS structure Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-10 10:24:07 -05:00
Michael Varga	c24ee2cf43	st/va: refactored handleVASliceDataBufferType This patch cleans the function handleVASliceDataBufferType() for better readability. Signed-off-by: Michael Varga <Michael.Varga@amd.com>	2014-11-10 10:24:07 -05:00
Ian Romanick	46a2323c3f	mesa: Remove _mesa_max_buffer_index It appears to be completely unused since `f9be8543` (February 2012). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-10 05:38:03 -08:00
Ian Romanick	8e4a6481e8	mesa: Uniform logging is very, very unlikely Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:40 -08:00
Ian Romanick	9cdf66657a	glsl: Swap the order of glsl_type::name and ::length On x86-64 this saves 8 bytes of padding in the structure, and this reduces the size of the structure to 32 bytes. v2: Fix constructor so that GCC won't warn about the order of initialization. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:40 -08:00
Ian Romanick	3711abd780	glsl: Store glsl_type::vector_elements and ::matrix_columns as uint8_t Due to the total number of bits used in the bitfield, this does not increase the size of the structure. It does, however, reduce the number of instructions required each time one of these fields is accessed. To access ::matrix_columns with the bitfield, three instructions were required: movzbl 0x9(%rdx),%eax shr %al and $0x7,%eax As a uint8_t, only one instruction is required. movzbl 0xa(%rdx),%eax These fields are accessed a lot. Valgrind callgrind results for a trace of Tesseract: _mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i Before (64-bit): 48,103,497 16,556,096 676,447 After (64-bit): 45,722,616 15,737,964 670,607 _mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i Before (32-bit): 61,472,611 21,051,222 821,361 After (32-bit): 57,987,421 19,872,226 811,609 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:40 -08:00
Ian Romanick	378d92c74e	mesa: Don't check for API_OPENGLES in _mesa_uniform_matrix There are no uniforms in OpenGL ES 1.x, so we can't even get to this code in that API. Also, reorder the checks. First check that transpose is true, then check whether or not that is legal in the current API. transpose should never be true in an ES2 context, so this gets one check (the more expensive one) out of the main path. Valgrind callgrind results for a trace of Tesseract: _mesa_UniformMatrix4fv _mesa_UniformMatrix3fv Before (64-bit): 96,119,025 24,240,510 After (64-bit): 90,726,569 22,926,662 _mesa_UniformMatrix4fv _mesa_UniformMatrix3fv Before (32-bit): 132,434,452 29,051,808 After (32-bit): 126,658,112 27,989,316 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:40 -08:00
Ian Romanick	91a2fa1490	mesa: Rework array error checks in validate_uniform_parameters Before ARB_explicit_uniform_location, Mesa's location encoding allowed locations for non-array types that had non-zero array indices. Basically, part of the location was the uniform and part was the array index. This meant that some checks had to occur for arrays and non-arrays. This is no longer possible, we the checks can be split up. Valgrind callgrind results for a trace of Tesseract: _mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i Before (64-bit): 50,499,557 17,487,316 686,227 After (64-bit): 50,023,791 17,274,432 684,293 _mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i Before (32-bit): 62,968,039 21,732,380 828,147 After (32-bit): 62,373,967 21,490,756 826,223 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:40 -08:00
Ian Romanick	366540e9af	mesa: Get some gl_shader_program::LinkStatus checking out of the main path I really wanted to remove 'shProg != NULL' as well, but that would have required adding a dummy program as the default program. That seemed like more churn than removing one test was worth. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:39 -08:00
Ian Romanick	3f5ebb98b7	mesa: Rework location == -1 error checking Only one caller wanted to generate an error when location == -1, so move the error generation to that caller. There will be more callers in the future that do not want to generate errors. Move the location == -1 check later in validate_uniform_parameters. As currently implemented, glUniform1iv(-1, -1, data) would not generate an error, but it should due to count being < 0. The location that I have moved it to will make more sense with the next commit. Valgrind callgrind results for a trace of Tesseract: _mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i Before (64-bit): 51,241,217 17,740,162 689,181 After (64-bit): 50,499,557 17,487,316 686,227 _mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i Before (32-bit): 63,940,605 21,987,918 831,065 After (32-bit): 62,968,039 21,732,380 828,147 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:39 -08:00
Ian Romanick	23dcbf623f	mesa: Minor clean ups in _mesa_uniform Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:39 -08:00
Ian Romanick	9c38d4db52	mesa: Remove GLSL_TYPE_SAMPLER check Noting the assertion just a few lines earlier, returnType cannot be GLSL_TYPE_SAMPLER. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:39 -08:00
Ian Romanick	5b9cf337b4	mesa/main: Pass the data that _mesa_uniform actually wants The GL_ enums were previously used because glsl_types.h couldn't be used in C code. That was fixed some time ago (and uniforms.c already includes glsl_types.h), so this is no longer necessary. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-11-10 04:25:39 -08:00
Chia-I Wu	d388d8576f	ilo: derive fb blending caps at bind time Derive whether a RT supports blending, logicop, and the like when set_framebuffer_state() is called. This enables us to simplify gen6_BLEND_STATE(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-10 15:46:31 +08:00
Chia-I Wu	55d70e0669	ilo: remove inlined state functions We had some inlined state functions for dispatching. They were not needed with the new top/bottom split. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-10 15:46:19 +08:00
Chia-I Wu	c88c49baf4	ilo: use top/bottom split for state functions Follow the builder and split state functions into top (vertex processing) and bottom (pixel processing). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-10 13:14:04 +08:00
Kenneth Graunke	f3b709c0ac	i965: Advertise a line width of 40.0 on Cherryview and Skylake. According to the documentation, line widths higher than 40.0 may have quality problems. That's already 20 times larger than we've been exposing, so it seems totally sufficient. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-11-08 22:24:08 -08:00
Kenneth Graunke	6dab04d7e3	i965: Advertise larger line widths. We've artificially been limiting this to 5 for no particular reason. On Gen4-5, the limit is [0, 7.5] with a granularity of 0.5 (U3.1). On Gen6+, the limit is [0, 7.9921875]. Since it's a U3.7, the granularity should be 0.125 (1/8). This patch conservatively advertises one granularity smaller than the hardware's maximum value, just in case there's a problem using the largest possible value. On Gen4-5, this is 7.5 - 0.5 = 7.0. On Gen6+, this is 8.0 - 0.125 = 7.875. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-11-08 22:22:54 -08:00
Kenneth Graunke	61838fd9ad	i965: Use ctx->Const.MaxLineWidth when clamping ctx->Line.Width. Rather than hardcoding platform values in every code path, just use the maximum value we set. Currently, ctx->Const.LineWidth == 5, which is smaller than the hardware limit. But applications shouldn't be using a value larger than we support anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-11-08 22:22:53 -08:00
Kenneth Graunke	87927ed1f0	i965: Set Line Width correctly on Cherryview and Skylake. Line Width moved to DW1 bits 29:12. It's actually now a U11.7. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-11-08 22:22:18 -08:00
Emil Velikov	a6d8413d7c	docs: add news item and link release notes for mesa 10.3.3 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-08 17:22:15 +00:00
Emil Velikov	caa0fb4709	docs: Add sha256 sums for the 10.3.3 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `9cc26056ee`)	2014-11-08 17:22:15 +00:00
Emil Velikov	0d5da6d9a8	Add release notes for the 10.3.3 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `1a9cc5f50d`)	2014-11-08 17:22:15 +00:00
José Fonseca	b238c756da	util/format: Fix clamping to 32bit integers. Use clamping constants that guarantee no integer overflows. As spotted by Chris Forbes. This causes the code to change as: - value \|= (uint32_t)CLAMP(src[0], 0.0f, 4294967295.0f); + value \|= (uint32_t)CLAMP(src[0], 0.0f, 4294967040.0f); - value \|= (uint32_t)((int32_t)CLAMP(src[0], -2147483648.0f, 2147483647.0f)); + value \|= (uint32_t)((int32_t)CLAMP(src[0], -2147483648.0f, 2147483520.0f)); Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-08 10:32:39 +00:00
José Fonseca	d268eac3a9	util/format: Generate floating point constants for clamping. This commit causes the generated C code to change as union util_format_r32g32b32a32_sscaled pixel; - pixel.chan.r = (int32_t)CLAMP(src[0], -2147483648, 2147483647); - pixel.chan.g = (int32_t)CLAMP(src[1], -2147483648, 2147483647); - pixel.chan.b = (int32_t)CLAMP(src[2], -2147483648, 2147483647); - pixel.chan.a = (int32_t)CLAMP(src[3], -2147483648, 2147483647); + pixel.chan.r = (int32_t)CLAMP(src[0], -2147483648.0f, 2147483647.0f); + pixel.chan.g = (int32_t)CLAMP(src[1], -2147483648.0f, 2147483647.0f); + pixel.chan.b = (int32_t)CLAMP(src[2], -2147483648.0f, 2147483647.0f); + pixel.chan.a = (int32_t)CLAMP(src[3], -2147483648.0f, 2147483647.0f); memcpy(dst, &pixel, sizeof pixel); which surprisingly makes a difference for MSVC. Thanks to Juraj Svec for diagnosing this and drafting a fix. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=29661	2014-11-08 10:32:39 +00:00
Vinson Lee	42443339f1	glsl/list: Revert unintentional file mode change in previous commit. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-11-07 21:04:08 -08:00
Vinson Lee	f9fc3949e1	glsl/list: Move declaration before code. Fixes MSVC build error. shaderapi.c src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type' src\glsl\list.h(535) : error C2143: syntax error : missing ')' before 'type' src\glsl\list.h(536) : error C2065: 'node' : undeclared identifier Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86025 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-11-07 15:36:26 -08:00
Jason Ekstrand	0c36aac832	glsl/list: Add an exec_list_validate function This can be very useful for trying to debug list corruptions. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-07 14:53:40 -08:00
José Fonseca	706ad3b649	llvmpipe: Avoid deadlock when unloading opengl32.dll On Windows, DllMain calls and thread creation/destruction are serialized, so when llvmpipe is destroyed from DllMain waiting for the rasterizer threads to finish will deadlock. So, instead of waiting for rasterizer threads to have finished, simply wait for the rasterizer threads to notify they are just about to finish. Verified with this very simple program: #include <windows.h> int main() { HMODULE hModule = LoadLibraryA("opengl32.dll"); FreeLibrary(hModule); } Fixes https://bugs.freedesktop.org/show_bug.cgi?id=76252 Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>	2014-11-07 21:00:06 +00:00
José Fonseca	edb7b1c566	docs: Update minimum required LLVM version.	2014-11-07 21:00:06 +00:00
Emil Velikov	21925ec3fc	i965: drop the custom gen8_instruction CFLAG No longer needed as the file was removed with commit `8c229d306b` Author: Kenneth Graunke <kenneth@whitecape.org> Date: Mon Aug 11 10:07:07 2014 -0700 i965: Delete the Gen8 code generators. We now use the brw_eu_emit.c code instead. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-07 18:32:17 +00:00
Emil Velikov	f6432c4d72	gbm/dri: cleanup memory leak on teardown During teardown we free the driver_configs list pointer, but we forget to deallocate each config in that list. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-07 18:32:07 +00:00
Emil Velikov	8ed08e69bc	egl_dri2: add a note about dri2_create_screen The function is not called by platform_drm. As such one needs to pay special attention at teardown. v2: Fix the comment block. Spotted by Ken. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-11-07 18:31:23 +00:00
Emil Velikov	38cec0303b	egl_dri2: fix double free on drm platforms Earlier commit failed to attribure that for drm platforms one does not call dri2_create_screen, thus it does not create the screen and driver_configs but inherits them from the "display" - gbm. As such wrap cleanup in Platform != _EGL_PLATFORM_DRM to prevent the issue and still cleanup correctly for non-drm platforms. v2: - Drop the ifdef HAVE_DRM_PLATFORM, reindent the code and fix the comment block. Suggested by Ken. Reported-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Mark Janes <mark.a.janes@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-11-07 18:29:08 +00:00
Chia-I Wu	9a0a4d67a9	ilo: tidy up message descriptor decoding Move opcode to string mappings to functions of their own. Have for consistent outputs for similar opcodes. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-07 23:34:56 +08:00
Chia-I Wu	d3c5976a3b	ilo: decode INTERFACE_DESCRIPTOR_DATA This is at least much better than decoding as blobs. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-07 23:33:21 +08:00
Matt Turner	58a54091a9	i965/fs: Wire up control flow correctly in predicated break pass. When the earlier block ended with control flow, we'd mistakenly remove some of its links to its children. The same happened with the later block. Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-06 16:37:56 -08:00
Matt Turner	f0cfc4fca0	i965/cfg: Add functions to get first and last non-CF instructions. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-06 16:37:56 -08:00
Kenneth Graunke	a16ca4ac6a	glsl: Skip loop-too-large heuristic if indexing arrays of a certain size A pattern in certain shaders is: uniform vec4 colors[NUM_LIGHTS]; for (int i = 0; i < NUM_LIGHTS; i++) { ...use colors[i]... } In this case, the application author expects the shader compiler to unroll the loop. By doing so, it replaces variable indexing of the array with constant indexing, which is more efficient. This patch extends the heuristic to see if arrays accessed within the loop are indexed by an induction variable, and if the array size exactly matches the number of loop iterations. If so, the application author probably intended us to unroll it. If not, we rely on the existing loop-too-large heuristic. Improves performance in a phong shading microbenchmark by 2.88x, and a shadow mapping microbenchmark by 1.63x. Without variable indexing, we can upload the small uniform arrays as push constants instead of pull constants, avoiding shader memory access. Affects several games, but doesn't appear to impact their performance. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2014-11-06 16:30:47 -08:00
Kenneth Graunke	4f22db5fbb	glsl: Lower constant arrays to uniform arrays. Consider GLSL code such as: const ivec2 offsets[] = ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1), ivec2(0, -1), ivec2(0, 0), ivec2(0, 1), ivec2(1, -1), ivec2(1, 0), ivec2(1, 1)); ivec2 offset = offsets[<non-constant expression>]; Both i965 and nv50 currently handle this very poorly. On i965, this becomes a pile of MOVs to load the immediate constants into registers, a pile of scratch writes to move the whole array to memory, and one scratch read to actually access the value - effectively the same as if it were a non-constant array. We'd much rather upload large blocks of constant data as uniform data, so drivers can simply upload the data via constbufs, and not have to populate it via shader instructions. This is currently non-optional because both i965 and nouveau benefit from it, and according to Marek radeonsi would benefit today as well. (According to Tom, radeonsi may want to handle this itself in the long term, but we can always add a flag when it becomes useful.) Improves performance in a terrain rendering microbenchmark by about 2x, and cuts the number of instructions in about half. Helps a lot of "Natural Selection 2" shaders, as well as one "HOARD" shader. total instructions in shared programs: 5473459 -> 5471765 (-0.03%) instructions in affected programs: 5880 -> 4186 (-28.81%) v2: Use ir_var_hidden to avoid exposing the new uniform via the GL uniform introspection API. v3: Alphabetize Makefile.sources properly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-06 16:20:01 -08:00
Kenneth Graunke	0c0bfb2ead	glsl: Add infrastructure for "hidden" uniforms. In the compiler, we'd like to generate implicit uniforms for internal use. These should not be visible via the GL uniform introspection API. To support that, we add a new ir_variable::how_declared value of ir_var_hidden, and plumb that through to gl_uniform_storage. v2 (idr): Fix some memory management issues in move_hidden_uniforms_to_end. The comment block on the function has more details. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-11-06 16:20:01 -08:00
Timothy Arceri	1378617218	mesa: Add SSE 4.1 optimisation for glDrawElements. Makes use of SSE 4.1 to speed up compute of min and max elements. Callgrind cpu usage results from pts benchmarks: Openarena 0.8.8: 3.67% -> 1.03% UrbanTerror: 2.36% -> 0.81% V5: - actually make use of the optimisation in android (Emil Velikov) - set a better array size limit for using SSE and added TODO V4: - fixed bugs with incrementing pointer and updating counters V3: - Removed sse_minmax.c from Makefile.sources - handle the first few values without SSE until the pointer is aligned and use _mm_load_si128 rather than _mm_loadu_si128 - guard the call to the SSE code better at build time V2: - removed GL* types - use _mm_store_si128() rather than _mm_store_ps() - add runtime check for SSE - use aligned attribute for local mix/max - bunch of tidyups Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-11-06 11:39:59 -08:00
Matt Turner	9557cf7d0d	i965: Remove non-existent vertical strides from array. These never existed, as far as I can tell. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-06 11:11:37 -08:00
Matt Turner	cc3b028a4f	i965: Convert stride/width/execution size macros into enums. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-06 11:11:34 -08:00
Matt Turner	497122a338	i965/fs: Remove force uncompressed stack. Last use was in shader_time. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-06 11:09:46 -08:00
Matt Turner	7e19e6c877	i965/fs: Use execution size of 1 for some shader_time operations. The ADDs depended on dispatch_width, which really isn't what we wanted. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-11-06 11:09:46 -08:00
Matt Turner	ee7e6009a9	i965/fs: Use mov(4) instructions to read timestamp. We only want fields 0-2.	2014-11-06 11:09:45 -08:00
Jan Vesely	cd93d82ba9	clover: Fix build after llvm r221375 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-11-06 11:39:36 -05:00
Emil Velikov	ba0bb4227e	egl_dri2: do not leak dri2_dpy->driver_configs Walk through the list and free each config, and finally free the list itself. Freeing approx 20KiB of memory, according to valgrind. Inspired by a similar patch by enpeng xu. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-06 13:23:51 +00:00
Emil Velikov	54a065d9a6	ilo: add two missing headers to the sources list Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-11-06 13:19:08 +00:00
Alexandros Frantzis	f53b6d0134	Releasing a surfaceless EGL context doesn't release underlying DRI context. driUnbindContext() checks for valid drawables before calling the driver unbind function. In case of Surfaceless contexts, the drawables are always Null and we end up not releasing the underlying DRI context. Moving the call to the driver function before the drawable validity checks fixes things. Steps to trigger this bug are following: - create surfaceless context and make it current - make some other context current - {another thread} destroy surfaceless context - make another context current Signed-off-by: Alexandros Frantzis <Alexandros.Frantzis@canonical.com> Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74563	2014-11-06 13:40:39 +02:00
Chia-I Wu	cd745d46ce	ilo: let ilo_shader_compile_cs() return a dummy shader The dummy shader sends an EOT message to end itself. There are many more works need to be done on the compiler side before we can advertise PIPE_CAP_COMPUTE. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:45:20 +08:00
Chia-I Wu	ce40fa3a4a	ilo: hook up launch_grid() All we need to do is to upload the input data and call ilo_render_emit_launch_grid() with space checking. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:53 +08:00
Chia-I Wu	a1a701877a	ilo: add ilo_render_emit_launch_grid() ilo_render_emit_launch_grid() emits all the hardware states needed for a launch_grid() call. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:53 +08:00
Chia-I Wu	9dd596c99f	ilo: improve media command helpers They were written for Gen6 but mostly untested. Make them work for Gen7+. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:53 +08:00
Chia-I Wu	a2054af85c	ilo: disassemble DP DC messages Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:53 +08:00
Chia-I Wu	58099ed0a1	ilo: disassemble TS messages Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:53 +08:00
Chia-I Wu	bfaed536dd	ilo: update genhw headers for media pipeline Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:45 +08:00
Chia-I Wu	207eccc5bf	ilo: add ilo_finalize_compute_states() It updates the handles of the global bindings. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:31 +08:00
Chia-I Wu	9feb637cd0	ilo: use a dynamic array for global bindings Use util_dynarray in ilo_set_global_binding() to allow for unlimited number of global bindings. Add a comment for global bindings. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:43:31 +08:00
Chia-I Wu	1d51947693	ilo: add kernel queries for compute shaders We need to know the local/input/private sizes and others. This is not complete. We need many others for CURBE setup. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:42:19 +08:00
Chia-I Wu	99742998fc	ilo: fix compute params Based on beignet, hardware capabilities, and OpenCL requirements. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:26:34 +08:00
Chia-I Wu	510a1a9012	ilo: add eu_count and thread_count to ilo_dev_info They will be used to report compute params or program compute states. thread_count can also be used for 3DSTATE_VS. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:26:34 +08:00
Chia-I Wu	29253f44d0	ilo: fix intel_bo_wait() on kernel 3.17 drm_intel_gem_bo_wait() with negative timeout is broken on kernel 3.17. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-11-06 10:26:34 +08:00
Ian Romanick	93a92d2c69	mesa: Silence unused parameter warning in check_context_limits in non-debug builds ../../src/mesa/main/context.c: In function 'check_context_limits': ../../src/mesa/main/context.c:733:41: warning: unused parameter 'ctx' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-05 09:36:04 -08:00
Ian Romanick	6f3b8bb747	util: Implement unreachable for MSVC using __assume Based on the description of __assume at: http://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-11-05 09:36:04 -08:00
Chris Forbes	1ca88aa582	i965: Fix sampler state pointer adjustment for nonconst samplers This started hitting an assertion recently. Only affects Haswell (Ivybridge doesn't support this meddling with the sampler state pointer, and ARB_gpu_shader5 is not enabled yet on Broadwell) 14 Piglits crash->pass. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-05 23:32:42 +13:00
Nick Sarnie	9e2473763d	ilo: add drm_configuration for the pipe-target Allows the driver to advertise DMA-BUF and throttling.	2014-11-04 21:22:52 +00:00
Kenneth Graunke	6107557f8f	i965: Re-enable Z16 on Gen8+. Improves performance in GLBenchmark 2.7 TRex by 3.88889% +/- 0.336383% (n=80) at 1280x720 on Broadwell GT3. Together with the previous patch, it improves performance by 5.42738% +/- 0.541971% (n=10) at 1920x1080. Note that without the PMA stall fix, this would instead decrease performance by 22%. v2: Update comment (noticed by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-04 11:39:18 -08:00
Kenneth Graunke	7423cc891b	i965: Implement the PMA stall fix. Certain non-promoted depth cases typically incur stalls. In very specific cases, we can enable a workaround which improves performance. Improves performance in GLBenchmark 2.7 TRex by 1.17762% +/- 0.448765% (n=75) at 1280x720 on Broadwell GT3. Haswell has this feature as well, but we can't currently write registers from userspace batches (and we'd incur additional software batch scanning overhead as well), so we haven't enabled it. Broadwell allows us to write CACHE_MODE_1. Backporters beware: the formula and flushing incantation differs between Haswell and Broadwell. v2: Move pma_stall_bits from brw->state to brw itself (requested by Kristian Høgsberg). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-04 11:38:01 -08:00
Kenneth Graunke	8ccf54ab09	i965: Add #defines for Broadwell HiZ workarounds in CACHE_MODE_1. This patch adds macros needed for the HiZ PMA stall optimization. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-04 11:35:11 -08:00
Kenneth Graunke	b5ad8a5d72	i965: Update compaction code to handle Skylake like Cherryview. Matt requested this in review feedback on the original patch, which I completely missed when pushing this series. Kristian also made this change, but I grabbed the wrong version of the patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-03 22:37:11 -08:00
Kenneth Graunke	8ca8dd123a	mesa: Don't call _mesa_ClipControl from glPopAttrib when unsupported. Otherwise, calling glPopAttrib on drivers that don't support ARB_clip_control gives you a GL error, which is surprising at best. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-03 18:26:08 -08:00
Kenneth Graunke	f781965097	i965: Disable fast color clears on Skylake for now. We're not programming the clear values yet, so this won't work. This patch should be (effectively) reverted eventually. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-03 15:35:25 -08:00
Kristian Høgsberg	c31ce2c40c	i965/skl: Use new MOCS for SKL On Skylake, the MOCS bits are an index into a table of 63 different, configurable cache configurations. As for previous GENs, we only care about WB and WT, which are available in the documented default set. Define SKL_MOCS_WB and SKL_MOCS_WT to the indices for those configucations and use those for the Skylake MOCS values. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:33:12 -08:00
Jordan Justen	5745aaf15c	i965/skl: Implement workaround for VF Invalidate issue Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:33:09 -08:00
Kenneth Graunke	35bbe177ec	i965/skl: Update Viewport Z Clip Test Enable bits for Skylake. Skylake has separate controls for enabling the Z Clip Test for the near and far planes. For now, maintain the legacy behavior by setting both. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:33:07 -08:00
Kenneth Graunke	77f584c7f9	i965/skl: Emit extra zeros in 3DSTATE_DS on Skylake. Skylake's 3DSTATE_DS packet has a few more fields; we don't support domain shaders yet though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:33:05 -08:00
Kristian Høgsberg	0bb072b42b	i965/skl: Init instructions compaction tables for SKL They are the same as for BDW, so just add a case for SKL to the init switch. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-03 15:32:59 -08:00
Kristian Høgsberg	d235c5afde	i965/skl: Add fast clear resolve rect multipliers for SKL SKL updates the resolve rectangle scaling factors again. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:55 -08:00
Kenneth Graunke	051bfe4d52	i965/skl: Always emit 3DSTATE_BINDING_TABLE_POINTERS_* on Skylake. On SKL, 3DSTATE_CONSTANT_* command is not committed until we give the corresponding 3DSTATE_BINDING_TABLE_POINTERS_* command. If we fail to do so, the constant buffers wont be read and push constants will be wrong. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:53 -08:00
Kenneth Graunke	1df496edb9	i965/skl: Allocate 16 DWords for SURFACE_STATE on Skylake. Otherwise they overlap and horrible things happen. All the new DWords are for fast color clear values, which we don't do yet. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:51 -08:00
Kenneth Graunke	d18949ad82	i965/skl: Refactor surface state allocation. We will need to allocate more DWords on Skylake. v2: Don't mark brw_context parameter const. It's modified. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:49 -08:00
Kenneth Graunke	263b584d5e	i965/skl: Emit extra zeros in STATE_BASE_ADDRESS on Skylake. Skylake introduces a new base address for a feature we don't yet expose. Setting these to 0 should be safe. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:47 -08:00
Kenneth Graunke	eaf12022d2	i965/skl: Update stencil reference handling for Skylake. Skylake uploads the stencil reference values in DW3 of the 3DSTATE_WM_DEPTH_STENCIL packet, rather than in COLOR_CALC_STATE. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:45 -08:00
Kenneth Graunke	822e791321	i965/skl: Set mask bits in PIPELINE_SELECT on Skylake. Skylake has some extra bits in PIPELINE_SELECT, none of which are interesting for a 3D driver. In order to selectively change them, it also introduces new "mask bits" in 15:8. We care about the "Pipeline Selection" bits (1:0), so set the mask to 0x3. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:43 -08:00
Jordan Justen	e813728b2b	i965/skl: Set max OpenGL version the same as gen7/8 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:41 -08:00
Damien Lespiau	48157b904a	i965/skl: Update 3DSTATE_SBE for Skylake. This commands has seen the addition of 2 dwords that allow to specify which channels of which attributes need to be forwarded to the fragment shader. v2: Rebase forward a year (done by Ken). Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-11-03 15:32:34 -08:00
Kenneth Graunke	2b7f73af9c	glsl: Improve the CSE pass debugging output. The CSE pass now prints out why it thinks a value is not a candidate for adding to the AE set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-11-03 15:16:50 -08:00
Matt Turner	799106d387	i965/fs: Don't compute_to_mrf() on Gen >= 7. No differences in shader-db on Haswell (Gen 7.5). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-03 11:27:52 -08:00
Matt Turner	5fbcb1b41d	glsl: Remove now useless dot optimization on basis vect The optimization in commit `d056863b` covers these cases, which were the first optimizations I added to the GLSL compiler. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-11-03 11:27:50 -08:00
Matt Turner	336e76c143	glsl: Emit mul instead of dot if only one component left. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85683 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85691 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-11-03 11:27:38 -08:00
Tom Stellard	263eb7fa39	clover: Fix clBuildProgram piglit regression Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices is greater than zero. Introduced by `e5468dfa52` Reported by: EdB Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-11-03 10:35:07 -05:00
José Fonseca	bfd453f942	gallivm: Disable frame-pointer-omission on x86 to ensure right stack alignment. Between release 3.2 and 3.3 LLVM stopped aligning properly when certain conditions (no allocas, but large number of vectors causing spills to the stack, and frame pointer omission enabled). We were already disabling frame-pointer-omission on several build types, but we now disable it on all build types. It's not clear whether this affects 32-bits x86 processes only, or if it can also affect 64-bits x86_64 processes when AVX registers are available and used. So disable frame-pointer-omission on both x86/x86_64 to be on the safe side. See also: - http://llvm.org/PR21435 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-03 14:47:00 +00:00
José Fonseca	b7e447d323	gallivm: When disassemble a function, start by printing out its name. To help recognize what's supposed to do. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-11-03 14:47:00 +00:00
Ben Widawsky	5695303563	i965/chv: Increase VS and GS thread counts AFAICT the number of threads is 80, not 70. I am not sure if Ken knows something I do not. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-11-02 21:18:08 -08:00
Brian Paul	52576dcb88	gallium/docs: fix NRM, NRM4 docs Need to do a sqrt(). FWIW, the html that Sphinx 1.1.3 generates for the math expressions looks completely broken. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-11-01 09:00:07 -06:00
Brian Paul	afdc4309dc	softpipe: use the tgsi_free_tokens() function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:30:00 -06:00
Brian Paul	e6ee85ec61	tgsi: add a tgsi_free_tokens() function To match tgsi_alloc_tokens(). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:29:59 -06:00
Brian Paul	c996b22329	util: simplify u_pstipple.c code Use the new helper functions in the tgsi_transform.h file to emit declarations and instructions. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:29:59 -06:00
Brian Paul	55008ef697	util: simplify temp register selection in u_pstipple.c Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:29:59 -06:00
Brian Paul	ccd1ea9d52	util: simplify util_pstipple_create_fragment_shader() params Pass and return tgsi_token buffers instead of pipe_shader_state. And update softpipe driver (the only user of this function). Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:29:59 -06:00
Brian Paul	e3ecb8206a	softpipe: remove unused softpipe_create_fs_variant_exec() parameter Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:29:59 -06:00
Brian Paul	2b9e63823f	softpipe: check for SP_NEW_STIPPLE when building quad pipeline Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and DO_PSTIPPLE_IN_HELPER_MODULE are zero/off. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-31 15:29:59 -06:00
Tom Stellard	b9e41b587f	r600g: Fix build with opencl and radeonsi disabled	2014-10-31 16:26:52 -04:00
Tom Stellard	64b0fac5e2	clover: Fix bug when binary programs are passed to clBuildProgram() v2 This was a regression introduced by `611d66fe45` Passing a binary program to clBuildProgram() is legal, but passing one to clCompileProgram() is not. v2: - Code cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-31 15:24:00 -04:00
Tom Stellard	e5468dfa52	clover: Factor input validation of clCompileProgram into a new function v2 This factors out the validation that is common with clBuildProgram(). v2: - Code cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-31 15:24:00 -04:00
Tom Stellard	1f4e48d5b5	radeonsi/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2 v2: - Drop dependency on LLVM >= 3.5.1 - Rename si_create_shader() to si_shader_binary_read()	2014-10-31 15:24:00 -04:00
Tom Stellard	fa07f4b68a	r600g/compute: Enable PIPE_SHADER_IR_NATIVE for compute shaders v2 v2: - Drop dependency on LLVM >= 3.5.1	2014-10-31 15:24:00 -04:00
Tom Stellard	e91735a641	gallium/radeon: Add query for symbol specific config information This adds a query which allows drivers to access the config information of a specific function within the LLVM generated ELF binary. This makes it possible for the driver to handle ELF binaries with multiple kernels / global functions.	2014-10-31 15:24:00 -04:00
Marek Olšák	f058c6bbd1	r300g: remove enabled/disabled hyperz and AA compression messages It's annoying with octave. Reported by Michael Burian. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>	2014-10-30 22:24:18 +01:00
Dieter Nützel	068b9f4f7a	r600g: Delete unused variable 'max_global_size' in 'r600_get_compute_param' Signed-off-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2014-10-30 22:24:18 +01:00
Chia-I Wu	4ded2ef5e8	mesa: protect the debug state with a mutex We are about to change mesa to spawn threads for deferred glCompileShader and glLinkProgram, and we need to make sure those threads can send compiler warnings/errors to the debug output safely. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-30 02:26:19 -07:00
Chia-I Wu	2d64e4ffba	glsl: protect glsl_type with a mutex glsl_type has several static hash tables and a static ralloc context. They need to be protected by a mutex as they are not thread-safe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69200 Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-30 02:26:19 -07:00
Chia-I Wu	a6706163cb	glsl: protect anonymous struct id with a mutex There may be two contexts compiling shaders at the same time, and we want the anonymous struct id to be globally unique. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-30 02:26:19 -07:00
Chia-I Wu	61c3d49388	util: initialize locale_t with a static object _mesa_strtod and _mesa_strtof may be called from multiple threads. They need to be thread-safe. v2: platform checks are now done in configure.ac Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-30 02:26:19 -07:00
Chia-I Wu	b039dbfffd	configure: check for xlocale.h and strtof With the assumptions that xlocale.h implies newlocale and strtof_l. SCons is updated to define HAVE_XLOCALE_H on linux and darwin. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-30 02:26:19 -07:00
Chia-I Wu	e3f2029479	util: add _mesa_strtod and _mesa_strtof Both core mesa and glsl have their own wrappers for strtof_l. Merge and move them to util/. They are compiled with a C++ compiler so that we can make them thread-safe in a following commit. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Kenneth Graunke <kenneth@whiteacpe.org>	2014-10-30 02:26:19 -07:00
Mathias Fröhlich	2c2ada6720	mesa/gallium: Signal _NEW_TRANSFORM from glClipControl. This removes the need for the gallium rasterizer state to listen to viewport changes. Thanks to Marek Olšák <maraeo@gmail.com>. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-30 07:52:00 +01:00
Matt Turner	600066af93	Revert "i965/compaction: Disable compaction on SNB temporarily." This reverts commit `cabc93c5ad`. Mark thinks the failures on the SNB GT2 in the lab are actually because of faulty hardware, not instruction compaction. The GT1 didn't see any problems after changes to the compaction code.	2014-10-29 21:38:39 -07:00
Matt Turner	601a134180	i965/vec4: Perform CSE on MAD instructions with final arguments switched. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-29 21:35:46 -07:00
Matt Turner	b65bd9583b	i965/fs: Perform CSE on MAD instructions with final arguments switched. Multiplication is commutative. instructions in affected programs: 48314 -> 47954 (-0.75%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-29 21:35:46 -07:00
Matt Turner	d056863b3c	glsl: Drop constant 0.0 components from dot products. Helps a small number of vertex shaders in the games Dungeon Defenders and Shank, as well as an internal benchmark. instructions in affected programs: 2801 -> 2719 (-2.93%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-29 21:35:46 -07:00
Kenneth Graunke	26122e09a3	glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present. v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event rather than gettimeofday(), which gives us the presentation time instead of the time when SwapBuffers was called. Suggested by Keith Packard. This relies on the fact that the X DRI3/Present implementations use microseconds for UST. v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits (caught by Keith Packard). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Keith Packard <keithp@keithp.com> [v3] Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1]	2014-10-29 15:13:58 -07:00
Kenneth Graunke	62b07b934e	i965: Rename brw_vec4_gs.[ch] to brw_gs.[ch]. These source files support actual geometry shaders, so using "gs" for the name makes a lot of sense. We're going to be adding SIMD8 geometry shader support as well, at which point "vec4_gs" will be a misnomer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2014-10-29 12:38:56 -07:00
Kenneth Graunke	02f8f90cc2	i965: Rename brw_gs{,_emit}.[ch] to brw_ff_gs{,_emit}.[ch]. The brw_gs.[ch] and brw_gs_emit.c source files contain code for emulating fixed-function unit functionality (VF primitive decomposition or SOL) using the GS unit. They do not contain code to support proper geometry shaders. We've taken to calling that code "ff_gs" (see brw_ff_gs_prog_key, brw_ff_gs_prog_data, brw_context::ff_gs, brw_ff_gs_compile, brw_ff_gs_prog). So it makes sense to make the filenames match. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com>	2014-10-29 12:38:42 -07:00
Kenneth Graunke	1480814173	i965: Rename intel_bufferobj_* functions to match GL and DD hooks. The GL functions and driver hooks use corresponding names---for example, glMapBufferRange and Driver.MapBufferRange. But our implementation was called "intel_bufferobj_map_range," which has the words "map" and "buffer" swapped, as well as randomly adding "obj." FlushMappedBufferRange was even trickier: it ordered the words 3, "obj", 1, 2, 4: intel_bufferobj_flush_mapped_range. Even though the old names were consistent, I always had trouble rearranging the jumble of words when searching for a function, and it took a few tries to eventually land there. The new names match the word order of GL and the driver hooks; FlushMappedBufferRange is simply brw_flush_mapped_buffer_range. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-29 12:38:28 -07:00
Jan Vesely	993e2922c9	configure: fix typos Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-10-29 19:10:48 +00:00
Jan Vesely	af9551e68c	configure: include llvm systemlibs when using static llvm v2: drop -WL,--exclude-libs, it's not necessary fix tabs/spaces Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70410 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-10-29 18:52:46 +00:00
Michel Dänzer	402ab50bed	radeon/llvm: Dynamically allocate branch/loop stack arrays This prevents us from silently overflowing the stack arrays, and allows arbitrary stack depths. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454 Cc: mesa-stable@lists.freedesktop.org Reported-and-Tested-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-29 19:01:25 +09:00
Chris Forbes	0d5f4960a4	mesa: Fix order of errors for glDrawTransformFeedbackStream The OpenGL 4.0 core profile specification, section 2.17.3 Transform Feedback Draw Operations says: "The error INVALID_VALUE is generated if <stream> is greater than or equal to the value of MAX_VERTEX_STREAMS. ... The error INVALID_OPERATION is generated if EndTransformFeedback has never been called while the object named by id was bound." Fixes the piglit test: ARB_transform_feedback3/arb_transform_feedback3-draw_using_invalid_stream_index (with the test itself fixed to eliminate an unrelated failure) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-29 21:25:20 +13:00
Eric Anholt	f87c700895	vc4: Add support for ARL and indirect register access on TGSI_FILE_CONSTANT. Fixes 14 ARB_vp tests (which had no lowering done), and should improve performance of indirect uniform array access in GLSL.	2014-10-28 17:16:05 -07:00
Eric Anholt	5539a5b685	vc4: Fix mixup of return type in reloc_tex().	2014-10-28 17:15:36 -07:00
Eric Anholt	926ab7dfa5	vc4: Drop redundant check for is_tmu_write(). This function is only called when it would return true.	2014-10-28 17:15:36 -07:00
Eric Anholt	8911879dec	vc4: Don't forget to validate code that's got PROG_END on it. This signal doesn't terminate the program now, it terminates the program soon. So you have to actually validate the code in the instruction.	2014-10-28 17:15:36 -07:00
Eric Anholt	fc1eb614a7	vc4: Add .dir-locals.el for kernel style in the kernel code.	2014-10-28 17:15:36 -07:00
Eric Anholt	6576dc1e92	vc4: Fix a couple missing '\n's in error output.	2014-10-28 17:15:36 -07:00
Brian Paul	6ad1c1eec1	st/mesa: use PIPE_BIND_DISPLAY_TARGET when checking for sRGB capability When we're checking if the framebuffer is sRGB capable, call is_format_supported() with the PIPE_BIND_DISPLAY_TARGET flag. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-10-28 18:07:54 -06:00
Marek Olšák	6fcb5520b7	Revert "st/mesa: set MaxUnrollIterations = 255" This reverts commit `20836c8185`. 255 is a huge number. If you have a loop with 255 iterations, unrolling it will exceed the SM3 instruction limit. Let's use the default again. The comment about a SM3 limit doesn't make sense. For SM3, we generally want 32 (default) or a lower number due to the SM3 instruction limit, which is 512 instructions. For SM4, we can try higher numbers if needed, but some shaders can end up being pretty huge and shader compilation can take more time. This fixes a shader compile failure on R500/SM3. Reported on IRC. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-10-28 23:20:51 +01:00
David Heidelberger	b7186ebea9	r300g/vdpau: enable again Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-10-28 23:20:51 +01:00
Marek Olšák	3fc499a1dd	r300g: only set clip_halfz for chips with HW TCL I forgot that we cannot emit vertex shader state on a chip without VS. In such a case, clip_halfz is handled by the Draw module.	2014-10-28 23:20:45 +01:00
Marek Olšák	e05259b637	radeonsi: fix incorrect index buffer max size for lowered 8-bit indices Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-28 23:20:45 +01:00
Marek Olšák	72424061e0	radeonsi: fix polygon mode for points and lines and point/line fill modes Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-28 23:20:45 +01:00
Marek Olšák	dab177ea99	r600g: fix polygon mode for points and lines and point/line fill modes Fixes piglit/polygon-mode-offset. Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-28 23:20:45 +01:00
Glenn Kennard	7b1c0cbc90	r600g: Implement sm5 UBO/sampler indexing Caveat: Shaders using UBO/sampler indexing will not be optimized by SB, due to SB not currently supporting the necessary CF_INDEX_[01] index registers. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>	2014-10-28 23:20:45 +01:00
Glenn Kennard	444c8c2f28	r600g: Implement sm5 interpolation functions Requires evergreen/cayman Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>	2014-10-28 23:20:44 +01:00
Neil Roberts	3b83a5c35c	docs: Update GL3.txt and relnotes for GL_KHR_context_flush_control	2014-10-28 16:51:12 +00:00
Neil Roberts	60ec95fa1e	mesa: Add support for the GL_KHR_context_flush_control extension The GL side of this extension just provides an accessor via glGetIntegerv for the value of GL_CONTEXT_RELEASE_BEHAVIOR so it is trivial to implement. There is a constant on the context for the value of the enum which is initialised to GL_CONTEXT_RELEASE_BEHAVIOR_FLUSH. The extension is always enabled because it doesn't need any driver interaction to retrieve the value. If the value of the enum is anything but FLUSH then _mesa_make_current will now refrain from calling _mesa_flush. This should only affect drivers that explicitly change the enum to a non-default value. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-28 16:40:18 +00:00
Neil Roberts	1ecf6e1595	gles2: Update gl2ext.h to revision 28335 The main incentive to do this is to get the defines for the GL_KHR_context_flush_control extension. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-28 16:40:18 +00:00
Jason Ekstrand	17d98ae254	i965/fs: Don't set dependency hints on instructions with spilled destinations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-27 17:54:10 -07:00
Jason Ekstrand	547a7fb458	i965/fs: Make scratch write instructions use the correct execution size Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-27 13:35:57 -07:00
Jason Ekstrand	9d1f72ebde	i965/fs: Use correct spill offsets Different platforms require the offset to be in different units. However, the generator fixes all of this up for us and only requires an offset in bytes. Previously, we were getting this wrong all over the place. Some computed/used it correctly as bytes while others treated the offset as whole registers or computed it as bytes or bytes*2 in SIMD16 mode. This commit cleans all this up and makes us properly treat it as bytes everywhere. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-27 13:35:57 -07:00
Jason Ekstrand	4242eb14c1	i965: Use the spill destination for the message header on GEN >= 7 Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-27 13:35:57 -07:00
Jason Ekstrand	76bb695f09	i965/fs: Don't [un]spill multiple registers at a time in SIMD8 mode I thought this would be a clever way to make spilling less expensive. However, it appears that the oword read/write messages we are using for spilling ignore the execution size and assume SIMD16 whenever working with more than one register. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-27 13:35:57 -07:00
Jason Ekstrand	3a5df8b612	i965/fs: Use instruction execution sizes when generating scratch reads/writes Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-27 13:35:57 -07:00
Lionel Landwerlin	d175e7c16b	egl/drm: do not crash when swapping buffers without any rendering Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-27 10:36:21 -07:00
Tobias Klausmann	1a170980a0	nv50: handle inverted render conditions This enables ARB_conditional_render_inverted. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-26 07:33:16 -04:00
Rob Clark	13862812dc	freedreno/ir3: consider instruction neighbors in cp Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive scalar registers. Keep track of instruction neighbors in copy- propagation step and avoid eliminating mov's which would cause an instruction to need multiple distinct left and/or right neighbors. This lets us not fall on our face when we encounter things like: 1: MOV TEMP[2], IN[0].xyzw 2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D 3: MOV TEMP[2].xy, IN[0].yxzz 4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D 5: END Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-25 12:07:43 -04:00
Rob Clark	4dff2a6429	freedreno/ir3: always mov tex coords Always insert extra mov's for the tex coord into the fanin. This simplifies things a bit, and avoids a scenario where multiple sam instructions can have mutually exclusive input's to it's fanin, for example: 1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D 2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D The CP pass can always remove the mov's that are not actually needed, so better to start out with too many mov's in the front end, than not enough. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-25 12:07:34 -04:00
Rob Clark	33193540fc	freedreno: rename a couple debug flags dscis -> noscis dbypass -> nobypass a bit more consistant w/ nobin, etc. And IMO a bit more sensible names. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-25 12:07:21 -04:00
Rob Clark	ded5013c4c	freedreno/ir3: skip virtual outputs in standalone compiler Kills get added to the outputs list, to ensure they get scheduled. But they aren't really outputs so skip them in the header comment block. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-25 10:25:15 -04:00
Mathias Fröhlich	a9c634dded	glx: Fix make check. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=85429. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-25 15:14:24 +02:00
Mathias Fröhlich	ce61559413	mesa: Add ARB_clip_control.xml to automake. Adding this makes 'make check' catch failures introduced from within ARB_clip_control.xml earlier. Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-25 15:14:24 +02:00
Rob Clark	d6252d0f63	freedreno/ir3: standalone compiler updates for ir3test In order to test compiler changes more easily, spit out the assembled shader with some header information so that we can know about inputs/outputs more easily. See: git://people.freedesktop.org/~robclark/ir3test In ir3test we have a big collection of tgsi shaders and reference ir3_compiler outputs. When making compiler changes, regenerate the compiler outputs and feed to ir3test to compare the new vs reference shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-25 09:08:15 -04:00
Chia-I Wu	762c68b879	ilo: improve blob decoding The last few dwords were skipped if the total number of dwords was not a multiple of 4. Change the formatting for better readability. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-10-25 14:28:08 +08:00
Eric Anholt	08599f668c	i965: Skip recalculating URB allocations if the entry size didn't change. We only get here if the VS/GS compiled programs change, but we can even skip it if the VS/GS size didn't change. Affects cairo runtime on glamor by -1.26471% +/- 0.674335% (n=234) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 23:17:14 -07:00
Andres Gomez	b0e0c26f02	glsl: Standardize names and fix typos Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 23:14:04 -07:00
Ian Romanick	7d560a3861	i965: Silence unused parameter warning in brw_dump_ir Just remove the parameter. Silences: brw_program.c: In function 'brw_dump_ir': brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	4939c2eced	i965: Remove brwIsProgramNative Originally I just fixed some unused parameter warnings in this function. However, Ken pointed out: "You could instead remove this driver hook. If the dd pointer is NULL, arbprogram.c will return true. I think I'd prefer that." Way, way back in time, I think _mesa_GetProgramivARB had the opposite behavior. Given that it works the way it now works, I also prefer removing the driver hook. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	66d950464c	mesa: Silence unused parameter warning in _mesa_init_shader_program Just remove the parameter. Silences: ../../src/mesa/main/uniform_query.cpp:1062:1: warning: unused parameter 'ctx' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	99e8a3973f	mesa: Remove context parameter from dd_function_table::NewShaderProgram This fixes some unused parameter warnings introduced by the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	c76cc7bab0	mesa: Make _mesa_init_shader_program static Since a couple commits ago, there is only one caller, and that caller is in the same file. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	cfe195f901	mesa: Remove context parameter from _mesa_init_shader_program Silences: ../../src/mesa/main/shaderobj.c: In function '_mesa_init_shader_program': ../../src/mesa/main/shaderobj.c:239:46: warning: unused parameter 'ctx' [-Wunused-parameter] For now, this adds a couple other unused parameter warnings, but future patches will clean those up. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	edcba62655	glsl_to_tgsi: Remove st_new_shader It was identical to the default implementation in _mesa_new_shader. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: Dave Airlie <airlied@redhat.com>	2014-10-24 19:54:39 -07:00
Ian Romanick	deee3b0f9e	glsl_to_tgsi: Remove st_new_shader_program It was identical to the default implementation in _mesa_new_shader_program. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: Dave Airlie <airlied@redhat.com>	2014-10-24 19:54:39 -07:00
Ian Romanick	a2dc16ed81	i965: Remove brw_new_shader_program It was identical to the default implementation in _mesa_new_shader_program. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:39 -07:00
Ian Romanick	9cdf2f78fc	mesa: Silence unused parameter warning in _mesa_clear_shader_program_data Just remove the parameter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:38 -07:00
Ian Romanick	fefead3b63	linker: Rely on _mesa_clear_shader_program_data to clear link information _mesa_link_shader_program already calls _mesa_clear_shader_program_data before calling link_shaders, so this is already done. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-24 19:54:38 -07:00
Ian Romanick	7cbcff0606	mesa: Add some missing clean-up to _mesa_clear_shader_program_data All of this is already done in link_shaders. More clean-ups coming. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:38 -07:00
Ian Romanick	a3bfc7d313	mesa: Remove prototypes for nonexistent functions _mesa_UseShaderProgramEXT, _mesa_ActiveProgramEXT, and _mesa_CreateShaderProgramEXT were all removed when support for GL_EXT_separate_shader_objects was removed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:38 -07:00
Ian Romanick	1ac924a77d	ff_fragment_shader: Silence unused parameter warning in smear Just remove the parameter. Silences: ../../src/mesa/main/ff_fragment_shader.cpp:668:1: warning: unused parameter 'p' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-24 19:54:38 -07:00
Ian Romanick	3e462d9221	meta: Only use _mesa_ClipControl if the extension is supported Fixes many piglit failures on IVB since `85edaa8`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85425 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Mathias Fröhlich <Mathias.Froehlich@gmx.net>	2014-10-24 19:24:54 -07:00
Emil Velikov	f9a9054b61	docs: add news item and link release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-25 01:13:11 +00:00
Emil Velikov	95d00f6640	docs: Add sha256 sums for the 10.3.2 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `9599470642`)	2014-10-25 01:11:02 +00:00
Emil Velikov	95d31ab54c	Add release notes for the 10.3.2 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `3b6a4758fa`)	2014-10-25 01:09:55 +00:00
Jason Ekstrand	5d1046291a	i965/fs: Compute q-values for register allocation manually Previously, we were allowing the register allocation code to do the computation for us in ra_set_finalize. However, the runtime for this computation is O(c^4 * g) where c is the number of classes and g is the number of GRF registers. However, these q-values are directly computable based on the way we lay out our register classes so there is no need for the aweful runtime algorithm. We were doing ok until commit `7210583eb` where we bumped the number of register classes from 11 to 16. While startup times don't normally matter, this caused piglit to take 4 times as long to run on Bay Trail. This patch should make generating the ra_set much faster and melt the piglit run times. v2: Fixed a couple of bugs. I have now verified that the same q-values are generated both ways. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-24 16:25:31 -07:00
Jason Ekstrand	2ec161b239	i965/fs: Don't interfere with too many base registers On older GENs in SIMD16 mode, we were accidentally building too much interference into our register classes. Since everything is divided by 2, the reigster allocator thinks we have 64 base registers instead of 128. The actual GRF mapping still needs to be doubled, but as far as the ra_set is concerned, we only have 64. We were accidentally adding way too much interference. Signed-off-by: Jason Ekstrand <jason.ekstrand@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-24 16:24:05 -07:00
Jason Ekstrand	ee65f2b50d	i965/fs: Properly precolor payload registers on GEN5 in SIMD16 For GEN6 SIMD16 mode, we have to 2-align all the registers, so we only have the even-numbered ones. This means that we have to divide the register number by 2 when we precolor. This wasn't a problem before because we were setting up the interference between ra_node registers wrong. This will be fixed in the next commit. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-24 16:23:54 -07:00
Jason Ekstrand	1988b71655	i965/fs: Add another use of MAX_VGRF_SIZE Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-24 16:23:24 -07:00
Jason Ekstrand	f84adb8481	util: Use reg_belongs_to_class instead of BITSET_TEST This shouldn't be a functional change since reg_belongs_to_class is just a wrapper around BITSET_TEST. It just makes the code a little easier to read. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-24 16:23:08 -07:00
José Fonseca	701f739d7f	llvmpipe: Ensure the packed input of the lp_test_format is aligned. Fixes: - https://bugs.freedesktop.org/show_bug.cgi?id=85377 - http://llvm.org/bugs/show_bug.cgi?id=21365 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-10-24 21:35:23 +01:00
José Fonseca	1ef6d439ba	llvmpipe: Flush stdout on lp_test_* unit tests. So that the order of test messages and gallivm/llvmpipe debug output is preserved. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-10-24 21:35:09 +01:00
Mathias Fröhlich	5fc0e11053	gallium: Enable ARB_clip_control for gallium drivers. Gallium should be prepared fine for ARB_clip_control. So enable this and mention it in the release notes. v2: Only enable for drivers announcing the freshly introduced PIPE_CAP_CLIP_HALFZ capability. v3: Use extension enable infrastructure to connect PIPE_CAP_CLIP_HALFZ with ARB_clip_control. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-24 19:21:21 +02:00
Mathias Fröhlich	56088131d0	gallium: introduce PIPE_CAP_CLIP_HALFZ. In preparation of ARB_clip_control. Let the driver decide if it supports pipe_rasterizer_state::clip_halfz being set to true. v3: Initially enable on ilo. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de	2014-10-24 19:21:21 +02:00
Mathias Fröhlich	85edaa8b72	mesa: Handle clip control in meta operations. Restore clip control to the default state if MESA_META_VIEWPORT or MESA_META_DEPTH_TEST is requested. v3: Handle clip control state with MESA_META_TRANSFORM. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-24 19:21:21 +02:00
Mathias Fröhlich	34a3c97fe6	mesa: Implement ARB_clip_control. Implement the mesa parts of ARB_clip_control. So far no driver enables this. v3: Restrict getting clip control state to the availability of ARB_clip_control. Move to transformation state. Handle clip control state with the GL_TRANSFORM_BIT. Move _FrontBit update into state.c. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-24 19:21:21 +02:00
Mathias Fröhlich	6340e609a3	mesa: Refactor viewport transform computation. This is for preparation of ARB_clip_control. v3: Add comments. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-24 19:21:20 +02:00
Eric Anholt	8c7ac377b7	vc4: Reuse uniform_data/contents indices when making uniforms. This allows vc4_opt_cse.c to CSE-away operations involving the same uniform values. total instructions in shared programs: 37341 -> 36906 (-1.16%) instructions in affected programs: 10233 -> 9798 (-4.25%) total uniforms in shared programs: 10523 -> 10320 (-1.93%) uniforms in affected programs: 2467 -> 2264 (-8.23%)	2014-10-24 18:04:26 +01:00
Eric Anholt	18ccda7b86	vc4: When asked to discard-map a whole resource, discard it. This saves a bunch of extra flushes when texsubimaging a whole texture that's been used for rendering, or subdataing a whole BO. In particular, this massively reduces the runtime of piglit texture-packed-formats (when the probes have been moved out of the inner loop).	2014-10-24 18:04:26 +01:00
Eric Anholt	a71c3b885a	vc4: Refactor flushing before mapping a BO. I'm going to want to make some other decisions here before flushing.	2014-10-24 18:04:26 +01:00
Eric Anholt	52824811b9	vc4: Allow dead code elimination of unused varyings. total instructions in shared programs: 39022 -> 37341 (-4.31%) instructions in affected programs: 26979 -> 25298 (-6.23%) total uniforms in shared programs: 11242 -> 10523 (-6.40%) uniforms in affected programs: 5836 -> 5117 (-12.32%)	2014-10-24 18:04:26 +01:00
Eric Anholt	5d32e26335	vc4: Add debug output to match shaderdb info to program dumps. I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but when debugging regressions, I want to match shaderdb output to shader disassembly.	2014-10-24 18:04:26 +01:00
Andreas Boll	14bdcc6ff9	radeon: enable Hyper-Z on r600g and radeonsi by default This reverts commit `01e6371149`. Since then many Hyper-Z issues have been fixed or worked around. Enable Hyper-Z by default so that we get enough feedback for the upcoming mesa 10.4 release. If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment variable R600_DEBUG=nohyperz and please report the issue on the bugtracker. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011 See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112 Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-24 09:11:51 +02:00
Matt Turner	76f27a6b03	i965: Silence unused variable warning.	2014-10-23 16:20:07 -07:00
Matt Turner	40492be2a4	i965/fs: Silence uninitialized variable warning. The compiler isn't privy to the knowledge that we're doing at least one framebuffer write. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-10-23 16:20:07 -07:00
Matt Turner	2695891088	util: Add assume() macro. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-23 16:20:07 -07:00
Jan Vesely	bbe93161e7	glapi: Fix compiler warning and script name Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-23 16:03:16 +01:00
Rob Clark	4f1fec6060	Revert "freedreno/a3xx: only emit dirty consts" This reverts commit `94bb33617d`. Which somehow broke gnome-shell.. and needs more investigation. For now, revert..	2014-10-23 10:46:51 -04:00
Rob Clark	6eabc11936	freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE fd_bo_cpu_prep() doesn't realize the bo is already referenced in unflushed cmdstream. It could be made to do so (but would have to be implemented twice, ie. both for msm and kgsl). But we still can't do the expected thing if the caller isn't using _NOSYNC. Because of the way the tiling works, we need to build quite a bit of cmdstream at flush time, which is not possible to do at the libdrm level. So rather than trying to make fd_bo_cpu_prep() smarter than it can possibly be, just always discard and reallocate if the PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-23 10:46:51 -04:00
Jan Vesely	ab53830b95	clover: Require libelf v2: test for libelf once, check in both radeon and clover CC: Tom Stellard <tom@stellard.net> CC: Emil Velikov <emil.l.velikov@gmail.com> CC: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-23 15:19:00 +01:00
Emil Velikov	b4039cf15a	clover: use correct typenames for compat::pair's first/second Seems to be a typo judging from the overall declaration of the template. Cc: EdB <edb+mesa@sigluy.net> Cc: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-23 15:18:12 +01:00
Emil Velikov	c63eb5dd5e	auxiliary/os: get the mmap/munmap wrappers working with android - Use macro for munmap under Android - the STATIC_ASSERT uses a off_t which is not used under Android for mmap. As loff_t size does not vary as does off_t just ignore the assert. - Wrap the long lines to improve readability. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-23 15:18:11 +01:00
Mauro Rossi	417b17378a	gallium/nouveau: fully build the driver under android Fix the trivial typo in the variable name. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-10-23 15:18:11 +01:00
Alon Levy	d897e7c34a	mesa/shaderimage.c: fix inconsistent sign warning Signed-off-by: Alon Levy <alevy@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-10-23 14:45:41 +01:00
Alon Levy	501baa6bbb	wgl: stw_pixelformat_get_info: correct type for index variable Signed-off-by: Alon Levy <alevy@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-10-23 14:45:40 +01:00
Alon Levy	23080e49c4	u_math.h: fix 64 to 32 bit truncation warning Signed-off-by: Alon Levy <alevy@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-10-23 14:45:40 +01:00
José Fonseca	75ad4fe78e	gallivm: Fix build with LLVM 3.3. The setMCJITMemoryManager method doesn't exist in LLVM 3.3. I thought I had tested the latest version of my earlier change with LLVM 3.3, but it looks I missed it. Trivial.	2014-10-23 10:42:12 +01:00
José Fonseca	065256dfc4	gallivm: Properly update for removal of JITMemoryManager in LLVM 3.6. JITMemoryManager was removed in LLVM 3.6, and replaced by its base class RTDyldMemoryManager. This change fixes our JIT memory managers specializations to derive from RTDyldMemoryManager in LLVM 3.6 instead of JITMemoryManager. This enables llvmpipe to run with LLVM 3.6. However, lp_free_generated_code is basically a no-op because there are not enough hook points in RTDyldMemoryManager to track and free the code of a module. In other words, with MCJIT, code once created, stays forever allocated until process destruction. This is not speicfic to LLVM 3.6 -- it will happen whenever MCJIT is used regardless of version. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-10-23 10:19:33 +01:00
José Fonseca	3fd220e2eb	gallivm: Fix white-space. Replace tabs with spaces. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-10-23 10:19:33 +01:00
José Fonseca	013ff2fae1	gallivm,llvmpipe,clover: Bump required LLVM version to 3.3. We'll need to update gallivm for the interface changes in LLVM 3.6, and the fewer the number of older LLVM versions we support the less hairy that will be. As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant whether LLVM version supports AVX or not. Runtime support for AVX is always checked and enforced independently.) Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-10-23 10:18:56 +01:00
Ilia Mirkin	9ad80d1d18	mesa: remove conditional render and rgtc from ES3 requirements The functionality exposed by those extensions does not appear in ES3 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-23 00:45:08 -04:00
Brian Paul	c9a6ec1978	u_blitter: put a comment on util_blitter_cache_all_shaders() Trivial.	2014-10-22 17:33:40 -06:00
Brian Paul	f82a84c097	u_blitter: use ctx->bind_fs_state(), not pipe->bind_fs_state() Consistently use the function pointer we saved earlier. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-22 17:33:40 -06:00
Brian Paul	0bcd9f5469	u_blitter: create basic fs shaders in util_blitter_cache_all_shaders() We need to create all fs shaders in this function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-22 17:33:40 -06:00
Brian Paul	27de89d266	u_blitter: do error checking assertions for shader caching If the user calls util_blitter_cache_all_shaders() set a flag and assert that we never try to create any new fragment shaders after that point. If the assertions fails, it means we missed generating some shader in util_blitter_cache_all_shaders(). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-22 17:33:40 -06:00
Anuj Phogat	7a652c41b4	glsl: Use signed array index in update_max_array_access() Avoids a crash in case of negative array index is used in a shader program. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-22 16:13:37 -07:00
Anuj Phogat	6f0089e92e	glsl: Fix crash due to negative array index Currently Mesa crashes with a shader like this: [fragmnet shader] float[5] array; int idx = -2; void main() { gl_FragColor = vec4(0.0, 1.0, 0.0, array[idx]); } Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-22 16:13:37 -07:00
Marek Olšák	8ec40adf7e	radeonsi: implement pipe_rasterizer_state::clip_halfz Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-10-22 21:05:00 +02:00
Marek Olšák	a3591da1a0	r600g: implement pipe_rasterizer_state::clip_halfz Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-10-22 21:04:58 +02:00
Marek Olšák	8ddd2f7aee	r300g: implement pipe_rasterizer_state::clip_halfz Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-10-22 21:04:56 +02:00
Michel Dänzer	ae879718c4	r600g: Drop references to destroyed blend state Fixes use-after-free when the currently bound blend state is destroyed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Cc: mesa-stable@lists.freedesktop.org	2014-10-22 17:09:43 +09:00
Kenneth Graunke	6dc6e6e0d9	i965/vec4: Generate better code for ir_triop_csel. Previously, we generated an extra CMP instruction: cmp.ge.f0(8) g6<1>D g1<0,4,1>F 0F cmp.nz.f0(8) null g6<4,4,1>D 0D (+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F The first operand is always a boolean, and we want to predicate the SEL on that. Rather than producing a boolean value and comparing it against zero, we can just produce a condition code in the flag register. Now we generate: cmp.ge.f0(8) null g1<0,4,1>F 0F (+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F No difference in shader-db. v2: Remember to delete the old code (thanks Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-21 21:14:03 -07:00
Kenneth Graunke	f5c3f095b9	i965/vec4: Simplify visit(ir_expression *)'s result_src/dst setup. Using dst_reg(this, ir->type) automatically sets the writemask to the proper size for the type; src_reg(dst_reg) preserves that. This should be equivalent, but less code. Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so the old code did need the manual writemask adjustment, since it constructed the registers the other way around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-21 21:14:00 -07:00
Kenneth Graunke	cb36e79f96	i965/vec4: Delete some dead code in visit(ir_expression *). Nothing uses the vector_elements temporary variable. Setting this->result.file is dead because we overwrite this->result a few lines later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-21 21:13:37 -07:00
Kenneth Graunke	4d34c4b582	i965/fs: Generate better code for ir_triop_csel. Previously, we generated an extra CMP instruction: cmp.ge.f0(8) g4<1>D g2<0,1,0>F 0F cmp.nz.f0(8) null g4<8,8,1>D 0D (+f0) sel(8) g120<1>F g2.4<0,1,0>F g3<0,1,0>F The first operand is always a boolean, and we want to predicate the SEL on that. Rather than producing a boolean value and comparing it against zero, we can just produce a condition code in the flag register. Now we generate: cmp.ge.f0(8) null g2<0,1,0>F 0F (+f0) sel(8) g124<1>F g2.4<0,1,0>F g3<0,1,0>F total instructions in shared programs: 5473459 -> 5473253 (-0.00%) instructions in affected programs: 6219 -> 6013 (-3.31%) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-21 21:13:37 -07:00
Kenneth Graunke	32364a1fe5	glsl: Delete unused gl_uniform_driver_format enum values. A while back, Matt made the uniform upload functions simply upload ctx->Const.UniformBooleanTrue for boolean values instead of 0/1, which removed the need to convert it later. We also set UniformBooleanTrue to 1.0f for drivers which want to treat booleans as 0.0/1.0f. Nothing ever sets these, so they are dead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-21 18:53:13 -07:00
Rob Clark	36310d9d56	freedreno/a3xx: fix depth/stencil restore format Also fix z16 restore format which was completely wrong. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-21 20:08:49 -04:00
Rob Clark	2bc2ab66d9	freedreno/a3xx: fix viewport state during clear Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-21 20:08:49 -04:00
Rob Clark	3eb8289aa4	freedreno: mark scissor state dirty when enable bit changes We don't have a scissor enable bit in hw, so when a raster state change results in scissor enable bit changing, we need to also mark scissor state as dirty. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-21 20:08:49 -04:00
Rob Clark	01b757e2b0	freedreno: clear vs scissor The optimization of avoiding restore (mem2gmem) if there was a clear falls down a bit if you don't have a fullscreen scissor. We need to make the decision logic a bit more clever to keep track of what was cleared, so that we can (a) completely skip mem2gmem if entire buffer was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that were completely cleared. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-21 20:08:49 -04:00
Vinson Lee	1ab6543431	clover: Fix build error with LLVM 3.4. DataLayoutPass was added in LLVM 3.5 r202168, commit 57edc9d4ff1648568a5dd7e9958649065b260dca "Make DataLayout a plain object, not a pass.". This patch fixes this build error with LLVM 3.4. CXX llvm/libclllvm_la-invocation.lo llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module, unsigned int, const std::vector<llvm::Function>&)': llvm/invocation.cpp:324:18: error: expected type-specifier PM.add(new llvm::DataLayoutPass(mod)); ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85189 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-10-21 15:40:47 -07:00
Marek Olšák	43b2432368	r600g,radeonsi: convert TGSI shader type to LLVM shader type The values are hardcoded in the LLVM backend, but the TGSI definitions are going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be increased by 2. We'll use VS for LS and HS, because there's nothing special about them from the LLVM backend point of view, even though the hardware side is different. We do the same for ES. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:39:50 +02:00
Marek Olšák	c5a44cf3f8	radeonsi: add some missing register definitions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:39:50 +02:00
Marek Olšák	fc3b3354d7	radeonsi: load ring resource descriptors only once v2: document the new functions Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:39:35 +02:00
Marek Olšák	d787608957	radeonsi: clarify shader constant load functions I'll need indexed loads without the meta data flag for tessellation later. Also rename load_const to buffer_load_const to distinguish it from indexed const loads. v2: add comments Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:35:44 +02:00
Marek Olšák	55a9b778c8	radeonsi: statically declare resource and sampler arrays Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:17:48 +02:00
Marek Olšák	e827bb6fe7	radeonsi: remove conversion of DX9 FACE input to GL st/mesa and gallium expect the DX9 format, so this is useless. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:17:41 +02:00
Marek Olšák	a18f803a86	radeonsi: revert hack for random failures in glsl-max-varyings This reverts commit `032e5548b3`. I've run glsl-max-varyings 30 times and it always passed. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:17:29 +02:00
Marek Olšák	b9b0973db2	radeonsi: generate shader pm4 states right after shader compilation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:17:26 +02:00
Marek Olšák	c94af8f0d7	radeonsi: make pm4 state generation for shaders independent of the context The si_pm4_delete_state calls became useless, because the pm4 state is always generated only once. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:17:22 +02:00
Marek Olšák	139bde061a	radeonsi: inline si_pm4_alloc_state It seemed like the function needed a context pointer. Let's remove it to make it less confusing. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-21 22:17:15 +02:00
Marek Olšák	22c5886f3f	r300g: replace r300_get_num_samples with a util variant	2014-10-21 22:03:55 +02:00
Marek Olšák	013850a1b7	glsl_to_tgsi: use _mesa_copy_linked_program_data This deduplicates some code.	2014-10-21 22:01:16 +02:00
Marek Olšák	9ec305ead7	glsl_to_tgsi: fix the value of gl_FrontFacing with native integers We must convert it to boolean from the DX9 float encoding that Gallium specifies. Later, we should probably define that FACE should be 0 or ~0 if native integers are supported. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-10-21 22:01:16 +02:00
Marek Olšák	e8764a4673	st/mesa: add ST_DEBUG=wf option which enables wireframe rendering Useful for tessellation.	2014-10-21 22:01:16 +02:00
Marek Olšák	5f5b83cbba	gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa With 5 shader stages and various combinations of enabled and disabled shaders, the maximum number of outputs in one shader doesn't have to be equal to the maximum number of inputs in the following shader. v2: return 32 for softpipe and llvmpipe	2014-10-21 21:59:02 +02:00
Eric Anholt	ef280c95f2	vc4: Fix SRC_ALPHA_SATURATE blending. Fixes glean blendFunc.	2014-10-21 15:46:48 +01:00
Eric Anholt	cc298023c9	vc4: Fix stencil writemask handling. If the writemask doesn't compress, then we want to put in the uncompressed writemask, not the compressed writemask failure value (all-on). Fixes glean's stencil2 and fbo-clear-formats on stencil.	2014-10-21 15:16:41 +01:00
Eric Anholt	48f6351940	vc4: Don't look at back stencil state unless two-sided stencil is enabled. Fixes regressions in the next bugfix, because gallium util stuff leaves the back stencil state as 0 if !back->enabled.	2014-10-21 15:16:41 +01:00
Rob Clark	4f17e026bb	freedreno/ir3: add debug flag to disable cp FD_MESA_DEBUG=nocp will disable copy propagation pass. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Ilia Mirkin	f0ca26725e	freedreno: positions come out as integers, not half-integers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Rob Clark	3fcb021201	freedreno/a3xx: disable early-z when we have kill's Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Rob Clark	8a0ffedd8d	freedreno/ir3: fix potential gpu lockup with kill It seems like the hardware is unhappy if we execute a kill instruction prior to last input (ei). Probably the shader thread stops executing and the end-input flag is never set. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Rob Clark	ab33a24089	freedreno/ir3: comment + better fxn name Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Rob Clark	94bb33617d	freedreno/a3xx: only emit dirty consts If app only updates (for example) vertex uniforms, it would be nice to only re-emit those and not also frag uniforms. Means we need to mark the first frag shader const buffer dirty after a clear. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Rob Clark	74069e324e	freedreno/a3xx: more layer/level fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-20 21:42:44 -04:00
Brian Paul	aafbd89c5e	mesa: fix 'feeedback' typo in comment Trivial.	2014-10-20 11:53:34 -06:00
Brian Paul	4676c6c25b	mesa: fix 'misalgned' typos in error messages Trivial.	2014-10-20 11:50:49 -06:00
Brian Paul	14379a0644	glsl: fix several use-after-free bugs The get_variable_being_redeclared() function can free the 'var' argument. Thereafter, we cannot assume that 'var' is a valid pointer. This patch replaces 'var->name' with 'earlier->name' in two places and calls is_gl_identifier(var->name) before 'var' might get freed. This fixes several piglit GLSL crashes, including: spec/glsl-1.50/execution/geometry/clip-distance-in-param spec/glsl-1.50/execution/geometry/clip-distance-bulk-copy spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-before-global-redeclaration.geom I'm not sure why these were not spotted sooner. A similar bug was previously fixed by `f9cecca7a`. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-20 08:59:32 -06:00
Tapani Pälli	953a0af8e3	mesa: validate sampler uniforms during gluniform calls Patch fixes 'glsl-2types-of-textures-on-same-unit' in WebGL conformance test suite. No Piglit regressions, fixes gl-2.0-active-sampler-conflict. To avoid adding potentially heavy check during draw (valid_to_render), check is done during uniform updates by inspecting TexturesUsed mask. A new boolean variable is introduced to cache validation state. v2: take into account case where 2 uniforms use same unit (curro) also do the check only when SSO is not in use, SSO has own path for sampler validation. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-20 11:07:12 +03:00
EdB	01d94193ac	clover: Don't return CL_INVALID_VALUE if there is no header. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-20 10:35:10 +03:00
EdB	aa93af809f	clover: Add allow_empty_tag. To allow empty objs() list checks. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-20 10:35:10 +03:00
EdB	611d66fe45	clover: Add initial implementation of clCompileProgram for CL 1.2. [ Francisco Jerez: General clean-up. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-20 10:34:51 +03:00
EdB	fead2b0463	clover: Add a simple compat::pair. std::pair is not c++98/c++11 safe. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-20 10:33:02 +03:00
Francisco Jerez	5583459655	clover/util: Allow using key_equals with pair-like objects other than std::pair.	2014-10-20 10:33:02 +03:00
Francisco Jerez	e987fd5dc6	clover/util: Define equality operators for a couple of compat classes.	2014-10-20 10:33:01 +03:00
Francisco Jerez	1441a3c1bb	clover/util: Fix construction of compat::vector with a general container as argument.	2014-10-20 10:33:01 +03:00
Tapani Pälli	73dd50acf6	glsl: implement switch flow control using a loop Patch removes old variable based logic for handling a break inside switch. Switch is put inside a loop so that existing infrastructure for loop flow control can be used for the switch, now also dead code elimination works properly. Possible 'continue' call inside a switch needs now special handling which is taken care of by detecting continue, breaking out and calling continue for the outside loop. v2: remove one unnecessary ir_expression (Curro) Fixes following Piglit tests: fs-exec-after-break.shader_test fs-conditional-break.shader_test No Piglit or es3conform regressions. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-20 07:55:58 +03:00
Eric Anholt	6212d2402d	vc4: Translate 4-byte index buffers to 2 bytes. Fixes assertion failures in 14 piglit tests (half of which now pass).	2014-10-19 08:44:56 +01:00
Eric Anholt	572fba95e4	vc4: Add support for rebasing texture levels so firstlevel == 0. GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't. Fixes piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap rendering.	2014-10-19 08:42:33 +01:00
Eric Anholt	15eb4c59f6	vc4: Apply a Newton-Raphson step to improve RSQ Fixes all the piglit built-in-functions/*sqrt tests, among others.	2014-10-18 10:08:59 +01:00
Eric Anholt	1fc124b80f	vc4: Apply a Newton-Raphson step to improve RCP. Fixes all the piglit floating-point *-op-div tests, among others.	2014-10-18 10:08:59 +01:00
Eric Anholt	0fdc5111b4	vc4: Add a little bit more packet parsing to make dump reading easier. Probably should have done this before staring at all those render lists today.	2014-10-18 10:08:59 +01:00
Chris Forbes	81041c4a4a	meta/msaa-blit: consider weird sample count case unreachable Suppresses a bunch of warning noise about sample_map possibly being used uninitialized. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-18 19:09:28 +13:00
Jason Ekstrand	4656c14e57	i965/fs: Change the type of booleans to UD and emit correct immediates Before, we used the a signed d-word for booleans and the immedates we emitted varried between signed and unsigned. This commit changes the type to unsigned (I think that makes more sense) and makes immediates more consistent. This allows copy propagation to work better cleans up some instructions. total instructions in shared programs: 5473519 -> 5465864 (-0.14%) instructions in affected programs: 432849 -> 425194 (-1.77%) GAINED: 27 LOST: 0 Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-17 13:36:48 -07:00
Kenneth Graunke	ffe582aa20	i965/fs: Don't pass ir_variable * to emit_sampleid_setup(). gl_SampleID is a built-in variable that always is of type "int". Suggested by Connor Abbott. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2014-10-17 13:03:18 -07:00
Eric Anholt	9ebfb3014e	vc4: Make some assertions about how many flushes/EOFs the simulator sees. This caught the previous commit's bug in the kernel validator.	2014-10-17 13:13:43 +01:00
Eric Anholt	1f7048419e	vc4: Fix accidental dropping of the low bits of the store tilebuffer packet. Notably this included the EOF flag (the other bits are the full buffer dump selection, but we don't do full dumps), which caused the kernel checking for frame completion to trigger.	2014-10-17 13:09:29 +01:00
Eric Anholt	afc3aa373d	vc4: Set the primitive list format at the start of rendering. The other driver does this manually before calling into each tile, but we can just let it get binned into the tiles (saving repeated kernel validation on the packet). Fixes simulator assertion failures on polygon-mode and non-auto texwrap.	2014-10-17 13:09:28 +01:00
Eric Anholt	895c904103	vc4: Replace the FLUSH_ALL with FLUSH. We don't need to emit all of our current state at the end of each bin list. We're going to be smashing it all at the start of the next tile's bin list, anyway.	2014-10-17 13:09:28 +01:00
Eric Anholt	000976ed99	vc4: Add some comments about state management.	2014-10-17 13:09:28 +01:00
Eric Anholt	135287db17	vc4: Make sure there's exactly 1 tile store per tile coords packet. It's not documented that I can see, but the other driver does it (check vg_hw_4.c), and one of the HW guys confirmed that you really do need to do it.	2014-10-17 13:09:25 +01:00
Michel Dänzer	c4db733fac	winsys/radeon: Use a single buffer cache manager again The trick is to generate a unique buffer usage value for each possible combination of domains and flags, with only one bit set each for the domains and flags. This ensures pb_check_usage() only returns TRUE when the domains and flags the cached buffer was created for exactly match the requested ones. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-17 17:09:49 +09:00
Tom Stellard	e1d363b3ff	clover: Add environment variables for dumping kernel code v2 There are two debug variables: CLOVER_DEBUG which you can set to any combination of llvm,clc,asm (separated by commas) to dump llvm IR, OpenCL C, and native assembly. CLOVER_DEBUG_FILE which you can set to a file name for dumping output instead of stderr. If you set this variable, the output will be split into three separate files with different suffixes: .cl for OpenCL C, .ll for LLVM IR, and .asm for native assembly. Note that when data is written, it is always appended to the files. v2: - Code cleanups - Add CLOVER_DEBUG_FILE environment variable for dumping to a file. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 19:42:52 -04:00
Tom Stellard	76136c29bb	clover: Register an llvm diagnostic handler v3 This will allow us to handle internal compiler errors. v2: - Code cleanups. v3: - More cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 19:42:41 -04:00
Tom Stellard	8e7df519bd	clover: Add support for compiling to native object code v3 v2: - Split build_module_native() into three separate functions. - Code cleanups. v3: - More cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 19:42:30 -04:00
Tom Stellard	8b7cc90cef	gallium: Add PIPE_SHADER_IR_NATIVE to enum pipe_shader_ir Drivers can return this value for PIPE_COMPUTE_CAP_IR_TARGET if they want clover to give them native object code. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 19:42:22 -04:00
Tom Stellard	dc39b32c9b	clover: Factor kernel argument parsing into its own function v2 v2: - Code cleanups. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 19:42:14 -04:00
Marek Olšák	833d698ad5	st/mesa: use pipe_sampler_view_release for releasing sampler views This fixes a crash when exiting Firefox. I have really no idea how Firefox does it. It seems to involve multiple contexts and multithreading. v2: added an XXX comment Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81680 Acked by Christian König. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Tested-by: Benjamin Bellec <b.bellec@gmail.com>	2014-10-16 23:31:20 +02:00
Kenneth Graunke	63c6509ad2	mesa: Drop the "target" parameter from NewBufferObject(). NewBufferObject took a "target" parameter, which it blindly passed to _mesa_initialize_buffer_object(), which ignored it. Not much point in passing it around. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-16 10:56:19 -07:00
Andres Gomez	af31f930ab	glsl: Update and fix typos in README.	2014-10-16 09:38:36 -07:00
Chris Forbes	2883aff3be	i965: Flag BRW_ATOMIC_COUNTER_BUFFER when a possible ABO is respecified Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 22:31:44 +13:00
Chris Forbes	7bd6dfe934	mesa: Mark buffer objects that are used as atomic counter buffers Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 22:31:44 +13:00
Chris Forbes	f1261db1ee	i965/disasm: Add missing message type for Gen7 DP untyped surface read This is used to implement GLSL's atomicCounter() intrinsic. Previously it worked, but the disassembly was bogus. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 22:31:43 +13:00
Chris Forbes	0dc56600aa	i965: Correctly use ABO count to trigger flagging of new surfaces. This would have almost never actually been an issue, since other state tends to get flagged at the same time as new ABOs -- but still bogus. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-16 22:31:43 +13:00
Chris Forbes	25189c72ce	i965: No longer reemit textures on BRW_NEW_UNIFORM_BUFFER This didn't make any sense, but papered over the missing TexBO flagging we've just fixed, in a bunch of cases. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Chris Forbes	1655f6fc61	i965: Dirty state in BO reallocation based on usage history Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Chris Forbes	c442745981	i965: Have mesa flag BRW_NEW_TEXTURE_BUFFER when a TexBO binding changes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Chris Forbes	be5df28941	i965: Add new dirty flag for new TexBOs. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Chris Forbes	8db38ba4d2	mesa: Mark buffer objects that are used as TexBOs Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Chris Forbes	fe3133fe78	mesa: Mark buffer objects which are bound as UBOs When a buffer object is bound to one of the indexed uniform buffer binding points, assume that from that point on it may be used as a uniform buffer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Chris Forbes	3d989467f1	mesa: Add usage history bitfield to buffer objects In the drivers, we occasionally want to reallocate the backing store for a buffer object; often to avoid waiting for the GPU to be finished with the previous contents. At the point that happens, we don't have a good way of determining where else the buffer object may be bound, and so no good way of determining which dirty flags need to be raised -- it's fairly expensive to go looking at all the possible binding points. Until now, we've considered any BO to be possibly bound as a UBO or TexBO, and flagged all that state to be reemitted. Instead, remember what kinds of binding point this buffer has ever been used with, so that the drivers can flag only what they need. I don't expect these bits to ever be reset, but that doesn't matter for reasonable apps. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-10-16 22:31:43 +13:00
Emil Velikov	79d09a4b12	vc4: correctly include the source files The kernel files are built into a separate static library and all the functions that require it are already wrapped in ifdef USE_VC4_SIMULATOR. Don't forget the header file :) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-10-16 10:00:14 +01:00
Connor Abbott	70fa53be5e	i965/fs: don't make a fake ir_texture in the Mesa IR frontend Now that we've made all the texture emit code mostly independent of GLSL IR, this isn't necessary any more. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:25 -07:00
Kenneth Graunke	b17f571945	i965/fs: Refactor the texture emission logic into a single function. Before, we had 3 different emit functions for various different gen's, as well as some ancilliary work that was the same across all gen's which was either contained in functions or duplicated across the GLSL IR and Mesa IR backends. Now, we have a single method, emit_texture(), that takes all the information needed to make a texture instruction and handles all the setup, and all we have to do to emit a texture instruction while converting from GLSL IR, Mesa IR, or any new backend is to extract the information emit_texture() needs and then call it. v2: Significant rebasing (by Ken). Signed-off-by: Connor Abbott <connor.abbott@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:22 -07:00
Connor Abbott	9e95d8ebf8	i965/fs: Make gather_channel() not use ir_texture. Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:20 -07:00
Connor Abbott	12d9a8cd86	i965/fs: Make swizzle_result() not use ir_texture. Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:19 -07:00
Connor Abbott	cf94dfdb96	i965/fs: fix integer textures with swizzles This happened to work before, but it would convert the output to a float and then back to an integer which seems bad. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:16 -07:00
Connor Abbott	7c8f0b7cd9	i965/fs: don't pass in ir_texture to emit_texture_* At this point, the only thing it's used for is the opcode. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:14 -07:00
Connor Abbott	4bffcb7e8e	i965/fs: don't use ir->type in emit_texture_gen4() We already have the type from the original destination. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:05 -07:00
Connor Abbott	eaadc43192	i965/fs: Don't use ir->lod_info.grad.dPd<x,y> in emit_texture_*. This drops a dependency on ir_texture objects. v2 (Ken): Rename lod_components to grad_components, as it only has a meaningful value for ir_txd. We could set it to 1 for TXL, but there's no real need. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:05:00 -07:00
Connor Abbott	cbde5407c9	i965/fs: Don't use ir->coordinate in emit_texture_*. This drops a dependency on ir_texture objects. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:58 -07:00
Connor Abbott	a8905e8c09	i965/fs: make rescale_texcoord() not use ir_texture. Our new IR won't have ir_texture objects, but using glsl_type is fine. v2 (Ken): Drop redundant ir->coordinate NULL check; rebase. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:56 -07:00
Connor Abbott	e599837fed	i965/fs: Make emit_mcs_fetch() not use ir_texture. Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:54 -07:00
Kenneth Graunke	465373535e	i965/fs: Rename "length" to "components" in emit_mcs_fetch(). This is slightly clearer. Based on a patch by Connor Abbott. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:52 -07:00
Connor Abbott	fa212c6b98	i965: Make brw_texture_offset() not use ir_texture. Our new IR won't have ir_texture objects. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:50 -07:00
Connor Abbott	a71455bc99	i965/fs: don't use ir->offset in emit_texture_gen5. v2 (Ken): Refactor the Gen7 code separately; rebase. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:47 -07:00
Kenneth Graunke	1f76fcf231	i965/fs: Move texel offset handling to visit(ir_texture *). This moves the handling of non-constant texel offset subexpression trees to the place where we visit other such subtrees. It also removes some uses of ir->offset in emit_texture_gen7, which will be useful when we write the backend for our new upcoming IR. Based on a patch by Connor Abbott. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:45 -07:00
Kenneth Graunke	cee2027574	i965: Drop ir->op != ir_txf condition in offset checking. brw_lower_unnormalized_offset sets ir->offset to NULL if it applies the texelFetchOffset workarounds, so there's no need to special case it here---there won't be an offset for ir_txf. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:43 -07:00
Kenneth Graunke	a2c3cfbb4d	i965: Restore a lost comment about TXF offset bugs. Eric's original code to work around TXF offset bugs contained a comment explaining the problem, which was lost when Chris generalized it to an IR transformation (in commit `598ca510b8`). This commit adds the original comment to the newer code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-15 17:04:27 -07:00
Rob Clark	652b8fbbbb	freedreno/ir3: large const support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:49 -04:00
Rob Clark	e71a3f80fb	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:48 -04:00
Rob Clark	dd332fe641	freedreno: fix layer_stride Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:48 -04:00
Rob Clark	8233b36a17	freedreno: inline fd_draw_emit() Manual LTO Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:48 -04:00
Rob Clark	368466b7b7	freedreno/ir3: optimize shader key comparision Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:48 -04:00
Rob Clark	d595987ea3	freedreno/a3xx: refactor/optimize emit Because we reuse various bits of emit code (for state/vertex/prog/etc) for both regular draws and internal draws (gmem<->mem, clear, etc), the number of parameters getting passed around has been growing. Refactor to group these into fd3_emit. This simplifies fxn signatures, avoids passing around shader key on the stack, etc. It also gives us a nice place to cache shader-variant lookup to avoid looking up shader variants multiple times per draw (without having to also pass them around as fxn args everywhere). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:48 -04:00
Rob Clark	d5d80b3739	freedreno/a3xx: refactor vertex state emit Get rid of fd3_vertex_buf and use fd_vertex_state directly for all draws. Removes a tiny bit of CPU overhead for munging around the vertex state every time it is emitted, but more importantly it cleans things up for later optimizations, so the emit paths don't have to special case internal draws (gmem<->mem, clears, etc) with regular draws. Instead of constructing fd3_vertex_buf array each time for internal draws, and context init time pre-create solid_vbuf_state and blit_vbuf_state. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-15 15:49:48 -04:00
Eric Anholt	57de9bbb63	vc4: Fix the uniform debug output. I dropped the shader index when moving to the compiled shader struct, but didn't update the format string here.	2014-10-15 18:12:03 +01:00
Eric Anholt	201d4c0b2a	vc4: Add support for user clip plane and gl_ClipVertex. Fixes about 15 piglit tests about interpolation and clipping.	2014-10-15 18:11:46 +01:00
Eric Anholt	6a0bf67048	vc4: Move the output semantics setup to a helper. I want to reuse it elsewhere to set up outputs that aren't in the TGSI.	2014-10-15 18:11:46 +01:00
Kenneth Graunke	39a5a60b57	i965: Allow CSE on Gen4-5 unary math. Due to the implicit move-from-GRF, unary math looks a lot like the Gen6+ math instruction: it's a single instruction (SEND) with a GRF source. The difference is that it also implicitly clobbers a message register. The only visible effect is that CSE will remove the MRF-clobbering from later math operations. This should be fine; compute_to_mrf and remove_redundant_mrf_writes don't look at the values populated by implied writes, so they can't rely on those values being present. Less interference may actually help those passes make more progress. Binary math is still problematic, since it involves a separate MOV instruction to load the second operand. We continue disallowing CSE for binary math operations. total instructions in shared programs: 3340303 -> 3340100 (-0.01%) instructions in affected programs: 26927 -> 26724 (-0.75%) Nothing hurt, gained, or lost. ~6% reduction on a few shaders. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-15 08:44:54 -07:00
Michel Dänzer	159f93cf39	r600g,radeonsi: Only set use_staging_texture = TRUE once No need to check for setting the flag after we set it already. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-15 16:26:30 +09:00
Michel Dänzer	87da286755	r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled We set the NO_CPU_ACCESS flag for BO allocation in that case, so direct CPU access may not work. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-15 16:26:14 +09:00
Michel Dänzer	3ede67a4c6	winsys/radeon: Use separate caching buffer manager for each set of flags Otherwise the caching buffer manager may return a buffer which was created with a different set of flags, which can cause trouble. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-15 16:11:40 +09:00
Andres Gomez	657764c21c	configure.ac: check for libexpat when no pkg-config is available Previously, when no pkg-config was available for libexpat we would just add the needed linking flags without any extra check. Now, we check that the library and the headers are also installed in the building environment. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-15 08:59:12 +02:00
Tom Stellard	8cf6482c3d	clover: Fix regression in module serialization We need to serialize semantic information for arguments, which was added in `06139c56fa`. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-14 17:58:06 -04:00
Jason Ekstrand	3435aa49f4	i965/fs: Use the correct regs_written on unspill instructions Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-14 12:39:45 -07:00
Ilia Mirkin	742158b51e	st/gbm: fix order of arguments passed to is_format_supported Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: mesa-stable@lists.freedesktop.org	2014-10-14 12:33:38 -04:00
Ilia Mirkin	5524af8136	nouveau: 3d textures are unsupported, limit 3d levels to 1 Ideally there would be a swrast fallback, but the driver isn't ready for that. This should avoid crashes if someone tries to use 3d textures though. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: mesa-stable@lists.freedesktop.org	2014-10-14 12:33:38 -04:00
Rob Clark	abe3b3d1e0	freedreno: use tgsi_lowering Now that the freedreno_lowering code is moved to tgsi_lowering, remove our private copy and switch over to using the common version. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-14 12:30:08 -04:00
David Heidelberger	d2c1d9693f	r300/compiler: remove useless check This code is already in if (!variable->C->is_r500) so no need check twice. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>	2014-10-14 12:18:32 -04:00
Nick Sarnie	e5bf8d38db	ilo: Build pipe-loader for ilo Trivial patch to create the pipe loader for ilo. All the code was already there. Signed-off-by: Nick Sarnie <commendsarnex@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-14 16:16:08 +01:00
Emil Velikov	af897df508	automake: explicitly set TARGET_RADEON_{WINSYS,COMMON} Originally the variables were set only once via the ?= operator but that causes issues when doing incremental builds. They appear to be undefined and missing from the dependency list despite their addition to LIBADD. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84807 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-14 16:16:08 +01:00
Eric Anholt	a2d8b6dbd5	vc4: Fix render target NPOT alignment at small miplevels. The texturing hardware takes the POT level 0 width/height and minifies those. This is different from what we were doing, for example, for 273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16. Fixes piglit-depthstencil-render-miplevels 273.	2014-10-14 14:57:50 +01:00
Eric Anholt	b5fc9d5664	vc4: Add support for having 0 vertex elements used. You have to load at least 1, according to the simulator. Fixes 4 piglit tests and even more ES2 conformance tests.	2014-10-14 11:29:48 +01:00
Vinson Lee	a2fd55cfb6	auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory. This patch fixes this build error on DragonFly BSD. CC os/os_misc.lo os/os_misc.c: In function 'os_get_total_physical_memory': os/os_misc.c:132:2: error: #error Unsupported *BSD Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-10-13 23:40:46 -07:00
Daniel Manjarres	291be28476	glx: Fix glxUseXFont for glxWindow and glxPixmaps The current implementation of glxUseXFont requires creating a temporary pixmap and graphics context, which requires a real old-school X11 Window, not a glxDrawable. This patch changes things so that glxUseXFont will also accept a glxWindow or glxPixmap, and lookup the underlying X11 Drawable. Without this patch glxUseXFont generates a giant stream of Xerrors about bad drawables and bad graphics contexts. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-10-13 22:05:58 -06:00
Chia-I Wu	4e2cf84b1f	ilo: clear writer pointer after unmapping It does not look like an issue now but it is good to be future proof. Spotted by Courtney Goeltzenleuchter. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-10-14 08:54:25 +08:00
Eric Anholt	615bbf0ca6	vc4: Write the VPM read setup multiple times to queue all the inputs. There's a 4-element fifo, and the size (number of dwords per vertex) field is just 4 bits. Fixes glsl-routing on sim.	2014-10-13 17:16:05 +01:00
Eric Anholt	e1d1c39626	vc4: Add support for the TXL opcode. There's a bit at the bottom of cube map stride (which has some formatting bugs in the docs) which flips the bias coordinate to being an absolute LOD.	2014-10-13 17:15:47 +01:00
Eric Anholt	5bc91b6e32	vc4: Improve the accuracy of SIN and COS. This gets them to pass glsl-sin/cos. There was an obvious problem that I was using the FRC code on the scaled input value, which means that we had a range in [0, 1], while our taylor is most accurate across [-0.5, 0.5]. We can just slide things over, but that means flipping the sign of the coefficients. After that, it was just a matter of stuffing more coefficients in.	2014-10-13 17:15:47 +01:00
Kenneth Graunke	02ca66fbc3	i965: Use unsynchronized maps for the program cache on LLC platforms. There's no reason to stall on pwrite - the CPU always appends to the buffer and never modifies existing contents, and the GPU never writes it. Further, the CPU always appends new data before submitting a batch that requires it. This code predates the unsynchronized mapping feature, so we simply didn't have the option when it was written. Ideally, we would do this for non-LLC platforms too, but unsynchronized mapping support only exists for LLC systems. Saves a bunch of stall avoidance copies when uploading shaders. v2: Rebase on changes to previous patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]	2014-10-13 06:55:08 -07:00
Kenneth Graunke	4c766c7959	i965: Issue performance warnings when copying the program cache BO. We don't really want unnecessary buffer copying, so it'd be nice to know when it's happening. v2: Drop stall warnings when doing a read-only CPU mapping of the cache BO. The GPU also uses it in a read-only fashion, so there won't be any stalls, even though the buffer is busy. (Thanks to Chris Wilson for catching this mistake.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]	2014-10-13 06:53:54 -07:00
Kenneth Graunke	a4fe727a5d	i965: Issue performance warnings on MapBufferRange stalls. This is easy: we just need to use brw_map_bo instead of mapping it directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-10-13 06:44:05 -07:00
Eric Anholt	5d72a1c956	vc4: Match VS outputs to FS inputs. If the VS doesn't output a value that the FS needs, we still need to read the right contents for the remaining FS inputs, by emitting padding. And if the VS outputs something the FS doesn't need, we shouldn't put it in the VPM at all (so the code producing it can get DCEed). Fixes 77 piglit tests.	2014-10-13 13:23:48 +01:00
Christian König	d561a42bc1	configure: use $libdir/dri as default for VA-API Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-13 12:43:59 +02:00
Christian König	966ae170b0	configure: remove superflous VA-API line from configure.ac We don't have GALLIUM_STATE_TRACKERS_DIRS any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-13 12:43:54 +02:00
Christian König	d3004a267a	configure: respect $libdir for the OMX installation dir Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-13 12:43:49 +02:00
Christian König	5ce06d12ff	configure: Revert "ask vdpau.pc for the default location of the vdpau drivers" This reverts commit `bbe6f7f865`. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-13 12:43:05 +02:00
Eric Anholt	83365a5b57	vc4: Add support for the CEIL opcode. Not as big of a deal as SSG, but still +9 piglit tests.	2014-10-13 08:06:48 +01:00
Eric Anholt	926eaa9af4	vc4: Add support for the SSG opcode.	2014-10-13 08:06:48 +01:00
Emil Velikov	b86f814afd	docs: add news item and link release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-13 02:14:02 +01:00
Emil Velikov	fc6345a916	docs: Add sha256 sums for the 10.3.1 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `fa98c74692`)	2014-10-13 02:06:29 +01:00
Emil Velikov	04fae07f0e	Add release notes for the 10.3.1 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `088d350178`)	2014-10-13 02:06:20 +01:00
Emil Velikov	66ea8a581d	docs: Add sha256 sums for the 10.2.9 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `52bd154980`)	2014-10-13 02:05:53 +01:00
Emil Velikov	f5e61295cd	Add release notes for the 10.2.9 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `9f1149876f`)	2014-10-13 02:05:22 +01:00
Glenn Kennard	a327fa3a06	r600g: Implement GL_ARB_sample_shading Also fixes two sided lighting which was broken at least on pre-evergreen by commit b1eb00. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-10-12 23:53:57 +02:00
Marek Olšák	75e97e2e3f	radeonsi: use tgsi_shader_info in si_llvm_emit_fs_epilogue This is the last use tgsi_parse_token in radeonsi. It looks ugly because the code was re-indented, but there is really no change in behavior. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:57 +02:00
Marek Olšák	558f7770a7	radeonsi: remove si_shader_output_values::index It's redundant now. It led to a simplification in si_llvm_emit_streamout, because outidx == reg. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:57 +02:00
Marek Olšák	ec0d16872b	radeonsi: use tgsi_shader_info in si_llvm_emit_vs_epilogue That code was really ugly. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:57 +02:00
Marek Olšák	8067732740	radeonsi: remove shader->input[] and output[] arrays and dependencies They were reinventing tgsi_shader_info. They are unused now. radeon_llvm_context::load_input can be NULL if input fetching is implemented in some other way. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:57 +02:00
Marek Olšák	8b057ddaea	radeonsi: move param_offset out of shader->input[] and output[] Those are going away. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:57 +02:00
Marek Olšák	02134cfaae	radeonsi: use tgsi_shader_info to get a list of GS outputs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:54 +02:00
Marek Olšák	101905d3f7	radeonsi: use tgsi_shader_info in si_update_spi_map Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:54 +02:00
Marek Olšák	6f04cf7fac	radeonsi: simplify dereferences in si_update_spi_map Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:54 +02:00
Marek Olšák	639f6b41d2	radeonsi: use tgsi_shader_info in si_shader_vs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:54 +02:00
Marek Olšák	fa933438a2	radeonsi: use tgsi_shader_info in si_shader_ps Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:54 +02:00
Marek Olšák	e23fec1445	radeonsi: use tgsi_shader_info in fetch_input_gs Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:53:51 +02:00
Marek Olšák	7a645c5366	radeonsi: don't rely on shader->output in si_llvm_emit_fs_epilogue Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:52:16 +02:00
Marek Olšák	216cf86ec4	radeonsi: use tgsi_shader_info in si_llvm_emit_es_epilogue tgsi_shader_info contains everything we need. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:52:13 +02:00
Marek Olšák	34e8200599	radeonsi: don't recompile shaders when changing nr_cbufs from 0 to 1 Both cases are equivalent. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:52:07 +02:00
Marek Olšák	5e0fbe1b63	radeonsi: remove vs.ucps_enabled from the shader key Written CLIPDIST outputs are simply disabled in PA_CL_VS_OUT_CNTL. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:52:02 +02:00
Marek Olšák	a9592cd3ac	radeonsi: assume ClipDistance usage mask is always 0xf No code in Mesa sets the usage mask to any other value. The final mask is AND'ed with enable bits from the rasterizer state anyway. If somebody implements setting usage masks in st/mesa, we can use tgsi_shader_info to get it more easily. This is a prerequisite for the following commit. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-12 23:51:44 +02:00
Francisco Jerez	2286edce16	clover: Fix unintended fall-through in kernel::argument::bind.	2014-10-12 11:44:05 +03:00
Jan Vesely	5bffc5e262	clover: Append implicit arguments to the kernel argument list. [ Francisco Jerez: Split off from a larger patch, and take a slightly different approach for passing the implicit arguments around. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-12 01:50:13 +03:00
Francisco Jerez	bf89a97748	clover: Pass execution dimensions and offset to the kernel as implicit arguments. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-10-12 01:44:19 +03:00
Francisco Jerez	06139c56fa	clover: Add semantic information to module::argument for implicit parameter passing. Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-10-12 01:39:21 +03:00
Francisco Jerez	27c51b5f58	clover: Use unreachable() from util/macros.h instead of assert(0). Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-10-11 12:44:09 +03:00
Vinson Lee	5480d6b13f	gallium: Add tokens for DragonFly BSD. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Brian Paul <brianp@vmware.com>	2014-10-10 21:32:35 -07:00
Chia-I Wu	566d1889ea	ilo: disassemble compacted instructions Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-10-11 11:55:50 +08:00
Erik Faye-Lund	326e303175	glsl: improve accuracy of atan() Our current atan()-approximation is pretty inaccurate at 1.0, so let's try to improve the situation by doing a direct approximation without going through atan. This new implementation uses an 11th degree polynomial to approximate atan in the [-1..1] range, and the following identitiy to reduce the entire range to [-1..1]: atan(x) = 0.5 * pi * sign(x) - atan(1.0 / x) This range-reduction idea is taken from the paper "Fast computation of Arctangent Functions for Embedded Applications: A Comparative Analysis" (Ukil et al. 2011). The polynomial that approximates atan(x) is: x * 0.9999793128310355 - x^3 * 0.3326756418091246 + x^5 * 0.1938924977115610 - x^7 * 0.1173503194786851 + x^9 * 0.0536813784310406 - x^11 * 0.0121323213173444 This polynomial was found with the following GNU Octave script: x = linspace(0, 1); y = atan(x); n = [1, 3, 5, 7, 9, 11]; format long; polyfitc(x, y, n) The polyfitc function is not built-in, but too long to include here. It can be downloaded from the following URL: http://www.mathworks.com/matlabcentral/fileexchange/47851-constraint-polynomial-fit/content/polyfitc.m This fixes the following piglit test: shaders/glsl-const-folding-01 Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-10 20:44:27 +02:00
Eric Anholt	070b2c2efc	vc4: Use the fnv1 hash function instead of gallium util's crc32. Improves simulated norast performance on a little benchmark by 13.4012% +/- 2.08459% (n=13).	2014-10-10 15:49:34 +02:00
Eric Anholt	d09509da2a	vc4: Don't look up the compiled shaders unless state has changed. Improves simulated norast performance on a little benchmark by 38.0965% +/- 3.27534% (n=11).	2014-10-10 15:49:22 +02:00
Eric Anholt	c6f50c4086	vc4: Actually clear the context's dirty flags. I was trying to skip state updates when !dirty, and suspiciously everything was always dirty.	2014-10-10 15:03:13 +02:00
Eric Anholt	7c474f9f2e	vc4: Optimize the other case of SEL_X_Y wih a 0 -> SEL_X_0(a). Cleans up some output to be more obvious in a piglit test I'm looking at.	2014-10-10 15:03:12 +02:00
Tapani Pälli	ac557b4c12	mesa: fix error reported on gTexSubImage2D when level not valid Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-10-10 15:01:51 +03:00
Kenneth Graunke	94841b6d5d	i965: Fix register write checks. When mapping the buffer a second time, we need to use the new pointer, not the one from the previous mapping. Otherwise, we will most likely crash. Apparently, we've just been getting lucky and getting the same bo->virtual pointer in both cases. libdrm probably has a hand in that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-10-10 00:04:39 +02:00
Eric Anholt	7e67ea994c	vc4: Optimize out adds of 0.	2014-10-09 21:47:06 +02:00
Eric Anholt	0401f55fff	vc4: Optimize fmul(x, 0) and fmul(x, 1). This was being generated frequently by matrix multiplies of 2 and 3-channel vertex attributes (which have the 0 or 1 loaded in the shader).	2014-10-09 21:47:06 +02:00
Eric Anholt	1cd8c1aab0	vc4: Factor out the turn-it-into-a-mov in opt_algebraic. This will be used more in the next commits.	2014-10-09 21:47:06 +02:00
Eric Anholt	40748cf8d9	vc4: Eliminate unused texture instructions.	2014-10-09 21:47:06 +02:00
Eric Anholt	b73cab6826	vc4: Dead code eliminate unused SF instructions.	2014-10-09 21:47:06 +02:00
Eric Anholt	93cac2637b	vc4: Prevent copy propagating out the MOVs from r4. Copy propagating these might result in reading the r4 after some other instruction has written r4. Just prevent all copy propagation of this for now. Fixes bad rendering with upcoming indirect register access support, where the copy propagation was consistently happening across another read.	2014-10-09 21:47:06 +02:00
Eric Anholt	c4b0dd5356	vc4: Split the coordinate shader to its own vc4_compiled_shader. Merging VS and CS into the same struct wasn't winning us anything except for not allocating a separate BO (but if we want to pack programs into BOs, we should pack not just those 2 programs together). What it was getting us was a bunch of code duplication about hash table lookups and propagating vc4_compile contents into a vc4_compiled_shader. I was about to make the situation worse with indirect uniform buffer access.	2014-10-09 21:47:06 +02:00
Eric Anholt	5c72d7706c	vc4: Add #defines for the texture uniform fields. I wanted to make another set of texture uploads for handling reladdr constants, and duplicating all the bitshifting looked like a terrible idea. In the process, this fixes a swap of the s/t texture wrap modes.	2014-10-09 21:47:06 +02:00
Eric Anholt	5cfab07639	vc4: Initialize undefined temporaries to 0. Under the simulator, reading registers before writing them triggers an assertion failure. c->undef gets treated as r0, which will usually be written, but not if it's used in the first instruction. We should definitely not be aborting in this case, and return some sort of undefined value instead. Fixes glsl-user-varying-ff.	2014-10-09 21:47:06 +02:00
Kenneth Graunke	4ce11de4ae	i965: Skip uploading border color when unnecessary. The border color is only needed when using the GL_CLAMP_TO_BORDER or (deprecated) GL_CLAMP wrap modes; all others ignore it, including the common GL_CLAMP_TO_EDGE and GL_REPEAT wrap modes. In those cases, we can skip uploading it entirely, saving a bit of space in the batchbuffer. Instead, we just point it at the start of the batch (offset 0); we have to program something, and that address is safe to read. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-09 15:43:18 +02:00
Kenneth Graunke	b7844d1248	i965: Use BDW_MOCS_PTE for renderbuffers. Write-back caching cannot be used for buffers being scanned out by the display engine; surfaces used for scan-out must be write-through or uncached. I originally chose WT for render targets because it works in all cases. However, we really want to use write-back caching where possible, as it is more efficient. Most renderbuffers are not used for scanout - off-screen FBOs certainly are fine, and non-pageflipped backbuffers should be fine as well. So in most cases WB will work. However, we don't know what will be used for scan-out, so we instead simply use the PTE value specified by the kernel, as it knows these things. This matches our MOCS choice on Haswell. Fixes performance regressions since commit `ee4484be3d` in a microbenchmark (spotted by Eero Tamminen). Improves performance in GLBenchmark 2.7/EgyptHD by 7.44362% +/- 0.496939% (n=55) on a Broadwell GT2. Improves performance in a bunch of other microbenchmarks by ~15% or so. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: mesa-stable@lists.freedesktop.org	2014-10-09 15:03:32 +02:00
Kenneth Graunke	d489a8a73d	i965: Add a BRW_MOCS_PTE #define. Like BDW_MOCS_WB and BDW_MOCS_WT, this specifies that we want to use all three caches (L3, LLC, and eLLC where available), but leaves the LLC caching mode up to the kernel's page table entry. This allows the kernel to pick WB/WT/UC based on whether it's using a buffer for scanout. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: mesa-stable@lists.freedesktop.org	2014-10-09 15:02:55 +02:00
Kenneth Graunke	8892a6b79c	mesa: Make _mesa_print_arrays use stderr. These days, most driver debug output happens via stderr, not stdout. Some applications (such as Xephyr) also appear to close stdout which makes these messages go nowhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-10-09 15:02:42 +02:00
Michel Dänzer	7b4276d7ac	r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers Putting those in VRAM can cause long pauses due to buffers being moved into / out of VRAM. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84662 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-10-09 18:11:44 +09:00
Eric Anholt	5a13522898	vc4: Optimize SF(ITOF(x)) -> SF(x). This is a common production of st_glsl_to_tgsi, because CMP takes a float argument.	2014-10-09 11:01:18 +02:00
Eric Anholt	00a9aebfe0	vc4: Add some optimization of FADD(FSUB(0, x)). This is a common production of st_glsl_to_tgsi, which uses negate flags on source arguments to handle subtraction.	2014-10-09 11:01:18 +02:00
Eric Anholt	67aea92964	vc4: Mostly fix offset calculation for NPOT mipmap levels. The non-base NPOT levels are stored as POT-aligned images. We get that POT alignment by minifying the POT-aligned base level. This means that level strides are also POT aligned, so we have to tell the rendering mode config that our resource is larger than the actual requested area. Fixes the fbo-generatemipmap-formats NPOT cases. Regresses depthstencil-render-miplevels 273 * -- the texture presentation now works (where it was completely broken before), it looks like there's some overflow of image bounds happening at the lower miplevels.	2014-10-09 11:01:09 +02:00
Eric Anholt	0b96a086cb	vc4: Move the mirrored kernel code to a kernel/ directory. Now this whole setup matches the kernel's file layout much more closely.	2014-10-09 09:46:39 +02:00
Eric Anholt	ef9914aa74	vc4: Enable LIT lowering in TGSI instead of our own code. This brings us the -128/128 clamping on the w component.	2014-10-08 22:47:39 +02:00
Eric Anholt	9773d45908	vc4: Fix scalar math opcodes to replicate their result from the X channel. Thanks to robclark for pointing out that I was probably failing to do this when I reported a "bug" in his lowering code.	2014-10-08 22:47:39 +02:00
Chia-I Wu	4e50a32be6	ilo: fix rectlist on GEN7+ It was broken by `343b014b57`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-10-09 03:37:04 +08:00
Eric Anholt	581418585e	vc4: Add support for two-sided color. It's fairly easy, thanks to Rob Clark's lowering code. Fixes two-sided-lighting and 4 vertex-program-two-side testcases, while regressing 8 testcases that involve enabling two-sided color while only initializing one of the two colors in the VS. If you're enabling two sided color, it's of course expected that you really do set up both colors, so this is still an improvement (and when we set up a linker for TGSI, we'll hopefully fix those 8 fails).	2014-10-08 17:45:16 +02:00
Eric Anholt	4dccdbf5cb	vc4: Enable POW lowering in TGSI instead of our own code.	2014-10-08 17:42:59 +02:00
Eric Anholt	1aef5a337f	vc4: Enable DP lowering in TGSI instead of our own code.	2014-10-08 17:42:59 +02:00
Eric Anholt	4f6e4c7370	vc4: Start using tgsi_lowering for opcodes we haven't supported before.	2014-10-08 17:42:59 +02:00
Eric Anholt	f9854e169f	gallium: Rename freedreno parts of tgsi_lowering.[ch]. Acked-by: Rob Clark <robclark@freedesktop.org>	2014-10-08 17:42:59 +02:00
Eric Anholt	19df602b39	gallium: Reformat tgsi_lowering.c for the normal style. Acked-by: Rob Clark <robclark@freedesktop.org>	2014-10-08 17:42:59 +02:00
Eric Anholt	3141dc8e87	gallium: Copy fd_lowering.[ch] to tgsi_lowering.[ch] for code sharing. Lots of drivers need to transform the weird instructions in TGSI into reasonable scalar ops, and this code can make those translations canonical. Acked-by: Rob Clark <robclark@freedesktop.org>	2014-10-08 17:42:59 +02:00
Eric Anholt	84caf5a861	vc4: Set unused raddr fields to QPU_R_NOP. The simulator assertion fails if you have a write to a reg and then a read (for example, in the NOP side of an instruction), even if the read isn't used for anything. By setting unused raddrs to NOP, we avoid the problem (since only the phsyical registers are tracked).	2014-10-08 17:42:59 +02:00
Eric Anholt	48af7426f2	vc4: Abstract out the field-merging logic for instructions. I'm going to be doing the same logic for some more fields next.	2014-10-08 17:42:59 +02:00
Niels Ole Salscheider	acdcef6788	r600: Use DMA transfers in r600_copy_global_buffer v2: Do not demote items that are already in the pool Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2014-10-07 15:59:43 -04:00
Iago Toral Quiroga	fd31628c49	glsl: Optimize min/max expression trees Original patch by Petri Latvala <petri.latvala@intel.com>: Add an optimization pass that drops min/max expression operands that can be proven to not contribute to the final result. The algorithm is similar to alpha-beta pruning on a minmax search, from the field of AI. This optimization pass can optimize min/max expressions where operands are min/max expressions. Such code can appear in shaders by itself, or as the result of clamp() or AMD_shader_trinary_minmax functions. This optimization pass improves the generated code for piglit's AMD_shader_trinary_minmax tests as follows: total instructions in shared programs: 75 -> 67 (-10.67%) instructions in affected programs: 60 -> 52 (-13.33%) GAINED: 0 LOST: 0 All tests (max3, min3, mid3) improved. A full shader-db run: total instructions in shared programs: 4293603 -> 4293575 (-0.00%) instructions in affected programs: 1188 -> 1160 (-2.36%) GAINED: 0 LOST: 0 Improvements happen in Guacamelee and Serious Sam 3. One shader from Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of dropping of a (constant float (0.00000)) operand, which was compiled to a saturate modifier. Version 2 by Iago Toral Quiroga <itoral@igalia.com>: Changes from review feedback: - Squashed various cosmetic changes sent by Matt Turner. - Make less_all_components return an enum rather than setting a class member. (Suggested by Mat Turner). Also, renamed it to compare_components. - Make less_all_components, smaller_constant and larger_constant static. (Suggested by Mat Turner) - Change mixmax_range to call its limits "low" and "high" instead of "range[0]" and "range[1]". (Suggested by Connor Abbot). - Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by Connor Abbot). - Make the logic more clearer by rearrenging the code and commenting. (Suggested by Connor Abbot). - Added comment to explain why we need to recurse twice. (Suggested by Connor Abbot). - If we cannot prune an expression, do not return early. Instead, attempt to prune its children. (Suggested by Connor Abbot). Other changes: - Instead of having a global "valid" visitor member, let the various functions that can determine this status return a boolean and check for its value to decide what to do in each case. This is more flexible and allows to recurse into children of parents that could not be prunned due to invalid ranges (so related to the last bullet in the review feedback). - Make sure we always check if a range is valid before working with it. Since any use of get_range, combine_range or range_intersection can invalidate a range we should check for this situation every time we use any of these functions. Version 3 by Iago Toral Quiroga <itoral@igalia.com>: Changes from review feedback: - Now we can make get_range, combine_range and range_intersection static too (suggested by Connor Abbot). - Do not return NULL when looking for the larger or greater constant into mixed vector constants. Instead, produce a new constant by doing a component-wise minmax. With this we can also remove of the validations when we call into these functions (suggested by Connor Abbot). - Add a comment explaining the meaning of the baserange argument in prune_expression (suggested by Connor Abbot). Other changes: - Eliminate minmax expressions operating on constant vectors with mixed values by resolving them. No piglit regressions observed with Version 3. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861 Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2014-10-07 12:37:51 +02:00
Tapani Pälli	16b53005a7	glsl: do not emit error for non written varyings on OpenGL ES Patch fixes following test case from 'shaders-with-varyings' WebGL conformance suite: "vertex shader with unused varying and fragment shader with used varying must succeed" v2: emit still a warning if the condition happens (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-07 08:28:51 +03:00
Michel Dänzer	be0a994fb8	radeonsi: Use dummy pixel shader if compilation of the real shader failed Instead of crashing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-07 12:07:13 +09:00
Chia-I Wu	f358462640	ilo: let shaders determine surface counts When a shader needs N surfaces, we should upload N surfaces and not depend on how many are bound. This commit is larger than it should be because we did not export how many surfaces a surface uses before. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-10-06 15:10:30 +08:00
Chia-I Wu	ca824e6940	ilo: let shaders determine sampler counts When a shader needs N samplers, we should upload N samplers and not depend on how many are bound. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-10-04 23:18:51 +08:00
Marek Olšák	0c4bc1e292	tgsi: change tgsi_shader_info::properties to a one-dimensional array Reviewed-by: Roland Scheidegger <sroland@vmware.com> v2: fix svga too	2014-10-04 15:36:39 +02:00
Marek Olšák	1f6c0b55df	radeonsi: set number of userdata SGPRs of GS copy shader to 4 It only needs the constant buffer with clip planes and read-write resources for the GS->VS ring and streamout. That's 2 pointers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:15 +02:00
Marek Olšák	68d36c0bb5	radeonsi: pass the GS shader directly to si_generate_gs_copy_shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:15 +02:00
Marek Olšák	aeb05f011e	radeonsi: set LLVMByValAttribute for all descriptor arrays I hope this is correct. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:15 +02:00
Marek Olšák	91f1a79f78	radeonsi: make the vertex shader key smaller We only support 16 vertex attribs, not 32. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	90611297fa	radeonsi: don't flush shader caches when building PM4 shader states This is a wrong place to flush caches to say the least. I don't think we need to flush the instruction caches if we don't patch shaders with DMA. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	10e386f4aa	radeonsi: remove interp_at_sample from the key, use TGSI_INTERPOLATE_LOC_SAMPLE st/mesa has the same flag in its shader key, we don't need to do it in the driver anymore. Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	0a2d6f0c4e	radeonsi: move geometry shader properties from si_shader to si_shader_selector Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	54de709911	radeonsi: always compile shaders on demand The first compiled shader is sometimes useless, because the key doesn't match the key for the draw call where it's used. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	6c9f61c97e	radeonsi: remove unused variable si_shader::gs_input_prim Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	7dc0164192	tgsi: remove some not so useful variables from tgsi_shader_info	2014-10-04 15:16:14 +02:00
Marek Olšák	8860584045	radeonsi: get fs_write_all from tgsi_shader_info directly Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	8908fae243	tgsi: simplify shader properties in tgsi_shader_info Use an array of properties indexed by TGSI_PROPERTY_* definitions.	2014-10-04 15:16:14 +02:00
Marek Olšák	5233568861	radeonsi: get tgsi_shader_info only once before compilation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	af4f5a7c97	gallium/util: add util_bitcount64 I'll need this in radeonsi. v2: use __builtin_popcountll if available Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-10-04 15:16:14 +02:00
Marek Olšák	837907b8b3	radeonsi: fix CS tracing and remove excessive CS dumping	2014-10-04 15:16:14 +02:00
Ilia Mirkin	c74be01e80	gk110/ir: add dnz flag emission for fmul/fmad Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-10-03 20:37:59 -04:00
Ilia Mirkin	d58037ccf5	gm107/ir: add dnz emission for fmul Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-10-03 20:37:59 -04:00
Brian Paul	90dc71b454	st/wgl: add WINAPI qualifiers on wgl function typedefs Fixes a release build segfault when wglCreateContextAttribsARB() calls the wglCreateContext() function. Cc: "10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-10-03 13:45:52 -06:00
Rob Clark	7297bdbd50	freedreno: query fixes Fixes a few issues, including a potential empty-IB (which triggers gpu hangs in piglit occlusion_query_meta_no_fragments) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-03 14:19:52 -04:00
Rob Clark	a262c601d3	freedreno/a3xx: handle VS only outputting BCOLOR Possibly we should map the front color to black (zeroes). But not sure there is a way to do that without generating a shader variant. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-03 14:19:52 -04:00
Rob Clark	af4d088395	freedreno/ir3: fix lockups with lame FRAG shaders Shaders like: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0], LOCAL IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D 1: MOV OUT[0], IMM[0].xyxx 2: END cause unhappyness. They have an IN[], but once this is compiled the useless TEX instruction goes away. Leaving a varying that is never fetched, which makes the hw unhappy. In the process fix a signed vs unsigned compare. If the vertex shader has max_reg=-1, MAX2() vs an unsigned would not give the desired result. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-03 14:19:52 -04:00
Matt Turner	cabc93c5ad	i965/compaction: Disable compaction on SNB temporarily. Will investigate after XDC.	2014-10-03 10:41:57 -07:00
Matt Turner	0d5c9bf1e4	Revert "i965: Emit ELSE/ENDIF JIP with type D on Gen 7." This reverts commit `54e30dbf4d`. Will investigate after XDC. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557	2014-10-03 10:02:24 -07:00
Matt Turner	b59db8e0f0	i965/fs: Remove dead generate_rep_fb_write prototype. Added in commit `f9dc7aab`.	2014-10-03 10:02:24 -07:00
Brian Paul	c7f0755caa	mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error On Windows, the Piglit primitive-restart test was failing a glGetError()==0 assertion when it was run w/out any command line arguments. Piglit's all.py script only runs primitive-restart with arguments so this case isn't normally hit during a full piglit run. The basic problem is Microsoft's opengl32.dll calls glFlush from wglGetProcAddress() and Piglit uses wglGetProcAddress() to resolve glPrimitiveRestartNV() which is called inside glBegin/End. See comments in the code for more info. Plus, improve the comments for _mesa_alloc_dispatch_table(). Cc: <mesa-stable@lists.freedesktop.org> Acked-by: Sinclair Yeh <syeh@vmware.com>	2014-10-03 10:04:48 -06:00
Ilia Mirkin	33c9ad97bf	freedreno/ir3: add TXF support Still failing a bunch of the fairly picky texelFetch tests, but the 1D(Array) ones are full passes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	e6acf3ac24	freedreno/ir3: add TXD support and expose ARB_shader_texture_lod Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	c49107c889	freedreno/ir3: add texture offset support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	5bba74c64b	freedreno/ir3: shadow comes before array Experimentally, this makes *ArrayShadow tex-miplevel-selection tests pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	81b34e4461	freedreno/ir3: make TXQ return integers, not floats We're still doing something wrong for array textures. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	c4e2a196c3	freedreno/ir3: add UMAD support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	347bc197a6	freedreno/ir3: add ISSG support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	ad5db64e7e	freedreno/ir3: add MOD support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	cab3cb1d71	freedreno/ir3: add UMOD support, based on UDIV Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Ilia Mirkin	8f7d01c2cb	freedreno/ir3: add IDIV/UDIV support Logic shamelessly copied from nv50 lowering pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 23:30:47 -04:00
Michel Dänzer	ed03747e6a	radeonsi: Clear sampler view flags when binding a buffer Fixes assertion failure while running the Unreal Engine 4 Elemental demo: .../si_blit.c:322:si_decompress_color_textures: Assertion `tex->cmask.size \|\| tex->fmask.size' failed. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-03 11:15:38 +09:00
Eric Anholt	ca00070259	vc4: Add support for framebuffer sRGB encoding.	2014-10-02 18:29:18 -07:00
Eric Anholt	24d9980562	vc4: Add support for sampling from sRGB. This isn't perfect -- the filtering is happening on the srgb values, and we're decoding afterwards, which is not what you want. I think that's the cause of some additional texwrap(GL_CLAMP, LINEAR) failures, though many other texwrap tests on srgb start to pass since unfiltered values come out correct.	2014-10-02 18:28:45 -07:00
Ilia Mirkin	3dd9a0d6fd	freedreno/ir3: avoid fan-in sources referring to same instruction Since the RA has to be done s.t. each one gets its own (adjacent) register, it would complicate matters if instructions were allowed to be repeated. This enables copy-propagation use in situations where previously that might have happened. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-02 21:05:50 -04:00
Rob Clark	f5eeb8a6dc	freedreno/a3xx: emit all immediates in one shot Makes the command stream a bit tighter when there are lots of immediates. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-02 21:05:50 -04:00
Ilia Mirkin	be00852bae	freedreno: instanced drawing/compute not yet supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-02 21:05:50 -04:00
Dave Airlie	8df3c02cdc	mesa: fix GetTexImage for 1D array depth textures While running piglit in virgl, I hit an assert in intel driver. "qemu-system-x86_64: intel_tex.c:219: intel_map_texture_image: Assertion `tex_image->TexObject->Target != 0x8C18 \|\| h == 1' failed." Thanks to Eric and Ken for pointing me in the right direction, Fix the get_tex_depth to do the same fixup as get_tex_rgba does for 1D array textures. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-10-03 10:37:55 +10:00
Tomasz Figa	b4ffd19e6c	st/mesa: Fix paths used in Android builds With current makefiles the build fails because source and build paths are generated incorrectly. With Android build system the top_srcdir and top_builddir variables are undefined and all paths are relative to where Android.mk is located. This ends up with path likes external/mesa/src/mesa/src/mesa/ for both source and build paths, which are obviously wrong. This patch fixes this by overriding resulting SRCDIR and BUILDDIR variables with empty string, so that paths end up being relative to Android.mk file again. Appending correct build path to generated files is already done in Android.gen.mk. Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-03 01:25:35 +01:00
Tomasz Figa	98445fd25e	st/mesa: Generate format_info.c in Android builds Current Android makefiles lack generation of format_info.c, which is a dependency of main/format.c. This patch adds necessary code to Android.gen.mk. Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-03 01:25:32 +01:00
Tomasz Figa	d703abf735	util: Include in Android builds This patch fixes Android build failures by including src/util directory in compilation. Files inside of this directory are compiled into libmesa_util static library and linked with resulting libGLES_mesa. Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-10-03 01:25:28 +01:00
Jason Ekstrand	493bfa54a5	i965/fs: Use the correct base_mrf for spilling pairs in SIMD8 Before, we were hard-coding the base_mrf based on dispatch width not number of registers spilled at a time. This caused us to emit instructions with a base_mrf or 14 and a mlen of 3 so we used the magical non-existant m16 register. This fixes the problem. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-02 16:38:25 -07:00
Jason Ekstrand	50d0e2e118	i965/fs: Add a MAX_GRF_SIZE define and use it various places Previously, we had a MAX_SAMPLER_MESSAGE_SIZE which we used instead. However, some FB write messages can validly be longer than this so we need something different. Since MAX_SAMPLER_MESSAGE_SIZE is validly useful on its own, we leave it alone and add a new MAX_GRF_SIZE that's big enough for FB writes. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84539 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-02 14:14:25 -07:00
Jason Ekstrand	b33e5465a7	i965/fs: Use the actual regsister width in brw_reg_from_fs_reg This fixes a bug where 1-wide operations don't properly translate down to 1-wide instructions. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-02 13:17:03 -07:00
Jason Ekstrand	75986830b4	i965/fs_fp: Use null_reg from fs_visitor instead of rolling our own Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84529 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-10-02 13:17:03 -07:00
Rob Clark	7309c6126f	freedreno/a3xx: handle large shader program sizes Above a certain limit use CACHE mode instead of BUFFER mode. This should solve gpu hangs with large shader programs. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-02 13:57:07 -04:00
Rob Clark	d01ee5923d	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-02 13:57:07 -04:00
Ilia Mirkin	3dc47c5960	freedreno: dual-source render targets are not supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-02 13:57:07 -04:00
Ilia Mirkin	786f01c492	gallium/hud: use u_sampler_view_default_template helper The existing code was not setting several fields, most importantly the target, which is required on nv50/nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-10-02 12:18:21 -04:00
Iago Toral Quiroga	db8cd4d519	glsl: Fix memory leak in builtin_builder::_image_prototype. in_var calls the ir_variable constructor, which dups the variable name. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-10-02 15:39:05 +02:00
Tapani Pälli	f4b4ae8c24	mesa: relax draw api validation on ES2 Patch fixes failing test in WebGL conformance test 'point-no-attributes' when running Chrome on OpenGL ES. (Shader program may draw points using constant data in shader.) No Piglit regressions. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-02 11:55:13 +03:00
Ilia Mirkin	3914dc579e	glsl: make consistent use of DECLARE_RALLOC_CXX_OPERATORS Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-10-02 00:59:35 -04:00
Eric Anholt	4111b1d54b	vc4: Fix the mapping of the minification filter to HW values. They're actually as documented in the HW specs and the GL mipmapping enums order. Fixes fbo-generatemipmap-filtering , and some other tests where we were off by a few bits due to unexpected linear filtering.	2014-10-01 17:03:36 -07:00
Eric Anholt	75f8e0bc2a	vc4: Make the last static array in vc4_program.c dynamically sized.	2014-10-01 17:03:35 -07:00
Eric Anholt	ebff93ac19	vc4: Fix some broken indentation.	2014-10-01 17:03:35 -07:00
Eric Anholt	d7a0502a54	vc4: Add support for the FACE semantic. Fixes glsl-fs-frontfacing.	2014-10-01 17:03:35 -07:00
Eric Anholt	1bf2d17a60	vc4: Add support for TGSI_OPCODE_CLAMP. This will be used by the shared LIT lowering code.	2014-10-01 17:03:35 -07:00
Eric Anholt	0c8c7d32f0	vc4: Fix compiler warning	2014-10-01 17:03:35 -07:00
Anuj Phogat	25266b2c11	meta: Fix make check failures in setup_glsl_msaa_blit_scaled_shader() introduced by commit `68ee950`. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reported-by: Mark Janes <mark.a.janes@intel.com>	2014-10-01 15:27:31 -07:00
Brian Paul	44b500f5f2	mesa: fix _mesa_alloc_dispatch_table() declaration Insert 'void' parameter to match declaration in api_exec.h. Trivial.	2014-10-01 15:17:47 -06:00
Roland Scheidegger	dea0fcf4e6	meta: (trivial) remove accidental double semicolon	2014-10-01 23:14:46 +02:00
Anuj Phogat	4330fa970b	i965: Enable EXT_framebuffer_multisample_blit_scaled for gen8 Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-10-01 12:04:15 -07:00
Anuj Phogat	68ee950c78	meta: Implement ext_framebuffer_multisample_blit_scaled extension Extension enables doing a multisample buffer resolve and buffer scaling using a single glBlitFrameBuffer() call. Currently, we have this extension implemented in BLORP which is only used by SNB and IVB. This patch implements the extension in meta path which makes it available to Broadwell. Implementation features: - Supports scaled resolves of 2X, 4X and 8X multisample buffers. - Avoids unnecessary shader compilations by storing the pre compiled shaders for each supported sample count. - Uses bilinear filtering for both GL_SCALED_RESOLVE_FASTEST_EXT and GL_SCALED_RESOLVE_NICEST_EXT filter options. This is an allowed behavior in the extension's spec. - I tried doing bicubic filtering for GL_SCALED_RESOLVE_NICEST_EXT filter. It made the edges in the image look little smoother but the image gets blurred causing no overall quality improvement. For now I have dropped the idea of doing different filtering for nicest filter. V2: - Minor changes to simplify the fragment shader. - Refactor the code to move i965 specific sample_map computation out of Meta. We now use ctx->Const.SampleMap{2,4,8}x variables initialized by the driver. - Use a simple msaa resolve shader for scaled resolves with scaling factor = 1.0. V3: - Make changes to create a string out of ctx->Const.SampleMap{2,4,8}x variables and use it in fragment shader. V4: - Make changes to use uint8_t type ctx->Const.SampleMap{2,4,8}x variables. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-10-01 12:04:15 -07:00
Anuj Phogat	7a4790148c	i965: Initialize the SampleMap{2,4,8}x variables with values specific to Intel hardware. V2: Define and use gen6_get_sample_map() function to initialize the variables. V3: Change the function name to gen6_set_sample_maps() and use memcpy() to fill in the data. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-10-01 12:04:15 -07:00
Anuj Phogat	38cd40faab	mesa: Add new variables in gl_context to store sample layout SampleMap{2,4,8}x variables are used in later patches to implement EXT_framebuffer_multisample_blit_scaled extension. V2: Use integer array instead of a string. Bump up the comment. V3: Use uint8_t type array. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-10-01 12:04:15 -07:00
Leo Liu	4f7916ab4f	st/va: implement vlVa(Query\|Create\|Get\|Put\|Destroy)Image This patch implements functions for images support, which basically supports copy data between video surface and user buffers, in this case supports SW decode, and other video output v2: fix buffer size for odd-sized image case expose I420 format as well v3: fix YUV 4:2:2 format data buffer size cleanup I420 format exposure Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-10-01 13:21:36 -04:00
Christian König	7913c8943a	st/va: implement Picture functions for mpeg2 h264 and vc1 This patch implements codec for mpeg2 h264 and vc1, populates codec parameters and pass them to HW driver. Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-10-01 13:21:36 -04:00
Christian König	1be5515838	st/va: implement Context Surface and Buffer This patch implements context managements, relate it HW driver, functions for video surface managements, and functions for application data memory buffer managements. implemented functions: vlVa(Create\|Destroy)Context vlVa(Create\|Destroy\|Put)Surfaces vlVa(Create\|Destroy)Buffer Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-10-01 13:21:36 -04:00
Christian König	2825ef3abf	st/va: implement vlVa(Create\|Destroy\|Query\|Get)Config This patch is for application to query configuration, such as profiles, entrypoints, and attributes v2: fix missing profile with query Signed-off-by: Michael Varga <michael.varga@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-10-01 13:21:36 -04:00
Christian König	3867933ecb	st/va: skeleton VAAPI state tracker This patch adds a skeleton VA-API state tracker, which is filled with live in the subsequent patches. v2: fixes in configure.ac and va state_tracker Makefile.am v3: do not link against libva. detect libva version, and correctly set driver entrypoint name. rebase(cleanup) targets/va/Makefile.am v4: cleanup va version auto detection add back targets/va/va.sym Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-10-01 13:21:36 -04:00
Leo Liu	0eb8f89981	st/vdpau: move common functions to util Break out these functions so that they can be shared with a other state trackers. They will be used in subsequent patches for the new VA-API state tracker. Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-10-01 13:21:36 -04:00
Rob Clark	204dd73c99	freedreno: max-texture-lod-bias should be 15.0f Fixes piglit lodbias test. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-10-01 07:28:06 -04:00
Kenneth Graunke	95073a2dca	mesa: Avoid flagging _NEW_VIEWPORT on redundant viewport updates. Cuts the number of i965 color calculator viewport uploads by 100x (11017983 -> 113385) in 'x11perf -gc' with Glamor in Xephyr. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-01 01:08:26 -07:00
Kenneth Graunke	0a1730200e	i965: Drop CACHE_NEW_VS_PROG from the gen7_sf_state atom. I believe when I wrote this code, gen6_sf_state used CACHE_NEW_VS_PROG, which has since been replaced by BRW_NEW_VUE_MAP_GEOM_OUT. It's not needed here anyway - only SBE needs it. Just a copy and paste mistake. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-10-01 01:08:07 -07:00
Kenneth Graunke	106e0db769	i965: Drop brwBindProgram driver hook. This function flagged BRW_NEW__PROGRAM When ctx->{Vertex,Geometry,Fragment}Program._Current changes, core Mesa calls the BindProgram driver hook, which flagged BRW_NEW__PROGRAM. However, brw_upload_state also checks for that changing, sets the same flags, and also updates brw->fragment_program and so on. So, this looks to be entirely redundant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:41 -07:00
Kenneth Graunke	e25a453b7f	i965: Add missing /* BRW_NEW_FRAGMENT_PROGRAM */ comments. I had to dig a bit to figure out why this was necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:39 -07:00
Kenneth Graunke	3d31ed0d93	i965: Use "1ull" instead of "1" in BRW_NEW_* defines. Now that the bitfield is a uint64_t, we should use 1ull. Currently, we only have 32 entries, so 1 works fine, but it's not future-proof. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:38 -07:00
Kenneth Graunke	a114f452ae	i965: Use ~0ull when flagging all BRW_NEW_* dirty flags. ~0 is 0xFFFFFFFF, which only covers the first 32 bits. We need all 64. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:36 -07:00
Kenneth Graunke	5105f9a7ae	i965: Fix INTEL_DEBUG=state to work with 64-bit dirty bits. This will keep INTEL_DEBUG=state working when we add BRW_NEW_* bits beyond 1 << 31. We missed doing this when widening the driver flags from uint32_t to uint64_t. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:35 -07:00
Kenneth Graunke	fbebd5e4a5	i965: Delete CACHE_NEW_BLORP_CONST_COLOR_PROG. Unused since krh rewrote fast clears to use meta. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 01:05:24 -07:00
Chris Forbes	e4e3b0fc0d	i965: Fix typo in comment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 18:37:06 +13:00
Chris Forbes	d8c5c4f3e4	i965: Fix spelling of GEN7_SAMPLER_EWA_ANISOTROPIC_ALGORITHM Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-10-01 18:37:06 +13:00
Vinson Lee	6a238ac0b7	llvmpipe: Add missing LLVMGetGlobalContext() arg in lp_test_format.c. Fix build error introduced with commit `eedbce9c63`. lp_test_format.c: In function ‘test_format_unorm8’: lp_test_format.c:226:4: error: too few arguments to function ‘gallivm_create’ gallivm = gallivm_create("test_module_unorm8"); ^ In file included from ../../../../src/gallium/auxiliary/gallivm/lp_bld_format.h:38:0, from lp_test_format.c:42: ../../../../src/gallium/auxiliary/gallivm/lp_bld_init.h:58:1: note: declared here gallivm_create(const char *name, LLVMContextRef context); ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84538 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-09-30 21:52:13 -07:00
Keith Packard	3202926746	glx/dri3: Provide error diagnostics when DRI3 allocation fails Instead of just segfaulting in the driver when a buffer allocation fails, report error messages indicating what went wrong so that we can debug things. As a simple example, chromium wraps Mesa in a sandbox which doesn't allow access to most syscalls, including the ability to create shared memory segments for fences. Before, you'd get a simple segfault in mesa and your 3D acceleration would fail. Now you get: $ chromium --disable-gpu-blacklist [10618:10643:0930/200525:ERROR:nss_util.cc(856)] After loading Root Certs, loaded==false: NSS error code: -8018 libGL: pci id for fd 12: 8086:0a16, driver i965 libGL: OpenDriver: trying /local-miki/src/mesa/mesa/lib/i965_dri.so libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted. libGL: Can't open configuration file /home/keithp/.drirc: Operation not permitted. libGL error: DRI3 Fence object allocation failure Operation not permitted [10618:10618:0930/200525:ERROR:command_buffer_proxy_impl.cc(153)] Could not send GpuCommandBufferMsg_Initialize. [10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(236)] CommandBufferProxy::Initialize failed. [10618:10618:0930/200525:ERROR:webgraphicscontext3d_command_buffer_impl.cc(256)] Failed to initialize command buffer. This made it pretty easy to diagnose the problem in the referenced bug report. Bugzilla: https://code.google.com/p/chromium/issues/detail?id=415681 Signed-off-by: Keith Packard <keithp@keithp.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 21:23:04 -07:00
Keith Packard	f7a355556e	glx/dri3: Use four buffers until X driver supports async flips A driver which doesn't have async flip support will queue up flips without any way to replace them afterwards. This means we've got a scanout buffer pinned as soon as we schedule a flip and so we need another buffer to keep from stalling. When vblank_mode=0, if there are only three buffers we do: current scanout buffer = 0 at MSC 0 Render frame 1 to buffer 1 PresentPixmap for buffer 1 at MSC 1 This is sitting down in the kernel waiting for vblank to become the next scanout buffer Render frame 2 to buffer 2 PresentPixmap for buffer 2 at MSC 1 This cannot be displayed at MSC 1 because the kernel doesn't have any way to replace buffer 1 as the pending scanout buffer. So, best case this will get displayed at MSC 2. Now we block after this, waiting for one of the three buffers to become idle. We can't use buffer 0 because it is the scanout buffer. We can't use buffer 1 because it's sitting in the kernel waiting to become the next scanout buffer and we can't use buffer 2 because that's the most recent frame which will become the next scanout buffer if the application doesn't manage to generate another complete frame by MSC 2. With four buffers, we get: current scanout buffer = 0 at MSC 0 Render frame 1 to buffer 1 PresentPixmap for buffer 1 at MSC 1 This is sitting down in the kernel waiting for vblank to become the next scanout buffer Render frame 2 to buffer 2 PresentPixmap for buffer 2 at MSC 1 This cannot be displayed at MSC 1 because the kernel doesn't have any way to replace buffer 1 as the pending scanout buffer. So, best case this will get displayed at MSC 2. The X server will queue this swap until buffer 1 becomes the scanout buffer. Render frame 3 to buffer 3 PresentPixmap for buffer 3 at MSC 1 As soon as the X server sees this, it will replace the pending buffer 2 swap with this swap and release buffer 2 back to the application Render frame 4 to buffer 2 PresentPixmap for buffer 2 at MSC 1 Now we're in a steady state, flipping between buffer 2 and 3 waiting for one of them to be queued to the kernel. ... current scanout buffer = 1 at MSC 1 Now buffer 0 is free and (e.g.) buffer 2 is queued in the kernel to be the scanout buffer at MSC 2 Render frames, flipping between buffer 0 and 3 When the system can replace a queued buffer, and we update Present to take advantage of that, we can use three buffers and get: current scanout buffer = 0 at MSC 0 Render frame 1 to buffer 1 PresentPixmap for buffer 1 at MSC 1 This is sitting waiting for vblank to become the next scanout buffer Render frame 2 to buffer 2 PresentPixmap for buffer 2 at MSC 1 Queue this for display at MSC 1 1. There are three possible results: 1) We're still before MSC 1. Buffer 1 is released, buffer 2 is queued waiting for MSC 1. 2) We're now after MSC 1. Buffer 0 was released at MSC 1. Buffer 1 is the current scanout buffer. a) If the user asked for a tearing update, we swap scanout from buffer 1 to buffer 2 and release buffer 1. b) If the user asked for non-tearing update, we queue buffer 2 for the MSC 2. In all three cases, we have a buffer released (call it 'n'), ready to receive the next frame. Render frame 3 to buffer n PresentPixmap for buffer n If we're still before MSC 1, then we'll ask to present at MSC 1. Otherwise, we'll ask to present at MSC 2. Present already does this if the driver offers async flips, however it does this by waiting for the right vblank event and sending an async flip right at that point. I've hacked the intel driver to offer this, but I get tearing at the top of the screen. I think this is because flips are always done from within the ring, and so the latency between the vblank event and the async flip happening can cause tearing at the top of the screen. That's why I'm keying the need for the extra buffer on the lack of 2D driver support for async flips. Signed-off-by: Keith Packard <keithp@keithp.com> Acked-by: Jason Ekstrand <jason.ekstrand@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-09-30 20:08:28 -07:00
Jason Ekstrand	eedbce9c63	i965/fs: Fix the build	2014-09-30 17:27:33 -07:00
Jason Ekstrand	83669fac9d	i965/fs: Fix an uninitialized value warnings Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 17:26:05 -07:00
Roland Scheidegger	9750ae8ca9	galahad: fix indirect draw Need to unwrap the indirect resource otherwise bad things will happen. Fixes random crashes and timeouts with piglit's arb_indirect_draw tests. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-10-01 02:17:24 +02:00
Roland Scheidegger	e3da8c110c	galahad: (trivial) handle cubemap arrays Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-10-01 02:16:57 +02:00
Matt Turner	3e7f8005db	i965/fs: Emit compressed BFI2 instructions on Gen > 7. IVB had a restriction that prevented us from emitting compressed three-source instructions, and although that was lifted on Haswell, Haswell had a new restriction that said BFI instructions specifically couldn't be compressed.	2014-09-30 17:09:34 -07:00
Matt Turner	9f5e5bd34d	i965/fs: Allow SIMD16 borrow/carry/64-bit multiply on Gen > 7. These checks were intended for Gen 7 only. None of these restrictions apply to Gen 8. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	05586f9bc1	i965/fs: Set MUL source type to W/UW in 64-bit mul macro on Gen8. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	94b68109fb	i965/fs: Optimize sqrt+inv into rsq. Transform sqrt a, b rcp c, a into sqrt a, b rsq c, b The improvement here is that we've broken a dependency between these instructions. Leads to 330 fewer INV instructions and 330 more RSQ. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	b52126b44f	i965/vec4: Optimize sqrt+inv into rsq. Transform sqrt a, b rcp c, a into sqrt a, b rsq c, b In most cases the sqrt's result is still used, so the improvement here is that we've broken a dependency between these instructions. Leads to 80 fewer INV instructions and 80 more RSQ. Occasionally the sqrt's result is no longer used, leading to: instructions in affected programs: 5005 -> 4949 (-1.12%) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	189ac07764	i965/vec4: Call opt_algebraic after opt_cse. The next patch adds an algebraic optimization for the pattern sqrt a, b rcp c, a and turns it into sqrt a, b rsq c, b but many vertex shaders do a = sqrt(b); var1 /= a; var2 /= a; which generates sqrt a, b rcp c, a rcp d, a If we apply the algebraic optimization before CSE, we'll end up with sqrt a, b rsq c, b rcp d, a Applying CSE combines the RCP instructions, preventing this from happening. No shader-db changes. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Matt Turner	d13bcdb3a9	i965/fs: Extend predicated break pass to predicate WHILE. Helps a handful of programs in Serious Sam 3 that use do-while loops. instructions in affected programs: 16114 -> 16075 (-0.24%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-30 17:09:34 -07:00
Mathias Fröhlich	6e7d36fd2c	gallivm: Fix build for LLVM 3.2 Do not rely on LLVMMCJITMemoryManagerRef being available. The c binding to the memory manager objects only appeared on llvm-3.4. The change is based on an initial patch of Brian Paul. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-10-01 00:29:31 +02:00
Rob Clark	cc355f1c06	freedreno: destroy transfer pool after blitter Blitter can still have transfers hanging around which it frees in util_blitter_destroy(). So let it clean up before we yank the transfer_pool from under it. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-30 16:56:15 -04:00
Rob Clark	01ff0b28b3	freedreno/lowering: fix token calculation for lowering Indirect registers consume an additional token. Try to clean up the token calculation math a bit, and fix it at the same time. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-30 16:56:15 -04:00
Ian Romanick	408aa46ca8	i965/fs: Don't make a name for a vector splitting temporary If the name is just going to get dropped, don't bother making it. If the name is made, release it sooner (rather than later). No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	0b47252999	glsl: Don't make a name for the function return variable If the name is just going to get dropped, don't bother making it. If the name is made, release it sooner (rather than later). No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	c87d09d7f0	glsl: Don't allocate a name for ir_var_temporary variables Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 74 40,578,719,715 67,762,208 62,263,404 5,498,804 0 After (32-bit): 52 40,565,579,466 66,359,800 61,187,818 5,171,982 0 Before (64-bit): 74 37,129,541,061 95,195,160 87,369,671 7,825,489 0 After (64-bit): 76 37,134,691,404 93,271,352 85,900,223 7,371,129 0 A real savings of 1.0MiB on 32-bit and 1.4MiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	eaa0c74142	glsl: Use ir_var_temporary for compiler generated temporaries These few places were using ir_var_auto for seemingly no reason. The names were not added to the symbol table. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:43 -07:00
Ian Romanick	04e1357d97	glsl: Add context-level controls for whether temporaries have real names No change Valgrind massif results for a trimmed apitrace of dota2. v2: Minor rebase on _mesa_init_constants changes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	a99482482d	glsl: Never put ir_var_temporary variables in the symbol table Later patches will give every ir_var_temporary the same name in release builds. Adding a bunch of variables named "compiler_temp" to the symbol table can only cause problems. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	7625babfae	glsl: Add the possibility for ir_variable to have a non-ralloced name Specifically, ir_var_temporary variables constructed with a NULL name will all have the name "compiler_temp" in static storage. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	0e654ab1b9	glsl: Store ir_variable_data::_num_state_slots and ::binding in 16-bits each Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 44 40,577,049,140 68,118,608 62,441,063 5,677,545 0 After (32-bit): 71 40,583,408,411 67,761,528 62,263,519 5,498,009 0 Before (64-bit): 63 37,122,829,194 95,153,008 87,333,600 7,819,408 0 After (64-bit): 67 37,123,303,706 95,150,544 87,333,600 7,816,944 0 A real savings of 173KiB on 32-bit and no change on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	a32ac726ee	glsl: Squish ir_variable::max_ifc_array_access and ::state_slots together At least one of these pointers must be NULL, and we can determine which will be NULL by looking at other fields. Use this information to store both pointers in the same location. If anyone can think of a better name for the union than "u", I'm all ears. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 63 40,574,239,515 68,117,280 62,618,607 5,498,673 0 After (32-bit): 44 40,577,049,140 68,118,608 62,441,063 5,677,545 0 Before (64-bit): 53 37,126,451,468 95,150,256 87,711,304 7,438,952 0 After (64-bit): 63 37,122,829,194 95,153,008 87,333,600 7,819,408 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	5aa8d8194c	glsl: Make ir_variable::num_state_slots and ir_variable::state_slots private Also move num_state_slots inside ir_variable_data for better packing. The payoff for this will come in a few more patches. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	21df016902	glsl: Make ir_variable::max_ifc_array_access private The payoff for this will come in a few more patches. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:42 -07:00
Ian Romanick	8afe6efa21	glsl: Store ir_variable::depth_layout using 3 bits warn_extension_index was moved to improve packing. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 73 40,580,476,304 68,488,400 62,796,151 5,692,249 0 After (32-bit): 73 40,575,751,558 68,116,528 62,618,607 5,497,921 0 Before (64-bit): 71 37,124,890,613 95,889,584 88,089,008 7,800,576 0 After (64-bit): 62 37,123,578,526 95,150,784 87,711,304 7,439,480 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. v2: Use the enum name with the bit-field and remove the extra casts. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]	2014-09-30 13:34:42 -07:00
Ian Romanick	ab51179f1f	glsl: Replace ir_variable::warn_extension pointer with an 8-bit index Also move the new warn_extension_index into ir_variable::data. This enables slightly better packing. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 82 40,580,040,531 68,488,992 62,973,695 5,515,297 0 After (32-bit): 73 40,580,476,304 68,488,400 62,796,151 5,692,249 0 Before (64-bit): 65 37,124,013,542 95,892,768 88,466,712 7,426,056 0 After (64-bit): 71 37,124,890,613 95,889,584 88,089,008 7,800,576 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:41 -07:00
Ian Romanick	baf5a75664	glsl: Use accessors for ir_variable::warn_extension The payoff for this will come in the next patch. No change Valgrind massif results for a trimmed apitrace of dota2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-30 13:34:41 -07:00
Ian Romanick	1012e95a40	glsl: Eliminate unused built-in variables after compilation After compilation (and before linking) we can eliminate quite a few built-in variables. Basically, any uniform or constant (e.g., gl_MaxVertexTextureImageUnits) that isn't used (with one exception) can be eliminated. System values, vertex shader inputs (with one exception), and fragment shader outputs that are not used and not re-declared in the shader text can also be removed. gl_ModelViewProjectMatrix and gl_Vertex are used by the built-in function ftransform. There are some complications with eliminating these variables (see the comment in the patch), so they are not eliminated. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 46 40,661,487,174 75,116,800 68,854,065 6,262,735 0 After (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0 Before (64-bit): 64 37,200,329,700 104,872,672 96,514,546 8,358,126 0 After (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0 A real savings of 4.9MiB on 32-bit and 7.0MiB on 64-bit. v2: Don't remove any built-in with Transpose in the name. v3: Fix comment typo noticed by Anuj. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: Eric Anholt <eric@anholt.net>	2014-09-30 13:34:41 -07:00
Ian Romanick	77005cfabd	glsl: Validate that built-in uniforms have backing state All built-in uniforms are supposed to be backed by some GL state. The state_slots field describes this backing state. This helped me track down a bug in a later patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-30 13:34:41 -07:00
Eric Anholt	8786544b3e	vc4: Don't forget to store stencil along with depth when storing either. Otherwise, we'd replace the stencil in our packed depth/stencil with 0s. Fixes about 50 piglit tests.	2014-09-30 12:55:28 -07:00
Mathias Fröhlich	43e2109326	llvmpipe: Reuse llvmpipes LLVMContext in the draw context. Reuse the LLVMContext already allocated in llvmpipe_context for draw_llvm if ppossible. This should decrease the memory footprint of an llvmpipe context. v2: Fix compile with llvm disabled. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-09-30 20:51:02 +02:00
Mathias Fröhlich	d90ff351f3	llvmpipe: Make a llvmpipe OpenGL context thread safe. This fixes the remaining problem with the recently introduced global jit memory manager. This change again uses a memory manager that is local to gallivm_state. This implementation still frees the majority of the memory immediately after compilation. Only the generated code is deferred until this code is no longer used. This change and the previous one using private LLVMContext instances I can now safely run several independent OpenGL contexts driven by llvmpipe from different threads. v3: Rebase on llvm-3.6 compile fixes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-09-30 20:51:02 +02:00
Mathias Fröhlich	83c62597fc	llvmpipe: Use two LLVMContexts per OpenGL context instead of a global one. This is one step to make llvmpipe thread safe as mandated by the OpenGL standard. Using the global LLVMContext is obviously a problem for that kind of use pattern. The patch introduces two LLVMContext instances that are private to an OpenGL context and used for all compiles. One is put into struct draw_llvm and the other one into struct llvmpipe_context. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>	2014-09-30 20:45:19 +02:00
Jason Ekstrand	98d00d6640	i965/brw_reg: Make the accumulator register take an explicit width. The big pile of patches I just pushed regresses about 25 piglit tests on SNB. This fixes the regressions. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-30 11:42:34 -07:00
Brian Paul	6b65847835	llvmpipe: move lp_jit_screen_init() call after allocation of screen object The screen argument isn't actually used by lp_jit_screen_init() at this time, but let's move the call so that we pass a valid pointer. v2: don't leak screen if lp_jit_screen_init() fails. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-30 12:09:14 -06:00
Brian Paul	b12899d752	tgsi: fix Semantic.Name assignment in tgsi_transform_input_decl() Assign the sem_name parameter, not TGSI_SEMANTIC_GENERIC. Fixes polygon stipple regression. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-30 12:08:49 -06:00
Brian Paul	0fb1e6b7b4	util: simplify PIPE_TEXTURE_CUBE case in util_max_layer() For cube resources, the array_size value should be 6. So handle that case as we do for array texture resources. But assert that array_size==6 just to be safe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 12:08:49 -06:00
Brian Paul	59562e9ba5	softpipe: don't special case PIPE_TEXTURE_CUBE in softpipe_resource_layout() As with the previous patch for llvmpipe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-30 12:08:49 -06:00
Brian Paul	3d77b80d80	llvmpipe: remove special case for PIPE_TEXTURE_CUBE in llvmpipe_texture_layout() layers (aka array_size) should be 6 for cube textures so we don't need to special-case it. But add an assertion just to be safe. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-30 12:08:49 -06:00
Brian Paul	8269bfdb83	gallium: add doc note about cube textures and can_create_resource() Just to be clear, and echo the description for resource_create(). Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-30 12:08:49 -06:00
Brian Paul	3bfc9a73ad	st/mesa: remove unneded PIPE_TEXTURE_CUBE check in st_texture_create() Earlier in the function we assert layers==6 for PIPE_TEXTURE_CUBE so there's no reason to special-case the pt.array_size = layers assignment. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-30 12:08:49 -06:00
Eric Anholt	2b76ee9031	mesa: Drop the always-software-primitive-restart paths. The core sw primitive restart code is still around, because i965 uses it in some cases, but there are no drivers that want it on all the time. Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-09-30 11:00:52 -07:00
Eric Anholt	bcb722d830	gallium: Drop software-only primitive restart support. The drivers not flagging primitive restart support are r300 swtcl, svga, nv30, and vc4. The point of primitive restart is to slightly reduce draw call overhead for apps by batching multiple draws. If we do an extra pass to read the index buffer and split back into multiple draws, we've entirely missed the point. This is particularly bad for drivers that otherwise have hardware IB reads, where the readback is probably uncached. Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-09-30 10:59:58 -07:00
Jason Ekstrand	4ddc25a8d4	i965/fs: Properly calculate the number of instructions in calculate_register_pressure Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	514fd1c55e	i965/fs: Use the GRF for FB writes on gen >= 7 On gen 7, the MRF was removed and we gained the ability to do send instructions directly from the GRF. This commit enables that functinoality for FB writes. v2: Make handling of components more sane. i965/fs: Force a high register for the final FB write v2: Renamed the array for the range mappings and added a comment Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	1dd9b90ecd	i965/fs: Handle COMPR4 in LOAD_PAYLOAD Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	29f4c5b5d5	i965/fs: Constant propagate into LOAD_PAYLOAD Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	6d770ce93a	i965/fs: Add split_virtual_grfs and compute_to_mrf after lower_load_payload If we are going to use LOAD_PAYLOAD operations to fill MRF registers, then we will need this. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	8b0e4b387a	i965/fs: Add a an optional source to the FS_OPCODE_FB_WRITE instruction Previously, we were use the base_mrf parameter of fs_inst to store the MRF location. In preparation for doing FB writes from the GRF, we now also allow you to set inst->base_mrf to -1 and provide a source register. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	9e1f52a6e2	i965/fs: Use the GRF for UNTYPED_SURFACE_READ instructions Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	d25aaf1cb1	i965/fs: Use the GRF for UNTYPED_ATOMIC instructions Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	65ddf6f404	i965/fs: Add a function for getting a component of a 8 or 16-wide register Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	30d718c2fb	i965/fs: Use the instruction execution size directly for texture generation Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	48ddd2889e	i965/fs: Use exec_size instead of force_uncompressed in dump_instruction Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	b18fd234da	i965/fs: Use instruction execution sizes instead of heuristics Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:15 -07:00
Jason Ekstrand	894ec5a1d8	i965/fs: Use instruction execution sizes to set compression state Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	8f1adb5965	i965/fs: Remove unneeded uses of force_uncompressed Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	2999f83bd9	i965/fs: Derive force_uncompressed from instruction exec_size Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	5f41d052bf	i965/fs: Make fs_reg::effective_width take fs_inst* instead of fs_visitor* Now that we have execution sizes, we can use that instead of the dispatch width. This way it also works for 8-wide instructions in SIMD16. i965/fs: Make effective_width a variable instead of a function i965/fs: Preserve effective width in constant propagation Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	6ba31cc000	i965/fs: Better guess the width of LOAD_PAYLOAD Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	071ac3a467	i965/fs: Add an exec_size field to fs_inst This will, eventually, allow us to manage execution sizes of instructions in a much more natural way from the fs_visitor level. i965/fs: Explicitly set instruction execute size a couple of places i965/blorp: Explicitly set instruction execute sizes Since blorp is all 16-wide and nothing isn't, in general, very careful about register width, we'll just set it all explicitly. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	fbc0a798ee	i965/fs: Determine partial writes based on the destination width Now that we track both halves of a 16-wide vgrf, we no longer need to worry about force_sechalf or force_uncompressed. The only real issue is if the destination is too small. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	27d7ef094a	i965/fs: Fix a bug in register coalesce This commit fixes a bug in register coalesce that happens when one register is moved to another the proper number of times but the channels are re-arranged. When this happens, the previous code would happily coalesce the registers regardless of the fact that the channel mappins were wrong. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	16819b48ab	i965/fs: Rework GEN5 texturing code to use fs_reg and offset() Now that offset() can properly handle MRF registers, we can use an MRF fs_reg and let offset() handle incrementing it correctly for different dispatch widths. While this doesn't have any noticeable effect currently, it does ensure that the destination register is 16-wide which will be necessary later when we start detecting execution sizes based on source and destination registers. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	7210583eb8	i965/fs_reg: Allocate double the number of vgrfs in SIMD16 mode This is actually the squash of a bunch of different changes. Individual commit titles follow: i965/fs: Always 2-align registers SIMD16 for gen <= 5 i965/fs: Use the register width when applying offsets This reworks both byte_offset() and offset() to be more intelligent. The byte_offset() function now supports offsets bigger than 32. The offset() function uses the byte_offset() function together with the register width and the type size to offset the register by the correct amount. i965/fs: Change regs_read to be in hardware registers i965/fs: Change regs_written to be actual hardware registers i965/fs: Properly handle register widths in LOAD_PAYLOAD The LOAD_PAYLOAD instruction is a bit special because it collects a bunch of registers (with possibly different widths) into a single payload block. Once the payload is constructed, it's treated as a single block of data and most of the information such as register widths doesn't matter anymore. In particular, the offset of any particular source register is the accumulation of the sizes of the previous source registers. i965/fs: Properly set writemasks in LOAD_PAYLOAD i965/fs: Handle register widths in demote_pull_constants i965/fs: Get rid of implicit register doubling in the allocator i965/fs: Reserve enough registers for PLN instructions i965/fs: Make sources and destinations interfere in 16-wide i965/fs: Properly handle register widths in CSE i965/fs: Properly handle register widths in register_coalesce i965/fs: Properly handle widths in copy propagation i965/fs: Properly handle register widths in VARYING_PULL_CONSTANT_LOAD i965/fs: Properly handle register widths and odd register sizes in spilling i965/fs: Don't waste a register on texture lookups for gen >= 7 Previously, we were waisting a register in SIMD16 mode because we could only allocate registers in pairs. Now that we can allocate and address odd-sized registers, let's get rid of this special-case. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	4232a776a6	i965/fs: Handle printing of registers better. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	5390ca8ce9	i965: Explicitly set widths on gen5 math instruction destinations. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	004fbd5375	i965/fs: Make half() divide the register width by 2 and use it more Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	24d023b9fe	i965/fs: Add a concept of a width to fs_reg Every register in i965 assembly implicitly has a concept of a "width". Usually, this is derived from the execution size of the instruction. However, when writing a compiler it turns out that it is frequently a useful to have the width explicitly in the register and derive the execution size of the instruction from the widths of the registers used in it. This commit adds a width field to fs_reg along with an effective_width() helper function. The effective_width() function tells you how wide the register effectively is when used in an instruction. For example, uniform values have width 1 since the data is not actually repeated, but when used in an instruction they take on the width of the instruction. However, for some instructions (LOAD_PAYLOAD being the notable exception), the width is not the same. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	1030ee6e9b	i965/fs: A little harmless refactoring of register_coalesce Just pass the visitor into is_copy_payload() and is_coalesce_candidate() instead of a register size and the virtual_grf_sizes array. Among other things, this makes the code more obvious because you don't have to figure out where src_size came from. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	f91b566f55	i965/brw_reg: Add a firsthalf function and use it in the generator Right now, this function is a no-op but it indicates that we intend to only use the first half of the 16-wide register. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	1728e74957	i965/fs: Copy propagate partial reads. This commit reworks copy propagation a bit to support propagating the copying of partial registers. This comes up every time we have pull constants because we do a pull constant read immediately followed by a move to splat the one component of the out to 8 or 16-wide. This allows us to eliminate the copy and simply use the one component of the register. Shader DB results: total instructions in shared programs: 5044937 -> 5044428 (-0.01%) instructions in affected programs: 66112 -> 65603 (-0.77%) GAINED: 0 LOST: 0 Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	4d5f0eb048	i965/fs: Refactor fs_inst::is_send_from_grf() A switch statement is much easier to read/edit than a big giant or statement. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	54688cd03b	i965/fs: Clean up emit_fb_writes This splits emit_fb_writes into two functions: emit_fb_writes and emit_single_fb_write. This reduces the amount of duplicated code in emit_fb_writes and makes the register number fiddling less arcane. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	72a3780f26	i965/fs: Print BAD_FILE registers in dump_instruction Sometimes these show up in LOAD_PAYLOAD instructions and it's nice to be able to see them. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:14 -07:00
Jason Ekstrand	2af4b0aeaf	i965/fs: Make compact_virtual_grfs an optimization pass Previously we disabled compact_virtual_grfs when dumping optimizations. The idea here was to make it easier to diff the dumped shader because you didn't have a sudden renaming. However, sometimes a bug is affected by compact_virtual_grfs and, when this happens, you want to keep dumping instructions with compact_virtual_grfs enabled. By turning it into an optimization pass and dumping it along with the others, we retain the ability to diff because you can just diff against the compact_virtual_grf output. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	a25db10c12	i964/fs: Make immediate fs_reg constructors explicit Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	1c89e098e8	i965/fs: Make null_reg_* const members of fs_visitor instead of globals We also set the register width equal to the dispatch width. Right now, this is effectively a no-op since we don't do anything with it. However, it will be important once we add an actual width field to fs_reg. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	ab7234c852	i965/fs: Use the var_from_vgrf helper function instead of doing it manually Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	c24dd54f97	i965/fs: Fix a bug with dead_code_eliminate on large writes Previously, if an instruction wrote to more than one register, we implicitly assumed that it filled the entire register. We never hit this before because the only time we did multi-register writes was things like texturing which always wrote to all of the registers. However, with the upcoming ability to do 16-wide instructions in SIMD8 and things of that nature, we can have multi-register writes at offsets and we'll hit this. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	1385a4b706	i965/fs: Use the UW type for the destination of VARYING_PULL_CONSTANT_LOAD instructions Using a floating-point type doesn't usually cause hangs on my HSW, but the simulator complains about it quite a bit. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	f0d43c09b2	i965/fs: Use offset a lot more places We have this wonderful offset() function for advancing registers, but we're not using it. Using offset() allows us to do some sanity checking and avoid manually touching fs_reg::reg_offset. In a few commits, we will make offset do even more nifty things for us. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	0089d025aa	i965/fs: fix a comment in compact_virtual_grfs Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	3dc3fccb75	i965/fs: Rewrite fs_visitor::split_virtual_grfs The original vgrf splitting code was written with the assumption that vgrfs came in two types: those that can be split into single registers and those that can't be split at all It was very conservative and bailed as soon as more than one element of a register was read or written. This won't work once we start allowing a regular MOV or ADD operation to operate on multiple registers. This rewrite allows for the case where a vgrf of size 5 may appropriately be split in to one register of size 1 and two registers of size 2. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	f9da0740e2	i965/fs_live_variables: Use var_from_vgrf insead of repeating the calculation Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-30 10:29:13 -07:00
Jason Ekstrand	75afe17b79	i965/fs: Manually generate the meta fast-clear shader Previously, we were generating the fast-clear shader from GLSL. The problem is that fast clears require that we use a replicated write rather than a regular write instruction. In order to get this we had a complicated and somewhat fragile optimization pass that looked for places where we can use a replicated write and used it. Since replicated writes have a lot of restrictions, we only ever use them for fast-clear operations. This commit replaces the optimization pass with a function that just generates the shader we want. This is a) less code, b) less fragile than the optimization pass, and c) generates a more efficient shader. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-30 10:29:13 -07:00
Michel Dänzer	61128d7507	radeonsi: Pass the slice size to si_dma_copy_buffer Otherwise some parts of tiled slices can be missed. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Michel Dänzer	74aeccd701	radeonsi: Catch more cases that can't be handled by si_dma_copy_buffer/tile Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Michel Dänzer	d17b85524d	radeonsi: Fix si_dma_copy(_tile) for compressed formats Fixes GPUVM faults when running the piglit test "getteximage-formats init-by-rendering" with R600_DEBUG=forcedma on SI. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Michel Dänzer	761d80ddab	radeonsi: Fix tiling mode index for stencil resources We are currently only dealing with depth-only or stencil-only resources here, not with resources having both depth and stencil[0]. In both cases, the tiling mode index is in the tile_mode field, not in the stencil_tile_mode field. [0] Add an assertion for that. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-30 18:55:48 +09:00
Chia-I Wu	594e1a2f4b	ilo: fix format of edge flag pointer The VE format of edge flag pointers was changed in `780ce576bb`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:32 +08:00
Chia-I Wu	2d13b5ac81	ilo: add a pass to finalize ilo_ve_state Add finalize_vertex_elements() to finalize ilo_ve_state. This fixes a potential issue with URB entry allocation for VS and move the complexity of gen6_3DSTATE_VERTEX_ELEMENTS() to the new function. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:32 +08:00
Chia-I Wu	2b4c8ffc30	ilo: precalculate aligned depth buffer size To replace the hacky zs_align_surface(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:31 +08:00
Chia-I Wu	343b014b57	ilo: use dynamic bo for rectlist vertices The size is always 24 bytes. We can upload them to the dynamic buffer. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-30 16:41:31 +08:00
Thomas Hellstrom	46537f1d03	st/xa: Fix regression in xa_yuv_planar_blit() Commit "st/xa: scissor to help tilers" broke xa_yuv_planar_blit() and vmwgfx textured video. Fix this by implementing scissors also in the yuv draw path. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Cc: Rob Clark <robclark@freedesktop.org> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-30 08:31:33 +02:00
Kenneth Graunke	68627235f2	i965: Delete intel_chipset.h. Unused; it was replaced by include/pci_ids/i965_pci_ids.h long ago. Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-29 20:10:00 -07:00
Alex Henrie	3bea907797	driconf: Correct and update Catalan translation Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-09-29 17:45:41 -07:00
Alex Henrie	33a7d0d040	driconf: Update Spanish translation Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-09-29 17:45:26 -07:00
Alex Henrie	3b34b876f4	driconf: Synchronize po files Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-09-29 17:45:10 -07:00
Eric Anholt	4ceaad14ff	vc4: Don't try to do stores to buffers that aren't bound. The code was kind of mixed up what buffers were getting stored in the case that a resolve bit was unset (which are set based on the GL state at draw time) and the buffer wasn't actually bound. In particular, depth-only rendering would store the color buffer contents, which happen to be pointing at the depth buffer. Thanks to clearing out the resolve bits for things we really can't resolve, now I can drop the safety checks for buffer presence around the actual stores. Fixes 42 piglit tests.	2014-09-29 17:44:15 -07:00
Eric Anholt	1d42aa8358	vc4: Shove some depth comparison bits down to where they're used.	2014-09-29 17:44:15 -07:00
Matt Turner	66ab9c22fe	i965: Use BRW_MATH_DATA_SCALAR when source regioning is scalar. Notice the mistaken (but harmless) argument swapping in brw_math_invert(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-29 15:59:19 -07:00
Matt Turner	a0df258f89	i965/compaction: Move variable declarations to their uses. Tested-by: Mark Janes <mark.a.janes@intel.com>	2014-09-29 15:59:16 -07:00
Matt Turner	a36631b74c	i965/compaction: Simplify jump target code. My attempts to clarify the code with _compacted/_uncompacted prefixed variables apparently failed. Hopefully this is clearer. In any case, the previous code wasn't clear enough to gcc to let it optimize division by a power of two into a shift. No problems now. Also, the previous code (in the ADD case) didn't work on 32-bit x86, due to complicated set of interactions best summed up as unsigned division and compiler optimizations. Tested-by: Mark Janes <mark.a.janes@intel.com>	2014-09-29 15:58:57 -07:00
Rob Clark	dce96f6da2	freedreno/a3xx: re-emit shaders on variant change We need to keep track if a state change other than frag/vert shader state will trigger us to need a different shader variant, and if necessary mark the appropriate shader state as dirty. Otherwise we will forget to re-emit the shader state. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	3aaab87563	freedreno/ir3: add some cmdline args Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	7cdd467994	freedreno/a3xx: add support to emulate GL_CLAMP Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	3541705816	freedreno: add texcoord clamp support to lowering This is for hw that needs to emulate some texture wrap modes (like CLAMP) with some help from the shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:43 -04:00
Rob Clark	a6746d1124	freedreno: move bind_sampler_states to per-generation Keep the existing function as a common helper. But this lets us move an a2xx specific hack out of common code. And the PIPE_TEX_WRAP_CLAMP emulation will require an a3xx specific hack. So rather than piling on hacks, split this out. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:42 -04:00
Rob Clark	7e20c09d4a	freedreno/a3xx: fix border color order Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:42 -04:00
Rob Clark	c61133046e	freedreno/a3xx: add 32bit integer vtx formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-29 18:30:42 -04:00
Eric Anholt	fc4b5b85ce	vc4: Add support for GL 1.1's stupid CLAMP mode. We just clamp the incoming texture coordinates. This breaks the lambda calculation, but it gets the piglit tests to pass. This is the same behavior as in i965.	2014-09-29 14:12:33 -07:00
Eric Anholt	ae22f5aa14	vc4: Add support for texture border color. One spot in the docs says that it's stored at a miplevel just beyond the last miplevel, which was scary. But really, you just load it as the R coordinate (which conflicts with cubemaps, but you don't do border clamping on cubes).	2014-09-29 13:48:08 -07:00
Eric Anholt	b65761f764	vc4: Add the necessary stubs for occlusion queries. We have to expose them for GL 2.0, but we just always return a value of 0. We should be advertising 0 query bits instead of 64, but gallium doesn't have plumbing for that yet. At least this stops the segfaults.	2014-09-29 11:51:09 -07:00
Eric Anholt	76cd9955d9	vc4: Optimize out silly SUBs of 0. Drops instructions on vs-temp-array-mat4-index-col-row-wr.shader_test, which I was looking at because it's failing to register allocate.	2014-09-29 11:33:34 -07:00
Eric Anholt	64122b16ce	vc4: Dump constant uniform values in VC4_DEBUG=qir. Definitely helps when trying to understand and optimize a program.	2014-09-29 11:33:34 -07:00
Eric Anholt	3311513041	vc4: Turn a SEL_X_Y(x, 0) into SEL_X_0(x). This may reduce register pressure and uniform counts. Drops a bunch of 0 uniform loads on vs-temp-array-mat4-index-col-row-wr.shader_test, which is failing to register allocate.	2014-09-29 11:33:34 -07:00
Eric Anholt	730267eb23	vc4: Add support for texture cube maps. It's not passing some of the piglit tests, because it looks like at small miplevels some contents from surrounding faces are getting filtered in at the corners. It does get 7 new tests passing.	2014-09-29 11:29:28 -07:00
Eric Anholt	c4245d8b2e	vc4: Rename the slice's size0. In the other related fields, "0" refers to the size of the first miplevel, while this is a field in a slice. The other implicit slices we have (cubemap layers) don't vary in size compared to the first one.	2014-09-29 11:26:43 -07:00
Eric Anholt	7a85ebf6e2	vc4: Stop trying to reuse temporaries that store uniform values. Almost always, the MOV will get copy propagated out. Even if it doesn't, it's probably better to just reload the uniform at next use (to reduce register pressure) rather than try to save instruction count. I was looking at this because in the presence of texturing (which calls add_uniform() directly to get the uniform load forced into the instruction) the c->uniform_contents indices don't match 1:1 with the temporary qregs.	2014-09-29 10:07:24 -07:00
Tapani Pälli	3386e95994	egl: setup screen iterator before using it commit `4ed23fd` broke creation of pbuffer surfaces, patch fixes the failure, noticed when running chrome with '--use-gl=egl'. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-09-29 15:12:11 +03:00
Chia-I Wu	8c7c0f7114	ilo: fix a missing 'else' An 'else' is missing in the disassembler. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-29 16:58:36 +08:00
Kalyan Kondapally	66a2fe4cf9	glsl: Allow texture2DProjLod and textureCubeLod in GL ES According to GLES (i.e. 1.0 and above) spec textureCubeLod and texture2DProjLod are built in functions. We seem to disable support for these functions with GLES. This patch enables the support. Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84355	2014-09-29 11:10:38 +03:00
Rob Clark	40aabc0e80	configure.ac: bump libdrm_freedreno requirement We need 2.4.57 for fd_bo_dmabuf() / fd_bo_from_dmabuf(). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-28 12:46:17 -04:00
Matt Turner	5ccdc23a86	glsl: Recognize open-coded pow(x, y). pow(x, y) is equivalent to exp(log(x) * y). instructions in affected programs: 578 -> 458 (-20.76%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-27 12:18:37 -07:00
Matt Turner	e9aee2572a	i965/fs: Don't invalidate live intervals in saturate propagation. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-27 12:18:37 -07:00
Matt Turner	b9689c6bda	i965/fs: Ignore mov.sat instructions in interference check in sat prop. When an instruction's result was consumed by multiple mov.sat instructions, we would decide that we couldn't move the saturate modifier because something else was using the result, even though it was just another mov.sat! total instructions in shared programs: 4275598 -> 4274842 (-0.02%) instructions in affected programs: 75634 -> 74878 (-1.00%) Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-27 12:18:37 -07:00
Matt Turner	82bdb559a1	i965/fs: Walk instructions in reverse in saturate propagation. When we find a mov.sat, we search backwards. We might as well search everything else backwards as well and potentially look at fewer instructions. This change enables the next patch. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-27 12:18:37 -07:00
Rob Clark	ed48f91275	freedreno/a3xx: add flat interpolation mode Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Rob Clark	df2f0c6d55	freedreno/a3xx: add LOD_BIAS Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Rob Clark	f7259949da	freedreno: turn missing caps into compile warnings Get rid of the 'default' case (as suggestied by imirkin) so compiler warns us about missing caps. Also add some caps that were missing until now. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Rob Clark	546d6c8dc9	freedreno: we have more than 0 viewports! `4155d1c7` 'st/mesa: drop dependence on API profile in st_init_extensions' broke freedreno because somehow 'PIPE_CAP_MAX_VIEWPORTS' fell through the cracks. Resulting that we reported zero viewports. So the state tracker never bothered to give us any valid viewport! Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Rob Clark	24cd746e4b	freedreno: update generated headers Among other things, fixes a bug for fixed point registers/bitfields. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Rob Clark	5c72672cdc	freedreno: don't advertise mirror-clamp support At least on a3xx, we cannot do it without some emulation in shader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Rob Clark	e4c678c164	freedreno: fix compiler warning Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-27 13:34:07 -04:00
Tom Stellard	ec566e0f16	configure.ac: Compute LLVM_VERSION_PATCH using llvm-config This is the only guaranteed way get the patch level for llvm, since the define cannot always be found in config.h depending on the version of llvm or the build system used. CC: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Jonathan Gray <jsg@jsg.id.au>	2014-09-27 17:46:39 +01:00
Emil Velikov	5ef6eb4654	Remove Bluegene/L wrappers Added back in 2009, with osmesa/GLU in mind. Unlikely to be working any more since the removal of the static makefiles. Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-27 15:21:22 +01:00
Emil Velikov	343795e445	mesa: remove last DJGPP remains Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-27 15:20:49 +01:00
Emil Velikov	a662fa94c1	configure: use explicit enabled/disabled in config switch description Rather than having double negatives -> disable-opencl, default=no simply use enabled/disabled. It makes things a bit easier for the reader and consistent throughout the file. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-27 15:20:42 +01:00
Emil Velikov	bbe6f7f865	configure: ask vdpau.pc for the default location of the vdpau drivers Rather than using hardcoded values honor the value set at libvdpau build time - i.e. the moduledir variable from vdpau.pc Update the omx description to match reality while we're here. Cc: Christian König <deathsimple@vodafone.de> Cc: Alexandre Demers <alexandre.f.demers@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80615 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-27 15:20:26 +01:00
Emil Velikov	407450eb84	configure: drop --with-egl-driver-dir switch The location of the egl driver(s) is matter that we should have never exposed to the user. Currently the dri2 driver is built into the libEGL loader, with the gallium based one soon to follow. v2: Fold EGL_DRIVER_INSTALL_DIR within the makefiles. Suggested by Matt. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80615 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-27 15:20:14 +01:00
Emil Velikov	2e6fc0647a	configure: remove non-functional --with-opencl-libdir The parameter used to control where the gallium pipe-drivers were installed, but was broken since commit `45270fb0fd` Author: Matt Turner <mattst88@gmail.com> Date: Thu Sep 13 10:45:01 2012 -0700 targets/pipe-loader: Convert to automake Considering that nowadays the pipe-drivers can be used by more than just the opencl target, even fixing this up will not be the best idea. Cc: Matt Turner <mattst88@gmail.com> Cc: Francisco Jerez <currojerez@riseup.net> Buzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61415 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-27 15:15:58 +01:00
Ian Romanick	c3f17bb18f	glsl: Strip arrayness from ir_type_dereference_variable too If the thing being dereferenced is a record or an array of records, it should be treated as row-major. The ir_type_derference_record path already does this, and I think I intended to do the same for this path in `b17a4d5d`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83741 Cc: mesa-stable@lists.freedesktop.org	2014-09-26 07:59:53 -07:00
Ian Romanick	2ab71e1486	glsl: Round struct size up to at least 16 bytes Per rule #9, the size of the structure is vec4 aligned. The MAX2 in the loop ensures that sizes >= 16 bytes are vec4 aligned. The new MAX2 after the loop ensures that sizes < 16 bytes are vec4 aligned. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932 Cc: mesa-stable@lists.freedesktop.org	2014-09-26 07:59:50 -07:00
Ian Romanick	5c75270c34	glsl: Make sure row-major array-of-structure get correct layout Whether or not the field is row-major (because it might be a bvec2 or something) does not affect the array itself. We need to know whether an array element in its entirety is row-major. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83506 Cc: mesa-stable@lists.freedesktop.org	2014-09-26 07:59:47 -07:00
Ian Romanick	8e01c66da6	glsl: Make sure fields after small structs have correct padding Previously the linker would correctly calculate the layout, but the lower_ubo_reference pass would not apply correct alignment to fields following small (less than 16-byte) nested structures. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83533 Cc: mesa-stable@lists.freedesktop.org	2014-09-26 07:59:25 -07:00
Chia-I Wu	24653bcd7d	ilo: give gen6_draw_session a better prefix gen6_draw_session is not GEN dependent. Rename it to ilo_render_draw_session. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	4be7b7ee85	ilo: make ilo_render opaque It is not used outside the render code. There are also too many details in it that we do not want other components to access directly. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	8f284343e0	ilo: make ilo_render_emit_draw() direct Remove emit_draw() and ILO_RENDER_DRAW indirections. With all emit functions being direct now, ilo_render_estimate_size() and more can also be removed. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	a05ce904aa	ilo: make ilo_render_emit_rectlist() direct Remove emit_rectlist() and ILO_RENDER_RECTLIST indirections. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	362d2fb982	ilo: clean up draw and rectlist state emission Add these new high-level functions ilo_render_get_draw_dynamic_states_len() ilo_render_emit_draw_dynamic_states() ilo_render_get_rectlist_dynamic_states_len() ilo_render_emit_rectlist_dynamic_states() ilo_render_get_draw_surface_states_len() ilo_render_emit_draw_surface_states() for draw and rectlist state emission. They are implemented in the new ilo_render_dynamic.c and ilo_render_surface.c. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	f1662e3670	ilo: sanity check ilo_render_get__len() Assert that we never write more than what ilo_render_get__len() returns. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	7fc7415316	ilo: simplify ilo_render_get_query_len() For all supported query types, we always emit a PIPE_CONTROL. Call ilo_render_get_flush_len() for simplicity and clarity. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	0afc17ea49	ilo: make ilo_render_emit_query() direct Remove emit_query() and ILO_RENDER_QUERY indirections. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	18cbd3cc34	ilo: make ilo_render_emit_flush() direct Remove emit_flush() and ILO_RENDER_FLUSH indirections. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	e3451552d2	ilo: simplify ilo_render invalidation ilo_render is based on ilo_builder. We should only care if the builder buffers are invalidated, or if the hardware context is invalidated. Replace ilo_render_invalidate() with flags by ilo_render_invalidate_builder() and ilo_render_invalidate_hw(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	ce2bda300d	ilo: add ilo_builder_{dynamic,surface}_used() Return how many DWords are used in dynamic and surface buffers respectively. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	2df2f60e8d	ilo: rename state buffer to dynamic buffer Both dynamic buffer and surface buffer are state buffers. We should not use state buffer to refer to the former. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	a7f2ab668c	ilo: constify ilo_render in ilo_render_get_sample_position() It is a getter and is not supposed to modify ilo_render. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	23d66a42a3	ilo: rename 3d_pipeline to render Follow the file renaming. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	3afe30e64b	ilo: remove struct ilo_3d Move members of ilo_3d that still make sense to ilo_context. With ilo_3d gone, rename functions whose names begin with ilo_3d to something more appropriate. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	b6443ae969	ilo: rename ilo_3d_pipeline.[ch] to ilo_render.[ch] They are used to build render engine commands, which can be more than 3D. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Chia-I Wu	392890d5de	ilo: rename ilo_3d.[ch] to ilo_draw.[ch] There is not much left in struct ilo_3d. We want to kill it and ilo_3d.[ch] will be bad names. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-26 21:15:55 +08:00
Michel Dänzer	7e55c3b352	st/mesa: Use PIPE_USAGE_STAGING for GL_STATIC/DYNAMIC/STREAM_READ buffers Such buffers can only be useful by reading from them with the CPU, so we need to make sure CPU reads are fast. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84178 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-26 16:53:13 +09:00
Tapani Pälli	9caa5c3b13	glsl: remove unused link_assign_uniform_block_offsets ubo offsets are assigned by link_uniform_blocks since `514f8c7e` Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-26 08:29:10 +03:00
Kalyan Kondapally	e018ea81bf	glsl: Structures must have same name to be considered same type. According to GLSL(4.2) and GLSL-ES (1.0, 3.0) spec, Structures must have the same name to be considered same type. We currently ignore the name check while checking if two records are same. This patch fixes this. Patch fixes failing tests in WebGL conformance test 'shaders-with-uniform-structs' when running Chrome on OpenGL ES. v2: Do not force name comparison with unnamed types (Tapani) v3: Cleanups (Matt) Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83934	2014-09-26 08:29:10 +03:00
Tapani Pälli	1cb81d3a9b	glsl: fix uniform location count used for glsl types Patch fixes the slot count used by vector types and adds 1 slot to be used by image and sampler types. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82921	2014-09-26 08:29:10 +03:00
Ilia Mirkin	a5bbfeda97	gm107/ir: take relative pfetch offset into account There is no dedicated instruction for this, so just combine it with the constant offset. Acked-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-26 01:13:06 -04:00
Michel Dänzer	4a38b154fd	gallivm: More fallout from disabling with LLVM 3.6 The draw module would still try to use gallivm, causing many piglit tests to fail with an assertion failure. llvmpipe might have been similarly affected. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-09-26 11:35:52 +09:00
Ilia Mirkin	cdc4de1215	gm107/ir: add support for indirect const buffer selection This was missed in the commit that enabled it for fermi/kepler as part of ARB_gpu_shader5 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-25 22:15:50 -04:00
Ilia Mirkin	0532a5fd00	gm107/ir: fix texture argument order Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-25 22:15:50 -04:00
Ilia Mirkin	d3c3bba6d0	gm107/ir: fix manual TXD for array targets This parallels the fixes in commit `afea9bae`. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-25 22:15:49 -04:00
Ilia Mirkin	d78b533c29	nouveau: fix glCompressedTexImage mesa_texstore expects pixel data, not compressed data. For compressed textures, we want to just copy the bits in without any conversion. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Francisco Jerez <currojerez@riseup.net>	2014-09-25 22:15:49 -04:00
Ilia Mirkin	0147c10c5f	nv50/ir: avoid deleting pseudo instructions too early What happens is that a SPLIT operation is part of the spill node, and as a pseudo op, the instruction gets erased after processing its first def. However the later defs still need to refer to it, so instead delay deleting until after that whole RA node is done processing. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79462 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-25 22:15:49 -04:00
Ilia Mirkin	9d2e298dd4	mesa/st: NumLayers is only valid for array textures For 3d textures, NumLayers is set to 1, which is not what we want. This fixes the newly added gl-layer-render-storage test (which constructs immutable 3d textures). Fixes regression introduced in `d82bd7eb06`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84145 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Chris Forbes <chrisf@ijw.co.nz>	2014-09-25 22:15:49 -04:00
Ilia Mirkin	fca2216ced	nv50/ir: add some comments on edge classification Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-25 22:15:49 -04:00
Ilia Mirkin	1ae32e24ca	nv50,nvc0: fix 3d blit logic for odd depth/stencil formats Reported-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-25 22:15:49 -04:00
Ilia Mirkin	b49dfb68ed	nv50,nvc0: add missing depth/stencil formats to tile flag selection Reported-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-25 22:15:49 -04:00
Eric Anholt	db11eb92cf	vc4: Switch from errx() to fprintf() and abort(). These are pretty catastrophic, "should never happen" failure paths (though 4 tests in piglit hit them currently, due to a single bug). An abort() that you can gdb on easily is probably more useful than a clean exit, particularly since a bug in piglit framework right now is causing early exit(1)s to simply not be recorded in the results at all.	2014-09-25 16:41:25 -07:00
Eric Anholt	45962fbeee	vc4: Fix miplevel validation for raster textures. We were using the un-minified value, meaning we'd reject correctly laid out textures.	2014-09-25 16:41:25 -07:00
Matt Turner	43267a325f	mesa: Replace IS_NEGATIVE(x) with x < 0.0f. I only made IS_NEGATIVE(x) use signbit in commit `0f3ba405` in an attempt to fix 54805, but it didn't help. We didn't use signbit on some platforms and instead defined it to x < 0.0f. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-25 13:57:29 -07:00
Matt Turner	50e2f70093	radeon: Use PRINTLIKE macro.	2014-09-25 13:57:29 -07:00
Matt Turner	b66791d47f	configure.ac: Replace gallium_check_st with gallium_require_drm.	2014-09-25 13:57:29 -07:00
Matt Turner	28e84c93bb	configure.ac: Drop gallium directory tracking. Was only tracked to be printed at the end of configure, but configure quits if it can't build something we requested, rather than silently dropping it, so printing these directories has little use.	2014-09-25 13:57:29 -07:00
Matt Turner	691bd9b9df	configure.ac: Use autoconf macro for GNU make.	2014-09-25 13:57:28 -07:00
Matt Turner	e4be17fd04	ralloc: Mark ralloc functions with gcc's malloc attribute. Cuts a few hundred bytes from the DRI drivers, so it must give gcc some extra information. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 13:52:55 -07:00
Matt Turner	976464c210	mesa: Replace a priori knowledge of gcc attributes with configure tests. Note that I had to add support for testing the packed attribute to m4/ax_gcc_func_attribute.m4. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [C bits] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 13:52:55 -07:00
Matt Turner	4a96df73e7	mesa: Replace a priori knowledge of gcc builtins with configure tests. Presumbly this will let clang and other compilers use the built-ins as well. Notice two changes specifically: - in _mesa_next_pow_two_64(), always use __builtin_clzll and add a static assertion that this is safe. - in macros.h, remove the clang-specific definition since it should be able to detect __builtin_unreachable in configure. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [C bits] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 13:52:55 -07:00
Matt Turner	3e00822619	i965/compaction: Document instruction compaction capabilities. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:37 -07:00
Matt Turner	54e30dbf4d	i965: Emit ELSE/ENDIF JIP with type D on Gen 7. The spec says the type must be W (JIP is 16-bits after all), but we've been emitting it with a UD type all along and have experienced no adverse effects. Changing the type to D allows ELSE and ENDIF instructions to be compacted. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	6a4e84edfa	i965/compaction: Support compaction of control flow instructions. We're currently emitting compactable control flow instruction the wrong types, preventing their compaction. The next patch will fix this and actually enable compaction. On chips that cannot compact control flow instructions, attempts to find a match in the datatype table will fail. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	14e44f896f	i965/compaction: Add support for G45. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	5a559557e6	i965: Add BRW_OPCODE_NENOP for G45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	64c0f62018	i965/compaction: Add support for Gen5. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	bb05b530ab	i965/compaction: Reduce size of compacted_counts[] array. The array was previously indexed in units of brw_compact_inst (8-bytes), but before compaction all instructions are uncompacted, so every odd element was unused. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	90c982a8a8	i965/compaction: Use sizeof brw_inst/brw_compact_inst. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	b92a1e2174	i965/compaction: Increment offset in for loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	eebf1f5441	i965/compaction: Make src_offset local to the for loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	cde887ccb1	i965/compaction: Remove unnecessary is-compacted? check. Used to pass over previously compacted instructions in this loop, but no longer. No point in checking. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	118021f929	i965/compaction: Don't set UIP on ELSE on Gen < 8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	45c3ece266	i965/compaction: Rework 3-src compaction logic. It may be possible to create a contrived example in which a 3-src instruction would have been compacted on Gen < 8. I'd rather not discover it in the wild. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	1fce6fcac1	i965/sf: Disable instruction compaction. Currently a no-op, since instruction compaction isn't implemented for the generations that have a programmable strips-and-fans unit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:36 -07:00
Matt Turner	b5466707d6	i965: Set JumpCount, not JIP, on ENDIF on Gen 6. Despite what the Sandybridge PRM says, ENDIF has Jump Count in <dst>, not JIP in <src1>. (The same mistake appears about WHILE as well). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-25 11:02:36 -07:00
Kenneth Graunke	23247e8059	mesa: Use VertexArray, not _VertexArray, in array size expressions. Both sizes are VERT_ATTRIB_MAX, so this has no effect. But it drops a few trivial uses of the derived state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 23:14:26 -07:00
Kenneth Graunke	f81052dc9b	mesa: Set correct array element in vbo_exec_vtx_init. I'm not familiar with this code, but this sure appears to be a typo. It looks like the intent is to set each array element, not arrays[0] each time. Notably, the loop just below uses "array", not "arrays". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-24 23:14:23 -07:00
Kenneth Graunke	d0ec6e8509	mesa: Use proper structure for glGet(GL_TEXTURE_COORD_ARRAY). The code in get.c that handles this uses ctx->Array.VAO->VertexAttrib, which is a gl_vertex_attrib_array structure, not a gl_client_array. The offsets of all fields happened to be the same in both structures, at least on x86_64. "Size," "Type," and "Stride" are obviously the same: both structures start with the same fields, in the same order. "Enabled" is dicier: there are different fields before it in both structures, including pointer sized values which might need special alignment. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-24 23:14:16 -07:00
Kenneth Graunke	8c16a0d7ba	mesa: Remove some dead helper functions. Dead since the _MaxElement removal, but these functions seemed generally applicable, so I decided to remove them in a separate patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 23:13:56 -07:00
Eric Anholt	a04605a8ca	vc4: Compute max_index instead of trusting the rest of userspace. max_index was coming from either the user telling us as part of glDrawRangeElements, or from an incidental calculation as part of some sort of primitive conversion fallback. Sometimes, it was just set to the default "I don't know" ~0 value. If it wasn't set to the actual max index, then the kernel would reject the draw call for allowing out-of-bounds VBO reads. So, compute the max index from the sizes of the VBOs, which isn't too expensive (unlike mapping and reading the index buffer) and is reliable. Fixes piglit vao-element-array-buffer.	2014-09-24 20:51:15 -07:00
Eric Anholt	61cb08ab4f	vc4: Move shader record setup before the draw call. The flush only happens after both are written, so we can do them in either order. This will let me compute max_index during the shader record setup.	2014-09-24 20:49:08 -07:00
Matt Turner	ba0c0a186d	i965/vec4: Call calculate_cfg() in test programs to avoid crashing. Reported-by: Mark Janes <mark.a.janes@intel.com>	2014-09-24 16:06:41 -07:00
Eric Anholt	52476b35c1	vc4: Add support for gl_PointCoord. Fixes piglit glsl-fs-pointcoord, point-sprite, and fbo-gl_pointcoord.	2014-09-24 15:59:03 -07:00
Eric Anholt	66b7bd60e0	vc4: Add support for point size setting. This is the support for both the global and per-vertex modes.	2014-09-24 15:56:39 -07:00
Eric Anholt	f24588d64e	vc4: Add support for line width setting. I don't see piglit tests for it, but this should be better than not emitting it at all.	2014-09-24 15:56:39 -07:00
Eric Anholt	7fa399f93a	vc4: Actually add support for polygon offset. Setting the bit without setting the offset values is kind of useless. Fixes piglit polygon-offset (but not polygon-mode-offset).	2014-09-24 15:56:39 -07:00
Eric Anholt	6abbdfe3db	vc4: Fix swapped 565 dithering versus no-dithering render configs. Fixes many 565 piglit tests (like fbo-generatemipmap-formats) that weren't expecting dithering.	2014-09-24 15:56:39 -07:00
Eric Anholt	8cd165051b	vc4: Add support for alpha test. Fixes most of piglit fbo-alphatest-formats (but not RGB565/332).	2014-09-24 15:56:39 -07:00
Rob Clark	a87e44da3a	freedreno/a3xx: initial texture border-color Still some open questions.. and at any rate, no additional piglit passes due to various wrap modes that we need to emulate in at least some cases :-( But it does fix some mystery page-faults.. So add some comments in the code where there are things that we need to emulate or do more r/e, and push as-is. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-24 18:52:58 -04:00
Brian Paul	9f47220450	util: use linear formats in util_blit_pixels() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-24 15:35:11 -06:00
Brian Paul	b6947e02de	util: simplify writemask parameters for util_blit_pixels() Instead of separate color and Z/S writemasks, just have one writemask parameter that takes a mask of the PIPE_MASK_[RGBAZS] flags. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-24 15:35:11 -06:00
Brian Paul	b32f05e153	util: s/PIPE_TEX_MIPFILTER/PIPE_TEX_FILTER/ in u_blit code PIPE_TEX_MIPFILTER_x is not legal for the pipe_sampler_state:: min/mag_img_filter fields. But PIPE_TEX_MIPFILTER_x == PIPE_TEX_FILTER_x so we were getting lucky. This also makes the code consistent with u_blitter.c. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-24 15:35:10 -06:00
Brian Paul	f5e8b30472	mesa: remove EXT suffix from FBO error messages And use pass caller="" for _mesa_FramebufferTexture(). Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-24 15:35:10 -06:00
Matt Turner	5980fc35c9	mesa: Drop _mesa_getenv() wrapper. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	209eba42eb	mesa: Drop _mesa_bsearch() wrapper. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	9499d6e358	mesa: Unifdef _WIN32_WCE. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	d20015a576	mesa: Unifdef _XBOX. Inexplicably added in commit `36940429`. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	b133b84733	configure.ac: Remove duplicate -DHAVE_PTHREAD. It's also defined by the AX_PTHREAD macro. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	d1022529fe	configure.ac: Stop checking for perl. Added by commit `a75c6163`, but no longer used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	585e250dd2	configure.ac: Use test -a, rather than another test. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 09:58:43 -07:00
Matt Turner	452926a5ec	mesa: Use realloc() instead of _mesa_realloc() and remove the latter. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-24 09:58:42 -07:00
Matt Turner	e5162defc8	mesa: Remove duplicate _mesa_{init,free}_shader_state prototypes. Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-24 09:58:42 -07:00
Tom Stellard	180b152b24	gallivm: Wrap deleted inlcude in if HAVE_LLVM < 0x0306 This was missed in `8f4ee56`.	2014-09-24 11:54:44 -04:00
Matt Turner	ef75f60822	i965: Add and use functions to get next/prev blocks. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	444fc0b4a8	i965: Call insert and remove functions from exec_node directly. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	49374fab5d	i965: Make instruction lists local to the bblocks. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	3fe1a84bbe	i965/cfg: Add note about double-loop macros and break behavior. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	153d148e9e	i965: Replace initialization loops with memset(). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	72bb3f81c6	i965/vec4: Don't iterate between blocks with inst->next/prev. The register coalescing portion of this patch hurts three shaders in Guacamelee by one instruction each, but examining the diff makes me believe that what we were generating was (perhaps harmlessly) incorrect.	2014-09-24 09:42:46 -07:00
Matt Turner	f0598d413b	i965/fs: Don't iterate between blocks with inst->next/prev. When instruction lists are per-basic block, this won't work. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	7119712f45	i965/cfg: Add macros to iterate through a block given a starting point. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	235f451f7a	i965/fs: Make count_to_loop_end() use basic blocks. When the instructions aren't in a flat list, this wouldn't have worked. Also, this should be faster. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	90bfeb2244	i965/vec4: Don't use instruction list after calculating the cfg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	2ff0ff880c	i965/fs: Don't use instruction list after calculating the cfg. The only trick is changing a break into a return true in register coalescing, since the macro is actually a double loop, and break will do something different than you expect. (Wish I'd realized that earlier!) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	a4fb8897a2	i965: Remove now unneeded calls to calculate_cfg(). Now that nothing invalidates the CFG, we can calculate_cfg() immediately after emit_fb_writes()/emit_thread_end() and never again. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	072ea414d0	i965: Remove cfg-invalidating parameter from invalidate_live_intervals. Everything has been converted to preserve the CFG. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	9e28bb863c	i965: Preserve the CFG in instruction scheduling. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	269b6e24d6	i965/vec4: Preserve CFG in spill_reg(). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	b0b64c85e4	i965/vec4: Preserve the CFG in a few more places. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Matt Turner	a9f8296dbb	i965/fs: Preserve the CFG in a few more places. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-24 09:42:46 -07:00
Kristian Høgsberg	9b75663866	i965: Restructure debug flags This cleans up the debug flags to be consistently indented, use bit shifting instead of hex-values and fixes a bug where the new DEBUG_NO8 flag used the same value as the DEBUG_VUE flag. This was hidden by the numbers not being aligned. Also removes gaps in the range where DEBUG_IOCTL (0x4) and DEBUG_REGION (0x400) used to be. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-09-24 09:15:09 -07:00
Tom Stellard	8f4ee56e49	gallivm: Disable gallivm to fix build with LLVM 3.6 LLVM commit r218316 removes the JITMemoryManager class, which is the parent for a seemingly important class in gallivm. In order to fix the build, I've wrapped most of lp_bld_misc.cpp in if HAVE_LLVM < 0x0306 and modifyed the lp_build_create_jit_compiler_for_module() function to return false for 3.6 and newer which effectively disables the gallivm functionality. I realize this is overkill, but I could not come up with a simple solution to fix the build. Also, since 3.6 will be the first release without the old JIT, it would be really great if we could move gallivm to use the C API only for accessing MCJIT. There is still time before the 3.6 release to extend the C API in case it is missing some functionality that is required by gallivm.	2014-09-24 10:34:19 -04:00
Marek Olšák	2f7714e071	gallium/rbug: correctly unreference a sampler view This fixes heap corruption. The sampler view can be bound in the context, so we cannot call destroy directly. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	91ddf49c87	gallium/rbug: unlock a mutex in rbug_create_query Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	c944866708	radeonsi: remove old cache flushing code Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	dd53d53dc6	radeonsi/compute: do CS partial flush with si_emit_cache_flush Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	604b58b554	radeonsi/compute: flush caches with si_emit_cache_flush Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	628f8ee1d9	radeonsi/compute: directly emit CONTEXT_CONTROL Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	dc05a9e4e0	radeonsi: properly destroy the GS copy shader and scratch_bo for compute Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	711623f7c8	radeonsi: release GS rings at context destruction Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	2833dc4e45	radeonsi: don't use pipe_constant_buffer for GS rings Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	1abb1a97b0	radeonsi: don't pass the context to the shader translator This should prevent accessing context state there. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	e29353ff20	radeonsi: don't snoop currently-bound GS shader when compiling ES Instead, pass the layout of GS inputs in memory to the ES using the shader key. Only 64 bits are needed to represent the layout in the key. Mixing and matching different VS and GS shaders should now always work. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	2774abd4ce	radeonsi: shorten si_pipe_* prefixes to si_* This was the original naming convention in r600g and it somehow crept into radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	8c37c16cbc	radeonsi: merge si_pipe_shader into si_shader One is part of the other anyway. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	07c0b4d9b7	radeonsi: disable gl_SampleMask fragment shader output if MSAA is disabled This fixes piglit: arb_sample_shading-builtin-gl-sample-mask 0 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	b53b1ceb3e	radeonsi: only update MSAA-specific framebuffer state if nr_samples is changed Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	dba4c5baf4	radeonsi: move DB_SHADER_CONTROL into db_render_state I will need this for fixing sample shading with 1 sample. The good news is that all shader pm4 states no longer use the current context state, so we can generate the pm4 states outside of draw_vbo if needed. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	adc5797f54	radeonsi: set KILL_ENABLE during shader compilation, remove uses_kill flag Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	a34c9f70b1	radeonsi: remove shader.ps_conservative_z, set db_shader_control instead Also set the field on SI too. It's not just specific to CIK. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	884f1654e2	radeonsi: move DB registers from draw_vbo into new db_render_state It's called db_misc_state in r600g. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	a768b43bc3	radeonsi: remove unused variable si_pipe_shader::sprite_coord_enable Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	fd076259ff	radeonsi: document what si_descriptors.c does Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	4ace4190ac	r300g: implement MSAA copies by resolving and upsampling There's no other way. It will use hw resolve + blit.	2014-09-24 14:48:02 +02:00
Marek Olšák	6cfedf8797	st/mesa: redefine mapping from VARYING_SLOT_TEXi/PNTC/VARi to TGSI GENERIC[i] Generic varyings in TGSI were based on the value of VARYING_SLOT_TEX0, so VAR0 was always GENERIC[22] (with tessellation patches). Some drivers might not be able to cope with that. This commit defines a proper mapping, so that PNTC is GENERIC[8] and VAR0 is GENERIC[9]. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	77038cd35a	st/mesa: don't set coord_enable for gl_PointCoord if using TGSI_SEMANTIC_PCOORD This was missed when Christoph Bumiller added PIPE_CAP_TGSI_TEXCOORD. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	ffbcee8a57	st/mesa: use UniformBooleanTrue in glsl_to_tgsi Just for consistency. This doesn't fix anything as the original code was already pretty good. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	4155d1c7b0	st/mesa: drop dependence on API profile in st_init_extensions The extensions and limits being set in the conditional block are core-only anyway and don't have any effect on other profiles. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:02 +02:00
Marek Olšák	2599b92eb9	mesa: allow forcing >=3.1 compatibility contexts with MESA_GL_VERSION_OVERRIDE E.g. the 4.0 compatibility profile can be forced with: MESA_GL_VERSION_OVERRIDE=4.0COMPAT Some tests that I have require 4.0 compatibility. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:01 +02:00
Marek Olšák	10ffd98c34	mesa: don't set ES versions to GLSLVersion in _mesa_init_constants No place in Mesa expects an ES version there. Drivers don't even set it like this. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-24 14:48:01 +02:00
Emil Velikov	a3e9582f09	targets/vl: don't forget to set GALLIUM_STATIC_TARGETS git rebase failure while dropping out a patch that reworks the way we build aux/vl. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-24 11:54:28 +01:00
Emil Velikov	5a68432f04	targets/egl: fold in target LDFLAGS variables Both variables are identical thus we can fold them into AM_LDFLAGS. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:01 +01:00
Emil Velikov	a37b9bb555	targets: drop the old MEGADRIVERS & STATIC_TARGET... variables No longer used/needed as of last commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:01 +01:00
Emil Velikov	0f3c0ff17b	gallium/softpipe,llvmpipe: add automake target 'templates' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:01 +01:00
Emil Velikov	29c4ae0ebf	configure: remove NEED_{SOFT,LLVM}PIPE_DRIVER variables The respective HAVE_{SOFT,LLVM}PIPE are already descriptive enough. Additionally the svga modules does not really use either one, but the auxiliary draw & gallivm modules. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:01 +01:00
Emil Velikov	3d909864c8	gallium/vc4: add automake target 'templates' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:01 +01:00
Emil Velikov	c2b5d7024e	gallium/r300,r600,radeonsi: add automake target 'templates' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:01 +01:00
Emil Velikov	fd4cd8e20a	gallium/svga: add automake target 'template' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:00 +01:00
Emil Velikov	ca32ce40b1	gallium/ilo: add automake target 'template' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:00 +01:00
Emil Velikov	defd48c6c5	gallium/i915: add automake target 'template' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:00 +01:00
Emil Velikov	97bec98ac9	gallium/freedreno: add automake target 'template' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:00 +01:00
Emil Velikov	0e59153229	gallium/nouveau: add automake target 'template' Rather than duplicating the libdeps, extra define... all over the targets, define them only once and use when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:44:00 +01:00
Emil Velikov	6e1f846ce0	targets/pipe-loader: drop unused authentication The dri, vdpau, omx, xvmc and gbm targets don't need any authentication even the VL ones never used it. Either the respective loader or the library itself (vl) is doing its auth prior to calling create_screen() Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:43:44 +01:00
Emil Velikov	18137c5fe0	targets/vl: fix hard-links when building shared pipe-drivers Make sure that MEGADRIVERS is set in order to create the hardlinks. The variable name is not the most appropriate and will be sorted out in upcoming commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:38:43 +01:00
Emil Velikov	1cb8bba499	configure: remove unused variable OSMESA_MESA_DEPS Leftover from the static Makefiles Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:38:43 +01:00
Emil Velikov	523fa2f1ce	gallium/freedreno: remove unused draw header Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:38:43 +01:00
Emil Velikov	e8053bb65e	gallium/r300: remove obsolete declaration The definition of rc_pair_regalloc_inputs_only() is no longer around so drop the declaration. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-24 10:38:43 +01:00
Eric Anholt	bf4aecfb2a	vc4: Drop maximum number of varyings down to 8. There are only 32 bits in the flatshade flags (which are 1 bit per component), the simulator crashes when you use more than about this many varyings, and the original Broadcom code drop only exposed 8 as well. Fixes 26 piglit tests in the varying-packing group, and makes many others go from crash to fail (due to not checking their varying counts and treating link failures as failures). Regresses ARB_fp/minmax (due to 8 varyings instead of 10).	2014-09-24 00:25:07 -07:00
Eric Anholt	45b104e0a2	vc4: Add support for flat shading. This is just the GL 1.1 flat shading of colors -- we don't need to support TGSI constant interpolation bits, because we don't do GLSL 1.30. Fixes 7 piglit tests.	2014-09-23 17:23:29 -07:00
Eric Anholt	0e7bc3088b	vc4: Drop stale comment. This should have been in `001247d230`.	2014-09-23 17:23:29 -07:00
Brian Paul	e8ea783d79	util: fix SCons build after register_allocate.c was moved to util/ directory.	2014-09-23 16:33:17 -06:00
Eric Anholt	9dbfca10a3	vc4: Put dead writes into the NOP register when generating code. They still provide register pressure since I haven't made a special class for them, but since they're only live for one instruction it probably doesn't matter. This improves the readability of QPU assembly.	2014-09-23 13:51:42 -07:00
Eric Anholt	d2b58240b4	vc4: When possible, resolve raddr conflicts by swapping files on specials. Cleans up a bunch of ugliness in perspective interpolation.	2014-09-23 13:51:41 -07:00
Eric Anholt	3e5325e8c9	vc4: Fix overzealous raddr conflict resolution. We only need to do the fixup when both args are in the same file, not just when both are in physical registers.	2014-09-23 13:51:29 -07:00
Eric Anholt	2e48b286bf	vc4: Add support for 8-bit unorm/snorm vertex inputs.	2014-09-23 13:40:10 -07:00
Eric Anholt	b7edf30191	vc4: Add disasm for A-file unpack operations. The A-file unpack is just like R4 unpack, except that if you don't do a floating-point operation it won't do float conversion (so int16 gets scaled up to int32).	2014-09-23 13:40:10 -07:00
Eric Anholt	71e5ba9c01	vc4: Switch to using Mesa's register allocator. This will let me more reliably allocate a-file registers, which are going to be even more in demand when I start using a-file unpacks. Also fixes a bug where the reservation of payload registers (FRAG_Z/W) was off by one but just caused failure to register allocate at all if the off-by-one was fixed.	2014-09-23 13:40:10 -07:00
Eric Anholt	0148690ac7	vc4: Make a static list of all the registers.	2014-09-23 13:40:10 -07:00
Eric Anholt	e157837282	vc4: Switch the context struct to use ralloc. I wanted to hang the ra_regs off it so I didn't have to free, but it turned out it wasn't ralloced yet.	2014-09-23 13:40:10 -07:00
Eric Anholt	517e01b5c3	mesa: Move register_allocate.c to util. The r300 gallium driver is using it outside of the Mesa tree, and I wanted to do so for vc4 as well. Rather than make the multiple-definitions problem even more complicated, just move it to more-shared code. v2: Don't forget to delete the symlink in r300 (review by Matt). Delete more r300-helper references (review by Emil) Don't prefix util/ header inclusion with "util/" (review by Emil) Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> (v1)	2014-09-23 13:40:10 -07:00
Roland Scheidegger	5e1fcc6258	gallivm: fix idiv `ffeb77c7b0` had a typo which turned all signed integer divisions into unsigned ones. Oops. This gets us back the 51 little piglits (all from glsl built-in-functions, fs/vs/gs-op-div-int-ivec2 and similar). Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-23 21:46:00 +02:00
Juha-Pekka Heikkila	4ed23fd590	egl: extra null checks for get_xcb_screen() return values verify get_xcb_screen() returned pointer before using it. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	b9463813ee	meta: Fix error paths in meta_copy_image.c If _mesa_get_tex_image() return NULL there is already error set in context. Other error pats free allocated texture. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	e13a8dc37d	meta: Avoid null access on setup_glsl_msaa_blit_shader() On default fallback path there was null access on src_rb Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	ba089cfa82	i965: Add extra null check in intel_bufferobj_alloc() Check calloc returned requested memory. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	51aa221480	mesa/main: Check allocations success in _mesa_one_time_init_extension_overrides() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	a3d6146e3a	glsl: Check realloc return value in ir_function::matching_signature() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	261120daef	loader: Check dlsym() did not fail in libudev_get_device_name_for_fd() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	d2f0442bf6	glsl: Check calloc return value in link_intrastage_shaders() Check calloc return value while adding build-in functions. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	808b8e59c0	i965: Avoid null access in intelMakeCurrent() separate two null checks connected with && to their own if branches. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	36f8042e8c	mesa: add null checks in symbol_table.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	6e56eaf7b7	glsl: add missing null check in tfeedback_decl::init() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	a82b29d526	i965: in set_read_rb_tex_image() check _mesa_meta_bind_rb_as_tex_image() did succeed Check if _mesa_meta_bind_rb_as_tex_image() did give the texture. If no texture was given there is already either GL_INVALID_VALUE or GL_OUT_OF_MEMORY error set in context. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-23 10:25:02 +03:00
Juha-Pekka Heikkila	5a6ec26aec	glsl: Fix memory leak in glsl_lexer.ll Running fast clear glClear with SNB caused Valgrind to complain about this. v2: line 237 fixed glClear from leaking memory, other strdups are also now changed to ralloc_strdups but I don't know what effect those have. At least no changes in my Piglit quick run. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-23 10:25:02 +03:00
Chia-I Wu	6c9d67118a	ilo: rework pipeline workarounds Add current_pipe_control_dw1 and deferred_pipe_control_dw1 to track what have been done since lsat 3DPRIMITIVE and what need to be done before next 3DPRIMITIVE. Based on them, we can emit WAs more smartly. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-23 10:08:05 +08:00
Chia-I Wu	34e807817f	ilo: remove handle_invalid_batch_bo() It was used to set has_gen6_wa_pipe_control to false when the batch buffer changed. When called from emit_flush() and others, it also unset ILO_3D_PIPELINE_INVALIDATE_BATCH_BO so that the following emit_draw() will not set has_gen6_wa_pipe_control to false again. It sounded error-prone and was just ugly. We should be able to achieve the same goal by reset has_gen6_wa_pipe_control in ilo_3d_pipeline_invalidate(). With handle_invalid_batch_bo() gone, the emit functions can also be inlined. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-23 10:08:05 +08:00
Chia-I Wu	2c1f978d6c	ilo: make gen6_pipeline_update_max_svbi() static We do not need to call it from GEN7 pipeline anymore since software PIPE_QUERY_PRIMITIVES_EMITTED is gone. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-23 10:08:05 +08:00
Ilia Mirkin	f6ff4cd517	freedreno/ir3: add TXB2 support Handles texture(samplerCubeShadow, bias), part of GLES3 and GL3 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-22 22:06:34 -04:00
Ilia Mirkin	9b7961f9a3	freedreno/ir3: add TXQ support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-22 22:06:34 -04:00
Ilia Mirkin	9a3dcf21d7	freedreno/ir3: fix TXB/TXL to actually pull the bias/lod argument Previously we would get a potentially computed post-swizzle coord based on the texture target info, which would not include the bias/lod in the last argument. The second argument does not have to be adjacent, so adjusting the order array did not make sense. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-22 22:06:34 -04:00
Ilia Mirkin	53678f5e6b	freedreno/ir3: make texture instruction construction more dynamic This will make life a lot easier as we add support for additional instructions. v2: shadow reference value is always .z or .w Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-22 22:06:34 -04:00
Andreas Pokorny	df341320c9	i915: Fix black buffers when importing prime fds Width and Height of the imported image was never initialized from the imported bo. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2014-09-23 00:26:17 +01:00
Andreas Pokorny	53b614bfd3	egl/drm: expose KHR_image_pixmap extension This changes enables EGL_KHR_image_pixmap in the egl drm platform, which is implemented there but has not been advertised yet. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2014-09-23 00:25:45 +01:00
Brian Paul	6addb7f42b	gallium: update comment for enum pipe_format Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-22 16:59:48 -06:00
Brian Paul	e7a614c60c	gallium: replace pipe_type enum with tgsi_return_type enum The only place the enum pipe_type was used is for the TGSI sampler view return type. So make it a TGSI type. Note: it appears this part of TGSI isn't used by anyone so it may be removed in the future. v2: the new name is tgsi_return_type, not tgsi_type. This means we can drop the previously posted tgsi_type -> tgsi_opcode_type patch. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	9ce72ac1fa	draw: use new tgsi_transform inst/decl helpers in pstipple code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	493ab77551	draw: use new tgsi_transform inst/decl helpers in aapoint code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	d7e5b7138a	draw: use new tgsi_transform inst/decl helpers in aaline code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	e9d076e6d0	tgsi: add inst/decl helpers for tgsi_transform utility Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	16ff2fdd70	draw: use tgsi transform prolog callback in polygon stipple code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	6581aa441e	draw: use tgsi transform prolog/epilog callbacks in AA line code Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	d77c0a2b52	draw: use tgsi transform prolog/epilog callbacks in AA point code This simplifies the code and makes it a little easier to understand. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:24 -06:00
Brian Paul	9e0160fc58	tgsi: fix tgsi transform's epilog callback We want to call the caller's epilog callback when we find the TGSI END instruction, not after it. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:23 -06:00
Brian Paul	b16bb3f50f	tgsi: add prolog() method to tgsi_transform_context Called when the user can insert new decls, instructions. This could be used in a few places in the 'draw' module. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-22 16:56:23 -06:00
Brian Paul	2826212dc7	glsl: use ptrdiff_t cast to silence g++ sign warning Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-22 16:56:23 -06:00
Jordan Justen	19b08e1bb3	i965/fs: Remove direct fs_visitor brw_wm_prog_key dependence Instead we store a void pointer to the key, and cast it to brw_wm_prog_key for fragment shader specific code paths. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-22 11:11:33 -07:00
Jordan Justen	e9be6a7833	i965/fs: Use brw_sampler_prog_key_data instead of brw_wm_prog_key::tex This helps: 1. Reduce the need to have fs_visitor::key's type be brw_wm_prog_key* 2. Align the code to allow brw_sampler_prog_key_data to be pulled out of other prog_key types for different stages. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-22 11:11:33 -07:00
Jordan Justen	49e5f76a65	i965/fs: Remove direct fs_visitor brw_wm_prog_data dependence Instead we store a brw_stage_prog_data pointer, and cast it to brw_wm_prog_data for fragment shader specific code paths. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-22 11:11:33 -07:00
Tom Stellard	c6d9801409	clover: Add support to mem objects for multiple destructor callbacks v2 The spec says that mem objects should maintain a stack of callbacks not just one. v2: - Remove stray printf. Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-22 12:32:34 -04:00
Brian Paul	cc71457b48	st/xa: silence unused variable warning Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-22 08:04:34 -06:00
Brian Paul	0100d45b7e	target-helpers: add inline qualifier on configuration_query() To silence unused function warnings. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-22 08:04:34 -06:00
Chia-I Wu	a68f421d73	ilo: clean up fallback path for primitive restart We should be able to draw with the index buffer mapped. That simplifies things a lot. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-22 14:22:37 +08:00
Chia-I Wu	d69faf851f	ilo: handle conditional rendering in the context Conditional rendering is not limited to draw_vbo(). Move the support to ilo_context, and replace ilo_3d_pass_render_condition() by ilo_skip_rendering().	2014-09-22 12:51:42 +08:00
Chia-I Wu	295a3a3ff0	ilo: create the pipeline from the builder The pipeline needs just the builder to build commands. It does not need CP.	2014-09-22 11:47:33 +08:00
Chia-I Wu	61c6a294dd	ilo: move aperture checks out of pipeline They can be done outside of the pipeline. Move them and let the pipeline focus on building commands.	2014-09-22 11:45:38 +08:00
Chia-I Wu	672592de7e	ilo: flush before setting SOL_RESET SOL_RESET happens before bo execution. It should not be observed by the commands that are already in the bo. Move the code out of the pipeline now that it submits.	2014-09-22 10:41:13 +08:00
Chia-I Wu	17e7582465	ilo: move size estimation check out of pipeline It can be done outside of the pipeline. Let's move it.	2014-09-22 10:36:27 +08:00
Rob Clark	49b8fb937f	freedreno/a3xx: more texture array fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-21 15:36:26 -04:00
Rob Clark	18291ee17a	freedreno: add DRM_CONF_SHARE_FD And config query and DRM_CONF_SHARE_FD to both mega-driver and traditional build configs, so that EGL_EXT_image_dma_buf_import works. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-21 15:35:53 -04:00
Chia-I Wu	41f072a4f8	ilo: use a single list for queries We used different lists for different types of queries because we wanted to update software queries quickly. Now that there is no software queries, we are fine with a single list. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-21 23:36:00 +08:00
Chia-I Wu	6b79d894d7	ilo: replace software queries by hardware ones Read PIPE_QUERY_PRIMITIVES_GENERATED and PIPE_QUERY_PRIMITIVES_EMITTED from hardware registers. Because all queries now have a bo, remove unnecessary checks for q->bo. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-21 23:31:41 +08:00
Chia-I Wu	154972700d	ilo: support prim queries in ilo_3d_pipeline_emit_query() Add support for PIPE_QUERY_PRIMITIVES_GENERATED and PIPE_QUERY_PRIMITIVES_EMITTED in ilo_3d_pipeline_emit_query(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-21 23:31:31 +08:00
Chia-I Wu	900d8136e1	ilo: add ilo_3d_pipeline_emit_query() It replaces ilo_3d_pipeline_emit_write_timestamp(), ilo_3d_pipeline_emit_write_depth_count(), and ilo_3d_pipeline_emit_write_statistics(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-21 23:31:20 +08:00
Chia-I Wu	9c873816a8	ilo: rework query support This fixes some corner cases, but more importantly, the new code should be easier to reason about. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-21 23:31:10 +08:00
Chia-I Wu	26fefae9a7	ilo: clarify cp owning/releasing Make it own()'s responsibility to make room for release() and itself. To be able to do that, allow ilo_cp_submit() in own(). Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-21 23:30:51 +08:00
Chia-I Wu	4eb2bbefd2	ilo: add a pointer to builder in ilo_3d_pipeline It saves quite some typings.	2014-09-20 11:45:31 +08:00
Chia-I Wu	8b4726d32e	ilo: add a helper for RECTLIST blitter Add ilo_3d_draw_rectlist() for use by RECTLIST blitter.	2014-09-20 11:29:40 +08:00
Chia-I Wu	bca549691e	ilo: no direct ilo_context access in BLT blitter We need ilo_builder for command building and ilo_cp for size check. ilo_context is not used.	2014-09-20 11:06:08 +08:00
Chia-I Wu	c1165c8ea0	ilo: fix headers in Makefile.sources	2014-09-20 11:01:35 +08:00
Chia-I Wu	6c0de4b979	ilo: add a new struct for context states Move pipe states in ilo_context to the new ilo_state_vector. The motivation is that ilo_context consists of several loosely related things. When we need an ilo_context somewhere, we usually need only one or two of the things in it. This change makes ilo_state_vector one such thing. An immediate result is that we no longer need ilo_context in 3D pipelines, something we have planned for since early days.	2014-09-20 10:13:53 +08:00
Chia-I Wu	284d767be0	ilo: merge ilo_gpe.h to ilo_state*.h Move the #define's and struct's to ilo_state.h. Move the inline functions and function declarations to ilo_state_gen.h.	2014-09-20 10:13:53 +08:00
Chia-I Wu	4a8a6ce154	ilo: rename ilo_gpe_gen.[ch] Rename them to ilo_state_gen.[ch].	2014-09-20 10:13:53 +08:00
Chia-I Wu	3cb383c1c9	ilo: make ilo_fence opaque It is manipulated only in ilo_screen.c.	2014-09-20 10:13:53 +08:00
Chris Forbes	c4ed6c730f	i965/gen6: Enable GL 3.3 and GLSL 3.30 Tested on my snb-gt2: 4 tests skip->pass in spec/EXT_texture_array 51 tests skip->pass in spec.glsl-3.30 4 tests skip->pass in spec/!OpenGL 3.3 No regressions; no skip->fail changes. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-20 13:59:36 +12:00
Roland Scheidegger	7ede5a1a7b	gallivm: add information about different sampler/view units if analyzing shader Useful to know in some cases. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-20 02:19:02 +02:00
Emil Velikov	4824eecc0c	docs: Add 10.3 sha256 sums, news item and link release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `18571edea8`) Conflicts: docs/index.html docs/relnotes.html	2014-09-19 20:18:43 +01:00
Emil Velikov	991242ece1	docs: Update 10.3 release notes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `1b12af300d`)	2014-09-19 20:16:37 +01:00
Emil Velikov	878e8a89f4	docs: Add sha256 sums for the 10.2.8 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `f95fcb1716`)	2014-09-19 20:16:25 +01:00
Emil Velikov	4e8d1c7899	Add release notes for the 10.2.8 release Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> (cherry picked from commit `1e2b4120f7`)	2014-09-19 20:16:14 +01:00
Marek Olšák	8449121971	st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables Some users don't understand that these variables can break OpenGL. The general is rule is that if an app supports MSAA, you mustn't use GALLIUM_MSAA. For example, if an app has an 8xMSAA FBO and GALLIUM_MSAA=4 is set, resolving the FBO to the back buffer will be rejected which will look like this on all gallium drivers: http://www.phoronix.com/scan.php?page=article&item=amd_radeonsi_msaa The environment variables also have no effect on modern apps like TF2, but there is still a performance hit due to wasted bandwidth and VRAM. In a nutshell, it does more harm than good. Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-19 20:38:06 +02:00
Eric Anholt	001247d230	vc4: Fix perspective interpolation. Fixes the mesa reflect demo and 6 tests under interpolation/	2014-09-19 11:25:02 -07:00
Eric Anholt	dcd03e7476	vc4: Use the same method as for FRAG_Z to handle fragcoord W. I need to get the non-reciprocal version of W for interpolation, anyway.	2014-09-19 11:09:04 -07:00
Roland Scheidegger	f2c39dd0e1	util: don't try to emit half-float intrinsics if avx isn't available These instructions only have vex encodings, thus they can't be used without avx. (Technically, one can still use avx-128 if avx isn't available because the environment doesn't store the ymm registers, however I don't think llvm can.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-19 16:58:28 +02:00
Samuel Iglesias Gonsalvez	74d7ff2efd	i965/gen6: enable GLSL 1.50, OpenGL 3.2 and GL_AMD_vertex_shader_layered Geometry shaders was the only thing we needed to enable GLSL 1.50 and OpenGL 3.2 in gen6. v2: Layered clears do not work properly in gen6 with OpenGL 3.2. Kenneth and Jordan realized that for this to work we also need GL_AMD_vertex_shader_layered (which requires OpenGL 3.2, so it could not be enabled before this patch), so we agreed to enable this together with OpenGL 3.2 in this patch. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:16 +02:00
Iago Toral Quiroga	d2c2ca9ee8	i965/gen6/gs: Use a specific implementation of geometry shaders for gen6. In gen6 we will use the geometry shader implementation from gen6_gs_visitor.cpp and keep the implementation in brw_vec4_gs_visitor.cpp for gen7+. Notice that gen6_gs_visitor inherits from brw_vec4_gs_visitor so it is not a completely seprate implementation of geometry shaders. Also, gen6 does not support multiple dispatch modes, its default operation mode is equivalent to gen7's SINGLE mode, so select that in gen6 for consistency. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Iago Toral Quiroga	3a4aee34a2	i965/gen6/gs: upload ubo and pull constants surfaces. Uniforms declared as uniform blocks are stored in ubo surfaces and need to be pulled from the geometry shader program so make sure we upload them first and do the same for pull constants. This fixes all piglit tests that use uniform blocks: bin/shader_runner tests/spec/glsl-1.50/uniform_buffer/gs-* Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	6947a8a593	i965/gen6/gs: Enable transform feedback support in geometry shaders Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Iago Toral Quiroga	c66165ab2b	i965/gen6/gs: Fix binding table clash between TF surfaces and textures. For gen6 geometry shaders we use the first BRW_MAX_SOL_BINDINGS entries of the binding table for transform feedback surfaces. However, vec4_visitor will setup the binding table so that textures use the same space in the binding table. This is done when calling assign_common_binding_table_offsets(0) as part if its run() method. To fix this clash we add a virtual method to the vec4_visitor hierarchy to assign the binding table offsets, so that we can change this behavior specifically for gen6 geometry shaders by mapping textures right after the first BRW_MAX_SOL_BINDINGS entries. Also, when there is no user-provided geometry shader, we only need to upload the binding table if we have transform feedback, however, in the case of a user-provided geometry shader, we can't only look into transform feedback to make that decision. This fixes multiple piglit tests for textureSize() and texelFetch() when these functions are called from a geometry shader in gen6, like these: bin/textureSize gs sampler2D -fbo -auto bin/texelFetch gs usampler2D -fbo -auto Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Iago Toral Quiroga	2614cde998	i965/gen6/gs: Avoid buffering transform feedback varyings twice. Currently we buffer transform feedack varyings separately. This patch makes it so that we reuse the values we have already buffered for all the output varyings of the geometry shader instead. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	2120443484	i965/gen6/gs: Buffer PSIZ/flags vertex data in gen6_gs_visitor Since geometry shaders can alter the value of varyings packed in the first output VUE slot (PSIZ), we need to buffer it together with all the other vertex data so we can emit the right value for each vertex when we do the URB writes. This fixes the following piglit test in gen6: tests/spec/glsl-1.50/execution/redeclare-pervertex-out-subset-gs.shader_test Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	28a7da612b	i965/gen6/gs: Setup SOL surfaces for user-provided geometry shaders Update gen6_gs_binding_table and gen6_sol_surface to use user-provided geometry program information when present. This is necessary to implement transform feedback support. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	fda4470944	i965/gen6/gs: implement transform feedback support in gen6_gs_visitor This takes care of generating code required to handle transform feedback. Notice that transform feedback isn't enabled yet, since that requires additional setups in other parts of the code that will come in later patches. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	1f77bfce7d	i965/gen6/gs: Add an additional parameter to the FF_SYNC opcode. We will use this parameter in later patches to provide information relevant to transform feedback that needs to be set as part of the FF_SYNC message. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	3ea410972a	i965/gen6/gs: implement GS_OPCODE_FF_SYNC_SET_PRIMITIVES opcode This opcode will be used when filling FF_SYNC header before emitting vertices and their data. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	5933a08bd9	i965/gen6/gs: implement GS_OPCODE_SVB_SET_DST_INDEX opcode This opcode generates code to copy the specified destination index into subregister 5 of the MRF message header. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Samuel Iglesias Gonsalvez	e86ae1b0a3	i965/gen6/gs: implement GS_OPCODE_SVB_WRITE opcode This opcode will be used when sending SVB WRITE messages to save transform feedback outputs into Streamed Vertex Buffers. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Iago Toral Quiroga	66ec61c49f	i965/gen6/gs: Enable texture units and upload sampler state. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:16 +02:00
Iago Toral Quiroga	6669fd0818	i965/gen6/gs: Assign geometry shader VUE map properly. So far in gen6 we only used geometry shaders to implement transform feedback in vertex shaders, so we assumed that the VUE map for the geometry shader stage was always the same as for the vertex shader stage. This is no longer true now that we support user provided geometry shaders in gen6 too. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	524ad6b901	i965/gen6/gs: Implement support for gl_PrimitiveIdIn. For this we will need to move PrimitiveID information, delivered in the thread payload in r0.1, to a separate register (we use GS_OPCODE_SET_PRIMITIVE_ID for this), then map the corresponding varying slot to that register in the setup_payload() method. Notice that we cannot use a virtual register as the destination for the PrimitiveID because we need to map all input attributes to hardware registers in setup_payload(), which happens before virtual registers are mapped to hardware registers. We could work around that issue if we were able to compute the first non-payload register in emit_prolog() and move the PrimitiveID information to that register, but we can't because at that point we still don't know the final number uniforms that will be included in the payload. So, what we do is to place PrimitiveID information in r1, which is always delivered as part of the payload but its only populated with data relevant for transform feedback when we set GEN6_GS_SVBI_PAYLOAD_ENABLE in the 3DSTATE_GS state packet. When we implement transform feedback, we wil make sure to move the value of r1 to another register before we overwrite it with the PrimitiveID. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	024b7c0f33	i965/gen6/gs: Implement GS_OPCODE_SET_PRIMITIVE_ID. In gen6 the geometry shader payload includes the PrimitiveID information in r0.1. When the shader code uses glPimitiveIdIn we will have to move this to a separate hardware register where we can map this attribute. This opcode takes the selected destination register and moves r0.1 there. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	c091804f4c	i965/gen6/gs: Handle the case where a geometry shader emits no output. In gen6 we need to end the thread differently depending on whether we have emitted at least one vertex or not. In case we did, the EOT message must always include the COMPLETE flag or else the GPU hangs. If we have not produced any output, however, we can't use the COMPLETE flag. This would lead us to end the program with an ENDIF opcode, which we want to avoid (and actually is not permitted since it hits an assertion), so instead what we do is that we always request a new VUE handle every time we do an URB WRITE, even for the last vertex we emit. With this we make sure that whether we have emitted at least one vertex or none at all we have to finish the thread without writing to the URB, which works for both cases by setting the COMPLETE and UNUSED flags in the EOT message. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	c1b8a5155b	i965/gen6/gs: Make sure we complete the last primitive. Just in case the GS algorithm does not call EndPrimitive() for the last primitive produced. This is relevant only for non point outputs, since for this we are already setting the PrimEnd flag on each vertex we emit. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	d93ca68666	i965/gen6/gs: Implement geometry shaders for outputs other than points. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	8411bf2c69	i965/gen6/gs: Add initial implementation for a gen6 geometry shader visitor. Geometry shaders in gen6 are significantly different from gen7+ so it is better to have them implemented in a different file rather than adding gen6 branching paths all over brw_vec4_gs_visitor.cpp. This commit adds an initial implementation that only handles point output, which is the simplest case. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	5c30da1845	i965: Generalize emit_urb_slot() to emit to any dst_reg. In gen7+ we emit vertices as they come, however in gen6 geometry shaders we have to buffer vertex data for all vertices and then emit it all in one go at the end. To achieve this we need to generalize emit_urb_slot() to store vertex data in general purpose registers and not only MRF registers. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	9b32fd0f70	i965: Provide means to create registers of a given size. Implemented by Ilia Mirkin <imirkin@alum.mit.edu>. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	f373b7ed82	i965/gen6/gs: Implement GS_OPCODE_SET_DWORD_2. We had GS_OPCODE_SET_DWORD_2_IMMED but this required its source argument to be an immediate. In gen6 we need to set dword 2 of the URB write message header from values stored in separate register, so we need something more flexible. This change replaces GS_OPCODE_SET_DWORD_2_IMMED with GS_OPCODE_SET_DWORD_2. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	7ccd47d644	i965/gen6/gs: Upload binding table for user-provided geometry shaders. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	5ac8294f9b	i965/gen6/gs: Enable URB space for user-provided geometry shaders. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	c09ddf82ff	i965/gen6/gs: Compute URB entry size for user-provided geometry shaders. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	621685ad4c	i965/gen6/gs: Add instruction URB flags to geometry shaders EOT message. Gen6 seems to require that EOT messages include the complete flag too or else the GPU hangs. We add will this flag to the instruction when we emit the thread end opcode. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	2c85132e51	i965/gen6/gs: Implement GS_OPCODE_URB_WRITE_ALLOCATE. Gen6 geometry shaders need to allocate URB handles for each new vertex they emit after the first (the URB handle for the first vertex is obtained via the FF_SYNC message). This opcode adds the URB allocation mechanism to regular URB writes. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	d0bdd4ce98	i965/gen6/gs: Implement GS_OPCODE_FF_SYNC. This implements the FF_SYNC message required in gen6 geometry shaders to get the initial URB handle. Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-19 15:01:15 +02:00
Samuel Iglesias Gonsalvez	406e04113f	i965/gs: Reuse gen6 constant push buffers setup code in gen7+. The code required for gen6 and gen7+ is almost the same, so reuse it. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:15 +02:00
Iago Toral Quiroga	96012dfe80	i965/gen6/gs: Setup constant push buffers for gen6 geometry shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:15 +02:00
Samuel Iglesias Gonsalvez	cf06136b63	i965/gen6/gs: Set brw->gs.enabled to FALSE in gen6_blorp_emit_gs_disable() See `7dfb4b2d00` for more details. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:14 +02:00
Samuel Iglesias Gonsalvez	bc383cb55b	i965/gen6/gs: use brw_gs_prog atom instead of brw_ff_gs_prog This is needed to support user-provided geometry shaders, since the brw_ff_gs_prog atom in gen6 only takes care of implementing transform feedback for vertex shaders. If there is no user-provided geometry shader the implementation falls back to the original code. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:14 +02:00
Samuel Iglesias Gonsalvez	dd376bdb25	i965/gen6/gs: Skeleton for user GS program support Currently, gen6 only uses geometry shaders for transform feedback so the state we emit is not suitable to accomodate general purpose, user-provided geometry shaders. This patch paves the way to add these support and the needed 3DSTATE_GS packet modifications for it. Previous code that emitted state to implement transform feedback in gen6 goes to upload_gs_state_adhoc_tf(). Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:14 +02:00
Iago Toral Quiroga	03164f6285	i965/gs: Use single dispatch mode as fallback to dual object mode when possible. Currently, when a geometry shader can't use dual object mode we fall back to dual instance mode, however, when invocations == 1, single dispatch mode is more performant and equally efficient in terms of register pressure. Single dispatch mode requires that the driver can handle interleaving of input registers, but this is already supported (dual instance mode has the same requirement). However, to take full advantage of single dispatch mode to reduce register pressure we would also need the ability to store two separate vec4 output values into vec8 registers, which would approximately double our capacity to store temporary values, but currently the vec4 visitor and generator classes do not support this, so at the moment register pressure in single and dual instance modes is the same. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-19 15:01:14 +02:00
Chia-I Wu	45cbc9267e	ilo: rename ILO_DEBUG=3d It has been a bad name since we added the builder. Rename it to ILO_DEBUG=batch to match i965, and call ilo_builder_decode() from ilo_cp_submit_internal().	2014-09-19 16:02:11 +08:00
Chia-I Wu	8a2352262e	ilo: rename ilo_cp_flush() "Flush" is used for too many things already: pipe resource flush, pipe context flush, pipe transfer region flush, and hardware pipeline flush. Rename it to ilo_cp_submit(). As such, ILO_DEBUG=flush is renamed to ILO_DEBUG=submit.	2014-09-19 16:02:11 +08:00
Chia-I Wu	1887d15eed	ilo: remove ilo_cp_empty() Call ilo_builder_batch_used() directly.	2014-09-19 16:02:11 +08:00
Chia-I Wu	270667472f	ilo: simplify ilo_cp_set_owner() The simplification allows us to get rid of ilo_cp_set_ring() and ilo_cp_implicit_flush(). The 3D query code is refactored for the simplification.	2014-09-19 16:02:11 +08:00
Kenneth Graunke	26ee6f23a9	mesa: Delete VAO _MaxElement code and index buffer bounds checking. Fredrik's implementation of ARB_vertex_attrib_binding introduced new gl_vertex_attrib_array and gl_vertex_buffer_binding structures, and converted Mesa's older gl_client_array to be derived state. Ultimately, we'd like to drop gl_client_array and use those structures directly. One hitch is that gl_client_array::_MaxElement doesn't correspond to either structure (unlike every other field), so we'd have to figure out where to store it. The _MaxElement computation uses values from both structures, so it doesn't really belong in either place. We could put it in the VAO, but we'd have to pass it around everywhere. It turns out that it's only used when ctx->Const.CheckArrayBounds is set, which is only set by the (rarely used) classic swrast driver. It appears that drivers/x11 used to set it as well, which was intended to avoid segmentation faults on out-of-bounds memory access in the X server (probably for indirect GLX clients). However, ajax deleted that code in 2010 (commit `1ccef926be`). The bounds checking apparently doesn't actually work, either. Non-VBO attributes arbitrarily set _MaxElement to 2 * 1000 * 1000 * 1000. vbo_save_draw and vbo_exec_draw remark /* ??? */ when setting it, and the i965 code contains a comment noting that _MaxElement is often bogus. Given that the code is complex, rarely used, and dubiously functional, it doesn't seem worth maintaining going forward. This patch drops it. This will probably mean the classic swrast driver may begin crashing on out of bounds vertex buffer access in some cases, but I believe that is allowed by OpenGL (and probably happened for non-VBO accesses anyway). There do not appear to be any Piglit regressions, either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Roland Scheidegger <sroland@vmware.com>	2014-09-19 00:43:01 -07:00
Eric Anholt	19589147ef	vc4: Add support for stencil operations. While depth test state is passed through the fragment shader as sideband, data, the stencil test state has to be set by the fragment shader itself. Many tests are still failing, but this gets most of hiz/ passing.	2014-09-18 17:46:43 -07:00
Eric Anholt	6e39854e23	vc4: Actually implement VC4_DEBUG=cl.	2014-09-18 11:46:50 -07:00
Roland Scheidegger	019ca99bee	draw: (trivial) remove duplicated lines	2014-09-18 16:13:24 +02:00
Brian Paul	7b2c703244	mesa: fix prog_optimize.c assertions triggered by SWZ opcode The SWZ instruction can have swizzle terms >4 (SWIZZLE_ZERO, SWIZZLE_ONE). These swizzle terms caused a few assertions to fail. This started happening after the commit "mesa: Actually use the Mesa IR optimizer for ARB programs." when replaying some apitrace files. A new piglit test (tests/asmparsertest/shaders/ARBfp1.0/swz-08.txt) exercises this. Cc: "10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-09-18 07:28:36 -06:00
Eric Anholt	71d4fc88d6	vc4: Allow copy propagation of uniforms. Fixes 12 piglit tests (and 8 more crash -> fail) from reducing register pressure.	2014-09-17 14:21:24 -07:00
Eric Anholt	79be2cc383	vc4: Make sure thread end doesn't have a uniform read. Prevents regression when I start doing copy propagation on uniforms.	2014-09-17 14:21:24 -07:00
Eric Anholt	44b8eb743d	vc4: Allow dead code elimination of instructions that read uniforms.	2014-09-17 14:21:24 -07:00
Eric Anholt	5e90ed79f6	vc4: Add support for reordering the uniform stream after optimization. This allows for introducing dead code eliminating of uniforms, copy propagation of uniforms, and instruction rescheduling between instructions that both read uniforms.	2014-09-17 14:21:24 -07:00
Eric Anholt	b0256fb75f	vc4: Initialize the various qreg arrays when allocating them. This is particularly important for outputs, where we try to MOV the whole vec4 to the VPM, even if only 1-3 components had been set up. It might also be important for temporaries, if the shader reads components before writing them.	2014-09-17 14:21:24 -07:00
Eric Anholt	b44a7a3223	vc4: Fix stray disable of the CSE pass. Somehow I slipped this in with the original commit of CSE.	2014-09-17 14:21:24 -07:00
rconde	ffeb77c7b0	gallivm,tgsi: fix idiv by zero crash While the result of signed integer division by zero is undefined by glsl (and doesn't exist with d3d10), we must not crash, so need to make sure we don't get sigfpe much like udiv already does. Unlike udiv where we return 0xffffffff (as required by d3d10) there is no requirement right now to return anything specific so we use zero.	2014-09-17 18:31:54 +02:00
Roland Scheidegger	4d996877ca	gallivm: add texture target information for sample opcodes to tgsi info sample opcodes don't have valid texture target information (and I don't think this should be changed), however it would be nice if we had that information ready elsewhere, so stuff that information into the tgsi info when analyzing a shader. v2: Ilja Mirkin spotted some bugs wrt not handling msaa resources. So add them and while there also add them to the tex opcode analysis this was cloned from as well (plus get rid of some bug not detecting indirect textures there in some cases too). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-17 18:31:54 +02:00
Richard Sandiford	2e49559c77	st/mesa: Fix handling of 8888 SNORM and SRGB formats for big-endian MESA_FORMAT_x8y8z8w8 puts the x channel in the least significant part of the containing 32-bit integer, which is equivalent to PIPE_FORMAT_xyzw8888. PIPE_FORMAT_x8y8z8w8 puts the x channel first in memory. This patch fixes up the mesa<->gallium mapping accordingly. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:20:08 +10:00
Richard Sandiford	ccdbcd9586	st/mesa: Fix handling of LA and RG formats for big-endian MESA_FORMAT_LnAn puts the luminance in the least significant part of the containing integer, which is equivalent to PIPE_FORMAT_LAnn. PIPE_FORMAT_LnAn puts the luminance first in memory. This patch fixes up the mesa<->gallium mapping accordingly. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:20:08 +10:00
Richard Sandiford	be6ef203aa	mesa: Add MESA_FORMAT_{A8R8G8B8, X8R8G8B8, X8B8G8R8}_SRGB (v2) This means that each 8888 SRGB format has a reversed counterpart, which is necessary for handling big-endian mesa<->gallium mappings. v2: fix missing i965 additions. (Jason) fix 127->255 max alpha for SRGB formats. (Jason) v1: Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:19:45 +10:00
Richard Sandiford	df14091c58	mesa: Add MESA_FORMAT_A8L8_{SNORM,SRGB} The associated UNORM format already existed. This means that each LnAn format has a reversed counterpart, which is necessary for handling big-endian mesa<->gallium mappings. [airlied: rebased onto current master] Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:17:47 +10:00
Richard Sandiford	234d194b49	gallium: Define PIPE_FORMAT_xyzw8888_{SNORM, SRGB} aliases ...i.e. formats in which the first listed component is in the least significant byte of the integer. The corresponding UNORM aliases already exist. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:17:46 +10:00
Richard Sandiford	f9d8574b5e	gallium: Add PIPE_FORMAT_x8B8G8R8_SNORM formats This means that each RnGnBnxn format has a reversed counterpart, which is necessary for handling big-endian mesa<->gallium mappings. The associated UNORM and SRGB formats already exist. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:17:46 +10:00
Richard Sandiford	9b4c13995c	gallium: Define PIPE_FORMAT_{LA, AL, RG, GR}nn aliases ...i.e. formats in which the first listed component is in the least significant half of the integer. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:17:46 +10:00
Richard Sandiford	f14b40ab32	gallium: Add PIPE_FORMAT_AnLn and PIPE_FORMAT_GnRn formats ...i.e. formats in which the alpha or green channel is first in memory. This means that each LnAn and RnGn format has a reversed counterpart, which is necessary for handling big-endian mesa<->gallium mappings. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:17:46 +10:00
Dave Airlie	9ea045e85e	mesa: fix SRGB alpha channel value in pack_float_R8G8B8X8_SRGB Jason pointed out the bug on review adding new formats, but the existing format also appears to have the bug, so use 255 as the max, these are SRGB no SNORM. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 13:28:18 +10:00
Richard Sandiford	ecc48f83c8	swrast: Fix handling of MESA_FORMAT_L8A8_SRGB for big-endian Luminance is the least-significant byte of the uint16, rather than the lowest byte in memory. Other parts of mesa already handle this correctly for big-endian, and swrast already handles other MESA_FORMAT_x8y8 formats correctly. This case was just an odd-one-out. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 12:07:13 +10:00
Richard Sandiford	3e3b1db5f7	mesa: Tweak unpack name for MESA_FORMAT_R8G8B8X8_SNORM MESA_FORMAT_R8G8B8X8_SNORM used a function called unpack_X8B8G8R8_SNORM while MESA_FORMAT_R8G8B8X8_SRGB used a function called unpack_R8G8B8X8_SRGB. This patch renames the SNORM function to have the same order as the MESA_FORMAT name, like the SRGB function does. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 12:05:08 +10:00
Richard Sandiford	3ff5c6a6c4	mesa: Fix alpha component in unpack_R8G8B8X8_SRGB. The function was using the "X" component as the alpha channel, rather than setting alpha to 1.0. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 11:56:23 +10:00
Dave Airlie	ebcb2ee989	util: move shared rgtc code to util (v2) This was being shared using a ../../ get out of gallium into mesa, and I swore when I did it I'd fix things when we got a util dir, we did, so I have. v2: move RGTC_DEBUG define Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-17 11:27:25 +10:00
Eric Anholt	2b6711cc5f	vc4: Claim ARB_fbo. This gets a ton of piglit working that crashes in waffle context management stuff otherwise. Actually supporting mismatched FB sizes is at best going to require some more load/store generals for color buffers, but if I can't manage to do that I'll want to just have state_tracker reject those FBOs as unsupported, rather than deny GL 2.1.	2014-09-16 15:14:52 -07:00
Eric Anholt	3c6d85e725	vc4: Fix memory leaks in register allocation.	2014-09-16 15:14:52 -07:00
Eric Anholt	ad02ba42f0	vc4: Move register allocation to a separate file. I'm going to be rewriting it all, and having it mixed up with the QIR-to-QPU opcode translation was messy.	2014-09-16 15:14:52 -07:00
Chris Forbes	b84c02f9cd	glsl: fix error message for redeclaring gl_PerVertex as output Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-17 08:33:16 +12:00
Chris Forbes	667f758788	i965/vec4: slightly improve insn dumping with no srcs Previously, we would get a trailing ', ' which looked strange. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-17 08:32:46 +12:00
Eric Anholt	2264925f85	vc4: Add support for computed depth writes. Fixes piglit glsl-1.10-fragdepth and early-z.	2014-09-16 13:03:41 -07:00
Eric Anholt	aae4223fbd	vc4: Restructure depth input/output in fragment shaders. The goal here is to have an argument for the depth write opcode so that I can do computed depth. In the process, this makes the calculations that will be emitted more obvious in the QIR.	2014-09-16 13:03:32 -07:00
Ilia Mirkin	a420aa1b41	freedreno: add a standalone ir3_compiler binary for building TGSI Compiler taken from the combo old/new compiler comparer + simulator. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-16 12:13:22 -04:00
Ilia Mirkin	5b1d316c51	freedreno: add default .dir-locals.el for emacs settings Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-16 12:13:22 -04:00
Gwenole Beauchesne	e1c50abf8a	i965: add support for RGBA dma_buf imports. This allows for importing foreign buffers in RGB32 native endian byte order, i.e. DRM_FORMAT_XBGR8888, and DRM_FORMAT_ABGR8888. Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-16 01:11:06 -07:00
Kenneth Graunke	78bd126194	i965: Mark delta_x/y as BAD_FILE if remapped away completely. Commit `afe3d1556f` (i965: Stop doing remapping of "special" regs.) stopped remapping delta_x/delta_y, and additionally stopped considering them always-live. We later realized delta_x was used in register allocaiton, so we actually needed to remap it, which was fixed in commit `23d782067a` (i965/fs: Keep track of the register that hold delta_x/delta_y.). However, that commit didn't restore the "always consider it live" part. If all the code using delta_x was eliminated, fs_visitor::delta_x would be left pointing at its old register number. Later code in register allocation would handle that register number specially...even though it wasn't actually delta_x. To combat this, set delta_x/y to BAD_FILE if they're eliminated, and check for that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83127 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-16 00:46:46 -07:00
Dave Airlie	7f6872d012	st_glsl_to_tgsi: init have_sqrt field. Coverity reported this. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-16 15:13:05 +10:00
Dave Airlie	8de5522d93	llvmpipe: fix rast debugging output The triangle_32_ rast functions never made it into the debug output, confused me for a few seconds. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-16 15:12:54 +10:00
Richard Sandiford	f93b6d8cc5	util: Add big-endian layout for a number of formats. This patch builds on `6c8f547f66` and previous patches by allowing u_format.csv to specify separate big-endian and little-endian layouts. It then uses this to specify the correct layouts for various depth/stencil formats. Later patches handle other formats. To recap, the idea is that u_format.csv lists the channels for an N-byte value as though it were an N-byte integer. For little-endian targets the channels are listed starting at the least-significant bit of the integer while for big-endian targets the channels are listed starting at the most-significant bit. This means that for something like PIPE_FORMAT_B8G8R8A8_UNORM (blue in first byte of memory, alpha in last byte of memory) the orders are the same for both endiannesses. But for something like PIPE_FORMAT_S8_UINT_Z24_UNORM, where the stencil is in the least significant byte of a 32-bit integer, there need to be separate channel definitions for each endianness. The effect of this patch is to make the affected PIPE_FORMAT_s have the same layout as the associated MESA_FORMAT_s for big-endian. The MESA_FORMAT_*s are already handled correctly. Fixes various piglit tests on z. No regressions on x86_64. [airlied: squash subsequent patches] util: Add big-endian layout for 5551 and 565 formats util: Add big-endian layout for 10/10/10/2 formats util: Add big-endian layout for 4444 formats util: Add big-endian layout for 233 format util: Add big-endian layout for 44 formats Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-16 14:02:56 +10:00
Richard Sandiford	9cd4dced06	llvmpipe: Fix PIPE_FORMAT_Z32_FLOAT_S8X24_UINT handling for big-endian. llvmpipe treats PIPE_FORMAT_Z32_FLOAT_S8X24_UINT as a bit of a special case, handling it as two 32-bit pieces rather than a single 64-bit block: /* 64bit d/s format is special already extracted 32 bits */ total_bits = format_desc->block.bits > 32 ? 32 : format_desc->block.bits; The format_desc describes the whole 64-bit block, so the z shift will be 32 for big-endian. But since we're accessing the z channel as a 32-bit value rather than a 64-bit value, we need to mask the shift with 31. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-16 14:02:55 +10:00
Richard Sandiford	1a65629ccc	gallivm: Fix uses of 2^24 Fallback cases in lp_bld_arit.c used 2^24 to mean "2 to the power 24", but in C it's "2 xor 24", i.e. 26. Fixed by using 1<< instead. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-16 14:02:55 +10:00
Richard Sandiford	0a7f9fe42b	gallivm: Add SNORM clamping to lp_build_{add, sub} ...fixing the associated TODO. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2014-09-16 14:02:54 +10:00
Rafael Ávila de Espíndola	f6e71ff9eb	gallivm: attach DataLayout to module too, not just pass manager. It looks like it was possible to attach it to both for a long time, however since llvm r217548 attaching it to just the pass manager is no longer sufficient and causes bugs (see http://llvm.org/bugs/show_bug.cgi?id=20903). Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-16 03:50:32 +02:00
Roland Scheidegger	145fef9636	gallivm: handle SAMPLE opcode in aos sampling This is just a very limited version, in particular sampler and sampler view index must be the same. It cannot handle any modifiers neither. Works much the same as soa version otherwise, to figure out the target we need to store the sampler view dcls. While here, also handle (no-op) RET and get rid of a couple bogus deprecated comments. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-16 03:50:31 +02:00
Roland Scheidegger	02595c55b0	tgsi: accept offsets for sample opcodes too in the text parser sample opcodes are a little oddly represented in the opcode_info, since they don't count as texture instructions - they don't have valid target information, but they may have offsets (unlike "ordinary" texture instructions, the texture token may be optional for them). So just make sure with these opcodes the optional offsets are accepted. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-16 03:50:31 +02:00
Roland Scheidegger	3a9eb40ee1	tgsi: don't print texture target for sample opcodes sample opcodes don't encode a texture target, it would thus always print UNKNOWN, which is not helpful (and wouldn't parse when giving back the shader text to tgsi). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-16 03:50:31 +02:00
Eric Anholt	84ca6bc113	vc4: Bump maximum ARB program temporaries to match Intel/AMD. This query has always been useless, but we could potentially reject well-formed, runnable programs if we expose a value that's too low.	2014-09-15 15:05:06 -07:00
Eric Anholt	33443f506f	vc4: Bump maximum uniforms count to match other drivers. We don't have any specific limits in the hardware, just like the other GPUs, so match their behavior. Fixes minmax_gles2 and several other piglit tests relying on the specced uniform minmax values.	2014-09-15 15:04:38 -07:00
Eric Anholt	5638b87d4c	vc4: Dynamically allocate the TGSI-to-qreg arrays. Fixes buffer overflows in some piglit tests (which are still failing to register allocate anyway).	2014-09-15 13:12:27 -07:00
Eric Anholt	2147dd9681	vc4: Fix memory leaks of struct qinst.	2014-09-15 13:12:27 -07:00
Eric Anholt	f78ee1b280	vc4: Fix memory leaks of some vc4_compile contents.	2014-09-15 13:12:27 -07:00
Eric Anholt	50292d76c5	vc4: Reuse the util header instead of defining our own ARRAY_SIZE. Fixes redefinition warnings if you end up including this header before util stuff.	2014-09-15 13:12:27 -07:00
Brian Paul	418da97905	mesa: move i, j var decls into SWIZZLE_CONVERT_LOOP() macro Put macro code in do {} while loop and put semicolons on macro calls so auto indentation works properly. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-15 09:52:44 -06:00
Brian Paul	cfeb394224	mesa: break up _mesa_swizzle_and_convert() to reduce compile time This reduces gcc -O3 compile time to 1/4 of what it was on my system. Reduces MSVC release build time too. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-09-15 09:52:44 -06:00
Kalyan Kondapally	dbc2d81d2b	Generate a warning when not writing gl_Position with GLES. With GLES we don't give any kind of warning in case we don't write to gl_position. This patch makes changes so that we generate a warning in case of GLES (VER < 300) and an error in case of GL. Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-15 08:14:33 +03:00
Tapani Pälli	9bd139e451	mesa: check that uniform exists in glUniform* functions Remap table for uniforms may contain empty entries when using explicit uniform locations. If no active/inactive variable exists with given location, remap table contains NULL. v2: move remap table bounds check before existence check (Ian Romanick) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Erik Faye-Lund <kusmabite@gmail.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83574	2014-09-15 07:33:12 +03:00
Chia-I Wu	ce50a61d36	ilo: clean up 3D/media functions Mostly style changes to set dw[0] directly.	2014-09-15 10:25:35 +08:00
Chia-I Wu	c39377d3fc	ilo: fix gen6_3DSTATE_MULTISAMPLE() There was a typo introduced by `90f4b131fc`.	2014-09-15 09:00:54 +08:00
Rob Clark	ca29c4c3b0	freedreno/a3xx: 3d/array textures Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-13 15:31:58 -04:00
Rob Clark	eea1cdf687	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-13 15:31:58 -04:00
Chia-I Wu	a32f48361a	ilo: trust vertex element count more We might run into ve->count == 0 and last_velement_edgeflag == true in gen6_3DSTATE_VERTEX_ELEMENTS() when the state tracker sets an invalid combination of VS and VE (does not seem to happen with st/mesa). Do not assume ve->count is positive when last_velement_edgeflag is true. Reported by Coverity. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-14 00:30:33 +08:00
Chia-I Wu	8fcf1b1f90	ilo: simplify src operand gathering in disassembler Always initialize the operand array to point to src0, src1, and src2. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-14 00:30:33 +08:00
Chia-I Wu	5341001b94	ilo: derive 3-src instructions from the opcode table One less switch statement to maintain. Signed-off-by: Chia-I Wu <olvaffe@gmail.com>	2014-09-14 00:30:33 +08:00
Ilia Mirkin	1d7b0d832c	nouveau: check for mesa context init failure Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-13 11:29:23 -04:00
Ilia Mirkin	2e86432cc1	nouveau: avoid leaking screen on initialization fail Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-13 11:17:26 -04:00
Ilia Mirkin	b13a4ca3f7	nouveau: change internal variables to avoid conflicts with macro args Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-13 10:55:16 -04:00
Chia-I Wu	9133784a46	ilo: clean up 3DPRIMITIVE functions Add ILO_PRIM_RECTANGLES to replace the rectlist bool.	2014-09-13 09:33:20 +08:00
Chia-I Wu	eca98153e9	ilo: clean up 3D/media common functions Rename ilo_builder_batch_state_base_address() to gen6_state_base_address() for consistency and remove unused gen6_STATE_BASE_ADDRESS(). Reorder the code in gen6_PIPE_CONTROL() a bit. Finally, some mostly cosmetic changes.	2014-09-13 09:31:08 +08:00
Chia-I Wu	ea8e7a8d4a	ilo: move 3D functions to ilo_builder_3d*.h Move functions for the 3D pipeline to the new headers. We artificially split the functions into top (vertex processing) and bottom (pixel processing), to keep the headers at reasonable sizes.	2014-09-13 09:31:08 +08:00
Chia-I Wu	aec8521166	ilo: move media functions to ilo_builder_media.h Move functions for the media pipeline to the new header.	2014-09-13 08:32:25 +08:00
Chia-I Wu	45023db7a9	ilo: move GPE common functions to ilo_builder_render.h Move 3D/media common functions to the new header.	2014-09-13 08:30:32 +08:00
Kenneth Graunke	84a40ce86b	glsl: Speed up constant folding for swizzles. ir_rvalue::constant_expression_value() recursively walks down an IR tree, attempting to reduce it to a single constant value. This is useful when you want to know whether a variable has a constant expression value at all, and if so, what it is. The constant folding optimization pass attempts to replace rvalues with their constant expression value from the bottom up. That way, we can optimize subexpressions, and ideally stop as soon as we find a non-constant subexpression. In order to obtain the actual value of an expression, the optimization pass calls constant_expression_value(). But it should only do so if it knows the value can be combined into a constant. Otherwise, at each step of walking back up the tree, it will walk down the tree again, only to discover what it already knew: it isn't constant. We properly avoided this call for ir_expression nodes, but not for ir_swizzle nodes. This patch fixes that, drastically reducing compile times on certain shaders where tree grafting has given us huge expression trees. It also fixes SuperTuxKart. Thanks to Iago and Mike for help in tracking this down. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78468 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-12 16:35:39 -07:00
Kenneth Graunke	7865026c04	i965/vec4: Make type_size() return 0 for samplers. The FS backend has always used 0, and the VS backend has always used 1. I think 1 is just working around other problems, and is incorrect. Samplers are baked in; nothing uses the UNIFORM register we would create, and we shouldn't upload any constant values for them. Fixes ES3-CTS.shaders.struct.uniform.sampler_array_vertex. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-12 16:35:39 -07:00
Kenneth Graunke	2408f166db	i965: Skip allocating UNIFORM file storage for uniforms of size 0. Samplers take up zero slots and therefore don't exist in the params array, nor are they included in stage_prog_data->nr_params. There's no need to store their size in param_size, as it's only used for dealing with arrays of "real" uniforms (ones uploaded as shader constants). We run into all kinds of problems trying to refer to the uniform storage for variables that don't have uniform storage. For one, we may use some other variable's index, or access out of bounds in arrays. In the FS backend, our extra 2 * MaxSamplerImageUnits params for texture rectangle rescaling paper over a lot of problems. In the VS backend, we claim samplers take up a slot, which also papers over problems. Instead, just skip allocating storage for variables that don't have any. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-12 16:35:39 -07:00
Kenneth Graunke	6b6145204d	i965: Separate gl_InstanceID and gl_VertexID uploading. We always uploaded them together, mostly out of laziness - both required an additional vertex element. However, gl_VertexID now also requires an additional vertex buffer for storing gl_BaseVertex; for non-indirect draws this also means uploading (a small amount of) data. This is extra overhead we don't need if the shader only uses gl_InstanceID. In particular, our clear shaders currently use gl_InstanceID for doing layered clears, but don't need gl_VertexID. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-12 16:35:35 -07:00
Kenneth Graunke	e980fe6071	i965: Fix reference counting in new basevertex upload code. In the non-indirect draw case, we call intel_upload_data to upload gl_BaseVertex. It makes brw->draw.draw_params_bo point to the upload buffer, and increments the upload BO reference count. So, we need to unreference it when making brw->draw.draw_params_bo point at something else, or else we'll retain a reference to stale upload buffers and hold on to them forever. This also means that the indirect case should increment the reference count on the indirect draw buffer when making brw->draw.draw_params_bo point at it. That way, both paths increment the reference count, so we can safely unreference it every time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-12 16:23:02 -07:00
Rob Clark	9b6281a7da	freedreno: "fix" problems with excessive flushes `4f338c9b` introduced logic to trigger a flush rather than overflowing cmdstream buffer. But the threshold was too low, triggering flushes where they were not needed. This caused problems with games like xonotic. Part of the problem is that we need to mark all state dirty between cmdstream submit ioctls, because we cannot rely on state being preserved across ioctls. But even with that, there are still some problems that are still being debugged. For now: 1) correctly mark all state dirty 2) introduce FD_MESA_DEBUG flush flag to force rendering to be flushed between each draw, to trigger problems (so that I can debug) 3) use a more reasonable threshold so for normal usecases we don't trigger the problems This at least corrects the regression, but there is still more debugging to do. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 18:35:39 -04:00
Marek Olšák	d13d2fd161	r600g,radeonsi: add debug option which forces DMA for copy_region and blit	2014-09-12 22:51:28 +02:00
Ilia Mirkin	d7ec3db349	freedreno/ir3: implement UMUL correctly Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:21 -04:00
Ilia Mirkin	436dd1e2f8	freedreno/ir3: fix UCMP handling UCMP does not require a compare, only a select. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:15 -04:00
Ilia Mirkin	9f5bd154d7	freedreno/ir3: add TXL support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:11 -04:00
Rob Clark	459f8f3d66	freedreno/ir3: add missing put_dst Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:09 -04:00
Rob Clark	59ff81663a	freedreno/ir3: catch incorrect usage of tmp-dst Each get_dst() should have a matching put_dst(). Add a bit of checking to catch mistakes. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:09 -04:00
Ilia Mirkin	db1a94b1cc	freedreno/ir3: use unsigned comparison for UIF Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:05 -04:00
Ilia Mirkin	11d72553c5	freedreno/ir3: negate result of USLT/etc Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:26:01 -04:00
Ilia Mirkin	8edf83b377	freedreno/ir3: add UARL support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:25:57 -04:00
Ilia Mirkin	10273f84c2	freedreno/ir3: INEG operates on src0, not src1 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:25:52 -04:00
Ilia Mirkin	572ffca050	freedreno/ir3: fix FSLT/etc handling to return 0/-1 instead of 0/1.0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:25:47 -04:00
Rob Clark	80058c0f08	freedreno/a3xx: alpha render-target shenanigans We need the .w component to end up in .x, since the hw appears to fetch gl_FragColor starting with the .x coordinate regardless of MRT format. As long as we are doing this, we might as well throw out the remaining unneeded components. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:23:52 -04:00
Rob Clark	3e0a82b52e	util/u_format: add _is_alpha() Because of render-to-alpha (000x) shenanigans, freedreno needs to do some special handling when rendering to alpha-only formats. And I noticed that while we had _is_luminance(), _is_intensity(), etc, an _is_alpha() helper was missing. So fix that. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:23:52 -04:00
Rob Clark	480fe244dd	freedreno/a3xx: format fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:23:52 -04:00
Rob Clark	1fba490569	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:23:52 -04:00
Rob Clark	2ed7640eec	freedreno/a3xx: handle rendering to layer != 0 Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-12 16:23:52 -04:00
Brian Paul	0d73ac6b02	mesa: fix _mesa_free_pipeline_data() use-after-free bug Unreference the ctx->_Shader object before we delete all the pipeline objects in the hash table. Before, ctx->_Shader could point to freed memory when _mesa_reference_pipeline_object(ctx, &ctx->_Shader, NULL) was called. Fixes crash when exiting the piglit rendezvous_by_location test on Windows. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-12 09:17:31 -06:00
Connor Abbott	2828680e39	ra: assert against unsigned underflow in q_total q_total should never go below 0 (which is why it's defined as unsigned), and if it does, then something is seriously wrong. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-12 16:07:47 +02:00
Connor Abbott	ec046bc08e	ra: note a restriction in the interfence graph API As noted in the previous commit, this was introduced in `567e2769b8` ("ra: make the p, q test more efficient"), but I forgot to mention it. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-12 16:07:47 +02:00
Connor Abbott	afd82dcad1	r300g: set register classes before interferences In commit `567e2769b8` ("ra: make the p, q test more efficient") I unknowingly introduced a new requirement to the register allocator API: the user must set the register class of all nodes before setting up their interferences, because ra_add_conflict_list() now uses the classes of the two interfering nodes. i965 already did this, but r300g was setting up register classes interleaved with setting up the interference graph. This led to us calculating the wrong q total, and in certain cases `e78a01d5e6` (" ra: optimistically color only one node at a time") made it so that this bug caused a segfault. In particular, the error occurred if the q total was decremented to 1 below 0 for the last node to be pushed onto the stack. Since q_total is an unsigned integer, it overflowed to 0xffffffff, which is what lowest_q_total happens to be initialzed to. This means that we would fail the "new_q_total < lowest_q_total" check on line 476 of register_allocate.c, and so the node would never be pushed onto the stack, which led to segfaults in ra_select() when we failed to ever give it a register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82828 Cc: "10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Tested-by: Pavel Ondračka <pavel.ondracka@email.cz> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-09-12 16:07:07 +02:00
Andreas Boll	2a13ff954d	gallium/util: add missing u_debug include Needed for assert. Fixes build on BE archs with -Werror=implicit-function-declaration. In file included from ../../../../../src/gallium/auxiliary/draw/draw_fs.c:30:0: ../../../../../src/gallium/auxiliary/util/u_math.h: In function 'util_memcpy_cpu_to_le32': ../../../../../src/gallium/auxiliary/util/u_math.h:810:4: error: implicit declaration of function 'assert' [-Werror=implicit-function-declaration] assert(n % 4 == 0); ^ Cc: "10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-12 15:55:12 +02:00
Chia-I Wu	802018df5f	ilo: fix builder size checks for BLT buffer clear/copy In buf_clear_region() and buf_copy_region(), max_cmd_size was set to 0. If either of the functions is called and there is not enough space in the builder, the next ilo_cp_flush() will fail silently in a release build. Replace magic numbers by size defines in tex_clear_region()/tex_copy_region() for consistency and readability.	2014-09-12 16:58:31 +08:00
Chia-I Wu	07e0923203	ilo: reduce BLT function parameters Intruduce gen6_blt_bo and gen6_blt_xy_bo to describe BOs. In the extreme case of gen6_XY_SRC_COPY_BLT(), the number of parameters goes down from 18 to 8.	2014-09-12 16:58:30 +08:00
Chia-I Wu	8fa62a9982	ilo: clean up BLT functions Follow the changes for MI functions, but for BLT this time.	2014-09-12 16:58:30 +08:00
Chia-I Wu	a77aaf4363	ilo: clean up MI functions With ilo_builder in place, some conventions we had to build commands are no longer needed.	2014-09-12 16:58:30 +08:00
Chia-I Wu	0c6a9cde94	ilo: move BLT functions to ilo_builder_blt.h Follow the changes for MI functions, but for BLT this time.	2014-09-12 16:58:30 +08:00
Chia-I Wu	50d2d9a69d	ilo: move MI functions to ilo_builder_mi.h Have a centralized place for MI functions, and remove the duplicated gen6_MI_LOAD_REGISTER_IMM().	2014-09-12 16:58:30 +08:00
Chia-I Wu	521887f9fd	ilo: add ILO_DEV_ASSERT() It replaces ILO_GPE_VALID_GEN().	2014-09-12 16:58:30 +08:00
Chia-I Wu	56d2ebb019	ilo: use an accessor for dev->gen It should enable us to do specialized builds by making the accessor return a constant.	2014-09-12 16:58:30 +08:00
Chia-I Wu	ea5de3e0bd	ilo: add GEN_EXTRACT() and GEN_SHIFT32() They replace READ() and SET_FIELD() that we have been using.	2014-09-12 16:58:29 +08:00
Chia-I Wu	e8f4dd70ab	ilo: remove ILO_GEN_GET_MAJOR() The last user has gone away.	2014-09-12 16:58:29 +08:00
Chia-I Wu	611f09890e	ilo: careful with empty fb state in ilo_gpe_set_fb() We cannot pass 0 as the width or height to ilo_gpe_init_view_surface_null().	2014-09-12 16:58:29 +08:00
Ilia Mirkin	95058bdec3	nv50,nvc0: enable ARB_texture_view Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-12 00:57:45 -04:00
Ilia Mirkin	d82bd7eb06	mesa/st: add ARB_texture_view support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-12 00:55:26 -04:00
Ilia Mirkin	c113095acd	gallium: add a texture target to sampler view and a CAP to use it This allows a sampler view to have a different texture target than the underlying resource. This will be used to implement the type casting between 2d arrays and cube maps as specified in ARB_texture_view. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-12 00:54:55 -04:00
Ilia Mirkin	3c81de5851	nouveau: only enable stencil func if the visual has stencil bits The _Enabled property already has the relevant information. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-12 00:51:20 -04:00
Ilia Mirkin	79959e5de5	nouveau: only enable the depth test if there actually is a depth buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-12 00:50:56 -04:00
Maarten Lankhorst	8ab85bfcd5	nouveau: remove unneeded assert No idea why it was added, but the code runs fine even on videos where it triggers. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-11 23:18:07 -04:00
Maarten Lankhorst	a41aad8431	nouveau: rework reference frame handling Fixes a regression from "nouveau/vdec: small fixes to h264 handling" New picking order for frames: 1. Vidbuf pointer matches. 2. Take the first kicked ref. 3. If that fails, take a ref that has a different last_used. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-11 23:18:05 -04:00
Maarten Lankhorst	121ceb38f4	nouveau: fix MPEG4 hw decoding Reorder some fields to make I-frame decoding work correctly. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-11 23:18:03 -04:00
Maarten Lankhorst	f6afed7076	nouveau: re-allocate bo's on overflow The BSP bo might be too small to contain all of the bsp data, bump its size on overflow. Also bump inter_bo when this happens, it might be too small otherwise. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-11 23:17:52 -04:00
Chia-I Wu	1187dbdd10	ilo: fix a compile error with -Werror=format-security Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83777	2014-09-12 09:45:42 +08:00
Ian Romanick	7aeb853c90	i965/vec4: Only examine virtual_grf_end for GRF sources If the source is not a GRF, it could have a register >= virtual_grf_count. Accessing virtual_grf_end with such a register would lead to out-of-bounds access. Make sure the source is a GRF before accessing virtual_grf_end. Fixes Valgrind complaints while compiling some shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2014-09-11 11:18:36 -07:00
Brian Paul	a46d7579e9	st/mesa: handle failed context creation for core profile If the glx/wgl state tracker requested a core profile but the gallium driver did not support some feature of GL 3.1 or later, we were setting ctx->Version=0 and then failing the assertion in _mesa_initialize_exec_table(). With this change we check for ctx->Version=0 and tear down the context and return NULL from st_create_context(). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-11 08:22:55 -06:00
Iago Toral Quiroga	f976b4c1bf	i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams. So far we have been using CL_INVOCATION_COUNT to resolve this query but this is no good with streams, as only stream 0 reaches the clipping stage. From ARB_transform_feedback3: "When a generated primitive query for a vertex stream is active, the primitives-generated count is incremented every time a primitive emitted to that stream reaches the Discarding Rasterization stage (see Section 3.x) right before rasterization. This counter is incremented whether or not transform feedback is active." Unfortunately, we don't have any registers that provide the number of primitives written to a specific stream other than the ones that track the number of primitives written to transform feedback in the SOL stage, so we can't implement this exactly as specified. In the past we implemented this feature by activating the SOL unit even if transform feeback was disabled, but making it so that all buffers were disabled and it only recorded statistics, which gave us the right semantics (see `3178d2474a`). Unfortunately, this came with a significant performance impact and had to be reverted. This new take does not intend to implement the exact semantics required by the spec, but improves what we have now, since now we return the primitive count for stream 0 in all cases. With this patch we use GEN7_SO_PRIM_STORAGE_NEEDED to resolve GL_PRIMITIVES_GENERATED queries for non-zero streams. This would return the number of primitives written to transform feedback for each stream instead. Since non-zero streams are only useful in combination with transform feedback this should not be too bad, and the only case that I think we would not be supporting would be the one in which we want to use both GL_PRIMITIVES_GENERATED and GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN on the same non-zero stream to detect buffer overflow. This patch also fixes the following piglit test: arb_gpu_shader5-xfb-streams-without-invocations This test uses both GL_PRIMITIVES_GENERATED and GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries on non-zero streams, but it does never hit the overflow case, so both queries are always expected to return the same value. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-09-11 15:17:22 +02:00
Christian König	6327b58415	radeon/uvd: use PIPE_USAGE_STAGING for msg&fb buffers That better matches the actual userspace use case, the kernel will force it to VRAM if the hardware requires it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-11 15:06:09 +02:00
Christian König	4dfdcdb4b3	radeon/video: use the hw to initial clear the buffers Less CPU overhead and avoids contention over CPU accessible memory on startup. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-11 15:06:03 +02:00
Christian König	4bc0059229	radeon/video: use more of the common buffer code v2 In preparation to using buffers clears with the hw engine(s). v2: split out flipping to using hw buffer clears. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-11 15:05:55 +02:00
José Fonseca	771ab951a8	scons: add /dynamicbase and /nxcompat to MinGW linkflags Just like b26503b196d51dc46c815e241343e42ab30e8d66 for MSVC.	2014-09-11 11:59:28 +01:00
Brian Paul	4860e98972	scons: add /dynamicbase and /nxcompat to MSVC linkflags This builds the opengl DLLs with address layout space randomization (ASLR) and data execution prevention (DEP) for better security. Reviewed-by: Kurt Daverman <krd@vmware.com>	2014-09-11 11:59:28 +01:00
Chia-I Wu	6816d853db	ilo: add a new disassembler The old disassembler was modified from i965's. It is as much work as doing a new one to keep it up-to-date, which also requires copying more headers over. The outputs of this new disassembler should match i965's as closely as possible.	2014-09-11 16:29:38 +08:00
Chia-I Wu	b51b349942	ilo: update genhw headers Add some new registers and some tweaks. The changes that affect ilo are GEN6_REG_HS_INVOCATION_COUNT -> GEN7_REG_HS_INVOCATION_COUNT GEN6_REG_DS_INVOCATION_COUNT -> GEN7_REG_DS_INVOCATION_COUNT GEN6_COND_NORMAL -> GEN6_COND_NONE	2014-09-11 16:29:38 +08:00
Frank Henigman	9c707d065a	glsl: allow precision qualifier on sampler arrays If a precision qualifer is allowed on type T, it should be allowed on an array of T. Refactor the check to ensure this is the case. (Fixes failures in WebGL conformance test 'gl-min-textures') Signed-off-by: Frank Henigman <fjhenigman@google.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-11 10:41:00 +03:00
Tapani Pälli	096ee4c3b0	glsl: mark variable as loop constant when it is set read only Patch modifies is_loop_constant() to take advantage of 'read_only' bit in ir_variable to detect a loop constant. Variables marked read-only are loop constant like mentioned by a comment in the function. v2: remove unnecessary comment (Francisco) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82537 Tested-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-09-11 10:09:12 +03:00
Michel Dänzer	82edcb918b	radeonsi: Simplify si_dma_copy_tile function No functional change intended. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-11 12:36:03 +09:00
Brian Paul	5cf8d9f54b	u_vbuf: simple whitespace fix	2014-09-10 16:37:54 -06:00
Brian Paul	9608193cbc	mesa: fix UNCLAMPED_FLOAT_TO_UBYTE() macro for MSVC MSVC replaces the "F" in "255.0F" with the macro argument which leads to an error. s/F/FLT/ to avoid that. It turns out we weren't using this macro at all on MSVC until the recent "mesa: Drop USE_IEEE define." change. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-09-10 16:37:54 -06:00
Brian Paul	56d8cfd7a5	mesa: trim down some #includes	2014-09-10 13:16:00 -06:00
Vinson Lee	cc20c45a36	pipe-loader: Include unistd.h in pipe_loader_drm.c for close function. This patch fixes a build error on DragonFly. CC libpipe_loader_la-pipe_loader_drm.lo pipe_loader_drm.c: In function 'pipe_loader_drm_probe': pipe_loader_drm.c:207:10: error: implicit declaration of function 'close' [-Werror=implicit-function-declaration] Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-10 11:59:38 -07:00
Kenneth Graunke	0bac2551e4	i965: Disable guardband clipping in the smaller-than-viewport case. Apparently guardband clipping doesn't work like we thought: objects entirely outside fthe guardband are trivially rejected, regardless of their relation to the viewport. Normally, the guardband is larger than the viewport, so this is not a problem. However, when the viewport is larger than the guardband, this means that we would discard primitives which were wholly outside of the guardband, but still visible. We always program the guardband to 8K x 8K to enforce the restriction that the screenspace bounding box of a single triangle must be no more than 8K x 8K. So, if the viewport is larger than that, we need to disable guardband clipping. Fixes ES3 conformance tests: - framebuffer_blit_functionality_negative_height_blit - framebuffer_blit_functionality_negative_width_blit - framebuffer_blit_functionality_negative_dimensions_blit - framebuffer_blit_functionality_magnifying_blit - framebuffer_blit_functionality_multisampled_to_singlesampled_blit v2: Mention the acronym expansion for TA/TR/MC in the comments. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-09-10 11:40:30 -07:00
Ian Romanick	927f5db461	i965: Request lowering gl_VertexID Fixes the (new) piglit tests gles-3.0-drawarrays-vertexid, gl-3.0-multidrawarrays-vertexid, and gl-3.2-basevertex-vertexid. Fixes gles3conform failure in: ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80247 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-10 11:05:09 -07:00
Kenneth Graunke	fbb353bc13	i965: Expose gl_BaseVertex via a vertex attribute. Now that we have the data available, we need to expose it to the shaders. We can reuse the same vertex element that we use for gl_VertexID, but we need to back it by an actual vertex buffer. A hardware restriction requires that vertex attributes coming from a buffer (STORE_SRC) must come before any other types (i.e. STORE_0). So, we have to make gl_BaseVertex be the .x component of the vertex attribute. This means moving gl_VertexID to a different component. I chose to move gl_VertexID and gl_InstanceID to the .z and .w components, respectively, to make room for gl_BaseInstance in the .y component (which would also come from a buffer, and therefore be STORE_SRC). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-10 11:05:08 -07:00
Kenneth Graunke	87b10c4a71	i965: Refactor Gen4-7 VERTEX_BUFFER_STATE emission into a helper. We'll need to emit another VERTEX_BUFFER_STATE for gl_BaseVertex; pulling this into a helper function will save us from having to deal with cross-generation differences in that code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-10 11:05:08 -07:00
Kenneth Graunke	fdbabf22e1	i965: Make gl_BaseVertex available in a buffer object. This will be used for GL_ARB_shader_draw_parameters, as well as fixing gl_VertexID, which is supposed to include gl_BaseVertex's value. For indirect draws, we simply point at the indirect buffer; for normal draws, we upload the value via the upload buffer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-10 11:05:08 -07:00
Kenneth Graunke	c89306983c	i965: Calculate start/base_vertex_location after preparing vertices. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-10 11:05:08 -07:00
Ian Romanick	9975792abd	i965: Handle SYSTEM_VALUE_VERTEX_ID_ZERO_BASE Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-10 11:05:08 -07:00
Kenneth Graunke	26e949b26e	mesa: Fix glGetActiveAttribute for gl_VertexID when lowered. The lower_vertex_id pass converts uses of the gl_VertexID system value to the gl_BaseVertex and gl_VertexIDMESA system values. Since gl_VertexID is no longer accessed, it would not be considered active. Of course, it should be, since the shader uses gl_VertexID. v2: Move the var->name dereference past the var != NULL check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-10 11:05:08 -07:00
Kenneth Graunke	26c9514155	mesa: Replace string comparisons with SYSTEM_VALUE enum checks. This is more efficient. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-09-10 11:05:08 -07:00
Ian Romanick	ec08b5e768	glsl: Add a lowering pass for gl_VertexID Converts gl_VertexID to (gl_VertexIDMESA + gl_BaseVertex). gl_VertexIDMESA is backed by SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, and gl_BaseVertex is backed by SYSTEM_VALUE_BASE_VERTEX. v2: Put the enum in struct gl_constants and propoerly resolve the scope in C++ code. Fix suggested by Marek. v3: Reabase on Matt's foreach_in_list changes (was using foreach_list). v4 (Ken): Use a systemvalue instead of a uniform because STATE_BASE_VERTEX has been removed. v5: Use a boolean to select lowering, and only allow one lowering method. Suggested by Ken. v6 (Ken): Replace strcmp against literal "gl_BaseVertex"/"gl_VertexID" with SYSTEM_VALUE enum checks, for efficiency. v7: Rebase on context constant initialization work. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-10 11:05:08 -07:00
Ian Romanick	04d3323d4b	glsl/linker: Make get_main_function_signature public The next patch will use this function in a different file. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-10 11:05:05 -07:00
Ian Romanick	1e87fbd78f	mesa: Add SYSTEM_VALUE_BASE_VERTEX This system value represents the basevertex value passed to glDrawElementsBaseVertex and related functions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-10 11:04:50 -07:00
Ian Romanick	5964a4f344	mesa: Add SYSTEM_VALUE_VERTEX_ID_ZERO_BASE There exists hardware, such as i965, that does not implement the OpenGL semantic for gl_VertexID. Instead, that hardware does not include the value of basevertex in the gl_VertexID value. SYSTEM_VALUE_VERTEX_ID_ZERO_BASE is the system value that represents this semantic. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-10 11:04:48 -07:00
Ian Romanick	9afb5ae8ca	mesa: Document SYSTEM_VALUE_VERTEX_ID and SYSTEM_VALUE_INSTANCE_ID v2: Additions to the documentation for SYSTEM_VALUE_VERTEX_ID. Quote the GL_ARB_shader_draw_parameters spec and mention DirectX SV_VertexID. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-09-10 11:04:44 -07:00
Jonathan Gray	cdb353539c	configure.ac: unbreak the build with non gnu grep `181581280b` changed the way the llvm-config version is read from sed to grep and introduced a requirement for gnu grep extension that treats BREs as EREs. Avoid this by calling egrep instead of grep which should be able to handle EREs everywhere. This allows Mesa to build on OpenBSD again. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Jonathan Gray <jsg@jsg.id.au>	2014-09-10 08:35:11 -07:00
Eric Anholt	d64ca0a765	vc4: Add support for shadow samplers. This doesn't quite make depth-tex-compare work, presumably because we're not hitting equality with itof(sample) * 1.0/0xffffff in the 0xffffff case. arb_fragment_program_shadow tests pass, though, as well as a bunch of other shadow-related stuff.	2014-09-09 20:41:43 -07:00
Eric Anholt	7d5c57f8e9	vc4: Add support for texture swizzles. Fixes depth-tex-modes.	2014-09-09 20:39:29 -07:00
Eric Anholt	1e77c93340	vc4: Move the texture format into a struct. I'm going to be putting some bitfields into the struct as well.	2014-09-09 20:38:39 -07:00
Eric Anholt	e7a6c54473	vc4: Add support for depth texturing.	2014-09-09 20:38:39 -07:00
Eric Anholt	d952a98c53	vc4: Expose r4 to register allocation. We potentially need to be careful that use of a value stored in r4 isn't copy-propagated (or something) across another r4 write. That doesn't appear to happen currently, and this makes the dataflow more obvious. It also opens up not unpacking the r4 value, which will be useful for depth textures.	2014-09-09 20:38:39 -07:00
Eric Anholt	be1fcd2cd3	vc4: Drop pointless raddr conflict handling on SF. SF doesn't have a src[1].	2014-09-09 20:38:39 -07:00
Eric Anholt	04faeff28a	vc4: The r4_count is supposed to be how many writes, not reads. It's part of the key so that you can tell which r4 value is being read.	2014-09-09 20:38:38 -07:00
Michel Dänzer	5679ccfcaf	r600g,radeonsi: Set RADEON_GEM_NO_CPU_ACCESS flag for tiled BOs This lets the kernel know that such BOs can be pinned outside of the CPU accessible part of VRAM. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-10 12:01:10 +09:00
Rob Clark	720cfb6fe9	freedreno/a3xx: enable hw primitive-restart Since software primitive-restart emulation is going to be removed (and anyways, mostly seemed to be crash prone in combination with u_primconvert and oddball scenarios (like PIPE_PRIM_POLYGON with only a single vertex), might as well do it in hardware (which fortunately didn't turn out to be too hard to figure out). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-09 19:42:18 -04:00
Rob Clark	564183f39c	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-09 19:42:18 -04:00
Rob Clark	a2c22d80d4	freedreno/ir3: fix potential segfault in RA Triggered by shaders like: FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0] DCL TEMP[0..2], LOCAL 0: IF CONST[0].xxxx :0 1: MOV TEMP[0], TEMP[1] 2: ELSE :0 3: MOV TEMP[0], TEMP[2] 4: ENDIF 5: MOV OUT[0], TEMP[0] 6: END not really a sane shader, although driver segfaulting is probably not the appropriate response. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-09 19:42:18 -04:00
Rob Clark	4f338c9bbf	freedreno: don't overflow cmdstream buffer so much We currently aren't too clever about dealing with running out of cmdstream buffer space. Since we use a single buffer for both drawing and tiling commands, we need to ensure there is enough space at the tail of the cmdstream buffer to fit the tiling commands. Until we get more clever, the easy solution is a threshold to trigger flushing rendering even if the application does not trigger flush (swap, changing render target, etc). This way we at least don't crash for apps that do several thousand draw calls (like some piglit tests do). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-09 19:42:18 -04:00
Rob Clark	fd4884e929	freedreno/ir3: add no-copy-propagate fallback step Most of the things the new compiler still has trouble with basically amount to cp stage removing too many copies. But without the cp stage, the shaders the new compiler produces are still better (perf and correctness) than the old compiler. So a simple thing to do until I have more time to work on it is first trying falling back to new compiler without cp, before finally falling back to old compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-09 19:42:18 -04:00
Emil Velikov	e387fdd235	ilo: add ilo_builder.h to the sources list Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 22:17:39 +01:00
Kenneth Graunke	e36bbff0e6	ir_to_mesa: Stop converting uniform booleans. Excess conversions considered harmful. Recently Matt reworked the boolean uniform handling to use the value of UniformBooleanTrue, rather than integer 1, when uploading uniforms: mesa: Upload boolean uniforms using UniformBooleanTrue. glsl: Use UniformBooleanTrue value for uniform initializers. Marek then set the default to 1.0f for drivers without native integer support: mesa: set UniformBooleanTrue = 1.0f by default However, ir_to_mesa was assuming a value of integer 1, and arranging for it to be converted to 1.0f on upload. Since Marek's commit, we were uploading 1.0f = 0x3f800000 which was being interpreted as the integer value 1065353216 and converted to float as 1.06535322E9, which broke assumptions in ir_to_mesa that "true" was exactly 1.0f. +13 Piglits on classic swrast (fs-bool-less-compare-true, {vs,fs}-op-not-bool-using-if, glsl-1.20/execution/uniform-initializer). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83573 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-09 13:19:44 -07:00
Jonathan Gray	c68073e65f	configure.ac: strip _GNU_SOURCE from llvm-config output Mesa already defines _GNU_SOURCE for glibc based systems and defining _GNU_SOURCE will break the Mesa build on other systems such as OpenBSD. _GNU_SOURCE only seems to be included in llvm-config output when LLVM is built via autoconf and not when it is built by cmake. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Gray <jsg@jsg.id.au>	2014-09-09 20:04:45 +01:00
Stefan Dirsch	49022a9713	xmlconfig: suppress libGL warnings when LIBGL_DEBUG == "quiet" Let's handle LIBGL_DEBUG env. variable in Mesa in a consistent way. Fixes: https://bugzilla.novell.com/show_bug.cgi?id=895730 Signed-off-by: Stefan Dirsch <sndirsch@suse.de> Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com>	2014-09-09 19:46:57 +01:00
Emil Velikov	3d8b53ffb4	automake: remove obsolete NEED_GALLIUM_LOADER Superseded by HAVE_LOADER_GALLIUM. The latter has a DRM brethren making the whose easier on which one to keep. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 19:45:24 +01:00
Emil Velikov	44ec468e80	configure: enable the gallium loader only when needed With the gallium megadrivers we've converted most ST to optionally use either statically linked in or shared pipe-drivers. The hardcoded switch forgot to conditionally enable the build of the shared pipe-drivers which resulted in them being constantly build. Cc: "10.3" <mesa-stable@lists.freedesktop.org> Cc: James Ausmus <james.ausmus@intel.com> Reported-by: James Ausmus <james.ausmus@intel.com> Tested-by: James Ausmus <james.ausmus@intel.com> Bugzilla: https://code.google.com/p/chromium/issues/detail?id=412089 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 19:45:10 +01:00
Emil Velikov	6dcd5ae725	configure: inform the user when we're building sw/kms-dri Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 19:39:37 +01:00
Emil Velikov	2903289706	configure: kill off NEED_WINSYS_WRAPPER Just drop the conditional and simplify our build. This means that it'll build every time, but it does not require any dependencies nor does it take that long to compile 200 lines of boilerplate code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 19:39:37 +01:00
Emil Velikov	0d0313ce9b	configure: kill off NEED_NONNULL_WINSYS The variable was unused and gave false information. The need for nonnull winsys currently does not relate as it used to. Nowadays one can mix and match more freely with plenty of winsys' to make your head spin. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 19:39:36 +01:00
Emil Velikov	40bb6f9313	configure: bail out if building svga without libdrm With recent commit we removed the NEED_NONNULL_WINSYS checks when selecting the hardware (inc svga) winsys. svga has only one winsys that explicitly requires libdrm (via it's bundled version of vmwgfx_drm.h) but configure.ac never really checks for it. Add the check early to prevent people from shooting themselves when they select the driver but lack libdrm. $ ./autogen.sh --disable-dri --disable-egl --disable-gallium-llvm --with-dri-drivers=swrast --with-gallium-drivers=svga,swrast Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82539 Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-09 19:39:36 +01:00
Eric Anholt	2220692330	vc4: Fix segfaults when rendering with no color render target.	2014-09-09 07:29:16 -07:00
Eric Anholt	5774f16453	vc4: Fill out the stencil clear field. The rest of stencil handling isn't done yet, but it documents an extra cl_u8(0) and helps make it obvious why we don't need to format clear_depth the same way the depth/stencil buffer is formatted.	2014-09-09 07:29:16 -07:00
Eric Anholt	fd6e4fccad	vc4: Flip around the depth/stencil fields. After implementing depth stores, it looks like this is the way things actually are, according to hiz-depth-read-fbo-d24-s0's probes.	2014-09-09 07:29:16 -07:00
Eric Anholt	2cbecee4b7	vc4: Add support for loading/storing the depth buffer. For now it still requires the color buffer to be present -- we're relying on the store of color buffer contents to end the frame, and we have to do something with color buffers in the rendering config packet.	2014-09-09 07:29:16 -07:00
Eric Anholt	1663a89374	vc4: Don't forget to do initial tile clearing for depth/stencil.	2014-09-09 07:29:16 -07:00
Eric Anholt	2cbdbeb4fa	vc4: Ignore non-address bits of the offset for load/store. These only get used for full buffer dumps, which we don't support yet anyway.	2014-09-09 07:29:16 -07:00
Eric Anholt	a894898255	vc4: Add a debug flag for flushing after every draw. It was useful on i965, but it's even more useful for debugging tiled renderers.	2014-09-09 07:29:12 -07:00
Eric Anholt	840f381120	vc4: Add missing null terminator to the debug options list. So far, apparently there's been some NULL laying at the address just after the options anyway, but the next commit changed that.	2014-09-09 07:28:12 -07:00
Tom Stellard	181581280b	configure.ac: Fix build with git-svn llvm version string Reviewed-and-tested-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-09-09 09:47:25 -04:00
Kalyan Kondapally	78c9201a5b	Linking fails when not writing gl_Position. According to GLSL-ES Spec(i.e. 1.0, 3.0), gl_Position value is undefined after the vertex processing stage if we don't write gl_Position. However, GLSL 1.10 Spec mentions that writing to gl_Position is mandatory. In case of GLSL-ES, it's not an error and atleast the linking should pass. Currently, Mesa throws an linker error in case we dont write to gl_position and Version is less then 140(GLSL) and 300(GLSL-ES). This patch changes it so that we don't report an error in case of GLSL-ES. Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83380	2014-09-09 10:39:39 +03:00
Chia-I Wu	2a49a94079	ilo: remove unused ilo_cp functions Remove ilo_cp_begin() ilo_cp_steal() ilo_cp_write() ilo_cp_write_multi() ilo_cp_write_bo() ilo_cp_end() ilo_cp_steal_ptr() ilo_cp_assert_no_implicit_flush()	2014-09-09 13:31:37 +08:00
Chia-I Wu	90f4b131fc	ilo: convert GPE GEN6 command functions to use ilo_builder Similar to the changes to GEN7 command functions, but to GEN6 this time. As every GPE function has been converted, remove ilo_cp_assert_no_implicit_flush() calls.	2014-09-09 13:31:37 +08:00
Chia-I Wu	80e29ae42c	ilo: convert GPE GEN7 command functions to use ilo_builder Make these changes ilo_cp_begin() -> ilo_builder_batch_pointer() ilo_cp_write() -> direct memory set ilo_cp_write_bo() -> ilo_builder_batch_reloc() and use this chance to drop the "_emit_" infix.	2014-09-09 13:31:37 +08:00
Chia-I Wu	fff9869164	ilo: convert GPE state functions to use ilo_builder Make these changes ilo_cp_steal_ptr() and memcpy() -> ilo_builder_state_write() ilo_cp_steal_ptr() -> ilo_builder_state_pointer() and use this chance to drop the "_emit_" infix.	2014-09-09 13:31:37 +08:00
Chia-I Wu	c81a973e04	ilo: convert GPE surface functions to use ilo_builder Make these changes ilo_cp_steal_ptr() and memcpy() -> ilo_builder_surface_write() ilo_cp_steal() and ilo_cp_write() -> ilo_builder_surface_write() ilo_cp_write_bo() -> ilo_builder_surface_reloc() and use this chance to drop the "_emit_" infix.	2014-09-09 13:31:37 +08:00
Chia-I Wu	6cbd1f4bd3	ilo: convert BLT to use ilo_builder Make these changes ilo_cp_begin() -> ilo_builder_batch_pointer() ilo_cp_write() -> direct memory set ilo_cp_write_bo() -> ilo_builder_batch_reloc() and make sure there is no implicit flush. Use this chance to drop the "_emit_" infix.	2014-09-09 13:31:37 +08:00
Chia-I Wu	d2acd67313	ilo: use ilo_builder for kernels and STATE_BASE_ADDRESS Remove instruction buffer management from ilo_3d and adapt ilo_shader_cache to upload kernels to ilo_builder. To be able to do that, we also let ilo_builder manage STATE_BASE_ADDRESS.	2014-09-09 13:31:37 +08:00
Chia-I Wu	55f80a3290	ilo: make ilo_cp based on ilo_builder This makes ilo_cp use the builder to manage batch buffers, and use ilo_builder_decode() to replace ilo_3d_pipeline_dump().	2014-09-09 13:31:36 +08:00
Chia-I Wu	dab4a676f7	ilo: add a builder for building BOs for submission Comparing to how we manage batch and instruction buffers, the new builder - does not flush - manages both types of buffers - manages STATE_BASE_ADDRESS - uploads kernels using unsynchronized mapping - has its own decoder for the buffers - provides more helpers	2014-09-09 13:31:36 +08:00
Chia-I Wu	43bf14eaeb	ilo: make toy_compiler_disassemble() more useful Do not require a toy_compiler so that it can be used in other places, such as state dumping. Add a bool to control whether the raw instruction words are shown.	2014-09-09 13:31:30 +08:00
Ilia Mirkin	4ea1565bbc	nv50/ir: accomodate all file types, there are now more than 8 Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:12 -04:00
Ilia Mirkin	5966903c28	nvc0/ir: uses was always null at that point in the code Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:12 -04:00
Ilia Mirkin	874a9396c5	nv50/ir: avoid array overrun when checking for supported mods Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-08 20:06:12 -04:00
Ilia Mirkin	64c5aeaa94	nouveau: buffer can never be null Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:11 -04:00
Ilia Mirkin	1792d60900	nvc0/ir: insn can never be null Reported by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:11 -04:00
Ilia Mirkin	9ced42b1aa	nvc0: size is a uint16_t, remove unnecessary assertion Reported by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:11 -04:00
Ilia Mirkin	564e305094	nvc0: avoid null deref of screen when collecting stats Reported by Coverity Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:11 -04:00
Ilia Mirkin	c02ac40837	nvc0: use 64-bit math when scaling the query results Reported by Coverity. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-08 20:06:11 -04:00
Roland Scheidegger	08f13ff439	gallivm: (trivial) don't try to use rcp when the division 1/x is integer This would just crash. Noticed by accident while checking int divisions by zero with a quickly hacked piglit test. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-09 01:44:08 +02:00
Roland Scheidegger	51b52ea013	docs: (trivial) mark softpipe, llvmpipe as done for GL_ARB_base_instance Forgot to add it when I fixed up the start instance handling in (llvm) draw. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-09 01:44:07 +02:00
Roland Scheidegger	9405e15f51	gallivm: (trivial) fix min / max variable names Calling the variable min when it's really max and vice versa seems a bit confusing. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-09-09 01:44:05 +02:00
Kenneth Graunke	a20cc2796f	i965: Handle ir_binop_ubo_load in boolean expression code. UBO loads can be boolean-valued expressions, too, so we need to handle them in emit_bool_to_cond_code() and emit_if_gen6(). However, unlike most expressions, it doesn't make sense to evaluate their operands, then do something with the results. We just want to evaluate the UBO load as a whole---which performs the read from memory---then load the boolean result into the flag register. Instead of adding code to handle it, we can simply bypass the ir_expression handling, and fall through to the default code, which will do exactly that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83468 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-08 15:43:52 -07:00
Kenneth Graunke	b9699e09bc	i965/fs: Make emit_if_gen6 never fall back to emit_bool_to_cond_code. Matt and I believe that Sandybridge actually uses 0xFFFFFFFF for a "true" comparison result, similar to Ivybridge. This matches the internal documentation, and empirical results, but contradicts the PRM. So, the comment is inaccurate, and we can actually just handle these directly without ever needing to fall through to the condition code path. Also, the vec4 backend has always done it this way, and has apparently been working fine. This patch makes the FS backend match the vec4 backend's behavior. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-08 15:43:51 -07:00
Kenneth Graunke	6272e60ca3	i965: Handle ir_triop_csel in emit_if_gen6(). ir_triop_csel can return a boolean expression, so we need to handle it here; we simply forgot when we added ir_triop_csel, and forgot again when adding it to emit_bool_to_cond_code. Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool on Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-08 15:43:49 -07:00
Christian König	12fb74fe89	mesa/st: don't advertise NV_vdpau_interop if it doesn't work. As long as we don't have a workaround for frame based decoding in VDPAU we should not advertise NV_vdpau_interop. v2: fix commit message, check if get_video_param is present Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2014-09-08 16:53:39 +02:00
Brian Paul	a3306f028e	docs: add news link to 10.2.7 release notes	2014-09-08 08:08:46 -06:00
Jordan Justen	dc0bd799ca	i965/fs: Remove direct fs_visitor gl_fragment_program dependence Instead we cast backend_visitor::prog for fragment shader specific code paths. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-06 11:17:53 -07:00
Ulrich Weigand	0feb977bbf	gallivm: Fix Altivec pack intrinsics for little-endian This patch fixes use of Altivec pack intrinsics on little-endian PowerPC systems. Since little-endian operation only affects the load and store instructions, the semantics of pack (and other) instructions that take two input vectors implicitly change: the pack instructions still fill a register placing values from the first operand into the "high" parts of the register, and values from the second operand into the "low" parts of the register, but since vector loads and stores perform an endian swap, the high parts end up at high memory addresses. To still achieve the desired effect, we have to swap the two inputs to the pack instruction on little-endian systems. This is done automatically by the back-end for instructions generated by LLVM, but needs to be done manually when emitting intrisincs (which still result in that instruction being emitted directly). Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Maarten Lankhorst <dev@mblankhorst.nl>	2014-09-06 15:51:58 +02:00
Jordan Justen	1f184bc114	i965/fs: Remove direct fs_generator brw_wm_prog_key dependence Instead we store a void pointer to the key, and cast it to brw_wm_prog_key for fragment shader specific code paths. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 22:15:06 -07:00
Jordan Justen	c43ae405aa	i965/fs: Remove direct fs_generator brw_wm_prog_data dependence Instead we store a brw_stage_prog_data pointer, and cast it to brw_wm_prog_data for fragment shader specific code paths. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 22:15:06 -07:00
Jordan Justen	f96a02c7ca	i965/fs: Don't store gl_fragment_program* in fs_generator gl_program* is named prog similar to backend_visitor. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 22:15:06 -07:00
Jordan Justen	936ca6f3cf	i965: Add uses_kill to brw_wm_prog_data Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 22:15:06 -07:00
Jordan Justen	d0e166752a	i965/fs: Rename fs_generator::prog to shader_prog This matches backend_visitor, and will allow gl_program to be named prog. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 22:15:06 -07:00
Jordan Justen	000a9ee1ba	i965/fs: Add stage variable to fs_generator This will allow for stage specific code paths. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 22:15:06 -07:00
Kristian Høgsberg	2d6d3461d3	i965: Adjust fast-clear resolve rect for BDW The scale factors for the resolve rectangle change for BDW and we have to look at brw->gen now to figure out how big it should be. Fixes: https://bugs.freedesktop.org/attachment.cgi?id=105777 Cc: "10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 20:47:03 -07:00
Christoph Bumiller	ca9ab05d45	nvc0/ir: clarify recursion fix to finding first tex uses This is a simple shader for reproducing the case mentioned: FRAG DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..1], LOCAL IMM[0] FLT32 { 0.0000, -1.0000, 1.0000, 0.0000} 0: MOV TEMP[0].x, CONST[0].wwww 1: MOV TEMP[1].x, CONST[0].wwww 2: BGNLOOP 3: IF TEMP[0].xxxx 4: BRK 5: ENDIF 6: ADD TEMP[0].x, TEMP[0], IMM[0].zzzz 7: IF CONST[0].xxxx 8: TEX TEMP[1].x, CONST[0], SAMP[0], 2D 9: ENDIF 10: IF CONST[0].zzzz 11: MOV TEMP[1].x, CONST[0].zzzz 12: ENDIF 13: ENDLOOP 14: MOV OUT[0], TEMP[1].xxxx 15: END Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-05 23:08:24 -04:00
Christoph Bumiller	b9f9e3ce03	nv50/ir/util: fix BitSet issues BitSet::allocate() is being used with the expectation that it would leave the bitfield untouched if its size hasn't changed, however, the function always zeroed the last word, which led to obscure bugs with live set computation. This also fixes BitSet::resize(), which was broken, but luckily not being used. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-05 23:05:42 -04:00
Ilia Mirkin	a71380040c	nvc0: remove nvc0_push, replaced with nvc0_vbo_translate Fixes build. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-05 23:00:27 -04:00
Ilia Mirkin	12311c7c52	nv50,nvc0: get rid of draw module support This hasn't been enabled in a long time and is completely stale and unnecessary. Remove, esp since it doesn't build. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-05 23:00:27 -04:00
Jason Ekstrand	ecf6c26757	i965/fs: Don't look at virtual_grf_sizes for uniforms Uniform values are in the UNIFORM register file, not the GRF register file. Looking in virtual_grf_sizes makes no sense and only makes the output of dump_instructions confusing. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-05 17:33:17 -07:00
Dave Airlie	291ae622fd	loader: fds can be 0 Possible resource leak reported by coverity. Reported-by: Coverity scanner. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-06 10:24:25 +10:00
Emil Velikov	196e949cf7	docs: Import 10.2.7 release notes, add news item. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-06 01:18:45 +01:00
Emil Velikov	2c69c9fdcb	gallium/vc4: ship all files in the tarball - include all headers in Makefile.sources Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:27 +01:00
Emil Velikov	ec9d8060e4	gallium/trace: ship all files in the tarball - include all headers in Makefile.sources - bundle the scons buildscript, README and trace.xsl Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:27 +01:00
Emil Velikov	7134043837	gallium/svga: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android & scons buildscript - include the headers' README & svga_dump.py Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:27 +01:00
Emil Velikov	f7008a6c5e	gallium/softpipe: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android & scons buildscript Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:27 +01:00
Emil Velikov	858d932d6a	gallium/rbug: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android buildscript & README Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	36b5012a8d	gallium/radeonsi: ship all files in the tarball - include all headers in Makefile.sources - bundle the android buildscript Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	8b48e14a48	gallium/radeon: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android buildscript & LLVM note Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	27d4f2eae3	gallium/r600: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android buildscript & custom include Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	cdd3a34096	gallium/r300: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android buildscript & the tests Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	2ba31a5185	gallium/nouveau: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android buildscript v2: Don't double-include the compiler sources. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	0cba104921	gallium/noop: ship all files in the tarball - include all headers in Makefile.sources - bundle the scons buildscript Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:26 +01:00
Emil Velikov	48d251cebb	gallium/llvmpipe: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the scons buildscript v2: Don't double include the test sources. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	a408b75849	gallium/identity: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the scons buildscript Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	930afeaa54	gallium/ilo: ship all files in the tarball - include all headers in Makefile.sources - bundle the android buildscript Cc: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	38719795a6	gallium/i915: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android buildscript & TODO Cc: Stephane Marchesin <stephane.marchesin@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	8928788d58	gallium/galahad: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the scons buildscript Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	0ea9569d8f	gallium/freedreno: ship all files in the tarball - include all headers in Makefile.sources - sort the list(s) - bundle the android build Cc: freedreno@lists.freedesktop.org Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	525c48a316	gallium/tools: pick up the tools for distribution Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:25 +01:00
Emil Velikov	c6948da666	gallium/tests: ship all the tests in the release tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:24 +01:00
Emil Velikov	13a5adc1b7	st/vega: ship the final headers Commit 60d772cd9d1(st/vega: add headers and SConscript in the tarball) meant to pick all the headers to be included in the release tarball yet it missed a few. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:24 +01:00
Emil Velikov	cd2e62a2f3	st/egl: include the remaining files in the tarball A few files were missing, namely: - a few of the common headers - the android + gdi sources Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:24 +01:00
Emil Velikov	96fb492583	st/glx/xlib: ship the SConscript in the release tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:24 +01:00
Emil Velikov	fc69d1141b	st/dri: ship the scons buildscript in the release tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:24 +01:00
Emil Velikov	3d3d9c3617	st/clover: ship Doxyfile in the release tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:24 +01:00
Emil Velikov	cf0c4d6d63	gallium: ship state-tracker/README in the release tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:23 +01:00
Emil Velikov	c553b6e2df	gallium: ship the non-automaked state-trackers & targets Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:23 +01:00
Emil Velikov	0fd45d3079	winsys/intel: drop intel_winsys.h from makefile.sources With the last revisions of commit 664c2d76947(gallium/ilo: cleanup intel_winsys.h) we moved the header from winsys to drivers, but we forgot to update the makefile.sources to reflect this. Cc: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-09-05 23:46:23 +01:00
Anuj Phogat	d09167a39f	meta: Store precompiled msaa shaders for all supported sample counts Currently, BLIT_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE* and BLIT_MSAA_SHADER_2D_MULTISAMPLE_ARRAY_RESOLVE* shaders in setup_glsl_msaa_blit_shader() are not recompiled when the source buffer sample count changes. For example, implementation continued using a 4X msaa shader, even if source buffer changes from 4X msaa to 8x msaa. It causes incorrect rendering. This patch adds new enums in blit_msaa_shader, one for each supported sample count, and uses them to store msaa shaders. Fixes following piglit tests on Broadwell: ext_framebuffer_multisample-accuracy all_samples color ext_framebuffer_multisample-accuracy all_samples depth_draw ext_framebuffer_multisample-accuracy all_samples depth_resolve ext_framebuffer_multisample-accuracy all_samples stencil_draw ext_framebuffer_multisample-accuracy all_samples stencil_resolve ext_framebuffer_multisample-formats all_samples Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstarnd <jason@jlekstrand.net>	2014-09-05 15:40:37 -07:00
Emil Velikov	0b76c51728	configure: check for core xcb and link the VL targets against it Make sure to check the presence of the module in order to pick the correct libs flag and before feeding them to the compiler/linker. Current libXvMC, libvdpau and libomx_mesa depends unconditionally upon xcb, due to their usage of the aux/vl gallium module. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-09-05 23:18:00 +01:00
Emil Velikov	17798bfb47	configure: check for core xcb and link libEGL against it Make sure to check the presence of the module in order to pick the correct libs flag and before feeding them to the compiler/linker. Current libEGL depends conditionally (when building with x11 platform) upon xcb. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-05 23:17:59 +01:00
Emil Velikov	da029f8081	configure: check for core xcb and link libGL against it Make sure to check the presence of the module in order to pick the correct libs flag and before feeding them to the compiler/linker. Current libGL depends conditionally (when building with dri3) upon xcb 1.9.3 and unconditionally on ancient xcb functions - xcb_generate_id and xcb_request_check amongst others. v2: Use PKG_CHECK_EXISTS() when checking for dri3 xcb. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80848 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-05 23:17:49 +01:00
Jason Ekstrand	7599886b26	i965/blorp: Pass image formats seperately from the miptree When a texture is wrapped in a texture view, we can't trust the format in the miptree itself. This patch allows us to pass the format seperately through blorp so we can proprerly handled wrapped textures. It's worth noting here that we can use the miptree format directly for depth/stencil formats because they cannot be reinterpreted by a texture view. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> CC: "10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-09-05 10:45:27 -07:00
Matt Turner	87472ae58c	i965/fs: Brown bag fix.	2014-09-05 10:29:38 -07:00
Matt Turner	e8df6a6b32	i965/vec4: Add ability to reswizzle arbitrary swizzles. Before commit `04895f5c` we would only reswizzle dot product instructions (since they wrote the same value into all channels, and we didn't have to think about anything else). That commit extended reswizzling to cases when the swizzle was single valued -- i.e., writing the same result into all channels. But allowing reswizzling of arbitrary things is actually really easy and is even less code. (Why didn't we do this in the first place?!) total instructions in shared programs: 4266079 -> 4261000 (-0.12%) instructions in affected programs: 351933 -> 346854 (-1.44%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 10:22:06 -07:00
Matt Turner	1ee1d8ab46	i965/vec4: Reswizzle sources when necessary. Despite the comment above the function claiming otherwise, the function did not reswizzle sources, which would lead to bad code generation since commit `04895f5c`, which began claiming we could do such swizzling when we could not. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82932 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 10:22:06 -07:00
Jason Ekstrand	e49cfe9bfc	i965/fs: Clean up emitting of untyped atomic and surface reads Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-05 10:04:06 -07:00
Matt Turner	ef8477cddf	i965/fs: Fix basic block tracking in try_rep_send(). The 'start' instruction is always in the current block, except for the case of shader time, which emits code in a pattern seen no where else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 09:53:21 -07:00
Matt Turner	248eaff63d	i965/fs: Pass block to insert and remove functions missed earlier. Otherwise, the basic block start/end IPs don't get updated properly, leading to a broken CFG. This usually results in the following assertion failure: brw_fs_live_variables.cpp:141: void brw::fs_live_variables::setup_def_use(): Assertion `ip == block->start_ip' failed. Fixes KWin, WebGL demos, and a score of Piglit tests on Sandybridge and earlier hardware. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 09:52:50 -07:00
Kenneth Graunke	6ff5bb2465	i965: Mark cfg dumping functions const. The dump() methods don't alter the CFG or basic blocks, so we should mark them as const. This lets you call them even if you have a const cfg_t - which is the case in certain portions of the code (such as live interval handling). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-05 09:52:38 -07:00
Matt Turner	88d673bde6	i965: Update if_block/else_block in the dead control flow pass. I think this bug crept in only recently. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 09:52:29 -07:00
Matt Turner	3e248e0418	i965/fs: Connect cfg properly in predicated break peephole. If the ENDIF instruction was the only instruction in its block, we'd leave the successors of the merged if+jump block in a bad state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83080 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-05 09:08:59 -07:00
Marek Olšák	1a00f24751	st/mesa: use 1.0f as boolean true on drivers without integer support Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882 Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-05 15:41:47 +02:00
Marek Olšák	d67db73458	mesa: set UniformBooleanTrue = 1.0f by default because NativeIntegers is 0 by default. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82882 Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-05 15:41:47 +02:00
Jonathan Gray	635477dc4b	automake: check if the linker supports --dynamic-list As older versions of gnu ld did not support --dynamic-list check to see if it is supported before using it. Non gnu linkers such the apple one likely lack this option as well. Fixes the build on OpenBSD which has binutils 2.15 and 2.17. The --dynamic-list option seems to been have introduced sometime after binutils 2.17 was released as it is present in 2.18. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-05 14:20:42 +01:00
Jonathan Gray	d3dee3df97	st/xvmc/tests: avoid non portable error.h functions To improve compatibility with OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-05 14:17:34 +01:00
Andreas Pokorny	8bcd57a46c	kms-swrast: Support Prime fd handling Allows using prime fds as display target and from display target. Test for PRIME capability after initializing kms_swrast screen. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Andreas Pokorny <andreas.pokorny@canonical.com>	2014-09-05 14:14:37 +01:00
Michel Dänzer	76b906c9f6	configure.ac: Add AC_SYS_LARGEFILE Making sure large file support is enabled across the tree even on 32-bit systems. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-05 18:08:59 +09:00
Francisco Jerez	b4539274b6	clover/util: Null-terminate the result of compat::string::c_str(). Reported-by: EdB <edb+mesa@sigluy.net>	2014-09-05 09:27:20 +03:00
Francisco Jerez	923c72982e	clover/util: Implement compat::string using aggregation instead of inheritance.	2014-09-05 09:27:20 +03:00
Francisco Jerez	7c1e6d582c	clover/util: Have compat::vector track separate size and capacity. In order to make the behaviour of resize() and reserve() closer to the standard. Reported-by: EdB <edb+mesa@sigluy.net>	2014-09-05 09:27:20 +03:00
Francisco Jerez	995f7b37da	clover: Use conversion operator to initialize build log from compat::string. Fixes binary garbage in the compilation logs caused by compat::string::c_str() not being null-terminated (which is a bug on its own that will be fixed in another commit). Reported-by: EdB <edb+mesa@sigluy.net>	2014-09-05 09:27:20 +03:00
Jordan Justen	864c463485	Revert 5 i965 patches: `8e27a4d2`, `373143ed`, `c5bdf9be`, `6f56e142`, `88e3d404` Reverts * "i965: Modify state upload to allow 2 different sets of state atoms." `8e27a4d2b3` * "i965: Modify dirty bit handling to support 2 pipelines." `373143ed91` * "i965: Create a macro for checking a dirty bit." `c5bdf9be1e` Conflicts: src/mesa/drivers/dri/i965/brw_context.h * "i965: Create a macro for setting all dirty bits." `6f56e1424d` Conflicts: src/mesa/drivers/dri/i965/brw_blorp.cpp src/mesa/drivers/dri/i965/brw_state_cache.c src/mesa/drivers/dri/i965/brw_state_upload.c * "i965: Create a macro for setting a dirty bit." `88e3d404da` Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-04 23:06:27 -07:00
Rob Clark	5d8f40a53a	freedreno/ir3: fix constlen with relative addressing We can't rely on the value from the assembler if relative addressing is used. So instead use the max of declared-consts (which does not include compiler immediates) and what we get from the assembler (which does). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-04 22:28:50 -04:00
Rob Clark	73ff4c5f70	freedreno/ir3: fix error in bail logic all_delayed will also be true if we didn't attempt to schedule anything due to no more instructions using current addr/pred. We rely on coming in to block_sched_undelayed() to detect and clean up when there are no more uses of the current addr/pred, which isn't necessarily an error. This fixes a regression introduced in `b823abed`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-04 22:28:50 -04:00
Rob Clark	08ee0488e6	freedreno/ir3: bit of debug Make it easier to figure out which compiler stage failed. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-04 22:28:50 -04:00
Eric Anholt	4bca922878	vc4: Merge qcompile and tgsi_to_qir The split between these two didn't make much sense. I'm going to want the chance to look at uniform contents in optimization passes, and the QPU emit I think is going to end up rewriting the uniforms stream.	2014-09-04 17:00:54 -07:00
Jordan Justen	23e20f4687	i965/fs: Use prog rather than fp->Base in fs_visitor Reduce fs_visitor's dependence on gl_fragment_program. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 11:46:42 -07:00
Jordan Justen	a346870ba8	i965/fs: Use stage_prog_data instead of prog_data->base in fs_visitor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 11:46:42 -07:00
Jordan Justen	246211d366	i965/fs: Add init function to fs_visitor This common init routine can be used by constructors for multiple program types. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 11:46:42 -07:00
Eric Anholt	55d2a16262	vc4: Add a CSE optimization pass. Debugging a regression in discard support was just too full of duplicate instructions, so I decided to remove them instead of re-analyzing each of them as I dumped their outputs in simulation.	2014-09-04 11:39:51 -07:00
Eric Anholt	80b27ca2cd	vc4: Switch to using native integers. There were troubles with bools without using native integers (st_glsl_to_tgsi seemed to think bool true was 1.0f sometimes, when as a uniform it's stored as ~0), and since I've got native integers other than divide, I might as well just support them.	2014-09-04 11:39:51 -07:00
Eric Anholt	874dfa8b2e	vc4: Expose compares at a lower level in QIR. Before, we had some special opcodes like CMP and SNE that emitted multiple instructions. Now, we reduce those operations significantly, giving optimization more to look at for reducing redundant operations. The downside is that QOP_SF is pretty special -- we're going to have to track it separately when we're doing instruction scheduling, and we want to peephole it into the instruction generating the destination write in most cases (and not allocate the destination reg, probably. Unless it's used for some other purpose, as well).	2014-09-04 11:39:51 -07:00
Eric Anholt	3972a6f057	vc4: Stop being so clever in CMP handling. This kind of cleverness should be in a general merging-of-ADD-and-MUL instruction scheduler, rather than individual opcodes.	2014-09-04 11:39:51 -07:00
Eric Anholt	511d2f9a13	state_tracker: Fix bug in conditional discards with native ints. A bool is 0 or ~0, and KILL_IF takes a float arg that's <0 for discard or >= 0 for not. By negating it, we ended up doing a floating point subtract of (0 - ~0), which ended up as an inf. To make this actually work, we need to convert the bool to a float. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-04 11:39:50 -07:00
Brian Paul	e69b4abc43	swrast: s/INLINE/inline/ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 12:17:44 -06:00
Brian Paul	0f255fd26b	osmesa: s/INLINE/inline/ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 12:17:44 -06:00
Brian Paul	27727b8479	xlib: s/INLINE/inline/ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 12:17:44 -06:00
Brian Paul	c4a0be73ea	meta: s/INLINE/inline/ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 12:17:43 -06:00
Brian Paul	44df6df05b	mesa: s/INLINE/inline/ Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-04 12:17:40 -06:00
Marek Olšák	3dbf55c1be	r600g,radeonsi: make sure there's enough CS space before resuming queries Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83432 Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-04 16:15:21 +02:00
Marek Olšák	374f3e9e19	mesa: invalidate draw state in glPopClientAttrib Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82538 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-04 16:09:56 +02:00
Marek Olšák	8bd6723179	Revert "r600g,radeonsi: initialize HTILE to fully-expanded state" This reverts commit `f05fe294e7`. Apparently the hw doesn't like this. Revert to the "cleared" state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83418	2014-09-04 15:48:38 +02:00
Thomas Hellstrom	2d6206140a	winsys/svga: Fix incorrect type usage in IOCTL v2 While similar in layout, the size of the SVGA3dSize type may be smaller than the struct drm_vmw_size type that is part of the ioctl interface. The kernel driver could accordingly overwrite a memory area following the size variable on the stack. Typically that would be another local variable, causing breakage in, for example, ubuntu 12.04.5 where the handle local variable becomes overwritten. v2: Fix whitespace errors Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Cc: "10.1 10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-04 14:31:52 +02:00
Timothy Arceri	504f5f9d1a	glapi: Add KHR_debug functions to check_table test Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-09-04 12:29:14 +10:00
Carl Worth	ecc89e4e42	egl: Restrict multiplication in calloc arguments to use compile-time constants As explained in the previous commit, we want to avoid the possibility of integer-multiplication overflow while allocating buffers. In these two cases, the final allocation size is the product of three values: one variable and two that are fixed constants at compile time. In this commit, we move the explicit multiplication to involve only the compile-time constants, preventing any overflow from that multiplication, (and allowing calloc to catch any potential overflow from the remainining implicit multiplication). Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 18:37:02 -07:00
Carl Worth	c35f14f368	Eliminate several cases of multiplication in arguments to calloc In commit `32f2fd1c5d`, several calls to _mesa_calloc(x) were replaced with calls to calloc(1, x). This is strictly equivalent to what the code was doing previously. But for cases where "x" involves multiplication, now that we are explicitly using the two-argument calloc, we can do one step better and replace: calloc(1, A * B); with: calloc(A, B); The advantage of the latter is that calloc will detect any overflow that would have resulted from the multiplication and will fail the allocation, (whereas the former would return a small allocation). So this fix can change potentially exploitable buffer overruns into segmentation faults. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 18:37:02 -07:00
Kenneth Graunke	96ce065db4	glsl: Report progress from opt_copy_propagation_elements(). It's been altering the tree and reporting "false" since January 2011. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 17:26:06 -07:00
Kenneth Graunke	702b6ea051	glsl: Skip rewriting instructions in opt_cpe when unnecessary. Previously, opt_copy_propagation_elements would always rewrite the instruction stream, even if was the same thing as before. In order to report progress correctly, we'll need to bail if the suggested replacement is identical (or equivalent) to the original code. This also introduced unnecessary noop swizzles, as far as I can tell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 17:26:04 -07:00
Kenneth Graunke	5ced83ee15	glsl: Initialize source_chan in opt_copy_propagation_elements. Previously, if chans < 4, we passed uninitialized stack garbage to the ir_swizzle constructor for the excess components. Thankfully, it ignores that data, as it's unnecessary, so no harm actually comes of it. However, it's obviously better to initialize it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 17:25:56 -07:00
Kenneth Graunke	8270b048cf	i965: Handle ir_triop_csel in emit_bool_to_cond_code(). ir_triop_csel can return a boolean expression, so we need to handle it here; we simply forgot when we added it. Fixes Piglit's EXT_shader_integer_mix/{vs,fs}-mix-if-bool. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-09-03 17:12:03 -07:00
Kenneth Graunke	f92fbd554f	i965: Move curb_read_length/total_scratch to brw_stage_prog_data. All shader stages have these fields, so it makes sense to store them in the common base structure, rather than duplicating them in each. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-03 17:11:33 -07:00
Carl Worth	7528f6fd17	build: Rename md5 to checksums as part of .PHONY target In commit `46d03d37bf` I renamed a Makefile target from md5 to checksums, (as we switched from MD5 checksums to SHA-256 checksums, so the more general name is more future proof). But that commit missed one mention of "md5" as a dependency of the .PHONY target. Rename that here as well.	2014-09-03 16:08:20 -07:00
tiffany	cfc42db592	glsl: fix assertion which fails for unsigned array indices. According to the GLSL 1.40 spec, section 5.7 Structure and Array Operations: "Array elements are accessed using an expression whose type is int or uint." Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-09-03 13:52:39 -06:00
Jason Ekstrand	11ee9a4d99	i965/copy_image: Divide the x offsets by block width when using the blitter Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 12:27:19 -07:00
Jason Ekstrand	499acf6e4a	i965/copy_image: Use the correct block dimension Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 12:27:19 -07:00
Jason Ekstrand	b608cd7fbf	meta/copy_image: Use the correct texture level when creating views Previously, we were accidentally assuming that the level of both textures was 0. Now we actually use the correct level in our hacked texture view. This doesn't 100% fix the meta path because the texture type is getting lost somewhere in the pipeline. However, it actually copies to/from the correct layer now. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 12:27:19 -07:00
Jason Ekstrand	fcb6d5b9ef	i965/copy_image: Use the correct texture level Previously, we were using the source images level for both source and destination. Also, we weren't taking the MinLevel from a potential texture view into account. This commit fixes both problems. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82804 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-09-03 12:27:19 -07:00
Michel Dänzer	58b386dce4	gallivm: Fix build against LLVM SVN >= r216982 Only MCJIT is available anymore. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-03 09:15:01 -07:00
Marek Olšák	8abdc3c4a9	r600g: fix alpha-test with HyperZ enabled, fixing L4D2 tree corruption *_update_db_shader_control depends on the alpha test state. The problem was it was in a block which is only entered if the pixel shader is changed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74863 Cc: mesa-stable@lists.freedesktop.org Tested-by: Benjamin Bellec <b.bellec@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-03 11:50:21 +02:00
Michel Dänzer	2adf7ee92e	r600g,radeonsi: Preserve existing buffer flags The default case was accidentally clearing RADEON_FLAG_CPU_ACCESS from the previous fall-through cases. Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-03 12:49:59 +09:00
Jason Ekstrand	454aab45ef	main: Don't leak temporary texture rows Reviewed-by: Dave Airlie <airlied@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-09-02 15:50:27 -07:00
Dave Airlie	8380b894ad	r300g: pointless assignment of info.indexed Did this code mean to do something else, you tell me! Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-03 07:59:09 +10:00
Dave Airlie	2b24e58310	omx/h264: remove stray semicolon after if Coverity reported this, looks wrong to me. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-03 07:58:58 +10:00
Dave Airlie	f4ccf687a6	vdpau: unlock the mutex on error paths in attribute setting. Coverity pointed out we never dropped the lock here, so fix it by using a common exit path. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-03 07:58:50 +10:00
Eric Anholt	2da9118852	u_primconvert: Use u_upload_mgr for our little IB allocations. tex-miplevel-selection was hammering my memory manager with primconverts on individual quads. This gets all those converted IBs packed into larger IBs. Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-09-02 13:55:15 -07:00
Eric Anholt	6720d1573a	u_primconvert: Shut up compiler warning. gcc isn't detecting that src is set before used, since both are under if (info->indexed). Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-09-02 13:55:15 -07:00
Eric Anholt	1718ba30e5	gbm: Fix gallium build when X11 is in a non-system directory pipe-loader.h will include Xlib.h when HAVE_PIPE_LOADER_XLIB is set in the build. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-02 13:48:54 -07:00
Eric Anholt	d71a9b7d9d	vc4: Handle a couple of the transfer map flags. This is part of fixing extremely long runtimes on some piglit tests that involve streaming vertex reuploads due to format conversions, and will similarly be important for X performance, which relies on these flags.	2014-09-02 12:10:56 -07:00
Kristian Høgsberg	8f55174fbd	meta: Make MESA_META_DRAW_BUFFERS restore properly A meta begin/end pair with MESA_META_DRAW_BUFFERS will change visible GL state. We recreate the draw buffer enums from the buffer bitfield, which changes GL_BACK to GL_BACK_LEFT (and GL_FRONT to GL_FRONT_LEFT). This commit modifes the save/restore logic to instead copy the buffer enums from the gl_framebuffer and then set them on restore using _mesa_drawbuffers(). It's not clear how this breaks the benchmark in 82796, but fixing meta to not leak the state change fixes the regression. No piglit regressions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=82796 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Cc: mesa-stable@lists.freedesktop.org	2014-09-02 10:33:13 -07:00
Emil Velikov	5a4e0f3873	Revert "mesa: fix make tarballs" This reverts commit `0fbb9a599d`. Rather than adding hacks around the issue drop the sources from the final tarball, and re-add them back with 'make dist'. This fixes a problem when running parallel 'make install' fails as it recreates sources and triggers partial recompilation. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83355 Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2014-09-02 11:39:29 +01:00
Dave Airlie	021e84f292	mesa/program_cache: calloc the correct size for the cache. Coverity reported this, and I think this is the right solution, since cache->items is struct cache_item ** not struct cache_item , we also realloc it using struct cache_item at some point. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-02 16:42:24 +10:00
Michel Dänzer	a75fee78c6	radeonsi: Compile dummy pixel shader on demand It's never used under normal circumstances. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-02 15:24:07 +09:00
Michel Dänzer	b84b9eae20	u_blitter: Create all shaders on demand Not all of these are used in every context, so this can make a significant difference for short-lived contexts such as in piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-02 15:24:07 +09:00
Michel Dänzer	51131c423c	r600g,radeonsi: Inform the kernel if a BO will likely be accessed by the CPU This allows the kernel to prevent such BOs from ever being stored in the CPU inaccessible part of VRAM. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-02 15:24:07 +09:00
Dave Airlie	2d5d1f5598	glsl: free uniform_map on failure path. If we fails in reserve_explicit_locations, we leak uniform_map. Reported-by: coverity scanner. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-02 16:05:52 +10:00
Paul Berry	9f20503658	main/cs: Add gl_context::ComputeProgram Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 19:38:27 -07:00
Jordan Justen	d035d50e05	mesa: Convert NewDriverState to 64-bits i965 will have more than 32 bits when BRW_STATE_COMPUTE_PROGRAM is added. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-09-01 19:38:27 -07:00
Paul Berry	8e27a4d2b3	i965: Modify state upload to allow 2 different sets of state atoms. The set of state atoms for compute shaders is currently empty; it will be filled in by future patches. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 19:38:27 -07:00
Paul Berry	373143ed91	i965: Modify dirty bit handling to support 2 pipelines. The hardware state for compute shaders is almost entirely orthogonal to the hardware state for 3D rendering. To avoid sending unnecessary state to the hardware, we'll need to have a separate set of state atoms for the compute pipeline and the 3D pipeline. That means we need to maintain two separate sets of dirty bits to determine which state atoms need to be run. But the dirty bits are not completely independent; for example, if BRW_NEW_SURFACES is flagged while doing 3D rendering, then not only do we need to re-run 3D state atoms that depend on BRW_NEW_SURFACES, but we also need to re-run compute state atoms that depend on BRW_NEW_SURFACES. But we'll also need to re-run those state atoms the next time the compute pipeline is run. To accomplish this, we record two sets of dirty bits, one for each pipeline. When bits are dirtied (via SET_DIRTY_BIT() or SET_DIRTY_ALL()) we set them to the dirty state in both pipelines. When brw_state_upload() is run, we clear the dirty bits just for the pipeline that was run. Note that since the number of pipelines is known at compile time to be 2, the compiler should unroll the loops in SET_DIRTY_BIT() and SET_DIRTY_ALL(). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 19:38:27 -07:00
Paul Berry	c5bdf9be1e	i965: Create a macro for checking a dirty bit. This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 19:38:27 -07:00
Paul Berry	6f56e1424d	i965: Create a macro for setting all dirty bits. This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 19:38:27 -07:00
Paul Berry	88e3d404da	i965: Create a macro for setting a dirty bit. This will make it easier to extend dirty bit handling to support compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 19:38:27 -07:00
Dave Airlie	94a909ec2d	i965: add missing parens in vec4 visitor coverity reported this, Matt said it look like missing parens, not bad identing, so lets try that. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-02 11:07:11 +10:00
Dave Airlie	19f6e80a1e	nouveau: don't leak dec struct on error This one path doesn't goto fail, so it seems to leak dec. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-02 10:08:58 +10:00
Dave Airlie	32a8b2cf54	xvmc/tests: %C isn't a valid printf specifier. Reported-by: Coverity scanner. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-02 10:07:54 +10:00
Dave Airlie	ea88b1de2f	nouveau/nv40: quiten coverity warning in unused vertex texture code. This fixes the code, but we never run it anyways, so silence coverity. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-09-02 10:04:29 +10:00
Ilia Mirkin	d0cd86686d	nv50: remove unused variables Recent code changes have caused these to no longer be used. Remove them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-01 18:47:42 -04:00
Ilia Mirkin	0c38006b55	mesa: force height of 1D textures to be 1 in texture views Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	2c44043313	nv50: attach the buffer bo to the miptree structures The current code... makes no sense. Use nouveau_bo_ref to attach the bo to the exposed resource so as to have the proper lifetime guarantees. Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	9d52e551a5	nv50: mt address may not be the underlying bo's start address With VP2, nv50_miptree is faked because the underlying bo's have to be laid out in a certain way. This is done by adjusting the address. Make sure that blits (and everything else for consistency) use the mt address rather than the bo address as a base. This fixes retrieving chroma plane with VDPAU. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82255 Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	2528d402b9	nv50: set the miptree address when clearing bo's in vp2 init The mt address is about to be used more, make sure it's set appropriately. Reported-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	6c2b079231	nv50/ir: avoid creating instructions that can't be emitted When constant folding a MAD operation, we first fold the multiply and generate an ADD. However we do so without making sure that the immediate can be handled in the saturate case. If it can't, load the immediate in a separate instruction. Reported-by: Tiziano Bacocco <tizbac2@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	115d9a5525	nvc0: don't make 1d staging textures linear Experimentally, the sampler doesn't appear to like these, neither as buffer nor as rect textures. So remove 1D from the list of texture types to make linear when used for staging. This fixes the OSD in mplayer for VDPAU. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	362cd26960	nv50: zero out unbound samplers Samplers are only defined up to num_samplers, so set all samplers above nr to NULL so that we don't try to read them again later. Tested-by: Christian Ruppert <idl0r@qasl.de> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Ilia Mirkin	c4bb436f76	nvc0/ir: avoid infinite recursion when finding first uses of tex In certain circumstances, findFirstUses could end up doubling back on instructions it had already processed, resulting in an infinite recursion. Avoid this by keeping track of already-visited instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079 Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-09-01 18:38:02 -04:00
Rob Clark	ef858ac770	freedreno/ir3: add DDX/DDY Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-01 18:08:21 -04:00
Rob Clark	5e5604cc28	freedreno/ir3: don't keep IR around Once we've assembled the shader, no need to keep the intermediate around. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-09-01 18:08:21 -04:00
Jason Ekstrand	e8f83538dd	i965/fs: Don't segfault when debug-logging a null program Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-01 12:33:13 -07:00
Jason Ekstrand	1c573c9adb	i965/vec4: Don't segfault when debug-logging a null program Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-01 12:31:56 -07:00
Marek Olšák	a10c8db715	radeonsi: implement EXPCLEAR optimization for depth Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:52 +02:00
Marek Olšák	f05fe294e7	r600g,radeonsi: initialize HTILE to fully-expanded state Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:52 +02:00
Marek Olšák	573313c94e	radeonsi: implement fast depth clear Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:51 +02:00
Marek Olšák	63cb4077e6	radeonsi: move DB_RENDER_CONTROL into draw_vbo So that I can add fast depth clear. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:51 +02:00
Marek Olšák	78aa717601	radeonsi: disable occlusion queries if they are not needed We always left them enabled, which turned off HiZ in some cases. This should improve performace with Hyper-Z. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:51 +02:00
Marek Olšák	ab9ad91779	r600g,radeonsi: force fast stencil and HTILE stencil off, fixing a Hyper-Z hang This should be as fast as no HTILE for stencil. I think we can still get full performance with depth-only rendering even if stencil is present in the buffer but not used, but I'm not 100% sure. This may be revisited when HiS and fast stencil clear are implemented. This fixes a hang in Brutal Legend. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:51 +02:00
Marek Olšák	ba14d4910c	r600g: set VGT_ENHANCE=4 on R7xx This is a golden setting on RV740, but there is a hw bug which recommends setting it on all R7xx chipsets. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:49 +02:00
Marek Olšák	13b93596da	r600g: expose AMD_vertex_shader_layer and *_viewport_index on R600-R700 already implemented Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:45 +02:00
Marek Olšák	d159c5e3e0	r600g: fix layered clear Cc: mesa-stable@lists.freedesktop.org Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:42 +02:00
Marek Olšák	e6d191bb6f	r600g: some DB bug workarounds for R6xx DB flushing Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:40 +02:00
Marek Olšák	0ccc653c70	r600g: enable fast depth clear for array textures and cubemaps I have a piglit test that hits this. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:37 +02:00
Marek Olšák	6d751065cc	r600g: use HTILE allocator from SI It's almost the same. This enables tiling for HTILE. It also enables Hyper-Z for other texture targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT). 2D array depth textures are tested by Unigine Sanctuary and my new piglit test. Acked-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:18:33 +02:00
Marek Olšák	ee1b30eaff	r600g: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX for EG/CM, inline other fields This fixes rendering to non-zero layer/face/slice with HTILE. v2: added the assertion Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:17:40 +02:00
Marek Olšák	91050ff215	radeonsi: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX, inline other fields This fixes rendering to a non-zero layer/face/slice with HTILE. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685 v2: added the assertion Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-09-01 21:15:36 +02:00
Glenn Kennard	8d0f6ff810	r600g: Implement sm5 geometry shader instancing Requires Evergreen or later hardware. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>	2014-09-01 21:12:03 +02:00
Marek Olšák	482def592f	glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand This fixes crashes if the number of temporaries is greater than 4096. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184 v2: added fail paths for realloc failures Cc: 10.2 10.3 mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-09-01 21:03:58 +02:00
Marek Olšák	b419c651fb	gallium/pb_bufmgr_cache: limit the size of cache This should make a machine which is running piglit more responsive at times. e.g. streaming-texture-leak can easily eat 600 MB because of how fast it creates new textures.	2014-09-01 20:17:48 +02:00
Marek Olšák	bba7d29a86	pipe-loader: use the correct screen index	2014-09-01 20:09:19 +02:00
Marek Olšák	0b56e23e7f	egl/dri2: use the correct screen index Required for multi-GPU configuration where each GPU has its own X screen.	2014-09-01 20:09:19 +02:00
Jordan Justen	1a428a5256	docs: Mark ARB_compute_shader as work in progress Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2014-09-01 10:45:37 -07:00
Connor Abbott	d571f2b15d	i965/fs: don't use ir->shadow_comparitor in emit_texture_* Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-01 00:55:14 -07:00
Connor Abbott	cbfcb1b069	i965/fs: don't pass ir_variable * to emit_samplepos_setup() We were only using it to get at its type, which we already know because it's a builtin variable. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-01 00:12:15 -07:00
Connor Abbott	ec3d06f591	i965/fs: don't pass ir_variable * to emit_frontfacing_interpolation() We were only using it to get at its type, which we already know because it's a builtin variable. v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-09-01 00:11:16 -07:00
Kenneth Graunke	70691f0c28	i965: Fix GPU hangs when INTEL_DEBUG=no16 is set. The replicated data clear shader needs to be SIMD16, or else the GPU will hang. So, compile it even if INTEL_DEBUG=no16 is set. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-31 17:03:31 -07:00
Emil Velikov	88cbe3908f	mesa: fix make tarballs Current method of generating distribution tar-balls involves manually invoking make + target name in the appropriate places. This temporary solution is used until we get 'make dist' working. Currently it does not work, as in order to have the target (which is also a filename) available in the final Makefile we need to add a PHONY target + use the correct target name. Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-09-01 00:22:20 +01:00
Abdiel Janulgue	5598458e69	i965/vec4: Remove try_emit_saturate Now that saturate is implemented natively as an instruction, we can cut down on unneeded functionality. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:09 +03:00
Abdiel Janulgue	cbd225057a	i965/fs: Refactor try_emit_saturate v3: Since the fs backend can emit saturate as a separate instruction, there is no need to detect for min/max instructions and to rewrite the instruction tree accordingly. On the other hand, we don't need to emit a separate saturated mov either when the expression generating src can do saturate directly. v4: Add can_do_saturate() check before enabling saturate modifer (Ken) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:09 +03:00
Abdiel Janulgue	b2c0c35907	ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturate Now that saturate is implemented natively as instruction, we can cut down on unneeded functionality. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:09 +03:00
Abdiel Janulgue	7841a246b9	i965/vec4: Allow propagation of instructions with saturate flag to sel When sel conditon is bounded within 0 and 1.0. This allows code as: mov.sat a b sel.ge dst a 0.25F To be propagated as: sel.ge.sat dst b 0.25F v3: - Syntax clarifications in inst->saturate assignment - Remove extra parenthesis when assigning src_reg value from copy_entry (Matt Turner) v4: - Take channels into consideration when propagating saturated instructions. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:09 +03:00
Abdiel Janulgue	40aeb558ce	i965/fs: Allow propagation of instructions with saturate flag to sel When sel conditon is bounded within 0 and 1.0. This allows code as: mov.sat a b sel.ge dst a 0.25F To be propagated as: sel.ge.sat dst b 0.25F v3: Syntax clarifications in inst->saturate assignment (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:09 +03:00
Abdiel Janulgue	0e2ba3ee82	glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x),b) v2: - Output max(saturate(x),b) instead of saturate(max(x,b)) - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is > 0.0 and inner constant is 1.0. - Fix comments to show that the optimization is a commutative operation (Matt Turner) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	d92394c5d8	glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x),b) v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is zero and inner constant is < 1 - Fix comments to reflect we are doing a commutative operation (Matt Turner) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	8f890b119e	glsl: Optimize clamp(x, 0, 1) as saturate(x) v2: - Check that the base type is float (Ian Romanick) v3: - Make sure comments reflect that we are doing a commutative operation - Add missing condition where the inner constant is 1.0 and outer constant is 0.0 - Make indexing of operands easier to read (Matt Turner) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	cbd0d643a3	glsl: Implement saturate as ir_unop_saturate Now that we have the ir_unop_saturate implemented as a single instruction, generate the correct simplified expression. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	cb621166dc	yi965/vec4: Add support for ir_unop_saturate Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	4bfe8a1e61	i965/fs: Add support for ir_unop_saturate Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	909fa50f5b	ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturate Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	cfa8c1cb39	ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturate Needed when vertex programs doesn't allow saturate Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	8935c12937	glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1) Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	4c0ccfc5b3	glsl: Add constant evaluation of ir_unop_saturate v2: Use CLAMP macro (Ian Romanick) Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	a5f02b6696	glsl: Add ir_unop_saturate Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-31 21:04:08 +03:00
Abdiel Janulgue	f340145107	i965/vec4/fs: Count loops in shader debug Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:04:03 +03:00
Abdiel Janulgue	ddc1d297bc	i965/vec4: inline generate_vec4_instruction() within generate_code() Suggested by Matt. This patch combines and moves back the code-generation functions from generate_vec4_instruction() into generate_code(). Makes generate_code() a bit larger, but helps us to count loops in a straightforward manner. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-08-31 21:03:49 +03:00
Kenneth Graunke	e34a363a78	i965: Add 2x MSAA support to Broadwell fast clear code. According to the cited documentation section (but in the newer docs), x_scaledown is the same for 2x and 4x MSAA. +47 piglits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83081 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.3" <mesa-stable@lists.freedesktop.org>	2014-08-31 01:48:10 -07:00
Matt Turner	8b5ac1df17	i965/vec4: Update register coalescing test. In commit `04895f5c` I added support for reswizzling writemasks. This test was checking that we didn't support this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82881	2014-08-30 21:00:28 -07:00
Matt Turner	0492275038	i965: Use unreachable() to silence warning. brw_meta_fast_clear.c:211:17: warning: 'x_scaledown' may be used uninitialized in this function [-Wmaybe-uninitialized] unsigned int x_scaledown, y_scaledown; Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-30 21:00:28 -07:00
Chia-I Wu	a14c23735e	ilo: set INTEL_RELOC_GGTT only on GEN6 We asked MI commands to use GGTT only on GEN6.	2014-08-31 10:34:39 +08:00
Chia-I Wu	255b274d75	ilo: fix bound check for 3DSTATE_URB_VS Fix max/min entries on GEN7.5 GT2/GT3.	2014-08-31 10:34:39 +08:00
Chia-I Wu	5f4b13f5fa	ilo: replace cmd by dw0 in GPE With `e3c251071b`, the magic values are gone. We no longer need "cmd" to hide them. Replace it by dw0.	2014-08-31 10:34:39 +08:00
Alexander von Gluck IV	7b6ea6ab8c	st/hgl: Move st_visual create/destroy into hgl state_tracker	2014-08-30 19:35:24 -04:00
Alexander von Gluck IV	15da8d0761	st/hgl: Move st_manager create/destroy into hgl state_tracker	2014-08-30 19:35:24 -04:00
Rob Clark	c06afcede2	freedreno/ir3: fix potential null ptr deref Fix potential segfault in debug code. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-30 18:02:51 -04:00
Rob Clark	c99f09f4be	freedreno/ir3: add TXB Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-30 18:02:51 -04:00
Rob Clark	b823abedf8	freedreno/ir3: detect scheduler fail There are some cases where the scheduler can get itself into impossible situations, by scheduling the wrong write to pred or addr register first. (Ie. it could end up being unable to schedule any instruction if some instruction which depends on the current addr/reg value also depends on another addr/reg value.) To solve this we'd need to be able to insert extra mov instructions (which would also help when register assignment gets into impossible situations). To do that, we'd need to move the nop padding from sched into legalize. But to start with, just detect when we get into an impossible situation and bail, rather than sitting forever in an infinite loop. This way it will at least fall back to the old compiler, which might even work if you are lucky. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-30 18:02:50 -04:00
Ian Romanick	932b0ef1ce	glsl: Use bit-flags image attributes and uint16_t for the image format All of the GL image enums fit in 16-bits. Also move the fields from the anonymous "image" structucture to the next higher structure. This will enable packing the bits with the other bitfield. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0 After (32-bit): 70 40,577,421,777 68,487,584 62,973,695 5,513,889 0 Before (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0 After (64-bit): 74 37,124,603,758 95,891,808 88,466,712 7,425,096 0 A real savings of 346KiB on 32-bit and 262KiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-29 23:29:19 -07:00
Ian Romanick	8eeca7a56c	glsl: Use a single bit for the dual-source blend index The only values allowed are 0 and 1, and the value is checked before assigning. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 74 40,580,119,657 69,186,544 63,506,327 5,680,217 0 After (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0 Before (64-bit): 89 36,822,971,897 96,526,616 88,735,296 7,791,320 0 After (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0 A real savings of 173KiB on 32-bit and no change on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-29 23:28:26 -07:00
Ian Romanick	c0cd5bedf6	glsl: Eliminate ir_variable::data.atomic.buffer_index Just use ir_variable::data.binding... because that's the where the binding is stored for everything else that can use layout(binding=). Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0 After (32-bit): 74 40,580,119,657 69,186,544 63,506,327 5,680,217 0 Before (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0 After (64-bit): 89 36,822,971,897 96,526,616 88,735,296 7,791,320 0 A real savings of 173KiB on 32-bit and 368KiB on 64-bit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-29 23:27:59 -07:00
Kenneth Graunke	941269f89c	mesa: Delete ctx->GeometryProgram.Cache. The VertexProgram and FragmentProgram have a Cache member for dealing with fixed function programs. There are no fixed function geometry programs, so this should never have existed, and was just copy and pasted. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-08-29 22:13:37 -07:00
Roland Scheidegger	ca4f0baca2	gallivm: fix somewhat broken NaN behavior for exp2 I actually screwed that up in `754319490f`, mistakenly thinking the code actually wanted the non-nan result before. So, introduce that missing nan behavior case and use that instead. For sse, there's no actual change in the resulting code at all, the fallback code wouldn't have done the right thing though. Of course, the actual issue I saw with pow() was completely unrelated... Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:34:41 +02:00
Roland Scheidegger	3d29e75a5f	softpipe: handle vertex texture sampling when using llvm for draw Pretty trivial, just fill in the offsets and such. The implementation is near 100% copy and paste from llvmpipe. Should be useful for debugging. No piglit change when not using SOFTPIPE_USE_LLVM=1. Now that it can do the same tests with and without using llvm for vs/gs, with llvm more pass, the only things failing only with llvm seems to be edgeflags tests and vs/gs-pow-float-float (and for the latter I'm not convinced the zero tolerance it requires is somehow mandated by glsl). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:34:16 +02:00
Roland Scheidegger	62fd871984	llvmpipe: (trivial) enable cube map arrays The code is all in place now so enable it. Seems to pass all relevant piglit tests (just like cube maps, some of the cube map array tests need GALLIVM_DEBUG=no_quad_lod,no_rho_approx) Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:33:40 +02:00
Roland Scheidegger	9da75f96bc	gallivm: handle cube map arrays for texture sampling Pretty easy, just make sure that all paths testing for PIPE_TEXTURE_CUBE also recognize PIPE_TEXTURE_CUBE_ARRAY, and add the layer * 6 calculation to the calculated face. Also handle it for texture size query, looks like OpenGL wants the number of cubes, not layers (so need division by 6). No piglit regressions. v2: fix up adding cube layer to face for seamless filtering (needs to happen after calculating per-sample face). Undetected by piglit unfortunately. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)	2014-08-30 01:33:02 +02:00
Roland Scheidegger	26a5156de7	draw: kill off bogus assertion in tgsi_fetch_gs_outputs Not sure why it was there but it is definitely not an error if gs outputs are infs/nans. Besides, the outputs can be ints, in which case any small negative number asserted. This fixes piglit's texelFetch gs isamplerXX crashes with softpipe (down from 14 to 2). Bug https://bugs.freedesktop.org/show_bug.cgi?id=80012 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:17:47 +02:00
Roland Scheidegger	c9ae5038d5	softpipe: don't assert on illegal wrap mode for rect textures piglit tex-miplevel-selection nowadays doesn't use repeat wrap mode due to sampler objects any longer, however at the time of the clear the wrap mode is still illegal and at this point we get to verify the state, including samplers (even though they won't get used), and because mesa doesn't treat it as an incomplete texture as the spec says it should, we hit the assertion. Just warn about this for now instead. Gets crashes down from 44 to 14 in a piglit run (all were in various tests of tex-miplevel-selection with texture rectangles). Though just about all tex-miplevel-selection tests fail anyway for other reasons. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:17:47 +02:00
Roland Scheidegger	032fe4ed23	tgsi: (trivial) fix handling msaa resources on TXF Just handle as ordinary 2d / 2d array resources. Prevents an assertion failure with softpipe and piglit glsl-resource-not-bound 2DMS/2DMSArray tests. While here also fix TXD shadowCube similarly, which fixes the crash with piglit tex-miplevel-selection textureGrad CubeShadow (the test will still fail due to softpipe being broken). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=80011 Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:17:47 +02:00
Roland Scheidegger	99105454b0	draw: remove fishy num_samplers/num_sampler_views check in llvm path This was meant for softpipe to not crash at some point if vertex texturing was used. It is, however, fishy because it uses values from draw_set_samplers/draw_set_sampler_views and not from the shader key. Albeit we should still in all cases actually generate a new shader if this changes (because the samplers and views themselves are in the key) I don't want to think again wondering if that's really correct in the future. Besides, at least today, it does not actually work for softpipe, as this was relying on softpipe not actually calling draw_set_samplers/sampler_views at all - I've verified it crashes regardless (if there were a tex instruction in the vs, which normally should not happen anyway). For drivers which do indeed not call these functions because they don't support vertex texturing at all (r300), this should still not crash because the static texture data is all zero, which causes the sampling functions to take an early out (same as is done if no texture is bound at the slot used for sampling - verified with hacked up softpipe). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-30 01:17:46 +02:00
Roland Scheidegger	85d4cc4790	mesa: fix fallback texture for cube map array mesa was creating a cube map array texture with just one layer, which is not legal. This caused an assertion failure when using that texture later in llvmpipe (when enabling cube map arrays) since it verifies the number of layers in the view is divisible by 6 (the sampling code might well crash randomly otherwise) with piglit glsl-resource-not-bound CubeArray -fbo -auto. v2: use appropriately sized texel array... Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)	2014-08-30 01:17:46 +02:00
Aaron Watry	7c73ee677f	r600/compute: Don't leak compute pool item_list/unallocated_list v3: Fix multi-line comment format v2: Change to C-style comments and fix indentation Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Bruno Jiménez <brunojimen@gmail.com>	2014-08-29 17:38:24 -05:00
Michel Dänzer	6cd0dbc415	u_vbuf: Make sure all caps are initialized Pointed out by valgrind. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83148 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-29 12:15:10 +09:00
Michel Dänzer	2a99b6e40f	r600g: Reinstate include path to common radeon source directory Fixes build failure since commit `a131263a2f` ('gallium/radeon: cleanup header inclusion'): ../../../../../src/gallium/drivers/r600/evergreen_compute.c:50:30: fatal error: radeon_llvm_util.h: No such file or directory #include "radeon_llvm_util.h" ^ compilation terminated. Trivial.	2014-08-29 12:09:16 +09:00
Matt Turner	2cab62a68d	i965: Mark BRW_CONDITIONAL_R as Gen <= 5.	2014-08-28 19:06:45 -07:00
Matt Turner	4fcefac753	i965/disasm: Show jump count for if/iff/halt. These instructions don't have pop count. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-28 19:06:27 -07:00
Matt Turner	fb2fddefce	i965/disasm: Disassemble JMPI's source properly. The source can be a register as well as an immediate, and disassembling a register as an immediate can have some strange results. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-28 19:06:27 -07:00
Matt Turner	bef7a025eb	i965/disasm: Add break/cont/halt to list of has_uip(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-28 19:06:27 -07:00
Matt Turner	383eccb77e	i965/disasm: Disassemble Z/NZ conditional modifiers as .z/.nz. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-28 19:06:27 -07:00
Ilia Mirkin	b4418cd4ce	nouveau: allow more tokens by default to avoid parse failures Also print a note saying that parsing failed to help isolate issues. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-28 21:53:55 -04:00
Emil Velikov	76e5406e58	targets/haiku-softpipe: explicitly prefix sw/hgl header Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:41:51 -04:00
Emil Velikov	f5fb9c556b	sw/hgl: struct haiku_displaytarget is not public struct It is meant to be private within the actual winsys. Remove it from the exported header, and fold it into it's only user. Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:41:46 -04:00
Emil Velikov	3b36ba4c39	include/haiku: fix comment typo Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:41:29 -04:00
Emil Velikov	5b8900ded3	hgl: trivial bits Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:34:43 -04:00
Alexander von Gluck IV	311b59495c	gallium/targets: Break haiku state_tracker out to own directory Ack'ed by Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:27:29 -04:00
Alexander von Gluck IV	86d1aa8531	gallium/targets: Haiku softpipe, perform better framebuffer validation * Check for back left attachment as well * Set and act on pipe format none Ack'ed by Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:27:26 -04:00
Emil Velikov	96b45e67d5	st/egl: ship all the files in the tarball Namely we were missing the headers and the Android/SCons buildscripts. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:42 +01:00
Emil Velikov	da1d324909	st/clover: sort the sources list Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:42 +01:00
Emil Velikov	010fa9074e	st/gbm: include the header in the sources list Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:42 +01:00
Emil Velikov	27be19aa45	st/xlib: Include the headers in the sources list. Yet another step towards a working 'make dist'. Cc: José Fonseca <jfonseca@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: José Fonseca <jfonseca@vmware.com>	2014-08-28 21:24:42 +01:00
Emil Velikov	526a9d9c5e	st/omx: use makefile.sources to handle sources lists ... and add the headers so that 'make check' is happy. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-08-28 21:24:41 +01:00
Emil Velikov	f6507d2357	st/vdpau: pickup/ship the private header Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-08-28 21:24:41 +01:00
Emil Velikov	e3fd703e85	st/vdpau: remove obsolete define VL_HANDLES This define is always set and it had no real purpose according to git log. Seems like it is a leftover from the vl/vdpau prototype stage. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-08-28 21:24:41 +01:00
Emil Velikov	60d772cd9d	st/vega: add headers and SConscript in the tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:41 +01:00
Emil Velikov	bcdb47d838	st/xa: add remaining files in the tarball Namely - the private header (xa_priv.h) - README and - xa-indent Sort the sources list while we're here. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:40 +01:00
Emil Velikov	398f6eefee	st/xvmc: pick up the headers for distribution - autotools/make will pick them up in the tarball. - Sort the list alphabetically. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:40 +01:00
Emil Velikov	c6e5801b40	Revert "configure: Disable xvmc by default" This reverts commit `6a19bb56e0`. The above commit disabled the default build of xvmc as the xvmc tests were failing. As pointed out by Ilia, the tests are "broken by design" as they do not test the object that is build but the one that is installed and setup on the workstation. With previous commit we moved the programs from the 'make check' to noinst automake target. This way they won't be run but will be around for people to use them. Cc: Tom Stellard <thomas.stellard@amd.com>	2014-08-28 21:24:40 +01:00
Emil Velikov	91f49befd0	st/xvmc: automake: move tests to noinst All the tests require an installed and setup XvMC, thus they are not good candidates for 'make check'. Keep them around as the user might want to actually test the implementation post installation/setup. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Cc: Tom Stellard <thomas.stellard@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:40 +01:00
Emil Velikov	015792fb02	winsys/sw: add the final files to the tarball Add the final remaining files into the tarball (make dist), namely: - SConscripts - Non-autotooled winsys' - android, gdi and hgl. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:39 +01:00
Emil Velikov	95603e259b	winsys/sw: automake: consistently use Makefile.sources - Include the headers within. - Update scons to use them. - Drop useless include (gallium/drivers) from scons. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:39 +01:00
Emil Velikov	f0ae81cc13	winsys/$(hw): ship the Android/SCons scripts in the tarball Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:38 +01:00
Emil Velikov	63e9831756	winsys/$(hw): include headers in Makefile.sources Otherwise 'make dist' will not pick them up :'( Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:38 +01:00
Emil Velikov	afdc44deca	st/egl: cleanup sw winsys header inclusions - Drop duplicate include compiler directives. - Leave the sw/ prefix for all the software winsys headers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:37 +01:00
Emil Velikov	30f3df4e53	winsys/radeon: move radeon_cs_dump.h to drm ... to ease packaging (make dist). Update it to fetch libdrm's include/libs via pkg-config. Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-28 21:24:37 +01:00
Emil Velikov	a131263a2f	gallium/radeon: cleanup header inclusion - Add top_srcdir/src/gallium/winsys to GALLIUM_DRIVER_C{XXFLAGS}. - Remove top_srcdir/src/gallium/drivers/radeon from the includes. As a result: - Common radeon headers are prefixed with 'radeon/' - Winsys header inclusion is prefixed 'radeon/drm' Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-28 21:24:37 +01:00
Emil Velikov	22a13f5b09	winsys/svga: build: cleanup the includes gallium/drivers is already part fo GALLIUM_WINSYS_CFLAGS. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-28 21:24:36 +01:00
Emil Velikov	7dc2f9f919	winsys/i915: remove the software winsys We stopped building it recently as it was unused and not tested. Good bye, it's been nice knowing you :) Cc: Stephane Marchesin <stephane.marchesin@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Stephane Marchesin <stephane.marchesin@gmail.com>	2014-08-28 21:24:36 +01:00
Emil Velikov	664c2d7694	gallium/ilo: cleanup intel_winsys.h Make the header location, inclusion and contents more common with its i915,r* and nouveau counterparts: - Move the header within drivers/ilo. - Separate out intel_winsys_create_for_fd into 'drm_public' header. - Cleanup the compiler includes. v2: Move the header to drivers/ilo. Suggested by Chia-I. v3: Correct intel_winsys.h inclusion. Spotted by Chia-I. Cc: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-08-28 21:24:16 +01:00
Timothy Arceri	4ca203f6a1	docs: mark GL_MAX_VERTEX_ATTRIB_STRIDE as done Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-27 20:36:03 -10:00
Timothy Arceri	89e6806dea	gallium: add cap for MAX_VERTEX_ATTRIB_STRIDE Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-27 20:35:59 -10:00
Timothy Arceri	3246e11d33	mesa: implement GL_MAX_VERTEX_ATTRIB_STRIDE V2: moved test for the VertexAttrib*Pointer() functions to update_array(), and made constant available for drivers to set Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-27 20:35:56 -10:00
Michel Dänzer	eae9da879f	st/clover: Fix build against LLVM SVN >= r216583 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-28 12:05:21 +09:00
Roland Scheidegger	eee9f6ae8a	draw: fix base instance handling in llvm path The base instance needs to be passed to the jited function, otherwise the instanced data fetch will only work with the same start instance when the jit function was created (and baking that into the key instead is not a viable option). This fixes piglit arb_base_instance-drawarrays (modulo some unrelated core/compat context trouble I get for the test). And fix the pipe cap bit in llvmpipe for it now that it actually works (it already worked for softpipe). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-28 03:03:23 +02:00
Roland Scheidegger	17eabfeccf	docs: fix up status of softpipe, llvmpipe The docs were never really up to date for them, missing just about everything. So mark them off as all done for GL 3.3 (though softpipe is in fact quite broken for some newer things especially wrt texturing, and both don't have compliant, real msaa support). And add the extensions missing too (no guarantee of completeness). Reviewed-by: Dave Airlie <airlied@gmail.com>	2014-08-28 03:01:16 +02:00
Alexander von Gluck IV	0348429586	glsl: Add strings.h on non-MSC platforms * IEEE Std 1003.1-2001 placed strcasecmp() in strings.h. * ISO C99 doesn't mention strcase* in string.h * On all platforms I could find, strcasecmp is in strings.h and string.h as a compatibility layer for software written pre-2001 POSIX * Technically strcasecmp should be only in strings.h and the man pages back this up. * Tested build on CentOS and Haiku Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-27 20:20:58 -04:00
Alex Deucher	6b48c18b03	radeon/uvd: remove comment about RV770 It doesn't seem to support field based decode after testing. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-08-27 10:04:13 -04:00
Christian König	80771e47b6	radeon/uvd: fix field handling on R6XX style UVD The first UVD generation can only do frame based output. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-08-26 17:56:57 +02:00
Christian König	03a99ba9e4	vl/compositor: set the scissor before clearing the render target Otherwise we clear areas that shouldn't be cleared. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2014-08-26 17:56:57 +02:00
Christian König	b73c20759f	st/vdpau: fix vlVdpOutputSurfaceRender(Output\|Bitmap)Surface Correctly handle that the source_surface is only optional. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80561 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2014-08-26 17:56:57 +02:00
Chia-I Wu	e3c251071b	ilo: use genhw command opcodes Replace ILO_GPE_MI and ILO_GPE_CMD with magic values by descriptive genhw macros.	2014-08-26 14:11:02 +08:00
Chia-I Wu	6c73478223	ilo: rename intel_bo_map_unsynchronized() Rename it to intel_bo_map_gtt_async().	2014-08-26 14:10:50 +08:00
Chia-I Wu	354d84b629	ilo: remove max_batch_size It is used to derive an artificial limit on max relocs per bo. We choose not to export it anymore.	2014-08-26 14:10:50 +08:00
Chia-I Wu	fbb869c1aa	ilo: replace domains by reloc flags It is simpler and is supported by the kernel. It cannot be used with libdrm_intel yet though.	2014-08-26 14:10:50 +08:00
Chris Forbes	01887593a4	docs: Update who is working on tessellation Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-26 07:51:11 +12:00
Chris Forbes	38a3490368	glsl: Remove bogus "OUPTUT" token This is never used. There is another token "OUTPUT" which the lexer can generate, though. This has been around since the dawn of time; is most likely a typo. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-26 07:50:43 +12:00
Marek Olšák	83503f9e68	radeonsi: handle PIPE_BIND_BLENDABLE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-25 13:12:24 +02:00
Marek Olšák	770719eb82	r600g: only set PIPE_BIND_BLENDABLE if colorbuffer rendering is supported Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-25 13:12:24 +02:00
Marek Olšák	bc0ae40616	r300g: handle PIPE_BIND_BLENDABLE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-25 13:12:23 +02:00
Eric Anholt	7317f11859	vc4: Stop doing qpu_inst(add, NOP) or qpu_inst(NOP, mul). Now that the extra WADDR is set, we can knock this off. Saves a lot of typing, and makes this code much more legible.	2014-08-24 22:13:26 -07:00
Eric Anholt	78d144f7de	vc4: Set the other WADDR in the qpu instruction helpers. Now you don't need to qpu_inst() your instruction with a NOP to get the other waddr set.	2014-08-24 22:13:26 -07:00
Eric Anholt	54499a85ff	vc4: Merge qpu_a_NOP() and qpu_m_NOP to a single qpu_NOP() helper. Now that qpu_inst() ignores the WADDR from the other half of the instruction, we can set both the ADD and MUL WADDRs in the NOP helper. Thanks to that, we also no longer need to qpu_inst(NOP, NOP).	2014-08-24 22:13:25 -07:00
Eric Anholt	1a7035f386	vc4: Ignore WADDRs from the other half of the instruction when merging. This allows setting the opposite-side WADDR to NOP (a non-zero value) in qpu_* helpers, so that we don't need to qpu_inst() merge them with NOPs all the time just to get the waddr set.	2014-08-24 22:13:25 -07:00
Eric Anholt	3212bafc28	vc4: Fix LT/GE set-0-or-1 compares. We were using the integer sub, which worked for the common case of EQ and NE. Fixes fs-lessThan-ivec2-ivec2 and other tests.	2014-08-24 22:13:25 -07:00
Eric Anholt	e2f66315cb	u_vbuf: Add a few more format fallbacks. Fixes piglit draw-vertices and gl-2.0-vertexattribpointer on vc4, where I'm only advertising R32F to RGBA32F support so far. Note: regresses gl-1.5-normal3b3s-invariance due to introduced flushes and missing depth buffer load/store support in the driver. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-24 22:13:25 -07:00
Eric Anholt	bbbe3b65ad	u_vbuf: Simplify the format fallback translation. Individual caps made supporting new fallbacks more complicated than it needed to be. Instead, just make a table of fallbacks at context init time. v2: Fix inverted "do we need to install vbuf?" flagging caught by Marek. Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)	2014-08-24 22:13:25 -07:00
Rob Clark	306e421887	freedreno/a2xx: fix segfault Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-24 13:09:47 -04:00
Rob Clark	bd3b096467	freedreno/a3xx: handle first/last level properly Fixes some assumptions about first_level being zero. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-24 13:09:23 -04:00
Rob Clark	b40a6c2b17	freedreno: implement pipe_flush_resource() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-24 13:09:00 -04:00
Rob Clark	478a08ebd2	freedreno: don't ignore src/dst level Don't ignore src/dst_level in pipe_copy_region. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-08-24 13:08:14 -04:00
Eric Anholt	8d8a5eb792	vc4: Fix save/restore of the VS/FS in the blitter. When I made the shader cache take the .fs member and moved the binding point to .bind_fs, I failed to update these. Fixes crashes in copyteximage-related tests.	2014-08-23 16:30:42 -07:00
Eric Anholt	9542e68207	vc4: Clear padding of ioctl arguments. Fixes valgrind complaints from valgrind being unaware of our ioctls.	2014-08-23 16:30:34 -07:00
Vinson Lee	c2867f5b36	auxilary/os: Add Solaris support in os_get_total_physical_memory. The patch fixes the build on Oracle Solaris. CC os/os_misc.lo "os/os_misc.c", line 59: #error: unexpected platform in os_sysinfo.c Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-22 18:24:34 -07:00
Alexander von Gluck IV	12a679a6f6	gallium/targets: Haiku, Fix some improper type warnings	2014-08-22 19:37:19 -04:00
Alexander von Gluck IV	31406d978d	gallium/targets: Clean up Haiku softpipe renderer visual * Drop creating gl_config first as it's only really used to create the state tracker visual.	2014-08-22 19:37:19 -04:00
Carl Worth	23163df24c	glcpp: Don't use alternation in the lookahead for empty pragmas. We've found that there's a buffer overrun bug in flex that's triggered by using alternation in a lookahead pattern. Fortunately, we don't need to match the exact {NEWLINE} expression to detect an empty pragma. It suffices to verify that there are no non-space characters before any newline character. So we can use a simple [\r\n] to get the desired behavior while avoiding the flex bug. Fixes the regression of piglit's 17000-consecutive-chars-identifier test, (which has been crashing since commit `04e40fd337` ). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82472 Signed-off-by: Carl Worth <cworth@cworth.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> CC: <mesa-stable@lists.freedesktop.org>	2014-08-22 15:14:59 -07:00
Kenneth Graunke	97d03b9366	i965: Disable try_emit_b2f_of_compare on Gen4-6. The optimization relies on CMP setting the destination to 0, which is equivalent to 0.0f. However, early platforms only set the least significant byte, leaving the other bits undefined. So, we must disable the optimization on those platforms. Oddly, Sandybridge wasn't reported as broken. The PRM states that it only sets the LSB, but the internal documentation says that it follows the IVB behavior. Since it wasn't reported as broken, we believe it really does follow the IVB behavior. v2: Allow the optimization on Sandybridge (requested by Matt). +32 piglits on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?=79963 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-22 11:40:32 -07:00
Matt Turner	b8aa1005c8	i965/fs: Preserve CFG in predicated break pass. Operating on this code, B0: ... cmp.ne.f0(8) (+f0) if(8) B1: break(8) B2: endif(8) We can delete B2 without attempting to merge any blocks, since the break/continue instruction necessarily ends the previous block. After deleting the if instruction, we attempt to merge blocks B0 and B1. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	3c4c2a6e30	i965/fs: Rename variable in predicated break pass. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	1db74a423f	i965/fs: Preserve CFG in the SEL peephole. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	81755bc67b	i965: Preserve CFG when deleting dead control flow. This pass deletes an IF/ELSE/ENDIF or IF/ENDIF sequence, or the ELSE in an ELSE/ENDIF sequence. In the typical case (where IF and ENDIF) aren't the only instructions in their basic blocks, we can simply remove the instructions (implicitly deleting the block containing only the ELSE), and attempt to merge blocks B0 and B2 together. B0: ... (+f0) if(8) B1: else(8) B2: endif(8) ... If the IF or ENDIF instructions are the only instructions in their respective basic blocks (which are deleted by the removal of the instructions), we'll want to instead merge the next blocks. Both B0 and B2 are possibly removed by the removal of if & endif. Same situation for if/endif. E.g., in the following example we'd remove blocks B1 and B2, and then attempt to combine B0 and B3. B0: ... B1: (+f0) if(8) B2: endif(8) B3: ... Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	9cf06e27e1	i965/cfg: Add functions to combine basic blocks. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	5e6ead5e8b	i965/cfg: Point to bblock_t containing associated control flow ... rather than pointing directly to the associated instruction. This will let us set the block containing the IF statement's else-pointer to NULL, when we delete a useless ELSE instruction, as in the case (+f0) if(8) ... else(8) endif(8) Also, remove the pointer to the ENDIF, since it's unused, and it was also potentially wrong, in the case of a basic block containing both an ENDIF and an IF instruction: endif(8) cmp.ne.f0(8) ... (+f0) if(8) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	2a98ebd42b	i965/fs: Preserve CFG in register allocation. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	20a849b4aa	i965: Use basic-block aware insertion/removal functions. To avoid invalidating and recreating the control flow graph. Also stop invalidating the CFG in places we didn't add or remove an instruction. cfg calculations: 202951 -> 80307 (-60.43%) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	e0aa45768c	i965: Add invalidate_cfg parameter to invalidate_live_intervals(). Will let us avoid invalidating the CFG if the optimization pass has removed instructions using the new basic block methods. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	3d6d4dc6f7	i965: Add basic-block aware backend_instruction::insert_* methods. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:34 -07:00
Matt Turner	dc527fbf7d	i965: Add a basic-block aware backend_instruction::remove method. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:33 -07:00
Matt Turner	240adc1346	i965/cfg: Add a function to remove a block from the cfg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:33 -07:00
Matt Turner	b7d50beea4	i965/cfg: Add functions to test if a block is a successor/predecessor. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-22 10:23:33 -07:00
Eric Anholt	e51e20c35e	vc4: Add support for fragment discards. Fixes piglit glsl-fs-discard-01 and -03, and allows a lot of mesa demos to start running. glsl-fs-discard-02 has a problem where the first tile is not getting stored on the first render.	2014-08-22 10:16:58 -07:00
Eric Anholt	0f894b2795	vc4: Make some helpers for setting condition codes in instructions.	2014-08-22 10:16:58 -07:00
Eric Anholt	cc68be2620	vc4: Avoid using undefined values when there's no color write. The simulator assertion fails when you read-before-write a temporary value, and there's no point in doing the packing if there was no color written.	2014-08-22 10:16:58 -07:00
Eric Anholt	ae83955b1d	vc4: Emit the scoreboard wait just when it's needed. This should improve performance on real hardware by allowing more shader instances to run in parallel. It also fixes assertion failures in tests that don't emit a fragment color, since otherwise we didn't have enough instructions to fit our signals in.	2014-08-22 10:16:58 -07:00
Eric Anholt	c3c922289b	vc4: Fix FLR for integer values less than 0. If we didn't truncate at all, then we don't need to fix for truncation happening in the wrong direction. Fixes piglit builtin-functions/-floor-	2014-08-22 10:16:57 -07:00
Eric Anholt	2ab4e48f94	vc4: Fix totally broken assertions about inter-instruction reg conflicts. The spec citation talked about A and B, and I proceeded to pay no attention to whether the waddrs were for A or B. As a result, this pair of instructions would claim to conflict: mov ra4, ra4 ; nop nop, r0, r0 mov.ns ra4, rb4 ; nop nop, r0, r0	2014-08-22 10:16:57 -07:00
Eric Anholt	b064c9103d	vc4: Add support for all the texture and FBO formats we can. Now that tiling is in place, we can expose the other formats. Depth is still broken (need to make changes in the shader), but if you don't expose it things crash all over. SNORM is dropped, but we could re-add it later with some shader fixes to handle converting between [0,1] and [-1,1].	2014-08-22 10:16:57 -07:00
Eric Anholt	3a1efcc7f9	vc4: Add support for texture tiling. This still treats everything as RGBA8888 for the most part, same as before. This is a prerequisite for handling other texture formats, since only RGBA8888 has a raster-layout mode.	2014-08-22 10:16:57 -07:00
Eric Anholt	1b6dcaf40c	vc4: Fix a typo in the validation for miplevels. It meant that LUMALPHA was being marked as many miplevels, and unsurprisingly wouldn't validate. On the other hand, some miplevel counts wouldn't get the small mips validated at all.	2014-08-22 10:16:57 -07:00
Eric Anholt	74ea87cde4	vc4: Convert to using an enum for texture data types	2014-08-22 10:16:57 -07:00
Eric Anholt	1cb5cfba85	vc4: Stop complaining about unknown texture channel types. It doesn't matter to this code -- the sampler always returns 8-bit unorm rgba.	2014-08-22 10:16:57 -07:00
Eric Anholt	b0a1e401a9	vc4: Include stdio/stdlib in headers so I don't have to include it per file. There are a few tools I want to have always available, and fprintf() and abort() are among them.	2014-08-22 10:16:57 -07:00
Matt Turner	d77f5603a5	i965: Fix JIP/UIP calculations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82846 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82929	2014-08-22 09:30:03 -07:00
Aaron Watry	2a553e4dc9	st/clover: Change platform name from Default to Clover Signed-off-by: Aaron Watry <awatry at gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-22 10:02:31 -05:00
Emil Velikov	e7f2f2dea5	dri/radeon: nuke the remaining references to sarea Remainder of the dri1 times. Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-21 21:47:44 +01:00
Emil Velikov	515ffb6c93	dri/radeon: cleanup the radeon_context vtbl Remove the set-but-unused, and set-but-empty vtable entries. Most likely a leftover from the dri1 days. Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-21 21:47:40 +01:00
Emil Velikov	dd46f0926d	include: move sarea.h next to it's only user The header is used by DRI1 drivers, which we've removed a while back. Now only the dri1 loader in libGL is using it, so let's move it in src/glx, and prefix it accordingly. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-21 21:47:37 +01:00
Emil Velikov	7550a24fa6	dri/radeon: drop obsolete radeon_{dri,macros}.h headers Both have been unused for at least a couple of years. For example the last user of radeon_macros.h was removed with commit `8c11f0a883` Author: Eric Anholt <eric@anholt.net> Date: Fri Oct 14 13:27:02 2011 -0700 radeon: Drop the legacy BO manager code. Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-21 21:47:22 +01:00
Vinson Lee	1748ea8b2b	SCons: Rename dri2_query_renderer.c to dri_common_query_renderer.c. Fix SCons build error introduced with commit `3fe7daec14`. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-08-21 12:22:18 -07:00
Connor Abbott	06ef631573	glsl/linker: pass through the is_intrinsic flag This flag was set to true for the atomic counter intrinsics, but it never got plumbed through the linker, so by the time it got to the backends it would always be set to the false. The current i965 backend code doesn't use is_intrinsic, so this should not change any existing code, but it's useful for codepaths that want to distinguish between intrinsics and non-intrinsics without using strcmp. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <connor.abbott@intel.com>	2014-08-21 11:46:13 -07:00
Carl Worth	619505ac7c	docs: Update instructions for creating a release This captures all of the steps I have been following in making releases for the past year or so. This way, the instructions should be sound for anyone who would like to take over the release process going forward.	2014-08-21 10:46:02 -07:00
Roland Scheidegger	eb4541ebaf	llvmpipe: change LP_MAX_SHADER_INSTRUCTIONS definition This change will double cache size for branches which have a lower LP_MAX_SHADER_VARIANTS limit (it will not do anything on master). The reason is that nowadays shaders tend to be quite a bit larger than they were (they were big when llvmpipe didn't have a fs loop, got much smaller with that loop, and since then have gradually increased quite a bit though still smaller than without the fs loop for various reasons - among them being d3d10 compliance, usage of 8-wide vectors, non-swizzled blend code). Thus effectively less shaders would be cached (unless they were very small and the variant limit was hit first). Also, since we're getting rid of the IR nowadays, the cached shaders shouldn't need all that much memory actually.	2014-08-21 19:00:29 +02:00
Carl Worth	399b4e2227	docs: Add my notes on stable-branch patch criteria This captures the set of rules I have been using for stable-branch management, (starting with a discussion on the mesa-dev mailing list on July 2013, and then refined through my own experience of performing stable-branch releases since then).	2014-08-21 09:46:57 -07:00
Carl Worth	46d03d37bf	Makefile: Switch from md5sums to sha256sums We switched to these several stable releases ago, (since the MD5 algorithm has been broken for some time), but only now did I get around to fixing this in the Makefile rather than just performing this step manually. CC: "10.2 10.3" <mesa-stable@lists.freedesktop.org>	2014-08-21 09:05:01 -07:00
Jon TURNEY	3fe7daec14	glx: Fix build since `679c2ef` "glx/drisw: add support for DRI2rendererQueryExtension", when only building drisw renderer v2: - Move dri_query_renderer_ into their respective dri*_priv.h headers - Drop then unnneeded include of dri2.h from dri2_query_renderer.c - Rename dri2_query_renderer.c as dri_common_query_renderer.c, as it's contents now are used for more than dri[23] Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-21 16:59:48 +01:00
Carl Worth	ea565108ae	Increment version to 10.4.0-devel Now that the 10.3 branch has been created	2014-08-21 08:38:24 -07:00
Alex Deucher	153df68834	radeonsi: add new SI pci ids Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2014-08-21 11:16:15 -04:00
Alex Deucher	f50b6b4895	radeonsi: add new CIK pci ids Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: mesa-stable@lists.freedesktop.org	2014-08-21 11:13:17 -04:00
Glenn Kennard	0fb221065e	r600g: Fix flat/smooth shade state toggle If only the flat/smooth shade state changed between two render calls the prior code would miss updating the hardware state. Also add check for sprite coord, potentially same type of issue otherwise for it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81967 Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-21 16:20:38 +02:00
Tom Stellard	bf7a60f41d	r600g/compute: Don't initialize vertex_buffer_state masks to 0x2 cs_vertex_buffer_state.enabled_mask and cs_vertex_buffer_state.dirty_mask are both updated when r600_set_constant_buffer() is called, so we don't need to manually update these values. This fixes a crash with OpenCL programs that have a kernel with no arguments. https://bugs.freedesktop.org/show_bug.cgi?id=82671 CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-21 06:12:12 -07:00
Tom Stellard	a9f0b08bac	r600g/compute: Use the first parameter in evergreen_set_global_binding()	2014-08-21 06:12:12 -07:00
Tom Stellard	43d954342e	pipe-loader: Fix memory leak v2 v2: - Change driver_name to char* Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-21 06:12:12 -07:00
Tom Stellard	8109664ded	radeon: Add work-around for missing Hainan support in clang < 3.6 v2 v2: - Add missing break. https://bugs.freedesktop.org/show_bug.cgi?id=82709 CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-21 06:12:11 -07:00
Michel Dänzer	3ba225c1ab	st/clover: Fix build against LLVM SVN >= r215967 v2 v2: Tom Stellard - Properly destroy the Module Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-21 07:55:23 -04:00
Kenneth Graunke	d682ebec0b	i965,meta: Stop unlocking the texture to try and prevent deadlocks. Unlocking the texture is not safe: another thread could come in and grab it. Now that we use a recursive mutex, this should work. This also fixes texture lock deadlocks in the new meta fast clear path. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-20 17:31:19 -07:00
Kenneth Graunke	0871028188	mesa: Use a recursive mutex for the texture lock. This avoids problems with things like meta operations calling functions that want to take the lock while the lock is already held. Basically, the point is to guard against API reentrancy across threads...not to guard against ourselves. Dave Airlie opposed this change, but it makes master usable again and no one proposed a better solution. We can revert this if/when someone does. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-20 17:31:19 -07:00
Carl Worth	f90b7e0f2b	glcpp: Fix glcpp-test-cr-lf "make check" test for Mac OS X There were two problems with the way this script used sed on OS X: 1. The OS X sed doesn't interpret "\r" in a replacement list as a carriage-return character, (instead it was inserting a literal 'r' character). We fix this by putting an actual ^M character into the source of the script, (rather than a two-character escape sequence hoping for sed to do the right thing). 2. When generating the test files with LF-CR ("\n\r") newlines, the OS X sed was adding an undesired final newline ("\n") at the end of the file. We avoid this by first using sed to add the ^M before the newlines, then using tr to swap the \r and \n characters. This way, sed never sees any lines ending with anything but \n, so it doesn't get confused and doesn't add any bogus extra newlines. Tested-by: Vinson Lee <vlee@freedesktop.org> Vinson's testing confirmed that this patch fixes FreeBSD as well.	2014-08-20 16:42:46 -07:00
Carl Worth	c09a8b0e3b	glcpp: Use printf instead of "echo -n" in glcpp-test I noticed that with /bin/sh on Mac OS X, "echo -n" does not work as desired, (it actually prints "-n" rather than suppressing the final newline). There is a /bin/echo that could be used (it actually works) instead of the builtin echo. But I decided it's more robust to just use printf rather than hardcoding /bin/echo into the script.	2014-08-20 16:41:38 -07:00
Matt Turner	04895f5c60	i965/vec4: Allow reswizzling writemasks when swizzle is single-valued. total instructions in shared programs: 4288033 -> 4266151 (-0.51%) instructions in affected programs: 930915 -> 909033 (-2.35%)	2014-08-20 13:01:18 -07:00
Jon TURNEY	bde2a62af7	Teach os_get_total_physical_memory about Cygwin Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-20 17:18:39 +01:00
Michel Dänzer	cd765cf7ee	r300g: Fix path to test programs for out-of-tree builds Fixes make check in that case. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-08-20 16:12:51 +09:00
Vinson Lee	c04a6d5c29	gallivm: Fix build with LLVM >= 3.6 r215967. This LLVM 3.6 commit changed EngineBuilder constructor. commit 3f4ed32b4398eaf4fe0080d8001ba01e6c2f43c8 Author: Rafael Espindola <rafael.espindola@gmail.com> Date: Tue Aug 19 04:04:25 2014 +0000 Make it explicit that ExecutionEngine takes ownership of the modules. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215967 91177308-0d34-0410-b5e6-96231b3b80d8 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-20 15:24:44 +09:00
Timothy Arceri	a1853eaea7	glsl: Use the without_array predicate in some more places Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-08-19 17:44:06 -07:00
Kristian Høgsberg	e6a53533b7	i965: Flush the RC and TC before doing a fast clear resolve The docs say "When performing a render target resolve, PIPE_CONTROL with end of pipe sync must be delivered.", which doesn't actually tell us whether we need to do it before or after. Blorp did it before and after, and doing it before certainly makes sense. The resolve operation needs to read from the MCS and if we don't flush the render cache it won't get up-to-date data. On the other hand, doing it after should not be necessary, since we call brw_render_cache_set_check_flush() after the resolve. Fixes rendering corruption in kwin's cover switch effect and various steam games. Missing flush spotted by Ken. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-19 17:21:39 -07:00
Carl Worth	8791cfedde	docs: Import 10.2.6 release notes, add news item.	2014-08-19 15:21:09 -07:00
Chris Forbes	1c4f141a54	docs: Mark off ARB_conditional_render_inverted for i965 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-20 07:49:17 +12:00
Chris Forbes	06ca96daad	i965: Enable ARB_conditional_render_inverted on Gen6+. The extension requires GL 3.0, so enable on just the generations exposing that. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-20 07:49:17 +12:00
Chris Forbes	3f8ad32627	mesa: Add support for inverted s/w conditional rendering Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-20 07:49:17 +12:00
Matt Turner	9a071e3339	i965/vec4: Add a pass to reduce swizzles. total instructions in shared programs: 4344280 -> 4288033 (-1.29%) instructions in affected programs: 397468 -> 341221 (-14.15%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-19 12:37:11 -07:00
Eric Anholt	5833680e7a	vc4: Plumb the texture index from TGSI through to the sampler uniforms. This commit and the last one fix ARB_fragment_program/sparse-samplers and 6 other tests.	2014-08-19 08:47:13 -07:00
Eric Anholt	c8097afe29	vc4: Avoid a null-deref if a sampler index isn't used. Part of fixing ARB_fragment_program/sparse-samplers	2014-08-19 08:47:13 -07:00
Brian Paul	31ce84a81f	mesa: fix NULL pointer deref bug in _mesa_drawbuffers() This is a follow-on fix to commit `39b40ad144`. Fixes a crash if the user calls glDrawBuffers(0, NULL). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82814 Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 09:29:16 -06:00
Glenn Kennard	dfa10ed264	r600g: Fix missing SET_TEXTURE_OFFSETS SB needs a bit of special handling to handle instructions without obvious side effects, to avoid it deleting them. Fixes failing non-const ARB_gpu_shader5 textureOffsets piglits with sb enabled. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-19 16:30:13 +02:00
Alexander von Gluck IV	ef1cf69cd3	gallium/target: Add needed mesautil lib to haiku-softpipe Acked-by: Brian Paul <brianp@vmware.com>	2014-08-19 10:03:05 -04:00
Alexander von Gluck IV	8cbf01f12a	gallium/aux: Fill in Haiku get process name code Acked-by: Brian Paul <brianp@vmware.com>	2014-08-19 10:03:05 -04:00
Alexander von Gluck IV	82c23dd962	haiku/swrast: Add missing src include search path for missing util/macros.h Acked-by: Brian Paul <brianp@vmware.com>	2014-08-19 10:03:05 -04:00
Tobias Klausmann	eed8b19aac	docs: Update status of ARB_conditional_render_inverted Done for: nvc0, softpipe and llvmpipe Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 09:02:29 -04:00
Tobias Klausmann	544c54114a	llvmpipe/softpipe: enable ARB_conditional_render_inverted Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 09:02:29 -04:00
Tobias Klausmann	a2fc85f5d0	nvc0: Handle ARB_conditional_render_inverted and enable it Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 09:02:29 -04:00
Tobias Klausmann	7a48858fcb	mesa/st: Support ARB_conditional_render_inverted modes Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 09:02:29 -04:00
Tobias Klausmann	fd5edee700	gallium: Add and handle PIPE_CAP_CONDITIONAL_RENDER_INVERTED Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 08:54:35 -04:00
Tobias Klausmann	64cc1876fa	mesa: add ARB_conditional_render_inverted flags Also add an extension bit so we can safely enable Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 08:54:35 -04:00
Tobias Klausmann	1a51751e93	glapi: add GL_ARB_conditional_render_inverted Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-19 08:54:35 -04:00
Chia-I Wu	58511b62c4	ilo: fix PIPE_CAP_VIDEO_MEMORY I changed Emil's patch in `f921131a5c` to report raw values in the winsys, but forgot to convert the values to megabytes in the pipe driver.	2014-08-19 19:56:07 +08:00
Chia-I Wu	17401896dd	ilo: enable HiZ in more cases on GEN6 With layer offsetting killed, we no longer need to restrict HiZ to non-mipmapped and non-arary depth buffers.	2014-08-19 19:53:37 +08:00
Chia-I Wu	5b4fc5f156	ilo: remove layer offsetting Follow i965 to kill layer offsetting for GEN6.	2014-08-19 19:53:37 +08:00
Chia-I Wu	fb3d506431	ilo: migrate to ilo_layout Embed an ilo_layout in ilo_texture, and remove now duplicated members.	2014-08-19 19:53:37 +08:00
Chia-I Wu	925359bc78	ilo: add new resource layout code Based on the old code, the new layout code describes the layout with the new, well-documented, ilo_layout. It also gains new features such as MCS support and extended ARYSPC_LOD0 that i965 comes up with (see `6345a94a9b`).	2014-08-19 19:53:37 +08:00
Niels Ole Salscheider	5ae9bdafd4	gallium/radeon: Do not use u_upload_mgr for buffer downloads Instead create a staging texture with pipe_buffer_create and PIPE_USAGE_STAGING. u_upload_mgr sets the usage of its staging buffer to PIPE_USAGE_STREAM. But since `150ac07b85` CPU -> GPU streaming buffers are created in VRAM. Therefore the staging texture (in VRAM) does not offer any performance improvements for buffer downloads. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-19 12:56:04 +02:00
Marek Olšák	498dc676ea	r600g: copy IA_MULTI_VGT_PARAM programming from radeonsi for Cayman Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	f62f88274a	radeonsi: bump PRIMGROUP_SIZE for some cases Recommended by hw people. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	4be7ff5567	radeonsi: set PARTIAL_VS_WAVE(0) when appropriate Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	94e474f3c3	radeonsi: set IA_MULTI_VGT_PARAM on SI the same as on CIK (v2) Nothing's changed for CIK here. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	a333309979	radeonsi: simplify si_num_banks function This makes it easier to use. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	db51ab6d6a	radeonsi: use r600_draw_rectangle from r600g Rectangles are easier than triangles for the rasterizer. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	7792f9858b	radeonsi: save scissor state and sample mask for u_blitter Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	25633c85e1	radeonsi: don't set CB_SHADER_MASK=1 if there are no color outputs This hack isn't needed anymore because of the previous u_blitter commit. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	a6fcdbf560	gallium/u_blitter: don't use an empty fragment shader if there's a colorbuffer This is custom code used by some drivers. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	406ab1662c	gallium/util: handle PIPE_BUFFER in util_pipe_tex_to_tgsi_tex Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	8db7dacf29	rbug: only add textures to the list rbug-gui cannot display buffers, so it's pointless to add them. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	ddcbe9c526	rbug: fix a crash in sampler_view_destroy caused by incorrect context Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	ba81a3784b	rbug: send the actual number of layers to the client This sends the correct value for array textures. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	90d10f467f	rbug: implement streamout context functions Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:18 +02:00
Marek Olšák	b7b1ad9c6c	rbug: fix crash in set_vertex_buffers Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:17 +02:00
Marek Olšák	4a3f156dd1	rbug: remove contexts from the list properly Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-19 12:20:17 +02:00
Emil Velikov	f921131a5c	ilo: fold drm_intel_get_aperture_sizes() within probe_winsys() ... and store the value in intel_winsys_info/ilo_dev_info. Suggested-by: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> olv: check for errors and report raw values	2014-08-19 17:45:00 +08:00
Matt Turner	a4359bcaa5	i965/cfg: Add a foreach_block_and_inst_safe macro. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-18 19:08:53 -07:00
Matt Turner	26624b85e7	i965/cfg: Add a foreach_inst_in_block_safe macro. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-18 19:05:59 -07:00
Matt Turner	c51b0861e4	i965/cfg: Add a foreach_block_safe macro. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-18 19:05:59 -07:00
Matt Turner	a3d0ccb037	i965: Pass a cfg pointer to generate_{code,assembly}. The loop over all instructions is now two-fold, over all of the blocks and all of the instructions in each block. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-18 19:05:59 -07:00
Matt Turner	596990d91e	i965: Add and use foreach_block macro. Use this as an opportunity to rename 'block_num' to 'num'. block->num is clear, and block->block_num has always been redundant.	2014-08-18 18:56:30 -07:00
Matt Turner	d688667c7f	i965/cfg: Embed link in bblock_t for main block list. The next patch adds a foreach_block (block, cfg) macro, which works better if it provides a direct bblock_t pointer, rather than a bblock_link pointer that you have to use to find the actual block. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-18 18:56:30 -07:00
Matt Turner	19c6617adf	i965/fs: Optimize gl_FrontFacing calculation on Gen4/5. Doesn't use fewer instructions, but it does avoid writing the flag register and if we want to switch the representation of true for Gen4/5 in the future, we can just delete the AND instruction.	2014-08-18 18:35:56 -07:00
Matt Turner	d1c43ed487	i965/fs: Optimize gl_FrontFacing calculation on Gen6+. total instructions in shared programs: 4288650 -> 4282838 (-0.14%) instructions in affected programs: 595018 -> 589206 (-0.98%) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:54 -07:00
Matt Turner	2e51dc838b	i965: Use ~0 to represent true on Gen >= 6. total instructions in shared programs: 4292303 -> 4288650 (-0.09%) instructions in affected programs: 299670 -> 296017 (-1.22%) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:53 -07:00
Matt Turner	cc60a487d1	i965/fs: Optimize emit_bool_to_cond_code for logical exprs. AND, OR, and XOR can generate the conditional code directly. total instructions in shared programs: 4293335 -> 4292303 (-0.02%) instructions in affected programs: 121408 -> 120376 (-0.85%) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:53 -07:00
Matt Turner	2a6b6621d8	i965: Use UniformBooleanTrue value for boolean literal true. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:49 -07:00
Matt Turner	9e2e7c7dc0	glsl: Use UniformBooleanTrue value for uniform initializers. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:48 -07:00
Matt Turner	6df0fd8fe9	mesa: Upload boolean uniforms using UniformBooleanTrue. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:47 -07:00
Matt Turner	e0f955abd3	i965: Remove dead call to _mesa_associate_uniform_storage(). Dead since the call to _mesa_generate_parameters_list_for_uniforms was removed in commit `12751ef2`. So this was why all of that code that was supposed to fix up the value of a uniform bool to wasn't happening. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-18 18:35:41 -07:00
Matt Turner	e87106d153	mapi: Inline shared-glapi/tests/Makefile. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-18 18:27:20 -07:00
Matt Turner	7172f02d7c	mapi: Inline glapi/tests/Makefile. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-18 18:27:16 -07:00
Matt Turner	9dbb0f49b6	mapi: Inline glapi/Makefile. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-18 18:25:52 -07:00
Matt Turner	dff5a219d0	mapi: Inline es2api/Makefile. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-18 18:25:29 -07:00
Matt Turner	18ef5136b6	mapi: Inline es1api/Makefile. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-18 18:25:02 -07:00
Matt Turner	c3ce1a942f	mapi: Inline shared-glapi/Makefile.	2014-08-18 18:24:09 -07:00
Matt Turner	4ccd2a9f9b	build: Let install-lib-links.mk handle .la files in subdirectories. The next patches are going to combine some of the mapi subdirectories' Makefiles into a single Makefile, giving better build parallelism. lib_LTLIBRARIES will be set to something like lib_LTLIBRARIES = shared-glapi/libglapi.la es2api/libGLESv2.la and the current code in install-lib-links.mk simply prepends .libs/ and replaces the .la in order to create the filenames that it needs to ln/cp into the LIBDIR. This doesn't work when the .la file is actually in a subdirectory. This patch fixes this and puts .libs/ in the right place. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-18 18:22:40 -07:00
Matt Turner	45eb065668	i965: Enable instruction compaction on Gen8+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	31eed95b22	i965: Add support for compacting 3-src instructions on Gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	fb1db6753f	i965: Add support for compacting 1- and 2-src instructions on Gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	3904d404a3	i965/gen8: Add 3-src instruction compaction tables. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	190ce6b093	i965/gen8: Add instruction compaction tables. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	2faa1a414c	i965: Update JIP/UIP compaction code to operate on bytes. JIP/UIP were previously in units of compacted instructions. On Gen8 they're in units of bytes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	23ab55cb6c	i965: Reverse condition ordering to let us support other gens. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-18 18:18:50 -07:00
Matt Turner	6cc6c3b647	i965/disasm: Add CSEL.	2014-08-18 18:18:50 -07:00
Timothy Arceri	39a920c0cb	mesa: fix copy and paste errors in glBindVertexBuffers Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-08-19 10:19:18 +10:00
Tobias Klausmann	9100c359ac	nv50/ir: (trivial) initialize pointer to silence warning Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>	2014-08-18 19:41:26 -04:00
Eric Anholt	76f687d5a5	vc4: Add support for swizzling of texture colors. Fixes swapped colors on the copypix demo and some piglit tests like pbo-teximage-tiling .	2014-08-18 15:27:43 -07:00
Eric Anholt	489350e570	vc4: Fix handling of non-XYZW swizzles in color outputs. The SWIZZLE_1 of the winsys destination was dereffing off the end of the array, which surprisingly often worked out (since nobody reads the rendered value anyway, so whatever junk was referenced in the QIR didn't matter), but shader dumping would sometimes segfault.	2014-08-18 15:27:43 -07:00
Eric Anholt	37992a4e39	vc4: Extract the swizzle handling from vertex fetch. I want to reuse this elsewhere, and NONE debug output hasn't been useful so I don't miss it being as detailed as it was before.	2014-08-18 15:27:43 -07:00
Eric Anholt	c1db622215	vc4: Add support for color masking. This gets fbo-colormask-formats working for core formats, which increases my confidence in some of the swizzle and blend handling.	2014-08-18 15:27:43 -07:00
Eric Anholt	50b4293eb3	vc4: Add a helper for QOP_R4_UNPACK_[ABCD].	2014-08-18 15:27:43 -07:00
Eric Anholt	8795341e2c	vc4: Don't forget to set up the offset for render targets. This almost fixes fbo-generatemipmap rendering, except that the 1x1 level isn't getting rendered.	2014-08-18 15:27:43 -07:00
Eric Anholt	63fe494877	vc4: Fix multi-level texture setup. We weren't accounting for the level 0 offset in the texture setup (so it only worked if it happened to be a single-level texture), and doing so required that we get the level 0 offset page aligned so that the offset bits don't get interpreted as the texture format and such.	2014-08-18 15:27:43 -07:00
Eric Anholt	a538bab065	vc4: Fix viewport handling in the uniforms upload. I had the right viewports in vc4_emit.c, but grabbed the wrong values in the uniform setup, so primitives would claim to be in the wrong parts of the screen. (The vc4_emit.c state looks like it just decides how big the clipping guardband is). This gets fbo-viewport closer to working (which still has the problem that the HW is always guard-band clipping), and fixes inverted FBO rendering in general.	2014-08-18 15:27:43 -07:00
Marek Olšák	082d8c54c1	docs/relnotes: document GLX_MESA_query_renderer	2014-08-19 00:26:41 +02:00
Francisco Jerez	e9a4e74926	clover: Refuse to build a program if there are kernel objects attached to it. Fixes piglit cl-api-build-program. Tested-by: EdB <edb+mesa@sigluy.net>	2014-08-18 09:32:24 +03:00
Francisco Jerez	c6817f19f6	clover/util: Pass initial count value to ref_counter constructor. And mark the ref_count() method as const. Tested-by: EdB <edb+mesa@sigluy.net>	2014-08-18 09:32:24 +03:00
Francisco Jerez	37e4d22e95	clover/util: Implement minimalist reference to clover::ref_counter object. Tested-by: EdB <edb+mesa@sigluy.net>	2014-08-18 09:32:24 +03:00
EdB	ce4d3f3104	clover: clGetProgramInfo support for OpenCL 1.2. [ Francisco Jerez: Rework using fold() for conciseness. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net> Tested-by: EdB <edb+mesa@sigluy.net>	2014-08-17 23:34:11 +03:00
Ilia Mirkin	ef130b6050	nouveau: don't keep stale pointer to free'd data If ->sys is non-null, we might decide that it's where the data is stored. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-08-16 17:52:54 -04:00
Ilia Mirkin	1f4bc0c95e	egl: don't exit process on initialization failure Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-16 17:52:54 -04:00
Brian Paul	9d9879abed	mesa: fix compressed_subtexture_error_check() return value The function should return GLboolean, not GLenum. If we detect invalid compressed pixel storage parameters, we should return GL_TRUE, not GL_FALSE so that the function is no-op'd. An update to the piglit s3tc-errors test will check this. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-16 06:48:45 -06:00
Brian Paul	cf8b680f40	mesa: move _mesa_compressed_texture_pixel_storage_error_check() to pixelstore.c, add const qualifier to the 'packing' parameter. Add comments. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-16 06:48:44 -06:00
Brian Paul	9b4c6da7f0	mesa: minor improvements to _mesa_compute_compressed_pixelstore() Replace the gl_texture_image parameter with mesa_format since we only used the image's format. Add some comments. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-16 06:48:44 -06:00
Brian Paul	1e594d4f5c	util: whitespace and formatting fixes in u_math.h Trivial.	2014-08-16 06:48:44 -06:00
Ilia Mirkin	8867ffbf95	nouveau: make sure to invalidate any vbo state as well Reviewed-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-08-16 02:33:12 -04:00
Jordan Justen	a1dca7069b	i965/gen6: Force ALL_SLICES_AT_EACH_LOD for separate stencil/hiz For gen6 we will use the ALL_SLICES_AT_EACH_LOD miptree layout for separate stencil/hiz. This is needed because gen6 hiz and separate stencil only support a single miplevel. When accessing the other LODs, we will program a tile aligned offset for the bo. PRM Volume 1, Part 1, 7.18.3.7.2 For separate stencil buffer [DevILK] to [DevSNB]: "The separate stencil buffer does not support mip mapping, thus the storage for LODs other than LOD 0 is not needed." We still allocate storage for the other stencil mip-levels within a single texture, but each mip-level will use non-mip-array spacing. PRM Volume 2, Part 1, 7.5.3 Hierarchical Depth Buffer "[DevSNB]: The hierarchical depth buffer does not support the LOD field, it is assumed by hardware to be zero. A separate hierarachical depth buffer is required for each LOD used, and the corresponding buffer’s state delivered to hardware each time a new depth buffer state with modified LOD is delivered." We allocate storage for the other hiz mip-levels within a single texture, but each mip-level will use non-mip-array spacing. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:42 -07:00
Jordan Justen	31e1beec89	i965/gen6: Stencil/hiz needs an offset for LOD > 0 Since gen6 separate stencil & hiz only supports LOD0, we need to program an offset to the LOD when emitting the separate stencil/hiz. v3: * Use new array_layout enum Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:42 -07:00
Jordan Justen	b3d68d5a30	i965/gen6: Force tile alignment for each stencil/hiz LOD Gen6 doesn't support multiple miplevels for hiz and stencil. Therefore, we must point to the LOD directly during rendering. But, we also have removed the tile offsets from normal depth surfaces, so we need to align each LOD to a tile boundary for hiz and stencil. v3: * Use new array_layout enum Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:42 -07:00
Jordan Justen	6345a94a9b	i965: Support array_layout == ALL_SLICES_AT_EACH_LOD for multiple LODs Previously array_layout ALL_SLICES_AT_EACH_LOD was only used for array spacing lod0 on gen7+ and therefore was only used with a single mip level. gen6 separate stencil & hiz only support LOD0, so we need to allocate the miptree similar to gen7+ array spacing lod0, except we also need space for multiple mip levels. (Since OpenGL stencil and depth support multiple LODs.) The miptree is allocated with tightly packed array slice spacing, but we still also pack the miplevels into the region similar to a normal multi mip level packing. A 2D Array texture with 2 slices and multiple LODs would look somewhat like this: +----------+ \| \| \| \| +----------+ \| \| \| \| +----------+ +---+ +-+ \| \| +-+ +---+ +-+ \| \| : +---+ v3: * Use new array_layout enum * ASCII art! Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	27f5fa7a37	i965: Allow forcing miptree->array_layout = ALL_SLICES_AT_EACH_LOD gen6 does not support multiple miplevels with separate stencil/hiz. Therefore we need to layout its miptree with no mipmap spacing between the slices of each miplevel. v3: * Use new array_layout enum Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	7e856d0b18	i965: Change mipmap array_spacing_lod0 to array_layout (enum) We will want to setup gen6 separate stencil and hiz miptrees in a layout that is similar to array_spacing_lod0. This is needed because gen6 hiz and stencil only support a single mip-level. In both use cases (gen7+ LOD0 spacing & gen6 separate stencil/hiz), the array slices will be packed at each LOD without reserving extra space for LODs within each array slice. So, we generalize the name of this field and add comments to indicate the old and new uses. Motivation for the gen6 change comes from the PRM: PRM Volume 1, Part 1, 7.18.3.7.2 For separate stencil buffer [DevILK] to [DevSNB]: "The separate stencil buffer does not support mip mapping, thus the storage for LODs other than LOD 0 is not needed." PRM Volume 2, Part 1, 7.5.3 Hierarchical Depth Buffer "[DevSNB]: The hierarchical depth buffer does not support the LOD field, it is assumed by hardware to be zero. A separate hierarachical depth buffer is required for each LOD used, and the corresponding buffer’s state delivered to hardware each time a new depth buffer state with modified LOD is delivered." v2: * Rename array_spacing_lod0 to non_mip_arrays v3: * Instead, replace array_spacing_lod0 with array_layout enum Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	56cdb55e38	i965/gen6 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surface (`bf25ee2` for gen6) Previously we would always find the 2D sub-surface of interest, and then program the surface to this location. Now we always program the 3DSTATE_DEPTH_BUFFER at the start of the surface. To select the lod/slice, we utilize the lod & minimum array element fields. We also must disable brw_workaround_depthstencil_alignment for gen >= 6. Now the hardware will handle alignment when rendering to additional slices/LODs. v3: * Set depth_mt bo RELOC offset to 0, as was done in `bf25ee2` Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56127 Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	3da13aef01	i965/gen6 fbo: make unmatched depth/stencil configs return unsupported (`f3c886b` for gen6) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	96306a6cbb	i965/gen6 blorp depth: calculate base surface width/height (`e3a49e1` for gen6) This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	039eb81abf	i965/gen6 depth surface: calculate minimum array element being rendered (`a23cfb8` for gen6) In layered rendering this will be 0. Otherwise it will be the selected slice. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	cfa19af966	i965/gen6 depth surface: calculate LOD being rendered to (`08ef1dd` for gen6) This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	51b38106d7	i965/gen6 depth surface: calculate depth (array size) for depth surface (`bc1acaa` for gen6) This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Note: Cube maps are treated as 2D arrays with 6 times as many array elements as the cube map array would have. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	48acf19d23	i965/gen6 depth surface: calculate more specific surface type (`171e633` for gen6) This will be used in 3DSTATE_DEPTH_BUFFER in a later patch. Note: Cube maps are treated as 2D arrays with 6 times as many array elements as the cube map array would have. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	11663050eb	i965/gen6_depth_state.c: Remove (gen != 6) code paths Since this code was branched from brw_misc_state.c, it had support for gen != 6. We can now remove this. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:41 -07:00
Jordan Justen	39a5b69985	i965: Split gen6 depth hiz state out from brw We will program the gen6 hiz depth state differently to enable layered rendering on gen6. v2: * Remove unneeded gen6_emit_depthbuffer as suggested by Topi Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:40 -07:00
Jordan Justen	341995e4b5	i965/gen6: Adjust render height in errata case for MSAA In the gen6 PRM Volume 1 Part 1: Graphics Core, Section 7.18.3.7.1 (Surface Arrays For all surfaces other than separate stencil buffer): "[DevSNB] Errata: Sampler MSAA Qpitch will be 4 greater than the value calculated in the equation above , for every other odd Surface Height starting from 1 i.e. 1,5,9,13" Since this Qpitch errata only impacts the sampler, we have to adjust the input for the rendering surface to achieve the same qpitch. For the affected heights, we increment the height by 1 for the rendering surface. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:40 -07:00
Jordan Justen	f063712373	i965/gen6: Add support for layered renderbuffers Rather than pointing the surface_state directly at a single sub-image of the texture for rendering, we now point the surface_state at the top level of the texture, and configure the surface_state as needed based on this. v2: * Use SET_FIELD as suggested by Topi * Simplify min_array_element assignment as suggested by Topi v3: * Use irb->layer_count for depth instead of rb->Depth * Make gl_target const * depth - 1, not depth v4: * Merge in `dd43900b` & `b875f39e` fixes to prevent 3D texture piglit regressions Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 20:11:04 -07:00
Jordan Justen	89b1f5d6ac	i965/gen6_surface_state.c: Remove (gen < 6) code path Since this code was branched from brw_wm_surface_state.c, it had support for gen < 6. We can now remove this. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 17:19:20 -07:00
Jordan Justen	1f8e0fbd38	i965: Split gen6 renderbuffer surface state from gen5 and older We will program the gen6 renderbuffer surface state differently to enable layered rendering on gen6. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 17:19:20 -07:00
Kenneth Graunke	2d1735187d	meta: Use instanced rendering for layered clears. Layered rendering is part of OpenGL 3.2; GL_ARB_draw_instanced is part of OpenGL 3.1. As such, all drivers supporting layered rendering already support gl_InstanceID. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-15 16:53:48 -07:00
Kenneth Graunke	ed6a4d6a7d	mesa: Expose vbo_exec_DrawArraysInstanced as _mesa_DrawArraysInstanced. So we can use it in meta.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-08-15 16:53:48 -07:00
Dave Airlie	e2594ee882	Revert "hud: don't overrun malloced arrays" This reverts commit `1cfcd0164e`. This seems to cause r600g lockups, https://bugs.freedesktop.org/show_bug.cgi?id=82628 Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-08-16 09:15:19 +10:00
Kristian Høgsberg	14c1a2a94c	i965: Guard access to gl_Layer by extension #ifdef Only assign gl_Layer if we have GL_AMD_vertex_shader_layer. Gen6 doesn't (currently) have that extension, but it also doesn't support layered rendering. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com>	2014-08-15 16:09:11 -07:00
Emil Velikov	1e1d285701	gallium/vc4: PIPE_CAP_VIDEO_MEMORY return the amount of system ram Suggested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-08-15 23:42:10 +01:00
Eric Anholt	7c65b714ed	vc4: Add support for blending. Passes blendminmax and blendsquare. glean's more serious blendFunc fails in simulation due to binner memory overflow (I really need to work around that), and fbo-blending-formats fails due to Mesa refusing one of the getter requests, even before it could fail due to the driver not actually supporting different formats yet.	2014-08-15 12:01:32 -07:00
Eric Anholt	f663102585	vc4: Drop incorrect attempt to incorrectly invert the primconvert hw_mask. The hw_mask is the set of primitives you actually support, so this attempt to provide the set of formats that's unsupported was wrong in two ways (it was intended to be '~' not '!'). However, we only call this code when prim isn't one of the actually supported hw_mask bits, so missing out on the memcpy didn't matter anyway.	2014-08-15 12:01:32 -07:00
Eric Anholt	a8f16054ca	vc4: Use cl_f() instead of cl_u32(fui())	2014-08-15 12:01:32 -07:00
Eric Anholt	e6fe6d0694	vc4: Consistently use qir_uniform_f().	2014-08-15 12:01:32 -07:00
Eric Anholt	ba875b3a0d	vc4: Consume the implicit varyings for points and lines. We were triggering simulator assertion failures for not consuming these, and presumably we want to actually make use of them some day (for things like point/line antialiasing) Note that this has the qreg index as 0, which is the same index as the first GL varyings read. This doesn't matter currently, since that number isn't used for anything except dumping.	2014-08-15 12:00:32 -07:00
Eric Anholt	64ad96a9f4	vc4: Move the deref of the color buffer for simulator into the simulator. At some point I'm going to want to move the information necessary for the host buffer upload/download into the BO so that it's independent of the current vc4->framebuffer, but for now this fixes pointless derefs on non-simulator in vc4_context.c since the dump_fbo() removal	2014-08-15 11:52:18 -07:00
Kristian Høgsberg	2f28a0dc23	i965: Implement fast color clears using meta operations This patch uses the infrastructure put in place by previous patches to implement fast color clears and replicated color clears in terms of meta operations. This works all the way back to gen7 where fast clear was introduced and adds support for fast clear on gen8. It replaces the blorp path completely and improves on a few cases. Layered clears are now done using instanced rendering and multiple render-target clears use a MRT shader with rep16 writes. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 11:25:47 -07:00
Kristian Høgsberg	f9dc7aabb3	i965: Add optimization pass to let us use the replicate data message The data port has a SIMD16 'replicate data' message, which lets us write the same color for all 16 pixels by sending the four floats in the lower half of a register instead of sending 4 times 16 identical component values in 8 registers. The message comes with a lot of restrictions and could be made generally useful by recognizing when those restriction are satisfied. For now, this lets us enable the optimization when we know it's safe, but we don't enable it by default. The optimization works for simple color clear shaders only, but does recognized and support multiple render targets. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 11:25:47 -07:00
Kristian Høgsberg	ba4507576c	meta: Export _mesa_meta_drawbuffers_from_bitfield() We'll use this in the i965 fast clear implementation. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2014-08-15 11:25:47 -07:00
Kristian Høgsberg	5fad83bdf8	mesa: Use _mesa_lock_context_textures in _mesa_GetTexParameterfv() GetTexParamterfv() doesnt change texture state, so instead of _mesa_lock_texture() we can use _mesa_lock_context_textures(), which doesn't increase the texture stamp. With this change, _mesa_update_state_locked() is now only called from under _mesa_lock_context_textures(), which is right thing to do. Right now it's the same mutex, but if we made texture locking more fine grained locking one day, just locking one texture here would be wrong. This all ignores the fact that texture locking seem a bit flaky and broken, but we're trying to not blatantly make it worse. This change allows us to reliably unlock the context textures in the dd::UpdateState callback as is necessary for meta color resolves. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 11:25:25 -07:00
Kristian Høgsberg	388f02729b	i965: Move pre-draw resolve buffers to dd::UpdateState No functional change except for glBegin/glEnd style rendering, where we now do the resolves at glBegin time instead of FLUSH_VERTICES time. This is also the reason for this change, so that when we later switch fast clear resolve to use meta, we won't be doing meta operations in the middle of a begin/end sequence. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 10:33:41 -07:00
Kristian Høgsberg	cf89b29d2f	i965: Provide a context flag to let us enable fast clear GEN7+ has the fast clear functionality, which lets us clear the color buffers using the MCS and a scaled down rectangle. To enable this we have to set the appropriate bits in the 3DSTATE_PS package. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 10:33:41 -07:00
Kristian Høgsberg	1a05dcb349	i965: Disable clipping when rendering 3DPRIM_RECTLIST primitives The clipper doesn't support clipping 3DPRIM_RECTLIST primitives and must be turned off when we use them. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 10:33:41 -07:00
Kristian Høgsberg	3f0f2c7f7d	i965: Add a mechanism for sending native primitives into the driver The brw_draw_prims() function is the draw entry point into the driver, and takes struct _mesa_prim for input. We want to be able to feed native primitives into the driver, and to that end we introduce BRW_PRIM_OFFSET, which lets use describe geometry using the native GEN primitive types. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 10:33:41 -07:00
Kristian Høgsberg	ff7a2fc322	i965: Add context flag to disable the viewport transform This lets us disable the viewport transform, which will be useful for emitting 3DPRIM_RECTLIST. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 10:33:41 -07:00
Kristian Høgsberg	1effbf6898	i965: Add an option to not generate the SIMD8 fragment shader For now, this can only be triggered with a new 'no8' INTEL_DEBUG option and a new context flag. We'll use the context flag later, but introducing it now lets us bisect to this commit if it breaks something. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 10:33:41 -07:00
Emil Velikov	0267c6d7ee	docs/autoconf: explicitly mention PKG_CONFIG_PATH for cross/multilib builds ... and squash a couple of typos. Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 18:00:37 +01:00
Emil Velikov	5fe400d82a	st/dri: Add __DRI2rendererQueryExtension support The final step to get GLX_MESA_query_renderer working with gallium drivers. v2: Remove __DRI2_RENDERER_PREFERRED_PROFILE handling. It's already handled in dri/common. Spotted by Marek. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	89f80c2185	gallium/softpipe/llvmpipe: handle query_renderer caps Both report 0xffffffff as both vendor and device id, and the maximum amount of system memory as video memory. v2: Use aux helper os_get_total_physical_memory(). Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	3a6b68b113	gallium/svga: handle query_rendered caps All the values are are currently hardcoded. One could use some heuristics to determine the amount of video memory if a callback to the host is not available. Do we what to advertise the driver as hardwar accelerated ? Cc: Brian Paul <brianp@vmware.com> Cc: José Fonseca <jose.r.fonseca@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	2b5f3956be	gallium/nouveau: handle query_renderer caps Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	0b67d5d4ce	gallium/vc4: handle query_renderer caps Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	de01443753	gallium/r300/r600/radeonsi: handle query_renderer caps Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	cc313b3ffe	gallium/ilo: handle query_renderer caps Implementation based on the classic driver with the following changes: - Use auxiliarry function os_get_total_physical_memory to get the total amount of memory. - Move the libdrm_intel specific get_aperture_size to the winsys. Cc: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:42:47 +01:00
Emil Velikov	5b9cb13295	gallium/i915: handle query_renderer caps Implementation based on the classic driver with the following changes: - Use auxiliarry function os_get_total_physical_memory to get the total amount of memory. - Move the libdrm_intel specific get_aperture_size to the winsys. Cc: Stephane Marchesin <stephane.marchesin@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:42:46 +01:00
Emil Velikov	e9c43b1f01	gallium/freedreno: handle query_renderer caps Provide the real vendor and and hardcode the device id as 0xffffffff as the devices currently using freedreno are non-pci. The device features UMA. Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-08-15 17:42:43 +01:00
Emil Velikov	8d2745703c	auxiliary/os: introduce os_get_total_physical_memory helper function Cc: Alexander von Gluck IV <kallisti5@unixzen.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:41:57 +01:00
Emil Velikov	139751403c	gallium: add GLX_MESA_query_renderer caps Namely vendor/device id, accelerated and UMA, which will be used to describe the underlying renderer. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-15 17:41:34 +01:00
Emil Velikov	64b1dc4449	dri/swrast: add GLX_MESA_query_renderer support v2: - Drop __DRI2_RENDERER_PREFERRED_PROFILE case. - Cleanup return statements. Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:38 +01:00
Emil Velikov	9c65361457	dri/radeon: add GLX_MESA_query_renderer support - Create radeon{Vendor,GetRenderer}String helpers. - Drop __DRI2_RENDERER_PREFERRED_PROFILE case. - Cleanup return statements. To be used by the upcomming GLX_MESA_query_renderer implementation. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:38 +01:00
Emil Velikov	55d1251d41	dri/radeon: don't print TCL status on glGetString(GL_RENDERER) Printing the TCL involves that context is available at the time of query. The GLX_MESA_query_renderer states that glGetString(GL_RENDERER) and glXQueryRendererStringMESA(GLX_RENDERER_DEVICE_ID_MESA) will have the same format, thus removing the context dependenicy will help us achieve that. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:37 +01:00
Emil Velikov	76f07362ea	dri/nouveau: add GLX_MESA_query_renderer support - Create nouveau_{vendor,get_renderer}_string helpers. - Set correct max_gl*version. - Query the device PCIID via libdrm_nouveau/nouveau_getparam. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:37 +01:00
Emil Velikov	87d3ae0b45	dri/common: Move __DRI2_RENDERER_PREFFERED_PROFILE handling to driQueryRendererIntegerCommon Essentially all drivers would like to use to opengl core profile if available, so avoid duplication by moving the code to a common fallback within driQueryRendererIntegerCommon. If a driver uses different approach they can handle it separately. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:37 +01:00
Emil Velikov	679c2ef8a0	glx/drisw: add support for DRI2rendererQueryExtension The extension is used by GLX_MESA_query_renderer, which can be provided for by hardware and software drivers. v2: Use designated initializers. v3: Move drisw_query_renderer_*() to dri2_query_renderer.c Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:37 +01:00
Emil Velikov	1bccf99c30	glx/dri2: use mapping table for dri2_convert_glx_query_renderer_attribs() Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:37 +01:00
Emil Velikov	d10ba8b7c0	glx/drisw: Move private structure declarations to a header file v2: Reff the correct file wrt copyright, spotted by Chia-I Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-15 17:35:36 +01:00
Brian Paul	ffb8e884f7	mesa: check if GL_ARB_copy_image is enabled in _mesa_CopyImageSubData() Generate a GL error and return rather than crashing on a null ctx->Driver.CopyImageSubData pointer (gallium). This allows apitraces with glCopyImageSubData() calls to continue rather than crash. Plus, fix a comment typo. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-15 08:35:17 -06:00
Neil Roberts	aa9d4f9d1a	i965/blorp_clear: Use memcpy instead of assignment to copy clear value Similar to the problem described in `2c50212b14`, if we copy the clear value through a regular assignment via a floating point value, then if an integer clear value is being used that happens to contain a signalling NaN value then it would get converted to a quiet NaN when stored via the x87 floating-point registers. This would corrupt the integer value. Instead we should use a memcpy to ensure the exact bit representation is preserved. This bug can be triggered on 32-bit builds with optimisations by using an integer clear color with a value like 0x7f817f81. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 12:35:40 +01:00
Glenn Kennard	afa7df9b78	r600g: Implement ARB_derivative_control Requires Evergreen/Cayman marek: update release notes Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-15 12:23:06 +02:00
Chris Forbes	f1370fed2c	docs: Update relnotes for ARB_gpu_shader5 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 19:25:10 +12:00
Chris Forbes	139f127aac	docs: Mark off ARB_gpu_shader5 for i965 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 19:25:07 +12:00
Chris Forbes	4a3667993e	i965: Enable ARB_gpu_shader5 on Gen7 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 19:24:56 +12:00
Chris Forbes	abedd05bcd	i965/fs: Add support for nonconst sampler indexing in FS visitor Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:13:33 +12:00
Chris Forbes	fbfcd671a1	i965/fs: Add support for non-const sampler indices in generator Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:13:32 +12:00
Chris Forbes	4ba5171f30	i965/fs: Refactor generate_tex in prep for nonconst sampler indexing Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:13:32 +12:00
Chris Forbes	2b1204aa96	i965/fs: Use brw_adjust_sampler_state_pointer in fs generator too Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:13:32 +12:00
Chris Forbes	2cd6169e92	i965/vec4: Add support for nonconst sampler indexing in VS visitor V2: Set force_writemask_all on ADD; this is necessary in the VS case too. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:12:45 +12:00
Chris Forbes	301b71557b	i965/vec4: Add support for non-const sampler indices in generator Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:10:32 +12:00
Chris Forbes	86dc34a0b0	i965: Generalize sampler state pointer mangling for non-const For now, assume that the addressed sampler can be in any of the 16-sampler banks. If we preserved range information this far, we could avoid emitting these instructions if the sampler were known to be contained within one bank. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:10:29 +12:00
Chris Forbes	f7146d1a94	i965/vec4: Refactor generate_tex in prep for non-const samplers Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:10:28 +12:00
Chris Forbes	8ce3fa8e91	i965: Extract helper function for surface state pointer adjustment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-15 19:10:19 +12:00
Chris Forbes	ceaf823e23	docs: Mark off ARB_gpu_shader5 UBO array indexing for i965 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:48 +12:00
Chris Forbes	70354ca668	i965/vec4: Add visitor support for nonconst ubo block indexing Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:48 +12:00
Chris Forbes	a55eae9b6d	i965/vec4: Generate indirect sends for nonconstant UBO array access Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:48 +12:00
Chris Forbes	ad9fce6811	i965/fs: Add visitor support for nonconstant UBO indices Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:48 +12:00
Chris Forbes	3fd359b10d	i965/fs: Generate indirect sends for nonconstant UBO array accesses Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:47 +12:00
Chris Forbes	17e0fa9a06	i965: Adjust set_message_descriptor to handle non-sends We're about to be using this infrastructure to build descriptors in src1 of non-send instructions, when preparing to do an indirect send. Don't accidentally clobber the conditionalmod field of those instructions with SFID bits, which aren't part of the descriptor. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:47 +12:00
Chris Forbes	3512c79789	i965: Add low-level support for indirect sends This provides a reasonable place to enforce the hardware restriction that indirect descriptors must be in a0.0 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-15 18:53:47 +12:00
Kenneth Graunke	35ca288165	i965/fs: Add pass to rename registers to break live ranges. The pass breaks live ranges of virtual registers by allocating new registers when it sees an assignment to a virtual GRF it's already seen written. total instructions in shared programs: 4337879 -> 4335014 (-0.07%) instructions in affected programs: 343865 -> 341000 (-0.83%) GAINED: 46 LOST: 1 [mattst88]: Make pass not break in presence of control flow. invalidate_live_intervals() only if progress. Fix up delta_x/delta_y. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2014-08-14 23:50:12 -07:00
Kenneth Graunke	650c331378	i965: Fix INTDIV math assertions on Broadwell. Commit `c66d928f2c` ("i965: Enable INTDIV in SIMD16 mode.") began using generate_math_gen6 to break SIMD16 INTDIV into two SIMD8 operations. generate_math_gen6 takes two registers - for unary operations, we pass ARF null for the second operand. Prior to Broadwell, real operands were always GRF. But now they can be IMM as well. So, check for != ARF instead of == GRF. +12 piglits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 23:21:34 -07:00
Kenneth Graunke	e84e074248	Revert "i965/vec4: Use MOV, not OR, to set URB write channel mask bits." This reverts commit `af13cf609f`, which appears to cause huge performance problems on Ivybridge. I'd missed that the FFTID bits are in the low byte. The documentation doesn't indicate that the URB write message header actually wants FFTID - it just labels those bits as "Reserved." But it appears necessary. This does slightly more than revert the original change: originally, Broadwell had separate code generation, which used MOV, and this patch only changed it for Gen4-7. Now that both are unified, reverting this also makes Broadwell use OR. Which should be fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-14 23:21:28 -07:00
Chris Forbes	417cc8b2c8	docs: Mark off ARB_derivative_control for i965. Also update 10.3 relnotes to match. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 18:04:09 +12:00
Chris Forbes	654b7788eb	i965: Enable ARB_derivative_control on Gen7+. The extension says GL 4.0 is required. We'll meet the spirit of that restriction by enabling on just those generations which will soon support GL 4.0 (Gen7+), although it's technically supportable on all generations. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 18:04:06 +12:00
Chris Forbes	a396224520	i965/fs: Support fine/coarse derivative opcodes The quality level (fine/coarse/dont-care) is plumbed through to the generator as a constant in src1. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 18:04:04 +12:00
Chris Forbes	587e6e7898	i965/vec4: Assert that fine/coarse derivative ops don't appear Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 18:04:03 +12:00
Chris Forbes	eba0c54f62	glsl: Mark program as using dFdy if coarse/fine variant is used Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-15 18:03:53 +12:00
Ilia Mirkin	f08d7b8fe1	nv50,nvc0: add support for fine derivatives The quadop-based method we currently use on all chipsets already provides the fine version of the derivatives. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-14 20:25:33 -04:00
Ilia Mirkin	88b0c6403f	mesa/st: add support for emitting fine derivative opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-14 20:25:32 -04:00
Ilia Mirkin	8ee74ce50f	gallium: add opcodes/cap for fine derivative support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1) v2: Reuse opcode gaps as suggested by Marek	2014-08-14 20:25:32 -04:00
Ilia Mirkin	3fa384db0c	mesa/program: add new derivative unops to the unexpected list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-14 20:25:32 -04:00
Ilia Mirkin	f80c6847e9	glsl: add ARB_derivative control support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-14 20:25:32 -04:00
Ilia Mirkin	4a9c36c985	mesa: add ARB_derivative_control extension bit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 20:25:32 -04:00
Ilia Mirkin	e474cb4027	mesa: add ARB_texture_barrier support This extension is identical to NV_texture_barrier. Alias glTextureBarrier to the existing glTextureBarrierNV and use the existing NV_texture_barrier extension bit. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-14 20:25:32 -04:00
Marek Olšák	c3bd130784	docs: document radeonsi BPTC support, sort extensions in 10.3 release notes	2014-08-15 02:05:05 +02:00
Glenn Kennard	f23ee74791	r600g: Implement BPTC texture support Requires Evergreen/Cayman Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-15 01:55:13 +02:00
Kristian Høgsberg	221d9c3e9c	i965: Rename intelValidateState to intel_update_state This matches the name of the dd hook. Also convert a couple of nearby dd implementations to lowercase + underscore as is now the standard. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-14 13:57:26 -07:00
Kristian Høgsberg	416dd873e8	i965: Assign PS kernel start pointers when we decide which kernels to use Right now we decide which kernels to use and the GRF start offsets in one place and emit the kernel pointers later. The logic of how to map 8, 16 and 32 kernels to kernel start pointers follows the same logic as which GRF start offsets to use, so lets figure out these two things in one place. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-08-14 13:57:26 -07:00
Grigori Goronzy	d7d8260f70	radeonsi: implement BPTC texture support Passes all piglit tests. v2: rebased Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-14 20:45:03 +02:00
Marek Olšák	87a8ed9389	radeonsi: fix buffer invalidation of unbound texture buffer objects This maintains a list of all TBOs in a pipe_context. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-14 20:45:03 +02:00
Marek Olšák	79f28cdb98	r600g: implement invalidation of texture buffer objects This fixes piglit spec/ARB_texture_buffer_object/data-sync. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-14 20:45:03 +02:00
Marek Olšák	da9c3ed304	r600g: fix constant buffer fetches Somebody forgot to do this. It was uncovered by recent st/mesa changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82139 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2014-08-14 20:45:03 +02:00
Marek Olšák	d52202141e	r600g: clear constant buffer sizes at the beginning of CS Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-14 20:45:03 +02:00
Pekka Paalanen	08264e5dad	egl_dri2: fix EXT_image_dma_buf_import fds The EGL_EXT_image_dma_buf_import specification was revised (according to its revision history) on Dec 5th, 2013, for EGL to not take ownership of the file descriptors. Do not close the file descriptors passed in to eglCreateImageKHR with EGL_LINUX_DMA_BUF_EXT target. It is assumed, that the drivers, which ultimately process the file descriptors, do not close or modify them in any way either. This avoids the need to dup(), as it seems we would only need to just close the dup'd file descriptors right after. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76188 Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-14 21:30:57 +03:00
Pekka Paalanen	972e87ca30	i965: fix compiler error in union initiliazer gcc 4.6.3 chokes with the following error: brw_vec4.cpp: In member function 'int brw::vec4_visitor::setup_uniforms(int)': brw_vec4.cpp:1496:37: error: expected primary-expression before '.' token Apparently C++ does not do named initializers for unions, except maybe as a gcc extension, which is not present here. As .f is the first element of the union, just drop it. Fixes the build error. Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 21:30:57 +03:00
Anuj Phogat	9b9dd22f44	i965: Bail on FS copy propagation for scratch writes with source modifiers Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 11:03:00 -07:00
Anuj Phogat	7c1ea00eaf	i965: Bail on vec4 copy propagation for scratch writes with source modifiers Fixes Khronos GLES3 CTS test: dynamic_expression_array_access_vertex Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 11:03:00 -07:00
Aras Pranckevicius	2b837576eb	glsl: Fixed vectorize pass vs. texture lookups. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82574 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 09:40:33 -07:00
Brian Paul	088106fa79	ra: move declarations before code to fix MSVC build Trivial.	2014-08-14 08:53:45 -06:00
Brian Paul	bfb6b76665	svga: remove some unneeded INLINE qualifiers Trivial.	2014-08-14 08:53:45 -06:00
Emil Velikov	478f82737c	docs/autoconf: update to better reflect reality * --enable-{32,64}-bit is done. Use --build and --host instead. * Configure does not add "-g -O2" to C{,XX}FLAGS. * Pkg-config has been mandatory for a while now. * Avoid using LDFLAGS, refer to pkg-config. * --with-expat is deprecated. Use pkg-config. v2: * Note that CC/CXX will need to be set for multilib builds. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2014-08-14 15:45:23 +01:00
Jose Fonseca	d4a1f3fd27	scons: do not include headers from the sources lists The SCons documentation is not explicit on the topic yet building mesa with SCons and MSVC is known to have problems when headers are listed. So be safe just drop them for now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82534 Tested-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-14 15:38:04 +01:00
Emil Velikov	395ce0b0fa	configure.ac: remove enable 32/64 bit hacks These two were added ages ago, with an explicit comment "Hacks ..." They have been insufficient for years and maintainers needed to explicitly handle the build themselves. Rather than lying and pretending that it works, just kill this hack and let maintainers build things the way it should be done for their distribution. Document the removal in the release notes. Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 15:37:33 +01:00
Emil Velikov	957a28e63c	Revert "configure: Fix --enable-XX-bit flags by moving LT_INIT where it should" This reverts commit `2af28040d6`. The commit was resolving an issue where libtool will not setup the environment correctly when one explicitly provides --enable-{32,64}-bit at configure time. It was caused due to the "-m32,64" C{,XX}FLAGS being set too late relative to LT_INIT. At the same time this cases the enable_static to be incorrectly set, amongst others leading to build issues. Rather than being smart and trying to handle 32/64 bit build ourselves it may be better to delegate it to the builder/maintainer. The latter should now know better which is the correct(most appropriate) method. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82536 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82546 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2014-08-14 15:36:49 +01:00
Neil Roberts	2c50212b14	i965: Store uniform constant values in a gl_constant_value instead of float The brw_stage_prog_data struct previously contained an array of float pointers to the values of parameters. These were then copied into a batch buffer to upload the values using a regular assignment. However the float values were also being overloaded to store integer values for integer uniforms. This can break if x87 floating-point registers are used to do the assignment because the fst instruction tries to fix up invalid float values. If an integer constant happened to look like an invalid float value then it would get altered when it was copied into the batch buffer. This patch changes the pointers to be gl_constant_value instead so that the assignment should end up copying without any alteration. This also makes it more obvious that the values being stored here are overloaded for multiple types. There are some static asserts where the values are uploaded to ensure that the size of gl_constant_value is the same as a float. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81150 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-14 11:54:48 +01:00
Christian König	6fb42ee7a6	st/vdpau: add device reference counting This fixes an issue with flash where it tries to destroy a decoder after already destroying the device associated with the decoder. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=82517 Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-14 11:57:07 +02:00
Chris Forbes	c1df492d03	mesa: Make ARB_gpu_shader5 core-profile-only Requires GLSL 1.50 or higher, which we only support in the core profile. V2: Fix broken alignment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-14 21:32:33 +12:00
Ilia Mirkin	a89353381a	nouveau: force luminance clear colors to have the same g/b values as r Fixes the LUMINANCE_ALPHA formats of fbo-clear-formats piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-14 02:05:06 -04:00
Kenneth Graunke	c66d928f2c	i965: Enable INTDIV in SIMD16 mode. All we need to do is decompose this to two SIMD8 instructions, like we do in many other cases. We even already have code for that. I apparently just botched this last time I tried, and it was easy. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 21:19:07 -07:00
Kenneth Graunke	24878f31c4	i965/fs: Drop "do dual source blending" generator parameter. When dual source blending, the visitor already stores a flag in brw_wm_prog_data (dual_src_blend) for the state upload code to use. The generator also receives this, so there's no need to pass an additional flag. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 21:19:07 -07:00
Jason Ekstrand	a8379a405a	mesa/texstore: Don't use the _mesa_swizzle_and_convert if we need transfer ops The _mesa_swizzle_and_convert path can't do transfer ops, so we should bail if they're needed. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-13 19:43:33 -07:00
Dave Airlie	f1ef4be4be	docs: update ARB_vertex_attrib_64bit status I started this as well on top of my fp64 stuff. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-08-14 10:49:55 +10:00
Dave Airlie	c63233424b	docs/GL3.txt: add GLES 3.1 section This just cherry-pick the extensions into a list for GLES 3.1 I'm not actually sure if this list if complete or correct, maybe someone else can tell me what I missed, and I'm not 100% sure on multi_draw_indirect. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-08-14 10:49:15 +10:00
Dave Airlie	1cfcd0164e	hud: don't overrun malloced arrays ==17630== Invalid read of size 4 ==17630== at 0x400AE10: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==17630== by 0x49024A2: u_upload_data (u_upload_mgr.c:253) ==17630== by 0x49050E1: u_vbuf_draw_vbo (u_vbuf.c:980) ==17630== by 0x487DE29: cso_draw_vbo (cso_context.c:1425) ==17630== by 0x487DEA0: cso_draw_arrays (cso_context.c:1445) ==17630== by 0x48A3B0E: hud_draw_colored_prims.constprop.6 (hud_context.c:123) ==17630== by 0x48A4810: hud_draw (hud_context.c:266) ==17630== by 0x48763F7: dri_flush (dri_drawable.c:483) ==17630== by 0x4057510: dri2Flush.constprop.4 (dri2_glx.c:559) ==17630== by 0x405789E: dri2SwapBuffers (dri2_glx.c:851) ==17630== by 0x402C531: glXSwapBuffers (glxcmds.c:842) ==17630== by 0x8049716: ??? (in /usr/bin/glxgears) ==17630== Address 0x4426b2c is 4 bytes after a block of size 1,008 alloc'd ==17630== at 0x4006B11: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==17630== by 0x48A4CE7: hud_pane_add_graph (hud_context.c:625) ==17630== by 0x48A68F0: hud_pipe_query_install (hud_driver_query.c:175) ==17630== by 0x48A6A30: hud_driver_query_install (hud_driver_query.c:207) ==17630== by 0x48A5835: hud_create (hud_context.c:791) ==17630== by 0x48756CB: dri_create_context (dri_context.c:165) ==17630== by 0x4871CD4: driCreateContextAttribs (dri_util.c:435) ==17630== by 0x4871E06: driCreateNewContext (dri_util.c:464) ==17630== by 0x4056A22: dri2_create_context (dri2_glx.c:223) ==17630== by 0x402CF68: CreateContext (glxcmds.c:299) ==17630== by 0x402D265: glXCreateContext (glxcmds.c:430) ==17630== by 0x804B136: ??? (in /usr/bin/glxgears) This is due to second vertex element being specified, and the upload tries to fetch over the end. However the pane rendering only requires a single vertex element, so specify only one. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-08-14 10:46:32 +10:00
Roland Scheidegger	b6d29de2c4	mesa: fix texstore with GL_COLOR_INDEX data This got broken by `3dbf5bf657`. GL_COLOR_INDEX data is still supported (in legacy contexts), but the new texstore_swizzle path cannot handle it (and didn't detect this). Unfortunately there's no piglit test trying to specify textures with a GL_COLOR_INDEX source format, and I don't really understand how all the color map stuff which is used by this works, but this caused conform failures (with a reported mesa implementation error when trying to figure out the color mapping). Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-14 02:16:23 +02:00
Andreas Boll	64c379a3a8	winsys/radeon: fix hawaii accel_working2 comment accel_working2 returns 3 if the new firmware is used. The comment wasn't updated in v3 of commit: `36771dc` winsys/radeon: fix nop packet padding for hawaii Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-13 23:28:23 +02:00
Tom Stellard	866dae85c8	r300g: Fix bug in build_loop_info()/compiler v2 Fixes piglit glean "do-loop with continue and break" on RS690 It's based on Tom Stellard patch and improved to handle CMP instruction. [v2] handle CMP instruction Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>	2014-08-13 14:37:03 -04:00
Tom Stellard	ed3f7eadad	clover: Flush the command queue in clReleaseCommandQueue() This is required by the spec. Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:20:22 -04:00
Tom Stellard	a15088338e	radeonsi/compute: Stop leaking the input buffer We were leaking the input buffer used for kernel arguments and since we were allocating it using si_upload_const_buffer() we were leaking 1 MB per kernel invocation. CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:18:35 -04:00
Tom Stellard	38fccc37c1	radeonsi/compute: Whitespace fixes CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:17:02 -04:00
Tom Stellard	1e2e550671	radeonsi/compute: Call si_pm4_free_state() after emitting compute state This will decrement the reference count for buffers referenced in the command stream will prevent us from leaking them. CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:17:02 -04:00
Tom Stellard	05e9681d55	radeonsi/compute: Update reference counts for buffers in si_set_global_binding() CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:17:02 -04:00
Tom Stellard	72969e0efb	radeon/compute: Report a value for PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:11:44 -04:00
Tom Stellard	77ea58ca81	radeon/compute: Fix reported values for MAX_GLOBAL_SIZE and MAX_MEM_ALLOC_SIZE There is a hard limit in older kernels of 256 MB for buffer allocations, so report this value as MAX_MEM_ALLOC_SIZE and adjust MAX_GLOBAL_SIZE to statisfy requirements of OpenCL. CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-13 14:11:43 -04:00
Connor Abbott	e78a01d5e6	ra: optimistically color only one node at a time Before, when we encountered a situation where we had to optimistically color a node, we would immediately give up and push all the remaining nodes on the stack in the order of their index - which is a random, and potentially not optimal, order. Instead, choose one node to optimistically color in ra_select(), and then once we've optimistically colored it, keep on going as normal in the hopes that we've opened up more avenues for the normal select phase to make progress. In cases with high register pressure, this helps make the order we push things on the stack much better, and therefore increase the chance that we can allocate successfully. total instructions in shared programs: 4545447 -> 4545401 (-0.00%) instructions in affected programs: 1353 -> 1307 (-3.40%) GAINED: 124 LOST: 6 Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-08-13 11:43:37 -07:00
Connor Abbott	03f4084d28	ra: don't consider nodes for spilling we don't need to Previously, we would consider any optimistically colored nodes for spilling. However, spilling any optimistically colored nodes below the node that we failed to color on the stack wouldn't help us make progress, since it wouldn't help with allowing us to find a color for the node currently failing to get colored. Only consider nodes which were above the failing node on the stack for spilling, which simplifies the logic, and comment the code better so people know what's going on here. No shader-db changes with BRW_MAX_GRF reduced to 90 (or with the normal number of GRF's). Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-08-13 11:43:30 -07:00
Connor Abbott	567e2769b8	ra: make the p, q test more efficient We can store the q total that pq_test() would've calculated in the node itself, updating it when we add a node to the stack. This way, we only have to walk the adjacency list when we push a node on the stack (i.e. when the p, q test succeeds) instead of every time we do the p, q test. No difference in shader-db run times, but I'm keeping this in because the q total that it calculates will also be used in the next few commits. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-08-13 11:43:22 -07:00
Connor Abbott	9a0b52e7c1	ra: cleanup the public API Previously, there were 3 entrypoints into parts of the actual allocator, and an API called ra_allocate_no_spills() that called all 3. Nobody would ever want to call any of the 3 entrypoints by themselves, so everybody just used ra_allocate_no_spills(). So just make them static functions, and while we're at it rename ra_allocate_no_spills() to ra_allocate() since there's no equivalent "with spills," because the backend is supposed to handle spilling. Signed-off-by: Connor Abbott <connor.abbott@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-13 11:43:05 -07:00
Ilia Mirkin	d72d67832b	nouveau: only try to get new storage if there are any levels This would try to allocate 0-sized bo's when the max level was below the base level. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-13 10:09:01 -04:00
Ilia Mirkin	ddcbea91f1	nouveau: add emacs dir-locals file for tabs/8-space indents Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-13 09:59:04 -04:00
Ilia Mirkin	8049e5a1f6	nvc0: increase GLSL level to 400 to enable ARB_gpu_shader5 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-13 09:59:04 -04:00
Ilia Mirkin	6f1edf3cbf	mesa/st: enable ARB_gpu_shader5 if the reported GLSL version >= 400 The ARB_gpu_shader5 extension is made up of a lot of small sub-parts. Instead of adding PIPE_CAP's for each of these, just rely on the GLSL version reported by the pipe driver. The remaining extensions lend themselves naturally to being checked through a single CAP. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-13 09:59:04 -04:00
Emil Velikov	52901ec261	android: add CleanSpec.mk The file contains rules that are executed on incremental builds. This way one can avoid doing a full clean and ensure that the new object (library) is correctly build. Inspired by the work of Chih-Wei Huang, from the Android-x86 project. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:57 +01:00
Emil Velikov	38df9f8a06	android: megadriver_stub: prefix static libraries with libmesa_ Will make it easier on us as CleanSpec.mk comes along and improves consistency across the Android build. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:57 +01:00
Emil Velikov	73121a34d4	android: loader: prefix static libraries with libmesa_* Will make it easier on us as CleanSpec.mk comes along and improves consistency across the Android build. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:57 +01:00
Emil Velikov	db4d7229bc	android: dri/i9*5: remove used _INCLUDES variable No longer needed as of last commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:57 +01:00
Emil Velikov	725373275c	android: drivers/dri: add $(mesa_top)/src to the includes list Will allow us to nuke an include or two from the drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	48307eb813	android: dri: use the installed libdrm headers Saves us a few lines and brings us closer to the automake build. Drop DRM_TOP as it's not longer used. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	c1cc3f2f19	android: gallium: use the installed libdrm headers Saves us a few lines and brings us closer to the automake build. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	5f3022e97f	android: loader: use the installed libdrm headers One step closer to the way we handle automake builds. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	db064b7054	android: egl/dri2: use the installed libdrm headers Trying to get rid of the hardcoded dependency of DRM_TOP which expects that mesa is localted in /external/drm. Will Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	5facd003a0	android: dri/i915: do not build an 'empty' driver The variable i915_C_FILES changed to i915_FILES with commit `34d4216e64` back in mesa 9.1/9.2. Yet we've missed to update the the android build, essentially creating an dummy/empty driver that can never work. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	fa4aeb3c65	automake: mesa: whitespace fixes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:56 +01:00
Emil Velikov	b3121bfd41	mesa: guard better when building with sse4.1 optimisations When the compiler is not capable/does not accept -msse4.1 while the target has the instruction set we'll blow up as _mesa_streaming_load_memcpy is going to be undefined. To make sure that never happens, wrap the runtime cpu check+caller in an ifdef thus do not compile that hunk of the code. Fix the android build by enabling the optimisation and adding the define where applicable. v2: autoconf conditionals end with "fi" rather than endif. v3: Wrap the definition and call to intel_miptree_{un,}map_movntdqa in if defined(USE_SSE41). Spotted by Matt. Cc: Matt Turner <mattst88@gmail.com> Cc: Adrian Negreanu <adrian.m.negreanu@intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:55 +01:00
Emil Velikov	07f583186d	android: glsl: the stlport over the limited Android STL The latter lacks various functionality used by mesa/glsl. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:55 +01:00
Emil Velikov	dfa6dc5eb8	android: drop HAL_PIXEL_FORMAT_RGBA_{5551,4444} Upstream Android (system/core) has dropped these formats with commit 6bac41f1bf9(get rid of HAL pixelformats 5551 and 4444) yet does not mention why. These formats never really worked so we're safe to drop them as well. Identical commit is available in the android-x86 external/mesa repo commit `06a2d36edc` Author: Chih-Wei Huang <cwhuang@linux.org.tw> Date: Wed Sep 25 01:16:57 2013 +0800 android: get rid of HAL pixelformats 5551 and 4444 Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:55 +01:00
Emil Velikov	51a9a09ba8	android: gallium/auxiliary: drop log2/log2f redefitions Recent versions of bionic has picked up support for these functions, leading to build issues due to the redefition of the symbols. Note: wrapping things in #ifdef does not cut it :\ Identical patch is available in chromium, android-x86 and perhaps other projects. commit 66c1c789ce3407472de9ed620c9f815639058835 Author: rmcilroy@chromium.org Date: Wed Apr 02 10:59:34 2014 +0000 Porting to x64 Android. Remove redefinitions of log2 and log2f. BUG= R=kbr@chromium.org Review URL: https://codereview.chromium.org/216773005 commit `9cc0a0d2b0` Author: Chih-Wei Huang <cwhuang@linux.org.tw> Date: Sun Jul 21 23:04:19 2013 +0800 android: remove log2, log2f The functions are already defined in the latest bionic. Cc: Chia-I Wu <olvaffe@gmail.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Chia-I Wu <olvaffe@gmail.com>	2014-08-13 00:46:55 +01:00
Emil Velikov	2e74818374	android: targets/egl-static: add correct include for DRM headers Android build never really installs the headers, as such we need to explicitly add their location in the source tree otherwise it will fail to find them. v2: Android now installs the headers, so let's use that ;) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:54 +01:00
Emil Velikov	b72b826ef8	scons: group state-trackers' and targets' scons Both share the identical dependencies, as such we can simplify the scons script. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:54 +01:00
Emil Velikov	ec668cbf8b	android: reorder gallium SUBDIRS To be closer to its automake counterpart. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:54 +01:00
Emil Velikov	b75e0d7e25	automake: handle gallium SUBDIRs in gallium/Makefile Considering the way we've been consolidating things it makes sense to add the final two (aux and tests) in here. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:54 +01:00
Emil Velikov	7af25d17a5	automake: compact gallium/target/Makefile into gallium/Makefile Yet another makefile less to worry about. v2: Add state_trackers and targets on a single SUBDIRS line. Requested by Matt. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:54 +01:00
Emil Velikov	eeb56b6b43	automake: merge gallium/state_trackers/Makefile into gallium/Makefile One makefile less, with the potential of further compacting the automake build. v2: Rebase on top of vc4 changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:53 +01:00
Emil Velikov	fd7da27a43	automake: compact gallium/drivers and gallium/winsys makefiles Rather than having two separate almost empty and identical makefiles, compact them thus improving the configure and build time. Additionally this makes the automake build symmetrical to the scons and android one. v2: Rebase on top of vc4, compact drivers + winsys on a single line. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:53 +01:00
Emil Velikov	792041ebe5	android: egl/main: add/enable freedreno For all everyone willing to give the freedreno driver a go they can now build it under Android. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Cc: Rob Clark <robclark@freedesktop.org> Cc: freedreno@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:53 +01:00
Emil Velikov	bf05e06757	android: gallium/freedreno: add preliminary build For all the people interested in testing the freedreno driver on their Android devices. The next commit will hook these up within the libEGL driver (via the gallium-egl backend). There may be some rough edges but those can be sorted when a willing builder/tester comes along. v2: - s/freefreno/freedreno/. Spotted by Matt Turner. - Use the installed libdrm headers. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Cc: Rob Clark <robclark@freedesktop.org> Cc: freedreno@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:52 +01:00
Emil Velikov	458d03a4a4	automake: gallium/freedreno: drop spurious include dirs Rather than including two extra folders only for two headers, just prefix the headers and be done with it. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Cc: Rob Clark <robclark@freedesktop.org> Cc: freedreno@lists.freedesktop.org Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-08-13 00:46:52 +01:00
Paulo Sergio Travaglia	aae453afe8	android: egl/main: resolve radeon linking issues - link against libdrm_radeon - link the r600 driver against libstlport - linkin the newly added libmesa_pipe_radeon library required by r600 and radeonsi drivers v2: Include pipe_radeon after pipe_r600/radeonsi. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> [Emil Velikov] Split up and add commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:52 +01:00
Paulo Sergio Travaglia	5bbfa308c9	android: gallium/radeon: attempt to fix the android build - include the correct folders - add a new buildscript for the common radeon folder v2: Use the installed libdrm headers over the DRM_TOP ones. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> [Emil Velikov] Split up and add commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:52 +01:00
Emil Velikov	825fa2873f	android: egl/main: fixup the nouveau build For a while the nouveau pipe driver has been a static library and it has been using STL for even longer. Correct add the link and cleanup the gallium_DRIVERS. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:52 +01:00
Emil Velikov	6b510c6338	android: gallium/nouveau: fix include folders, link against libstlport nouveau uses STL for a while now thus we need to include external/stlport/libstlport.mk in order to get the build at least partially working. v2: Use the installed libdrm headers over the DRM_TOP ones. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-13 00:46:52 +01:00
Emil Velikov	b26017fad8	egl/main: Bring in the Makefile.sources Rather than having the sources list duplicated across all three build systems, define it once and use it whenever needed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-13 00:46:52 +01:00
Ilia Mirkin	2787bff8dd	nvc0: add BPTC format support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 19:21:04 -04:00
Ilia Mirkin	ffd706dac0	mesa/st: add BPTC formats, expose ARB_texture_compression_bptc Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-12 19:21:04 -04:00
Ilia Mirkin	19563f0880	softpipe,llvmpipe: mark BPTC formats as unsupported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-12 19:21:04 -04:00
Ilia Mirkin	43c038f4a6	gallium: add basic support for BPTC formats Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-12 19:21:04 -04:00
Ilia Mirkin	82903acf5e	docs: add GL4.5 section Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-12 18:56:22 -04:00
Emil Velikov	5e5f754f5b	configure.ac: drop enable_dri check in gallium_gbm A while back we've mandated that gbm requires enable_dri, thus this check is no longer required. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-12 23:36:06 +01:00
Emil Velikov	1d1ec76bdf	configure.ac: bail out if building gallium_gbm without gallium_egl The former is the only user of the latter. As such building gbm without egl makes little to no sense. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-12 23:36:06 +01:00
Emil Velikov	16873a6e62	st/dri: define GALLIUM_SOFTPIPE when building kms_swrast To avoid unresolved symbols in the DRI modules with earlier commit we wrapped the innards of dri_kms_init_screen() in a DRI_TARGET/GALLIUM_SOFTPIPE ifdef. At the same time we forgot to adds the defines to the st/dri build systems, breaking kms_swrast and gnome-continuous. Drop the DRI_TARGET define, we're already in st/DRI. Reported-by: Jasper St. Pierre <jstpierre@mecheye.net> Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-12 23:36:06 +01:00
Alexandre Demers	2af28040d6	configure: Fix --enable-XX-bit flags by moving LT_INIT where it should Moving LT_INIT after setting completely (AM_)C(XX)FLAGS and LDFLAGS. LT_INIT needs them as they are expected to be used all along the compilation when the macro runs its tests to determine among other things the host type. For info, see http://www.gnu.org/software/libtool/manual/html_node/LT_005fINIT.html Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50754 Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Tested-by: Tapani Palli <lemody@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-12 23:36:06 +01:00
Emil Velikov	469416f988	c11/threads: correct assertion We should assert when either the function or the flag pointer is null or we'll end up with a null reference a few lines later. Currently unused by mesa thus it has gone unnoticed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 23:36:05 +01:00
Brian Paul	07109cfd99	docs: now distributing the GL/glcorearb.h header Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 15:55:41 -06:00
Brian Paul	25774859f8	mesa: pull Khronos glcorearb.h header into include/GL/ Apps that only want to use core functionality should #include this header. This version covers everything up to OpenGL 4.5. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 15:55:41 -06:00
Eric Anholt	c8e0dd2a2c	vc4: Drop the dump_fbo() routine. Now that eglkms is working, and some tests are working under PIGLIT_PLATFORM=gbm, I don't think I need this any more.	2014-08-12 14:21:56 -07:00
Eric Anholt	8106722bbc	vc4: Claim the GL 2.1 minimum for 3D textures. We don't actually do them (or even fake them) currently, but it does get us a bunch of unrelated glean glsl1 tests passing, which previously would error out due to glean assuming the minimums on a 3D texture that 2 of the subtests use.	2014-08-12 14:19:49 -07:00
Eric Anholt	e1ce610899	vc4: Declare what vertex formats we actually support. We will support more than this eventually, but for now this makes u_vbuf format-convert a few things (32-bit snorm and scaled, doubles) for us.	2014-08-12 14:19:49 -07:00
Eric Anholt	8e504ce420	vc4: Stash some debug code for format support checks. This can be useful for looking at context init setup and texture format choices, and there's no reason for the silly retval computation we do if you're not going to have this code (mostly from freedreno) around.	2014-08-12 14:03:35 -07:00
Eric Anholt	af35afed06	vc4: Texture format support has nothing to do with VBO format support. This was inherited from freedreno, but doesn't apply to us.	2014-08-12 14:03:35 -07:00
Eric Anholt	3e9a09415e	vc4: Fix off-by-one in texture maximum levels. It's 2048x2048 that's the max, not 1024x1024.	2014-08-12 14:03:34 -07:00
Eric Anholt	b9eb3d4bee	vc4: Add support for the FLR opcode.	2014-08-12 14:03:34 -07:00
Kenneth Graunke	8c229d306b	i965: Delete the Gen8 code generators. We now use the brw_eu_emit.c code instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	f17bfc9ba9	i965: Never use the Gen8 code generators. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	074d472398	i965: Switch to the EU emit layer for code generation on Broadwell. Everything should be in place to unify code generation between Gen4-7 and Gen8+. We should be able to drop the Gen8 generators at this point. However, leave them hooked up for a brief moment, for testing and comparison purposes. Set GEN8=1 to use the old Gen8+ code generator paths. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	db6ffa29c8	i965: Retype atomics to UD in Gen8 code generation. Kind of a moot point since we're deleting Gen8 code generation, but this at least helps make it match the Gen4-7 code. It's probably more reasonable than using float. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	04f5b2f4e4	i965/vp: Use the sampler for pull constant loads on Gen7/7.5. This improves performance in Trine 2 at 1280x720 (windowed) on "Very High" settings by 30% (in the interactive menu) to 45% (in the forest by the giant frog) on Haswell GT3e. It also now generates the same assembly on Gen7 as it does on Gen8, which always used the sampler for both types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	f7e9756201	i965/vec4: Drop gen <= 7 assertion in pull constant load handling. I don't see any reason for this to exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	ce90fd9676	i965/eu: Set src0 file to IMM on Gen8+ flow control instructions. According to the documentation, we need to set the source 0 register type to IMM for flow control instructinos that have both JIP and UIP. Out of paranoia, just make all flow control instructions use IMM; there's no benefit to using ARF anyway, and it could trouble that's difficult to diagnose. See commit `9584959123`, which did the analogous change in the gen8_generator code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	d8ef0eab5a	i965/eu: Refactor brw_WHILE to share a bit more code on Gen6+. We're going to add a Gen8+ case shortly, which would need to duplicate this code again. Instead, share it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	aafdf9eef4	i965/eu: Emulate F32TO16 and F16TO32 on Broadwell. When we combine the Gen4-7 and Gen8+ generators, we'll need to handle half float packing/unpacking functions somehow. The Gen8+ generator code today just emulates the behavior of the Gen7 F32TO16/F16TO32 instructions, including the align16 mode bugs. Rather than messing with fs_generator/vec4_generator, I decided to just emulate the instructions at the brw_eu_emit.c layer. v2: Change gen >= 7 asserts to gen == 7 (suggested by Chris Forbes). Fix regressions on Haswell in VS tests due to type assertions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	849046b842	i965/vec4: Port Gen8 SET_VERTEX_COUNT handling to vec4_generator. Broadwell requires the number of vertices written by the geometry shader to be specified in a separate register, as part of the terminating message's payload. This also means GS_OPCODE_THREAD_END needs to increment mlen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	17c17b87f9	i965/vec4: Switch to MOV, not OR, for GS_OPCODE_THREAD_END on Gen8. Either should work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	af13cf609f	i965/vec4: Use MOV, not OR, to set URB write channel mask bits. g0.5 has nothing of value to contribute to m0.5. In both the VS and GS payload, g0.5 contains the scratch space pointer - which is definitely not of any use. The GS payload also contains FFTID, but the URB write message header doesn't want FFTID. The only reason I used OR was because Eric originally requested it. On Broadwell, I used MOV, and that's worked out fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	efc818e3a4	i965/fs: Don't set flag_subreg_nr = 1 on predicated FB write setup. On Haswell, we implement "discard" via predicated SEND messages, using f0.1 instead of f0.0. To accomplish this, we set inst->flag_subreg to 1 on the FS_OPCODE_FB_WRITE. Most instructions using fs_inst::flag_subreg expand to a single assembly instruction. However, FS_OPCODE_FB_WRITE can generate several MOVs for setting up header information. We don't want to set flag_subreg on those, so override the default state back to 0. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	2e180e4c09	i965/vec4: Respect ir->force_writemask_all in Gen8 code generation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-12 13:39:25 -07:00
Kenneth Graunke	7b6b61ba83	i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-12 13:39:24 -07:00
Jason Ekstrand	97d57f1142	gallium/r300: Fix a link error in the tests The link error occurs because the static libraries are linked in the wrong order. This fixes it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82483 Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-08-12 11:35:07 -07:00
Matt Turner	e005c1148d	i965: Return NONE from brw_swap_cmod on unknown input. Comparing ~0u with a packed enum (i.e., 1 byte) always evaluates to false. Shouldn't gcc warn about this? Reported-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-12 11:09:45 -07:00
Neil Roberts	ab66b19669	docs: Update release notes and GL3.txt for GL_ARB_texture_compression_bptc Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	a018a3f3f5	mesa/meta: Support decompressing floating-point formats Previously the Meta implementation of glGetTexImage would fall back to _mesa_get_teximage if the texturing is not using an unsigned normalised format. However in order to support the half-float formats of BPTC textures we can make it render to a floating-point renderbuffer instead. This patch makes decompression_state have two FBOs, one for the GL_RGBA format and one for GL_RGBA32F. If a floating-point texture is encountered it will try setting up a floating-point FBO. It will now also check the status of the FBO and fall back to _mesa_get_teximage if the FBO is not complete. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	817051ab5b	swrast: Enable GL_ARB_texture_compression_bptc Enables BPTC texture compression on the software rasterizer. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	9782b8a80c	i965: Enable the GL_ARB_texture_compression_bptc extension Enables the BPTC extension on Gen>=7 and adds the necessary format mappings to get the right surface type value. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	88a8830390	mesa/main: Modify generate_mipmap_compressed to cope with float textures Once we add BPTC texture support we will need to generate mipmaps for compressed floating point textures too. Most of the code seems to already be there but it just needs a few extra lines to get it to use GL_FLOAT instead of GL_UNSIGNED_BYTE as the type for the temporary buffers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	17cde55c53	mesa: Add texstore functions for BPTC-compressed textures This adds compressors for all four of the BPTC compressed-texture formats. The compressor is written from scratch and takes a very simple approach. It always uses a single mode of the BPTC format (4 for unorm and 3 for half-floats) and picks the two endpoints by dividing the texels into those which have more or less than the average luminance of the block and then calculating an average color of the texels within each division. It's probably not really sensible to try to use BPTC compression at runtime because for example with the Nvidia offline compression tool it can take in the order of an hour to compress a full-screen image. With that in mind I don't think it's worth having a proper compressor in Mesa and this approach gives reasonable results for a usage that is basically a corner case. v2: Always use the custom compressor, even for the unorm formats. Fix the quantization step for the half-float format compressor. Fixed a typo which was breaking the right-hand edge of half-float textures with a width that isn't a multiple of four. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	442bcd7fd3	mesa: Add texel fetch functions for BPTC-compressed textures Adds functions to fetch from any of the four BPTC-compressed formats. v2: Set the alpha component to 1.0 when fetching from the half-float formats instead of leaving it uninitialised. Don't linearize the alpha component when fetching from sRGB. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	7e78033c11	mesa: Add the format enums for BPTC-compressed images This adds the following four Mesa image format enums which correspond to the four BPTC compressed texture formats: MESA_FORMAT_BPTC_RGBA_UNORM MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT It also updates the format information functions to handle these and the corresponding GL enums. v2: Also modify _mesa_get_format_color_encoding, _mesa_get_srgb_format_linear and _mesa_get_uncompressed_format Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:23:50 +01:00
Neil Roberts	cc9c30b8a7	mesa/format_info: Add support for the BPTC layout Adds the ‘bptc’ layout to get_channel_bits. The channel bits for BPTC depend on the mode but as it only has to be an approximation this sets it to 8 for the two UNORM formats and 16 for the two half-float formats. These represent the minimum number of bits of variation that can be generated by the interpolation of the two formats. This doesn't quite match what we do for S3TC which only returns 4 even though it can similarly generate 8 bits from the interpolation. However it does match what we return for ETC2. For reference, NVidia seems to return 8 bits for the UNORM formats and 32 bits for the half-float formats. v2: Change the number of bits to 8/8/8/8 for the UNORM formats and 16/16/16 for the half-float formats. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-12 18:23:38 +01:00
Neil Roberts	84218b598f	mesa/format_info: Add support for compressed floating-point formats If the name of a compressed texture format has ‘FLOAT’ in it it will now set the data type of the format to GL_FLOAT. This will be needed for the BPTC half-float formats. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:00:26 +01:00
Neil Roberts	0c6e230eb1	mesa: Fix the base format for GL_COMPRESSED_RGB_BPTC_*_FLOAT_ARB The signed and unsigned half-float BPTC-compressed formats were being reported as having a base format of GL_RGBA but they don't store an alpha channel so it should be GL_RGB. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:00:26 +01:00
Neil Roberts	5ceb4bff33	mesa: Add the GL_ARB_texture_compression_bptc extension This adds a boolean in the gl_extensions struct for GL_ARB_texture_compression_bptc as well as an entry in extension_table. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-12 18:00:26 +01:00
Andreas Boll	36771dc60f	winsys/radeon: fix nop packet padding for hawaii The initial firmware for hawaii does not support type3 nop packet. Detect the new hawaii firmware with query RADEON_INFO_ACCEL_WORKING2. If the returned value is 3, then the new firmware is used. This patch uses type2 for the old firmware and type3 for the new firmware. It fixes the cases when the old firmware is used and the user wants to manually enable acceleration. The two possible scenarios are: - the kernel has no support for the new firmware. - the kernel has support for the new firmware but only the old firmware is available. Additionaly this patch disables GPU acceleration on hawaii if the kernel returns a value < 2. In this case the kernel hasn't the required fixes for proper acceleration. v2: - Fix indentation - Use private struct radeon_drm_winsys instead of public struct radeon_info - Rename r600_accel_working2 to accel_working2 v3: - Use type2 nop packet for returned value < 3 v4: - Fail to initialize winsys for returned value < 2 Cc: mesa-stable@lists.freedesktop.org Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-12 12:16:06 -04:00
Brian Paul	fa5b76e3a2	mesa: regenerate gl_mangle.h Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:45 -06:00
Brian Paul	0a96e7adaa	mesa: update wglext.h to version 20140810 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:45 -06:00
Brian Paul	eeb7fc8b7d	mesa: update glxext.h to version 20140810 Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:45 -06:00
Brian Paul	b7d36efe93	mesa: update glext.h to version 20140810 This brings in the new OpenGL 4.5 features. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 08:09:44 -06:00
Charmaine Lee	0c065270c0	svga: Add a limit to the maximum surface size This patch adds a limit to the maximum surface size which is based on the maximum size of a single mob. If this value is not available, the maximum surface size is by default set to 128 MB. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-12 08:03:24 -06:00
José Fonseca	d839be24b3	mesa/st: Move declaration to top of block. To fix MSVC build failure. Trivial.	2014-08-12 14:25:37 +01:00
Ilia Mirkin	6174f49170	mesa/st: add support for dynamic sampler offsets Replace the plain sampler index with a register reference to a sampler. We also need to keep track of the sampler array size when there is a relative reference so that we can mark the whole array used. To facilitate implementation, we add a separate ADDR register that exclusively handles the sampler relative address. Other approaches would be more invasive. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-12 08:52:14 -04:00
Christian König	83012b5085	radeon/uvd: fix gpu_address for video surfaces We need to get the new gpu_address as well when reallocating the cs buffer. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=82428 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2014-08-12 11:53:52 +02:00
Chris Forbes	3b48f6a4c0	mesa: Add a new function for getting the nonconst sampler array index If the array index is not a constant expression, the existing support will assume a zero offset (giving us the sampler index of the base of the array). For dynamically uniform indexing of sampler arrays, we need both that and the indexing expression. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 19:18:55 +12:00
Chris Forbes	1b4761bc27	glsl: Allow dynamically uniform sampler array indexing with 4.0/gs5 V2: Expand comment to explain what dynamically uniform expressions are about. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-12 19:17:56 +12:00
Ilia Mirkin	f525bd01d1	nvc0/ir: describe the tex arguments for fermi/kepler Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 19:07:34 -04:00
Ilia Mirkin	b3cbd86224	nvc0/ir: add kepler+ support for indirect texture references Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 19:07:34 -04:00
Ilia Mirkin	af3619e880	nvc0/ir: add base tex offset for fermi indirect tex case Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 19:07:34 -04:00
Kenneth Graunke	f73594778b	i965: Revert part of `f5cc3fdcf1`. Fixes non-termination in various Piglit tests. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-11 15:07:17 -07:00
Eric Anholt	602a3f92d4	vc4: Flip which primitives are considered front-facing. This mostly fixes glxgears rendering.	2014-08-11 14:47:54 -07:00
Eric Anholt	f097516505	vc4: Don't forget to set the depth clear value in the packet. This gets glxgears partially rendering again.	2014-08-11 14:47:54 -07:00
Eric Anholt	e63598aecb	vc4: Add support for gl_FragCoord. This isn't passing all tests (glsl-fs-fragcoord-zw-ortho, for example), but it does get a bunch more tests passing. v2: Rebase on helpers change.	2014-08-11 14:47:54 -07:00
Eric Anholt	d34fbdda12	vc4: Refactor shader input setup again. This makes some space for handling special inputs like fragcoords.	2014-08-11 14:47:54 -07:00
Eric Anholt	a7faca5d27	vc4: Clean up the tile alloc buffer size. This prevents some simulator assertion failures, but it does mean (since I've dropped the "* 16" padding) that on real hardware you need a kernel that does overflow memory management (currently, "drm/vc4: Add support for binner overflow memory allocation." in my kernel tree).	2014-08-11 14:47:51 -07:00
Eric Anholt	7050ab510d	vc4: Clarify some values implicitly chosen for binning config. These #defines are 0, but it should help make math above make more sense.	2014-08-11 14:45:32 -07:00
Eric Anholt	ed5cb5d7d5	vc4: Improve simulator memory allocation. This should reduce a bunch of spurious failures in sim.	2014-08-11 14:45:32 -07:00
Eric Anholt	f5f8dd29c3	vc4: Handle stride==0 in VBO validation	2014-08-11 14:45:32 -07:00
Eric Anholt	0f034055f9	vc4: Stash some debug code for looking at what BOs are at what hindex. When you're debugging validation, it's nice to know what the BOs are for.	2014-08-11 14:45:32 -07:00
Eric Anholt	8ebfa8fdb2	vc4: Use GEM under simulation even for non-winsys BOs. In addition to reducing sim-specific code, it also avoids our local handle allocation conflicting with the host GEM's handle numbering, which was causing vc4_gem_hindex() to not distinguish between winsys BOs and the same-numbered non-winsys bo.	2014-08-11 14:45:32 -07:00
Eric Anholt	cdc208bdaf	vc4: Don't forget to unmap the GEM BO when freeing. Otherwise it'll stick around forever.	2014-08-11 14:45:32 -07:00
Eric Anholt	d2cc7f97df	vc4: Add validation of raster-format textures. ... and reject everything else, for now. v2: Rebase on v2 of the rendering config validation change.	2014-08-11 14:45:32 -07:00
Eric Anholt	b384d16733	vc4: Drop VC4_PACKET_PRIMITIVE_LIST_FORMAT. It's not relevant to our command streams any more. v2: Fix indentation and a typo in the comment.	2014-08-11 14:45:32 -07:00
Eric Anholt	3aba1b124f	vc4: Add validation that vertex indices don't overflow VBO bounds.	2014-08-11 14:45:32 -07:00
Eric Anholt	5692122147	vc4: Fix the shader record size for extended strides. It turns out they aren't packed when attributes are missing, according to both docs and simulation.	2014-08-11 14:45:32 -07:00
Eric Anholt	aaff32ded0	vc4: Fix the shader record size for extended strides. It turns out they aren't packed when attributes are missing, according to both docs and simulation. v2: Drop unused variable.	2014-08-11 14:45:31 -07:00
Eric Anholt	9f24e4e6ed	vc4: Add a bunch of validation of render mode configuration. v2: Fix a build break after some previous rebase.	2014-08-11 14:45:31 -07:00
Eric Anholt	ff4748491b	vc4: Store the (currently always linear) tiling format in the resource.	2014-08-11 14:45:31 -07:00
Eric Anholt	0bc2aed90f	vc4: Add a bunch of validation of the binning mode config.	2014-08-11 14:45:31 -07:00
Eric Anholt	b6caa9556c	vc4: Validate that the same BO doesn't get reused for different purposes. We don't care if things like vertex data get smashed by render target data, but we do need to make sure that shader code doesn't get rendered to. v2: Fix overflowing read of gl_relocs[] that incorrect flagged of some VBOs as shader code.	2014-08-11 14:45:31 -07:00
Eric Anholt	fa26d334cb	vc4: Use the packet #defines in the kernel validation code.	2014-08-11 14:45:31 -07:00
Eric Anholt	5969f9b79c	vc4: Rename GEM_HANDLES to be in a namespace. It's not a real VC4 hardware packet, but I've put in a comment to explain it.	2014-08-11 14:45:31 -07:00
Eric Anholt	27b8a0a025	vc4: Clean up TMU write validation. The comment conflicted with the support in the code, so I moved the TMU write validation to where the comment was, and dropped some dead arguments from the functions while changing their signatures.	2014-08-11 14:45:31 -07:00
Eric Anholt	7969a15325	vc4: Update a comment about shader validation	2014-08-11 14:45:31 -07:00
Eric Anholt	99070c6daa	vc4: Add proper translation from Zc to Zs for vertex output. This fixes the remaining failure in depthfunc.	2014-08-11 14:45:31 -07:00
Eric Anholt	4160ac5ee4	vc4: Add support for depth clears and tests within a tile. This doesn't load/store the Z contents across submits yet. It also disables early Z, since it's going to require tracking of Z functions across multiple state updates to track the early Z direction and whether it can be used. v2: Move the key setup to before the search for the key.	2014-08-11 14:45:31 -07:00
Eric Anholt	2259cc5aeb	vc4: Avoid flushing when mapping buffers that aren't in the batch. This should prevent a bunch of unnecessary flushes for things like updating immediate vertex data.	2014-08-11 14:45:31 -07:00
Eric Anholt	6b2583412f	vc4: Drop the flush at the end of the draw Now we actally get multiple draw calls per submit.	2014-08-11 14:45:31 -07:00
Eric Anholt	c047f13603	vc4: Align following shader recs to 16 bytes. Otherwise, the low address bits will end up being interpreted as attribute counts.	2014-08-11 14:45:31 -07:00
Eric Anholt	766ca5c7a5	vc4: Fix a potential src buffer overflow in shader rec validation.	2014-08-11 14:45:31 -07:00
Eric Anholt	027d730aff	vc4: Keep a reference to BOs queued for rendering. Otherwise, once we're not flushing at the end of every draw, we'll free things like gallium resources, and free the backing GEM object, before we've flushed the rendering using it to the kernel.	2014-08-11 14:45:30 -07:00
Eric Anholt	771d86abd6	vc4: Compute the proper end address of the relocated command lists. render_cl_size/bin_cl_size includes relocations, while the hardware buffer doesn't. If you don't emit a HALT packet, the command parser continues until the end register's value. We can't allow executing unvalidated buffer contents (and it's actually harmful in the render lists Mesa is emitting, since VC4_PACKET_STORE_MS_TILE_BUFFER_AND_EOF doesn't trigger a halt).	2014-08-11 14:45:30 -07:00
Eric Anholt	c58f35393e	vc4: Walk tiles horizontally, then vertically. I was confused looking at my addresses in dumps because I was seeing the tile branch offsets jumping all over.	2014-08-11 14:45:30 -07:00
Eric Anholt	165ca6b5ad	vc4: Track clears veresus uncleared draws, and the clear color. This is a step toward queueing more than one draw per frame. Fixes piglit attribute0 test, since we get a working clear color now.	2014-08-11 14:45:30 -07:00
Eric Anholt	9c631f30c9	vc4: Move the rest of RCL setup to flush time. We only want to set up render target config and clear colors once per frame.	2014-08-11 14:45:30 -07:00
Eric Anholt	100e5679c7	vc4: Move render command list calls to vc4_flush()	2014-08-11 14:45:30 -07:00
Eric Anholt	fbaac8407a	vc4: Move bin command list ending commands to vc4_flush()	2014-08-11 14:45:29 -07:00
Eric Anholt	5e062cb2b4	vc4: Rename fields in the kernel interface. I decided I didn't like "len" compared to "size", and I keep typing shader_rec instead of shader_record[s] elsewhere, so make it consistent.	2014-08-11 14:45:28 -07:00
Eric Anholt	2b16b3d75f	vc4: Fix things to validate more than one shader state in a submit.	2014-08-11 14:45:28 -07:00
Eric Anholt	a8f2bf0f51	vc4: Rewrite the kernel ABI to support texture uniform relocation. This required building a shader parser that would walk the program to find where the texturing-related uniforms are in the uniforms stream. Note that as of this commit, a new kernel is required for rendering on actual VC4 hardware (currently that commit is named "drm/vc4: Introduce shader validation and better command stream validation.", but is likely to be squashed as part of an eventual merge of the kernel driver).	2014-08-11 14:45:28 -07:00
Eric Anholt	6a5ece12aa	vc4: Add docs for the drm interface	2014-08-11 14:45:28 -07:00
Eric Anholt	11fbee3201	vc4: Add load/store to the validator	2014-08-11 14:40:45 -07:00
Eric Anholt	a3cd3c0d19	vc4: Switch simulator to using kernel validator This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.	2014-08-11 14:40:45 -07:00
Eric Anholt	a02c658908	vc4: Drop pointless shader state struct	2014-08-11 14:40:45 -07:00
Eric Anholt	857dcc09fa	vc4: Add support for texture rectangles v2: Rebase on helpers change.	2014-08-11 14:40:45 -07:00
Eric Anholt	66c6c40127	vc4: Add support for texturing (under simulation) Only rgba8888 works, and only a single texture unit, and it's only under simulation because I haven't built the kernel interface yet. v2: Rebase on helpers. v3: Fold in the don't-break-the-arm-build fix.	2014-08-11 14:40:45 -07:00
Eric Anholt	d5a6e3dd9b	vc4: Drop PIPE_SHADER_CAP_MAX_ADDRS Fixes the build since `c10332bbb8`	2014-08-11 14:40:42 -07:00
Marek Olšák	c10332bbb8	gallium: remove PIPE_SHADER_CAP_MAX_ADDRS This limit is fixed in Mesa core and cannot be changed. It only affects ARB_vertex_program and ARB_fragment_program. The minimum value for ARB_vertex_program is 1 according to the spec. The maximum value for ARB_vertex_program is limited to 1 by Mesa core. The value should be zero for ARB_fragment_program, because it doesn't support ARL. Finally, drivers shouldn't mess with these values arbitrarily. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	718d4b97ef	st/mesa: compute supported GL versions at DRIscreen creation This computes all GL versions before any context is created. It's a requirement for GLX_MESA_query_renderer. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	fceadfe7ef	gallium: pass st_config_options to query_versions So move it from dri_context to dri_screen. This will be needed for version computations. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	f1f5366629	mesa: return version 0 if the computed core profile version is too low Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	7207830047	mesa: add _mesa_get_version, a ctx-independent variant of _mesa_compute_version Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	537cbb7e1a	mesa: add a context-independent variant of _mesa_override_gl_version v2: changed GLboolean -> bool Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	ee9a2b1ae9	mesa: make _mesa_init_constants context-independent and public Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	858452e542	mesa: make _mesa_init_extensions context-independent Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	35e755faa7	st/mesa: make st_init_limits context-independent Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	002211f9ee	mesa: move ShaderCompilerOptions into gl_constants Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	5c69173907	st/mesa: make st_init_extensions context-independent Setting Const.MaxSamples needed a rework, so that it doesn't call st_choose_format, which depends on st_context. Other than that, there is no change in functionality. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	d9a6f4360a	mesa: make _mesa_override_glsl_version context-independent Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	c6cbde5008	gallium/stapi: move setting GL versions to the state tracker All flags are set for st/mesa, so the state tracker doesn't have to check them. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-11 21:53:57 +02:00
Marek Olšák	0127d26e6c	st/mesa: convert the ETC1 format to an uncompressed one if unsupported I don't know of any hardware which supports it. With this, GL_OES_compressed_ETC1_RGB8_texture is supported if RGBA8 is supported. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2014-08-11 21:53:57 +02:00
Marek Olšák	547e2880bc	st/mesa: add st_context parameter to st_mesa_format_to_pipe_format This will be used by the next commit. Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>	2014-08-11 21:53:57 +02:00
Marek Olšák	3d56732c1f	st/mesa: advertise ARB_ES3_compatibility if GLSL 3.30 and ETC2 are supported	2014-08-11 21:53:57 +02:00
Marek Olšák	463b0ea1f6	st/mesa: add support for ETC2 formats The formats are emulated by translating them into plain uncompressed formats, because I don't know of any hardware which supports them. This is required for GLES 3.0 and ARB_ES3_compatibility (GL 4.3).	2014-08-11 21:53:57 +02:00
Marek Olšák	ddc8003c61	mesa: add helper _mesa_is_format_etc2 v2: renamed GLboolean -> bool Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-11 21:53:57 +02:00
Brian Paul	f24be73401	mesa: add missing GLAPIENTRY in copyimage.c Fixes MinGW build. Trivial.	2014-08-11 12:59:47 -06:00
Jason Ekstrand	f5cc3fdcf1	i965/cse: Don't eliminate instructions with side-effects This casues problems when converting atomics to use the GRF. Sometimes the atomic operation would get eaten by CSE when it shouldn't. v2: Roll the has_side_effects check into is_expression Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-11 11:40:32 -07:00
Jason Ekstrand	34ee3f5a34	docs/GL3: Mark ARB_copy_image as implemented on i965	2014-08-11 11:26:14 -07:00
Jason Ekstrand	410fea8dd9	i965: Add support for ARB_copy_image This, together with the meta path, provides a complete implemetation of ARB_copy_image. v2: Add a fallback memcpy path for when the texture is too big for the blitter v3: Properly support copying between two places on the same texture in the memcpy fallback v4: Properly handle blit between the same two images in the fallback path v5: Properly handle blit between the same two compressed images in the fallback path v6: Fix a typo in a comment Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2014-08-11 11:26:14 -07:00
Jason Ekstrand	8ad7c1903d	mesa/meta: Add a partial implementation of CopyImageSubData This provides an implementation of CopyImageSubData that works if both textures are uncompressed. This implementation works by using a combination of texture views and BlitFramebuffer. If one of the textures is compressed, it returns false and the driver is expected to provide a fallback. v2: Don't leak fbo's Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com> v3: Change glGen/DeleteTextures to _mesa_Gen/DeleteTextures	2014-08-11 11:26:00 -07:00
Jason Ekstrand	80a8b020c0	mesa/meta: Make _mesa_meta_bind_fbo_image also take a framebuffer target Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2014-08-11 11:20:23 -07:00
Jason Ekstrand	41b6460e08	mesa: Add GL API support for ARB_copy_image This adds the API entrypoint, error checking logic, and a driver hook for the ARB_copy_image extension. v2: Fix a typo in ARB_copy_image.xml and add it to the makefile v3: Put ARB_copy_image.xml in the right place alphebetically in the makefile and properly prefix the commit message v4: Fixed some line wrapping and added a check for null v5: Check for incomplete renderbuffers Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Neil Roberts <neil@linux.intel.com> v6: Update dispatch_sanity for the addition of CopyImageSubData	2014-08-11 11:20:23 -07:00
Matt Turner	23d782067a	i965/fs: Keep track of the register that hold delta_x/delta_y. They're needed in register allocation. Fixes a regression since `afe3d155`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78875 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-11 10:40:01 -07:00
Matt Turner	41bdad59ab	i965: Mark branch unreachable in sampler state code. Silences some uninitialized variable warnings. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-11 10:40:01 -07:00
Brian Paul	904ed3b315	mesa: simplify _mesa_update_draw_buffers() There's no need to copy the array of DrawBuffer enums to a temp array. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:44:51 -06:00
Brian Paul	39b40ad144	mesa: fix assertion in _mesa_drawbuffers() Fixes failed assertion when _mesa_update_draw_buffers() was called with GL_DRAW_BUFFER == GL_FRONT_AND_BACK. The piglit gl30basic hit this. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-11 09:44:51 -06:00
Brian Paul	dd8f15a553	mesa: whitespace, 80-column wrapping in program.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:44:50 -06:00
Brian Paul	d8f7577d5f	mesa: simplify/rename _mesa_init_program_struct() No need to return a value. Remove unused ctx parameter. Remove _mesa_ prefix since it's static. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:44:50 -06:00
Brian Paul	53b13b2ead	st/mesa: use PRId64 for printing 64-bit ints v2: use signed types/formats Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:44:50 -06:00
Brian Paul	80fa7fd23e	mesa: use PRId64 for printing 64-bit ints Silences MinGW warnings: warning: unknown conversion type character ‘l’ in format [-Wformat] warning: too many arguments for format [-Wformat-extra-args] v2: use signed types/formats Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:44:44 -06:00
Brian Paul	a5743fdf7d	mesa: define and use ALL_TYPE_BITS in varray.c code Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:37:50 -06:00
Brian Paul	288f887622	mesa: add comment that GL_CLIP_DISTANCE0 == GL_CLIP_PLANE0 in enable.c Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-11 09:37:50 -06:00
Maarten Lankhorst	4c16e6a8e0	configure.ac: Do not require llvm on x32 Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Maarten Lankhorst <dev@mblankhorst.nl>	2014-08-11 13:16:11 +02:00
Neil Roberts	1b417ea784	i965: Don't check for format differences when using the blorp blitter Previously the blorp blitter wouldn't be used if the source and destination buffer had a different format other than swizzling between RGB and BGR and adding or removing a dummy alpha channel. However there's no reason why the blorp code path can't be used to do almost all format conversions so this patch just removes the checks. However it does explicitly disable converting to/from MESA_FORMAT_Z24_UNORM_X8_UINT because there is a similar check brw_blorp_copytexsubimage. This doesn't cause any Piglit test regressions at least on Ivybridge. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-11 11:24:46 +01:00
Kenneth Graunke	9276ef6f41	i965/eu: Allow math on immediates on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	db64c2eee2	i965/eu: Update jump distance scaling for Broadwell. Broadwell measures jump distances in bytes, so we need to scale by 16. v2: Update the function in brw_eu.h, not in brw_eu_emit.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	82ddd517af	i965/eu: Refactor jump distance scaling to use a helper function. Different generations of hardware measure jump distances in different units. Previously, every function that needed to set a jump target open coded this scaling, or made a hardcoded assumption (i.e. just used 2). Most functions start with the number of instructions to jump, and scale up to the hardware-specific value. So, I made the function match that. Others start with a byte offset, and divide by a constant (8) to obtain the jump distance. This is actually 16 / 2 (the jump scale for Gen5-7). v2: Make the helper a static inline defined in brw_eu.h, instead of an actual function in brw_eu_emit.c (as suggested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	a1c899c718	i965/eu: Set UIP on ELSE instructions on Broadwell. Broadwell adds UIP on ELSE instructions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	7d41170b62	i965/eu: Make it clear that brw_patch_break_count only runs on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	0457464c33	i965/eu: Make it clear that brw_find_loop_end only runs on Gen6+. It has Gen6+ knowledge baked in, and indeed is only called for Gen6+, but it wasn't immediately obvious that this was the case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	0d6adce469	i965/eu: Port Broadwell CMP destination type hack to brw_eu_emit.c. See gen8_generator::CMP(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:34 -07:00
Kenneth Graunke	49519a1b63	i965/eu: Explicitly disable instruction compaction on Broadwell for now. Until now, it's been off implicitly: we never call the compactor function. When we merge the generators, we'll start calling it, so we should make it do nothing. Matt will enable instruction compaction properly later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:32:33 -07:00
Kenneth Graunke	8609df97a0	i965/eu: Use Haswell atomic messages on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:03:45 -07:00
Kenneth Graunke	e1bd2ca28a	i965/eu: Change gen == 7 to gen >= 7 in a couple brw_eu_emit.c cases. Broadwell is going to use the brw_eu_emit.c code soon. We want to get the fake MRF handling and URB HWord channel mask handling. We don't need the CMP thread switch workaround, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-10 19:01:52 -07:00
Ben Widawsky	38e181bad2	i965/clip: Removing scissor atom Now that we no longer use ctx->DrawBuffer->_Xmin and related fields to program the screen-space viewport extents, we don't depend on any scissoring state. So we can drop the +_NEW_SCISSOR dependency. On GEN8, a change in scissor state does not effect anything for the clipper/sf hardware state. The hardware will always do the right thing once the viewport extents are programmed. We can therefore remove the unecessary state emission. Ken originally spotted this. v2: Reword the commit message. Remove spurious hunk. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-10 17:13:36 -07:00
Ben Widawsky	f6725d627c	i965/guardband: Enable for all viewport dimensions (GEN8+) The goal of guardband clipping is to try to avoid 3d clipping because it is an expensive operation. When guardband clipping is disabled, all geometry that intersects the viewport is sent to the FF 3d clipper. Objects which are entirely enclosed within the viewport are said to be "trivially accepted" while those entirely outside of the viewport are, "trivially rejected". When guardband clipping is turned on the above behavior is changed such that if the geometry is within the guardband, and intersects the viewport, it skips the 3d clipper. Prior to GEN8, this was problematic if the viewport was smaller than the screen as it could allow for rendering to occur outside of the viewport. That could be mitigated if the programmer specified a scissor region which was less than or equal to the viewport - but this is not required for correctness in OpenGL. In theory you could be clever with the guardband so as not to invoke this problem. We do not do this, and have no data that suggests we should bother (nor the converse data). With viewport extents in place on GEN8, it should be safe to turn on guardband clipping for all cases While here, add a comment to the code which confused me thoroughly. v2: Update grammar in commit message. Reword comments based on Ken's suggestion. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-10 17:13:36 -07:00
Ben Widawsky	1a20e38ccf	i965: Simplify viewport extents programming on GEN8 Viewport extents are a 3rd rectangle that defines which pixels get discarded as part of the rasterization process. The actual pixels drawn to the screen are an intersection of the drawing rectangle, the viewport extents, and the scissor rectangle. It permits the use of guardband clipping in all cases (see later patch). The actual pixels drawn to the screen are an intersection of the drawing rectangle, the viewport extents, and the scissor rectangle. Scissor rectangle is not super important for this discussion as it should always help do the right thing provided the programmer uses it. switch (viewport dimensions, drawrect dimension) { case viewport > drawing rectangle: no effects; break; case viewport == drawing rectangle: no effects; break; case viewport < drawing rectangle: Pixels (after the viewport transformation but before expensive rastersizing and shading operations) which are outside of the viewport are discarded. } I am unable to find a test case where this improves performance, but in all my testing it doesn't hurt performance, and intuitively, it should not ever hurt performance. It also permits us to use the guardband more freely (see upcoming patch). v2: Updating commit message. v3: Commit message updates requested by Ken Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-10 17:13:36 -07:00
Ben Widawsky	109d420f42	i965/guardband: Improve comments for guardband clipping While working in this part of the code I had a great deal of trouble understanding what it was trying to do, and matching it with the spec. (mostly due bad wording in the PRM). To help future people, I've cleaned up the wording and provided some ascii art. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-10 17:13:36 -07:00
Kenneth Graunke	31f1cbc24d	i965: Support the allow_glsl_extension_directive_midshader option. This adds support for Marek's new driconf parameter, which avoids totally white rendering in Unigine Valley (which attempts to enable the GL_ARB_sample_shading extension in an illegal place). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75664 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-10 16:22:31 -07:00
Connor Abbott	b6df68ba56	i965/fs: set virtual_grf_count in assign_regs() This lets us call dump_instructions() after register allocation without failing an assertion. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <connor.abbott@intel.com>	2014-08-10 15:00:53 -07:00
Connor Abbott	58007aec41	i965/fs: don't read from uninitialized memory while assigning registers Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <connor.abbott@intel.com>	2014-08-10 15:00:52 -07:00
Matt Turner	59a26a0554	i965/fs: Fix bad whitespace.	2014-08-10 15:00:52 -07:00
Niels Ole Salscheider	3d5e247de6	gallium/radeon: Set gpu_address to 0 if r600_virtual_address is false Without this patch I get the following during DMA transfers: [drm:radeon_cs_ib_chunk] ERROR Invalid command stream ! radeon 0000:01:00.0: CP DMA dst buffer too small (21475829792 4096) This is a fixup for `e878e154cd`. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-10 12:54:34 +02:00
Marek Olšák	a65611f70a	radeonsi: simplify constant buffer upload for big endian Point util_memcpy_cpu_to_le32 to a buffer storage directly. v2: simplify more Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-10 12:52:13 +02:00
Marek Olšák	b1843a2d2a	winsys/radeon: fix compile warnings	2014-08-09 23:48:41 +02:00
Marek Olšák	b5f877ef7e	r600g/compute: fix compile warnings Trivial.	2014-08-09 23:41:16 +02:00
Marek Olšák	3d06952d9e	r300g: handle new shader caps Trivial.	2014-08-09 23:41:16 +02:00
Marek Olšák	955505f6ff	radeonsi: fix CMASK and HTILE allocation on Tahiti Tahiti has 12 tile pipes, but P8 pipe config. It looks like there is no way to get the pipe config except for reading GB_TILE_MODE. The TILING_CONFIG ioctl doesn't return more than 8 pipes, so we can't use that for Hawaii. This fixes a regression caused by `9b046474c9` on Tahiti. v2: add an assertion and print an error on failure Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-09 23:41:16 +02:00
Marek Olšák	00ddf7a016	gallium/radeon: remove r600_resource_va Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:16 +02:00
Marek Olšák	8c235465cd	gallium/radeon: use gpu_address from r600_resource Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:16 +02:00
Marek Olšák	f6c392a270	r600g: use gpu_address from r600_resource Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	1c03a690bf	radeonsi: use gpu_address from r600_resource Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	e878e154cd	gallium/radeon: store VM address in r600_resource This will help to get rid of the buffer_get_virtual_address calls. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	43b5c34cc3	r600g: remove useless r600_resource_va calls R600-R700 don't support virtual memory. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	0e229b8c5a	radeonsi: always prefer SWITCH_ON_EOP(0) on CIK The code is rewritten to take known constraints into account, while always using 0 by default. This should improve performance for multi-SE parts in theory. A debug option is also added for easier debugging. (If there are hangs, use the option. If the hangs go away, you have found the problem.) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> v2: fix a typo, set max_se for evergreen GPUs according to the kernel driver	2014-08-09 23:41:15 +02:00
Marek Olšák	515269b3a7	radeonsi: fix a hang with instancing in Unigine Heaven/Valley on Hawaii This isn't documented anywhere, but it's the only thing that works for this case. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	085a861545	radeon,r200: fix buffer validation after CS flush This validates all bound buffers (CB, ZB, textures, DMA) at the beginning of CS. This fixes "bo->space_accouned" assertion failures. Tested by: Jochen Rollwagen <joro-2013@t-online.de> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	0b5d88a518	st/mesa: fix blit-based partial TexSubImage for 1D arrays This fixes piglit spec/EXT_texture_array/render-1darray. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	56286834b8	st/mesa: fix DrawPixels(GL_STENCIL_INDEX) This is a bug which was probably uncovered recently by Jason's commits and broke this. The problem is _mesa_base_tex_format(GL_STENCIL_INDEX) returns -1. Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-09 23:41:15 +02:00
Marek Olšák	88e0a2f88b	st/mesa: dump TGSI before calling into the driver If the driver crashes in create_xx_shader, you want to see the shader. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-08-09 23:41:15 +02:00
Jon TURNEY	a2e1dc0cce	configure.ac: Use LIBS rather than LDFLAGS to add -ldl to dladdr check `ec8ebff` "Check for dladdr()" erroneously uses LDFLAGS rather than LIBS to add -ldl to the dladdr check. Replace the workaround in `39a4cc4` of explicitly checking in libdl, with a more correct approach of using LIBS. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Pali Rohár <pali.rohar@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-09 11:18:31 +01:00
Eric Anholt	7b4b60b7e5	vc4: Add support for the COS instruction.	2014-08-08 18:59:47 -07:00
Eric Anholt	663ffff0e7	vc4: Add support for the SIN instruction. v2: Rebase on helpers.	2014-08-08 18:59:47 -07:00
Eric Anholt	d815b2490b	vc4: Fix register aliasing for packing of scaled coordinates. Fixes glean fragProg1's "ADD test" and likely many others.	2014-08-08 18:59:47 -07:00
Eric Anholt	9492eb588d	vc4: Add some debug code for forcing fragment shader output color.	2014-08-08 18:59:47 -07:00
Eric Anholt	961715eab2	u_primconvert: Copy min/max_index from the original primitive. These values are supposed to be the minimum/maximum index values used to read from the vertex buffers. This code either copies index values out of the old IB (so, same min/max as the original draw call), or generates a new IB (using index values between the start and the start + count of the old array draw info, which just happens to be what min/max_index are set to by st_draw.c). We were incorrectly setting the max_index in the converting-from-glDrawArrays case to the start vertex plus the number of vertices generated in the new IB, which broke QUADS primitive conversion on VC4 (where max_index really has to be correct, or the kernel might reject your draw call due to buffer overflow). Reviewed-by: Rob Clark <robclark@freedesktop.org> (from verbal description of the patch)	2014-08-08 18:59:47 -07:00
Eric Anholt	1d03692f78	vc4: Fix using and emitting the 1/W from the vertex/coord shaders. v2: Rebase on helpers change.	2014-08-08 18:59:47 -07:00
Eric Anholt	88bc5baa00	vc4: Add support for swizzles of 32 bit float vertex attributes. Some tests start working (useprogram-flushverts, for example) due to getitng the right vertices now. Some that used to pass start failing with memory overflow during binning, which is weird (glsl-fs-texture2drect). And a couple stop rendering correctly (glsl-fs-bug25902). v2: Move the attribute format setup in the key from after search time to before the search. v3: Fix reading of attributes other than position (I forgot to respect attr and stored everything in inputs 0-3, i.e. position).	2014-08-08 18:59:47 -07:00
Eric Anholt	f069367f39	vc4: Add support for the TGSI FRC opcode. v2: Rebase on helpers.	2014-08-08 18:59:47 -07:00
Eric Anholt	bf542cd372	vc4: Add support for the TGSI TRUNC opcode. v2: Rebase on helpers.	2014-08-08 18:59:47 -07:00
Eric Anholt	399285403a	vc4: Crank up the tile allocation BO size This avoids a simulator assertion failure with glamor. I need to actually support resize, though.	2014-08-08 18:59:47 -07:00
Eric Anholt	75afa64ef8	vc4: Add support for multiple attributes	2014-08-08 18:59:47 -07:00
Eric Anholt	32948ca768	vc4: Add more useful debug for the undefined-source case We could get undefined sources in real programs from the wild, so we'll need to turn off this debug eventually. But for now, using undefined sources is typically me just mistyping something.	2014-08-08 18:59:47 -07:00
Eric Anholt	6ff2129d58	vc4: Add support for the lit opcode. v2: Fix how it was using the X channel for the real work of the opcode, instead of Y. Fixes glean's LIT test. v3: Rebase on the helpers.	2014-08-08 18:59:47 -07:00
Eric Anholt	63e49da0a5	vc4: Add support for the POW opcode v2: Rebase on helpers.	2014-08-08 18:59:47 -07:00
Eric Anholt	0e182e7d8f	vc4: Refactor uniform handling. I wanted an easy way to set up new uniforms every time, so I could handle texture-sampler-related uniforms. v2: Rebase on helpers change.	2014-08-08 18:59:47 -07:00
Eric Anholt	6c185bd263	vc4: Add support for the LRP opcode. v2: Rebase on helpers, cutting out most of the code in this change.	2014-08-08 18:59:47 -07:00
Eric Anholt	ec9da314ba	vc4: Add copy propagation between temps. We put in a bunch of extra MOVs for program outputs, and this can clean those up. We should do uniforms, too, though. v2: Fix missing flagging of progress when we actually optimize. Caught by Aaron Watry.	2014-08-08 18:59:47 -07:00
Eric Anholt	d9d1c14430	vc4: Add dead code elimination. This cleans up a bunch of noise in the compiled coordinate shaders (since we don't need the varying outputs), and also from writemasked instructions with negated src operands.	2014-08-08 18:59:47 -07:00
Eric Anholt	1d23d55ae9	vc4: Add an initial pass of algebraic optimization. There was a lot of extra noise in my piglit shader dumps because of silly CMPs.	2014-08-08 18:59:47 -07:00
Eric Anholt	4c53087c67	vc4: Add support for CMP. This took a couple of tries, and this is the squash of those attempts. v2: Fix register file conflicts on the args in the destination-is-accumulator case. v3: Rebase on helper change and qir_inst4 change.	2014-08-08 18:59:47 -07:00
Eric Anholt	eea1d36915	vc4: Make scheduling of NOPs a separate step from QIR -> QPU translation. This should also be used as a way to pair QIR instructions into QPU instructions later.	2014-08-08 18:59:46 -07:00
Eric Anholt	c293927511	vc4: Add WIP support for varyings. It doesn't do all the interpolation yet, but more tests can run now. v2: Rebase on helpers.	2014-08-08 18:59:46 -07:00
Eric Anholt	db9f41ea88	vc4: Use r3 instead of r5 for temps, since r5 only has 32 bits of storage Reserving a whole accumulator for temps is awful in the first place, but I'll fix that later.	2014-08-08 18:59:46 -07:00
Eric Anholt	23b2bad991	vc4: Fix emit of ABS v2: Rebase on qir helpers.	2014-08-08 18:59:46 -07:00
Eric Anholt	cf2d777fbe	vc4: Add shader variant caching to handle FS output swizzle.	2014-08-08 18:59:46 -07:00
Eric Anholt	6cf86dd487	vc4: Load the tile buffer before incrementally drawing. We will want to occasionally disable this again when we do clear support. v2: Squash with the previous commit (I accidentally committed at two stages of writing the change)	2014-08-08 18:59:46 -07:00
Eric Anholt	c3f96060a8	vc4: Don't reallocate the tile alloc/state bos every frame. This was a problem for the simulator since we don't free memory back to it, and it would soon just run out.	2014-08-08 18:59:46 -07:00
Eric Anholt	21db430210	vc4: Add VC4_DEBUG env option v2: Fix an accidental deletion of some characters from the copyright message (caught by Ilia Mirkin)	2014-08-08 18:59:46 -07:00
Eric Anholt	2e35981d4d	vc4: Add support for SNE/SEQ/SGE/SLT.	2014-08-08 18:59:46 -07:00
Eric Anholt	7108c24fd0	vc4: Use the user's actual first vertex attribute. This is hardcoded to read it as RGBA32F so far, but starts to get more tests working.	2014-08-08 18:59:46 -07:00
Eric Anholt	427f934f9e	vc4: Fix UBO allocation when no uniforms are used. We do rely on a real BO getting allocated, so make sure we ask for a non-zero size.	2014-08-08 18:59:46 -07:00
Eric Anholt	db8712bcbc	vc4: Add initial support for math opcodes	2014-08-08 18:59:46 -07:00
Eric Anholt	792d1c92df	vc4: Switch to actually generating vertex and fragment shader code from TGSI. This introduces an IR (QIR, for QPU IR) to do optimization on. It's a scalar, SSA IR in general. It looks like optimization is pretty easy this way, though I haven't figured out if it's going to be good for our weird register allocation or not (or if I want to reduce to basically QPU instructions first), and I've got some problems with it having some multi-QPU-instruction opcodes (SEQ and CMP, for example) which I probably want to break down. Of course, this commit mostly doesn't work, since many other things are still hardwired, like the VBO data. v2: Rewrite to use a bunch of helpers (qir_OPCODE) for emitting QIR instructions into temporary values, and make qir_inst4 take the 4 args separately instead of an array (all later callers wanted individual args).	2014-08-08 18:59:46 -07:00
Eric Anholt	e59890aebb	vc4: Start converting the driver to use vertex shaders. Note: This is the cutoff point where I switched from developing primarily on the Pi to developing o the simulator. As a result, from this point on the code is untested on the Pi (the kernel code I have currently wasn't rendering anything at this commit, though the simulator renders successfully, suggesting kernel bugs).	2014-08-08 18:59:46 -07:00
Eric Anholt	1850d0a1cb	vc4: Initial skeleton driver import. This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)	2014-08-08 18:59:46 -07:00
Roland Scheidegger	f017e32c0a	draw: (trivial) use information about gs being present from variant key This is a purely cosmetic change. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-09 03:52:58 +02:00
Roland Scheidegger	6d2ecdb4a6	draw: don't use clipvertex output if user plane clipping is disabled The non-llvm path made sure that both clip and pre_clip_pos point to the data output by position, not clipvertex, if user based clipping is disabled. However, the llvm path did not, which apparently led to failures if gl_ClipVertex was written but user plane clipping not enabled (bug 80183). Why I have no idea really, but just make it match the non-llvm behavior... Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-09 03:52:58 +02:00
Chris Forbes	0f4c5a70c6	i965: Get rid of backend_instruction::sampler The generators no longer use this. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:35 +12:00
Chris Forbes	298da9fa2a	i965/vec4/Gen8: Use src1 for sampler_index instead of ->sampler field Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:33 +12:00
Chris Forbes	6be68767b9	i965/vec4/Gen4-7: Use src1 for sampler_index instead of ->sampler field Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:31 +12:00
Chris Forbes	1a3fd11aef	i965/vec4: Pass sampler index in src1 for texture ops Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:29 +12:00
Chris Forbes	2f4e12a835	i965/vec4: Collect all emits of texture ops into one place Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:27 +12:00
Chris Forbes	db09fd5957	i965/fs/Gen8: Pass sampler_index to generate_tex Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:25 +12:00
Chris Forbes	ba5f7a361a	i965/fs/Gen4-7: Pass sampler_index to generate_tex Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:23 +12:00
Chris Forbes	191bc64f82	i965/blorp: Put sampler index in src1 of texture ops Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:21 +12:00
Chris Forbes	a578592fd2	i965/fs: pass sampler as src1 of texture op Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:17 +12:00
Chris Forbes	f6a0192f7d	i965/fs: Collect all emits of texture ops for Gen5/6 into one place Reduces duplication, and will do so even more when we change the sampler plumbing. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:12:13 +12:00
Chris Forbes	d1b136fdd0	i965/fs: Collect all emits of texture ops for Gen4 into one place Reduces duplication, and will do so even more when we change the sampler plumbing. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-09 13:11:33 +12:00
Pali Rohár	39a4cc45a4	configure: check for dladdr via AC_CHECK_FUNC/AC_CHECK_LIB Use both macros as in some cases using AC_CHECK_FUNCS alone may fail. Thus HAVE_DLADDR will not be defined, and as a result most of the code in megadriver_stub.c will not be compiled. Breaking the backwards compatibility between older libGL/xserver(s) and DRI megadrivers. Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Cc: "10.2" <mesa-stable@lists.freedesktop.org> [Emil Velikov] Commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-08 19:26:39 +01:00
Emil Velikov	16826a36ef	util: remove ralloc_test The tests in an empty stub, which we're currently building twice. If anyone is interested in expanding it (adding actual tests) they can always bring it back. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-08 19:23:25 +01:00
Darius Goad	5492296318	gallivm: Handle MSAA textures in emit_fetch_texels This support is preliminary due to the fact that MSAA is not actually implemented. However, this patch does fix the piglit test: spec/!OpenGL 3.2/glsl-resource-not-bound 2DMS (bug #79740). (v2 RS: don't emit 4th coord as explicit lod) Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-08 18:54:08 +02:00
Roland Scheidegger	394ea139c7	draw: hack around weird primitive id input in gs The distinction between system values and ordinary inputs is not very obvious in gallium - further fueled by the fact that they use the same semantic names. Still, if there's any value which imho really is a system value, it's the primitive id input into the gs (while earlier (tessleation) stages could read it, it is _always_ generated by the system). For some odd reason though (which I'd classify as a bug but seems too complicated to fix) the glsl compiler in mesa treats this as an ordinary varying, and everything else after that (including the state tracker and other drivers) just go along with that. But input fetching in gs for llvm based draw was definitely limited to the ordinary (2-dimensional) inputs so only worked with other state trackers, the code was also additionally relying on tgsi_scan_shader filling uses_primid correctly which did not happen neither (would set it only for all stages if it was a system value, but only set it for the fragment shader if it was an input value). This fixes piglit glsl-1.50-geometry-primitive-id-restart and primitive-id-in in llvmpipe. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-08 18:54:08 +02:00
Roland Scheidegger	92a059d294	draw: fix prim id float cast for non-llvm path These values are always uints, casting them to floats does no good. Fixes piglit glsl-1.50-geometry-primitive-id-restart tests for softpipe. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-08 18:54:07 +02:00
Bruno Jiménez	ec73778f1f	clover: Add support for CL_MAP_WRITE_INVALIDATE_REGION OpenCL 1.2 CL_MAP_WRITE_INVALIDATE_REGION sounds a lot like PIPE_TRANSFER_DISCARD_RANGE: From OpenCL 1.2 spec: The contents of the region being mapped are to be discarded. From p_defines.h: Discards the memory within the mapped region. v2: Move the code for validating flags to the front-end as suggested by Francisco Jerez Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-08-08 18:06:14 +03:00
Chia-I Wu	8d853468bd	ilo: break down the format table The PRMs no longer have a single table for format capabilities. Multiple tables take up less space, and are easier to maintain. Encode typed write information while at it.	2014-08-08 20:23:56 +08:00
Kenneth Graunke	ae95b9dd9b	i965: Emit a performance warning on conditional rendering. We have a CPU-side implementation of conditional rendering; it really should be done on the GPU. It's not necessarily that hard, but nobody has gotten to fixing it yet. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-08 00:52:10 -07:00
Kenneth Graunke	e9a9d441f0	i965: Set ExecSize to 16 for loop instructions in SIMD16 shaders. Previously, we explicitly set the execution size to BRW_EXECUTE_8 and disabled compression for loop instructions. I can't imagine how this could be correct in SIMD16 mode. Looking at the history, it appears that this code has used BRW_EXECUTE_8 since 2007, when we had a SIMD8 backend that supported control flow and a separate SIMD16 backend that didn't. Presumably, when we added SIMD16 support for shaders with control flow, we simply neglected to update it. Note that Gen4-5 don't support SIMD16 on shaders with control flow. This might be a candidate for stable, but would need to be rewritten completely due to the brw_inst API changes in master. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-08 00:51:50 -07:00
Kenneth Graunke	e64dbd050d	i965/eu: Merge brw_CONT and gen6_CONT. The only difference is setting PopCount on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-08 00:51:44 -07:00
Kenneth Graunke	e7a7b3317c	i965/eu: Drop redundant brw_set_src0/brw_set_dest from gen6_CONT. We shouldn't need to set them, then set them differently. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-08 00:51:34 -07:00
Juha-Pekka Heikkila	d64be94294	util: add src/util/format_srgb.c to .gitignore format_srgb.c is generated by format_srgb.py python script, having format_srgb.c in git ignore list will silence git complaints about untracked file. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-08 09:49:52 +03:00
Ian Romanick	89d92fc00e	mesa: Fold _mesa_uniform_merge_location_offset into its only caller Also delete the comment before that function. Everything in that comment was either stale, wrong, or captured elsewhere. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-07 16:17:55 -07:00
Ian Romanick	1c759e32d8	mesa: Fold _mesa_uniform_split_location_offset into its only caller Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-07 16:17:53 -07:00
Ian Romanick	e0c867372a	glsl_to_tgsi: Delete unused function set_uniform_initializer Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-07 16:17:50 -07:00
Ian Romanick	8f81f4e185	mesa: Use MAX2 to calculate maximum uniform element Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-07 16:17:48 -07:00
Ian Romanick	411abcb237	mesa: Have validate_uniform_parameters return the gl_uniform_storage pointer This simplifies all the callers, and it enables the removal of one of the function parameters. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-07 16:17:45 -07:00
Carl Worth	f28a105868	glsl/glcpp: Rename one test to avoid a duplicate test number With two tests both numbered 118, there was a confusing off-by-two difference between the last test number and the total number of tests (as reported by glcpp-test). With this rename, there's only an off-by-one difference left, (which is easy to understand given the zero-based test numbering). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	41540997fb	glsl/glcpp: Fix handling of commas that result from macro expansion Here is some additional stress testing of nested macros where the expansion of macros involves commas, (and whether those commas are interpreted as argument separators or not in subsequent function-like macro calls). Credit to the GCC documentation that directed my attention toward this issue: https://gcc.gnu.org/onlinedocs/gcc-3.2/cpp/Argument-Prescan.html Fixing the bug required only removing code from glcpp. When first testing the details of expansions involving commas, I had come to the mistaken conclusion that an expanded comma should never be treated as an argument separator, (so had introduced the rather ugly COMMA_FINAL token to represent this). In fact, an expanded comma should be treated as a separator, (as tested here), and this treatment can be avoided by judicious use of parentheses (as also tested here). With this simple removal of the COMMA_FINAL token, the behavior of glcpp matches that of gcc's preprocessor for all of these hairy cases. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	318369aceb	glsl/glcpp: Integrate recent glcpp-test-cr-lf test into "make check" Beyond just listing this in the TESTS variable in Makefile.am, only minor changes were needed to make this work. The primary issue is that the build system runs the test script from a different directory than the script itself. So we have to use the $srcdir variable to find the test input files. Using $srcdir in this way also ensures that this test works when using an out-of-tree build. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	7ba74c65a7	glsl/glcpp: Fix glcpp-test to correctly extract test-specific arguments The (optional) test-specific command-line arguments to be passed to glcpp are embedded within the source files of some tests, and glcpp-test uses grep to extract them. Of course, grep is line-based and looks for the native line-separator to determine line boundaries. So, for files using non-native line separators, grep was getting quite confused and passing bogus arguments to glcpp. Fix this by canonical-izing the line separators in the source file prior to using grep. With this commit, the glcpp-test-cr-lf tests pass entirely: \r: 143/143 tests pass \r\n: 143/143 tests pass \n\r: 143/143 tests pass Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	f1340745c0	glsl/glcpp: Fix line-continuation code to handle multiple newline flavors Sometimes the newline separator is a single character, and sometimes it is two characters. Before we can fold away and line-continuation backslashes, we identify the flavor of line separator that is in use. With this identified, we then correctly search for backslashes followed immediately by the first character of the line separator. Also, when re-inserting newlines to replace collapsed newlines, we carefully insert newlines of the same flavor. With this commit, almost all remaining test are fixed as tested by glcpp-test-cr-lf: \r: 142/143 tests pass \r\n: 142/143 tests pass \n\r: 143/143 tests pass (The only remaining failures have nothing to do with the actual pre-processor code, but are due to a bug in the way the test suite uses grep to try to extract test-specific command-line options from the source files.) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	ec69e00843	glsl/glcpp: Don't include any newline characters in #error token Some tests were failing because the message printed by #error was including a '\r' character from the source file in its output. This is easily avoided by fixing the regular expression for #error to never include any of the possible newline characters, (neither '\r' nor '\n'). With this commit 2 tests are fixed for each of the '\r' and '\r\n' cases. Current results after the commit are: \r: 137/143 tests pass \r\n 142/143 tests pass \n\r: 139/143 tests pass Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	04e40fd337	glsl/glcpp: Treat CR+LF pair as a single newline The GLSL specification says that either carriage-return, line-feed, or both together can be used to terminate lines. Further, it says that when used together, the pair of terminators shall be interpreted as a single line. This final requirement has not been respected by glcpp up until now, (it has been emitting two newlines for every CR+LF pair). Here, we fix the lexer by using a regular expression for NEWLINE that eats up both "\r\n" (or even "\n\r") if possible before also considering a single '\n' or a single '\r' as a line terminator. Before this commit, the test results are as follows: \r: 135/143 tests pass \r\n: 4/143 tests pass \n\r: 4/143 tests pass After this commit, the test results are as follows: \r: 135/143 tests pass \r\n: 140/143 tests pass \n\r: 139/143 tests pass So, obviously, a dramatic improvement. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	f4ddd026c6	glsl/glcpp: Add test script for testing various line-termination characters The GLSL specification has a very broad definition of what is a newline. Namely, it can be the carriage-return character, '\r', the newline character, '\n', or any combination of the two, (though in combination, the two are treated as a single newline). Here, we add a new test-runner, glcpp-test-cr-lf, that, for each possible line-termination combination, runs through the existing test suite with all source files modified to use those line-termination characters. Instead of using the .expected files for this, this script assumes that the regular test suite has been run already and expects the output to match the .out files. This avoids getting 4 test failures for any one bug, and instead will hopefully only report bugs actually related to the line-termination characters. The new testing is not yet integrated into "make check". For that, some munging of the testdir option will be necessary, (to support "make check" with out-of-tree builds). For now, the scripts can just be run directly by hand. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	218e878b54	glsl/glcpp: Fix for macros that expand to include "defined" operators Prior to this commit, the following snippet would trigger an error in glcpp: #define FOO defined BAR #if FOO #endif The problem was that support for the "defined" operator was implemented within the grammar, (where the parser was parsing the tokens of the condition itself). But what is required is to interpret the "defined" operator that results after macro expansion is performed. I could not find any fix for this case by modifying the grammar alone. The difficulty is that outside of the grammar we already have a recursive function that performs macro expansion (_glcpp_parser_expand_token_list) and that function itself must be augmented to be made aware of the semantics of the "defined" operator. The reason we can't simply handle "defined" outside of the recursive expansion function is that not only must we scan for any "defined" operators in the original condition (before any macro expansion occurs); but at each level of the recursive expansion, we must again scan the list of tokens resulting from expansion and handle "defined" before entering the next level of recursion to further expand macros. And of course, all of this is context dependent. The evaluation of "defined" operators must only happen when we are handling preprocessor conditionals, (#if and #elif) and not when performing any other expansion, (such as in the main body). To implement this, we add a new "mode" parameter to all of the expansion functions to specify whether resulting DEFINED tokens should be evaluated or ignored. One side benefit of this change is that an ugly wart in the grammar is removed. We previously had "conditional_token" and "conditional_tokens" productions that were basically copies of "pp_token" and "pp_tokens" but with added productions for the various forms of DEFINED operators. With the new code here, those ugly copy-and-paste productions are eliminated from the grammar. A new "make check" test is added to stress-test the code here. This commit fixes the following Khronos GLES3 CTS tests: conditional_inclusion.basic_2_vertex conditional_inclusion.basic_2_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	a48ff781c1	glsl/glcpp: Swallow empty #pragma directives. Previously, we were passing these through, just like any other pragma. But the downstream compiler was tripping up on them. It seems easier to swallow these in the preprocessor and not pass them on at all rather than fixing the downstream compiler. This fixes the following Khronos GLES3 CTS tests: preprocessor.pragmas.pragma_vertex preprocessor.pragmas.pragma_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	bf9bce5bea	glsl/glcpp: Fix #pragma to not over-increment the line-number count Previously, the #pragma directive was swallowing an entire line, (including the final newline). At that time it was appropriate for it to increment the line count. More recently, our handling of #pragma changed to not include the newline. But the code to increment yylineno stuck around. This was causing __LINE__ to be increased by one more than desired for every #pragma. Remove the bogus, extra increment, and add a test for this case. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	9a54b07651	glsl/glcpp: Add testing for null directives with spaces and comments This new "make check" test stresses out the support from the last two commits, (to esnure that '#' is correctly interpreted as the null directives, regardless of any whitespace or comments on the same line). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	c0127c30dd	glsl/glcpp: Fix NULL directives when followed by a single-line comment This is the fix for the following line: # // comment to ignore here According to the translation-phase rules, the comment should be removed before the preprocessor looks to interpret the null directive. So in our implementation we must explicitly look for single-line comments in the <HASH> start condition as well. This commit fixes the following Khronos GLES3 CTS tests: null_directive_vertex null_directive_fragment Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	e84e159caa	glsl/glcpp: Add tests for #define followed by comments This simply tests the previous commit, (that #define followed by a comment will still generate the expected "#define without macro name" error message). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	b4b2a5c3f3	glsl/glcpp: Allow single-line comments immediately after #define We were already correctly supporting single-line comments in case like: #define FOO bar // comment here... The new support added here is simply for the none-too-useful: #define // comment instead of macro name With this commit, this line will now give the expected "#define without macro name" error message instead of the lexer just going off into the weeds. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:29 -07:00
Carl Worth	b76482e731	glsl/glcpp: Add test for "#define without macro name" This ensures that the previous commit indeed generates the expected error message when a "#define" directive is not followed by anything except for a newline. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:28 -07:00
Carl Worth	a196ab1f8a	glsl/glcpp: Add explicit error for "#define without macro name" Previously, glcpp would emit an error like this if <EOF> happened to occur immediately after the "#define", but in general would just get confused, (leading to un-helpful error messages). To fix things to generate a clean error message, we do a few things: 1. Don't require horizontal whitespace immediately after #define 2. Add a production for the error case, (DEFINE_TOKEN followed immediately by a NEWLINE token). 3. Make the lexer reset to the <INITIAL> state after every NEWLINE. This 3rd point prevents the lexer from getting so confused and generating further spurious errors in the file because it was stuck in the <DEFINE> start condition. We also drop the similar error message from the <EOF> rule since the newly-added rule will have already printed the error message. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-07 16:08:28 -07:00
Matt Turner	b6ab52b7f9	docs: List GL+GLSL versions as parts of a whole. Listing the GLSL version as an individual component of a GL version, separate from the extensions isn't really right. The GLSL changes are (almost?) entirely comprised of changes listed in the extensions. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-07 16:00:24 -07:00
Matt Turner	bbd5dd5226	i965/vec4: Remove unused emit_bool_comparison method. Apparently unused since it was added in commit `af3c9803`. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-07 16:00:24 -07:00
Matt Turner	50d5fc192b	mesa: Drop USE_IEEE define. I think OpenVMS was the only platform that Mesa ran on that used a non-IEEE representation for floats. We removed OpenVMS support a while back, and this should alleviate the need to continue updating the this-platform-uses-IEEE list. The one bit of this patch that needs review is the IS_INF_OR_NAN, because I'm not sure if MSVC supports isfinite. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82268 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-07 16:00:24 -07:00
Ian Romanick	4837b130a7	mesa: Group gl_system_value values by the stage where they exist Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-07 15:19:59 -07:00
Ian Romanick	5d7275c350	glsl_to_tgsi: Assert that the _mesa_sysval_to_semantic mapping is correct Future patches will rearrange the values in gl_system_value, and I want to catch errors. Designated initializers would make all of this unnecessary. v2: Don't use STATIC_ASSERT. Not only does it not work, but GCC doesn't tell you that it's not going to work. Thanks for nothing! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-07 15:19:57 -07:00
Ian Romanick	21ef7f58e3	mesa/st: Only one copy of mesa_sysval_to_semantic Future patches will necessitate changes to the table, and I only want to update one. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-07 15:19:55 -07:00
Ian Romanick	1c887ae6e2	glsl_to_tgsi: Constify mesa_sysval_to_semantic Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-07 15:19:53 -07:00
Kenneth Graunke	b7679639bc	i965/clip: Fix brw_clip_unfilled.c/compute_offset's assembly. Due to the destination register width of 1 or 2, these instructions get ExecSize 1 or 2. But dir and offset (used as src0) are both registers of width 4, violating the execsize >= width assertion. I honestly don't think this could have ever worked. Fixes Piglit's polygon-offset and polygon-mode-offset tests on Gen4-5. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70441 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-07 13:22:52 -07:00
Tapani Pälli	151fb1e808	glsl: support unsigned increment in ir_loop controls Current version can create ir_expression where operands have different base type, patch adds support for unsigned type. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> https://bugs.freedesktop.org/show_bug.cgi?id=80880	2014-08-07 07:31:49 +03:00
Jason Ekstrand	787bac3808	mesa/formats: Fix the size of ETC2_SRGB8_PUNCHTHROUGH_ALPHA1 Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-06 15:15:53 -07:00
Jason Ekstrand	bb89d82ac4	mesa/formats: Use the correct swizzle parameter for the 11-bit EAC formats Red-only formats should be x001 and RG formats should be xy01. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-08-06 15:15:44 -07:00
Roland Scheidegger	6e9005e8b0	draw: fix clipvertex trouble if position comes from gs If the vertex shader has no position but the gs has, the clipvertex output was -1 (because it's the same as vs position in this case if there's no explicit clipvertex output). This caused crashes (or assertion failures) in clipping since in the end position (which came from gs) was different from cv (-1) and we then tried to use the bogus cv input. Rather than just test for -1 cv value in clipping, make it explicitly return the position output of the gs instead which seems cleaner (since we really don't want to use the clipvertex value from the vs (it could be a valid value in the (unsupported) case of vs writing clipvertex but still using a gs). This fixes piglit shader_runner clip-distance-out-values.shader_test. Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-08-06 18:01:33 +02:00
Roland Scheidegger	11bd6f0e9b	draw: don't run pipeline stages when gs has no position output The clip stage may crash if there's no position output, for this reason code was added to avoid running the pipeline stages in this case (`c7c7186045`). However, this failed to actually work when there was a geometry shader, since unlike the vertex shader it did not initialize the position output to -1, hence the code trying to detect this didn't trigger. So simply initialize the position output to -1 just like the vs does. This fixes piglit glsl-1.50-transform-feedback-type-and-size (segfault->pass). clip-distance-out-values.shader_test goes from segfault to assertion failure, suggesting more fixes are needed, no other piglit changes. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-08-06 18:01:33 +02:00
Vinson Lee	c40d7d6d94	dri/xmlconfig: s/uint/unsigned int/ This patch fixes this build error on Mac OS X. ./xmlconfig.h:61:5: error: unknown type name 'uint'; did you mean 'int'? uint nRanges; /*< \brief Number of ranges / ^~~~ int ./xmlconfig.h:79:5: error: unknown type name 'uint'; did you mean 'int'? uint tableSize; ^~~~ int Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 16:52:42 -07:00
Brian Paul	1125d021de	mesa include stdint.h in formats.h To get uint8_t type, to fix MSVC build. Trivial.	2014-08-05 13:07:46 -06:00
Jason Ekstrand	fc2b2d337e	mesa/texstore: Add a generic rgba integer texture upload path Again, we delete a lot of functions that aren't really doing anything interesting anymore. v2: Comment the texstore_rgba_integer function Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:17 -07:00
Jason Ekstrand	d267b75715	mesa/texstore: Add a generic float/normalized rgba texture upload path This commit also removes a bunch of functions which aren't doing anything more interesting than the general path does. v2: Better comment the texstore_via_float function Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:17 -07:00
Jason Ekstrand	3dbf5bf657	mesa/texstore: Use _mesa_swizzle_and_convert when possible This should be both faster and more accurate than our general slow-path of converting everything to float. v2: Add a comment to top of the texstore_swizzle function Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:17 -07:00
Jason Ekstrand	4c8fc26835	main/texstore: Split texture storage into three functions This commit splits the texture storage into three functions: texstore_depth_stencil, texstore_compressed, and texstore_rgba. Right now this split seems artificial since we just have one function pointer per format and there is no difference between these three categories. However, this split makes it much easier to write a more general function upload path for one of these categories than the current function pointers. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:17 -07:00
Jason Ekstrand	6b912dc129	mesa/format_utils: Add a function to convert a mesa_format to an array format This commits adds the _mesa_format_to_array function that determines if the given format can be represented as an array format and computes the array format parameters. This is a direct helper function for using _mesa_swizzle_and_convert v2: Better documentation and commit message v3: Fixed a potential segfault from an invalid endianness swizzle Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:16 -07:00
Jason Ekstrand	d55f77b503	mesa/format_utils: Add a general format conversion function Most format conversion operations required by GL can be performed by converting one channel at a time, shuffling the channels around, and optionally filling missing channels with zeros and ones. This adds a function to do just that in a general, yet efficient, way. v2: * Add better comments including full docs for functions * Don't use __typeof__ * Use inline helpers instead of writing out conversions by hand, * Force full loop unrolling for better performance v3: Add another set of parens around the MAX_INT macro Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:16 -07:00
Jason Ekstrand	452d64986b	mesa/imports: Add a _mesa_half_is_negative helper function Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:16 -07:00
Jason Ekstrand	850fb0d1dc	mesa/formats: Add layout and swizzle information v2: Move the MESA_FORMAT_SWIZZLE enum to the top of the file Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:16 -07:00
Jason Ekstrand	55a929955f	mesa/formats: Remove IndexBits Mesa hasn't supported color-indexed textures for some time. This is 0 for all texture formats, so we don't need to store it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:15 -07:00
Jason Ekstrand	12610ffcf7	mesa/formats: Autogenerate the format_info structure from a CSV file Instead of a having all of the format metadata in a gigantic hard-to-edit array of type struct format_info, we now have a human-readable CSV file. The CSV file also contains more format information than the format_info struct contained so we can potentially make format_info more detailed later. The python to generate the format information was added the previous commit. This commit turns it on in both automake and scons builds. v2: Split into two commits and stuff to generate format_info.c from scons Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:15 -07:00
Jason Ekstrand	3420565310	mesa/main: Add python code to generate the format_info structure This adds a python script called format_info.py that is used to generate a single format_info.c file that contains the filled-out format_info array. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:15 -07:00
Jason Ekstrand	d4c780e052	mesa: Add python to parse the formats CSV file The basic concept for the format parser was taken from the format CSV parser in gallium/auxilliary/util. However, this one has been altered in a number of ways: * Removed big endian vs. little endian stuff (mesa doesn't need it) * Better documentation: Almost every method has a full docstring * An actual Swizzle class with methods for composition and inverses * Over-all cleaner (in my opinion) implementation and class interactions * A few bug fixes Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:15 -07:00
Jason Ekstrand	056cc47e12	mesa: Add a format description CSV file Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 10:56:15 -07:00
Jason Ekstrand	1d47f67455	util/tests/hash_table: Link against libmesautil instead of libmesa Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82159 Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-05 10:52:48 -07:00
Brian Paul	36de884ffd	st/mesa: adjust Z coordinates for quad clearing Specify the quad's Z position in clip coordinate space, not normalized Z space. Use viewport scale, translation = 0.5, 0.5. Before, we were specifying the quad's Z position in [0,1] and using viewport scale=1.0, translate=0.0. That works fine, unless your driver needs to work in clip coordinate space and needs to reconstruct viewport near/far values from the scale/translation factors. The VMware svga driver falls into that category. When we did that reconstruction we wound up with near=-1 and far=1 which are outside the limits of [0,1]. In some cases, this caused the quad to be drawn at the wrong depth. In other cases it was clipped away. Fixes some scissored depth clears with VMware driver. This should have no effect on other drivers. We're already using these values for the glBitmap and glDraw/CopyPixels code. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-08-05 10:21:18 -06:00
Brian Paul	6719914f98	mesa: make vertex array type error checking a little more efficient Compute the bitmask of supported array types once instead of every time we call a GL vertex array function. Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-08-05 10:18:34 -06:00
Michel Dänzer	3347c634d0	glsl_to_tgsi: Fix typo shader_program -> shader This was a regression introduced by commit `f4b0ab7afd` ('st/mesa: fix incorrect size of UBO declarations') which caused an assertion failure while compiling shaders of e.g. UE4 demos. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81834 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 23:34:00 +09:00
Brian Paul	8563335b65	mesa: update wglext.h to version 20140630 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-05 08:19:02 -06:00
Brian Paul	c344f45333	mesa: update glxext.h to version 20140725 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-05 08:18:58 -06:00
Brian Paul	d96607970b	mesa: update glext.h to version 20140725 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-05 08:18:50 -06:00
Neil Roberts	816dbdb106	meta: Disable dithering during glBlitFramebuffer According to the GL spec the only fragment operations that should affect glBlitFramebuffer are “the pixel ownership test, the scissor test, and sRGB conversion”. That implies that dithering should not be performed so we need to disable it when implementing the blit with a render. Before commit `05b52efbc9` the dithering state would be left as whatever the application picks (the default being GL_TRUE) and after that commit it was explicitly enabled. Neither of these were correct. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81828 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-05 14:48:15 +01:00
Emil Velikov	afcf5d33cf	libgl-xlib: drop duplicate mesautil from scons build Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-05 13:56:35 +01:00
Emil Velikov	4f0f75deba	llvmpipe/tests: automake: link against libmesautil.la Or the build will fail due to unresolved symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-05 13:56:32 +01:00
Emil Velikov	07a275991e	gallium/tests: automake: link against libmesautil.la Or the build will fail due to unresolved symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-05 13:56:30 +01:00
Emil Velikov	692009cab1	targets/omx: automake: link against libmesautil.la Or the build will fail due to unresolved symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-05 13:56:27 +01:00
Emil Velikov	807b5467a3	targets/xvmc: automake: link against libmesautil.la Or the build will fail due to unresolved symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-05 13:53:43 +01:00
Jan Vesely	d0b4ac642b	targets/clover: link against libmesautil.la Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-05 12:54:08 +09:00
Jan Vesely	e28136343b	gallivm: Fix build with latest LLVM Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-08-05 12:52:56 +09:00
Roland Scheidegger	6b834af77e	targets/dri: link with mesautil Similar to other recent build fixes. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-05 04:13:17 +02:00
Roland Scheidegger	9042e8863a	gallium/docs: Document TEX2/TXL2/TXB2 instructions and fix up other tex doc Add documentation for TEX2/TXL2/TXB2 tgsi opcodes. Also, the texture opcode documentation wasn't very accurate so fix this up a bit. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-05 04:13:17 +02:00
Roland Scheidegger	c3c33756ff	gallivm: fix cube map array (and cube map shadow with bias) handling In particular need to handle TEX2/TXB2/TXL2 opcodes. cube map shadow with bias already used TXB2 which didn't work before at all, despite that there's by default no piglit change (but using no_quad_lod and no_rho_opt indeed passes some more tex-miplevel-selection tests). The actual sampling code still won't handle cube map arrays. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 04:13:17 +02:00
Roland Scheidegger	ea05cfaaca	llvmpipe: implement support for cube map arrays This just covers the resource side of things, not the actual sampling. Here things are trivial as cube map arrays are identical to 2d arrays in all respects. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-05 04:13:17 +02:00
Anuj Phogat	d308f57fe7	egl: Fix OpenGL ES version checks in _eglParseContextAttribList() We would generate EGL_BAD_CONFIG because _eglGetContextAPIBit returns zero for the combination of EGL_OPENGL_ES_API and a major version > 3. By just returning zero, the caller can't tell the difference between a bad version (which should generate EGL_BAD_MATCH) and a bad API (which should generate EGL_BAD_CONFIG). This patch causes us to filter out major versions > 3 at a point where we can generate the correct error. Fixes gles3 Khronos CTS test: egl_create_context.egl_create_context V2: Fix commit message as suggested by Ian. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-04 18:31:26 -07:00
Anuj Phogat	338fef61f8	meta: Fix datatype computation in get_temp_image_type() Changes in the patch will cause datatype to be computed correctly for 8 and 16 bit integer formats. For example: GL_RG8I, GL_RG16I etc. Fixes many failures in gles3 Khronos CTS test: copy_tex_image_conversions_required copy_tex_image_conversions_forbidden Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-04 17:19:42 -07:00
Anuj Phogat	4bab55c874	meta: Move the call to _mesa_get_format_datatype() out of switch Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-04 17:19:41 -07:00
Anuj Phogat	7de90890c6	meta: Use _mesa_get_format_bits() to get the GL_RED_BITS We currently get red bits from ctx->DrawBuffer->Visual.redBits by making a false assumption that the texture we're writing to (in glCopyTexImage2D()) is used as a DrawBuffer. Fixes many failures in gles3 Khronos CTS test: copy_tex_image_conversions_required Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-04 17:19:41 -07:00
Anuj Phogat	9796a17265	meta: Initialize the variable in declaration statement Saves one line of code :) Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-04 17:19:20 -07:00
Anuj Phogat	c7def2257a	mesa: Allow GL_TEXTURE_CUBE_MAP target with compressed internal formats GL_TEXTURE_CUBE_MAP is an allowed texture target in glTexStorage2D() and is allowed to be used (like GL_TEXTURE_2D) with compressed internal formats. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 17:12:29 -07:00
Anuj Phogat	2fc4205461	mesa: Add gles3 condition for normalized internal formats in glCopyTexImage*() Fixes many failures in gles3 Khronos CTS test: packed_pixels Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:12:23 -07:00
Anuj Phogat	938b3d0034	mesa: Add utility function _mesa_is_enum_format_unorm() V2: Add missing formats. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:12:14 -07:00
Anuj Phogat	6df48ff27a	mesa: Add gles3 error condition for GL_RGBA10_A2 buffer format in glCopyTexImage*() Fixes many failures in gles3 Khronos CTS test: packed_pixels Khronos bug# 9807 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:12:05 -07:00
Anuj Phogat	5c0d2a12f3	mesa: Add a gles3 error condition for sized internalformat in glCopyTexImage*() Fixes many failures in gles3 Khronos CTS test: packed_pixels V2: Add the check for alpha bits to avoid confusion. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:52 -07:00
Anuj Phogat	e0fe00eeac	mesa: Add a helper function _mesa_is_enum_format_unsized() Function is utilized by next patch in the series. V2: Add missing formats. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:44 -07:00
Anuj Phogat	2d362a6aee	mesa: Don't allow snorm internal formats in glCopyTexImage*() in GLES3 Fixes few failures in gles3 Khronos CTS test: packed_pixels Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:34 -07:00
Anuj Phogat	845b5ec89f	mesa: Add utility function _mesa_is_enum_format_snorm() Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:25 -07:00
Anuj Phogat	3c7a0c690a	mesa: Fix condition for using compressed internalformat in glCompressedTexImage3D() Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:18 -07:00
Anuj Phogat	e27c9f3a02	mesa: Add error condition for using compressed internalformat in glTexStorage3D() Fixes gles3 Khronos CTS test: texture_storage_texture_internal_formats Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:10 -07:00
Anuj Phogat	ac2adf66c1	mesa: Turn target_can_be_compressed() in to a utility function V2: Declare the function in teximage.h Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 17:11:00 -07:00
Anuj Phogat	a94d78438d	mesa: Fix error condition for valid texture targets in glTexStorage* functions Fixes gles3 Khronos CTS test: texture_storage_texture_targets Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-08-04 17:10:48 -07:00
Ian Romanick	7b18983147	glsl: Rebuild the symbol table without unreachable symbols Previously we had to keep unreachable global symbols in the symbol table because the symbol table is used during linking. Having the symbol table retain pointers to freed memory... what could possibly go wrong? At the same time, this meant that we kept live references to tons of memory that was no longer needed. New strategy: destroy the old symbol table, and make a new one from the reachable symbols. Valgrind massif results for a trimmed apitrace of dota2: n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B) Before (32-bit): 59 40,642,425,451 76,337,968 69,720,886 6,617,082 0 After (32-bit): 46 40,661,487,174 75,116,800 68,854,065 6,262,735 0 Before (64-bit): 79 37,179,441,771 106,986,512 98,112,095 8,874,417 0 After (64-bit): 64 37,200,329,700 104,872,672 96,514,546 8,358,126 0 A real savings of 846KiB on 32-bit and 1.5MiB on 64-bit. v2: (by Kenneth Graunke) Just add the ir_function from the IR stream, rather than looking it up in the symbol table; they're now identical. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-04 15:48:09 -07:00
Kenneth Graunke	3d051772c8	glsl: Only create one ir_function for a given name. Piglit's spec/glsl-1.10/linker/override-builtin-{const,uniform}-05 tests do the following: 1. Call abs(float) - a built-in function. 2. Create a user-defined replacement for abs(float). 3. Call abs(float) again - now the user function. At step 1, we created an ir_function which included the built-in signature, added it to the symbol table, and emitted it into the IR stream. Then, when processing the function definition at step 2, we'd see that there was already an ir_function. But, since there were no user-defined functions, we skipped over a bunch of code, and ended up creating a second one. This new ir_function shadowed the original in the symbol table, but both ended up in the IR stream. This results in an awkward situation where searching for an ir_function via the symbol table, a forward linked list walk, and a reverse linked list walk may return different ir_functions. This seems undesirable. This patch instead re-uses the existing ir_function, putting both built-in and user-defined signatures in the same one. The previous patch's additional filtering ensures everything continues working. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-04 15:48:06 -07:00
Kenneth Graunke	21129d4de3	glsl: Make it possible to ignore built-ins when matching signatures. Historically, we've implemented the rules for overriding built-in functions by creating multiple ir_functions and relying on the symbol table to hide the one containing built-in functions. That works, but has a few drawbacks, so the next patch will change it. Instead, we'll have a single ir_function for a particular name, which will contain both built-in and user-defined signatures. Passing an extra parameter to matching_signature makes it easy to ignore built-ins when they're supposed to be hidden. I didn't add the parameter to exact_matching_signature since it wasn't necessary. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-08-04 15:47:06 -07:00
Kenneth Graunke	f82f2fb3dc	mesa: Actually use the Mesa IR optimizer for ARB programs. On Haswell, this cuts 1-3 instructions from 183 vertex shaders in "Shadowrun Returns", "Shatter", and "Trine 2." It adds 2 instructions to a single fragment shader in "Closure." total instructions in shared programs: 278803 -> 278546 (-0.09%) instructions in affected programs: 41930 -> 41673 (-0.61%) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-08-04 15:43:56 -07:00
Ian Romanick	b48621c348	glsl: Do not add extra padding to structures This code was attemping to align the base of the structure to the required alignment of the structure. However, it had two problems: 1. It was aligning the target structure member, not the base of the structure. 2. It was calculating the alignment based on the members previous to the target member instead of all the members of the structure. Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.random.nested_structs.6 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.2 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.6 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.5 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.19 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.0 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.2 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.6 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.12 v2: Fix rebase failure noticed by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:07 -07:00
Ian Romanick	b17a4d5dab	glsl: Correctly determine when the field of a UBO is row-major Previously if a field of an block with an instance name was marked row-major (but block itself was not), we would think the field (and it's sub-fields) were column-major. Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.random.basic_types.7 ES3-CTS.shaders.uniform_block.random.basic_types.9 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.1 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.3 ES3-CTS.shaders.uniform_block.random.nested_structs.3 ES3-CTS.shaders.uniform_block.random.nested_structs.5 ES3-CTS.shaders.uniform_block.random.nested_structs.8 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.3 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.6 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.7 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.8 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.9 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.0 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.1 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.2 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.3 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.4 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.6 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.0 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.1 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.5 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.0 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.4 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.7 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.8 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.12 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.14 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.15 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.16 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.1 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.8 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.9 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.10 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.11 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.13 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.14 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.15 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.16 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.17 Fixes gles3conform failures (caused by previous commits) in: ES3-CTS.shaders.uniform_block.random.basic_types.8 ES3-CTS.shaders.uniform_block.random.basic_arrays.3 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.0 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.2 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.9 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.13 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.18 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.4 v2: Fix rebase failure noticed by Matt. v3: Use without_array() instead of older predicates. v4: s/GLSL_MATRIX_LAYOUT_DEFAULT/GLSL_MATRIX_LAYOUT_INHERITED/g Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [v2]	2014-08-04 14:40:07 -07:00
Ian Romanick	b71f149a44	linker: Use the matrix layout information in ir_variable and glsl_type for UBO layout Use the data that is stored in the ir_variable and the glsl_type to determine whether or not a UBO member is row-major. Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat2x3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat2x4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat3x2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat3x4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat4x2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.shared.row_major_mat4x3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat2x3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat2x4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat3x2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat3x4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat4x2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.packed.row_major_mat4x3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat2x3 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat2x4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat3x2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat3x4 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat4x2 ES3-CTS.shaders.uniform_block.instance_array_basic_type.std140.row_major_mat4x3 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.2 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.5 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.9 Causes gles3conform failures in: ES3-CTS.shaders.uniform_block.random.basic_types.8 ES3-CTS.shaders.uniform_block.random.basic_arrays.3 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.0 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.2 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.13 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.18 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.4 These failures will be fixed shortly. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:07 -07:00
Ian Romanick	d561e79a67	glsl: Track matrix layout of variables using two bits Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.3 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.13 Causes gles3conform failures in: ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.9 This failure will be fixed shortly. v2: Use without_array() instead of older predicates. v3: s/GLSL_MATRIX_LAYOUT_DEFAULT/GLSL_MATRIX_LAYOUT_INHERITED/g Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2014-08-04 14:40:07 -07:00
Ian Romanick	68fa4cab1a	glsl: Also track matrix layout information into structures Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:07 -07:00
Ian Romanick	814d694160	glsl: Track matrix layout of structure fields using two bits v2: Rename GLSL_MATRIX_LAYOUT_DEFAULT to GLSL_MATRIX_LAYOUT_INHERITED. Add comments in glsl_types.h explaining the layouts. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:07 -07:00
Ian Romanick	ab7098c8df	glsl: Correctly load columns of a row-major matrix For a row-major matrix, the next column starts at the next element. Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat2 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat3 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat4 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat2x3 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat2x4 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat3x2 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat3x4 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat4x2 ES3-CTS.shaders.uniform_block.single_basic_array.shared.row_major_mat4x3 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat2 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat3 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat4 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat2x3 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat2x4 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat3x2 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat3x4 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat4x2 ES3-CTS.shaders.uniform_block.single_basic_array.packed.row_major_mat4x3 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat2 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat3 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat4 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat2x3 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat2x4 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat3x2 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat3x4 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat4x2 ES3-CTS.shaders.uniform_block.single_basic_array.std140.row_major_mat4x3 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.9 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:07 -07:00
Ian Romanick	7f731340d2	linker: Add padding after the last field of a structure This causes the thing following the structure to be vec4-aligned. Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.random.nested_structs.2 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.5 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:07 -07:00
Ian Romanick	47c6fc5b04	linker: Add a last_field parameter to various program_resource_visitor methods I also considered renaming visit_field(const glsl_struct_field *) to entry_record and adding an exit_record method. This would be more similar to the hierarchical visitor. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:06 -07:00
Ian Romanick	46356c46ea	mesa: Do not list inactive block members as active Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.single_nested_struct.per_block_buffer_packed ES3-CTS.shaders.uniform_block.single_nested_struct_array.per_block_buffer_packed ES3-CTS.shaders.uniform_block.random.scalar_types.7 ES3-CTS.shaders.uniform_block.random.basic_arrays.4 ES3-CTS.shaders.uniform_block.random.basic_arrays.6 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.2 ES3-CTS.shaders.uniform_block.random.nested_structs.9 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.3 Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:06 -07:00
Ian Romanick	1ca25abe25	glsl: Do not eliminate 'shared' or 'std140' blocks or block members Commit `32f32292` (glsl: Allow elimination of uniform block members) enabled elimination of unused uniform block members to fix a gles3 conformance test failure. This went too far the other way. Section 2.11.6 (Uniform Variables) of the OpenGL ES 3.0.3 spec says: "All members of a named uniform block declared with a shared or std140 layout qualifier are considered active, even if they are not referenced in any shader in the program. The uniform block itself is also considered active, even if no member of the block is referenced." Fixes gles3conform failures in: ES3-CTS.shaders.uniform_block.single_nested_struct.per_block_buffer_shared ES3-CTS.shaders.uniform_block.single_nested_struct.per_block_buffer_std140 ES3-CTS.shaders.uniform_block.single_nested_struct_array.per_block_buffer_shared ES3-CTS.shaders.uniform_block.single_nested_struct_array.per_block_buffer_std140 ES3-CTS.shaders.uniform_block.random.scalar_types.2 ES3-CTS.shaders.uniform_block.random.scalar_types.9 ES3-CTS.shaders.uniform_block.random.vector_types.1 ES3-CTS.shaders.uniform_block.random.vector_types.3 ES3-CTS.shaders.uniform_block.random.vector_types.7 ES3-CTS.shaders.uniform_block.random.vector_types.9 ES3-CTS.shaders.uniform_block.random.basic_types.5 ES3-CTS.shaders.uniform_block.random.basic_types.6 ES3-CTS.shaders.uniform_block.random.basic_arrays.0 ES3-CTS.shaders.uniform_block.random.basic_arrays.2 ES3-CTS.shaders.uniform_block.random.basic_arrays.5 ES3-CTS.shaders.uniform_block.random.basic_arrays.8 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.0 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.4 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.5 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.6 ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.9 ES3-CTS.shaders.uniform_block.random.nested_structs.0 ES3-CTS.shaders.uniform_block.random.nested_structs.1 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays.4 ES3-CTS.shaders.uniform_block.random.nested_structs_instance_arrays.8 ES3-CTS.shaders.uniform_block.random.nested_structs_arrays_instance_arrays.7 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.3 ES3-CTS.shaders.uniform_block.random.all_per_block_buffers.6 ES3-CTS.shaders.uniform_block.random.all_shared_buffer.18 v2: Whitespace and other minor fixes suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 14:40:06 -07:00
Ian Romanick	6305caea52	glsl: Use the without_array predicate to simplify some code Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> [v1] Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-08-04 14:40:06 -07:00
Ian Romanick	22f7a46d74	glsl: Add without_array type predicate Returns the type without any arrays. This will be used in later patches in this series. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-04 14:40:06 -07:00
Ian Romanick	146be3ddbe	glsl: Use constant_expression_value instead of as_constant Just a few lines earlier we may have wrapped the index expression with ir_unop_i2u expression. Whenever that happens, as_constant will return NULL, and that almost always happens. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2014-08-04 14:40:06 -07:00
Brian Paul	b249712643	targets/graw-gdi: link with mesautil, not mesautils Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 15:22:48 -06:00
Brian Paul	a3bdbef020	wmesa: link with mesautil Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 15:22:48 -06:00
Brian Paul	d6a7ff6d3b	osmesa: link with mesautil Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 15:22:48 -06:00
Brian Paul	c4e23f039e	targets/libgl-gdi: link with mesautil Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 15:22:48 -06:00
Brian Paul	0ba5d8010d	targets/egl-static: link with libmesautil.la Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 15:22:48 -06:00
Brian Paul	b0b9871f69	mesa/x86: put code in braces to silence declarations after code warning Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-08-04 15:22:48 -06:00
Jason Ekstrand	ea705a4537	src/Makefile.am: Move gtest before util Since the ralloc test in util/tests needs gtest, we need to make sure that the gtest subdir is loaded first. This fixes bug #82148. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-04 13:21:08 -07:00
Brian Paul	9b10bc5589	util: include c99_compat.h in format_srgb.h to get 'inline' definition Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 14:06:13 -06:00
Brian Paul	04764f3bd9	util: include c99_compat.h in hash_table.h to get 'inline' definition Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 14:06:13 -06:00
Brian Paul	b035869ff8	targets/vdpau: link with libmesautil.la to fix build breakage Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 14:06:13 -06:00
Brian Paul	9f88893829	xlib: fix missing mesautil build breakage Fixes the non-DRI build. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-08-04 14:06:13 -06:00
Matthew McClure	ff0cbfb3db	svga: SVGA_3D_CMD_BIND_GB_SHADER needs to reserve two relocations. With this patch, the SVGA_3D_CMD_BIND_GB_SHADER functionality will reserve two relocations, one for the shader ID and the second for the MOB ID. Verified with the WDDM winsys path that the number of relocations and patch locations required is two. Fixes Bug 1277406 Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-08-04 14:06:13 -06:00
Jason Ekstrand	0236e75b2a	gallium: Add libmesautil dependency to gdm and xa targets Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-04 12:38:01 -07:00
Jason Ekstrand	e97498ef81	mesa/main: Use the RGB <-> sRGB conversion functions in libmesautil Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 11:07:20 -07:00
Jason Ekstrand	992e1ea8e4	gallium: Move sRGB <-> RGB handling to libmesautil Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 11:07:15 -07:00
Jason Ekstrand	efa0aa8ffc	util: Gather some common macros This gathers macros that have been included across components into util so that the include chain can be more vertical. In particular, this makes util stand on its own without any dependence whatsoever on the rest of mesa. Signed-off-by: "Jason Ekstrand" <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 11:07:10 -07:00
Kenneth Graunke	72e55bb688	util: Move the open-addressing linear-probing hash_table to src/util. This hash table is used in core Mesa, the GLSL compiler, and the i965 driver, which makes it a good candidate for the new src/util module. It's much faster than program/hash_table.[ch] (see commit `6991c2922f` for data), and José's u_hash_table.c has a comment saying Gallium should probably consider switching to a linear probing hash table at some point. So this seems like the best candidate for a shared data structure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> v2 (Jason Ekstrand): Pick up another hash_table use and patch up scons Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 11:07:05 -07:00
Kenneth Graunke	1e0da6233b	util: Move ralloc to a new src/util directory. For a long time, we've wanted a place to put utility code which isn't directly tied to Mesa or Gallium internals. This patch creates a new src/util directory for exactly that purpose, and builds the contents as libmesautil.la. ralloc seemed like a good first candidate. These days, it's directly used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl didn't make much sense. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> v2 (Jason Ekstrand): More realloc uses and some scons fixes Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 11:06:58 -07:00
Jason Ekstrand	dcc29c18b4	mesa/SConscript: Use Makefile.sources instead of duplicating the file lists Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-04 11:06:52 -07:00
Emil Velikov	87e719ae98	targets/dri: resolve the scons build With earlier commit we've conditionally enabled/added the kms_dri target for automake builds. Unfortunately the we forgot to add the appropriate define in the scons build, resulting in a broken library due to the undefined symbol 'kms_swrast_create_screen'. Reported-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Roland Scheidegger <sroland@vmware.com>	2014-08-04 18:26:35 +01:00
Jan Vesely	cf3c73cf20	mesa/st: Fix compiler warnings both array and index are unsigned types Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-04 09:51:44 -06:00
Jan Vesely	6614def764	gallium: Fix compiler warning. warning: type qualifiers ignored on function return type Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-04 09:51:38 -06:00
Tapani Pälli	d66acc7077	glsl: fix switch statement default case regressions This patch fixes regressions caused by commit `48deb4d`. Regressions happened because 'run_default' var did not get initialized when default case was the last one. Now all the switch tests in es3conform suite are passing. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81857	2014-08-04 12:32:59 +03:00
Aaron Watry	47e5039680	st/dri: Fix driver loading if swrast isn't built If building hardware drivers only, then kms_swrast_create_screen won't be defined in inline_drm_helper.h and hardware drivers will fail to dlopen as a result. Copy the #if guards from inline_drm_helper.h to dri_kms_init_screen to make the definition/use of the function match. Fixes radeonsi_dri.so dlopen with the following configure: ./configure --with-dri-drivers= --with-dri-driverdir=/usr/local/lib/dri/ \ --enable-gbm --enable-gallium-gbm --enable-debug --enable-opencl \ --enable-opencl-icd --with-gallium-drivers=radeonsi \ --with-egl-platforms=drm --enable-glx-tls --enable-texture-float \ --enable-omx Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-03 12:13:47 -05:00
Ilia Mirkin	7b3d0a9a1e	mesa/st: only convert AND(a, NOT(b)) into MAD when not using native integers Native integers imply a somewhat different handling of booleans. Instead of being 1.0/0.0 floats, they are 0 (true) / -1 (false) integers. As such the original optimization no longer applies. Reported-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-08-03 10:05:53 -04:00
Marek Olšák	152006e149	Remove XA state tracker support for Radeon We don't support this type of X acceleration and we never did. Other drivers might want to do the same thing. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-03 14:00:25 +02:00
Carl Worth	179c5d4e6d	docs: Import 10.2.5 release notes, add news item.	2014-08-02 22:54:26 -07:00
Ilia Mirkin	47b064fd8a	mesa/st: add support for dynamic ubo selection Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) v2: fix src register, use index2D for base of 1 Acked-by: Marek Olšák <marek.olsak@amd.com>	2014-08-02 23:51:40 -04:00
Kenneth Graunke	5d90926052	i965: Delete stale "pre-gen4" comment in texture validation code. In commit `16060c5adc`, Eric changed the code to not relayout just for baselevel changes - only if the range of miplevels actually increases. So this comment is now wrong. Notably, the i915 version of the code actually does what the comment says. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-02 05:25:06 -07:00
Kenneth Graunke	8ccae4fe28	i965: Delete sampler state structures. We've moved to using bitshifts (like we did for surface state); nothing uses the structures anymore. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:41 -07:00
Kenneth Graunke	b8c2538e17	i965: Replace sizeof(struct gen7_sampler_state) with the size itself. These are the last users of struct gen7_sampler_state. v2: Use a local sampler_state_size variable, to help distinguish the various 16s (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:41 -07:00
Kenneth Graunke	7da612e8d0	i965: Drop sizeof(struct brw_sampler_state) from estimated prim size. This is the last user of the structure. v2: Use a local variable with a sensible name so people know what 16 is. (Suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:41 -07:00
Kenneth Graunke	3d1a4d1f5b	i965: Make BLORP use brw_emit_sampler_state(). This simplifies the code, removes use of the old structures, and also allows us to combine the Gen6 and Gen7+ code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:41 -07:00
Kenneth Graunke	6b5b78b518	i965: Delete redundant sampler state dumping code. Although the Gen4-6 and Gen7+ variants used different structure types, they didn't use any of the fields - only the size, which is identical. So both decoders did exactly the same thing. Someday we should implement useful decoders for SAMPLER_STATE. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:40 -07:00
Kenneth Graunke	3f3e0be666	i965: Make some brw_sampler_state.c functions static again. Now that gen7_sampler_state.c is gone, everything is once again in a single file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:40 -07:00
Kenneth Graunke	2fe2fe1fce	i965: Stop using gen7_update_sampler_state; rm gen7_sampler_state.c. The code in brw_sampler_state.c now handles all generations; we don't need the extra Gen7+ only code anymore. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:40 -07:00
Kenneth Graunke	7679393f56	i965: Make brw_update_sampler_state use 8 bits for LOD fields on Gen7+. This was the only actual difference between Gen4-6 and Gen7+ in terms of the values we program. The rest was just mechanical structure rearrangement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:40 -07:00
Kenneth Graunke	a50b640dfe	i965: Make brw_update_sampler_state() use brw_emit_sampler_state(). Instead of stuffing bits directly into the brw_sampler_state structure, we now store them in local variables, then use brw_emit_sampler_state() to assemble the packet. This separates the decision about what values to use from the actual packet emission, which makes the code more reusable across generations. v2: Put const on a bunch of local variables and move declarations, as suggested by Topi Pohjolainen. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:40 -07:00
Kenneth Graunke	05f0796eb6	i965: Introduce a function to emit a SAMPLER_STATE structure. This simply assembles all the SAMPLER_STATE fields into their proper bit locations. Making it work on all generations was easy enough; some of the fields are even in the same place. Not used by anything yet, but will be soon. I made it non-static so BLORP can use it too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:16:40 -07:00
Kenneth Graunke	7cdb0a30fa	i965: Add const to upload_default_color's sampler parameter. It doesn't edit the value, and this lets us use const in more places. Needed to implement Topi's review comments for the next patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-08-02 05:16:18 -07:00
Kenneth Graunke	b590a1237c	i965: Add #defines for SAMPLER_STATE fields. We'll use these to replace the existing structures. I've adopted the convention that "BRW" applies to all hardware, and "GENX" applies starting with generation X, but might be replaced by some later generation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	eee8196782	i965: Convert wrap mode #defines to an enum. This makes it easy to tell that they're grouped together, and also improves gdb printing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	6afe21da62	i965: Delete gen7_upload_sampler_state_table and vtable mechanism. brw_upload_sampler_state_table now handles all generations, so we don't need the vtable mechanism either. There's still a lot of code duplication; the next patches will address that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	c2f231e181	i965: Make brw_upload_sampler_state_table handle Gen7+ as well. This copies a few changes from gen7_upload_sampler_state_table; the next patch will delete that function. Gen7+ has per-stage sampler state pointer update packets, so we emit them as soon as we emit a new table for a stage. On Gen6 and earlier, we have a single packet, so we delay until we've changed everything that's going to be changed. v2: Split 3DSTATE_SAMPLER_STATE_POINTERS_XS packet emission into a helper function (suggested by Topi Pohjolainen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	8fbc96ca74	i965: Shift brw_upload_sampler_state_table away from structures. The Gen4-6 and Gen7+ code is virtually identical, but both use different structure types. Switching to use a uint32_t pointer and operate on the number of DWords will make it possible to share code. It turns out that SURFACE_STATE is the same number of DWords on every platform currently; it will be easy to handle a change there, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	378eea9708	i965: Push computation for sampler state batch offsets up a level. Other than this, brw_update_sampler_state only deals with a single SAMPLER_STATE structure, and doesn't need to know which position it is in the table. The caller takes care of dealing with multiple surface states. Pushing this up a level allows us to drop the ss_index parameter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	7efa183e8f	i965: Drop unused 'ss_index' parameter from gen7_update_sampler_state. This was copied from the Gen4-6 code, but is unused. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	a381592a8e	i965: Stop storing sdc_offset in brw_stage_state. sdc_offset is produced and consumed in the same function, so there's no need to store it in the context, nor pass pointers to it through various call chains. Saves 128 bytes per brw_stage_state structure, and makes the code clearer as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	9a1a8cb84d	i965: Drop the degenerate brw_sampler_default_color structure. It's just an array of four floats, and we have an array of four floats, so this is literally just a memcpy...but with custom structs and strange macros to give the appearance of doing something more. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	c8e2549785	i965: Write a better file comment for brw_sampler_state.c. The old one has been inaccurate for years. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	3f67fb4dc3	i965: Rename brw_wm_sampler_state.c to brw_sampler_state.c. When the driver was originally written, it only supported texturing in the pixel shader backend; vertex and geometry shader texturing came much later. Originally, the pixel shader was referred to as "WM" (the Windowizer/Masker unit). So, this code happened to only be relevant for the WM stage, at the time. However, sampler state really applies to all stages, so putting "wm" in the filename doesn't make sense. I dropped it in gen7_sampler_state.c; at this point the asymmetry just trips people up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kenneth Graunke	6e12035834	i965/blorp: Don't set min_mag_neq bit in Gen6 SAMPLER_STATE. The "Min/Mag State Not Equal" bit is supposed to be set when the min/mag filters or address rounding modes differ. BLORP uses identical min/mag settings, so the bit should be unset. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-08-02 05:14:42 -07:00
Kevin Rogovin	e41cc45361	define GL_OES_standard_derivatives if extension is supported Define the macro GL_OES_standard_derivatives as 1 if the extension GL_OES_standard_derivatives is supported. V2 [Chris]: Correct trailing whitespace Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-02 11:52:41 +12:00
Roland Scheidegger	3b69347efc	llvmpipe: don't store number of layers per level This could be recalculated, though it turns out the only use of it after resource allocation is for calculating whole resource size (for scene size accounting though that isn't quite ideal neither). Thus, instead just store the whole resource size and drop it (saving a couple bytes of storage per resource). It makes things simpler too. Note that for the accounting winsys resources always come back with size 0 but this is unchanged (we don't actually know the size in any case). Also reformat llvmpipe_texture_layout (drop unneded indentation). v2: adapt to previous changes. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-01 23:31:11 +02:00
Roland Scheidegger	7e7aebbbd0	llvmpipe: integrate memory allocation into llvmpipe_texture_layout Seems pointless to just duplicate some of the calculations (the calculation of actual memory used compared to what was predicted in llvmpipe_texture_layout actually could have differed slightly in some cases due to different alignment rules used though this should have been of no consequence). v2: keep the previous mip alignment of MAX2(64, cacheline). This was added for ARB_map_buffer_alignment - I'm not convinced it's needed for textures, but it was supposed to be cleanup without functional change. Also replace div with 64bit mul / comparison. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-08-01 23:31:11 +02:00
Roland Scheidegger	47096fbb5d	llvmpipe: get rid of impossible code in alloc_image_data Only used for non display target resources. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-08-01 23:31:11 +02:00
Jordan Justen	c860a379d2	i965/miptree: Layout 1D Array as 2D Array with height of 1 1D array miptrees were being laid out as a 2D texture with 1 slice. This happened due to the mesa core storing the 1D array slice count in the height field. On Intel hardware, we want to create a 2D array with a height of 1 for the 1D array case. Fixes assertion failure in piglit (gen6, gen8): spec/glsl-1.30/execution/tex-miplevel-selection textureOffset 1DArrayShadow In release builds of Mesa, this test was observed to cause a GPU hang on gen8. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81450 Tested-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-08-01 11:13:07 -07:00
Glenn Kennard	3a9278b92c	r600g: Implement gpu_shader5 textureGather Adds 0-3 textureGather component selection and non-constant offsets Caveat: 0 and 1 texture swizzles only work if textureGather component select is 3 or a component that does not exist in the sampler texture format. This is a hardware limitation, any other value returns 128/255=0.501961 for both 0 and 1. Passes all textureGather piglit tests on radeon 6670, except for those using 0/1 texture swizzles due to aforementioned reason. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-08-01 16:19:47 +02:00
Aditya Atluri	f455f34ab9	mesa: Add missing atomic buffer bindings and unbindings Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-01 15:50:26 +02:00
Michel Dänzer	150ac07b85	r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming buffers Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-01 11:25:27 +09:00
Michel Dänzer	8898fff46c	r600g/radeonsi: Reduce or even drop special treatment of persistent mappings Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-08-01 11:04:16 +09:00
Jon TURNEY	095c37e472	target-helpers: Do not build kms_dri on libdrm-less platforms. Fix build since `3b176c441b` for dri_platform=none hosts. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-08-01 01:31:58 +01:00
Glenn Kennard	b1eb00cd40	r600g: gpu_shader5 gl_SampleMaskIn support Map TGSI_SEMANTIC_SAMPLEMASK to register/component. Enable face register when sample mask is needed by shader. Requires Evergreen/Cayman Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-31 11:51:05 +02:00
Glenn Kennard	2768a56f58	r600g: Implement gpu_shader5 integer ops Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-31 11:51:04 +02:00
Glenn Kennard	2133a1aedf	r600g: Add IMUL_HI/UMUL_HI support Fixes fs-imulExtended, fs-imulExtended-only-msb, fs-umulExtended, fs-umulExtended-only-msb piglit tests. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-31 11:51:04 +02:00
Glenn Kennard	a48b615006	r600g: Implement GL_ARB_texture_query_lod Requires Evergreen or later v2 (Andreas): Update relnotes/10.3 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)	2014-07-31 11:51:04 +02:00
Eric Anholt	1da4bb5b97	gbm: Log at least one dlerror() when we fail to open any drivers. We don't want to log every single error (such as all the ones where the file wasn't even present in our list of search paths), but if you didn't find any driver, then seeing at least one error is useful (since the common case as a developer is a single DEFAULT_DRIVER_DIR or GBM_DRIVERS_PATH entry). v2: Rebase on swrast changes. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 22:31:30 -07:00
Eric Anholt	ef81ce9909	gbm: Fix a debug log message Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 22:30:19 -07:00
Eric Anholt	bfb0da9fa7	gallium: Add a uif() helper function to complement fui() I found myself often wanting this when I'm printing out a uint32_t mapping of some GPU data, and I want to put in an interpretation of that value as a float. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 22:30:19 -07:00
Vinson Lee	bf3a26266d	glapi: Do not use backtrace on DragonFly. execinfo.h is not available on DragonFly. Fixes this build error. CC glapi_gentable.lo glapi_gentable.c:44:22: fatal error: execinfo.h: No such file or directory Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-30 21:48:09 -07:00
Roland Scheidegger	5a12155503	gallivm: fix up out-of-bounds level when using conformant out-of-bound behavior When using (d3d10) conformant out-of-bound behavior for texel fetching (currently always enabled) the level still needs to be set to a safe value even though the offset in the end won't get used because the level is used to look up the mip offset itself and the actual strides, which might otherwise crash. For simplicity, we'll use level 0 in this case (this ought to be safe, llvmpipe does not actually fill in level 0 information if first_level is larger, but some random strides / offsets shouldn't hurt as ultimately we always use offset 0 in this case). Fixes a crash in some in-house test where random huge levels appear in lp_build_fetch_texel() (the test actually uses level 0 always but if the fetching happens in a block with a execution mask random values may appear). CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-07-31 01:31:06 +02:00
Giovanni Campagna	e57ad3d38c	dri: Add a new capabilities for drivers that can't share buffers The kms-dri swrast driver cannot share buffers using the GEM, so it must tell the loader to disable extensions relying on that, without disabling the image DRI extension altogether (which would prevent the loader from working at all). This requires a new gallium capability (which is queried on the pipe_screen and for swrast drivers it's forwarded to the winsys), and requires a new version of the DRI image extension. [Emil Velikov] - Rebased on top of gallium-dri megadrivers. - Drop PIPE_CAP_BUFFER_SHARE and sw_winsys::get_param hook. The can_share_buffer cap is set at InitScreen. We use a different InitScreen (and thus value for the cap) function for kms_dri, due to deeper differences originating from dri megadrivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 16:43:41 +01:00
Giovanni Campagna	3b176c441b	gallium: Add a dumb drm/kms winsys backed swrast provider Add a new winsys and target that can be used with a dri2 state tracker and loader instead of drisw. This allows to use gbm as a dri2/image loader and avoid the extra copy from the backbuffer to the shadow frontbuffer. The new driver is called "kms_swrast", and is loaded by gbm as a fallback, because it is only useful with the gbm platform (as no buffer sharing is possible) To force select the driver set the environment variable GBM_ALWAYS_SOFTWARE [Emil Velikov] - Rebase on top of gallium megadriver. - s/text/test/ in configure.ac (Spotted by Andreas Pokorny). - Add scons support for winsys/sw/kms-dri and fix the build. - Provide separate DriverAPI, due to different InitScreen hook. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 16:33:09 +01:00
Giovanni Campagna	8430af5ebe	Add support for swrast to the DRM EGL platform Turn GBM into a swrast loader (providing putimage/getimage backed by a dumb KMS buffer). This allows to run KMS+DRM GL applications (such as weston or mutter-wayland) unmodified on cards that don't have any client side HW acceleration component but that can do modeset (examples include simpledrm and qxl) [Emil Velikov] - Fix make check. - Split dri_open_driver() from dri_load_driver(). - Don't try to bind the swrast extensions when using dri. - Handle swrast->CreateNewScreen() failure. - strdup the driver_name, as it's free'd at destruction. - s/LIBGL_ALWAYS_SOFTWARE/GBM_ALWAYS_SOFTWARE/ - Move gbm_dri_bo_map/unmap to gbm_driiint.h. - Correct swrast fallback logic. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 16:33:09 +01:00
Emil Velikov	e3a3dbe940	st/gbm: don't segfault if the fail to create the screen Whenever dd_create_screen/pipe_loader_* fails, gdrm->dev may be NULL. Thus peeking inside the struct will lead to a crash. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 16:33:09 +01:00
Emil Velikov	d93ae21939	st/gbm: retrieve the driver-name via dd_driver_name() ... on static targets. Otherwise we'll crash badly as gdrm->dev is NULL when we try to copy the string driver_name. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-30 16:33:09 +01:00
Brian Paul	85109bc507	glsl/glcpp: rename ERROR to ERROR_TOKEN to fix MSVC build ERROR is a #define in the MSVC WinGDI.h header file. Add the _TOKEN suffix as we do for a few other lexer tokens. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-30 08:12:03 -06:00
Ian Romanick	66decc7efa	configure: Don't override user -g or -O options for debug builds Principle of least surprise: --enable-debug should enable debugging. Ages ago, Mesa's build system only added -g in dri-debug builds (yay for the static Makefiles). If you forgot to change it (or wrap the build with custom scripts), you would often be disappointed when trying to gdb Mesa bugs. New developers, that may not yet have custom scripts, will have this same issue. I think we should enable experienced developers to do what they want, and make things easier for new developers. I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want configure to change them for me. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-29 15:49:27 -07:00
Carl Worth	a62354a987	glsl: Add flex options to eliminate the default rule We've had bugs in the past where we have been inadvertently matching the default rule. Just as we did in the pre-processor in the previous commit, we can use: %option warn nodefault in the compiler to instruct flex to not generate the default rule, and further to warn if our set of rules could let any characters go unmatched. With this warning active, flex actually warns that the catch-all rule we recently added to the compiler could never be matched. Since that is all safely determined at compile time now, we can safely drop this run-time compiler error message, (as we do in this commit). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-29 15:11:51 -07:00
Carl Worth	bc8721f16f	glsl/glcpp: Add flex options to eliminate the default rule. We've had multiple bugs in the past where we have been inadvertently matching the default rule, (which we never want to do). We recently added a catch-all rule to avoid this, (and made this rule robust for future start conditions). Kristian pointed out that flex allows us to go one step better. This syntax: %option warn nodefault instructs flex to not generate the default rule at all. Further, flex will generate a warning at compile time if the set of rules we provide are inadequate, (such that it would be possible for the default rule to be matched). With this warning in place, I found that the catch-all rule was in fact missing something. The catch-all rule uses a pattern of "." which doesn't match newlines. So here we extend the newline-matching rule to all start conditions. That is enough to convince flex that it really doesn't need any default rule. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-29 15:11:51 -07:00
Carl Worth	4ebff9bca6	glsl/glcpp: Combine the two rules matching any character Using a single rule here means that we can use the <*> syntax to match all start conditions. This makes the catch-all rule more robust against the addition of future start conditions, (no need to maintain an ever- growing list of start conditions for this rul). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-29 15:11:51 -07:00
Carl Worth	80e9301d9b	glsl/glcpp: Alphabetize lists of start conditions There is no behavioral change here. It's just easier to verify that lists of start conditions include all expected conditions when they appear in a consistent order. The <INITIAL> state is special, so it appears first in all lists. All others appear in alphabetical order. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-29 15:11:51 -07:00
Carl Worth	f9c99aefea	glsl/glcpp: Add a catch-all rule for unexpected characters. In some of the recent glcpp bug-fixing, we found that glcpp was emitting unrecognized characters from the input source file to stdout, and dropping them from the source passed onto the compiler proper. This was obviously confusing, and totally undesired. The bogus behavior comes from an implicit default rule in flex, which is that any unmatched character is implicitly matched and printed to stdout. To avoid this implicit matching and printing, here we add an explicit catch-all rule. If this rule ever matches it prints an internal compiler error. The correct response for any such error is fixing glcpp to handle the unexpected character in the correct way. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:51 -07:00
Carl Worth	4757c74c84	glsl/glcpp: Treat carriage return as equivalent to line feed. Previously, the '\r' character was not explicitly matched by any lexer rule. This means that glcpp would have been using the default flex rule to match '\r' characters, (where they would have been printed to stdout rather than actually correctly handled). With this commit, we treat '\r' as equivalent to '\n'. This is clearly an improvement the bogus printing to stdout. The resulting behavior is compliant with the GLSL specification for any source file that uses exclusively '\r' or '\n' to separate lines. For shaders that use a multiple-character line separator, (such as "\r\n"), glcpp won't be precisely compliant with the specification, (treating these as two newline characters rather than one), but this should not introduce any semantic changes to the shader programs. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:51 -07:00
Carl Worth	12d583b21a	glsl/glcpp: Add test for a multi-line comment within an #if 0 block This test is written to exercise a bug which I recently wrote, (but fortunately caught and fixed before ever committing it). For the curious: The bug happened when the NEWLINE_CATCHUP code didn't actually return the NEWLINE token (due to the skipping). This resulted in the lexer continuing on through all the subsequent rules while still in the NEWLINE_CATCHUP start condition, (which then triggered the internal-compiler-error catch-all rule). What is intended is for the return of the NEWLINE token to start a new iteration of the lexer loop, at which time the NEWLINE_CATCHUP-handling code will reset from the <NEWLINE_CATCHUP> to the <INITIAL> start condition. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	346d712e87	glsl/glcpp: Test that macro parameters substitute immediately after periods At one point while rewriting the lexing rule for pre-processing numbers, I made it a bit too aggressive and within a replacement list sucked up a parameter name that appeared immediately after a period. This caused the parameter name to be unreplaced when the macro was expanded. It was in some piglit tests that I originally found this issue. Here, I'm adding a test to "make check" to ensure that this behavior remains correct. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	285c9392ad	glsl/glcpp: Add (non)-support for ++ and -- operators These operators aren't defined for preprocessor expressions, so we never implemented them. This led them to be misinterpreted as strings of unary '+' or '-' operators. In fact, what is actually desired is to generate an error if these operators appear in any preprocessor condition. So this commit looks like it is strictly adding support for these operators. And it is supporting them as far as passing them through to the subsequent compiler, (which was already happening anyway). What's less apparent in the commit is that with these tokens now being lexed, but with no change to the grammar for preprocessor expressions, these operators will now trigger errors there. A new "make check" test is added to verify the desired behavior. This commit fixes the following Khronos GLES3 CTS test: invalid_op_1_vertex invalid_op_1_fragment invalid_op_2_vertex invalid_op_2_fragment Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	34cd293c8a	glsl/glcpp: Emit error for duplicate parameter name in function-like macro This will emit an error for something like: #define FOO(x,x) ... Obviously, it's not a legal thing to do, and it's easy to check. Add a "make check" test for this as well. This fixes the following Khronos GLES3 CTS tests: invalid_function_definitions.unique_param_name_vertex invalid_function_definitions.unique_param_name_fragment Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	fe1e0ac852	glsl/glcpp: Add an explanatory comment for "loc != NULL" check Just reading the code, it looked like a bug that _define_object_macro had this check, but _define_function_macro did not. Upon further reading, that's because the check is to allow for our builtins to be defined, (and there are no builtin function-like macros). Add my new understanding as a comment to help the next reader. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	18c589d20e	glsl/glcpp: Drop the HASH_ prefix from token names like HASH_IF Previously, we had a single token for "#if" but now that we have two separate tokens, it looks much better to see: HASH_TOKEN IF than: HASH_TOKEN HASH_IF (Note, that for the same reason we use HASH_TOKEN instead of HASH, we also use DEFINE_TOKEN instead of DEFINE to avoid a conflict with the <DEFINE> start condition in the lexer.) There should be no behavioral change from this commit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Kenneth Graunke	de0b4b6607	glsl: Properly lex extra tokens when handling # directives. Without this, in the <PP> state, we would hit Flex's default rule, which prints tokens to stdout, rather than returning them as tokens. (Or, after the previous commit, we would hit the new catch-all rule and generate an internal compiler error.) With this commit in place, we generate the desired syntax error. This manifested as a weird bug where shaders with semicolons after extension directives, such as: #extension GL_foo_bar : enable; would print semicolons to the screen, but otherwise compile just fine (even though this is illegal). Fixes Piglit's extension-semicolon.frag test. This also fixes the following Khronos GLES3 conformance tests, (and for real this time): invalid_char_in_name_vertex invalid_char_in_name_fragment Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	f196eb2d39	glsl: Add an internal-error catch-all rule This is to avoid the default, silent flex rule which simply prints the character to stdout. For the following Khronos GLES3 conformance tests: invalid_char_in_name_vertex invalid_char_in_name_fragment With this commit, these tests now report Pass where they previously reported Fail, but Mesa isn't behaving correctly yet. It's now reporting the internal error where what is really desired is a syntax error. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	f062f0506a	glsl/glcpp: Correctly parse directives with intervening comments It's legal (though highly bizarre) for a pre-processor directive to look like this: # /* why? / define FOO bar This behavior comes about since the specification defines separate logical phases in a precise order, and comment-removal occurs in a phase before the identification of directives. Our implementation does not use an actual separate phase for comment removal, so some extra care is necessary to correctly parse this. What we want is for '#' to introduce a directive iff it is the first token on a line, (ignoring whitespace and comments). Previously, we had a lexical rule that worked only for whitespace (not comments) with the following regular expression to find a directive-introducing '#' at the beginning of a line: HASH ^{HSPACE}#{HSPACE}* In this commit, we switch to instead use a simple literal match of '#' to return a HASH_TOKEN token and add a new <HASH> start condition for whenever the HASH_TOKEN is the first non-space token of a line. This requires the addition of the new bit of state: first_non_space_token_this_line. This approach has a couple of implications on the glcpp parser: 1. The parser now sees two separate tokens, (such as HASH_TOKEN and HASH_DEFINE) where it previously saw one token (HASH_DEFINE) for the sequence "#define". This is a straightforward change throughout the grammar. 2. The parser may now see a SPACE token before the HASH_TOKEN token of a directive. Previously the lexical regular expression for {HASH} would eat up the space and there would be no SPACE token. This second implication is a bit of a nuisance for the parser. It causes a SPACE token to appear in a production of the grammar with the following two definitions of a control_line: control_line SPACE control_line This is really ugly, since normally a space would simply be a token separator, so it wouldn't appear in the tokens of a production. This leads to a further problem with interleaved spaces and comments: /* ... / / ... / #define / ..*/ For this, we must not return several consecutive SPACE tokens, or else we would need an arbitrary number of new productions: SPACE SPACE control_line SPACE SPACE SPACE control_line ad nauseam To avoid this problem, in this commit we also change the lexer to emit only a single SPACE token for any series of consecutive spaces, (whether from actual whitespace or comments). For this compression, we add a new bit of parser state: last_token_was_space. And we also update the expected results of all necessary test cases for the new compression of space tokens. Fortunately, the compression of spaces should not lead to any semantic changes in terms of what the eventual GLSL compiler sees. So there's a lot happening in this commit, (particularly for such a tiny feature). But fortunately, the lexer itself is looking cleaner than ever. The only ugly bit is all the state updating, but it is at least isolated to a single shared function. Of course, a new "make check" test is added for the new feature, (directives with comments and whitespace interleaved in many combinations). And this commit fixes the following Khronos GLES3 CTS tests: function_definition_with_comments_vertex function_definition_with_comments_fragment Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:50 -07:00
Carl Worth	dfdf9dc082	glsl/glcpp: Rename HASH token to HASH_TOKEN This is in preparation for the planned addition of a new <HASH> start condition to the lexer. Both start conditions and token types are, of course, in the same default C namespace, so a start condition and a token type with the same name will collide. (And unfortunately, they are both apparently implemented as equivalent numeric types so the collision is undetected at compile time and simply leads to unpredictable behavior at run time.) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	0d5f5d127b	glsl/glcpp: Don't use start-condition stack when switching to/from <DEFINE> This commit does not cause any behavioral change for any valid program. Prior to entering the <DEFINE> start condition, the only valid start condition is <INITIAL>, so whether pushing/popping <DEFINE> onto the stack or explicit returning to <INITIAL> is equivalent. The reason for this change is that we are planning to soon add a start condition for <HASH> with the following semantics: <HASH>: We just saw a directive-introducing '#' <DEFINE>: We just saw "#define" starting a directive With these two start conditions in place, the only correct behavior is to leave <DEFINE> by returning to <INITIAL>. But the old push/pop code would have returned to the <HASH> start condition which would then cause an error when the next directive-introducing '#' would be encountered. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	2fdc1f50c4	glsl/glcpp: Add a -d/--debug option to the standalone glcpp program The verbose debug output from the parser is quite useful when debugging, and having this available as a command-line option is much more convenient than manually forcing this into the code when needed, (which is what I had been doing for too long previously). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	8e8f8ff1b2	glsl/glcpp: Fix off-by-one error in column in first-line error messages For the first line we were initializing the column to 1, but for all subsequent lines we were initializing the column to 0. The column number is advanced for each token read before any error message is printed. So the 0 value is the correct initialization, (so that the first column is reported as column 1). With this extremely minor change, many of the .expected files are updated such that error messages for the first line now have the correct column number in them. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	0742e0acd3	glsl/glcpp: Minor tweak to wording of error message It makes more sense to print the directive name with the preceding '#'. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	f583f214d5	glsl/glcpp: Stop using a lexer start condition (<SKIP>) for token skipping. Here, "skipping" refers to the lexer not emitting any tokens for portions of the file within an #if condition (or similar) that evaluates to false. Previously, the lexer had a special <SKIP> start condition used to control this skipping. This start condition was not handled like a normal start condition. Instead, there was a particularly ugly block of code set to be included at the top of the generated lexing loop that would change from <INITIAL> to <SKIP> or from <SKIP> to <INITIAL> depending on various pieces of parser state, (such as parser->skip_state and parser->lexing_directive). Not only was that an ugly approach, but the <SKIP> start condition was complicating several glcpp bug fixes I attempted recently that want to use start conditions for other purposes, (such as a new <HASH> start condition). The recently added RETURN_TOKEN macro gives us a convenient way to implement skipping without using a lexer start condition. Now, at the top of the generated lexer, we examine all the necessary parser state and set a new parser->skipping bit. Then, in RETURN_TOKEN, we examine parser->skipping to determine whether to actually emit the token or not. Besides this, there are only a couple of other places where we need to examine the skipping bit (other than when returning a token): * To avoid emitting an error for #error if skipped. * To avoid entering the <DEFINE> start condition for a #define that is skipped. With all of this in place in the present commit, there are hopefully no behavioral changes with this patch, ("make check" still passes all of the glcpp tests at least). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	09b4e12900	glsl/glcpp: Abstract a bit of common code for returning string tokens Now that we have a common macro for returning tokens, it makes sense to perform some of the common work there, (such as copying string values). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	828686d4eb	glsl/glcpp: Drop extra, final newline from most output The glcpp parser is line-based, so it needs to see a NEWLINE token at the end of each line. This causes a trick for files that end without a final newline. Previously, the lexer for glcpp punted in this case by unconditionally returning a NEWLINE token at end-of-file, (causing most files to have an extra blank line at the end). Here, we refine this by lexing end-of-file as a NEWLINE token only if the immediately preceding token was not a NEWLINE token. The patch is a minor change that only looks huge for two reasons: 1. Almost all glcpp test result ".expected" files are updated to drop the extra newline. 2. All return statements from the lexer are adjusted to use a new RETURN_TOKEN macro that tracks the last-token-was-a-newline state. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:49 -07:00
Carl Worth	5dbdc341e8	glsl/glcpp: Add testing for EOF sans newline (and fix for <DEFINE>, <COMMENT>) The glcpp implementation has long had code to support a file that ends without a final newline. But we didn't have a "make check" test for this. Additionally, the <EOF> action was restricted only to the <INITIAL> state so it would fail to get invoked if the EOF was encountered in the <COMMENT> or the <DEFINE> case. Neither of these was a bug, per se, since EOF in either of these cases is an error anyway, (either "unterminated comment" or "missing macro name for #define"). But with the new explicit support for these cases, we not generate clean error messages in these cases, (rather than "unexpected $end" from before). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:48 -07:00
Carl Worth	21dda50549	glsl/glcpp: Remove some un-needed calls to NEWLINE_CATCHUP The NEWLINE_CATCHUP code is only intended to be invoked after we lex an actual newline character ('\n'). The two extra calls here were apparently added accidentally because the pattern happened to contain a (negated) '\n', (see commit `6005e9cb28`). I don't think either case could have caused any actual bug. (In the first case, the pattern matched right up to the next newline, so the NEWLINE_CATCHUP code was just about to be called. In the second case, I don't think it's possible to actually enter the <SKIP> start condition after commented newlines without any intervening newline.) But, if nothing else, the code is cleaner without these extra calls. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:48 -07:00
Carl Worth	cc335c0e57	glsl/glcpp: Add support for comments between #define and macro identifier The recent adddition of an error for "#define followed by a non-identifier" was a bit to aggressive since it used a regular expression in the lexer to flag any character that's not legal as the first character of an identifier. But we need to allow comments to appear here, (since we aren't removing comments in a preliminary pass). So we refine the error here to only flag characters that could not be an identifier, nor a comment, nor whitespace. We also augment the existing comment support to be active in the <DEFINE> state as well. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:48 -07:00
Carl Worth	ea2e9300ec	glsl/glcpp: Emit proper error for #define with a non-identifier Previously, if the preprocessor encountered a #define with a non-identifier, such as: #define 123 456 The lexer had no explicit rules to match non-identifiers in the <DEFINE> start state. Because of this, flex's default rule was being invoked, (printing characters to stdout), and all text was being discarded by the compiler until the next identifier. As one can imagine, this led to all sorts of interesting and surprising results. Fix this by adding an explicit rule complementing the existing identifier-based rules that should catch all non-identifiers after #define and reliably give a well-formatted error message. A new test is added to "make check" to ensure this bug stays fixed. This commit also fixes the following Khronos GLES3 CTS test: define_non_identifier_vertex (The "fragment" variant was passing earlier only because the preprocessor was behaving so randomly and causing the compilation to fail. It's lucky, in fact, that the "vertex" version succesfully compiled so we could find and fix this bug.) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-29 15:11:48 -07:00
Carl Worth	9e45fb6f51	glsl/glcpp: Add testing for directives preceded by a space This test simply has one of each directive, all of which are preceded by a single space character.	2014-07-29 15:11:48 -07:00
Carl Worth	da7f226a27	glsl/glcpp: Fix to emit spaces following directives The glcpp lexer and parser use the space_tokens state bit to avoid emitting tokens for spaces while parsing a directive. Previously, this bit was only being set again by the first non-space token following a directive. This led to a bug where a space, (or a comment that should emit a space), immediately following a directive, (optionally searated by newlines), would be omitted from the output. Here we fix the bug by also setting the space_tokens bit whenever we lex a newline in the standard start conditions.	2014-07-29 15:11:48 -07:00
Marek Olšák	49e2275d0d	configure.ac: require libdrm_radeon 2.4.56 because of the Hawaii fix there	2014-07-29 23:25:42 +02:00
Jason Ekstrand	3ea922dd7c	main/get_hash_params: Add GL_SAMPLE_SHADING_ARB GL_SAMPLE_SHADING is specified as a valid pname for glGet in the GL_ARB_sample_shading extension. It seems as if we forgot to add it to the table of pnames. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org	2014-07-29 10:50:38 -07:00
Yaakov Selkowitz	b12d5f0d00	os_process.c: Add cygwin as an expected platform mesa/mesa/src/gallium/auxiliary/os/os_process.c:40:2: warning: #warning unexpected platform in os_process.c [-Wcpp] #warning unexpected platform in os_process.c mesa/mesa/src/gallium/auxiliary/os/os_process.c:77:2: warning: #warning unexpected platform in os_process.c [-Wcpp] #warning unexpected platform in os_process.c Signed-off-by: Yaakov Selkowitz <yselkowitz@users.sourceforge.net> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-29 17:53:08 +01:00
Yaakov Selkowitz	d05f72d4c3	xmlconfig: Use program_invocation_short_name when building for cygwin mesa/mesa/src/mesa/drivers/dri/common/xmlconfig.c:104:10: warning: #warning "Per application configuration won't work with your OS version." [-Wcpp] # warning "Per application configuration won't work with your OS version." Signed-off-by: Yaakov Selkowitz <yselkowitz@users.sourceforge.net> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-29 17:52:57 +01:00
Brian Paul	448f14918c	docs: fix date typo: July 78 -> 18	2014-07-29 09:16:23 -06:00
Brian Paul	7844263f07	svga: remove unneeded depth==1 assertion in svga_texture_view_surface() We can create 3D texture views. Avoids an assertion in piglit fbo-generatemipmap-3d test and allows it to pass. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-07-29 09:16:23 -06:00
José Fonseca	66a1b3a1da	st/wgl: Clamp wglChoosePixelFormatARB's output nNumFormats to nMaxFormats. While running https://github.com/nvMcJohn/apitest with apitrace I noticed that Mesa was producing bogus results: wglChoosePixelFormatARB(hdc, piAttribIList = {...}, pfAttribFList = &0, nMaxFormats = 1, piFormats = {19, 65576, 37, 198656, 131075, 0, 402653184, 0, 0, 0, 0, -573575710}, nNumFormats = &12) = TRUE However https://www.opengl.org/registry/specs/ARB/wgl_pixel_format.txt states <nNumFormats> returns the number of matching formats. The returned value is guaranteed to be no larger than <nMaxFormats>. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-29 15:41:32 +01:00
Michel Dänzer	8d0a1a6bc0	gallium/radeon: Add some Emacs .dir-locals.el files Based on the toplevel one but adapted to the driver/winsys coding styles. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-29 17:59:13 +09:00
Chia-I Wu	9a53f941c7	ilo: fix fb height of HiZ ops It was set to aligned width. It appears to be fine on GEN7+, but causes random hangs on GEN6.	2014-07-29 10:24:59 +08:00
Tapani Pälli	76b11d15d3	glapi: add indexed blend functions (GL 4.0) This makes some of the UE4 engine demos (Stylized, Mobile Temple) render correctly, tested on Intel Haswell machine. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78716	2014-07-28 16:26:27 -07:00
Marek Olšák	a9528cef6b	r600g,radeonsi: switch all occurences of array_size to util_max_layer This fixes 3D texture support in all these cases, because array_size is 1 with 3D textures and depth0 actually contains the "array size". util_max_layer is universal and returns the last layer index for any texture target. A lot of the cases below can't actually be hit with 3D textures, but let's be consistent. This fixes a failure in: piglit layered-rendering/clear-color-all-types 3d single_level for r600g and radeonsi, which was caused by an incorrect CMASK size calculation. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	71ce92200e	radeonsi: fix occlusion queries on Hawaii This was just a guess - and it worked! Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	156b7e244c	winsys/radeon: fix vram_size overflow with Hawaii This fixes piglit spec/!OpenGL 3.1/minmax. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	0e7f56313d	radeonsi: fix a hang with streamout on Hawaii I actually couldn't reproduce this one, but internal docs recommend this workaround. Better safe than sorry. Also, the number of dwords for the sync packets is increased by 4 instead of 2, because it wasn't bumped last time when a new packet was added there. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	3d9e87406c	radeonsi: fix a hang with instancing on Hawaii This fixes "piglit/bin/arb_transform_feedback2-draw-auto instanced". Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	c7407b94a8	gallium/util: add a helper for calculating primitive count from vertex count This is needed by the following commit which is a candidate for stable too. Cc: mesa-stable@lists.freedesktop.org	2014-07-28 23:57:08 +02:00
Marek Olšák	9b046474c9	radeonsi: fix CMASK and HTILE calculations for Hawaii This fixes the checkerboard pattern in glxgears and anything that triggers fast color clear. num_channels is always <= 8, but Hawaii has 16 pipes. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	ecbd3a545a	r600g,radeonsi: add debug flags which disable tiling Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-28 23:57:08 +02:00
Marek Olšák	04f2c88f45	gallium: rename shader cap MAX_CONSTS to MAX_CONST_BUFFER_SIZE This new name isn't so confusing. I also changed the gallivm limit, because it looked wrong. Reviewed-by: Brian Paul <brianp@vmware.com> v2: use sizeof(float[4])	2014-07-28 23:57:08 +02:00
Marek Olšák	d5bcb5e8de	r600g: switch SNORM conversion to DX and GLES behavior it also matches GL 4.2 further discussion: http://lists.freedesktop.org/archives/mesa-dev/2013-August/042680.html Cc: mesa-stable@lists.freedesktop.org	2014-07-28 23:57:08 +02:00
Tom Stellard	5fe20592d4	util: Fix typo Spotted by okias on IRC.	2014-07-28 16:40:05 -04:00
Chia-I Wu	cc1e1da24a	ilo: correctly propagate resource renames to hardware Not only should we mark states dirty when the underlying resource is renamed, we should also update the CSO bo when available.	2014-07-28 23:55:55 +08:00
Chia-I Wu	fb1820355b	ilo: add ilo_resource_get_bo() helper We will need it in the following commit.	2014-07-28 23:55:55 +08:00
Tom Stellard	6f0c1f2b5f	radeonsi: Use util_memcpy_cpu_to_le32() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-28 10:14:28 -04:00
Tom Stellard	f0e0737922	util: Add util_memcpy_cpu_to_le32() v3 v2: - Preserve word boundaries. v3: - Use const and restrict. - Fix indentation. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-28 10:10:43 -04:00
Tom Stellard	3d636b4785	clover: Add checks for image support to the image functions v2 Most image functions are required to return a CL_INVALID_OPERATION error when used on devices without image support. v2: - Simplified the code Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-28 10:10:30 -04:00
Bruno Jiménez	7f96bea5bc	r600g/compute: Add debug information to promote and demote functions v2: Add information about the item's starting point and size v3: Rebased on top of master Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-28 10:10:20 -04:00
Bruno Jiménez	e7715126f7	r600g/compute: Add documentation to compute_memory_pool v2: Rebased on top of master Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-28 10:09:46 -04:00
Chia-I Wu	717e3b1ca1	ilo: unblock an inline write with a staging bo This should allow a deeper pipeline.	2014-07-28 22:57:22 +08:00
Chia-I Wu	7395432f2e	ilo: try unblocking a transfer with a staging bo When mapping a busy resource with PIPE_TRANSFER_DISCARD_RANGE or PIPE_TRANSFER_FLUSH_EXPLICIT, we can avoid blocking by allocating and mapping a staging bo, and emit pipelined copies at proper places. Since the staging bo is never bound to GPU, we give it packed layout to save space.	2014-07-28 22:57:22 +08:00
Chia-I Wu	0a0e57b070	ilo: enable persistent and coherent transfers Enable PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT and reorder caps a bit.	2014-07-28 22:57:22 +08:00
Chia-I Wu	b02e993d8c	ilo: drop ptr from ilo_transfer With the recent clean-ups, we can pass the mapped pointer around between functions cleanly. Drop it to make ilo_transfer smaller.	2014-07-28 22:57:22 +08:00
Chia-I Wu	b1dd54d9fe	ilo: s/TRANSFER_MAP_UNSYNC/TRANSFER_MAP_GTT_UNSYNC/ It maps to drm_intel_gem_bo_map_unsynchronized(), which results in unsynchronized GTT mapping.	2014-07-28 22:57:22 +08:00
Chia-I Wu	2a82bb30e8	ilo: drop unused context param from transfer functions Many of the transfer functions do not need an ilo_context. Drop it.	2014-07-28 22:57:22 +08:00
Chia-I Wu	8abf6c06e8	ilo: tidy up transfer mapping/unmapping Add xfer_map() to replace map_bo_for_transfer(). Add xfer_unmap() and xfer_alloc_staging_sys() to simplify texture and buffer mapping/unmapping, and enable more code sharing between them.	2014-07-28 22:57:22 +08:00
Chia-I Wu	2f4bed0405	ilo: tidy up choose_transfer_method() Add a bunch of helper functions and a big comment for choose_transfer_method(). This also fixes handling of PIPE_TRANSFER_MAP_DIRECTLY to not ignore tiling.	2014-07-28 22:57:22 +08:00
Chia-I Wu	91656eb375	ilo: free transfers with util_slab_free() We used FREE() in one of the error path.	2014-07-28 22:57:22 +08:00
EdB	1d3e06c216	clover: Add clUnloadPlatformCompiler. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-28 14:46:44 +02:00
EdB	39869423cb	clover: Add clCreateProgramWithBuiltInKernels. [ Francisco Jerez: Check for devices not associated with the specified context. Style fix. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-28 14:45:29 +02:00
Jordan Justen	be8bc588b9	glsl/cs: Add several GLSL compute shader variables With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit: built-in-constants tests/spec/arb_compute_shader/minimum-maximums.txt Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-27 17:59:28 -07:00
Jordan Justen	12029046a2	main/cs: Add additional compute shader constant values With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit: * arb_compute_shader-minmax Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-27 17:58:58 -07:00
Chris Forbes	74e100affc	glsl: No longer require ubo block index to be constant in ir_validate Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-26 16:46:03 +12:00
Chris Forbes	be237a6129	glsl: Accept nonconstant array references in lower_ubo_reference Instead of falling back to just the block name (which we won't find), look for the first element of the block array. We'll deal with the rest in the backend by arranging for the blocks to be laid out contiguously. V2: Squashed together patches 3, 5 of V1, plus a naming tweak. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-26 16:46:03 +12:00
Chris Forbes	c59802d3a1	glsl: Convert uniform_block in lower_ubo_reference to ir_rvalue. Previously this was a block index with special semantics for -1. With ARB_gpu_shader5, this need not be a compile-time constant, so allow any rvalue here and convert the -1 to a NULL pointer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-26 16:46:03 +12:00
Chris Forbes	9c90a63378	glsl: Mark entire UBO array active if indexed with non-constant. Without doing a lot more work, we have no idea which indices may be used at runtime, so just mark them all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-26 16:46:03 +12:00
Chris Forbes	8eae5ceb99	glsl: Allow non-constant UBO array indexing with GLSL4/ARB_gpu_shader5. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-26 16:46:03 +12:00
Chia-I Wu	4714c4ec48	ilo: simplify ilo_flush() Move fence creation to the new ilo_fence_create().	2014-07-26 12:30:39 +08:00
Bruno Jiménez	654fd3e33f	r600g/compute: Defrag the pool at the same time as we grow it This allows us two things: we now need less item copies when we have to defrag+grow the pool (to just one copy per item) and, even in the case where we don't need to defrag the pool, we reduce the data copied to just the useful data that the items use. Note: The fallback path is a bit ugly now, but hopefully we won't need it much. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-25 17:51:57 -04:00
Bruno Jiménez	4ca04f3112	r600g/compute: Try to use a temporary resource when growing the pool Now, before moving everything to host memory, we try to create a new resource to use as a pool. I we succeed we just use this resource and delete the previous one. If we fail we fallback to using the shadow. This should make growing the pool faster, and we can also save 64KB of memory that were allocated for the 'shadow', even if they weren't used. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-25 17:51:57 -04:00
Rob Clark	5eb11eb192	freedreno: fix typo in gpu version check Opps, I should use larger fonts, I guess. Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 14:29:02 -04:00
Rob Clark	db193e5ad0	freedreno/ir3: split out shader compiler from a3xx Move the bits we want to share between generations from fd3_program to ir3_shader. So overall structure is: fdN_shader_stateobj -> ir3_shader -> ir3_shader_variant -> ir3 \|- ... \- ir3_shader_variant -> ir3 So the ir3_shader becomes the topmost generation neutral object, which manages the set of variants each of which generates, compiles, and assembles it's own ir. There is a bit of additional renaming to s/fd3_compiler/ir3_compiler/, etc. Keep the split between the gallium level stateobj and the shader helper object because it might be a good idea to pre-compute some generation specific register values (ie. anything that is independent of linking). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 13:29:28 -04:00
Rob Clark	7d7e6ae9c3	freedreno/a3xx/compiler: rename ir3_shader to ir3 First step of reoganization split out compiler (so it can be shared between a3xx and a4xx). Rename ir3_shader -> ir3 (since we'll want the name ir3_shader for a higher level object). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 13:29:28 -04:00
Rob Clark	faaeddb55e	freedreno/a3xx/compiler: scheduler vs pred reg The scheduler also needs to be aware of predicate register (p0) in addition to address register (a0). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 13:29:28 -04:00
Rob Clark	9f391322a0	freedreno/a3xx/compiler: little cleanups Remove some obsolete comments, rename deref->addr. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 13:29:28 -04:00
Rob Clark	d48faad3c2	freedreno/a3xx: enable/disable wa's based on patch-level It seems like for the most part, different behaviors, workarounds, etc, should be conditional on GPU patch revision (ie. a320.0 vs a320.2) rather than GPU id (a320 vs a330). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 13:29:28 -04:00
Rob Clark	9613ca569f	freedreno/a3xx/compiler: make IR heap dyanmic The fixed size heap is a remnant of the fdre-a3xx assembler. Yet it is convenient for being able to free the entire data structure in one shot without worrying about leaking nodes. Change it to dynamically grow the heap size (adding chunks) as needed so we don't have an artificial upper limit on shader size (other than hw limits) and don't always have to allocate worst-case size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-25 13:29:28 -04:00
Jan Vesely	0bc1fa22d8	r600g/compute: Fix singed/unsigned comparison compiler warnings. The iteration variables go from 0 anyway. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-25 12:55:05 -04:00
Tom Stellard	0ec8587642	clover: Query the device to see if images are supported Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-25 12:49:45 -04:00
Tom Stellard	1607a8efc1	gallium: Add PIPE_CAP_COMPUTE_IMAGES_SUPPORTED Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-25 12:49:20 -04:00
Bruno Jiménez	d6b89aef26	r600g/compute: Allow compute_memory_defrag to defragment between resources This will be used in the following patch to avoid duplicated code Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-25 12:38:42 -04:00
Bruno Jiménez	5cf108078c	r600g/compute: Allow compute_memory_move_item to move items between resources v2: Remove unnecesary variables Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-25 12:38:28 -04:00
Dylan Baker	bf1247936a	gbm: Search LIBGL_DRIVERS_PATH if GBM_DRIVERS_PATH is not set The GBM_DRIVERS_PATH environment variable is not documented, and only used to set the location of gbm drivers, while LIBGL_DRIVERS_PATH is used for everything else, and is documented. Generally this split leads to confusion as to why gbm doesn't work. This patch will read LIBGL_DRIVERS_PATH as a fallback if GBM_DRIVERS_PATH is not set. The comments clearly indicate that using LIBGL_DRIVERS_PATH is preferred over GBM_DRIVERS_PATH. v2: - Use GBM_DRIVERS_PATH as a fallback v3: [jordan.l.justen@intel.com] - Make LIBGL_DRIVERS_PATH the fallback Signed-off-by: Dylan Baker <baker.dylan.c@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-24 23:15:06 -07:00
Jerome Glisse	cce58147eb	winsys/radeon: fix indentation Can we please keep it clean and avoid ending up in messy situation like ddx. Signed-off-by: Jérôme Glisse <jglisse@redhat.com>	2014-07-24 17:30:31 -04:00
Jason Ekstrand	989d2e3709	Add an accelerated version of F_TO_I for x86_64 According to a quick micro-benchmark, this new version is 20% faster on my Haswell laptop. v2: Removed the XXX note about x86_64 from the comment v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC support for free. v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-24 12:44:56 -07:00
Matt Turner	2a33510f16	i965/fs: Decide predicate/predicate_inverse outside of the for loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-24 11:27:44 -07:00
Matt Turner	96128d134b	i965/fs: Swap if/else conditions in SEL peephole. Will clarify make the next commit easier to read. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-24 11:27:44 -07:00
Matt Turner	ac2acf04f7	i965: Improve dead control flow elimination. ... to eliminate an ELSE instruction followed immediately by an ENDIF. instructions in affected programs: 704 -> 700 (-0.57%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-24 11:27:43 -07:00
Ilia Mirkin	0ddc28b026	nvc0/ir: support 2d constbuf indexing Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:42 -04:00
Ilia Mirkin	4eef537960	gm107/ir: emit LDC subops Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:42 -04:00
Ilia Mirkin	fc3d5fe01d	gk110/ir: emit load constant subop Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	9c4959d0df	mesa/st: add support for interpolate_at_* ops Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	dfb0ca1606	nv50/ir: fix phi/union sources when their def has been merged In a situation where double-register values are used, the phi nodes can still end up being u32 values. They all get merged into one RA node though. When fixing up the merge (which comes after the phi node), the phi node's def would get fixed, but not its sources which would remain at the low register value. This maintains the invariant that a phi node's defs and sources are allocated the same register. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	32702cceed	nv50/ir: fix hard-coded TYPE_U32 sized register Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	3f6b34bacc	nvc0: mark shader header if fp64 is used Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	b21a28797c	nv50/ir: keep track of whether the program uses fp64 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	47e5a8d7a2	nvc0: make sure that the local memory allocation is aligned to 0x10 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-24 08:26:41 -04:00
Ilia Mirkin	637b6c2478	mesa: add ARB_clear_texture.xml to file list, remove duplicate decls Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-24 08:26:41 -04:00
Chia-I Wu	9d6166880d	ilo: check the tilings of imported handles Just to be cautious.	2014-07-24 13:38:51 +08:00
Chia-I Wu	cbc943c43e	ilo: clean up resource bo renaming s/alloc_bo/rename_bo/ as that is what the functions do. Simplify bo allocation and move the complexity to bo renaming.	2014-07-24 13:21:35 +08:00
Chia-I Wu	cf8c9947a8	ilo: share some code between {tex,buf}_create_bo Add resource_get_bo_name() and resource_get_bo_initial_domain() for use by both functions.	2014-07-24 10:49:02 +08:00
Chia-I Wu	c1a1a627c4	ilo: use native 3-component vertex formats on GEN7.5+ GEN7.5 gains support for those formats natively.	2014-07-24 09:54:20 +08:00
Chia-I Wu	2126541b0b	ilo: allow for device-dependent format translation Pass ilo_dev_info to all format translation functions.	2014-07-24 09:33:33 +08:00
Jason Ekstrand	6bac86cd85	i965: Accelerate uploads of RGBA and BGRA GL_UNSIGNED_INT_8_8_8_8_REV textures Since intel is always going to be little-endian, GL_UNSIGNED_INT_8_8_8_8_REV is the same as GL_UNSIGNED_BYTE for RGBA and BGRA textures, so the same acceleration code will work. We might as well use it. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-23 16:48:35 -07:00
Ian Romanick	5072d0e7fc	mesa: Fix the name in the error message Obvious copy-and-paste bug. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-23 16:42:47 -07:00
Ian Romanick	3f04a1532e	glsl: Fix some bad indentation Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-23 16:42:47 -07:00
Kenneth Graunke	d4d886a0bc	i965/fs: Set LastRT on the final FB write on Broadwell. In Piglit's EXT_framebuffer_multisample/alpha-to-coverage-dual-src-blend test, key->nr_color_regions == 2, but the dual source blend FB write has ir->target set to 0. So we failed to set "Last Render Target Select" on any FB write message. We only emit one FB write per render target, so my comment about setting LastRT on every FB write directed at the last color region is a bit... misinformed. According to the documentation, depth buffer writes and scoreboard updates happen on the FB write with LastRT set, so I believe we want to set it only once. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-23 15:44:37 -07:00
Kenneth Graunke	36a4a6bbdc	i965: Port INTEL_DEBUG=optimizer to the vec4 backend. Largely via copy and paste. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-23 15:44:16 -07:00
Kenneth Graunke	8d2e95bd4b	i965: Save the gl_shader_stage enum in backend_visitor. This will be useful for INTEL_DEBUG=optimizer in the vec4 backend, which needs to know whether it's currently processing a VS or GS. It isn't worth adding virtual methods for this case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-23 15:44:14 -07:00
Kenneth Graunke	d6d3e6027d	i965: Don't print WE_normal in disassembly. Dropping this helps most lines fit in an 80 column terminal. The absence of WE_normal also helps call attention to WE_all, where something unusual is going on. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-23 15:44:08 -07:00
Rob Clark	2f181bc391	freedreno/a3xx/compiler: fix p0 (kill, etc) Don't assert (debug builds) or assign random uninitialized value for predicate register (p0).. that screws up kill, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 15:10:53 -04:00
Tom Stellard	fb237ba746	Revert "r600g/compute: Fix warnings" This reverts commit `467f1585e2`. This breaks the build on some systems.	2014-07-23 11:52:05 -04:00
Grigori Goronzy	2a766b0b64	radeon/llvm: fix formatting Use K&R and same indent as most other code. No functional change intended. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:40:41 -04:00
Grigori Goronzy	0e9cdedd2e	radeon/llvm: enable unsafe math for graphics shaders Accuracy of some operations was recently improved in the R600 backend, at the cost of slower code. This is required for compute shaders, but not for graphics shaders. Add unsafe-fp-math hint to make LLVM generate faster but possibly less accurate code. Piglit didn't indicate any regressions. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:40:33 -04:00
Tom Stellard	467f1585e2	r600g/compute: Fix warnings	2014-07-23 10:29:17 -04:00
Glenn Kennard	2fa6d659c3	r600g: Use hardware sqrt instruction Piglit quick tests including sqrt pass, no other regressions, tested on radeon 6670. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-23 10:29:17 -04:00
Bruno Jiménez	dbaf0bc388	r600g/compute: Remove unneeded code from compute_memory_promote_item Now that we know that the pool is defragmented, we positively know that allocated + unallocated will be the total size of the current pool plus all the items that will be promoted. So we only need to grow the pool once. This will allow us to just add the new items to the end of the item_list without the need of looking for a place to the new item. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:29:17 -04:00
Bruno Jiménez	e7bda844e6	r600g/compute: Quick exit if there's nothing to add to the pool This way we can avoid defragmenting the pool, even if it is needed to defragment it, and looping again through the list of unallocated items. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:29:17 -04:00
Bruno Jiménez	90d7b09ed2	r600g/compute: Defrag the pool if it's necesary This patch adds a new member to the pool to track its status. For now it is used only for the 'fragmented' status, but if needed it could be used for more statuses. The pool will be considered fragmented if: An item that isn't the last is freed or demoted. This 'strategy' has a problem, although it shouldn't cause any bug. If for example we have two items, A and B. We choose to free A first, now the pool will have the 'fragmented' status. If we now free B, the pool will retain its 'fragmented' status even if it isn't fragmented. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:29:17 -04:00
Bruno Jiménez	d8b6f0dacb	r600g/compute: Add a function for defragmenting the pool This new function will move items forward in the pool, so that there's no gap between them, effectively defragmenting the pool. For now this function is a bit dumb as it just moves items forward without trying to see if other items in the pool could fit in the gaps. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:29:17 -04:00
Bruno Jiménez	1f705b2bee	r600g/compute: Add a function for moving items in the pool This function will be used in the future by compute_memory_defrag to move items forward in the pool. It does so by first checking for overlaping ranges, if the ranges don't overlap it will copy the contents directly. If they overlap it will try first to make a temporary buffer, if this buffer fails to allocate, it will finally fall back to a mapping. Note that it will only be needed to move items forward, it only checks for overlapping ranges in that case. If needed, it can easily be added by changing the first if. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-23 10:29:17 -04:00
Rob Clark	23ae2db854	freedreno/a3xx: more vtx formats Actually what we currently handle is just the SCALED versions, and not the int versions. The difference probably matters more when we actually support integer in the compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 09:03:10 -04:00
Rob Clark	a5ac36a75f	freedreno/a3xx/compiler: const file relative addressing Teach new compiler scheduling and register assignment how to deal with relative addressing. This gets us what we need to avoid falling back to old compiler for CONST[ADDR[0].x+n]. It is also a prerequisite for temp file relative addressing, although that is going to also need some cleverness in register assignment to keep arrays grouped together. NOTE: doing address calculation in full precision and then narrowing to s16 in the mov to addr reg seems to sometimes cause lockups (and sometimes work?!). It seems more reliable to do the address calculation in s16, like the blob does. Which means teaching RA how to deal with mixed half and full precision allocation. Fortunately that didn't turn out to be too hard, so that is a nice bonus which we could probably take better advantage of elsewhere. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 09:03:10 -04:00
Rob Clark	c18ae9c293	freedreno/a3xx/compiler: move function Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 09:03:09 -04:00
Rob Clark	3a7da7f5ec	freedreno/a3xx: add back a few stalls Technically we should not need these. CP_LOAD_STATE can be pipelined. But removing them broke a few piglit tests, like fbo-depth- GL_DEPTH_COMPONENT24-readpixels. I expect these are just masking a problem elsewhere, or perhaps they are only needed under some more specific circumstances. But until that is understood properly, give back a bit of the perf boost we got from `c63450e8`. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 09:03:09 -04:00
Rob Clark	9f6dfd16e3	targets/dri: fix freedreno targets The kernel driver name is either "kgsl" (downstream/android) or "msm" (upstream). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 09:03:09 -04:00
Rob Clark	c357e8475a	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-07-23 09:03:09 -04:00
Neil Roberts	c6398a38af	docs: Update GL3.txt and relnotes for GL_ARB_clear_texture	2014-07-23 12:10:37 +01:00
Neil Roberts	0779f37e15	meta: Add a meta implementation of GL_ARB_clear_texture Adds an implementation of the ClearTexSubImage driver entry point that tries to set up an FBO to render to the texture and then calls glClearBuffer with a scissor to perform the actual clear. If an FBO can't be created for the texture then it will fall back to using _mesa_store_ClearTexSubImage. When used in combination with _mesa_store_ClearTexSubImage this should provide an implementation that works for all DRI-based drivers. However as this has only been tested with the i965 driver it is currently only enabled there. v2: Only enable the extension for the i965 driver instead of all DRI drivers. Remove an unnecessary goto. Don't require GL_ARB_framebuffer_object. Add some more comments. v3: Use glClearBuffer* to avoid having to modify glClearColor and friends. Handle sRGB textures. Explicitly disable dithering. Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>	2014-07-23 11:50:38 +01:00
Neil Roberts	05b52efbc9	meta: Add a state flag for the GL_DITHER The Meta implementation of glClearTexSubImage is going to want to ensure that dithering is disabled so that it can get a consistent color across the whole texture when clearing. This adds a state flag to easily save it and set it to the default value when performing meta operations. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-23 11:50:38 +01:00
Neil Roberts	df9945ca26	texstore: Add a generic implementation of GL_ARB_clear_texture Adds an implmentation of the ClearTexSubImage driver entry point that just maps the texture and writes the values in. The extension is not yet enabled by default because it doesn't work with multisample textures as they don't have a simple linear layout. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-07-23 11:50:38 +01:00
Neil Roberts	fbbbf7529c	mesa/main: Add generic bits of ARB_clear_texture implementation This adds the driver entry point for glClearTexSubImage and fills in the _mesa_ClearTexImage and _mesa_ClearTexSubImage functions that call it. v2: Don't clear some of the images if only one of them makes an error Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2014-07-23 11:50:38 +01:00
Neil Roberts	2e63f91e60	teximage: Add utility func for format/internalFormat compatibility check In texture_error_check() there was a snippet of code to check whether the given format and internal format are basically compatible. This has been split out into its own static helper function so that it can be used by an implementation of glClearTexImage too.	2014-07-23 11:50:38 +01:00
Ilia Mirkin	c4067acd90	mesa/main: add ARB_clear_texture entrypoints Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Neil Roberts <neil@linux.intel.com>	2014-07-23 11:50:37 +01:00
Michel Dänzer	07c65b85ea	r600g/radeonsi: Use write-combined CPU mappings of some BOs in GTT Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-23 18:55:50 +09:00
Michel Dänzer	37d43ebb28	winsys/radeon: Use separate caching buffer managers for VRAM and GTT Should reduce overhead because the caching buffer manager doesn't need to consider buffers of the wrong type. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-23 15:43:04 +09:00
Dave Airlie	2c947760ed	docs/GL3.txt: update status for ARB_compute_shader since some bits are done in tree, but nobody is working on it anymore. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-23 11:06:15 +10:00
Anuj Phogat	9548ba6e7b	mesa: Don't use memcpy() in _mesa_texstore() for float depth texture data because float depth texture data needs clamping to [0.0, 1.0]. Let the _mesa_texstore() fallback to slower path. Fixes Khronos GLES3 CTS tests: shadow_execution_vert shadow_execution_frag V2: Move the check to _mesa_texstore_can_use_memcpy() function. Add check for floating point data types. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-21 18:33:29 -07:00
Kenneth Graunke	29af97f280	i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+. We actually want to use mov(16), not mov(8). Fixes 7 Piglit tests: ARB_sample_shading/builtin-gl-sample-mask [2468] and ARB_sample_shading/builtin-gl-sample-mask-simple [468]. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-21 14:59:13 -07:00
Kenneth Graunke	38ffef7840	i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode. We might be able to do this without an extra program key field, but this is non-invasive and fixes the bug, for now. This fixes the following Piglit tests on Broadwell: - ARB_sample_shading/builtin-gl-sample-id 2 - ARB_sample_shading/builtin-gl-sample-position 2 - EXT_framebuffer_multisample/multisample-blit 2 color - EXT_framebuffer_multisample/multisample-blit 2 color linear - EXT_framebuffer_multisample/multisample-blit 2 depth - EXT_framebuffer_multisample/no-color 2 depth combined - EXT_framebuffer_multisample/no-color 2 depth separate - EXT_framebuffer_multisample/no-color 2 depth single - EXT_framebuffer_multisample/no-color 2 depth-computed combined - EXT_framebuffer_multisample/no-color 2 depth-computed separate - EXT_framebuffer_multisample/no-color 2 depth-computed single - EXT_framebuffer_multisample/unaligned-blit 2 color msaa - EXT_framebuffer_multisample/unaligned-blit 2 depth msaa Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991 Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-21 14:59:12 -07:00
Kenneth Graunke	4cf47c80fc	i965: Add missing persample_shading field to brw_wm_debug_recompile. Otherwise, the performance warning for shader recompiles will just say "something else". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-21 11:19:44 -07:00
Kenneth Graunke	caf8c07dd4	i965/disasm: Don't disassemble the URB complete field on Broadwell. It doesn't exist, so attempting to read it will trigger generation assertions in the brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-21 11:19:17 -07:00
Kenneth Graunke	662f1ccc24	i965: Disable hex offset printing in disassembly. Printing the hex offsets makes it basically impossible to diff assembly: if you add even a single instruction, the entire shader shows up as a difference. So, every time I want to compare assembly, I have to strip this out. The hex offsets might be useful when debugging compaction, or when inspecting the program cache buffer. Since it's occasionally useful, but uncommon, this patch disables it by default, but makes it easy to re-enable it temporarily when the need arises. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-21 11:19:08 -07:00
Matt Turner	3e9105f7ee	i965/vec4: Use foreach_inst_in_block a couple more places. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-21 10:35:41 -07:00
Matt Turner	1761671b06	i965: Replace cfg instances with calls to calculate_cfg(). Avoids regenerating it unnecessarily. Every program in shader-db improved, none by an amount less than a 1/3 reduction. One Dota2 shader decreased from 62 -> 24. cfg calculations: 429492 -> 193197 (-55.02%) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-21 10:35:39 -07:00
Matt Turner	dd65a6d9ad	i965/cfg: Add a foreach_block_and_inst macro. Will let us abstract how the instructions are stored. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-21 10:35:38 -07:00
Matt Turner	680fe0acb3	i965: Add cfg to backend_visitor. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-21 10:35:34 -07:00
Tom Stellard	b0f780345e	radeonsi/compute: Add support scratch buffer support v2 The scratch buffer will be used for private memory and also register spilling. v2: - Code cleanups	2014-07-21 10:00:09 -04:00
Tom Stellard	6cc5334e42	radeonsi/compute: Bump number of user sgprs for LLVM 3.5 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-21 10:00:09 -04:00
Tom Stellard	81385f7596	winsys/radeon: Query the kernel for the number of SEs and SHs per SE Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-21 10:00:09 -04:00
Tom Stellard	245e86168a	radeonsi/compute: Share COMPUTE_DBG macro with r600g Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-21 10:00:09 -04:00
Tom Stellard	9ba3105e0a	radeonsi: Read rodata from ELF and append it to the end of shaders The is used for programs that have arrays of constants that are accessed using dynamic indices. The shader will compute the base address of the constants and then access them using SMRD instructions.	2014-07-21 10:00:09 -04:00
Ian Romanick	01c21c459f	glsl: Fix bad indentation Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-19 15:04:04 -07:00
Ian Romanick	47e2a74a5a	i965: Silence unused parameter warning brw_fs_visitor.cpp:2400:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-19 15:04:01 -07:00
Ian Romanick	22b9641edf	i965: Silence 'comparison is always true' warning The parameter is an int16_t, and we're check that it's value will fit in 16-bits. Yes, the value that is stored in 16-bits will surely fit in 16-bits. brw_inst.h: In function 'brw_inst_set_gen6_jump_count': brw_inst.h:321:66: warning: comparison is always true due to limited range of data type [-Wtype-limits] brw_inst.h:321:66: warning: comparison is always true due to limited range of data type [-Wtype-limits] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-19 15:03:57 -07:00
Ian Romanick	1946612b7d	i965: Silence many unused parameter warnings brw_inst.h: In function 'brw_inst_set_src1_vstride': brw_inst.h:118:76: warning: unused parameter 'brw' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-19 15:03:49 -07:00
Vinson Lee	f6fc807345	configure.ac: Add LLVM patch version to error message. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-18 21:33:38 -07:00
Jason Ekstrand	ecd3e89b32	main/format_pack: Fix a wrong datatype in pack_ubyte_R8G8_UNORM Before it was only storing one of the color components due to truncation. With this patch it now properly stores all of them. Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-18 18:34:36 -07:00
Carl Worth	8ed24543f8	docs: Import 10.2.4 release notes And add a news item.	2014-07-18 16:50:05 -07:00
Jason Ekstrand	f14d217f5c	Add support for RGBA8 and RGBX8 textures in intel_texsubimage_tiled_memcpy Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-07-17 18:20:09 -07:00
Jason Ekstrand	765f4b8c04	i965: Improve debug output in intelTexImage and intelTexSubimage Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-07-17 18:20:09 -07:00
Marek Olšák	d808de31bd	radeonsi: only update vertex buffers when they need updating Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:59 +02:00
Marek Olšák	6210d6fdc2	radeonsi: remove nr_vertex_buffers Unused. Also inline util_set_vertex_buffers_count and simplify it. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:59 +02:00
Marek Olšák	0ed0bf0696	radeonsi: move vertex buffer descriptors from IB to memory This removes the intermediate storage (pm4 state) and generates descriptors directly in a staging buffer. It also reduces the number of flushes, because the descriptors no longer take CS space. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:59 +02:00
Marek Olšák	1635ded828	radeonsi: add support for fine-grained sampler view updates Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:59 +02:00
Marek Olšák	bea8f2f46d	radeonsi: move si_set_sampler_views to si_descriptors.c Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:59 +02:00
Marek Olšák	dd46841bc9	radeonsi: move sampler descriptors from IB to memory Sampler descriptors are now represented by si_descriptors. This also adds support for fine-grained sampler state updates and the border color update is now isolated in a separate function. Border colors have been broken if texturing from multiple shader stages is used. This patch doesn't change that. BTW, blitting already makes use of fine-grained state updates. u_blitter uses 2 textures at most, so we only have to save 2. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:59 +02:00
Marek Olšák	2a7b57ad42	radeonsi: implement ARB_draw_indirect Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:58 +02:00
Marek Olšák	887b69a233	radeonsi: don't add info->start to the index buffer offset info->start will be invalid once info->indirect isn't NULL, so it shouldn't be added to ib.offset. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:58 +02:00
Marek Olšák	09056b352d	radeonsi: use an SGPR instead of VGT_INDX_OFFSET The draw indirect packets cannot set VGT_INDX_OFFSET, they can only set user data SGPRs. This is the only way to support start/index_bias with indirect drawing. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:58 +02:00
Marek Olšák	a66d934139	radeonsi: assume LLVM 3.4.2 is always present Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:58 +02:00
Marek Olšák	4ad682461e	configure.ac: require LLVM 3.4.2 for radeon Needed by ARB_draw_indirect. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-18 01:58:58 +02:00
Marek Olšák	3a86ca54df	st/mesa,gallium: add a workaround for Unigine Heaven 4.0 and Valley 1.0 Most (all?) Unigine shaders fail to compile without this if sample shading is advertised. This is, of course, Unigine developers' fault. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-18 01:58:58 +02:00
Marek Olšák	b0ff18bd34	glsl: add a mechanism to allow #extension directives in the middle of shaders This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work with sample shading. Also, if this is disabled, the error message at least makes sense now. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-18 01:58:58 +02:00
Glenn Kennard	392c9f8dfe	r600g: Implement GL_ARB_texture_gather Only supported on evergreen and later. Currently limited to single component textures as the hardware GATHER4 instruction ignores texture swizzles. Piglit quick run passes on radeon 6670 with all applicable textureGather tests, no regressions. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-07-18 01:58:58 +02:00
Anuj Phogat	984a02ba55	i965: Fix z_offset computation in intel_miptree_unmap_depthstencil() The bug is triggered by using glTexSubImage2d() with GL_DEPTH_STENCIL as base internal format and non-zero x, y offsets. Currently x, y offsets are ignored while updating the texture image. Fixes Khronos GLES3 CTS tests: npot_tex_sub_image_2d npot_tex_sub_image_3d npot_pbo_tex_sub_image_2d npot_pbo_tex_sub_image_2d Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-17 15:52:27 -07:00
Anuj Phogat	5d9f5cd35b	Revert "i965: Extend compute-to-mrf pass to understand blocks of MOVs" This reverts commit `bbefb15e01`. Fixes the 11 regressions caused in framebuffer_blit tests in Khronos GLES3 CTS tests: Original patch reduced the instruction count but had no performance benefits. So, it's safe to revert it without causing any performance regressions. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Acked-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-17 15:49:46 -07:00
Adel Gadllah	b656e3c603	i915: Fix up intelInitScreen2 for DRI3 Commit `442442026e` updated both i915 and i965 for DRI3 support, but one check in intelInitScreen2 was missed for i915 causing crashes when trying to use i915 with DRI3. So fix that up. Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> References: https://bugzilla.redhat.com/show_bug.cgi?id=1115323 References: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=754297 Tested-by: František Zatloukal <Zatloukal.Frantisek@gmail.com> Tested-by: Dirk Griesbach <spamthis@freenet.de> Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-17 14:42:35 -07:00
Pavel Popov	4ceb612a10	mesa: Fix regression introduced by commit "mesa: fix packing of float texels to GL_SHORT/GL_BYTE". This commit "mesa: fix packing of float texels to GL_SHORT/GL_BYTE" replaced _TO_BYTE to _TO_BYTE_TEX because _TO_FLOAT_TEX are used to unpack the texels to floats. In this case _TO_FLOATZ in function extract_float_rgba also should be replaced to *_TO_FLOAT_TEX. Underline that these macros automatically preserve zero when converting. The regression was observed on 3 oglconform tests: snorm-textures basic.getTexImage snorm-textures advanced.mipmap.manual.getTex snorm-textures advanced.mipmap.upload.getTex Signed-off-by: Pavel Popov <pavel.e.popov@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-18 08:01:07 +12:00
Thorsten Glaser	3cfe6bc9cc	nv50: fix build failure on m68k due to invalid struct alignment assumptions Make alignment assumptions explicit by inserting correct padding with unknown struct members. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-17 10:31:30 -04:00
Tom Stellard	74dfd86ed6	clover: Call end_query before getting timestamp result v2 v2: - Move the end_query() call into the timestamp constructor. - Still pass false as the wait parameter to get_query_result(). Reviewed-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-17 09:33:37 -04:00
Tapani Pälli	48deb4dbf2	glsl: handle a switch where default is in the middle of cases This fixes following tests in es3conform: shaders.switch.default_not_last_dynamic_vertex shaders.switch.default_not_last_dynamic_fragment and makes following tests in Piglit pass: glsl-1.30/execution/switch/fs-default-notlast-fallthrough glsl-1.30/execution/switch/fs-default_notlast No Piglit regressions. v2: take away unnecessary ir_if, just use conditional assignment v3: use foreach_in_list instead of foreach_list Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v2) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v3)	2014-07-17 07:39:12 +03:00
Kenneth Graunke	9e47ed2f77	glsl: Make the tree rebalancer use vector_elements, not components(). components() includes matrix columns, so if this code encountered a matrix, it would ask for something like a vec9 or vec16. This is clearly not what you want. Earlier code now prevents this from seeing matrices, but we should still use vector_elements, for clarity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-16 15:43:13 -07:00
Kenneth Graunke	7db75927ca	glsl: Guard against error_type in the tree rebalancer. This helped me track down the bug fixed in the previous commit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-16 15:43:13 -07:00
Kenneth Graunke	9697f8088f	glsl: Make the tree rebalancer bail on matrix operands. It doesn't handle things like (vector * matrix) correctly, and apparently Matt's intention was to bail. Fixes shader compilation in Natural Selection 2. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-16 15:43:13 -07:00
Kenneth Graunke	99f8ea295f	Revert "i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams." This reverts commit `3178d2474a`. This caused GPU hangs on Ivybridge for some users and huge (80%) performance regressions across the board on multiple platforms. We need to find a better solution. I've made several attempts, but none of them have worked yet. In the meantime, we should revert this. Reverting it breaks GL_PRIMITIVES_GENERATED for non-zero streams, but that's okay, since we don't expose GL_ARB_gpu_shader5 yet. Fixes Piglit's EXT_transform_feedback/generatemipmap prims_generated test case on Haswell.	2014-07-16 14:19:29 -07:00
Chia-I Wu	1661f7559b	ilo: add some missing formats Map more pipe formats to hardware formats. Enable more VB formats on Haswell.	2014-07-16 14:31:59 +08:00
Chia-I Wu	69cd3ebd6f	ilo: update and tailor the surface format table Recreate the table from scratch with the help of a pdf-table-to-csv converter. Switch to a form that is more suitable for ilo.	2014-07-16 14:31:59 +08:00
Kenneth Graunke	a2de656278	i965: Don't copy propagate abs into Broadwell logic instructions. It's not clear what abs on logical instructions means on Broadwell, and it doesn't appear to do anything sensible. Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-15 22:12:15 -07:00
Kenneth Graunke	cf1b5eee7f	i965/fs: Use WE_all for gl_SampleID header register munging. This code should execute without regard to the currently executing channels. Asking for gl_SampleID inside control flow might break in strange ways. It appears to break even at the top of the program in SIMD16 mode occasionally as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2014-07-15 22:10:10 -07:00
Kenneth Graunke	e5adc560cc	i965/fs: Set force_uncompressed and force_sechalf on samplepos setup. gen8_fs_generator uses these to decide whether to set the execution size to 8 or 16, so we incorrectly made both of these MOVs the full width in SIMD16 shaders. (It happened to work out on Gen4-7.) Setting them should also help inform optimization passes what's really going on, which could help avoid bugs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2014-07-15 22:10:06 -07:00
Kenneth Graunke	2eaf3f670f	i965: Set execution size to 8 for instructions with force_sechalf set. Both inst->force_uncompressed and inst->force_sechalf mean that the generated instruction should be uncompressed and have an execution size of 8. We don't require the visitor to set both flags - setting inst->force_sechalf by itself is supposed to be enough. On Gen4-7, guess_execution_size() demoted instructions to 8-wide based on the default compression state. On Gen8+, we instead set a default execution size, which worked great...except that we forgot to check inst->force_sechalf when deciding whether to use 8 or 16. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: mesa-stable@lists.freedesktop.org	2014-07-15 22:09:49 -07:00
Christoph Bumiller	4198711006	nvc0: fix translate path for PRIM_RESTART_WITH_DRAW_ARRAYS Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-15 17:57:45 -04:00
Christoph Bumiller	a284a0afa2	nvc0: add support for indirect drawing Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-15 17:57:45 -04:00
Ilia Mirkin	bbc4a7bd31	nouveau: check if a fence has already been signalled nouveau_fence_update does real work unconditionally. Avoid doing that if the fence we're checking on has already been signalled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-15 17:57:45 -04:00
Matt Turner	c11096c749	glsl: Don't declare variables in for-loop declaration. Reported-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-15 12:17:48 -07:00
Connor Abbott	58270c2fac	exec_list: Make various places use the new length() method. Instead of hand-rolling it. v2 [mattst88]: Rename get_size to length. Expand comment in ir_reader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <connor.abbott@intel.com>	2014-07-15 11:16:16 -07:00
Connor Abbott	7b0f69225a	exec_list: Add a function to give the length of a list. v2 [mattst88]: Remove trailing whitespace. Rename get_size to length. Mark as const. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <connor.abbott@intel.com>	2014-07-15 11:16:16 -07:00
Connor Abbott	28c4fd4bc6	exec_list: Add a prepend function. This complements the existing append function. It's implemented in a rather simple way right now; it could be changed if performance is a concern. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Connor Abbott <connor.abbott@intel.com>	2014-07-15 11:16:16 -07:00
Ian Romanick	9a723b970e	mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat profile There are no queries for GL_TEXTURE_LUMINANCE_SIZE, GL_TEXTURE_INTENSITY_SIZE, GL_TEXTURE_LUMINANCE_TYPE, or GL_TEXTURE_INTENSITY_TYPE in any version of OpenGL ES or desktop OpenGL core profile. NOTE: Without changes to piglit, this regresses required-sized-texture-formats. v2: Rebase on different initial change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.2 <mesa-stable@lists.freedesktop.org>	2014-07-15 10:46:33 -07:00
Ian Romanick	750286600b	mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profile There are no texture borders in any version of OpenGL ES or desktop OpenGL core profile. Fixes piglit's gl-3.2-texture-border-deprecated. v2: Rebase on different initial change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.2 <mesa-stable@lists.freedesktop.org>	2014-07-15 10:46:33 -07:00
Ian Romanick	ee58c71a65	mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image Instead of catching the special case early, handle it by constructing a fake gl_texture_image that will cause the values required by the OpenGL 4.0 spec to be returned. Previously, calling glGenTextures(1, &t); glBindTexture(GL_TEXTURE_2D, t); glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, 0xDEADBEEF, &value); would not generate an error. Anuj: Can you verify this does not regress proxy_textures_invalid_size? Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Suggested-by: Brian Paul <brianp@vmware.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Cc: Anuj Phogat <anuj.phogat@gmail.com>	2014-07-15 10:46:33 -07:00
Matt Turner	83214edf8a	i965/fs: Relax interference check in register coalescing. A similar attempt was made in commit `5ff1e446` and was reverted in commit `a39428cf` after causing a regression in an ES 3 conformance test. The test still passes after this commit. total instructions in shared programs: 1994827 -> 1992858 (-0.10%) instructions in affected programs: 128247 -> 126278 (-1.54%) GAINED: 0 LOST: 1 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-15 10:12:29 -07:00
Matt Turner	1d97212007	i965/fs: Perform CSE on sends-from-GRF rather than textures. Should potentially allow a few more cases, while avoiding doing CSE on texture operations on Gen <= 6 with the MRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80211 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: lu hua <huax.lu@intel.com>	2014-07-15 10:12:29 -07:00
Matt Turner	103716a862	glsl: Update expression types after rebalancing the tree. If we saw a tree that looked like vec3 / \ vec3 float / \ vec3 float / \ vec3 float We would see that all of the expression types were vec3, and then rebalance to vec3 / \ vec3 vec3 <-- should be float / \ / \ vec3 float float float This patch adds code to visit the rebalanced tree and update the expression types from the bottom up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80880 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-15 10:12:29 -07:00
Matt Turner	7b962a4e6b	glsl: Add callback_leave to ir_hierarchical_visitor.	2014-07-15 10:12:29 -07:00
Matt Turner	76caaedd7e	i965: Initialize new chunks of realloc'd memory. Otherwise we'd compare uninitialized pointers with NULL and dereference, leading to crashes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-15 10:12:29 -07:00
Tom Stellard	0d711e719e	radeon/llvm: Fix LLVM diagnostic error reporting We were trying to print the error message after disposing the message object. Tested-by and Reviewed-by: Aaron Watry <awatry@gmail.com>	2014-07-15 11:55:26 -04:00
José Fonseca	20b431fd9e	util/tgsi: Fix ureg_EMIT/ENDPRIM prototype. `0cbefc1bea` added a source argument to EMIT/ENDPRIM, but it did not update tgsi_ureg accordingly, causing all users of ureg_EMIT/ENDPRIM to fail at runtime with an assertion failure. Trivial.	2014-07-15 14:56:31 +01:00
Vinson Lee	e945a19b35	glapi: Use GetProcAddress instead of dlsym on Windows. This patch fixes this MinGW build error. glapi_gentable.c: In function '_glapi_create_table_from_handle': glapi_gentable.c:123:9: error: implicit declaration of function 'dlsym' [-Werror=implicit-function-declaration] *procp = dlsym(handle, symboln); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Brian Paul <brianp@vmware.com>	2014-07-14 22:21:10 -07:00
Chia-I Wu	c25fe88ebf	ilo: raise texture size limits Report the hardware limits now that max-texture-size piglit test has been fixed.	2014-07-15 12:00:15 +08:00
Chia-I Wu	81d7f33e30	ilo: move away from drm_intel_bo_alloc_tiled We want to know the exact sizes of the BOs, and the driver has the knowledge to do so. Refactoring of the resource allocation code is needed though.	2014-07-15 12:00:10 +08:00
Marek Olšák	d859bdb4b5	radeonsi: partially revert "switch descriptors to i32 vectors" It indeed breaks LLVM 3.4.2.	2014-07-14 21:40:19 +02:00
Matt Turner	130c99ca15	i965/vec4: Invalidate live intervals in opt_cse, not _local. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-14 11:27:52 -07:00
Matt Turner	aba15d93a6	i965/vec4: Move aeb list into opt_cse_local. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-14 11:27:52 -07:00
Matt Turner	1ca6b5d2e8	i965/fs: Invalidate live intervals in opt_cse, not _local. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-14 11:27:52 -07:00
Matt Turner	bdbaa9ab5b	i965/fs: Move aeb list into opt_cse_local. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-14 11:27:52 -07:00
Cody Northrop	0f679f0ab5	glsl: Fix aggregates with dynamic initializers. Vectors are falling in to the ir_dereference_array() path. Without this change, the following glsl aborts the debug driver, or gets the wrong answer in release: mat2x2 a = mat2( vec2( 1.0, vertex.x ), vec2( 0.0, 1.0 ) ); Also submitting piglit tests, will reference in bug. v2: Rebase on Mesa master. v3: Remove unneeded check for arrays, which are covered by process_array_constructor(), recommended by Timothy Arceri. Signed-off-by: Cody Northrop <cody@lunarg.com> Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79373	2014-07-14 08:36:36 -07:00
Jon TURNEY	923f78440c	Avoid mesa_dri_drivers import lib being installed On Cygwin and MinGW, linking a shared library also generates an import library Use a wildcard which also matches the name of the megadriver import lib, mesa_dri_drivers.dll.a, so that is also removed after megadriver symlinks are created (This then matches src/gallium/targets/dri/Makefile.am, which already does things this way) Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-13 16:06:46 +01:00
Chris Forbes	5899a45a5b	i965/vec4: Silence warnings about unhandled interpolation ops Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-13 11:13:23 +12:00
Chris Forbes	1e4068ca45	docs: Mark off ARB_gpu_shader5 interpolation functions for i965 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-13 10:04:25 +12:00
Chris Forbes	9c0bddf735	i965/fs: add support for ir__interpolate_at_ expressions SIMD8-only for now. V5: - Fix style complaints - Move prototype to be with other oddball emit functions - Use unreachable() instead of assert() where possible V6: - Describe what is happening with the clamping - Add reg_width to make some expressions clearer Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-13 10:01:24 +12:00
Chris Forbes	5ed147c26f	i965/fs: Skip channel expressions splitting for interpolation The backend will have to do a message send, so we want to keep these in one piece, just like texture ops. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-13 10:01:22 +12:00
Chris Forbes	6e91f2df95	i965/fs: add generator support for pixel interpolator query V5: - Split into separate opcodes - Pass message data in src1 immediate - Put noperspective bit in fs_inst rather than adding any junk to backend_instruction Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-13 10:01:18 +12:00
Chris Forbes	d732598b63	i965: add low-level support for send to pixel interpolator Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-13 10:01:17 +12:00
Chris Forbes	0b0572a2ad	i965/disasm: add support for pixel interpolator messages V3: Rework for brw_inst changes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-13 10:01:16 +12:00
Chris Forbes	1b6163bdf5	i965: Add message descriptor bit definitions for pixel interpolator These got lost in the big brw_inst shakeup. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-13 10:01:13 +12:00
Chris Forbes	f55e9a7c75	i965/disasm: Disassemble indirect sends more properly - Don't try to disassemble send's src1 as a descriptor if it's not an immediate. - In the same case, show src1 as an operand (makes it easier to see bogus register regions, etc -- the hardware is very fussy) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-12 11:29:17 +12:00
Chris Forbes	1854ead64c	i965: Avoid crashing while dumping vec4 insn operands We'd otherwise go looking into virtual_grf_sizes for things that aren't in there at all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-12 11:29:17 +12:00
Chris Forbes	1499619fe6	i965: Fix two broken asserts in brw_eu_emit These were looking in the wrong field. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-12 11:29:09 +12:00
Chris Forbes	b45d417108	glsl: add new interpolateAt* builtin functions V2: - Don't assume everyone wants interpolateAtSample() lowered to interpolateAtOffset. It turns out this isn't what we want most of the time for i965. Lowering can be added later in an ir pass which drivers opt into, rather than bolting it straight into the builtin definition. - Only expose the interpolateAt* builtins in the fragment language. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-12 11:20:02 +12:00
Chris Forbes	1d5b06664f	glsl: add new expression types for interpolateAt* Will be used to implement interpolateAt*() from ARB_gpu_shader5 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-12 11:20:00 +12:00
Chris Forbes	8b7a323596	allow builtin functions to require parameters to be shader inputs The new interpolateAt* builtins have strange restrictions on the <interpolant> parameter. - It must be a shader input, or an element of a shader input array. - It must not include a swizzle. V2: Don't abuse ir_var_mode_shader_in for this; make a new flag. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-12 11:19:50 +12:00
Marek Olšák	ee2a818d33	radeonsi: rename definitions of shader limits Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-11 19:36:29 +02:00
Marek Olšák	4f3f0435bf	radeonsi: switch descriptors to i32 vectors This is a follow-up to the commit which adds texture fetches with offsets. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-11 19:36:29 +02:00
Marek Olšák	877bb52dc9	radeonsi: properly implement texture opcodes that take an offset Instead of using intr_name in lp_build_tgsi_action, this selects the names with a switch statement in the emit function. This allows emitting llvm.SI.sample for instructions without offsets and llvm.SI.image.sample.*.o otherwise. This depends on my LLVM changes. When LLVM 3.5 is released, I'll switch all texture instructions to the new intrinsics.	2014-07-11 19:36:29 +02:00
Marek Olšák	04aa2bd724	radeonsi: fix texture fetches with derivatives for 1DArray and 3D textures	2014-07-11 19:36:29 +02:00
Marek Olšák	b279f0143f	radeonsi: fix samplerCubeShadow with bias Pack the depth value before overwriting it with cube coordinates. Cc: mesa-stable@lists.freedesktop.org	2014-07-11 19:36:29 +02:00
Marek Olšák	a11fff329e	st/mesa: fix samplerCubeShadow with bias It has 5 coordinates: (x,y,z,depth,lodbias) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-11 19:36:29 +02:00
Marek Olšák	734e4946f5	mesa: fix crash in st/mesa after deleting a VAO This happens when glGetMultisamplefv (or any other non-draw function) is called, which doesn't invoke the VBO module to update _DrawArrays and the pointer is invalid at that point. However st/mesa still dereferences it to setup vertex buffers ==> crash. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-11 19:36:29 +02:00
Jon TURNEY	f381c27c54	configure: Cygwin requires _XOPEN_SOURCE >= 700 to prototype strndup() Adjust definition of _XOPEN_SOURCE appropriately for use of strndup() added with commit `da3a47d6` Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-07-11 15:26:02 +01:00
Brian Paul	da46b9de9f	gallium/docs: minor clarification for TXQ instruction Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-10 11:30:04 -06:00
Brian Paul	c45b9b5721	softpipe: fix sp_get_dims() for PIPE_BUFFER Before, we were checking the level against view->u.tex.last_level but level is not valid for buffers. Plus, the aliasing of the view->u.tex view->u.buf members (a union) caused the level checking arithmetic to be totally wrong. The net effect is we always returned early for PIPE_BUFFER size queries. This fixes the piglit "textureSize 140 fs samplerBuffer" test. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-10 10:59:40 -06:00
Brian Paul	faa6b0cdc3	glsl/glcpp: move macro declaration before code to fix MSVC build Reviewed-by: Carl Worth <cworth@cworth.org>	2014-07-10 08:08:10 -06:00
Ilia Mirkin	acaed8f41d	nvc0/ir: add support for interpolating with non-default settings Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-09 22:32:13 -04:00
Ilia Mirkin	7c9161521a	gallium: add INTERP_* opcodes to support interpolateAt* Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-09 22:32:13 -04:00
Ilia Mirkin	ca5e15f40f	r600g: remove unused base_vector_chan variable Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-07-09 22:32:13 -04:00
Ilia Mirkin	b8db6db8b0	i965: forward-declare struct brw_context in brw_reg.h Commit `54e91e7420` introduced a function declaration that uses brw_context. While brw_context tends to get included in most files, it is not when compiling intel_asm_annotation.c resulting in the following warning: In file included from brw_shader.h:25:0, from brw_cfg.h:32, from intel_asm_annotation.c:24: brw_reg.h:122:39: warning: 'struct brw_context' declared inside parameter list [enabled by default] brw_reg.h:122:39: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] Add a forward-declaration for struct brw_context to avoid the issue. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-09 22:31:51 -04:00
Ilia Mirkin	a432079400	nvc0/ir: fix encoding of offset register into interpolation instruction Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-09 21:10:24 -04:00
Ilia Mirkin	7f937875c0	nvc0/ir: account for indirect textures on fermi for txd Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-09 21:10:24 -04:00
Ilia Mirkin	9807a8ddaf	nvc0/ir: unset s/r indirect sources before moving everything With the current logic, it's very likely that s/r indirect sources are right after the "regular" ones. Unset them before moving the texture arguments over rather than after, as one of those arguments would likely have assumed one of the s/r positions. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-09 21:10:24 -04:00
Emil Velikov	0bdc3e1afd	targets/dri-swrast: Convert to static/shared pipe-driver Convert the final dri target to the single DRI (megadriver) library. Cleanup all the automake leftovers from the conversion stage and update the scons build. v2: Link in llvmpipe, when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:49 +01:00
Emil Velikov	29ca7d2c94	st/dri: merge dri/drm and dri/sw backends Move the driver_name to dri2/drisw and remove all the SPLIT_TAGETS mayhem. In the next step we'll unify the dri and dri-swrast targets, completing the gallium DRI megadriver. v2: Remove leftover st/dri Makefiles from CONFIG_FILES. Spotted by Thomas Helland. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:49 +01:00
Emil Velikov	f6483aa694	targets/dri-swrast: convert to gallium megadrivers :) Export the approapriate new symbol, and keep backwards compat via the megadriver_stub helper library. Our next step would be to unify dri/drm and dri/sw, leading to a complete megadrivers solution, and having a single library that provides dri across all targets. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	dab5d16f0e	scons: build and use a single dri_common library Rather than building two identical ones for dri-vmwgfx and dri-swrast build a single library, and drop some duplication in the build. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	0e357234f3	st/dri/drm: remove __driDriverExtensions and driDriverAPI ... and use libmegadriver_stub as their provider. Teach scons how to build the library archive and use it. v2: scons: fix build on a drm-less system. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	3b7c120be3	targets/dri: cleanup conversion leftovers With all the users converted to __driGetExtensions_* we can have only a single inclusion of the required header + define. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	f6898aa264	targets/dri: update scons build to handle __driDriverGetExtensions_vmwgfx Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Cc: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	5c68a1dc0b	targets/dri: Add __driDriverGetExtensions_vmwgfx Identical to previous commits - will bring us a step closer to megadrivers. Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	ff0e25f3a6	targets/dri: Add __driDriverGetExtensions_i965 symbol Identical to previous commits - will bring us a step closer to megadrivers. Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	3591acacf9	targets/dri: Add __driDriverGetExtensions_i915 symbol Identical to previous commits - will bring us a step closer to megadrivers. Cc: Stephane Marchesin <stephane.marchesin@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:48 +01:00
Emil Velikov	f48b06f89d	targets/dri: Add __driDriverGetExtensions_freedreno symbol Identical to previous two commits - will bring us a step closer to megadrivers. Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:47 +01:00
Emil Velikov	4cd1bb6a91	targets/dri: Add __driDriverGetExtensions_(r300\|r600\|radeonsi) symbols The symbol is introduced by the mesa megadrivers, and adding gallium support for it will allow us to merge st/dri/drm and st/dri/sw. Resulting in a single dri library across all of gallium. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:47 +01:00
Emil Velikov	5b7e43aea8	targets/dri: Add __driDriverGetExtensions_nouveau symbol The symbol is introduced by the mesa megadrivers, and adding gallium support for it will allow us to merge st/dri/drm and st/dri/sw. Resulting in a single dri library across gallium. v2: Rebase on top of gallium dri3. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-10 01:06:47 +01:00
Ilia Mirkin	532eb72be3	tgsi: add interpolation location modifier support to text parser Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-09 19:26:32 -04:00
Ilia Mirkin	6b92a06ea3	mesa/st: add per sample shading state to fp key and set interpolation This enables a gallium driver not to care about the semantics of ARB_sample_shading vs ARB_gpu_shader5 sample attributes. When ARB_sample_shading-style sample shading is enabled, all of the fp inputs are marked for interpolation at the sample location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-09 19:26:32 -04:00
Ilia Mirkin	4c97ed4411	gallium: switch dedicated centroid field to interpolation location The new location field can be either center, centroid, or sample, which indicates the location that the shader should interpolate at. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-09 19:26:32 -04:00
Kenneth Graunke	e3b16294cb	meta: Call glObjectLabel before linking. i965 precompiles shaders at link time, and prints a disassembly if INTEL_DEBUG=vs,gs,fs, including the shader name. However, blit shaders were showing up as "unnamed" since we hadn't set a name prior to linking. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-09 16:04:52 -07:00
Kenneth Graunke	272e36e229	ff_fragment_shader: Access glsl_types directly. Originally, we didn't have direct accessors for all of the GLSL types, so the only way to get at them was to use the symbol table. Now, we can just get at them directly, which is simpler and faster. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-07-09 15:48:24 -07:00
Brian Paul	c03c6e0168	st/mesa: add PIPE_FORMAT_R10G10B10A2_UNORM to format_map table as a candidate for the GL_RGB10_A2 internal texture format. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-09 15:06:46 -06:00
Brian Paul	282b783a15	st/mesa: add some missing MESA/PIPE_FORMAT_R10G10B10A2_UNORM switch cases Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-09 15:06:46 -06:00
Carl Worth	0e12cd7954	glsl/glcpp: Don't choke on an empty pragma The lexer was insisting that there be at least one character after "#pragma" and before the end of the line. This caused an error for a line consisting only of "#pragma" which volates at least the following sentence from the GLSL ES Specification 3.00.4: The scope as well as the effect of the optimize and debug pragmas is implementation-dependent except that their use must not generate an error. [Page 12 (Page 28 of PDF)] and likely the following sentence from that specification and also in GLSLangSpec 4.30.6: If an implementation does not recognize the tokens following #pragma, then it will ignore that pragma. Add a "make check" test to ensure no future regressions. This change fixes at least part of the following Khronos GLES3 CTS test: preprocessor.pragmas.pragma_vertex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-09 12:05:14 -07:00
Carl Worth	43047384c3	glsl/glcpp: Promote "extra token at end of directive" from warning to error We've always warned about this case, but a recent confromance test expects this to be an error that causes compilation to fail. Make it so. Also add a "make check" test to ensure these errors are generated. This fixes the following Khronos GLES3 conformance tests: invalid_conditionals.tokens_after_ifdef_vertex invalid_conditionals.tokens_after_ifdef_fragment invalid_conditionals.tokens_after_ifndef_vertex invalid_conditionals.tokens_after_ifndef_fragment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-09 12:05:14 -07:00
Carl Worth	dac3c986c5	glsl/glcpp: Once again report undefined macro name in error message. While writing the previous commit message, I just felt bad documenting the shortcoming of the change, (that undefined macro names would not be reported in error messages). Fix this by preserving the first-encounterd undefined macro name and reporting that in any resulting error message. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-09 12:05:13 -07:00
Carl Worth	ec6222ef01	glsl/glcpp: Add short-circuiting for \|\| and && in #if/#elif for OpenGL ES. The GLSL ES Specification 3.00.4 says: #if, #ifdef, #ifndef, #else, #elif, and #endif are defined to operate as for C++ except for the following: ... • Undefined identifiers not consumed by the defined operator do not default to '0'. Use of such identifiers causes an error. [Page 11 (page 127 of the PDF file)] as well as: The semantics of applying operators in the preprocessor match those standard in the C++ preprocessor with the following exceptions: • The 2nd operand in a logical and ('&&') operation is evaluated if and only if the 1st operand evaluates to non-zero. • The 2nd operand in a logical or ('\|\|') operation is evaluated if and only if the 1st operand evaluates to zero. If an operand is not evaluated, the presence of undefined identifiers in the operand will not cause an error. (Note that neither of these deviations from C++ preprocessor behavior apply to non-ES GLSL, at least as of specfication version 4.30.6). The first portion of this, (generating an error for an undefined macro in an (short-circuiting to squelch errors), was not implemented previously, but is implemented in this commit. A test is added for "make check" to ensure this behavior. Note: The change as implemented does make the error message a bit less precise, (it just states that an undefined macro was encountered, but not the name of the macro). This commit fixes the following Khronos GLES3 conformance test: undefined_identifiers.valid_undefined_identifier_1_vertex undefined_identifiers.valid_undefined_identifier_1_fragment undefined_identifiers.valid_undefined_identifier_2_vertex undefined_identifiers.valid_undefined_identifier_2_fragment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-09 12:05:13 -07:00
Carl Worth	9794f8f245	glsl/glcpp: Fix glcpp to properly lex entire "preprocessing numbers" The preprocessor defines a notions of a "preprocessing number" that starts with either a digit or a decimal point, and continues with zero or more of digits, decimal points, identifier characters, or the sign symbols, ('-' and '+'). Prior to this change, preprocessing numbers were lexed as some combination of OTHER and IDENTIFIER tokens. This had the problem of causing undesired macro expansion in some cases. We add tests to ensure that the undesired macro expansion does not happen in cases such as: #define e +1 #define xyz -2 int n = 1e; int p = 1xyz; In either case these macro definitions have no effect after this change, so that the numeric literals, (whether valid or not), will be passed on as-is from the preprocessor to the compiler proper. This fixes the following Khronos GLES3 CTS tests: preprocessor.basic.correct_phases_vertex preprocessor.basic.correct_phases_fragment v2. Thanks to Anuj Phogat for improving the original regular expression, (which accepted a '+' or '-', where these are only allowed after one of [eEpP]. I also expanded the test to exercise this. v3. Also fixed regular expression to require at least one digit at the beginning (after an optional period). Otherwise, a string such as ".xyz" was getting sucked up as a preprocessing number, (where obviously this should be a field access). Again, I expanded the test to exercise this. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-07-09 12:05:13 -07:00
Carl Worth	98c0e3c783	glsl/glcpp: Fix glcpp to catch garbage after #if 1 ... #else Previously, a line such as: #else garbage would flag an error if it followed "#if 0", but not if it followed "#if 1". We fix this by setting a new bit of state (lexing_else) that allows the lexer to defer switching to the <SKIP> start state until after the NEWLINE following the #else directive. A new test case is added for: #if 1 #else garbage #endif which was untested before, (and did not generate the desired error). This fixes the following Khronos GLES3 CTS tests: tokens_after_else_vertex tokens_after_else_fragment Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-07-09 12:05:13 -07:00
Carl Worth	1d862a0b39	glsl/glcpp: Fixup glcpp tests for redefining a macro with whitespace changes. Previously, the test suite was expecting the compiler to allow a redefintion of a macro with whitespace added, but gcc is more strict and allows only for changes in the amounts of whitespace, (but insists that whitespace exist or not in exactly the same places). See: https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html: These definitions are effectively the same: #define FOUR (2 + 2) #define FOUR (2 + 2) #define FOUR (2 /* two / + 2) but these are not: #define FOUR (2 + 2) #define FOUR ( 2+2 ) #define FOUR (2 2) #define FOUR(score,and,seven,years,ago) (2 + 2) This change adjusts the existing "redefine-macro-legitimate" test to work with the more strict understanding, and adds a new "redefine-whitespace" test to verify that changes in the position of whitespace are flagged as errors. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-07-09 12:05:13 -07:00
Anuj Phogat	a6e9cd14ca	glsl/glcpp: Fix preprocessor error condition for macro redefinition This patch specifically fixes redefinition condition for white space changes. #define and #undef functionality in GLSL follows the standard for C++ preprocessors for macro definitions. From https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html: These definitions are effectively the same: #define FOUR (2 + 2) #define FOUR (2 + 2) #define FOUR (2 /* two / + 2) but these are not: #define FOUR (2 + 2) #define FOUR ( 2+2 ) #define FOUR (2 2) #define FOUR(score,and,seven,years,ago) (2 + 2) Fixes Khronos GLES3 CTS tests; invalid_object_whitespace_vertex invalid_object_whitespace_fragment Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Carl Worth <cworth@cworth.org>	2014-07-09 12:05:13 -07:00
Carl Worth	1a46dd6edd	glsl/glcpp: Add test to ensure compiler won't allow #undef for some builtins Currently verifying that an #undef of __FILE__, __LINE__, or __VERSION__ will generate an error. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-07-09 12:05:13 -07:00
Anuj Phogat	64b7fc2dd1	glsl/glcpp: Do not allow undefining the built-in macros Fixes piglit tests in spec/glsl-es-3.00/compile: undef-__FILE__.vert undef-GL_ES.vert undef-__LINE__.vert undef-__VERSION__.vert Also, fixes Khronos GLES3 CTS tests: undefine_invalid_object_1_vertex undefine_invalid_object_1_fragment undefine_invalid_object_2_vertex undefine_invalid_object_2_fragment Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Carl Worth <cworth@cworth.org>	2014-07-09 12:05:13 -07:00
Brian Paul	378fa34c7b	gallium/u_blitter: fix some shader memory leaks The _msaa shaders weren't getting freed. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-09 12:15:35 -06:00
Ilia Mirkin	e924bb32f4	tgsi: properly parse indirect dimension references (e.g. for UBOs) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-09 12:40:07 -04:00
Christian König	c8011c1885	radeonsi: fix order of r600_need_dma_space and r600_context_bo_reloc Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-09 15:08:22 +02:00
Brian Paul	d10204930f	st/mesa: fix geometry shader memory leak Spotted by Charmaine Lee. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-07-09 06:43:26 -06:00
Brian Paul	176b64b811	mesa: fix geometry shader memory leaks Spotted by Charmaine Lee. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-09 06:43:26 -06:00
Brian Paul	971122a9c0	st/mesa: minor simplification of some state atom assignments	2014-07-09 06:43:25 -06:00
Brian Paul	301ffe7b26	st/mesa: minor fix-up in st_GetSamplePosition() If the driver doesn't implement get_sample_position(), let's return some non-garbage values.	2014-07-09 06:43:25 -06:00
Brian Paul	91affc8b32	mesa: use float to silence MSVC warning in _mesa_GetMultisamplefv()	2014-07-09 06:43:25 -06:00
Samuel Pitoiset	50bbe49c33	nvc0: allocate more space before a counter is configured On nvc0, a counter can have up to 6 sources instead of only one for nve4+. This fixes a crash when a counter uses more than one source. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-08 19:41:00 -04:00
Tobias Klausmann	a9b21015f5	nv50/ir: use unordered_set instead of list to keep track of var uses The set of variable uses does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~22s from ~4h Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-08 19:41:00 -04:00
Kenneth Graunke	503391b46f	i965/disasm: Fix disassembly of the any16h/all16h predicates. BRW_PREDICATE_ALIGN1_ANY16H was incorrectly being disassembled as "all16h", and ALL16H would probably print as "(null)". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-08 12:31:01 -07:00
Kenneth Graunke	e13a6406c3	glsl: Fix the foreach_in_list_reverse macro. We clearly don't want to start at the head and walk backwards; we want to start at the last real element before the tail sentinel. If the list is empty, tail_pred will be the head sentinel, and we'll stop. Nothing uses this function, so I guess nobody noticed it was broken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-07-08 12:31:01 -07:00
Marek Olšák	be536efe20	radeonsi: mark MSAA config state as dirty at the beginning of CS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81020 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-08 20:46:23 +02:00
Marek Olšák	fe6be9926f	gallium: fix u_default_transfer_inline_write for textures This doesn't fix any known issue. In fact, radeon drivers ignore all the discard flags for textures and implicitly do "discard range" for any write transfer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-08 20:46:23 +02:00
Matt Turner	cf430408c4	i965: Remove artificial dependency between math instructions. ... on Gen6+. I'm not actually sure which class Gen6 fits into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-08 11:12:02 -07:00
Matt Turner	099cbc1477	i965/fs: Track dependencies in instruction scheduling per reg offset. Previously instruction scheduling tracked dependencies on a per-register basis. This meant that there was an artificial dependency between interpolation instructions writing into the same virtual register. Instruction scheduling would insert a number of instructions between the two instructions in this example, when they are actually independent. linterp vgrf8+0.0:F, hw_reg2:F, hw_reg3:F, hw_reg6:F linterp vgrf8+1.0:F, hw_reg2:F, hw_reg3:F, hw_reg6+16:F This lead to cases where the first texture coordinate is interpolated at the beginning of the shader, but the second is done immediately before the texture operation that uses it as a source. After this change, the artificial dependency is removed and the interpolation instructions are scheduled together. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-08 11:12:02 -07:00
Jon TURNEY	7a641dd58d	configure: Don't special case Cygwin to use gnu99, define _XOPEN_SOURCE instead Revert "build: Build on Cygwin with gnu99 instead of c99." and define _XOPEN_SOURCE appropriately. This reverts commit `53e36d333c`. Since Cygwin 1.7.18 (April 2013), it's headers correctly prototype strtoll() when using -std=c99, and correctly prototype strdup() when _XOPEN_SOURCE is defined appropriately, so this workaround is no longer needed. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Cc: Vinson Lee <vlee@freedesktop.org>	2014-07-08 14:25:21 +01:00
Chia-I Wu	8ff16111ee	ilo: fix fence reference counting The old code was complicated, and was wrong when *ptr is NULL.	2014-07-08 15:00:36 +08:00
Kristian Høgsberg	bbefb15e01	i965: Extend compute-to-mrf pass to understand blocks of MOVs The current compute-to-mrf pass doesn't handle blocks of MOVs. Shaders that end with a texture fetch follwed by an fb write are left like this: 0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000010: send(8) g2<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q }; 0x00000020: mov(8) g113<1>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000028: mov(8) g114<1>F g3<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000030: mov(8) g115<1>F g4<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000038: mov(8) g116<1>F g5<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000040: sendc(8) null g113<8,8,1>F render ( RT write, 0, 4, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT }; This patch lets compute-to-mrf recognize blocks of MOVs and match them to instructions (typically SEND) that writes multiple registers. With this, the above shader becomes: 0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000010: send(8) g113<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q }; 0x00000020: sendc(8) null g113<8,8,1>F render ( RT write, 0, 20, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT }; which is the bulk of the shader db results: total instructions in shared programs: 987040 -> 986720 (-0.03%) instructions in affected programs: 844 -> 524 (-37.91%) GAINED: 0 LOST: 0 The optimization also applies to MRT shaders that write the same color value to multiple RTs, in which case we can eliminate four MOVs in a similar fashion. See fbo-drawbuffers2-blend in piglit for an example. No measurable performance impact. No piglit regressions. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-07-07 23:39:40 -07:00
Ilia Mirkin	8aa34dc9cb	nvc0/ir: fill offset in properly for TXD Apparently TXD wants its offset differently than TEX, accepting it in the upper bits of the layer index. Unclear what happens when this is combined with indirect sampler indexing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	114d46829d	nvc0/ir: use manual TXD when offsets are involved Something about how we're implementing offsets for TXD is wrong, just flip to the generic quadop-based implementation in that case. This is the minimal fix appropriate for backporting. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	afea9bae67	nvc0/ir: do quadops on the right texture coordinates for TXD handleTEX moves the layer as the first argument. This makes sure that the quadops deal with the texture coordinates. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	1065aa92f4	nv50/ir: ignore bias for samplerCubeShadow on nv50 Unfortunately there's no good way to do this on the nv50 shader isa. Dropping the bias seems preferable to doing the compare post-filtering. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Ilia Mirkin	30d91e0eec	nv50/ir: retrieve shadow compare from first arg This can only happen with texture(samplerCubeShadow, bias), where the compare will be in the first argument. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>	2014-07-08 00:14:33 -04:00
Carl Worth	9007c4f9f4	docs: Import 10.2.3 release notes And add a news item.	2014-07-07 16:28:37 -07:00
Matt Turner	f6db414f3c	i965/fs: Disable unlit_centroid_workaround on Haswell. Although the HSW PRM shows it, the BSpec lists this workaround as being for Ivybridge only. total instructions in shared programs: 1994951 -> 1993675 (-0.06%) instructions in affected programs: 27325 -> 26049 (-4.67%)	2014-07-06 18:19:17 -07:00
Matt Turner	6f7c4a8d05	i965/vec4: Perform CSE on CMP(N) instructions. Port of commit `b16b3c87` to the vec4 code. No shader-db improvements, but might as well. The fs backend saw an improvement because it's scalar and multiple identical CMP instructions were generated by the SEL peepholes.	2014-07-06 18:19:15 -07:00
Matt Turner	7921bf0062	i965/vec4: Don't emit null MOVs in CSE. Port of commit `219b43c6` to the vec4 code.	2014-07-06 18:18:52 -07:00
Matt Turner	949991cc99	i965/vec4: Improve CSE performance by expiring some available expressions. Port of commit `5daf867f` to the vec4 code.	2014-07-06 18:18:52 -07:00
Kenneth Graunke	3c8dc48ad1	i965/vec4: Add basic common subexpression elimination. [mattst88]: Modified to perform CSE on instructions with the same writemask. Offered no improvement before. total instructions in shared programs: 1995633 -> 1995185 (-0.02%) instructions in affected programs: 14410 -> 13962 (-3.11%) Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-06 18:18:51 -07:00
Matt Turner	848fc7f710	i965: Fix warnings introduced in commit `e24ef5ab`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-06 18:15:36 -07:00
Christian König	042b061fef	gallium/radeon: use PRIX64 instead of PRIu64 We want hex values here, not decimals. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-06 13:28:04 +02:00
Matt Turner	1580865a8c	i965: Move assembly annotation functions to intel_asm_annotation.c. It's C. Compile it as such. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	423932791d	i965: Rename intel_asm_printer -> intel_asm_annotation. The #ifndef include guards already said the right thing :) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	6d3e24a5c2	i965: Make backend_instruction usable from C. With a hack to place an exec_node in the struct in C to be at the same location as the inherited exec_node in C++. Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	0db30fcf89	i965/cfg: Make cfg_t usable from C. Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	857c06236c	i965: Repack backend_instruction struct. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	ce706b4a9b	i965: Make a brw_predicate enum. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	46e5b2a497	i965: Make a brw_conditional_mod enum. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	ab74a42eef	i965: Move common fields into backend_instruction. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	3de11cacf0	i965: Use enum brw_reg_type for register types. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	34ef6a7651	i965: Move is_zero/one/null/accumulator into backend_reg. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	c019105f37	i965: Make a common backend_reg class. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:30 -07:00
Matt Turner	9377b189f7	i965: Drop imm union from visitor register classes. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:29 -07:00
Matt Turner	53992a102f	i965: Use immediate storage in brw_reg for visitor regs. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-07-05 22:42:29 -07:00
Andreas Boll	45446efc30	docs: add news item for mesa-demos 8.2.0 release	2014-07-05 11:32:54 +02:00
Chris Forbes	4087d9ec0b	glsl: Fix merging of layout(invocations) with other qualifiers If another layout qualifier appeared to the left of `invocations` in the GS input layout declaration, the invocation count would be dropped on the floor. Fixes the piglit tests: spec/ARB_transform_feedback3/arb_transform_feedback3-ext_interleaved_two_bufs_gs_max spec/ARB_gpu_shader5/arb_gpu_shader5-invocation-id spec/ARB_gpu_shader5/compiler/correct-multiple-layout-qualifier-invocations.geom spec/ARB_gpu_shader5/execution/invocations-conflicting Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-05 09:42:17 +12:00
Ilia Mirkin	9a37eb8adb	nvc0: add a memory barrier when there are persistent UBOs Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-03 20:08:41 -04:00
Ilia Mirkin	5d4f5218bb	nv50: do an explicit flush on draw when there are persistent buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-03 20:01:07 -04:00
Ilia Mirkin	b2b7c65122	nv50: disable dedicated ubo upload method The hardware allows multiple simultaneous renders with the same memory-backed constbufs but with each invocation having different values. However in order for that to work, the data has to be streamed in via the right constbuf slot. We weren't doing that for UBOs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>	2014-07-03 20:01:06 -04:00
Ilia Mirkin	32b71246e7	gallium: rename PIPE_CAP_TGSI_VS_LAYER to also have _VIEWPORT Now that this cap is used to determine the availability of both, adjust its name to reflect the new reality. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-07-03 19:39:25 -04:00
Ilia Mirkin	0fb6f1bf1d	mesa/st: enable AMD_vertex_shader_viewport_index The assumption is that any driver capable of emitting layer from the vertex shader and supporting viewports should be able to also handle emitting viewport index from the vertex shader. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tobias Droste <tdroste@gmx.de>	2014-07-03 19:39:25 -04:00
Ilia Mirkin	313acb3ffa	r600g: allow vs to write to gl_ViewportIndex Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Tobias Droste <tdroste@gmx.de>	2014-07-03 19:39:25 -04:00
Thomas Hellstrom	556a415033	svga: Don't unnecessarily reemit BindGBShader commands v2 The Linux winsys can no longer relocate shader code, so avoid reemitting BindGBShader commands. They are costly. v2: Correctly handle errors from SVGA3D_BindGBShader() Reported-by: Michael Banack <banackm@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Tested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-07-03 22:26:00 +02:00
Aaron Watry	824197efd5	radeon/llvm: Allocate space for kernel metadata operands Previously, we were assuming that kernel metadata nodes only had 1 operand. Kernels which have attributes can have more than 1, e.g.: !0 = metadata !{void (i32 addrspace(1)) @testKernel, metadata !1} !1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1} Attempting to get the kernel without the correct number of attributes led to memory corruption and luxrays crashing out. Fixes the cl/program/execute/attributes.cl piglit test. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223 CC: "10.2" <mesa-stable@lists.freedesktop.org>	2014-07-03 15:18:03 -05:00
Samuel Iglesias Gonsalvez	7f0420700c	glsl: fix duplicated layout qualifier detection for GS This patch fixes the duplicated layout qualifier detection for geometry shader's layout qualifiers. Also it makes the detection code more legible by defining allowed_duplicates_mask variable. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80778 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-07-03 10:34:12 -07:00
Brian Paul	986adb9057	svga: add switch cases for PIPE_SHADER_CAP_DOUBLES Signed-off-by: Brian Paul <brianp@vmware.com>	2014-07-03 08:25:50 -06:00
Thomas Hellstrom	35cf3831d7	st/xa: Don't close the drm fd on failure v2 If XA fails to initialize with pipe_loader enabled, the pipe_loader's cleanup function will close the drm file descriptor. That's pretty bad because the file descriptor will probably be the X server driver's only connection to drm. Temporarily solve this by dup()'ing the file descriptor before handing it over to the pipe loader. This fixes freedesktop.org bugzilla bug #80645. v2: Fix CC addresses. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-07-03 02:55:00 -07:00
Michel Dänzer	370184e813	Revert "radeonsi: Use dma_copy when possible for si_blit." This reverts commit `5d5c20920e`. Caused visual corruption, see e.g. https://bugs.freedesktop.org/show_bug.cgi?id=80827#c1	2014-07-03 11:17:38 +09:00
Ilia Mirkin	7666a9f4ae	i965: expose AMD_vertex_shader_viewport_index on gen7+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-02 21:59:41 -04:00
Ilia Mirkin	df61553070	glsl: add support for AMD_vertex_shader_viewport_index Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Tested-by: Tobias Droste <tdroste@gmx.de>	2014-07-02 21:59:38 -04:00
Ilia Mirkin	e593953b50	mesa: add support for AMD_vertex_shader_viewport_index Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Tested-by: Tobias Droste <tdroste@gmx.de>	2014-07-02 21:59:05 -04:00
Ilia Mirkin	6c544e5413	mesa/st: enable ARB_fragment_layer_viewport If multiple viewports are supported, that implies the presence of a GS and layered rendering, so we can enable ARB_fragment_layer_viewport as well. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-02 20:20:53 -04:00
Eric Anholt	6ded75ed08	i965/gen6: Add a spec citation about push constant packet requirements. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	e874274d08	i965: Add a comment about null renderbuffer surfaces and why they exist. I noticed this when trying to find comments about pull constant buffers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	489ec68554	i965: Update a ton of comments about constant buffers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	e24ef5ab18	i965: Merge VS/GS and WM pull constant buffer upload paths. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	34f4e614dd	i965/gen6+: Merge VS/GS and WM push constant buffer upload paths. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	c0f1929dd2	i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data. I wanted to access this value from stage-generic code, so stop storing it under two different names. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	5ba31c34d8	i965: Fix state flags for gen4/5 CURBE. If we had some NOS affecting VS compilation that resulted in optimization changing the set of constants to be uploaded, we might not have reuploaded the constants. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	a8330c343c	i965: Remove a dead define. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	c00d3bd59d	i965: Reuse libdrm's header for AUB definitions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	a6af5602af	i965: Fix stale comments about the state cache. This changed in the state streaming work years ago. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	ccf7878126	i965: Fix stale binding table comment. I recently moved the code from the mentioned location right into this file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:59 -07:00
Eric Anholt	ccda1b9ba9	i965: Drop the memcmp for finding duplicated CURBE uploads. At this point, the extra copy of the data and memcmp are as expensive as just re-uploading. Note: now that we'll always upload, and brw_constant_buffer watches BRW_NEW_BATCH anyway, we don't need to explicitly unref the old curbe_bo at batch reset time. No significant performance difference on glamor copywinwin10 (n=55), despite that test having a 98% hit rate on the cache. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:58 -07:00
Eric Anholt	44c63bdd40	i965: Reuse intel_upload.c for gen4/5 constant buffers. No performance difference on glamor with copywinwin10 (n=40) on my gm45. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-07-02 12:45:58 -07:00
Tom Stellard	fea996c2aa	gallium: Add PIPE_SHADER_CAP_DOUBLES This is for reporting whether or not double precision floating-point operations are supported. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-02 15:31:52 -04:00
Matt Arsenault	2ab44f657e	clover: Fix not setting build log if the build succeeds v2 If there were only warnings, they would not be added to the log. v2: - Use compat::string. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-07-02 15:15:13 -04:00
Francisco Jerez	d2504ead2f	clover: Have compat::string allocate its own memory.	2014-07-02 15:15:13 -04:00
Tom Stellard	9e5beac236	gallium/radeon: Only print a message for LLVM diagnostic errors We were printing messages for all diagnostic types, which was spamming the console for some OpenCL programs.	2014-07-02 15:15:13 -04:00
Tom Stellard	b9f501bc6b	radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Laurent Carlier <lordheavym@gmail.com> https://bugs.freedesktop.org/show_bug.cgi?id=80015 CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-07-02 14:59:29 -04:00
Ilia Mirkin	141f8fe1d1	r600g: allow viewport index/layer to be sent to ps In order to support ARB_fragment_layer_viewport, we need to explicitly send these along to the pixel shader, since it has no other way to retrieve them. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Tobias Droste <tdroste@gmx.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-07-02 10:53:34 -04:00
Emil Velikov	7414552b18	targets/dri: allow duplicated symbols With the inclusion of xmlconfig in the loader we're providing dri* symbols which are already available in libdricommon.la. This leads to a build break due to the multiple definitions. Temporary allow multiple definitions, until we come with a better solution. Reported-by: Laurent Carlier <lordheavym@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-02 12:25:05 +01:00
Emil Velikov	bd322dfd0e	st/dri: Remove the old libdridrm library With all the hw drivers converted, we can go back to having a single libdridrm provider. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	37b7a76266	targets/dri-vmwgfx: Convert to static/shared pipe-drivers Convert the final hardware driver to a single dri provider which includes all the pipe-drivers. Update the scons build and drop the unused vmw_powf.c. Cc: José Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Cc: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	100e654b25	targets/dri-ilo: Convert to static/shared pipe-driver Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	0a4be815f4	targets/dri-i915: Convert to static/shared pipe-drivers v2: - Drop inclusion of the winsys wrapper and softpipe/llvmpipe. - Remove old Makefile.am, target.c. - Correctly append i915 to the megadrivers list. Cc: Stephane Marchesin <stephane.marchesin@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	231063b032	targets/dri-freedreno: Convert to static/shared pipe-drivers Now we don't need a second dri module when using kgsl :) Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	495e3e7bed	targets/(r300\|r600\|radeonsi)/dri: Convert to static/shared pipe-drivers Related to previous commit, merge the separate dri targets to a single one. This is essentially all the buildsystem mayhem required for megaradeon. Cc: Marek Olšák <marek.olsak@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	6eabddd531	targets/dri-nouveau: Convert to static/shared pipe-drivers Similiar to other targets, we'd like to convert all the separate targets into a single one, thus we'll minimize the duplication and overall size of mesa. The conversion per API basis, with the drivers available either statically or shared. Currently the former is the default. v2: Correctly append the version script to the linker flags. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:53 +01:00
Emil Velikov	9a7fd2954f	st/dri/drm: Add a second libdridrm library Will be used to create the single dri target library, on our way to convert all the dri targets during the conversion to to static/shared pipe-drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:52 +01:00
Emil Velikov	a66dd60547	st/dri: Allow separate dri-targets With this commit we add a couple of DEFINES making the ST code conditional, in a way that we can use it to gradually convert the dri-targets from separate libraries into a single one. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:52 +01:00
Emil Velikov	98204ea7d0	targets/dri-swrast: use drm aware dricommon when building more than swrast Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-07-02 10:52:52 +01:00
Ilia Mirkin	e1432489c0	docs: update hw-dependent bits of ARB_gpu_shader5 Some of the features are completely implemented by core, while others have hardware dependencies. Create a list of drivers supporting each sub-feature that must have hw support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-07-01 20:10:09 -04:00
Ilia Mirkin	27ee7df8ad	nvc0: add missed PIPE_CAP_DRAW_INDIRECT Real support will be forthcoming. For now, avoid the unknown cap error and compiler warning. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-01 20:08:36 -04:00
Roland Scheidegger	a7ee842acd	llvmpipe: get rid of llvmpipe_get_texture_tile_linear Because the layout is always linear this didn't really do much any longer - at some point this triggered per-tile swizzled->linear conversion. The x/y coords were ignored too. Apart from triggering conversion, this also invoked alloc_image_data(), which could only actually trigger mapping of display target resources. So, instead just call resource_map in the callers (which also gives the ability to unmap again). Note that mapping/unmapping of display target resources still isn't really all that clean (map/unmap may be unmatched, and all such mappings use the same pointer thus usage flags are a lie). Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-02 01:55:59 +02:00
Roland Scheidegger	a4d0758d9d	llvmpipe: get rid of llvmpipe_get_texture_image The only caller left used it only for non display target textures, hence it was really the same as llvmpipe_get_texture_image_address - it also had a usage flag but this was ignored anyway. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-02 01:55:59 +02:00
Roland Scheidegger	aa1ab8173d	llvmpipe: get rid of llvmpipe_get_texture_image_all Once used for invoking swizzled->linear conversion for all needed images. But we now have a single allocation for all images in a resource, thus looping through all slices is rather pointless, conversion doesn't happen neither. Also simplify the sampling setup code to use the mip_offsets array in the resource directly - if the (non display target) resource exists its memory will already be allocated as well. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-02 01:55:59 +02:00
Roland Scheidegger	90abdc1541	llvmpipe: allocate regular texture memory upfront The deferred allocation doesn't really make much sense anymore, since we no longer allocate swizzled/linear memory in chunks and not per level / slice neither. This means we could fail resource creation a bit more (could already fail in theory anyway) but should not fail maps later (right now, callers can't deal with neither really). Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-02 01:55:59 +02:00
Roland Scheidegger	7e1521f191	llvmpipe: get rid of linear_img struct Just use a tex_data pointer directly - the description was no longer correct neither. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-02 01:55:59 +02:00
Roland Scheidegger	b4c3246e7b	llvmpipe: (trivial) rename linear_mip_offsets to mip_offsets Since switching to non-swizzled rendering we only have "normal", aka linear, offsets. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-07-02 01:55:59 +02:00
Roland Scheidegger	188ba1d6ec	target-helpers: don't use designated initializers it looks since `ce1a137228` they are now included in more places, in particular even for things buildable with msvc, and hence those break the build. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-07-02 01:55:59 +02:00
Christoph Bumiller	b97b87940b	st/mesa: add support for indirect drawing	2014-07-02 00:47:10 +02:00
Marek Olšák	59330f13b0	gallium/u_vbuf: get draw info from an indirect buffer if there's any This is required for fallbacks to work with ARB_draw_indirect.	2014-07-02 00:47:10 +02:00
Christoph Bumiller	bc198f8e63	gallium: add facilities for indirect drawing v2: Added comments to util_draw_indirect, clarified and fixed map size. Removed unlikely().	2014-07-02 00:47:09 +02:00
Christoph Bumiller	a27b3582a6	gallium: add PIPE_BIND_COMMAND_ARGS_BUFFER Intended for use with GL_ARB_draw_indirect's DRAW_INDIRECT_BUFFER target or for D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.	2014-07-02 00:47:09 +02:00
Dave Airlie	8392179fcc	xmlconfig/dri: bool -> unsigned char Drop stdbool, due to the X server being a pain and having struct members called bool, although I've sent a patch to fix that we should retain stupidity here. Use unsigned char which is what GLboolean is anyways. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-02 08:24:05 +10:00
Cody Northrop	78121e4b8d	i965/fs: Update discard jump to preserve uniform loads via sampler. Commit `17c7ead7` exposed a bug in how uniform loading happens in the presence of discard. It manifested itself in an application as randomly incorrect pixels on the borders of conditional areas. This is due to how discards jump to the end of the shader incorrectly for some channels. The current implementation checks each 2x2 subspan to preserve derivatives. When uniform loading via samplers was turned on, it uses a full execution mask, as stated in lower_uniform_pull_constant_loads(), and only populates four channels of the destination (see generate_uniform_pull_constant_load_gen7()). It happens incorrectly when the first subspan has been jumped over. The series that implemented this optimization was done before the changes to use samplers for uniform loads. Uniform sampler loads use special execution masks and only populate four channels, so we can't jump over those or corruption ensues. This fix only jumps to the end of the shader if all relevant channels are disabled, i.e. all 8 or 16, depending on dispatch. This preserves the original GLbenchmark 2.7 speedup noted in commit `beafced2`. It changes the shader assembly accordingly: before : (-f0.1.any4h) halt(8) 17 2 null { align1 WE_all 1Q }; after(8) : (-f0.1.any8h) halt(8) 17 2 null { align1 WE_all 1Q }; after(16): (-f0.1.any16h) halt(16) 17 2 null { align1 WE_all 1H }; v2: Cleaned up comments and conditional ordering. v3: Fix typo. Signed-off-by: Cody Northrop <cody@lunarg.com> Reviewed-by: Mike Stroyan <mike@lunarg.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948	2014-07-01 13:22:28 -07:00
Matt Turner	fcac7020cf	i965/fs: Mark case unreachable to silence warning. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:52 -07:00
Matt Turner	3d826729da	i965: Use unreachable() instead of unconditional assert(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:52 -07:00
Matt Turner	a3d10c2c30	mesa: Make unreachable macro take a string argument. To aid in debugging. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:52 -07:00
Matt Turner	e658440234	i965/vec4: Remove useless conditionals. Setting a couple of bits is the same cost or less as conditionally setting a couple of bits.	2014-07-01 08:55:52 -07:00
Matt Turner	2e90d1fb62	i965/fs: Pass cfg to calculate_live_intervals(). We've often created the CFG immediately before, so use it when available. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:52 -07:00
Matt Turner	ec1b2d6aa0	i965: Mark fields in the live interval classes protected. cfg, for instance, is a pointer to a local variable in calculate_live_intervals, certainly not valid after that function has returned. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:52 -07:00
Matt Turner	021094481c	glsl: Remove now unused foreach_list* macros. foreach_list_typed_const was never used as far as I can tell. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:52 -07:00
Matt Turner	266109736a	i965: Use typed foreach_in_list_safe instead of foreach_list_safe. Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	c5030ac0ac	i965: Use typed foreach_in_list instead of foreach_list. Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	bc2fbbafd2	i965: Add and use foreach_inst_in_block macros. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	e8e5f0a342	i965/fs: Use is_head_sentinel() instead of ->prev == NULL. Makes it more clear what we're doing and requires less knowledge of exec_list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	d6bb8bb7ce	mesa: Add and use foreach_list_typed_safe. Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	22cd917329	mesa: Add and use foreach_in_list_use_after. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	d49173a97b	glsl: Replace uses of foreach_list_const. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	fd8f65498a	glsl: Replace another couple uses of foreach_list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	6e217ad1d7	glsl: Use foreach_list_typed when possible. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	373824d769	mesa: Use typed foreach_in_list_safe instead of foreach_list_safe. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	c6a16f6d0e	glsl: Use typed foreach_in_list_safe instead of foreach_list_safe. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	e0cb82d0c4	mesa: Use typed foreach_in_list instead of foreach_list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	4d78446d78	glsl: Use typed foreach_in_list instead of foreach_list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	da9f0316e6	glsl: Add typed foreach_in_list_safe macro. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Matt Turner	3597681040	glsl: Add typed foreach_in_list/_reverse macros. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-07-01 08:55:51 -07:00
Axel Davy	4d6c9352f3	mesa: fix the condition in src/loader/Makefile.am We want to have the dri common files compiled to define USE_DRICONF. We need to check both NEED_OPENGL_COMMON and HAVE_DRICOMMON Signed-off-by: Axel Davy <axel.davy@ens.fr> Tested-by: Brian Paul <brianp@vmware.com>	2014-07-01 09:42:44 -06:00
Brian Paul	ad6e1e12cc	mesa: update comment for UniformBufferSize to indicate size is in bytes Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 09:42:44 -06:00
Brian Paul	f4b0ab7afd	st/mesa: fix incorrect size of UBO declarations UniformBufferSize is in bytes so we need to divide by 16 to get the number of constant buffer slots. Also, the ureg_DECL_constant2D() function takes first..last parameters so we need to subtract one for the last value. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 09:42:44 -06:00
Brian Paul	01bf8bb875	st/mesa: don't use address register for constant-indexed ir_binop_ubo_load Before, we were always using the address register and indirect addressing to index into a UBO constant buffer. With this change we only do that when necessary. Using the piglit bin/arb_uniform_buffer_object-rendering test as an example: Shader code: uniform ub_rot {float rotation; }; ... m[1][1] = cos(rotation); Before: IMM[1] INT32 {0, 1, 0, 0} 1: UARL ADDR[0].x, IMM[1].xxxx 2: MOV TEMP[0].x, CONST[3][ADDR[0].x].xxxx 3: COS TEMP[1].x, TEMP[0].xxxx After: 0: COS TEMP[0].x, CONST[3][0].xxxx Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 09:42:44 -06:00
Brian Paul	dfca35f807	st/mesa: allow 2D indexing for all shader types in translate_src() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 09:42:44 -06:00
Brian Paul	f11e3dc122	st/mesa: don't ignore const buf index in src_register() Otherwise, if we were creating a const buffer src register for a UBO the index into the UBO was always zero. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 09:42:44 -06:00
Ilia Mirkin	5e04526399	nvc0: expose 4 vertex streams, use stream ids in xfb Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-01 11:34:40 -04:00
Ilia Mirkin	2f2467cb23	nvc0/ir: only merge emit/restart for identical streams Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-01 11:34:40 -04:00
Ilia Mirkin	e5cdbdecd2	nvc0/ir: avoid creating restarts with non-0 stream Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-01 11:34:40 -04:00
Ilia Mirkin	40b8aec251	nvc0/ir: fix emitting vertex stream Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-07-01 11:34:40 -04:00
Ilia Mirkin	1d16dbf416	mesa/st: add vertex stream support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 11:34:37 -04:00
Ilia Mirkin	746e5260f6	gallium: add a cap for max vertex streams Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 11:34:35 -04:00
Ilia Mirkin	43e4b3e311	gallium: add an index argument to create_query Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 11:34:31 -04:00
Ilia Mirkin	7f1b365f65	gallium: add support for stream in so info Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 11:34:28 -04:00
Ilia Mirkin	0cbefc1bea	gallium: add vertex stream argument to EMIT/ENDPRIM Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-07-01 11:34:24 -04:00
Matt Turner	1bfc0a1102	i965/fs: Mark predicated PLN instructions with dependency hints. To implement the unlit_centroid_workaround, previously we emitted (+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 1Q }; (-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 1Q }; where the flag register contains the channel enable bits from g0. Since the predicates are complementary, the pair of pln instructions write to non-overlapping components of the destination, which is the case that the dependency control hints are designed for. Typically setting dependency control hints on predicated instructions isn't safe (if an instruction doesn't execute due to the predicate, it won't update the scoreboard, leaving it in a bad state) but since we must have at least one channel executing (i.e., +f0 is true for some channel) by virtue of the fact that the thread is running, we can put the +f0 pln instruction last and set the hints: (-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 NoDDClr 1Q }; (+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 NoDDChk 1Q }; Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 22:31:06 -07:00
Matt Turner	4fe53ee5d7	i965/fs: Predicate PLN instructions used in unlit centroid WA. Maybe lets us skip some PLN instructions if whole subspans are disabled? Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 22:31:05 -07:00
Matt Turner	6d2536395d	i965/fs: Add no_dd_{clear,check} fields to fs_inst. And plumb them through. Also make the assert in the generator look like the vec4 one. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 22:31:05 -07:00
Matt Turner	bcbb7c41b7	i965/fs: Let sat-prop ignore live ranges if producer already has sat. This sequence (where both x and w are used afterwards) wasn't handled. mul.sat x, y, z ... mov.sat w, x We assumed that if x was used after the mov.sat, that we couldn't propagate the saturate modifier, but in fact x was already saturated. So ignore the live range check if the producing instruction already saturates its result. Cuts one instruction from hundreds of TF2 shaders. total instructions in shared programs: 1995631 -> 1994951 (-0.03%) instructions in affected programs: 155248 -> 154568 (-0.44%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-30 22:31:05 -07:00
Matt Turner	e58992aedd	i965/fs: Pass const references to emit functions. Cuts 10k of .text and saves a bunch of useless struct copies.	2014-06-30 22:31:05 -07:00
Matt Turner	35b741c8e7	i965/vec4: Pass const references to instruction functions. text data bss dec hex filename 4231165 123200 39648 4394013 430c1d i965_dri.so 4186277 123200 39648 4349125 425cc5 i965_dri.so Cuts 43k of .text and saves a bunch of useless struct copies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-30 22:31:05 -07:00
Matt Turner	d35f34cea9	i965/vec4: Pass const references to vec4_instruction(). text data bss dec hex filename 4244821 123200 39648 4407669 434175 i965_dri.so 4231165 123200 39648 4394013 430c1d i965_dri.so Cuts 13k of .text and saves a bunch of useless struct copies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-30 22:31:05 -07:00
Matt Turner	e4b05af5d4	i965/fs: Pass const references to instruction functions. text data bss dec hex filename 4270747 123200 39648 4433595 43a6bb i965_dri.so 4244821 123200 39648 4407669 434175 i965_dri.so Cuts 25k of .text and saves a bunch of useless struct copies. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-30 22:31:05 -07:00
Axel Davy	5d5c20920e	radeonsi: Use dma_copy when possible for si_blit. This improves GLX DRI3 GPU offloading significantly on CPU bound benchmarks particularly. No performance impact for DRI2 GPU offloading. v2: Add missing tests Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák<marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:10:01 +10:00
Axel Davy	9320c8fea9	glx/dri3: add GPU offloading support. The differences with DRI2 GPU offloading are: a) There's no logic for GPU offloading needed in the Xserver b) for DRI2, the card would render to a back buffer, and the content would be copied to the front buffer (the same buffers everytime). Here we can potentially use several back buffers and copy to buffers with no tiling to share with X. We send them with the Present extension. That means than the DRI2 solution is forced to have tearings with GPU offloading. In the ideal scenario, this DRI3 solution doesn't have this problem. However without dma-buf fences, a race can appear (if the card is slow and the rendering hasn't finished before the server card reads the buffer), and then old content is displayed. If a user hits this, he should probably revert to the DRI2 solution (LIBGL_DRI3_DISABLE). Users with cards fast enough seem to not hit this in practice (I have an Amd hd 7730m, and I don't hit this, except if I force a low dpm mode) c) for non-fullscreen apps, the DRI2 GPU offloading solution requires compositing. This DRI3 solution doesn't have this requirement. Rendering to a pixmap also works. d) There is no need to have a DDX loaded for the secondary card. V4: Fixes some piglit tests Signed-off-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:07:52 +10:00
Axel Davy	3ecd9e1a93	loader: Use drirc device_id parameter in complement to DRI_PRIME DRI_PRIME is not very handy, because you have to launch the executable with it set, which is not always easy to do. By using drirc, the user specifies the target executable and the device to use. After that the program will be launched everytime on the target device. For example if .drirc contains: <driconf> <device driver="loader"> <application name="Glmark2" executable="glmark2"> <option name="device_id" value="pci-0000_01_00_0" /> </application> </device> </driconf> Then glmark2 will use if possible the render-node of ID_PATH_TAG pci-0000_01_00_0. v2: Fix compilation issue v3: Add "-lm" and rebase. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:07:40 +10:00
Axel Davy	7ab925a6aa	loader: add gpu selection code via DRI_PRIME. v2: Fix the leak of device_name v3: Rebased It enables to use the DRI_PRIME env var to specify which gpu to use. Two syntax are supported: If DRI_PRIME is 1 it means: take any other gpu than the default one. If DRI_PRIME is the ID_PATH_TAG of a device: choose this device if possible. The ID_PATH_TAG is a tag filled by udev. You can check it with 'udevadm info' on the device node. For example it can be "pci-0000_01_00_0". Render-nodes need to be enabled to choose another gpu, and they need to have the ID_PATH_TAG advertised. It is possible for not very recent udev that the tag is not advertised for render-nodes, then ones need to add a file containing: SUBSYSTEM=="drm", IMPORT{builtin}="path_id" in /etc/udev/rules.d/ Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:07:30 +10:00
Axel Davy	da3a47d682	drirc: Add string support Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-07-01 13:06:51 +10:00
Dave Airlie	29800e6a3e	dri: remove GL types from config queries This in theory changes ABI for the boolean->bool I think, but nothing in the tree uses configQueryb AFAICS. Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:06:29 +10:00
Dave Airlie	a513daec29	dri/xmlconfig: remove GL types. This just drops all the GL types from the xmlconfig and use std C types from stdint and stdbool. v2: drop further double and header include. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:03:06 +10:00
Dave Airlie	b94dc944df	dri3: cache pointer to back instead of looking up. This is just prep work for the dri3 prime patches. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-07-01 13:00:14 +10:00
Alexandre Demers	11a879f260	configure.ac: (trivial) Fixing a typo Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-30 22:50:00 +01:00
Emil Velikov	ce1a137228	targets/egl-static: use inline_drm_helper and Automake.inc helpers Update all three build systems, and add freedreno to the android build. Pending future work on the ST we can convert egl-static to provide either static or dynamic access to the pipe-drivers. There is no functional change with this patch. v2: Don't add freedreno to android build, drop the wrapper winsys. Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-30 22:27:12 +01:00
Emil Velikov	7689aa28cd	targets/gbm: convert to static/shared pipe-driver Move the gbm "target" code to the state-tracker, similar to other - dri, omx, vdpau... ST. v2: Drop inclusion of the wrapper winsys and softpipe/llvmpipe. Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-30 22:27:11 +01:00
Emil Velikov	37e640a073	targets/xa: provide alternative(static) xa target Now we can build the xa target (libxatracker) with either static pipe-drivers or shared ones. Currently we default to static. - Remove the unused CFLAGS/CPPFLAGS. - Use GALLIUM_TARGET_CFLAGS where applicable. v2: Update the printout messages at configure. v3: Drop inclusion of the wrapper winsys and softpipe/llvmpipe. Cc: Jakob Bornecrantz <jakob@vmware.com> Cc: Rob Clark <robclark@freedesktop.org> Cc: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-30 22:27:11 +01:00
Kenneth Graunke	c60a4ba7e3	i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications. Apparently INTEL_DEBUG=fs has crashed on Broadwell for anything using ARB_fragment_program since commit `9cee3ff5`. We need to NULL-check the right field. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-30 14:06:51 -07:00
Kenneth Graunke	5dfbfd17e0	i965/disasm: Delete gen8_disasm.c. The functionality has been merged into brw_disasm.c; use that instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	e59a9ecc98	i965/disasm: Stop using gen8_disassemble in favor of brw_disassemble. At this point, brw_disassemble can do everything gen8_disassemble can do - and, thanks to the new brw_inst API, it supports all generations. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	7b7f95b952	i965/disasm: Improve render target write message disassembly. Previously, we decoded render target write messages as: render ( RT write, 0, 16, 12, 0) mlen 8 rlen 0 which made you remember (or look up) what the numbers meant: 1. The binding table index 2. The raw message control, undecoded: - Last Render Target Select - Slot Group Select - Message Type (SIMD8, normal SIMD16, SIMD16 replicate data, ...) 3. The dataport message type, again (already decoded as "RT write") 4. The write commit bit (0 or 1) Needless to say, having to decipher that yourself is annoying. Now, we do: render RT write SIMD16 LastRT Surface = 0 mlen 8 rlen 0 with optional "Hi" and "WriteCommit" for slot group/write commit. Thanks to the new brw_inst API, we can also stop duplicating code on a per-generation basis. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	0e5b52e35d	i965/disasm: Rename msg_target to SFID. We haven't used the name "message target" in a while - there are a lot of things called "target", and it gets confusing. SFID ("Shared Function ID") is the term commonly used in the modern documentation. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	c4cf088f43	i965/disasm: Fix typo in RT UNORM write message. The name of this message is "Render Target UNORM Write" (Sandybridge PRM, Volume 4 Part 1, Page 210). Drop the bogus 'c'. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	3603dfff6f	i965/disasm: Use Gen6+ SFID case labels. Most developers will recognize the Gen6+ SFID names more quickly than the Gen4-5 ones. Given that they're the same values, just use the new names. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	4fe78f4cc2	i965/disasm: "Handle" Gen8+ HF/DF immediate cases. We should print something properly, but I'm not sure how to properly print an HF, and we don't have any DFs today to test with. This is at least better than the current Gen8 disassembler, which would simply assert fail. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	f36bebcd5c	i965/disasm: Cut piles of duplicate swizzle printing. Making a helper function saves us from cut and pasting this four times. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	bdcbcc73dd	i965/disasm: Properly decode negate source modifiers on Broadwell. This is a port of Abdiel's `6f9f916b9b` to brw_disasm.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	00b72bbab5	i965/disasm: Improve disassembly of atomic messages on Haswell+. This backports the atomic message disassembly support from gen8_disasm.c, which additionally offers support for decoding atomic surface read/write messages, and showing SIMD modes and other details. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	eb3185f686	i965/disasm: Actually disassemble Gen7+ URB opcodes. I never bothered implementing the disassembler for Gen7+ URB opcodes, so we were just disassembling them as Ironlake/Sandybridge ones. This looked pretty bad when running Paul's GS EndPrimitive tests, as the "write OWord" message was decoded at ff_sync, which doesn't exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	aa9e23dbe8	i965/disasm: Decode Broadwell's invm/rsqrtm math functions. We don't use these yet, but we may as well disassemble them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	9a91f92596	i965/disasm: Properly disassemble the "atomic" ThreadCtrl value. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	156c73a899	i965/disasm: Properly disassemble all32h/any32h align1 predicates. While we're adding things, use symbolic constants rather than magic numbers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:28 -07:00
Kenneth Graunke	03084453d7	i965: Add #defines for any32h/all32h predication. These have existed since Ivybridge. We don't use them today, but the Gen8+ disassembler supports them, and I'd like to use symbolic names rather than magic numbers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	707c42cb96	i965/disasm: Mark ELSE as having UIP on Gen8+. This makes brw_disasm.c able to disassemble ELSE instructions correctly on Broadwell. (gen8_disasm.c already handles this correctly.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	df4eeed0be	i965/disasm: Properly disassemble jump targets on Gen4-5. Previously, our dissasembly for flow control instructions looked like: 0x00000040: else(8) ip 65540D { align16 switch }; It didn't print InstCount properly for ELSE/ENDIF, and didn't even attempt to disassemble PopCount. Now it looks like: 0x00000040: else(8) Jump: 4 Pop: 1 { align16 switch }; which is much more readable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	6928959d8e	i965/disasm: Improve disassembly of jump targets on Gen6+. Previously, flow control instructions generated output like: (+f0) if(8) 12 8 null 0x000c0008UD { align16 WE_normal 1Q }; which included a dissasembly of the register fields, even though those are meaningless for flow control instructions---those bits are reused for another purpose. It also wasn't immediately obvious which number was UIP and which was JIP. With this patch, we instead output: (+f0) if(8) JIP: 8 UIP: 12 { align16 WE_normal 1Q }; which is much clearer. The patch also introduces has_uip/has_jip helper functions which clear up a some generation/opcode checking mess. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	6497890bf4	i965/disasm: Add support for new Gen8+ register types. While we're at it, use proper names rather than magic numbers for the existing fields. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	5f106b03a9	i965: Restyle brw_disasm.c. brw_disasm.c basically wasn't following the Mesa coding style at all. It used 4-space indent instead of 3-space, didn't cuddle braces, didn't put function return types on a separate line, put extra spaces in function calls (between the name and parenthesis), and a number of other things. This made it fairly obnoxious to work on, since my editor is configured to follow Mesa style in the Mesa source repository. Fixing it to follow a consistent style now should save time dealing with it later. These modifications were originally generated by: $ indent -br -i3 -npcs -ce -cs -l80 --no-tabs with some manual changes afterwards to fit our style better. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	5e20e9a830	i965/disasm: Create an "opcode" temporary. This saves typing brw_inst_opcode(brw, inst) everywhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Kenneth Graunke	3d1992754f	i965/disasm: Eliminate opcode pointer. opcode is just a pointer to opcode_descs; we may as well use that directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-30 14:05:27 -07:00
Jason Ekstrand	4000c0112a	Remove the ATI_envmap_bumpmap extension As far as I can tell, the Intel mesa driver is the only driver in the world still supporting this legacy extension. If someone wants to do bump mapping, they can use shaders. Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v2] Reviewed-by: Ian Romanick <idr@freedesktop.org> [v3]	2014-06-30 12:02:25 -07:00
Kenneth Graunke	7577cdd830	meta: Use AMD_vertex_shader_layer instead of a GS for layered clears. On i965, enabling and disabling the GS is not free: you have to do a full pipeline stall, reconfigure the URB and push constant space, and emit a bunch of state. Most clears aren't layered, so the GS isn't needed in the common case. But we turned it on universally. Using AMD_vertex_shader_layer allows us to skip setting up the GS altogether, while achieving the same effect. According to Ilia, current nVidia GPUs can't do AMD_vertex_shader_layer. However, since nouveau is Gallium-based, they're unlikely to ever care about this path. Intel and AMD GPUs both support the extension. Since i965 is the only driver using this path which does layered rendering, we may as well target it at that. v2: Improve commit message. No code changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-30 00:08:54 -07:00
Samuel Iglesias Gonsalvez	f3c5b2f7d0	docs: mark "Geometry shader multiple streams" as done for i965 Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	5b3492fa3f	i965: Enable vertex streams up to MAX_VERTEX_STREAMS. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	0b84fa2c52	mesa: Enable simultaneous queries on different streams. It should be possible to query the number of primitives written to each individual stream by a geometry shader in a single draw call. For that we need to have up to MAX_VERTEX_STREAM separate query objects. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	3178d2474a	i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams. So far we have been using CL_INVOCATION_COUNT to resolve this query but this is no good with streams, as only stream 0 reaches the clipping stage. Instead we will use SO_PRIM_STORAGE_NEEDED which can keep track of the primitives sent to each individual stream. Since SO_PRIM_STORAGE_NEEDED is related to the SOL stage and according to ARB_transform_feedback3 we need to be able to query primitives generated in each stream whether transform feedback is active or not what we do is to enable the SOL unit even if transform feedback is not active but disable all output buffers in that case. This effectively disables transform feedback but permits activation of statistics enabling SO_PRIM_STORAGE_NEEDED even when transform feedback is not active. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	a374685f09	i965: Implement GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN with non-zero streams. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	ecd9960430	mesa: Include stream information in indexed queries. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Samuel Iglesias Gonsalvez	0e58a3ef2a	glsl: include streamId when reading/printing ir_variable IR. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	a16043ba57	glsl: include streamId when reading/printing emit-vertex and end-primitive IR. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	5d562588a5	i965/gs: Set control data bits for vertices emitted in stream mode. In stream mode we have to set control data bits with the StreamID information for every vertex. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	7589683c97	glsl: Validate vertex emission in geometry shaders. Check if non-zero streams are used. Fail to link if emitting to unsupported streams or emitting to non-zero streams with output type other than GL_POINTS. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	e877aadde0	glsl: Add support for EmitStreamVertex() and EndStreamPrimitive(). Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	4b3fc21032	glsl: Modify ir_end_primitive to have a stream. This will be necessary to implement EndStreamPrimitive(). EndPrimitive() will produce an ir_end_primitive with the default stream 0. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	8639effefe	glsl: Modify ir_emit_vertex to have a stream. This will be necessary to implement EmitStreamVertex(). EmitVertex() will produce an ir_emit_vertex with the default stream 0. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	9650293b51	i965/gs: Set number of control data bits for stream mode. If the geometry shader is indeed using streams then we need 2 control data bits per vertex for the StreamID. If the shader is not using streams then we don't need control data bits. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	6d3632c9c9	glsl: Store info about geometry shaders that emit vertices to non-zero streams. On Intel hardware when a geometry shader outputs GL_POINTS primitives we only need to emit vertex control bits if it emits vertices to non-zero streams, so use a flag to track this. This flag will be set to TRUE when a geometry shader calls EmitStreamVertex() or EndStreamPrimitive() with a non-zero stream parameter in a later patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	598c2e2c83	glsl: Only geometry shader outputs can be associated with non-zero streams. This should be ensured by the parser, so assert on that. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	e2dd717616	glsl: Two varyings can't write to the same buffer from different streams. If this is detected, fail to link. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:50 +02:00
Iago Toral Quiroga	1e1f071d25	glsl: Add methods to retrive a varying's name and streamId. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:49 +02:00
Iago Toral Quiroga	02fd80e160	glsl: Fail to link if inter-stage input/outputs are not assigned to stream 0 Outputs that are linked to inputs in the next stage must be output to stream 0, otherwise we should fail to link. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:49 +02:00
Iago Toral Quiroga	b908e85ed3	glsl: Assign GLSL StreamIds to transform feedback outputs. Inter-shader outputs must be on stream 0, which is the default. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-30 08:08:49 +02:00
Iago Toral Quiroga	37d795317e	i965: Enable transform feedback for streams > 0 Configure hardware to read vertex data for all streams and have all streams write their varyings to the corresponsing output buffers. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:49 +02:00
Iago Toral Quiroga	f20c723039	mesa: add StreamId information to transform feedback outputs. For now initialized to the default stream 0. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:49 +02:00
Samuel Iglesias Gonsalvez	a7e6ec6898	glsl: Add parsing support for multi-stream output in geometry shaders. This implements parsing requirements for multi-stream support in geometry shaders as defined in ARB_gpu_shader5. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-30 08:08:49 +02:00
Emil Velikov	15b5e663b0	st/omx: strcpy the string into the allocated buffer This fixes commit a001ca98e15(st/omx: keep the name, (name\|role)_specific strings dynamically allocated) in which we dynamically allocated the buffers for name and (name\|role)_specific yet forgot to copy the encoder strings into them. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80614 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-28 15:24:45 +01:00
Ilia Mirkin	f230015206	mesa: expose ARB_seamless_cubemap_per_texture when supported All of the bits appear to already be in place to support this in the sampler (which the original AMD version didn't allow). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-28 00:51:43 -04:00
Emil Velikov	a001ca98e1	st/omx: keep the name, (name\|role)_specific strings dynamically allocated ... as it's caller (the external program omxregister-bellagio) is the one who frees all of the allocated memory. Reported-by: Pedretti Fabio <pedretti.fabio@gmail.com> Tested-by: Fabio Pedretti <pedretti.fabio@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-27 19:13:31 +01:00
Chris Forbes	ed66312426	docs: Update the status of a few things in GL3.txt Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-27 22:13:21 +12:00
Axel Davy	c58486516f	nv50: fix dri3 prime buffer creation This is the same fix than "nvc0: fix dri3 prime buffer creation" Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 13:38:20 +10:00
Dave Airlie	13eddf3bf2	nvc0: fix dri3 prime buffer creation We need to place shared buffers into GART. Reviewed-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 13:38:14 +10:00
Axel Davy	df282ce1bf	gallium/dri2: implement blitImage V3: call flush_resource before flush V4: Add new flags Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 11:39:34 +10:00
Axel Davy	8a66a5de83	dri/image: add blitImage to the specification It allows to blit two __DRIimages. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 11:39:34 +10:00
Axel Davy	27c686309e	gallium: Add __DRIimageDriverExtension support to gallium __DRIimageDriverExtension is used by GLX DRI3 and Wayland. This patch is a rewrite of http://lists.freedesktop.org/archives/mesa-dev/2014-May/060318.html and http://lists.freedesktop.org/archives/mesa-dev/2014-May/060317.html Previous patches were: Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 11:39:34 +10:00
Axel Davy	e40cf256f4	dri3: use invalidate. This doesn't change anything to the intel DRI3 implementation, but enables the gallium implementation to use dri2.stamp instead of relying on the stamp shared with the st backend. Signed-off-by: Axel Davy <axel.davy@ens.fr> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 11:39:34 +10:00
Dave Airlie	e4419913bf	dri3: fix image extension checking. Move the image extension setup in with all the others in bind_extensions, and improve the check to both version and function pointer. Reviewed-by: Axel Davy <axel.davy@ens.fr> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 11:39:34 +10:00
Jasper St. Pierre	b4dcf87f34	glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event While the official INTEL_swap_event specification says that the drawable field should contain the GLXDrawable, not the Drawable, the existing DRI2 code in dri2.c that translates from DRI2_BufferSwapComplete sends out GLX_BufferSwapComplete with the Drawable's ID, so existing codebases like Clutter/Cogl rely on getting the Drawable. Match DRI2's error here and stuff the event with the X Drawable, not the GLX drawable. This fixes apps seeing wrong drawables through an indirect GLX context or with DRI3, which uses the GLX_BufferSwapComplete event directly on the wire instead of translates Present in mesa. At the same time, also modify the structure for the event to make sure that clients don't make the same mistake. This is not an API or ABI break, as GLXDrawable and Drawable are both typedefs for XID. Signed-off-by: Jasper St. Pierre <jstpierre@mecheye.net> Reviewed-by: Axel Davy <axel.davy@ens.fr> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-27 09:44:56 +10:00
Kenneth Graunke	8cf289c3ef	i965: Enable compressed multisample support (CMS) on Broadwell. Everything is in place and appears to be working. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-26 11:50:35 -07:00
Kenneth Graunke	db184d43b0	i965: Add 2x MSAA support to the MCS allocation function. 2x MSAA also uses 8 bits, just like 4x. More bits are unused. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-26 11:50:34 -07:00
Kenneth Graunke	a248b2a4eb	i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell. MCS buffers are never allocated on Broadwell, so this does nothing for now, but puts the infrastructure in place for when they do exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-06-26 11:50:34 -07:00
Kenneth Graunke	e10311be9f	i965: Drop SINT workaround for CMS layout on Broadwell. According to the documentation, we don't need this SINT workaround on Broadwell. (Or at least, it doesn't mention that we need it.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-06-26 11:50:34 -07:00
Kenneth Graunke	fd77187689	i965: Add plumbing for Broadwell's auxiliary surface support. Broadwell generalizes the MCS fields to allow for multiple kinds of auxiliary surfaces. This patch adds the plumbing to set those values, but doesn't yet hook any up. v2: (by Jordan Justen) Use mt for qpitch; pitch is tiles - 1. v3: Don't forget to subtract 1 from aux_mt->pitch. v4: Drop unnecessary aux_mt->offset (caught by Jordan Justen). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-06-26 11:50:34 -07:00
Jordan Justen	a46cb6a971	i965: Add auxiliary surface field #defines for Broadwell. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-06-26 11:50:34 -07:00
Kenneth Graunke	7c2946fc23	i965: Disassemble all of DP write message control bits on Gen6. Prior to the new brw_inst API, the brw_instruction structure split off bits 4 and 5 of msg_control for specific fields, and we failed to disassemble them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:46:26 -07:00
Matt Turner	40a9754953	i965: Pass brw to brw_try_compact_instruction(). Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:25 -07:00
Matt Turner	fa1a3b2e3c	i965: Add is_cherryview flag to brw_context. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:24 -07:00
Matt Turner	a25401bc9a	i965: Add CSEL opcode definition for Gen8. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:23 -07:00
Matt Turner	e1b477238d	i965: Document which instructions are generation specific. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:21 -07:00
Matt Turner	a382b4cb7a	i965: Don't set UIP for ENDIF/WHILE. They don't have a UIP. We used UIP in an array dereference, which never caused problems on Gen < 8, since UIP was a small integer (number of instructions). On Gen 8 UIP is in bytes, so it's large enough that it caused us to read out of bounds of the array. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:19 -07:00
Matt Turner	92233aee47	i965: Replace struct brw_compact_instruction with brw_compact_inst. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:17 -07:00
Matt Turner	eaf78e56af	i965: Convert brw_eu_compact.c to the new brw_compact_inst API. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:16 -07:00
Matt Turner	395c759712	i965: Introduce a new brw_compact_inst API. For now nothing uses this, but we can incrementally convert. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:14 -07:00
Matt Turner	7c79608b5b	i965: Replace 'struct brw_instruction' with 'brw_inst'. Use this an an opportunity to clean up the formatting of some old code (brw_ADD, for instance). Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:12 -07:00
Matt Turner	290daad497	i965: Throw out guts of struct brw_instruction. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:11 -07:00
Matt Turner	a375092f5c	i965: Convert brw_gs_emit.c to the new brw_inst API. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:05 -07:00
Matt Turner	bfbe6a7210	i965: Convert brw_disasm.c to the new brw_inst API. v2: (by Kenneth Graunke) - Fix disassembly of Gen4-5 SEND messages to print base MRF correctly. - Only print URB opcode on Gen5+, to match previous output (besides, there is only one opcode AFAICT.) - Only print the low 3 bits of msg_control, to match previous output. (We probably should decode all the fields, but hadn't previously due to the brw_instruction structure definition splitting out bits 4/5 for last_render_target and slot_group_select.) - Fix 3-source MRF/GRF file decoding on Sandybridge. - Fix compression code to use qtr_control rather than cmpt_control (which is compaction, not compression). Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v2] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:46:01 -07:00
Matt Turner	1149eedffc	i965: Pass brw rather than gen to brw_disassemble_inst(). We will need it in order to use the new brw_inst API. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:45:58 -07:00
Matt Turner	9cbf899a7d	i965: Convert brw_eu_compact.c to the new brw_inst API. v2: Use brw_inst_bits rather than pulling out individual fields and reassembling them. Signed-off-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:45:50 -07:00
Kenneth Graunke	5e6818faa5	i965: Extend is_haswell checks to gen >= 8 in Gen4-7 generators. We're going to use fs_generator/vec4_generator for Gen8+ code soon, thanks to the new brw_instruction API. When we do, we'll generally want to take the Haswell paths on Gen8+ as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:45:47 -07:00
Kenneth Graunke	45cc9ddcc1	i965: Convert test_eu_compact.c to the new brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:45:46 -07:00
Kenneth Graunke	4362631d7b	i965: Convert vec4_generator to the new brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:45:44 -07:00
Kenneth Graunke	a041eb4030	i965: Convert fs_generator to the new brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:45:42 -07:00
Kenneth Graunke	eedc5bbc69	i965: Convert Gen4-5 clipping code to the new brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:45:40 -07:00
Kenneth Graunke	7213e1ddc7	i965: Convert brw_sf_emit.c to the new brw_inst API. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:45:38 -07:00
Kenneth Graunke	829aac4b67	i965: Convert brw_eu_emit.c to the new brw_inst API. v2: - Fix IF -> ELSE patching on Sandybridge. - Don't set base_mrf on Gen6+ in OWord Block Read functions. (Although - the old code did this universally, it shouldn't have - the field - doesn't exist on Gen6+ and just got overwritten by the SFID anyway.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:44:51 -07:00
Kenneth Graunke	607f5eb381	i965: Convert brw_eu.[ch] to use the new brw_inst API. v2: Don't set flag_reg_nr prior to Gen7 (as it doesn't exist). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:44:43 -07:00
Kenneth Graunke	d49a9ca8c2	i965: Introduce a new brw_inst API. This is similar to gen8_instruction, and will eventually replace it. For now nothing uses this, but we can incrementally convert. The new API takes the existing brw_instruction pointers to ease conversion; when done, we can simply drop the old structure and rename struct brw_instruction -> brw_inst. v2: (by Matt Turner) Make JIP/UIP functions take a signed argument. v3: (by Kenneth Graunke) - Make Gen4-6 jump target functions take a signed argument. - Fix indirect align1 AddrImm bits on Gen4-7. - Fix SFID on Sandybridge to use bits 27:24. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v1, v3+] Signed-off-by: Matt Turner <mattst88@gmail.com> [v2] Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:44:24 -07:00
Kenneth Graunke	05040d6f8f	i965: Pass brw into next_offset(). The new brw_inst API is going to require a brw pointer in order to access fields (so it can do generation checks). Plumb it in now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-26 11:44:22 -07:00
Greg Hunt	890287b96b	i965: Remove unneeded VS workaround stalls on Baytrail. According to the workarounds list, these stalls aren't needed on production Baytrail systems. Piglit confirms that as well. These cause a small slowdown when we are sending a large number of small batches to the GPU. Removing these improves performance by up to 5% on some CPU bound SynMark tests (Batch[4-7], DrvState1, HdrBloom, Multithread, ShMapPcf). Signed-off-by: Gregory Hunt <greg.hunt@mobica.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 11:31:28 -07:00
Kenneth Graunke	05126b9bb5	i965: Include marketing names for Broadwell GPUs. Intel would like us to include the marketing names. Developers additionally want "Broadwell GT1/2/3" because it makes it easier to identify what hardware users have when they request assistance or report issues. Including both makes it easy for everyone to map between the names. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-26 11:31:27 -07:00
Roland Scheidegger	b1c1c7d31b	softpipe: use last_level from sampler view, not from the resource The last_level from the sampler view may be limited by the state tracker to a value lower than what the base texture provides. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=80541. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-06-26 16:57:12 +02:00
Emil Velikov	f3a97c0381	targets/automake.inc: s/GALLIUM_VIDEO_CFLAGS/GALLIUM_TARGET_CFLAGS/ The flags are not specific to the video targets plus we can reuse them for targets/xa and targets/gbm. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-26 14:05:13 +01:00
Emil Velikov	f6723392e3	auxiliary/vl: Remove no longer used SPLIT_TARGETS Required for the conversion stage of all VL targets to a single library per API (static/shared pipe-drivers). No longer required as per last commit. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-26 14:05:13 +01:00
Emil Velikov	11bce6a94e	targets/radeonsi/omx: convert to static/shared pipe-drivers The radeonsi counterpart of previous commit - now libomx-radeonsi is built into the libomx-mesa library. Providing a single library per API. v2: Include the radeon winsys only when there is a user for it. v3: Correcly include the winsys. Now with extra brown bag :\ Note: Make sure to rebuild the .omxregister file, by executing $ omxregister-bellagio This patch concludes the unification. Now libomx-mesa will be used for all hardware - r600, radeonsi and nouveau. Cc: Leo Liu <leo.liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-26 14:05:13 +01:00
Emil Velikov	d23497c256	targets/r600/omx: convert to static/shared pipe-drivers The r600 counterpart of previous commit - now the libomx-r600 is built into the libomx-mesa library. Providing a single library per API. v2: Include the radeon winsys only when there is a user for it. v3: Correcly include the winsys. Now with extra brown bag :\ Note: Make sure to rebuild the .omxregister file, by executing $ omxregister-bellagio If you have more than one omx library (libomx-radeonsi, libomx-r600), make sure to temporary move the unused one. By the end of the series there will be only one library that will be used for all hardware - r600, radeonsi and nouveau. Cc: Leo Liu <leo.liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-26 14:05:13 +01:00
Emil Velikov	b1f4a9681f	targets/omx-nouveau: convert to static/shared pipe-drivers Similar to the vdpau/xvmc targets, we're going to convert the multiple target libraries into a single one. The library can be built with the relevant pipe-drivers statically linked in, or loaded as shared modules. Currently we default to static. Note: Make sure to rebuild the .omxregister file, by executing $ omxregister-bellagio If you have more than one omx library (libomx-radeonsi, libomx-r600), make sure to temporary move the unused one. By the end of the series there will be only one library that will be used for all hardware - r600, radeonsi and nouveau. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-26 14:05:13 +01:00
Emil Velikov	c35cf3400f	st/omx: avoid using dynamic vid_(enc\|dec)_base and avc_(name\|role) Strictly speaking we should not have done this in the first place, as all of the above should be static across the system. Currently this may cause some minor issues, which will be resolved in the following patches, by providing a single library for the OMX api. Cleanup a few unneeded strcpy cases while we're around. Note: Make sure to rebuild the .omxregister file, by executing $ omxregister-bellagio If you have more than one omx library (libomx-radeonsi, libomx-r600), make sure to temporary move the unused one. By the end of the series there will be only one library that will be used for all hardware - r600, radeonsi and nouveau. Cc: Leo Liu <leo.liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-26 14:05:12 +01:00
Emil Velikov	9a9742f92c	st/omx: provide constant number of components The number of components and their names/roles should be kept constant as all of that information cached. Note: Make sure to rebuild the .omxregister file, by executing $ omxregister-bellagio. Cc: Leo Liu <leo.liu@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-26 14:05:12 +01:00
Juha-Pekka Heikkila	2670d0f91d	glx: Added missing null check in GetDrawableAttribute() For GLX_BACK_BUFFER_AGE_EXT query added extra null check. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	0f7958aac2	mesa/main: In register_surface() verify gl_texture_object was found Verify _mesa_lookup_texture() returned valid pointer before using it. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	cc5abf0460	mesa/main: Verify calloc return value in register_surface() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	568c545b7e	glsl: Add missing null check in push_back() Report memory error on realloc failure and don't leak any memory. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	088da3720f	glsl: check _mesa_hash_table_create return value in link_uniform_blocks Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	db081b497e	i965/fs: Check variable_storage return value in fs_visitor::visit check variable_storage() found the requested fs_reg. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	78a89d6fa0	i965: Handle miptree creation failure in intel_alloc_texture_storage() Check intel_miptree_create() return value before using it as a pointer. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Juha-Pekka Heikkila	375943bc0a	i965: Check calloc return value in gather_statistics_results() Check calloc return value and report on error, also later skip results handling if there was no memory to store results to. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-26 15:37:14 +03:00
Matt Turner	9a8acafa47	i965/vec4: Try constant propagate after copy propagate made progress. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:57 -07:00
Matt Turner	d5432e3f45	i965/vec4: Make try_copy_propagate() static. Now that can_do_source_mods() isn't part of the visitor, this doesn't need to be either. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:56 -07:00
Matt Turner	7526df70ea	i965/vec4: Rename try_copy/constant_propagat{ion,e} to match the fs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:55 -07:00
Matt Turner	7192207de1	i965/vec4: Constant propagate into 2-src math instructions on Gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:54 -07:00
Matt Turner	038eb649b3	i965/fs: Constant propagate into 2-src math instructions on Gen8. total instructions in shared programs: 1878133 -> 1876986 (-0.06%) instructions in affected programs: 153007 -> 151860 (-0.75%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:53 -07:00
Matt Turner	aca4a951ea	i965/fs: Make try_constant_propagate() static. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:51 -07:00
Matt Turner	46659d46a8	i965: Make can_do_source_mods() a member of the instruction classes. Pretty nonsensical to have it as a method of the visitor just for access to brw. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-25 13:00:48 -07:00
Chris Forbes	b4ef7c596b	glsl: Treat an interface block specifier as a level of struct nesting Fixes the piglit test: spec/glsl-1.50/compiler/interface-blocks-structs-defined-within-block-instanced.vert Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 07:58:32 +12:00
Chris Forbes	91b8ecbe1c	glsl: Disallow primitive type layout qualifier on variables. This only makes any sense on the GS input or output layout declaration, nowhere else. Fixes the piglit tests: * spec/glsl-1.50/compiler/incorrect-in-layout-qualifiers-with-variable-declarations.geom * spec/glsl-1.50/compiler/incorrect-out-layout-qualifiers-with-variable-declarations.geom * spec/glsl-1.50/compiler/layout-fs-no-output.frag * spec/glsl-1.50/compiler/layout-vs-no-input.vert * spec/glsl-1.50/compiler/layout-vs-no-output.vert Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 07:58:25 +12:00
Chris Forbes	d4703f9446	glsl: Relax combinations of layout qualifiers with other qualifiers. Previously we disallowed any combination of layout with interpolation, invariant, or precise qualifiers. There is very little spec guidance on exactly which combinations should be allowed, but with ARB_sso it's useful to allow these qualifiers with rendezvous-by-location. Since it's unclear exactly where the layout qualifier should appear when combined with other qualifiers, we will allow it anywhere before the auxiliary storage qualifier. This allows enough flexibility for all examples I've seen, while keeping the auxiliary-storage-qualifier / storage-qualifier pair together (as they are a single qualifier in the spec prior to ARB_shading_language_420pack) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-26 07:58:01 +12:00
Ian Romanick	316dafa27d	glsl: Don't convert reductions of ivec to a dot-product Mesa has an optimization that converts expressions like "v.x + v.y + v.z + v.w" into dot(v, 1.0). And therein lies the rub: the other operand to the dot-product is always a float... even if the vector is an ivec or uvec. This results in an assertion failure in ir_builder. If the base type of the operand is not float, don't try the optimization. Dot-product is not valid on integer data. Fixes piglit vs-integer-reduction.shader_test and OpenGL ES conformance test ES2-CTS.gtf.GL2Tests.glGetUniform.glGetUniform. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christoph Brill <egore911@gmail.com>	2014-06-25 10:56:32 -07:00
Carl Worth	4ccbbbdd74	docs: Import 10.2.2 release notes, add news item	2014-06-24 21:49:38 -07:00
Carl Worth	4076cbceaf	docs: Import 10.1.6 release notes, add news item	2014-06-24 21:40:15 -07:00
Takashi Iwai	6b8b17153a	llvmpipe: Fix zero-division in llvmpipe_texture_layout() Fix the crash of "gnome-control-center info" invocation on QEMU where zero height is passed at init. (sroland: simplify logic by eliminating the div altogether, using 64bit mul.) Fixes: https://bugzilla.novell.com/show_bug.cgi?id=879462 Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-25 02:15:49 +02:00
Matt Turner	48f1143c64	i965/fs: Don't fix_math_operand() on Gen >= 8. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-24 11:51:54 -07:00
Matt Turner	b24e1cc604	i965/vec4: Don't fix_math_operand() on Gen >= 8. The emit_math?_gen? functions serve to implement workarounds for the math instruction, none of which exist on Gen8+. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-24 11:51:54 -07:00
Matt Turner	0e800dfe75	i965/vec4: Don't return void from a void function. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-24 11:51:54 -07:00
Bruno Jiménez	c997007f66	r600g/compute: Defer the creation of the temporary resource For the first use of a buffer, we will only need the temporary resource in the case that a user wants to write/map to this buffer. But in the cases where the user creates a buffer to act as an output of a kernel, then we were creating an unneeded resource, because it will contain garbage, and would be copied to the pool, and destroyed when promoting. This patch avoids the creation and copies of resources in this case. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-24 12:37:36 -04:00
Jan Vesely	fec2a08eae	r600g/compute: Handle failures in compute_memory_pool_finalize Reviewed-by: Bruno Jiménez <brunojimen@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-06-24 12:37:30 -04:00
Jan Vesely	9575225e12	r600g/compute: Fix possible endless loop in compute_memory_pool allocations. The important part is the change of the condition to <= 0. Otherwise the loop gets stuck never actually growing the pool. The change in the aux-need calculation guarantees max 2 iterations, and avoids wasting memory in case a smaller item can't fit into a relatively larger pool. Reviewed-by: Bruno Jiménez <brunojimen@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-06-24 12:36:55 -04:00
Jan Vesely	0c181cdc6c	r600: Fix use after free in compute_memory_promote_item. The dst pointer needs to be initialized after any calls to compute_memory_grow_pool, as the function might change the pool->vbo pointer. This fixes crashes and assertion failures in two gegl tests. Reviewed-by: Bruno Jiménez <brunojimen@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>	2014-06-24 12:04:54 -04:00
Ilia Mirkin	a59f2bb17b	nouveau: dup fd before passing it to device nouveau screens are reused for the same device node. However in the scenario where we create screen 1, screen 2, and then delete screen 1, the surrounding code might also close the original device node. To protect against this, dup the fd and use the dup'd fd in the nouveau_device. Also tell the nouveau_device that it is the owner of the fd so that it will be closed on destruction. Also make sure to free the nouveau_device in case of any failure. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79823 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com>	2014-06-24 09:30:25 -04:00
Fredrik Höglund	41d759d076	mesa: Don't use derived vertex state in api_arrayelt.c Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-06-24 07:15:30 +02:00
Ilia Mirkin	ea91d629df	nvc0: allow VIEWPORT_INDEX and LAYER to be used as input semantics Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-23 19:23:16 -04:00
Ilia Mirkin	a91a556c81	mesa/st: handle gl_Layer input semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-06-23 19:23:16 -04:00
Tobias Klausmann	98a86f61a8	nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices Previously, if we had something like: gl_ViewportIndex = idx; for(int i = 0; i < gl_in.length(); i++) { gl_Position = gl_in[i].gl_Position; EmitVertex(); } EndPrimitive(); The right viewport index would not be set on the primitive because the last vertex is the provoking one. However blob drivers appear to move the gl_ViewportIndex write into the for loop, allowing the application to be ignorant of this detail. While the application is technically wrong here, because the blob does it and other drivers appear to implicitly work this way as well, we add a buffer register that viewport index writes go into, which is then exported before every EmitVertex() call. This fixes the remaining piglit tests in ARB_viewport_array for nv50/nvc0. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-23 19:23:16 -04:00
Roland Scheidegger	604e54de78	draw: (trivial) fix clamping of viewport index The old logic would let all negative values go through unclamped, with potentially disastrous results (probably trying to fetch viewport values from random memory locations). GL has undefined rendering for vp indices outside valid range but that's a bit too undefined... (The logic is now the same as in llvmpipe.) CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-24 00:37:52 +02:00
Kenneth Graunke	f6a99d1167	i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell. As far as I can tell, Broadwell doesn't need any of the SURFACE_STATE workarounds for textureGather() bugs, so there's no need to emit a second set of identical copies. To keep things simple, just point the gather surface index base to the same place as the texture surface index base. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-23 13:29:39 -07:00
Emil Velikov	2442d3553f	targets/(vdpau\|xvmc): hardlink against the installed library With commit `11e46a32ae` and `f9ebb1ea77` we resolved the symlink generation required by the versioning of the library. Although they incorrectly changed the way hardlinks are created by linking to the ones from the build tree. If the device used for building differs from the one set as destination linking will fail. Reported-by: Andy Furniss <adf.lists@gmail.com> Tested-by: Andy Furniss <adf.lists@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-23 20:57:01 +01:00
Neil Roberts	5f11b10f2c	i965: Allow the blorp blit between BGR and RGB Previously the blorp blitter would only be used if the format is identical or there is only a difference between whether there is an alpha component or not. This patch makes it also allow the blorp blitter if the only difference is the ordering of the RGB components (ie, RGB or BGR). This is particularly useful since commit `61e264f4fc` because Mesa now prefers RGB ordering for textures but the window system buffers are still created as BGR. That means that the blorp blitter won't be used for the (probably) common case of blitting from a texture to the window system buffer. This doesn't cause any regressions in the FBO piglit tests on Haswell. On Sandybridge it causes the fbo-blit-stretch test to fail but that is only because it was failing anyway before the above commit and that commit hid the problem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-23 19:59:40 +01:00
Ian Romanick	3552aa7c1c	glsl: Silence many unused parameter warnings In file included from ../../src/glsl/builtin_functions.cpp:61:0: ../../src/glsl/glsl_parser_extras.h:154:9: warning: unused parameter 'var' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-06-23 11:24:25 -07:00
Emil Velikov	f9ebb1ea77	targets/xvmc: correctly generate the symlinks Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-23 15:54:36 +01:00
Emil Velikov	11e46a32ae	targets/vdpau: correctly generate the symlinks Reported-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-23 15:53:26 +01:00
Ville Syrjälä	ca55a1aaa7	i915: Fix gen2 texblend setup Fix an off by one in the texture unit walk during texblend setup on gen2. This caused the last enabled texunit to be skipped resulting in totally messed up texturing. This is a regression introduced here: commit `1ad443ecdd` Author: Eric Anholt <eric@anholt.net> Date: Wed Apr 23 15:35:27 2014 -0700 i915: Redo texture unit walking on i830. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-06-23 12:42:00 +03:00
Iago Toral Quiroga	c822db6a05	mesa: Make Geom.UsesEndPrimitive a bool instead of a GLboolean	2014-06-23 07:55:51 +02:00
Emil Velikov	df71b39f5c	targets/r600/xvmc: convert to static/shared pipe-drivers The r600 equivalent of previous commit. v2: Correctly include the radeon winsys/radeon_common. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Thomas Helland <thomashelland90 at gmail.com>	2014-06-22 23:06:07 +01:00
Emil Velikov	dc01ca44a7	targets/xvmc-nouveau: convert to static/shared pipe-drivers Similar to vdpau targets, we're going to convert the individual target libraries into a single one. The library can be built with the relevant pipe-drivers statically linked in, or loaded as shared modules. Currently we default to static. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Thomas Helland <thomashelland90 at gmail.com>	2014-06-22 23:06:04 +01:00
Emil Velikov	291d70210d	targets/radeonsi/vdpau: convert to static/shared pipe-drivers Similar to previous commits, this allows us to minimise some of the duplication by compacting all vdpau targets into a single library. v2: Include the radeon winsys only when there is a user for it. v3: Correcly include the winsys. Now with extra brown bag :\ Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Thomas Helland <thomashelland90 at gmail.com>	2014-06-22 23:06:01 +01:00
Emil Velikov	f85e7ce057	targets/r600/vdpau: convert to static/shared pipe-drivers Similar to previous commit, this allows us to minimise some of the duplication by compacting all vdpau targets into a single library. v2: Include the radeon winsys only when there is a user for it. v3: Correcly include the winsys. Now with extra brown bag :\ Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Tested-by: Thomas Helland <thomashelland90 at gmail.com>	2014-06-22 23:05:58 +01:00
Emil Velikov	9df2c4956b	targets/vdpau-nouveau: convert to static/shared pipe-drivers Create a single library (for the vdpau api) thus reducing the overall size of mesa. Current commit converts vdpau-nouveau, with upcomming commits handling the rest. The library can be built with the relevant pipe-drivers statically linked in, or loaded as shared modules. Currently we default to static. Add SPLIT_TARGETS to guard the other VL targets. Note: symlink handling is rather ugly and will need an update to work with BSD and other non-linux platforms. v2: Split the conversion into per-target basis. Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Thomas Helland <thomashelland90 at gmail.com>	2014-06-22 23:05:49 +01:00
Chris Forbes	8b2e0ddf8a	Partially revert "glsl: Add builtin define for ARB_fragment_layer_viewport" This partially reverts commit `cc18b1ec21`, which dropped some unrelated code due to a fumbled rebase.	2014-06-22 23:54:21 +12:00
Rob Clark	1f3ca56b76	freedreno: use util_copy_framebuffer_state() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-06-22 07:28:17 -04:00
Rob Clark	c63450e829	freedreno/a3xx: WFI fixes/cleanup Blob driver seems to need WFI in some cases after CP_EVENT_WRITE, implying that this is asynchronous and should reset needs_wfi. Also, CP_INVALIDATE_STATE seems to need WFI. But CP_LOAD_STATE does not. The blob driver also puts WFIs before writing GRAS_CL_VPORT registers. The latter may be a work-around, as these registers should be banked/ context registers. I haven't yet found a lockup that this averts, but I expect viewport to change infrequently so out of paranoia I will keep these for now. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-06-22 07:25:43 -04:00
Chris Forbes	b2c1f3a019	glsl: Add gl_Layer and gl_ViewportIndex builtins to fragment shader Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-22 16:52:19 +12:00
Chris Forbes	cc18b1ec21	glsl: Add builtin define for ARB_fragment_layer_viewport The spec doesn't actually mention adding this, but this is the usual pattern so I'm assuming it's a spec bug. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-22 16:52:17 +12:00
Chris Forbes	fcc9b4c15e	glsl: Add extension plumbing for ARB_fragment_layer_viewport Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-22 16:52:15 +12:00
Chris Forbes	51c82bddef	mesa: Add extension plumbing for ARB_fragment_layer_viewport Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-22 16:52:13 +12:00
Chris Forbes	22448c819d	glapi: Add (empty) api section for ARB_fragment_layer_viewport This extension is purely GLSL -- there are no new GL API elements. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-22 16:51:29 +12:00
Kenneth Graunke	a20994d616	i965: Save meta stencil blit programs in the context. When the last context in a share group is destroyed, the hash table containing all of the shader programs (ctx->Shared->ShaderObjects) is destroyed, throwing away all of the shader programs. Using a static variable to store program IDs ends up holding on to them after this, so we think we still have a compiled program, when it actually got destroyed. _mesa_UseProgram then hits GL errors, since no program by that ID exists. Instead, store the program IDs in the context, so we know to recompile if our context gets destroyed and the application creates another one. Fixes es3conform tests when run without -minfmt (where it creates separate contexts for testing each visual). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77865 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-21 10:47:47 -07:00
Emil Velikov	dfaf6116c9	scons: avoid building any piece of i915 Leftover from commit `c21fca8bf2`. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <wallbraker@gmail.com>	2014-06-21 16:43:10 +01:00
Aaron Watry	564821c917	gallivm: Fix build after LLVM commit 211259 Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 19:49:18 -05:00
Daniel Manjarres	86bd2196b4	glx: Don't crash on swap event for a Window (non-GLXWindow) Prior to GLX 1.3 there was the glxMakeCurrent() function that took a single drawable handle. The Drawable could be either a bare XID for a Window or an XID for a glxpixmap. GLX 1.3 added glxMakeContextCurrent that takes 2 handles: one for reading, one for writing. Nowadays the old glxMakeCurrent call is implemented as a call to glxMakeContextCurrent with the single handle duplicated. Because of this it is allowed to use a plain-old Window ID as an argument to glxMakeContextCurrent, although nobody really documents this sort of thing. The manpage for the NEW call specifies the arguments as GLXPixmaps, but the actual code accepts Window XIDs too, and handles them correctly. Similarly, the glxSelectEvents function can also take a bare Window XID. The "piglit" tests all use GLXWindows and/or GLXPixmaps. You never tested swap events with a bare Window XID. That is what my app was doing. The swap_events code worked with Window XIDs in mesa 7.x.y. The new code added in versions 8, 9, and 10 assumes that all buffer swap events have a GLXPixmap associated with them. Because of the historical quirks above, this is not true. Swap events for bare Window XIDs do NOT have a glxpixmap resulting in a segfault. Any app that uses the old school glxMakeCurrent call with a Window XID while trying to use swap_events will crash when the libs try to lookup the nonexistent GLXPixmap associated with the incoming swap event. I believe that the people who wrote the spec overlooked this, because the "sbc" field comes from the OML_sync extension that is defined in terms of glxpixmaps only. v2 (idr): Formatting changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-20 11:04:04 -07:00
Bruno Jiménez	2d2af4cd2c	r600g/compute: Use gallium util functions for double lists Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:44:12 -04:00
Bruno Jiménez	257d697fb9	r600g/compute: Map only against intermediate buffers With this we can assure that mapped buffers will never change its position when relocating the pool. This patch should finally solve the mapping bug. v2: Use the new is_item_in_pool util function, as suggested by Tom Stellard Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:44:08 -04:00
Bruno Jiménez	9b933b73a9	r600g/compute: Implement compute_memory_demote_item This function will be used when we want to map an item that it's already in the pool. v2: Use temporary variables to avoid so many castings in functions, as suggested by Tom Stellard Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:44:04 -04:00
Bruno Jiménez	0b8c29915b	r600g/compute: Avoid problems when promoting items mapped for reading Acording to the OpenCL spec, it is possible to have a buffer mapped for reading and at read from it using commands or buffers. With this we can keep the mapping (that exists against the temporary item) and read with a kernel (from the item we have just added to the pool) without problems. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:44:00 -04:00
Bruno Jiménez	3da1b17555	r600g/compute: Only move to the pool the buffers marked for promoting Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:43:57 -04:00
Bruno Jiménez	4d1e4429e6	r600g/compute: divide the item list in two Now we will have a list with the items that are in the pool (item_list) and the items that are outside it (unallocated_list) Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:43:54 -04:00
Bruno Jiménez	e3dfe3f7b2	r600g/compute: Add statuses to the compute_memory_items These statuses will help track whether the items are mapped or if they should be promoted to or demoted from the pool v2: Use the new is_item_in_pool util function, as suggested by Tom Stellard Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:43:50 -04:00
Bruno Jiménez	9e491eb5d7	r600g/compute: Add an util function to know if an item is in the pool Every item that has been placed in the pool must have start_in_dw different from -1. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:43:46 -04:00
Bruno Jiménez	0038402753	r600g/compute: Add an intermediate resource for OpenCL buffers This patch changes completely the way buffers are added to the compute_memory_pool. Before this, whenever we were going to map a buffer or write to or read from it, it would get placed into the pool. Now, every unallocated buffer has its own r600_resource until it is allocated in the pool. NOTE: This patch also increase the GPU memory usage at the moment of putting every buffer in it's place. More or less, the memory usage is ~2x(sum of every buffer size) v2: Cleanup v3: Use temporary variables to avoid so many castings in functions, as suggested by Tom Stellard Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-20 13:43:28 -04:00
Iago Toral Quiroga	96a95f48ea	mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-20 09:50:54 +02:00
Iago Toral Quiroga	ec712bf469	mesa: Init Geom.UsesEndPrimitive in shader programs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-20 09:50:54 +02:00
Matt Turner	e974781301	glsl: Optimize (v.x + v.y) + (v.z + v.w) into dot(v, 1.0). Cuts five instructions out of SynMark's Gl32VSInstancing benchmark.	2014-06-19 16:11:52 -07:00
Matt Turner	f043971097	glsl: Pass in options to do_algebraic(). Will be used in the next commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-06-19 16:11:51 -07:00
Matt Turner	1d9f74eda7	glsl: Rebalance expression trees that are reduction operations. The intention of this pass was to give us better instruction scheduling opportunities, but it unexpectedly reduced some instruction counts as well: total instructions in shared programs: 1666639 -> 1666073 (-0.03%) instructions in affected programs: 54612 -> 54046 (-1.04%) (and trades 4 SIMD16 programs in SS3)	2014-06-19 16:11:51 -07:00
Emil Velikov	d300f3f51a	automake: include the libdeps in the correct order Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80254 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 22:53:56 +01:00
Francisco Jerez	4a39e5073a	clover: Calculate the serialized size of a module efficiently. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-19 20:17:19 +02:00
Francisco Jerez	ab023c27a3	clover: Optimize module serialization for vectors of fundamental types. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-19 20:17:08 +02:00
Roland Scheidegger	cad60420d5	gallivm: set mcpu when initializing llvm execution engine Previously llvm detected cpu features automatically when the execution engine was created (based on host cpu). This is no longer the case, which meant llvm was then not able to emit some of the intrinsics we used as we didn't specify any sse attributes (only on avx supporting systems this was not a problem since despite at least some llvm versions enabling it anyway we always set this manually). So, instead of trying to figure out which MAttrs to set just set MCPU. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=77493. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2014-06-19 16:58:00 +02:00
Tom Stellard	4aa128a123	clover: Don't use llvm's global context An LLVMContext should only be accessed by a single and using the global context was causing crashes in multi-threaded environments. Now we use a separate context for each compile. Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-19 10:41:10 -04:00
Tom Stellard	0cc391f013	clover: Prevent Clang from printing number of errors and warnings to stderr. https://bugs.freedesktop.org/show_bug.cgi?id=78581 CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-19 10:18:26 -04:00
Michel Dänzer	93b6b1fa83	radeon/llvm: Adapt to AMDGPU.rsq intrinsic change in LLVM 3.5 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>	2014-06-19 09:58:03 -04:00
Emil Velikov	949beb0b84	configure: add HAVE_GALLIUM_STATIC_TARGETS Will be used to control the linking mode of pipe-drivers in gallium targets. Keep this hardcoded to static, as the pipe-drivers bare an unstable interface which we do not want to expose to the normal user. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:46:19 +01:00
Emil Velikov	d22b39e4db	targets: use GALLIUM_PIPE_LOADER_WINSYS_LIB_DEPS Drop ~50 lines of buildsystem mayhem. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:40:01 +01:00
Emil Velikov	571b2467ca	automake: introduce helper variable - gallium_pipe_loader_winsys_libs Will be used in upcomming commits to reduce duplication in the build. v2: Drop the megadriver/static_target variables. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:40:01 +01:00
Emil Velikov	86c30c6c5b	target-helpers: add dd_configuration(), dd_driver_name() Add a couple of helpers to be used by the dri targets when built with static pipe-drivers. Both functions provide functionality required by the dri state-tracker. With this patch ilo, nouveau and r300 gain support for throttle dri configuration. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:40:01 +01:00
Emil Velikov	573b55e302	target-helpers: add dd_create_screen() helper Will be used by gallium targets that statically link the pipe-drivers in the final library. Provides identical functionality to device_descriptor.create_screan. v2: - Don't sw_screen_wrap the i915/svga screen. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:39:50 +01:00
Emil Velikov	1e414faa5e	target-helpers: add a note about debug wrappers If memory serves me right, at least one debug wrapper does not return the base screen on failure. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:37:15 +01:00
Emil Velikov	665a4d9d9b	targets/pipe-loader: add driver specific drm_configuration Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:37:14 +01:00
Emil Velikov	36ff20027c	pipe-loader: add pipe_loader_ops::configuration() Required for the dri state-tracker. Will be used to retrieve driver specific configuration parameters: - share_fd (dmabuf) capability - throttle Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:37:14 +01:00
Emil Velikov	7f00611d78	pipe-loader: note that we leak pipe_loader_drm_device->base->driver_name The string is malloc'd (strdup) in loader_get_driver_for_fd(). Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:37:14 +01:00
Emil Velikov	6984e8db91	automake: stop building i915-sw and drop explicit linking to softpipe Unused and possibly broken. Will be completely removed in upcomming commits. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-19 12:37:14 +01:00
Ilia Mirkin	25182e249e	nv30: hack to avoid errors on unexpected color/zeta combinations This is just a hack, it should be possible to create a temporary zeta surface and render to that instead. However that's more complicated and this avoids the render being entirely broken and errors being reported by the card. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-19 01:05:52 -04:00
Ilia Mirkin	e1fe1435b1	nv30: tidy screen caps, add missing ones Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-19 01:05:52 -04:00
Ilia Mirkin	c092c46b27	nv30: avoid dangling references to deleted contexts Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-19 01:05:52 -04:00
Ilia Mirkin	5af80f6268	nv30: plug some memory leaks on screen destroy and shader compile Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-19 01:05:52 -04:00
Ilia Mirkin	22e9551af0	nv50: organize screen caps Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-19 01:05:52 -04:00
Ilia Mirkin	b03be4b0ee	nvc0: organize screen caps Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-19 01:05:52 -04:00
Ilia Mirkin	7e7097a4f4	nvc0: remove vport_int hack and instead use the usual state validation Commit `ad4dc772` fixed an issue with the viewport not being restored correctly. However it's rather hackish and confusing. Instead just mark the viewport dirty and let the viewport validation take care of it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-19 01:05:52 -04:00
David Heidelberger	8658fe3e4c	r300g: don't advertize PIPE_FORMAT_B10G10R10X2_UNORM on < r500 Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-19 01:43:09 +02:00
Marek Olšák	57f3da997a	radeonsi: implement ARB_texture_query_lod Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-06-19 00:18:17 +02:00
Marek Olšák	6a2b38381e	radeonsi: pass ARB_conservative_depth parameters to the hardware Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-06-19 00:17:36 +02:00
Marek Olšák	1df7199fc9	gallium: implement ARB_texture_query_levels The extension is always supported if GLSL 1.30 is supported. Softpipe and llvmpipe support is also added (trivial). Radeon and nouveau support is already done. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-06-19 00:17:36 +02:00
Marek Olšák	552c70a837	st/mesa: set sampler_view::last_level correctly It was set to pipe_resource::last_level and _MaxLevel was embedded in max_lod, that's why it worked for ordinary texturing. However, min_lod doesn't have any effect on texelFetch and textureQueryLevels, so we must still set last_level correctly. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-06-19 00:17:09 +02:00
Dave Airlie	c530282bbc	st/mesa: handle array textures in st_texture_image_copy Marek: also handle cube arrays Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-06-19 00:17:09 +02:00
Marek Olšák	6818e117ce	radeonsi: cosmetic changes in si_shader.c reviewed by Michel Dänzer	2014-06-19 00:17:09 +02:00
Marek Olšák	c7b5a5c4a3	radeonsi: implement ARB_texture_gather and Gather functions from GLSL 4.00 All ARB_texture_gather and gather-related ARB_gpu_shader5 piglit tests pass. reviewed by Michel Dänzer	2014-06-19 00:17:09 +02:00
Marek Olšák	0df3551bf4	st/mesa: fix geometry shader max texture limit in state validation Reviewed-by: Brian Paul <brianp@vmware.com>	2014-06-19 00:14:00 +02:00
Marek Olšák	bb867e2f2f	r600g: fix the max vertex shader input limit	2014-06-19 00:14:00 +02:00
Ian Romanick	cc219d1d65	meta: Respect the driver's maximum number of draw buffers Commit `c1c1cf5f9` added infrastructure for saving and restoring draw buffer state. However, it universially used MAX_DRAW_BUFFERS, but many drivers support far fewer than that at limit. For example, the radeon and i915 drivers only support 1. Using MAX_DRAW_BUFFERS causes meta to generate GL errors. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80115 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Kenneth Graunke <kenneth@whitecape.org> [on Broadwell] Tested-by: jpsinthemix@verizon.net Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-18 14:45:25 -07:00
Roland Scheidegger	56335b4441	gallivm: fix SCALED -> NORM conversions Such conversions (which are most likely rather pointless in practice) were resulting in shifts with negative shift counts and shifts with counts the same as the bit width. This was always undefined in llvm, the code generated was rather horrendous but happened to work. So make sure such shifts are filtered out and replaced with something that works (the generated code is still just as horrendous as before). This fixes lp_test_format, https://bugs.freedesktop.org/show_bug.cgi?id=73846. v2: prettify by using build context shift helpers. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-06-18 19:52:57 +02:00
Kristian Høgsberg	7928b946ad	mesa: Remove glClear optimization based on drawable size A drawable size of 0x0 means that we don't have buffers for a drawable yet, not that we have a zero-sized buffer. Core mesa shouldn't be optimizing out drawing based on buffer size, since the draw call could be what triggers the driver to go and get buffers. As discussed in the referenced bug report, the optimization was added as part of a scatter-shot attempt to fix a different problem. There's no other example in mesa core of using the buffer size in this way. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74005 Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-18 10:15:03 -07:00
Juha-Pekka Heikkila	fe5224b16a	mesa: In emit_texenv() type mismatch was forced with typecast Type mismatch caused random memory to be copied when casted memory area was smaller than expected type. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-18 16:15:51 +03:00
Grigori Goronzy	6cd30f5d73	radeon/uvd: disable VC-1 simple/main on UVD 2.x It's about as broken as on later UVD revisions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452 Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-18 13:58:52 +02:00
Grigori Goronzy	cf05f9bf01	radeonsi: add sampling of 4:2:2 subsampled textures This makes 4:2:2 video surfaces work in VDPAU. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-06-18 13:58:37 +02:00
Grigori Goronzy	f5dafc156a	util/u_format: move utility function from r600g We need this for radeonsi, and it might be useful for other drivers, too.	2014-06-18 13:58:19 +02:00
Leo Liu	700100d94b	radeon/vce: set number of cpbs based on level v2: add error check for cpb size 0 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-18 13:53:27 +02:00
Leo Liu	0796483282	radeon/vce: implement h264 level support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-18 13:53:23 +02:00
Leo Liu	e2db7c10d6	st/omx/enc: implement h264 level support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-18 13:53:20 +02:00
Leo Liu	4fca06a902	vl: add level interface Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-18 13:53:17 +02:00
Leo Liu	cb9fcc5c44	st/st/omx: fix switch-case indentation in vid_enc.c Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-18 13:52:54 +02:00
Jon TURNEY	83821ece79	glx: Add an error message when a direct renderer's createScreen() routine fails because no matching fbConfigs or visuals could be found. Nearly all the error cases in *createScreen() issue an error message to diagnose the failure to initialize before branching to handle_error. The few remaining error cases which don't should probably do the same. (At the moment, it seems this can be triggered in drisw with an X server which reports definite values for MAX_PBUFFFER_(WIDTH\|HEIGHT\|SIZE), because those attributes are checked for an exact match against 0.) Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-18 09:55:45 +01:00
Chia-I Wu	88b887faa9	i965/vec4: unit test for copy propagation and writemask This unit test demonstrates a subtle bug fixed by `4ddf51db6a`. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-06-18 13:43:05 +08:00
Matt Turner	6c2d815d64	i965/vec4/gs: Silence warning about unused 'success' in release build. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:41 -07:00
Matt Turner	17f2dd7274	i965/disasm: Mark three_source_reg_encoding[] static. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:39 -07:00
Matt Turner	9f7b5fa2c8	i965/blorp: Remove unused 'brw' member. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:38 -07:00
Matt Turner	73ab06f9c5	i965/blorp: Mark branch unreachable to silence uninitialized var warning. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:36 -07:00
Matt Turner	f3aecefa99	i965: Silence warning about unused brw in release builds. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:34 -07:00
Matt Turner	836f4299e8	i965: Mark backend_instruction and bblock_t as structs. They have to be marked as structs for C code elsewhere. bblock_t is already defined as a struct, and all of backend_instruction's fields are public anyway. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:33 -07:00
Matt Turner	83649587c6	i965: Use standard SSE intrinsics instead of gcc built-ins. Let's this file compile with clang. Reviewed-by: Frank Henigman <fjhenigman@google.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:20 -07:00
Matt Turner	52a4065493	mesa: Remove unused functions from perfomance query code. Perhaps useful for debugging? Never used otherwise. Added by commit `8cf5bdad`. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:18 -07:00
Matt Turner	7f3f9b1a68	mesa: Remove unused extra_EXT_texture_integer. Unused since commit `b6475f94`. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:17 -07:00
Matt Turner	9f4e776433	mesa: Mark default case unreachable to silence warning. Warned about 'coord' being undefined in the default case, which is unreachable. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:14 -07:00
Matt Turner	6ac5adce63	egl: Remove unused variable dri_driver_path. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:12 -07:00
Matt Turner	d2458a4710	swrast: Remove unused solve_plane_recip(). Unused since commit `9e8a961d`. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:11 -07:00
Matt Turner	db650d9ec1	glsl: Remove 'struct' from ir_variable declaration. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-17 10:18:06 -07:00
Matt Turner	ebc7524503	Revert "i965: Add 'wait' instruction support" This reverts commit `20be3ff576`. No evidence of ever being used.	2014-06-17 10:16:23 -07:00
Matt Turner	fab92fa1cb	i965/fs: Optimize SEL with the same sources into a MOV. instructions in affected programs: 474 -> 462 (-2.53%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-17 09:40:31 -07:00
Matt Turner	35bc02dee8	i965/fs: Perform CSE on texture operations. Helps Unigine Tropics and some (old) gstreamer shaders in shader-db. instructions in affected programs: 792 -> 744 (-6.06%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-17 09:40:31 -07:00
Matt Turner	18372a7100	i965/fs: Copy propagate from load_payload. But only into non-load_payload instructions. Otherwise we would prevent register coalescing from combining identical payloads.	2014-06-17 09:40:30 -07:00
Matt Turner	31ae9c25ff	i965/fs: Perform CSE on load_payload instructions if it's not a copy. Since CSE creates instructions, if we let CSE generate things register coalescing can't remove, bad things will happen. Only let CSE combine non-copy load_payloads. E.g., allow CSE to handle this load_payload vgrf4+0, vgrf5, vgrf6 but not this load_payload vgrf4+0, vgrf5+0, vgrf5+1	2014-06-17 09:40:30 -07:00
Matt Turner	8f4e324be2	i965/fs: Support register coalescing on LOAD_PAYLOAD operands.	2014-06-17 09:40:07 -07:00
Matt Turner	4b7bca8979	i965/fs: Emit load_payload instead of multiple MOVs for large VGRFs.	2014-06-17 09:40:07 -07:00
Matt Turner	68b7b03429	i965/fs: Only consider real sources when comparing instructions.	2014-06-17 09:38:06 -07:00
Matt Turner	856860db4a	i965/fs: Apply cube map array fixup and restore the payload. So that we don't have partial writes to a large VGRF. Will be cleaned up by register coalescing.	2014-06-17 09:38:06 -07:00
Matt Turner	15b6ab04e2	i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().	2014-06-17 09:38:06 -07:00
Matt Turner	138905d728	i965/fs: Lower LOAD_PAYLOAD and clean up. Clean up with with register_coalesce()/dead_code_eliminate().	2014-06-17 09:38:05 -07:00
Matt Turner	b996216384	i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD. Will be used to simplify the handling of large virtual GRFs in SSA form. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-06-17 09:38:05 -07:00
Tapani Pälli	39cdf1621e	glsl: type check between switch init-expression and case Patch adds a type check between switch init-expression and case label and performs a implicit signed->unsigned type conversion when possible. v2: add GLSL spec reference, do implicit conversion if possible (Matt) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79724 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-17 08:13:28 +03:00
Tobias Klausmann	5357c14da4	nv50/ir: Remove NV50_SEMANTIC_VIEWPORTINDEX Use TGSI_SEMANTIC_VIEWPORT_INDEX for the last consumer. Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-16 23:08:32 -04:00
Tobias Klausmann	cd01e1667a	docs: update GL3.txt, relnotes: mark GL_ARB_viewport_array as done for nvc0 Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-16 23:08:32 -04:00
Tobias Klausmann	a2cb3a4a4f	nvc0: implement multiple viewports/scissors, enable ARB_viewport_array Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: mark things dirty on ctx switch, 3d blit] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-16 23:08:03 -04:00
Ilia Mirkin	af05270ccf	nv50: make sure to mark first scissor dirty after blit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-16 23:08:03 -04:00
Kenneth Graunke	49659ad90c	i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell. Like on Haswell, we need to use 8x4 aligned rectangle primitives for hierarchical depth buffer resolves and depth clears. See the comments in brw_blorp.cpp's brw_hiz_op_params() constructor. (The Broadwell documentation confirms that this is still necessary.) This patch makes the Broadwell code follow the same behavior as Chad and Jordan's Gen7 BLORP code. Based on a patch by Topi Pohjolainen. This fixes es3conform's framebuffer_blit_functionality_scissor_blit test, with no Piglit regressions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-16 17:23:21 -07:00
Kenneth Graunke	fa35b272a0	i965: Make INTEL_DEBUG=mip print out whether HiZ is enabled. We only enable HiZ for miplevels which are aligned on 8x4 blocks. When debugging HiZ failures, it's useful to know whether a particular miplevel is using HiZ or not. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-16 17:22:29 -07:00
Jordan Justen	380dd3be02	glsl/cs: Fix local_size_y and local_size_z flags.q.local_size has 3 bits. One each for x, y and z. Fixes piglit's: * spec/ARB_compute_shader/linker/mismatched_local_work_sizes * spec/ARB_compute_shader/compiler/default_local_size.comp * spec/ARB_compute_shader/compiler/work_group_size_too_large * spec/ARB_compute_shader/compiler/gl_WorkGroupSize_matches_layout.comp This was regressed in `738c9c3c`. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-16 09:54:52 -07:00
Jordan Justen	539cd92476	main/extensions: Only parse MESA_EXTENSION_OVERRIDE once Previously, we would parse MESA_EXTENSION_OVERRIDE each time a context was created. Now we will save the results of that parsing and use it during context initialization. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Jordan Justen	ac3e2efeff	main/extensions: Build list of extensions that can't be disabled This will allow us to utilize the early MESA_EXTENSION_OVERRIDE parsing at the later extension string initialization step. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Jordan Justen	863f57ee1b	main/extensions: Create extra extensions override string This will allow us to utilize the early MESA_EXTENSION_OVERRIDE parsing at the later extension string initialization step. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Jordan Justen	10e03b4401	i965/cs: Use override structure rather than separate env var In `25268b93`, we added a new environment variable (INTEL_COMPUTE_SHADER) to allow some constant values to be upgraded for the ARB_compute_shader extension. Now, we can look to see if the extension was enabled via the MESA_EXTENSION_OVERRIDE environment variable. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Jordan Justen	f5ca8c1972	main/extensions: Add early extension override structures During the early one_time_init phase of context creation, we initialize two global gl_extensions structures. We read the MESA_EXTENSION_OVERRIDE environment variable, and store positive and negative overrides in two structures: * struct gl_extensions _mesa_extension_override_enables * struct gl_extensions _mesa_extension_override_disables These are filled before the driver initializes extensions and constants, therefore the driver can make adjustments based on the desired overrides. This can be useful during development of a new extension where the extension is only partially ready. The driver can't actually advertise support for the extension, but if it sees that the override is set for the extension, then it can expose more supported parts of the extension, such as upgrading context constants. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Jordan Justen	8be64fb570	main/extensions: Create a context-less set_extensions function We will add new gl_extensions structures that capture the environment variable extension overrides and are available early in context creation. This will allow a driver to take actions during its initialization based on the extension overrides. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Jordan Justen	f2280eeba5	main/extensions: Don't advertise unknown extensions overrides with (-) Previously setting: MESA_EXTENSION_OVERRIDE=-GL_MESA_ham_sandwich Would cause Mesa to advertise support for the GL_MESA_ham_sandwich extension, even though the override specifically asked for it to be disabled. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-06-16 09:54:52 -07:00
Marek Olšák	41060a6095	radeonsi: fixup sizes of shader resource and sampler arrays This was wrong for a very long time. I wonder if the array size has any effect on anything. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-16 16:55:57 +02:00
José Fonseca	7889469663	scons: Link libGL.so against xcb-dri2. Fixing undefined xcb_dri2_* symbols. Trivial.	2014-06-16 11:24:21 +01:00
Michel Dänzer	d6fd8a9771	r600g/radeonsi: Remove default case from PIPE_COMPUTE_CAP_* switch This way, the compiler warns about unhandled caps. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-06-16 15:56:29 +09:00
Tapani Pälli	5cb8fdb397	docs: update ARB_explicit_uniform_location status + modify release notes for 10.3 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	f3750a2c86	Enable GL_ARB_explicit_uniform_location in the drivers. v2: enable also for i915 (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	e8fb8b1bb3	glsl: parser changes for GL_ARB_explicit_uniform_location Patch adds a preprocessor define for the extension and stores explicit location data for uniforms during AST->HIR conversion. It also sets layout token to be available when having the extension in place. v2: change parser check to require GLSL 330 or enabling GL_ARB_explicit_attrib_location (Ian) v3: fix the check and comment in AST->HIR (Petri) Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	8381f0f0c3	glsl: add enable bit for ARB_explicit_uniform_location Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	73f7c8636d	mesa: support inactive uniforms in glUniform* functions Support inactive uniforms that have explicit location set in glUniform* functions. v2: remove unnecessary extension check, use new define (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	dd2a6519b9	glsl/linker: assign explicit uniform locations Patch refactors the existing uniform processing so explicit locations are taken in to account during variable processing. These locations are temporarily stored in gl_uniform_storage before actual locations are set. UNMAPPED_UNIFORM_LOC marks unset location so that we can use 0 as a valid explicit location. When locations are set, UniformRemapTable is first populated with uniforms that have explicit location set (inactive and active ones), rest are put after explicit location slots. v2: introduce define for locations that have not been set yet (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	eca9d16048	glsl/linker: initialize explicit uniform locations Patch initializes the UniformRemapTable for explicit locations. This needs to happen before optimizations to make sure all inactive uniforms get their explicit locations correctly. v2: fix initialization bug, introduce define for inactive uniforms (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	dadc3d04f0	glsl: add glsl_type::uniform_locations() helper function This function calculates the number of unique values from glGetUniformLocation for the elements of the type. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	bfe42ddd99	mesa: add new enum MAX_UNIFORM_LOCATIONS Patch adds new implementation dependent value required by the GL_ARB_explicit_uniform_location extension. Default value for user assignable locations is calculated as sum of MaxUniformComponents for each stage. v2: fix descriptor in get_hash_params.py (Petri) v3: simpler formula for calculating initial value (Ian) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	d1a64aad16	mesa: add enable bit for ARB_explicit_uniform_location Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Tapani Pälli	bd5f1202fb	glapi: add GL_ARB_explicit_uniform_location Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-16 06:49:59 +03:00
Kenneth Graunke	5d8e246ac8	i965/vec4: Use the sampler for pull constant loads on Broadwell. We've used the LD sampler message for pull constant loads on earlier hardware for some time, and also were already using it for the FS on Broadwell. This patch makes us use it for Broadwell VS/GS as well. I believe that when I wrote this code in 2012, we still used the data port in some cases, and I somehow neglected to convert it while rebasing. Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821% (n = 17). Many other applications should benefit similarly: this speeds up uniform array access in the VS, which is commonly used for skinning shaders, among other things. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Ben Widawsky <ben@bwidawsk.net> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-15 16:51:05 -07:00
Kenneth Graunke	847abaccc0	i965: Add missing newlines to a few perf_debug messages. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-15 16:51:05 -07:00
Kenneth Graunke	d053a05ef3	i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing. I actually added MOCS support for these things, but forgot to delete the corresponding perf_debug() warnings. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-15 16:51:05 -07:00
Kenneth Graunke	7f256c1c70	i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell. Somehow I missed this when adding all of the other MOCS values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-15 16:51:05 -07:00
Kenneth Graunke	d0575d98fc	i965/vec4: Fix dead code elimination for VGRFs of size > 1. When faced with code such as: mov vgrf31.0:UD, 960D mov vgrf31.1:UD, vgrf30.xxxx:UD The dead code eliminator didn't consider reg_offsets, so it decided that the second instruction was writing was writing to the same register as the first one, and eliminated the first one. But they're actually different registers. This fixes INTEL_DEBUG=shader_time for vertex shaders. In the above code, vgrf31.0 represents the offset into the shader_time buffer where the data should be written, and vgrf31.1 represents the actual time data. With a completely undefined offset, results were...unexpected. I think this is probably one of the few cases (maybe only case) where we generate multiple MOVs to a large VGRF. Normally, we just use them as texturing results; the other SEND-from-GRF uses a size 1 VGRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-06-15 16:51:05 -07:00
Kenneth Graunke	d6a7a2606e	i965: Add SHADER_OPCODE_SHADER_TIME_ADD to dump_instructions() decode. "shader_time_add" is a lot more informative than "op152". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-15 16:51:04 -07:00
Vinson Lee	4133c7126c	glsl: Fix clang mismatched-tags warnings with glsl_type. Fix clang mismatched-tags warnings introduced with commit `4f5445a45d`. ./glsl_symbol_table.h:37:1: warning: class 'glsl_type' was previously declared as a struct [-Wmismatched-tags] class glsl_type; ^ ./glsl_types.h:86:8: note: previous use is here struct glsl_type { ^ ./glsl_symbol_table.h:37:1: note: did you mean struct here? class glsl_type; ^~~~~ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-15 13:34:18 -07:00
Vinson Lee	32c5544860	mesa/drivers: Fix clang constant-logical-operand warnings. This patch fixes several clang constant-logical-operand warnings such as the following. ../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: warning: use of logical '\|\|' with constant operand [-Wconstant-logical-operand] if (DO_TWOSIDE \|\| DO_OFFSET \|\| DO_UNFILLED \|\| DO_TWOSTENCIL) ^ ~~~~~~~~~~~ ../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: note: use '\|' for a bitwise operation if (DO_TWOSIDE \|\| DO_OFFSET \|\| DO_UNFILLED \|\| DO_TWOSTENCIL) ^~ \| Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-14 23:21:43 -07:00
Chris Forbes	4191cc4861	glsl: Correct more typos Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-15 12:55:16 +12:00
Tom Stellard	ac26a562ed	radeon/compute: Always report at least 1 compute unit Some apps will abort if they detect 0 compute units. This fixes crashes in some OpenCV tests.	2014-06-13 21:32:34 -04:00
Jason Ekstrand	ffe609cc69	meta_blit: properly compute texture width for the CopyTexSubImage fallback Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-13 13:09:21 -07:00
Rob Clark	06e9536e5f	freedreno/a3xx: vtx formats Add support for more vertex buffer formats. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-06-13 15:20:34 -04:00
Rob Clark	ba6a490bbc	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-06-13 15:20:34 -04:00
Rob Clark	3394900dd3	freedreno: try for more squarish tile dimensions Worth about ~0.5fps in xonotic, for example. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-06-13 15:20:34 -04:00
Rob Clark	6aeeb706d2	freedreno: fix for null textures Some apps seem to give us a null sampler/view for texture slots which come before the last used texture slot. In particular 0ad triggers this. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-06-13 15:20:34 -04:00
Roland Scheidegger	2ea8e2fccf	llvmpipe: increase number of queries which can be binned simultaneously to 64 Gallium (but not OpenGL) does allow nesting of queries, but there's no limit specified (d3d10 has no limit neither). Nevertheless, for practical purposes we need some limit in llvmpipe, otherwise we'd need more complex handling of queries as we need to keep track of all binned queries (this only affects queries which gather data past setup). A limit of 16 is too small though, while 64 would suffice. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-06-13 20:08:39 +02:00
Bruno Jiménez	03aab2af16	radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS v2: Add RADEON_INFO_ACTIVE_CU_COUNT as a define, as suggested by Tom Stellard Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-13 10:59:30 -04:00
Neil Roberts	b8d15ca5e8	Remove _mesa_is_type_integer and _mesa_is_enum_format_or_type_integer The comment for _mesa_is_type_integer is confusing because it says that it returns whether the type is an “integer (non-normalized)” format. I don't think it makes sense to say whether a type is normalized or not because it depends on what format it is used with. For example, GL_RGBA+GL_UNSIGNED_BYTE is normalized but GL_RGBA_INTEGER+GL_UNSIGNED_BYTE isn't. If the normalized comment is just a mistake then it still doesn't make much sense because it is missing the packed-pixel types such as GL_UNSIGNED_INT_5_6_5. If those were added then it effectively just returns type != GL_FLOAT. That function was only used in _mesa_is_enum_format_or_type_integer. This function effectively checks whether the format is non-normalized or the type is an integer. I can't think of any situation where that check would make sense. As far as I can tell neither of these functions have ever been used anywhere so we should just remove them to avoid confusion. These functions were added in `9ad8f431b2`. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-06-13 15:54:46 +01:00
Bruno Jiménez	2a0dffa0c9	clover: query driver for the max number of compute units Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-06-12 19:09:32 -04:00
Bruno Jiménez	8f4d37889c	gallium: Add PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-06-12 19:08:06 -04:00
Bruno Jiménez	4f70d83089	r600g/compute: solve a bug introduced by `2e01b8b440` That commit made possible that the items could be one just after the other when their size was a multiple of ITEM_ALIGNMENT. But compute_memory_prealloc_chunk still looked to leave a gap between items. Resulting in that we got an infinite loop when trying to add an item which would left no space between itself and the next item. Fixes piglit test: cl-custom-r600-create-release-buffer-bug And the test for alignment I have just sent: http://lists.freedesktop.org/archives/piglit/2014-June/011135.html Sorry about this. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-12 15:52:08 -04:00
Niels Ole Salscheider	607bc89970	egl/gallium: Set defines for supported APIs when using automake This fixes automake builds which are broken since `b52a530ce2`. v2: This patch also adds the FEATURE_* defines back to targets/egl-static for Android and Scons that have been removed in the mentioned commit. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79885 Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-12 18:07:20 +01:00
Emil Velikov	816d392b58	configure: correctly autodetect xvmc/vdpau/omx Commit `e62b7d38a1` (configure: autodetect video state-trackers when non swrast driver is present) added a check that caused the autodetection to be omitted when we have the swrast gallium driver. Whereas it should have skipped the VL targets when only swrast was selected. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79907 Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-12 18:07:20 +01:00
Courtney Goeltzenleuchter	0406f59eeb	mesa: glx: Reduce error log level The code that parses LIBGL_DRIVERS_PATH was printing an error for every attempted dlopen. It's not an error to have to check multiple items in the path, only an error if no suitable library is found. Reduced the load error to a warning to match behavior of dynamic linker. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-12 10:19:00 -06:00
Brian Paul	33f273778b	cso: fix stream-out clean up in cso_release_all() Use the has_streamout flag as we do elsewhere to check if we need to call pipe->set_stream_output_targets(). The driver might implement the set_stream_output_targets() function, but not for all hardware configurations. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-06-12 13:23:56 +01:00
Neil Roberts	765efeef88	i965: Set the fast clear color value for texture surfaces When a multisampled texture is used for sampling the fast clear color value needs to be programmed into the surface state. This was being left as all zeroes so if the surface was cleared to a value other than black then it wouldn't work properly. This doesn't matter for single-sample textures because in that case the MCS buffer is resolved before it is used as a texture source. https://bugs.freedesktop.org/show_bug.cgi?id=79729 Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-12 11:24:04 +01:00
Chris Forbes	2c79aa8272	glsl: Fix typo in comment. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-12 21:19:24 +12:00
Kenneth Graunke	3e71258023	i965: Fix disassembly of BLORP clear programs. Too many levels of indirection. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-12 00:56:08 -07:00
Kenneth Graunke	b207caf9bc	i965/fs: Move FB write default state mashing in a level. We only need to alter the default state if we're emitting MOVs for header related fields. So, we can simply move the push/pop of state in to the if (header_present) block, bypassing it in the common case. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903	2014-06-12 00:56:08 -07:00
Kenneth Graunke	a2ad771671	i965: Fix Haswell discard regressions since Gen4-5 line AA fix. In commit `dc2d3a7f5c`, Iago accidentally moved fire_fb_write() above the brw_pop_insn_state(), which caused the SEND to lose its predication and change from WE_normal to WE_all. Haswell uses predicated SENDs for discards, so this broke Piglit's tests for discards. We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked, but the actual FB write itself should respect those. So, pop state first, and force it again around the single MOV. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903	2014-06-12 00:56:08 -07:00
Michel Dänzer	be5e5b6c93	gbm: Remove 64x64 restriction from GBM_BO_USE_CURSOR GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to build, but it no longer rejects widths or heights other than 64. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-12 16:13:39 +09:00
Matt Turner	2c8520c03d	i965: Use brw->gen in some generation checks. Will simplify the automated conversion if we want to allow compiling the driver for a single generation. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-06-11 20:57:10 -07:00
Matt Turner	f51a7e00da	i965/fs: Clean up tabs in brw_fs_cse.cpp. I'm adding vec4 CSE, and I want to diff the files.	2014-06-11 20:09:22 -07:00
Matt Turner	4bb9d16fd3	configure.ac: Simplify DUSE_EXTERNAL_DXTN_LIB logic. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-11 20:09:22 -07:00
Matt Turner	026d1fe986	configure.ac: Alphabetize AC_CONFIG_FILES. This isn't supposed to be difficult. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-11 20:09:22 -07:00
Matt Turner	180e60df65	configure.ac: Remove single quotes to fix syntax highlighting. Please stop adding them. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-11 20:09:22 -07:00
Robert Bragg	c6f118484c	meta: save and restore swizzle for _GenerateMipmap This makes sure to use a no-op swizzle while iteratively rendering each level of a mipmap otherwise we may loose components and effectively apply the swizzle twice by the time these levels are sampled. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-11 21:38:01 +01:00
Ian Romanick	63117ac329	i965/vec4: Emit smarter code for b2f of a comparison Previously we would emit the comparison, emit an AND to mask off extra bits from the comparison result, then convert the result to float. Now, do the comparison, then use a cleverly constructed SEL to pick either 0.0f or 1.0f. No piglit regressions on Ivybridge. total instructions in shared programs: 1642311 -> 1639449 (-0.17%) instructions in affected programs: 136533 -> 133671 (-2.10%) GAINED: 0 LOST: 0 Programs that are affected appear to save between 1 and 5 instuctions (just by skimming the output from shader-db report.py. v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove extraneous fix_3src_operand (suggested by Matt). The latter change required swapping the order of the operands and using predicate_inverse. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-11 12:00:24 -07:00
Ian Romanick	be0452b049	i965/vec4: Silence a couple unused parameter warnings brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter] brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-11 12:00:20 -07:00
Ian Romanick	014d45f137	glsl: Store gl_uniform_driver_storage::format as the actual type And delete the incorrect comment. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com>	2014-06-11 11:26:05 -07:00
Dave Airlie	0d89448662	softpipe: fix pt->resource assert placement oops meant to move this. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 14:03:11 +10:00
Dave Airlie	9bc12ef241	softpipe: enable AMD_vertex_shader_layer. This passes tests now on softpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:21:21 +10:00
Dave Airlie	8dede2fa6c	softpipe: enable GLSL 3.30 support. This enables GL3.3 on softpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:21:17 +10:00
Dave Airlie	c82d227edd	softpipe: bump the softpipe geometry limits This just aligns the limits with llvmpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:21:08 +10:00
Dave Airlie	7ea04f089b	tgsi_exec: use defines for max inputs/outputs This fixes the limits for GL 3.2, and subsequently fixes some segfaults in some varying packing tests and max varying tests after the limits bumped. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:21:04 +10:00
Dave Airlie	740d5bed77	softpipe: add layered rendering support. This adds support for GL 3.2 layered rendering to softpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:20:30 +10:00
Dave Airlie	dc8fc39ada	softpipe: add layering to the surface tile cache. This adds the layer info to the tile cache. This changes clear_flags to be dynamically allocated as MAX_LAYERS seems like a too big step. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:20:30 +10:00
Dave Airlie	5a57248541	softpipe: add depth clamping support. (v2) This passes the piglit depth clamp tests. this is required for GL 3.2. v2: move min/max up one level, could go further, thanks to Roland for suggestion. v1: Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:20:07 +10:00
Dave Airlie	a4670de0a0	tgsi/gs: bound max output vertices in shader This limits the number of emitted vertices to the shaders max output vertices, and avoids us writing things into memory that isn't big enough for it. Reviewed-by: Zack Rusin <zackr@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-06-11 12:19:37 +10:00
Jon Ashburn	10e8d55799	i965: Add GPU BLIT of texture image to PBO in Intel driver Add Intel driver hook for glGetTexImage to accelerate the case of reading texture image into a PBO. This case gets huge performance gains by using GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed by memcpy. No regressions on Piglit tests with Intel driver. Performance gain (1280 x 800 FBO, Ivybridge): glGetTexImage + glMapBufferRange with patch 1.45 msec glGetTexImage + glMapBufferRange without patch 4.68 msec v3: (by Kenneth Graunke) - Fix compile after Eric's change to drop the tiling argument to intel_miptree_create_for_bo. - Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit regressions. - Squash in several whitespace and coding style fixes.	2014-06-10 18:36:44 -07:00
Kenneth Graunke	237aac39b1	i965: Invalidate live intervals when inserting Gen4 SEND workarounds. We need to invalidate the live intervals when inserting new instructions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-06-10 16:38:27 -07:00
Kenneth Graunke	ecc78eab11	i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code. When walking backwards, we want to stop at the head sentinel, which is where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL. Fixes random crashes, as well as valgrind errors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2014-06-10 16:38:27 -07:00
Kenneth Graunke	fc19c4aaf1	meta: Label the meta GLSL clear program. Giving the meta clear program a meaningful name makes it easier to find in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time. We already did so for integer programs, but neglected to label the primary program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-10 16:38:27 -07:00
Kenneth Graunke	2bcd24c9f0	i965/fs: Combine generate_math[12]_gen6 methods. These used to call different math emitters (brw_math vs. brw_math2). Now that they both call gen6_math, they're virtually identical. When unrolling SIMD16 to multiple SIMD8 operations, we should take care not to apply sechalf to brw_null_reg for src1. Otherwise, we'd end up with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's valid. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-10 16:38:27 -07:00
Kenneth Graunke	35e48bd618	i965/fs: Drop the generate_math[12]_gen7 methods. These functions are basically identical, so we should combine them. However, they're so trivial, we may as well just fold them into their only call sites. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	f3ddd71f28	i965/vec4: Combine generate_math[12]_gen6 methods. These are trivial to combine: we should just avoid checking the second operand if it's brw_null_reg. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	5260a26e92	i965/vec4: Drop the generate_math2_gen7() method. It's now a single line of code, so we may as well fold it into the caller. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	b003fc265f	i965: Rename brw_math to gen4_math. Usually, I try to use "brw" for functions that apply to all generations, and "gen4" for dead end/legacy code that is only used on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	de65ec2fde	i965: Split Gen4-5 and Gen6+ MATH instruction emitters. Our existing functions, brw_math and brw_math2, had unclear roles: Gen4-5 used brw_math for both unary and binary math functions; it never used brw_math2. Since operands are already in message registers, this is reasonable. Gen6+ used brw_math for unary math functions, and brw_math2 for binary math functions, duplicating a lot of code. The only real difference was that brw_math used brw_null_reg() for src1. This patch improves brw_math2's assertions to allow both unary and binary operations, renames it to gen6_math(), and drops the Gen6+ code out of brw_math(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	7b9cf79790	i965: Make src_reg::equals() take a constant reference, not a pointer. This is more typical C++ style. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	000f4a33c0	i965: Don't set the "switch" flag on control flow instructions on Gen6+. Thread switching on control flow instructions is a documented workaround for Gen4-5 errata. As far as I can tell, it hasn't been needed since Sandybridge. Thread switching is not free, so in theory this may help performance slightly. Flow control instructions with the "switch" flag cannot be compacted, so removing it will make these instructions compactable. (Of course, we still have to implement compaction for flow control instructions...) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-10 16:38:26 -07:00
Kenneth Graunke	3a439534de	i965/fs: Allow CSE on math opcodes on Gen6+. total instructions in shared programs: 2081469 -> 2081248 (-0.01%) instructions in affected programs: 22606 -> 22385 (-0.98%) No programs were hurt by this patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-10 16:38:25 -07:00
Thomas Helland	2c9a1518a1	glsl: Remove unused include in expr.flatt. Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:52 -07:00
Thomas Helland	10e00611c2	glsl: Remove unused include in ir.cpp Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	8e1e68119c	glsl: Remove unused include from ir_constant_expression.cpp Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	068d30655c	glsl: Remove unused include from ir_basic_block.cpp Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	b6e68fc9fb	glsl: Remove unused include from hir_field_selection.cpp Found with IWYU. Compile-tested on my Ivy-bridge system Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	4f5445a45d	glsl: Remove unused include from glsl_symbol_table.h Only function-defs use glsl_type so forward declare instead. Compile-tested on my Ivy-bridge system. IWYU also suggests removing #include <new>, and this compiles fine. I'm not familiar enough with memory management in C/C++ that I feel comfortable removing this. Insights would be appreciated. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	38ffbf459b	glsl: Remove unused include from glsl_types.cpp Found with IWYU. Compile-tested on my Ivy-bridge system. Added comment about core.h being used for MAX2. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	22f5a0b277	glsl: Remove unused include from builtin_variables.cpp Found with IWYU. Compile-tested on my Ivy-bridge system. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	6f385d9371	glsl: Remove unused include in ast_to_hir.cpp Found with IWYU. Comment says it's for struct gl_extensions. Grepping for gl_extensions shows no uses. Tested by compiling on my Ivy-bridge system. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	5b83d5e2f9	glsl: Remove unused includes in link_uniform_block_active_visitor.h Found with IWYU, compile-tested on my Ivy-bridge system. This is not used in the header, and is included in the source. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Thomas Helland	eac09a4e1d	glsl: Remove unused includes in link_uniform_init. Found with IWYU, confirmed with grepping for "hash" and "symbol". No negative effects on compilation. IWYU also reported core.h and linker.h could be removed, but I'm unsure if those are false positives. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Thomas Helland <thomashelland90@gmail.com>	2014-06-10 13:05:51 -07:00
Matt Turner	4787c25a60	i965: Replace open-coded linked list with exec_list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:51 -07:00
Matt Turner	1951418038	glsl: Add an exec_node_init() function, usable from C. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:51 -07:00
Matt Turner	b123c6e96d	glsl: Make foreach macros usable from C by adding struct keyword. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:51 -07:00
Matt Turner	d4ce0109de	glsl: Make exec_list members just wrap the C API. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:51 -07:00
Matt Turner	b10ad648a1	glsl: Make exec_node members just wrap the C API. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:51 -07:00
Matt Turner	d691f0de72	glsl: Add C API for exec_list. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:50 -07:00
Matt Turner	47a77ba839	glsl: Add C API for exec_node. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:50 -07:00
Matt Turner	5f90f2ee59	glsl: Move definition of exec_list member functions out of the struct. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:50 -07:00
Matt Turner	cb5a0e59cf	glsl: Move definition of exec_node member functions out of the struct. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 13:05:50 -07:00
Bruno Jiménez	112c1b14ed	r600g/compute: Use %u as the unsigned format This fixes an issue when running cl-program-bitcoin-phatk piglit test where some of the inputs have negative values Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:57 -04:00
Bruno Jiménez	2e01b8b440	r600g/compute: align items correctly Now, items whose size is a multiple of 1024 dw won't leave 1024 dw between itself and the following item The rest of the cases is left as it was Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:57 -04:00
Bruno Jiménez	df1dd8bf22	r600g/compute: Cleanup of compute_memory_pool.h Removed compute_memory_defrag declaration because it seems to be unimplemented. I think that this function would have been the one that solves the problem with fragmentation that compute_memory_finalize_pending has. Also removed comments that are already at compute_memory_pool.c Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:57 -04:00
Bruno Jiménez	1d6384318e	r600g/compute: Tidy a bit compute_memory_finalize_pending Explanation of the changes, as requested by Tom Stellard: Let's take need after is calculated as item->size_in_dw+2048 - (pool->size_in_dw - allocated) BEFORE: If need is positive or 0: we calculate need += 1024 - (need % 1024), which is like cealing to the nearest multiple of 1024, for example 0 goes to 1024, 512 goes to 1024 as well, 1025 goes to 2048 and so on. So now need is always possitive, we do compute_memory_grow_pool, check its output and continue. If need is negative: we calculate need += 1024 - (need % 1024), in this case we will have negative numbers, and if need is [-1024:-1] 0, so now we take the else, recalculate need as need = pool->size_in_dw / 10 and need += 1024 - (need % 1024), we do compute_memory_grow_pool, check its output and continue. AFTER: If need is positive or 0: we jump the if, calculate need += 1024 - (need % 1024) compute_memory_grow_pool, check its output and continue. If need is negative: we enter the if, and need is now pool->size_in_dw / 10. Now we calculate need += 1024 - (need % 1024) compute_memory_grow_pool, check its output and continue. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:57 -04:00
Bruno Jiménez	39bd08efdd	r600g/compute: Add more NULL checks In this case, NULL checks are added to compute_memory_grow_pool, so it returns -1 when it fails. This makes necesary to handle such cases in compute_memory_finalize_pending when it is needed to grow the pool Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:57 -04:00
Bruno Jiménez	833b550773	r600g/compute: Adding checks for NULL after CALLOC Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:57 -04:00
Bruno Jiménez	fd943fa6c2	r600g/compute: Fixing a typo and some indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-10 15:29:56 -04:00
Cody Northrop	3eef571cbc	mesa: Fix substitution of large shaders Signed-off-by: Cody Northrop <cody@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-06-10 10:45:31 -06:00
Michel Dänzer	2d399bb183	configure: Only check for OpenCL without LLVM when the latter is certain LLVM is enabled by default for some architectures, but the test was failing before that. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-10 10:56:58 -04:00
David Heidelberger	b0fd54900c	r600g,radeonsi: implement PIPE_QUERY_TIMESTAMP_DISJOINT v2 Marek: set the query result correctly Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-10 13:20:13 +02:00
Jon TURNEY	bd526ec9e1	configure: Always default to --enable-driglx-direct Always default to --enable-driglx-direct, now that will build driswrast, but won't try to use dri[123] on platforms which don't have that. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-10 10:32:56 +01:00
Jon TURNEY	f647a722da	glx: Fix build in GLX_DIRECT_RENDERING !GLX_USE_APPLEGL !GLX_USE_DRM case Some untangling to fix building in the dri_platform=none, --enable-driglx-direct case, where only driswast can be used. Turn the test for including the glXGetScreenDriver()/glXGetScreenDriver() interface used by xdriinfo from !GLX_USE_APPLEGL into a positive form, as it is only useful when dri_platform=drm Add additional GLX_USE_DRM tests so DRI[123] renderers are only used when dri_platform=drm Note that swrast and indirect must still be disabled in the APPLEGL case at the moment, which makes things more complex than they need to be. More untangling is needed to allow that Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-10 10:32:22 +01:00
Kristian Høgsberg	7a45274477	i965: Make gen7_pi field of brw_instruction use unsigned instead of GLuint Nothing else uses GL-types here. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-09 21:17:19 -07:00
Kristian Høgsberg	cefa265761	i965: Don't include mtypes.h in brw_disasm.c It's not used. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-09 21:17:19 -07:00
Matt Turner	8e115b03cf	i965/fs: initialize src as reg_undef for texture opcodes on Gen4. Untested.	2014-06-09 21:08:05 -07:00
Tapani Pälli	198204c9c5	i965/fs: initialize src as reg_undef for texture opcodes on Gen5/6. Commit `07af0ab` changed fs_inst to have 0 sources for texture opcodes in emit_texture_gen5 (Ironlake, Sandybrige) while fs_generator still uses a single source from brw_reg struct. Patch sets src as reg_undef which matches the behavior before the constructor got changed. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79534	2014-06-09 21:08:05 -07:00
Emil Velikov	5cb1cad0ae	egl/dri2: do not leak dri2_dpy->driver_name Originally all hardware drivers duplicate the driver_name string from an external source, while for the software rasterizer we set it to "swrast". Follow the example set by hw drivers this way we can free the string at dri2_terminate(). v2: Use strdup over strndup. Suggested by Ilia Mirkin. v3: Handle platform_drm in a similar manner. Cleanup swrast driver_name in error path. Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-09 22:56:00 +01:00
Emil Velikov	c153b1f39b	egl/dri2/x11: use standard strndup function Using a custom version of the function brings no benefit. Cc: Chad Versace <chad.versace@linux.intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-06-09 22:55:51 +01:00
Adrian Negreanu	357a8b6f33	android, dricore: undefined reference to _mesa_streaming_load_memcpy _mesa_streaming_load_memcpy is defined in main/streaming-load-memcpy.c I'm adding it to the dricore lib Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:51:44 +01:00
Adrian Negreanu	6eb3888c86	android, mesa_gen_matypes: pull in timespec POSIX definition This fixes: include/c11/threads_posix.h: In function 'cnd_timedwait': include/c11/threads_posix.h:140:21: error: storage size of 'abs_time' isn't known Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:51:34 +01:00
Adrian Negreanu	6980cae6ae	android, egl: typo dri2_fallback_pixmap_surface -> dri2_fallback_create_pixmap_surface I used commit `bc8b07a6` as reference, and only the droid_display_vtbl had this issue. This fixes: src/egl/drivers/dri2/platform_android.c:641:29: error: 'dri2_fallback_pixmap_surface' undeclared here (not in a function) Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:51:17 +01:00
Adrian Negreanu	4dc5545eff	android, egl: add correct drm include for libmesa_egl_dri2 Fixes: src/egl/drivers/dri2/platform_android.c:38: include/GL/internal/dri_interface.h:51:17: fatal error: drm.h: No such file or directory Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:51:10 +01:00
Adrian Negreanu	0048483f73	android: add src/gallium/auxiliary as include path for libmesa_dricore This fixes: In file included from /home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_exec_api.c:445:0: /home/adrian/workspace/mesa/mesa-master.git/src/mesa/vbo/vbo_attrib_tmp.h:28:38: fatal error: util/u_format_r11g11b10f.h: No such file or directory Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:51:02 +01:00
Adrian Negreanu	a49ebfab1d	android: add libloader to libGLES_mesa and libmesa_egl_dri2 This fixes src/egl/drivers/dri2/platform_android.c:664: error: undefined reference to 'loader_set_logger' src/egl/drivers/dri2/platform_android.c:678: error: undefined reference to 'loader_get_driver_for_fd' Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:50:53 +01:00
Adrian Negreanu	aba0f152be	android: adapt to the megadriver mechanism Fixes linker error: ld: .../libmesa_dri_common_intermediates/libmesa_dri_common.a(dri_util.o): in function globalDriverAPI:dri_util.c(.data.rel+0x0): error: undefined reference to 'driDriverAPI' As an example, you can see that mesa_dri_drivers also uses common/libmegadriver_stub (src/mesa/drivers/dri/Makefile.am) The _stub part might be confusing, but it actually provides the dri-driver shared lib constructor, megadriver_stub_init, which will later on load the real platform dependent part and call l __driDriverGetExtensions_<platform> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:50:41 +01:00
Adrian Negreanu	eb3f80dbba	add megadriver_stub_FILES So that android part can also use $(megadriver_stub_FILES) Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Adrian Negreanu <adrian.m.negreanu@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-09 22:49:54 +01:00
Emil Velikov	c21fca8bf2	scons: remove dri-i915 build target Unmaintained and broken. Cc: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Jakob Bornecrantz <jakob@vmware.com>	2014-06-09 22:46:17 +01:00
Emil Velikov	93257a56b5	configure: error out when building opencl without LLVM Cc: Tom Stellard <thomas.stellard@amd.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-06-09 22:45:05 +01:00
Abdiel Janulgue	6f9f916b9b	i965/disasm: Properly debug negate source modifier for logical instructions Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-06-09 11:19:50 -07:00
Abdiel Janulgue	c17db7537f	i965/vec4: skip copy-propate for logical instructions with negated src entries The negation source modifier on src registers has changed meaning in Broadwell when used with logical operations. Don't copy propagate when negate src modifier is set and when the destination instruction is a logical op. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-06-09 11:19:48 -07:00
Abdiel Janulgue	609d00e13e	i965/fs: skip copy-propate for logical instructions with negated src entries The negation source modifier on src registers has changed meaning in Broadwell when used with logical operations. Don't copy propagate when negate src modifier is set and when the destination instruction is a logical op. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-06-09 11:19:45 -07:00
Abdiel Janulgue	a66660d2b7	i965/fs: Refactor check for potential copy propagated instructions. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>	2014-06-09 11:19:39 -07:00
Brian Paul	1e150ca696	docs: add link to 10.1.5 on news page	2014-06-09 06:13:41 -07:00
Brian Paul	c53550586e	docs: fix version number in 10.2.1 release notes	2014-06-09 06:10:35 -07:00
Brian Paul	bedeb5433b	docs: import the 10.1.5 release notes	2014-06-09 06:10:18 -07:00
Chris Forbes	5bbb028ef3	glsl: Validate aux storage qualifier combination with other qualifiers. We've been allowing `centroid` and `sample` in all kinds of weird places where they're not valid. Insist that `sample` is combined with `in` or `out`; and that `centroid` is combined with `in`, `out`, or the deprecated `varying`. V2: Validate this in a more sensible place. This does require an extra case for uniform blocks members and struct members, though, since they don't go through the normal path. V3: Improve error message wording; eliminate redundant error generation for inputs in VS or outputs in FS. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-10 10:09:31 +12:00
Iago Toral Quiroga	c75f827f12	i965: Ensure that we end instruction streams properly. Threads must terminate with a SEND message to a particular shared function, such as a URB write or FB write, so the instruction stream really shouldn't ever end in an IF/ELSE/ENDIF or similar block structure. However, if the instruction stream (incorrectly) ends in a block structure the last block's end pointer will not be set, leading to a crash later on in fs_live_variables::setup_def_use(). It is better to detect this earlier, so assert on that. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-09 12:00:04 +02:00
Iago Toral Quiroga	dc2d3a7f5c	i965/fs: Add Gen < 6 runtime checks for line antialiasing. In Gen < 6 the hardware generates a runtime bit that indicates whether AA data has to be sent as part of the framebuffer write SEND message. This affects the specific case where we have setup antialiased line rendering and we render polygons which have one face setup in GL_LINE mode (line antialiasing will be used) and the other one in GL_FILL mode (no line antialiasing needed). Currently we are not doing this runtime test and instead we always send AA data, which produces incorrect rendering of the GL_FILL face of the polygon in in the aforementioned scenario (verified in ironlake and gm45). In Gen4 this is, likely, a regression introduced with commit `098acf6c84`. In Gen5 this has never worked properly. Gen > 5 are not affected by this. The patch fixes the problem by adding the appropriate runtime check and adjusting the framebuffer write message accordingly in the conflictive scenario. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-09 11:48:49 +02:00
Iago Toral Quiroga	6e61892aea	i965/fs: Let the gen < 8 generator know about runtime_check_aads_emit In gen < 6 we need to produce conditional code based on this flag when doing framebuffer writes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-09 11:47:38 +02:00
Chris Forbes	be1b5724ab	docs: Mark off ARB_compressed_texture_pixel_storage .. and add to release notes for 10.3 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:42:45 +12:00
Chris Forbes	8a1a4855cf	mesa: Add extension enable for ARB_compressed_texture_pixel_storage Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:42:45 +12:00
Chris Forbes	b57138b57a	mesa: Add pixel storage support for GetCompressedTexImage Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:42:45 +12:00
Chris Forbes	be30766f56	mesa: Compute proper strides for compressed texture pixel storage. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:42:45 +12:00
Chris Forbes	8d29569c25	mesa: Extract computation of compressed pixel store params This logic is reusable across CompressedTexImage and GetCompressedTexImage; the strides calculated will also be needed in the PBO validation functions to ensure that the referenced range of bytes is valid. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:42:44 +12:00
Chris Forbes	d6e60cb504	mesa: Emit errors for inconsistent compressed pixel store state V2: Use bool rather than GLboolean for internal function Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:42:44 +12:00
Chris Forbes	75a5823749	mesa: Add new pixel pack/unpack state for ARB_compressed_texture_pixel_storage Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:38:42 +12:00
Chris Forbes	1fca84e7a0	tests: Add new enum strings for ARB_compressed_texture_pixel_storage Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:38:40 +12:00
Chris Forbes	cef3f9b909	glapi: Add XML infrastructure for ARB_compressed_texture_pixel_storage Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:38:38 +12:00
Chris Forbes	8f63559c93	mesa: Make CompressedTexSubImage errors more consistent Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:38:36 +12:00
Chris Forbes	4119b0eaee	mesa: Trim down PixelStorei implementation Move _mesa_error call for INVALID_VALUE to one place. Remove checks for previous value matching -- this was important when we were flushing vertices before the update, but that hasn't happened for a long time now. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-10 07:38:27 +12:00
José Fonseca	eb58aa9cf0	mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING). A recent ApiTrace change, that tries to dump more buffer state causes Mesa from my distro (10.1.4) to segfaults here. I haven't actually confirm this fixes it (I can't repro on master), but it seems a good idea to be defensive here anyway. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-08 09:43:14 +01:00
Iago Toral Quiroga	8873120f9f	Revert "i965: Move brw_land_fwd_jump() to compilation unit of its use." This reverts commit `f3cb2e6ed7`. brw_land_fwd_jump() is convenient wherever we produce JMPI instructions and we will use JMPI to implement framebuffer writes that involve line antialiasing in gen < 6. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-07 21:32:35 -07:00
Kenneth Graunke	220e208329	i965: Fix else and brace placement in brw_eu_emit.c. I'm making a lot of changes to this area, and I figured I may as well not conflate these trivial changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-07 21:30:03 -07:00
Kenneth Graunke	1f3735bff0	i965: Drop the remaining default predication whacking. With my earlier cleaning in place (see git log brw_eu_emit.c), nothing relies on the instruction emitters for IF/WHILE/JMPI disabling predication. Drop it in favor of making callers do the right thing explicitly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-07 21:30:03 -07:00
Kenneth Graunke	8a314a784c	i965/sf: Use brw_set_default_predicate_control(). This is a bit tidier than poking at p->current directly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-07 21:30:03 -07:00
Ilia Mirkin	bd7dd3ed06	gk110/ir: fix bfind emission There is a short-immediate version as well, but it should never end up getting used since it would have gotten folded earlier. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-07 16:39:19 -04:00
Ian Romanick	40500ebb20	docs: Add MD5 checksum, etc. for 10.2.1 release Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `70ce1031e7`)	2014-06-06 23:30:58 -07:00
Ian Romanick	3581e5ef89	docs: Add initial 10.2.1 release notes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `8c4845d29b`)	2014-06-06 23:30:17 -07:00
Vinson Lee	82c577acfa	configure.ac: Do not use Pthreads with MinGW. Match the behavior of the SCons MinGW build. This patch also fixes these build errors. CC glapi_entrypoint.lo glapi_entrypoint.c: In function 'init_glapi_relocs_once': glapi_entrypoint.c:341:4: error: unknown type name 'pthread_once_t' static pthread_once_t once_control = PTHREAD_ONCE_INIT; ^ glapi_entrypoint.c:341:41: error: 'PTHREAD_ONCE_INIT' undeclared (first use in this function) static pthread_once_t once_control = PTHREAD_ONCE_INIT; ^ glapi_entrypoint.c:341:41: note: each undeclared identifier is reported only once for each function it appears in glapi_entrypoint.c:342:4: error: implicit declaration of function 'pthread_once' [-Werror=implicit-function-declaration] pthread_once( & once_control, init_glapi_relocs ); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-06 22:25:13 -07:00
Ilia Mirkin	7a67318794	gk110/ir: fix emitting constbuf file index Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-07 00:30:22 -04:00
Ian Romanick	637132645a	docs: Add MD5 checksum, etc. for 10.1 release Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit `28d41e409d`)	2014-06-06 21:19:27 -07:00
Ilia Mirkin	4a3a71a183	gk110/ir: emit saturate flag on fadd when needed Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 23:32:29 -04:00
Ilia Mirkin	9fef8b3d81	gk110/ir: fix slct emission Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 22:54:21 -04:00
Timothy Arceri	1454f894ff	st/mesa: remove extra calculation of sampler count This code was originally introduced to fix https://bugs.freedesktop.org/show_bug.cgi?id=53617. The comment says you need to pass NULL in order to unref old views however cso_set_sampler_views() already takes care of old views with the second for loop. Also as of `2355a64414` cso_set_sampler_views() passes the max of the old and new views to the driver for all state trackers making this code obsolete. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-06-07 12:21:19 +10:00
Ilia Mirkin	d588a4919b	gk110/ir: fix interp mode emission Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 20:33:06 -04:00
Ilia Mirkin	ed1b9e5721	gk110/ir: fix ISAD emission with register args Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 19:52:49 -04:00
Ilia Mirkin	6e046508a1	gk110/ir: fix quadon opcode emission Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 19:27:28 -04:00
Ilia Mirkin	ca65fc418f	nvc0: don't bother trying to set up compute for gk110+ The nouveau fw currently prints a bunch of errors. No point in seeing those all the time, esp since compute doesn't really work in the first place. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 18:25:35 -04:00
Ilia Mirkin	b9ec766bd0	gk110: add in forgotten code for gk110 isa Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 18:25:32 -04:00
Ilia Mirkin	73eec47ef8	gk110/ir: emit texbar the same way that the blob does Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 18:25:16 -04:00
José Fonseca	b6956aef74	scons: Search only for mingw-w64 cross-compilers. Some distros still ship the non-mingw-w64 cross-compilers, but they are can't build Mesa properly, as Jakob pointed out.	2014-06-06 13:15:37 +01:00
Stéphane Marchesin	1751a9ba26	i915g: Remove 4444 and 5551 formats They don't seem to work 100%, I need to investigate but in the meantime let's remove them.	2014-06-05 21:44:35 -07:00
Tobias Klausmann	4f4e9ba166	nvc0/ir: Handle OP_POPCNT when folding constant expressions Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> [imirkin: make sure to only fold 1-arg popcnt in opnd] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-06 00:05:11 -04:00
Tobias Klausmann	fdc1d96b0f	nvc0/ir: Handle OP_BFIND when folding constant expressions Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-06 00:00:26 -04:00
Tobias Klausmann	4674343e8f	nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressions Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-06-06 00:00:26 -04:00
Tobias Klausmann	3164bfc734	nv50/ir: clear subop when folding constant expressions Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF) might have a subop set. After folding, make sure that it is cleared Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-06 00:00:26 -04:00
Kenneth Graunke	221169693b	i965: Support GL_CLAMP natively on Broadwell. The new hardware actually supports this OpenGL 1.x feature natively, so we can finally drop our shader workarounds. Not many applications use GL_CLAMP, and most use it unintentionally, but it's trivial to do right, so we should. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-05 01:26:05 -07:00
Kenneth Graunke	7f3d64a77b	i965: Pass brw to translate_wrap_mode(). This lets us do generation checks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-05 01:25:56 -07:00
Tapani Pälli	cf29913aa1	i965: use _mesa_align_malloc in intel_miptree_map_movntdqa This fixes case where we have 1x1 size buffer and misalignment is 0. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79616	2014-06-05 09:00:17 +03:00
Chris Forbes	3c77d2a113	i965/fs: Allow array dereference of HW_REG. When dereferencing an element of gl_SampleMaskIn[], the source register here will be a HW_REG rather than a VGRF because the payload slot is now exposed directly. Fixes an assertion failure in the Piglit test: tests/spec/arb_gpu_shader5/execution/samplemaskin-basic Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-05 06:53:43 +12:00
Leo Liu	3642ee846a	st/omx/enc: enable b frames Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-04 17:24:42 +02:00
Leo Liu	e074f8200e	radeon/vce: implement h264 profile support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-04 17:24:42 +02:00
Leo Liu	f588b80bba	st/omx/enc: implement h264 profile support Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-04 17:24:41 +02:00
Leo Liu	4722c326ce	vl: add more avc profiles Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-06-04 17:24:41 +02:00
José Fonseca	122e232495	wgl: Disable CRT message boxes when Windows system error messages boxes are disabled. At least on MSVC we statically link against the CRT, so we must disable the CRT message boxes if we want unattended testing. The messages are convenient when running manually, so let them be if the system error message boxes are not disabled.	2014-06-04 10:25:08 +01:00
Chris Forbes	7e0dd80f11	glapi: Note apparent gap in numbering from ARB_multi_draw_indirect This is defined in the same included file as ARB_draw_indirect. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-04 20:25:39 +12:00
Chris Forbes	7bf768b484	docs: Mark off gs5/overload resolution Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-04 20:12:58 +12:00
Chris Forbes	b18b4c7d74	glsl: Implement overload resolution for ARB_gpu_shader5 V3: Move spec citation into the code. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 20:10:27 +12:00
Chris Forbes	c1ceadfc32	glsl: Add support for comparing function parameter conversions The ARB_gpu_shader5 spec says: "To determine whether the conversion for a single argument in one match is better than that for another match, the following rules are applied, in order: 1. An exact match is better than a match involving any implicit conversion. 2. A match involving an implicit conversion from float to double is better than a match involving any other implicit conversion. 3. A match involving an implicit conversion from either int or uint to float is better than a match involving an implicit conversion from either int or uint to double. If none of the rules above apply to a particular pair of conversions, neither conversion is considered better than the other." V3: Add spec citation, including oddball difference between gs5 and GLSL 4.0; comment a bit better as per Jordan's suggestions. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 20:03:08 +12:00
Chris Forbes	59dd444cac	glsl: Build a list of inexact function matches This will facilitate GLSL 4.0 / ARB_gpu_shader5's enhanced overload resolution rules, and also possibly better error reporting for ambiguous function calls. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 19:49:34 +12:00
Chris Forbes	4312e973f2	docs: Mark off gs5/implicit conversions Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-06-04 19:36:02 +12:00
Chris Forbes	6ae787584d	glsl: Allow int -> uint implicit conversions on function parameters V2: Fix crashes during linking, where the parse state is NULL. In this case, all required checks have already been done, so we assume the extension is enabled. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-04 19:35:59 +12:00
Chris Forbes	f17428a276	glsl: Pass parse state to can_implicitly_convert_to() Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-04 19:35:57 +12:00
Chris Forbes	a78c663c22	glsl: Pass parse state to parameter_lists_match() The available implicit conversions depend on the GLSL version we're compiling. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-04 19:35:54 +12:00
Chris Forbes	240974e93f	glsl: Add support for int -> uint implicit conversions This is required for ARB_gpu_shader5. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-04 19:35:51 +12:00
Chris Forbes	1ace51f091	glsl: Clean up apply_implicit_conversion We're about to add new implicit conversions, first for ARB_gpu_shader5, and then later for ARB_gpu_shader_fp64. Pull out the opcode determination into its own function, and get rid of the bool -> float case that could never be hit anyway [since it fails the is_numeric() check]. V2: Retain the vector width mangling. It turns out this is necessary for the conversions done (and then thrown away) when determining the return type of arithmetic operators. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-04 19:35:47 +12:00
Chris Forbes	9578bb21d0	docs: Update `precise` qualifier status in GL3.txt Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 18:56:11 +12:00
Chris Forbes	345034869e	glsl: Allow `precise` as a parameter qualifier Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 18:56:09 +12:00
Chris Forbes	d0495c6db8	glsl: Disallow `precise` redeclarations of vars from outer scopes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 18:56:08 +12:00
Chris Forbes	5ecffe5a3a	glsl: Add support for `precise` redeclarations This works like glsl-1.20+'s invariant redeclarations, but with fewer restrictions, since `precise` is allowed on pretty much anything. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 18:56:05 +12:00
Chris Forbes	4b756b20c4	glsl: add support for `precise` in type_qualifier Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 18:56:03 +12:00
Chris Forbes	37ab3ddbf8	glsl: remove outdated comment, move sample to correct block Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-06-04 18:55:49 +12:00
Kenneth Graunke	7913b4b97b	i965: Fix copy and pasted values in Broadwell code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-06-03 18:19:54 -07:00
Matt Turner	ac25cf55af	glsl: Make most ir_instruction::as_subclass() functions non-virtual. There are several common ways to check whether an object is a particular subclass: dynamic_cast<>, the as_subclass() pattern, or explicit enum tags. We originally used the virtual as_subclass methods, but later added enum tags as they are much nicer for debugging. Since we have the enum tags, we don't necessarily need to use virtual functions to implement the as_subclass() methods. We can just check the tag and return the pointer or NULL. This saves 18 entries in the vtable, and instead of two pointer dereferences per as_subclass() call most are only three inline instructions. Compile time of sam3/112.frag (the longest compile in a recent shader-db run) is reduced by 5% from 348 to 329 ms (n=500). perf stat of this workload shows: 24.14% reduction in iTLB-loads: 285,543 -> 216,606 42.55% reduction in iTLB-load-misses: 18,785 -> 10,792 Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-03 17:58:34 -07:00
Matt Turner	773544f0e9	glsl: Move ir_type_unset to end of enumeration. Now that the constructors set a type, ir_type_unset is not very useful. Move it to the end of the enum (specifically out of position 0) so that enums checks for dereferences and rvalues can save an instruction. Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-03 17:58:34 -07:00
Matt Turner	943cc7ff17	glsl: Reorder ir_type_* enum for easier comparisons. Makes checking whether an object is an ir_dereference, an ir_rvalue, or an ir_jump simpler. Since ir_dereference is a subclass or ir_rvalue, list its subtypes first so that they can both generate nice code. Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-03 17:58:34 -07:00
Matt Turner	3540b5eb55	glsl: Remove useless call to as_rvalue(). The type returned by hir() is already an ir_rvalue pointer. Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-06-03 17:58:34 -07:00
Ian Romanick	963bd99f03	glsl: Set ir_instruction::ir_type in the base class constructor This has the added perk that if you forget to set ir_type in the constructor of a new subclass (or a new constructor of an existing subclass) the compiler will tell you... instead of relying on ir_validate or similar run-time detection. Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-06-03 17:58:34 -07:00
Sinclair Yeh	91ff0d4c65	egl: Check for NULL native_window in eglCreateWindowSurface We have customers using NULL as a way to test the robustness of the API. Without this check, EGL will segfault trying to dereference dri2_surf->wl_win->private because wl_win is NULL. This fix adds a check and sets EGL_BAD_NATIVE_WINDOW v2: Incorporated feedback from idr - moved the check to a higher level function. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-03 17:28:30 -07:00
Marek Olšák	d226191820	r600g,radeonsi: don't use hardware MSAA resolve if dst is fast-cleared It doesn't work and our docs say so too. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-06-03 13:33:14 +02:00
Marek Olšák	0423513c61	radeonsi: BlitFramebuffer should follow render condition Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-06-03 13:33:14 +02:00
Marek Olšák	3a92fc1bdd	r600g: BlitFramebuffer should follow render condition	2014-06-03 13:33:14 +02:00
Marek Olšák	d929a30e9a	r300g: BlitFramebuffer should follow render condition	2014-06-03 13:33:14 +02:00
Marek Olšák	bf701a84eb	r600g,radeonsi: disable fast clear if render condition is on For some reason, CP DMA doesn't follow the predicate bit if I enable it, so this is the only option. This fixes piglit: spec/NV_conditional_render/clear Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-06-03 13:33:14 +02:00
José Fonseca	e3e13d6b85	mesa: Make glGetIntegerv(GL__ARRAY_SIZE) return GL_BGRA. Same as `b026b6bbfe`, but COLOR_ARRAY_SIZE/SECONDARY_COLOR_ARRAY_SIZE. Ideally we wouldn't munge the incoming state, so that we wouldn't need to unmunge it back on glGet. But the array size state is copied and referred in many places, many of which couldn't take an GLenum like GL_BGRA instead of a plain integer. So just hack around on glGet*, to ensure there is no risk of introducing regressions elsewhere. This bug causes problems to Apitrace, resulting in wrong traces. See https://github.com/apitrace/apitrace/issues/261 for details. Tested with piglit arb_vertex_array_bgra-get, which was created for this purpose. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-03 12:20:53 +01:00
José Fonseca	53468dee03	mesa/main: Make get_hash.c values constant. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-03 12:20:50 +01:00
Vinson Lee	dad22cc590	i965: Add _default_ name changes to test_eu_compact.c. These were missed in commit `e374809819`. Fixes 'make check'. CC test_eu_compact.o test_eu_compact.c: In function ‘gen_f0_0_MOV_GRF_GRF’: test_eu_compact.c:222:4: error: implicit declaration of function ‘brw_set_predicate_control’ [-Werror=implicit-function-declaration] brw_set_predicate_control(p, true); ^ test_eu_compact.c: In function ‘run_tests’: test_eu_compact.c:270:6: error: implicit declaration of function ‘brw_set_access_mode’ [-Werror=implicit-function-declaration] brw_set_access_mode(p, BRW_ALIGN_16); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-06-02 23:44:21 -07:00
Matt Turner	328e959317	i965/gen8: Print number of instructions directly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-02 15:17:30 -07:00
Matt Turner	757d7ddf01	i965: Emit compaction stats without walking the assembly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-02 15:17:29 -07:00
Matt Turner	6fdfe3f2dc	i965: Move program header printing to end of generate_code(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-02 15:17:29 -07:00
Matt Turner	92b055625d	i965: Move annotation info into generate code. Suggested by Ken as a way to cut down lines of code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-02 15:17:29 -07:00
Kenneth Graunke	e374809819	i965: Put '_default_' in the name of functions that set default state. Eventually we're going to use functions to set bits on an instruction. Putting 'default' in the name of functions that alter default state will help distinguins them. This patch was generated entirely mechanically, by the following: for file in brw*.{cpp,c,h}; do sed -i \ -e 's/brw_set_mask_control/brw_set_default_mask_control/g' \ -e 's/brw_set_saturate/brw_set_default_saturate/g' \ -e 's/brw_set_access_mode/brw_set_default_access_mode/g' \ -e 's/brw_set_compression_control/brw_set_default_compression_control/g' \ -e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \ -e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \ -e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \ -e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \ $file; done No manual changes were done after running that command. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:36 -07:00
Kenneth Graunke	76d7160c6c	i965: Delete brw_set_conditionalmod. This removes the ability to set the default conditional modifier on all future instructions. Nothing uses it, and it's not really a sensible thing to do anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:35 -07:00
Kenneth Graunke	fea7b97742	i965: Eliminate brw_set_conditionalmod from the Gen4-5 compilers. With the predication changes eliminated, all this does is set the conditional modifier on a single instruction. Doing that directly is easy, and avoids mucking about with default state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:33 -07:00
Kenneth Graunke	776ad51165	i965: Don't use brw_set_conditionalmod in the FS and vec4 compilers. brw_set_conditionalmod and brw_next_insn work together to set the conditional modifier for the next instruction, then turn it off. The Gen8+ generators don't implement this: we just set it for all future instructions, and whack it for each fs_inst/vec4_instruction. Both approaches work out because we only set conditional_mod on IR instructions like CMP, AND, and so on, which correspond to exactly one assembly instruction. The Gen8 generators would break if we had an IR instruction that generated multiple instructions, and the Gen4-7 EU emit layer would do...something. To safeguard against this, assert that we only generated one instruction if conditional_mod is set, and just set the flag directly on that instruction rather than altering default state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:30 -07:00
Kenneth Graunke	ff340ce3c3	i965: Stop setting predication from brw_set_conditionalmod. brw_set_conditionalmod has traditionally been complex: it causes conditionalmod to be set for the next instruction, and then predication to be set on all future instructions after that. We may want to generate a flag condition and not use it immediately, due to instruction scheduling or the like. Even if not, it's easy to set things explicitly, and that's clearer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:29 -07:00
Kenneth Graunke	0985da5423	i965: Drop unnecessary brw_set_conditionalmod() before brw_CMP(). brw_CMP already takes a conditional modifier as a parameter, and sets it accordingly. brw_set_conditionalmod() also makes everything after the next instruction predicated, but we don't need that: we always emit an IF instruction after load_clip_distance(), and that's already predicated. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:26 -07:00
Kenneth Graunke	0bfac24caf	i965/clip: Use the new brw_last_inst macro instead of temporaries. It wasn't too bad before, but the macro is going to be nicer once I start modifying a lot more instructions in this pattern. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:25 -07:00
Kenneth Graunke	42c292006c	i965: Create a "brw_last_inst" convenience macro. Often times, we want to emit an instruction, then set one field on it, such as predication or a conditional modifier. Normally, we'd have to declare "struct brw_instruction *inst;" and then use "inst = brw_FOO(...)" to emit the instruction, which can hurt readability. The new "brw_last_inst" macro refers to the most recently emitted instruction, so you can just do: brw_ADD(...) brw_last_inst->header.predicate_control = BRW_PREDICATE_NORMAL; Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:23 -07:00
Kenneth Graunke	8deb91b2e7	i965: Make brw_JMPI set predicate_control based on a parameter. We use both predicated and unconditional JMPI instructions. But in each case, it's clear which we want. It's simpler to just specify it as a parameter, rather than relying on default state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:21 -07:00
Kenneth Graunke	3769a2d51f	i965: Remove the dst and src0 parameters from brw_JMPI. In all cases, we set both dst and src0 to brw_ip_reg(). This is no accident: according to the ISA reference, both are required to be the IP register. So, we may as well drop the parameters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-06-02 15:09:12 -07:00
Beren Minor	0ca0d5743f	egl/main: Fix eglMakeCurrent when releasing context from current thread. EGL 1.4 Specification says that eglMakeCurrent(display, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT) can be used to release the current thread's ownership on the surfaces and context. MESA's egl implementation was only accepting the parameters when the KHR_surfaceless_context extension is supported. [chadv] Add quote from the EGL 1.4 spec. Cc: "10,1, 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-06-02 12:16:50 -07:00
Marek Olšák	f98a7d89be	radeonsi: enable ARB_sample_shading	2014-06-02 13:01:27 +02:00
Marek Olšák	d0e8b65aed	radeonsi: implement SAMPLEMASK fragment shader output	2014-06-02 12:58:22 +02:00
Marek Olšák	99df120e00	radeonsi: interpolate varyings at sample when full sample shading is enabled	2014-06-02 12:58:22 +02:00
Marek Olšák	99d9d7c0d6	radeonsi: implement SAMPLEPOS fragment shader input The sample positions are read from a constant buffer.	2014-06-02 12:58:22 +02:00
Marek Olšák	5b06fc376d	radeonsi: implement SAMPLEID fragment shader input	2014-06-02 12:58:22 +02:00
Marek Olšák	501fee2511	radeonsi: implement set_min_samples This is how per-sample shading is enabled.	2014-06-02 12:58:22 +02:00
Marek Olšák	fe98bfb261	radeon: add basic register setup for per-sample shading Only for Cayman, SI, CIK.	2014-06-02 12:58:22 +02:00
Marek Olšák	3aed75c859	radeon: split cayman_emit_msaa_state into 2 functions The other function will be split up from the framebuffer state.	2014-06-02 12:58:22 +02:00
Marek Olšák	0d5ec2c615	Revert "glx: load dri driver with RTLD_LOCAL so dlclose never fails to unload" This reverts commit `e3cc0d90e1`. It breaks too many apps and completely breaks my desktop too. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79469 We'll probably need to re-release all stable versions after this is committed. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-06-02 12:56:12 +02:00
Christoph Bumiller	b206f5951c	r600g: use TGSI_PROPERTY to disable viewport and clipping v2 get rid of magic value, use DEFINES v3 update clip_disable together with vs_position_window_space Big thanks to Marek Olšák! Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	4b586a26c8	gallium: create TGSI_PROPERTY to disable viewport and clipping Marek v2: add a cap Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	304f64bb50	r600g: remove assert on draw with count == 0 Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	476aaf8b8e	r600g: HW bug workaround for TGSI_OPCODE_BREAKC Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	6544a4a342	r600g: implement TGSI_OPCODE_BREAKC Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	822ac96802	r600g: support all channels of TGSI_FILE_ADDRESS It's allowed in SM3. v2: fix multi-component tgsi_r600_arl (FLT_TO_INT is trans-only) Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	04eb8b85ea	r600g: check for PIPE_BIND_BLENDABLE in is_format_supported v2: added !util_format_is_depth_or_stencil(format) Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:03 +02:00
Christoph Bumiller	04de3234ee	r600g: handle PIPE_QUERY_GPU_FINISHED Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-06-02 12:49:02 +02:00
Matt Turner	84e0a5c406	i965/fs: Add fs_inst constructor that takes a list of sources. Also add an emit() function that calls it. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:24 -07:00
Matt Turner	521f9b9a48	i965/fs: Add a function to resize fs_inst's sources array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:24 -07:00
Matt Turner	07af0abef0	i965/fs: Clean up fs_inst constructors. In a fashion suggested by Ken. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:24 -07:00
Matt Turner	b1dcdcde2e	i965/fs: Loop from 0 to inst->sources, not 0 to 3. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:24 -07:00
Matt Turner	27e12a8ea9	i965/fs: Store the number of sources an fs_inst has. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:23 -07:00
Matt Turner	1b60391ed4	i965/fs: ralloc fs_inst's fs_reg sources. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:23 -07:00
Matt Turner	a391e99b23	i965/fs: Disable fs_inst assignment operator. The fs_reg src array is going to turn into a pointer and we'd rather not consider the implications of shallow copying fs_insts. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:23 -07:00
Matt Turner	6d3a15223a	i965/fs: Add and use an fs_inst copy constructor. Will get more complicated when fs_reg src becomes a pointer. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:29:23 -07:00
Matt Turner	bfcf6a665b	i965: Skip IR annotations with INTEL_DEBUG=noann. Running shader-db with INTEL_DEBUG=noann reduces the runtime from ~90 to ~80 seconds on my machine. It also reduces the disk space consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on disk). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:18:52 -07:00
Matt Turner	55bd8b8b66	i965/fs: Debug the optimization passes by dumping instr to file. With INTEL_DEBUG=optimizer, write the output of dump_instructions() to a file each time an optimization pass makes progress. This lets you easily diff successive files to see what an optimization pass did. Example filenames written when running glxgears: fs8-0000-00-start fs8-0000-01-04-opt_copy_propagate fs8-0000-01-06-dead_code_eliminate fs8-0000-01-12-compute_to_mrf fs8-0000-02-06-dead_code_eliminate \| \| \| \| \| \| \| `-- optimization pass name \| \| \| \| \| `-- optimization pass number in the loop \| \| \| `-- optimization loop interation \| `-- shader program number Note that with INTEL_DEBUG=optimizer, we disable compact_virtual_grfs, so that we can diff instruction lists across loop interations without the register numbers being changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:18:52 -07:00
Matt Turner	e9bf1662b0	i965: Give dump_instructions() a filename argument. This will allow debugging code to dump the IR after an optimization pass makes progress (the next patch). Only let it open and write to a file if the effective user isn't root. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:18:52 -07:00
Matt Turner	56d6dcf4f7	i965: Give dump_instruction() a FILE* argument. Use function overloading rather than default arguments, since gdb doesn't know about default arguments. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:18:52 -07:00
Matt Turner	08c2acd8d9	i965: Add envvar to debug the optimization passes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-06-01 13:18:52 -07:00
Roland Scheidegger	3fc72f2ec6	llvmpipe: (trivial) drop "unswizzled" from some function names This made sense when swizzled storage layout was used for rendering to tiles. But nowadays the name just adds confusion (and makes for long lines). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-31 22:05:14 +02:00
Roland Scheidegger	576868140b	llvmpipe: fix crash when not all attachments are populated in a fb Framebuffers can have NULL attachments since a while. llvmpipe handled that properly for lp_rast_shade_quads_mask but it seems the change didn't make it to lp_rast_shade_tile. This fixes piglit fbo-drawbuffers-none test (though I need to increase the FB_SIZE from 32 to 256 so the tris cover some tiles fully). https://bugs.freedesktop.org/show_bug.cgi?id=79421 Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-31 22:05:14 +02:00
Roland Scheidegger	98d8ba2776	softpipe: honor the render_condition_enable bit in blits. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-31 22:05:14 +02:00
Roland Scheidegger	c90b5884bd	llvmpipe: honor the render_condition_enable bit in blits. This fixes piglit nv_conditional_render-blitframebuffer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-31 22:05:14 +02:00
Roland Scheidegger	f49e201df9	gallium/docs: improve documentation of render condition wrt blits. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-31 22:05:14 +02:00
Brian Paul	3b66029dd3	svga: use svga_shader_too_large() in compile_vs() And rework the dummy shader code to match the fragment shader case. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-05-31 06:25:36 -06:00
Brian Paul	3bb18eab72	svga: use svga_shader_too_large() in compile_fs() Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-05-31 06:25:35 -06:00
Brian Paul	7b2ff54417	svga: added svga_shader_too_large() helper To check if a shader bytcode exceeds the device limit. There's no limit when using GBS. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-05-31 06:25:35 -06:00
Jeremy Huddleston Sequoia	b4f34241ec	darwin: Remove extra kCGLPFAColorSize attribute when requesting an offscreen context https://xquartz.macosforge.org/trac/ticket/650 Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2014-05-31 03:44:51 -07:00
Vinson Lee	83bba8f146	util: Do not use __builtin_clrsb with Intel C++ Compiler. This patch fixes this build error with icc 14.0.2. In file included from state_tracker/st_glsl_to_tgsi.cpp(63): ../../src/gallium/auxiliary/util/u_math.h(583): error: identifier "__builtin_clrsb" is undefined return 31 - __builtin_clrsb(i); ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-05-30 19:47:35 -07:00
Lubomir Rintel	90b5747856	i915: add a missing NULL pointer check mesaVisual can be NULL with configless context since this commit: commit `551d459af4` Author: Neil Roberts <neil@linux.intel.com> Date: Fri Mar 7 18:05:47 2014 +0000 Add the EGL_MESA_configless_context extension ... Previously the i965 and i915 drivers were explicitly creating a zeroed visual whenever 0 is passed for the EGLConfig. We attempt to dereference the visual in i915 and now we don't create a zeroed-out one one it crashes, breaking at least weston in an i915. There's no point in doing so as it would be zero anyway. v2: Fixed a typo in commit message. Added some tags. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1100967 Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-30 17:10:08 -07:00
Ian Romanick	7b1aeec9cd	glapi: Duplicate GLES1 prototypes in glapi_dispatch.c These prototypes are necessary because GLES1 library builds will create dispatch functions for them. We can't directly include GLES/gl.h because it would conflict the previously-included GL/gl.h. Since GLES1 ABI is not expected to every add more functions, the path of least resistance is to just duplicate the prototypes for the functions that aren't already in desktop OpenGL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79294 Acked-by: Matt Turner <mattst88@gmail.com> Tested-by: Andreas Boll <andreas.boll.dev@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-30 16:33:34 -07:00
Matt Turner	65bccff800	i965/vec4: Allow writemasking on math instructions on Gen7+. The math instruction was Align1-only on Gen6 and we never updated this to let it use Align16 features like writemasking on newer platforms. total instructions in shared programs: 1686120 -> 1685507 (-0.04%) instructions in affected programs: 48593 -> 47980 (-1.26%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-30 12:20:45 -07:00
Pavel Popov	d292d40207	i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Pavel Popov <pavel.e.popov@intel.com>	2014-05-30 12:20:18 -07:00
Brian Paul	ebf229a436	st/wgl: use _debug_printf() instead of fprintf() This should print output both for debug and release builds. Suggested by Jose. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-05-30 18:52:39 +01:00
Brian Paul	4b05e3cb0f	st/wgl: formatting fixes in stw_framebuffer.c And remove some unneeded #includes and INLINE qualifiers. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-05-30 18:52:39 +01:00
Brian Paul	f9595e21bc	st/wgl: make stw_lookup_context_locked() an inline function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-05-30 18:52:39 +01:00
Brian Paul	bd36cbfa5a	st/wgl: fix implementation of wglCreateContextAttribsARB() wglCreateContextAttribsARB() didn't work previously since it returned a context ID that wasn't allocated by OPENGL32.DLL. So if that context ID was later passed to wglMakeCurrent(), etc. it was rejected. Now when wglCreateContextAttribsARB() is called we actually call wglCreateContext() in order to get a valid context ID. Then we replace the context data which was created with new context data which reflects the arguments passed to wglCreateContextAttribsARB(). If there were a DrvCreateContextAttribs() function in the ICD this work-around wouldn't be necessary. Reviewed-by: Charmaine Lee <charmainel@vmware.com> Conflicts: src/gallium/state_trackers/wgl/stw_ext_extensionsstring.c src/gallium/state_trackers/wgl/stw_getprocaddress.c	2014-05-30 18:52:39 +01:00
Brian Paul	fa55c2402c	st/wgl: add debug code to check that pixel format initialization worked If the assertion fails, it means something is really broken. Before, if this happened we reverted to the GDI renderer without any warning. Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-05-30 18:52:39 +01:00
Brian Paul	e4a5165562	st/wgl: change PFD_SWAP_COPY to PFD_SWAP_EXCHANGE. To reflect our actual SwapBuffers implementation. See stw_st_swap_framebuffer_locked(). This fixes various rendering issues with SolidEdge. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-30 18:52:39 +01:00
José Fonseca	76bf4bd3c5	docs: Document how to replace Windows built-in OpenGL software rasterizer with llvmpipe. Just happened to stumble across this registry key while debugging something else. This technique is much neater than trying to override opengl32.dll. Also a few minors cleanups.	2014-05-30 18:52:39 +01:00
Tapani Pälli	56bdffe8c1	scons: add common.c as part of glcpp build to have _mesa_error_no_memory function available Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79440 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-05-30 10:11:44 +03:00
Juha-Pekka Heikkila	fb7baafbbf	mesa: Add missing null checks into prog_hash_table.c Check calloc return values in hash_table_insert() and hash_table_replace() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-30 09:22:34 +03:00
Tapani Pälli	c692581ae8	glcpp: link with tests/common.c So that prog_hash_table can use _mesa_error_no_memory function. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-05-30 09:22:24 +03:00
Juha-Pekka Heikkila	7bfe94694c	mesa/main: Add missing null check in _mesa_CreatePerfQueryINTEL() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Petri Latvala <petri.latvala@intel.com>	2014-05-30 07:22:01 +03:00
Juha-Pekka Heikkila	5c9056d37f	mesa/drivers: Add extra null check in blitframebuffer_texture() If texObj == NULL here it mean there is already GL_INVALID_VALUE or GL_OUT_OF_MEMORY error set to context. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-30 07:21:39 +03:00
Juha-Pekka Heikkila	19f1d137f8	glsl: Add null check in loop_analysis.cpp Check return value from hash_table_find before using it as a pointer Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-30 07:21:12 +03:00
Juha-Pekka Heikkila	77a00c71bb	mesa: add missing null check in _mesa_NewHashTable() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-30 07:20:53 +03:00
Gary Wong	85b6f36ca5	loader: add optional /sys filesystem method for PCI identification. Introduce a simple PCI identification method of looking up the answer the /sys filesystem (available on Linux). Attempted after libudev, but before DRM. Disabled by default (available only when the --enable-sysfs configure option is specified). Signed-off-by: Gary Wong <gtw@gnu.org> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-29 20:25:37 -06:00
Gary Wong	090c772b8a	loader: allow attempting more than one method of PCI identification. loader_get_pci_id_for_fd() and loader_get_device_name_for_fd() now attempt all available strategies to identify the hardware, instead of conditionally compiling in a single test. The existing libudev and DRM approaches have been retained, attempting first libudev (if available) and then DRM (if necessary). Signed-off-by: Gary Wong <gtw@gnu.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-29 20:25:37 -06:00
Emil Velikov	febec73147	st/egl: do not link against libloader Move the link to the final targets, like any other place in mesa/gallium. This allows better visibilty and will prevent us from including the library archive twice. Resolves multiple definition of `loader_get_pci_id_for_fd' multiple definition of `loader_get_pci_id_for_fd' Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79263 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79382 Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-29 20:01:33 +01:00
Emil Velikov	6638c55838	egl_dri2: fix wayland_platform when drm_platform is not set The build fails with implicit delaration of drmGetCap (xf86drm.h) Were we're including the header only when building the DRM_PLATFORM. Wayland backend can operate without DRM_PLATFORM so replace the guard, and fold in drmGetCap() usage to silence compiler warnings. Cc: Chad Versace <chad.versace@linux.intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-29 20:01:03 +01:00
Matt Turner	dfd117b857	i965/fs: Set correct number of regs_written for MCS fetches. regs_written is in units of virtual GRFs. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-29 10:42:25 -07:00
Jerome Glisse	e3cc0d90e1	glx: load dri driver with RTLD_LOCAL so dlclose never fails to unload There is no reason anymore to load with RTLD_GLOBAL and for some driver this even result in dlclose failing to unload leading to catastrophic failure with swrast fallback. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>	2014-05-29 13:32:21 -04:00
Stéphane Marchesin	c0bd206a14	i915g: Support B5G5R5A1 render targets and textures	2014-05-28 19:53:58 -07:00
Stéphane Marchesin	569c026520	i915g: Support R4G4B4A4 render targets and textures	2014-05-28 19:53:55 -07:00
Stéphane Marchesin	9e59c91a73	i915g: Fix copy region code This fixes a few issues with it, also cleans up the code.	2014-05-28 19:53:51 -07:00
Connor Abbott	fc7e7cfabc	glsl/tests: remove generated tests from the repo They were made unneccesary by the last commit. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-28 15:07:07 -07:00
Connor Abbott	a1d8322fbb	glsl/tests: call create_test_cases.py in optimization-test This way, when someone modifies create_test_cases.py and forgets to commit their changes again, people will notice. v2: make sure we parse the right directories and check for existance the right way. v3 (Ken): Use $PYTHON2 instead of calling python directly. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-28 15:06:45 -07:00
Connor Abbott	6e24111b9c	glsl/tests/lower_jumps: fix generated sexpr's for loops In `088494aa` (as well as other commits in the series) Paul Berry modified the tests for lower_jumps to account for the fact that the s-expression for the loop IR instruction changed from (loop () () () () (statements...)) to (loop (statements...)), but he forgot to update create_test_cases.py which he used to create the tests. Fix that, so that now create_test_cases.py is synced with the generated tests. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-28 15:06:16 -07:00
Connor Abbott	bbaec0f76c	glsl: be more consistent about printing constants Make sure that we print the same number of digits when printing 0.0 as any other floating-point number. This will make generating expected output files for tests easier. To avoid breaking "make check," update the generated tests for lower_jumps before the next commit which will bring create_test_cases.py in line with them. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-28 15:05:59 -07:00
Brian Paul	a7aca3919b	glsl: replace strncmp("gl_") calls with new is_gl_identifier() helper Makes things a little easier to read. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-28 15:06:07 -06:00
Brian Paul	f9cecca7a6	glsl: fix use-after free bug/crash in ast_declarator_list::hir() The call to get_variable_being_redeclared() may delete 'var' so we can't reference var->name afterward. We fix that by examining the var's name before making that call. Fixes valgrind warnings and possible crash when running the piglit tests/spec/glsl-1.30/execution/clipping/vs-clip-distance-in-param.shader_test test (and probably others). Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-28 15:06:07 -06:00
Kenneth Graunke	bb9623a1a8	i965: Fix repeated usage of rectangle texture coordinate scaling. Previously, we set up new entries in the params[] array on every access of a rectangle texture. Unfortunately, we only reserve space for (2 * MaxTextureImageUnits) extra entries, so programs which accessed rectangle textures more times than that would write off the end of the array and likely crash. We don't really have a decent mapping between the index returned by _mesa_add_state_reference and our index into the params array, so we have to manually search for it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78691 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: mesa-stable@lists.freedesktop.org	2014-05-28 13:12:10 -07:00
José Fonseca	9ec7cb8aa0	egl-static: Fix undefined reference to `loader_*' Trivial. Better than a broken build.	2014-05-28 10:33:33 +01:00
Topi Pohjolainen	a6022e5405	meta/blit: Use gl_FragColor also in the msaa blit shader Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use the meta path. No piglit regressions on IVB. Further input from Ken: "Unfortunately, this doesn't fix MRT for integer data. In the single-sampled case, since we're directly copying data, we were read/copy/write data as "float" values, which actually contained the integer bits. Here, we can't do that since we need to process the actual integer data. I do wonder if we could use intBitsToFloat/uintBitsToFloat to stuff the integer bits in the float gl_FragColor output. Just a crazy idea. In the long term (post 10.2), I think we should draft an extension that allows you to do "layout(location = all)" on user-defined fragment shader outputs. (Or some similar syntax.)" Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-05-28 10:32:29 +03:00
Alexandre Courbot	ecee4c4229	nvc0/ir: use SM35 ISA with GK20A GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use the GK110 path when this chip is detected. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-27 22:12:40 -04:00
Alexandre Courbot	1973d79e27	nvc0: add GK20A 3D class GK20A is mostly compatible with GK104, but features a new 3D class. Add it to the relevant header and use it when GK20A is detected. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-27 22:12:40 -04:00
Kenneth Graunke	4b846e231e	i965/sf: Replace push/pop in brw_emit_anyprim_setup. Each of the subroutine emitters alter the predication state, but otherwise don't change anything (or put it back when they do). Resetting predication at the end makes these functions idempotent with regard to the default instruction state - which is a nice property. With that in place, push/pop is no longer necessary. v2: Improve whitespace (requested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:02 -07:00
Kenneth Graunke	471bff4c62	i965/sf: Drop unnecessary push/pop in copy_z_inv_w. brw_MOV doesn't alter the default instruction state, so this does nothing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:02 -07:00
Kenneth Graunke	0f9eeae878	i965/sf: Drop unnecessary push/pop in flatshading code. brw_JMPI sets predicate_control to BRW_PREDICATE_NONE, but that's already the value coming in. Otherwise, nothing changes state. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:02 -07:00
Kenneth Graunke	d9cac44a14	i965/sf: Move brw_compile::flag_value to brw_sf_compile. This field is only used to track the current value of the flag register during the SF compile. It has no place in the common compiler code. While we're changing every call, drop the 'brw' prefix from the function since it's static. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:01 -07:00
Kenneth Graunke	e287f5937f	i965/sf: Move brw_set_predicate_control_flag_value to brw_sf_emit.c. Only the Gen4-5 SF program compiler actually uses this function; move it there. Soon the fields will be moved out of brw_compile. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:01 -07:00
Kenneth Graunke	41afb3ade4	i965/sf: Drop useless push/pop state from flag register mashing code. There's no point in pushing and popping the default state; the code between the two stack operations doesn't alter anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:01 -07:00
Kenneth Graunke	2747f6a1f9	i965/sf: Drop unnecessary push/pop in do_twoside_color. None of the assembly emitters called between push and pop actually change the state. So, we can drop these. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:01 -07:00
Kenneth Graunke	09655bb81b	i965: Don't implicitly set predicate default state in brw_CMP. Previously, brw_CMP with a null destination implicitly set the default state to make future instructions predicated. This is messy and confusing - emitting a CMP that populates the flag register and later using it to predicate instructions are logically separate. With the main compiler, we may even schedule instructions between the CMP and the user of the flag value. This patch simplifies brw_CMP to just emit a CMP instruction, and not mess with predication. It also updates all necessary callers. These mostly fell into two patterns: 1. brw_CMP followed by brw_IF. We don't need to do anything special here; brw_IF already sets up predication appropriately. 2. brw_CMP followed by a single predicated instruction. The old model was to call brw_CMP, emit the next (predicated) instruction, then disable predication for any instructions beyond that. Instead, just explicitly set predicate_control on the single instruction we want to predicate. It's no more code, and requires less cross-module knowledge. This drops setting flag_value to 0xff as well, which is a field only used by the SF compile. There is only one brw_CMP call in the SF code, which is in do_twoside_caller, and called at the start of brw_emit_tri_setup, where flag_value is already 0xff. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:01 -07:00
Kenneth Graunke	b07c4b1d9d	i965: Drop unnecessary predication default state resets in clip code. Presumably, this was to reset the default state of predication_control from brw_CMP. But brw_CMP only sets that if dst is ARF null, which it isn't here. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:01 -07:00
Kenneth Graunke	a5bb24d769	i965/sf: Reset flag_value to 0xff before emitting SF subroutines. When compiling any of the SF program variants, flag_value starts off as 0xff and will be modified when generating code. brw_emit_anyprim_setup emits several subroutines, saving and restoring flag_value across each of them. Since it starts out as 0xff, this is equivalent to simply setting it to 0xff at the start of each subroutine. Resetting the value makes more logical sense; each subroutine doesn't know whether one of the others even executed, much less what it did to the flag register. This also lets us to drop the brw_set_predicate_control_flag_value call from brw_init_compile: predicate is already initialized to BRW_PREDICATE_NONE by the memset, and the value of flag_value is irrelevant (as it's only used by the SF compiler). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-27 13:46:00 -07:00
Leo Liu	b3ad853a2c	st/omx/enc: implement restricted b frames pattern Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-27 16:56:55 +02:00
Leo Liu	cc6c76e8f6	radeon/vce: implement non-referenced frames Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-27 16:56:52 +02:00
Leo Liu	8e0eae4c3d	vl: add interface for non-referenced frames Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-27 16:56:32 +02:00
Topi Pohjolainen	57730d67f6	i965/meta: Store stencil texturing mode Meta path needs to keep the current texture object's state. Fixes the following gles3 cts tests on bdw: framebuffer_blit_functionality_negative_width_blit.test: fail framebuffer_blit_functionality_all_buffer_blit.test: fail framebuffer_blit_functionality_negative_height_blit.test: fail framebuffer_blit_functionality_missing_buffers_blit.test: fail framebuffer_blit_functionality_negative_dimensions_blit.test: fail framebuffer_blit_functionality_minifying_blit.test: fail framebuffer_blit_functionality_magnifying_blit.test: fail Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-27 09:31:27 +03:00
Topi Pohjolainen	c246828c4d	meta/blit: Add stencil texturing mode save and restore v2 (Ken): Only restore the mode if it has changed. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-27 09:31:07 +03:00
Stéphane Marchesin	328e7e7742	i915g: Fix shader disasm code This broke when I separated declarations/shader.	2014-05-26 23:08:49 -07:00
Stéphane Marchesin	82a76e61e7	i915g: Fallback to sw for npot copies i915g's npot support is incomplete, so let's not use it for copies. This fixes a bunch of piglit tests.	2014-05-26 23:08:49 -07:00
Stéphane Marchesin	b419ca937a	i915g: handle more formats in copy We can handle depth, luminance,... copies by simply replacing the format with a known format of the same bpp.	2014-05-26 23:08:49 -07:00
Tobias Klausmann	a26e2bc2e3	nvc0: implement clear_buffer Provide an accelerated path for ARB_clear_buffer_object Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-26 21:17:14 -04:00
Matt Turner	4c7bf8a704	i965: Switch types D->UD when possible to allow compaction. Number of compacted instructions: 827404 -> 833045 (0.68%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-26 13:58:58 -07:00
Matt Turner	0d3f83f4ad	Revert "i965: Don't make instructions with a null dest a barrier to scheduling." This reverts commit `42a26cb5e4`. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648	2014-05-26 11:47:15 -07:00
Matt Turner	a39428cf5c	Revert "i965/fs: Simplify interference scan in register coalescing." This reverts commit `5ff1e446d4`. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704	2014-05-26 11:47:13 -07:00
Matt Turner	fc025a6719	Revert "i965/fs: Give up in interference check if we see a WHILE." This reverts commit `55de1c035c`. Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-26 11:47:04 -07:00
Matt Turner	ccb1ea8a15	Revert "i965/fs: Reduce restrictions on interference in register coalescing." This reverts commit `f770123f58`. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692	2014-05-26 11:46:52 -07:00
Ilia Mirkin	0d699530ff	nvc0: revert mistaken logic to collapse color outputs to the beginning In commit `af38ef907`, I added a "fix" to color outputs not being assigned correctly when sample mask was being output. This was totally wrong -- the color indices (i.e. "si" values) were the ones that were wrong. Undo that hunk. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-26 14:53:26 -04:00
Ilia Mirkin	ab7bd7093d	mesa/st: fix color outputs in presence of sample mask output Commit `c5d822dad9` added support for sample mask incorrectly. It became treated as a color output, and messed up the color output indices. Revert the hunk that did that, and add explicit support just like for depth/stencil writes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2014-05-26 14:00:11 -04:00
Rob Clark	aa78c4586d	freedreno/a3xx: texture fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-26 09:03:09 -04:00
Rob Clark	2456be63e9	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-26 08:58:17 -04:00
Rob Clark	286863939f	freedreno: few caps fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-26 08:56:27 -04:00
Vinson Lee	f0748b5014	mesa/x86: Fix build with clang <= 3.3. clang <= 3.3 cpuid.h does not define contants for feature bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79095 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-05-25 21:39:30 -07:00
Matt Turner	6148e94e26	i965: Don't treat HW_REGs as barriers if they're immediates. We had a handful of cases where we'd used brw_imm_*() to generate an immediate, rather than fs_reg(). We shouldn't do that but we shouldn't limit scheduling flexibility on account of immediate arguments either. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-25 20:16:46 -07:00
Matt Turner	c938be8ad2	i965/fs: Don't use brw_imm_* unnecessarily. Using brw_imm_* creates a source with file=HW_REG, and the scheduler inserts barrier dependencies when it sees HW_REG. None of these are hardware-registers in the sense that they're special and scheduling shouldn't touch them. A few of the modified cases already have HW_REGs for other sources, so it won't allow extra flexibility in some cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-25 20:16:41 -07:00
Emil Velikov	7a63bd960c	automake: correctly append the version-script Turns out that the AC conditional did not include the the version-scripts as expected. Rather it truncated the remaining linker flags. Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-05-25 23:21:47 +01:00
Emil Velikov	239df5b654	targets/libgl-xlib: hide all the exported symbol mayhem Leave only the gl/glx and mangled gl symbols. XMesa* was never an official interface and the only user of it was mesa-demos, while they were still in the same repo as mesa. v2: Conditionally use the version-script. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-25 23:21:47 +01:00
Emil Velikov	7e613f4683	targets/osmesa: include mangled gl symbols Missed out with commit `d4c3968c25` Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-25 23:21:46 +01:00
Emil Velikov	a75baba2f1	targets/xa: limit the amount of exported symbols In the presence of LLVM the final library exports every symbol from the llvm namespace. Resolve this by using a version script (w/o the version/name tag). Considering that there are only ~35 symbols, explicitly list them to minimize the chances of rogue symbols sneaking in. v2: Conditionally include the version-script. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-25 23:21:46 +01:00
Emil Velikov	ce12bbd107	dri_util: keep __dri2ConfigOptions symbol private The symbol was added with commit 45e2b51c853(DRI2/GLX: check for vblank_mode in DRI2 GLX code) but was never used as such according to git log. Possibly it was marked as public due to confusion with __driConfigOptions which was used for dri1 drivers. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-25 23:21:46 +01:00
Kai Wasserbäch	97aa256b19	targets/opencl: Fix (static) linking with LLVM (v2) Without this, I get linking failures (static linking). The static linking is sort of required for me, because otherwise Steam and applications using the Steam runtime regularily fail because my LLVM was compiled and linked against a newer libgcc_s, libstdc++, etc. and uses features from those newer versions. And instead of Steam just not starting, my X starts crashing, whenever libGL fails to load a (32 bit) driver. Since I hate crashes of X and I don't think Valve/Steam will behave like a proper distribution soon (rebuilds versus current Debian Testing, since they base their Steam OS off that), I need a radeonsi which carries its own LLVM within and doesn't care about what the runtime sets. This means linking Mesa statically. v1 → v2: Move logic to configure.ac Acked-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2014-05-25 23:21:46 +01:00
Emil Velikov	eb2241f8a9	glx: do not leak dri3Display v2: Do not wrap the code in ifdef HAVE_DRI3 (suggested by Keith) Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-25 23:21:46 +01:00
Emil Velikov	b52a530ce2	gallium/egl: st_profiles are build time decision, treat them as such The profiles are present depending on the defines at build time. Drop the extra functions and feed the defines directly into the state-tracker at build time. v2: Drop unused variable i. Acked-by: Chia-I Wu <olvaffe@gmail.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-25 23:21:46 +01:00
Emil Velikov	a9afdcc3a1	dri_util: set implemented version of the DRI_CORE extension ... rather than the one defined in our internal interface (dri_interface.h) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-05-25 23:21:45 +01:00
Matt Turner	c9fd68408b	i965/fs: Don't modify ann_count if not debugging. If we make ann_count non-zero, annotation_finalize() won't bail. Not modifying it seems to make the code more clear than would modifying annotation_finalize().	2014-05-25 10:32:35 -07:00
Matt Turner	c2c639ecf6	Revert "i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6" This reverts commit `a6860100b8`. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-24 23:03:24 -07:00
Matt Turner	db42dd8952	Revert "i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6" This reverts commit `2dfbbeca50` with the comment about MAC and implicit accumulator removed. Why this code didn't work in all circumstances is unknown and without a working Ironlake simulator (which uses a different AUB format) we'll probably never know, short of a lot of experimentation, and spending a bunch of time to try to optimize a few instructions on Ironlake is not time well spent. Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a dependence between the otherwise independent per-component calculations. Not using the accumulator, even if it means an extra instruction per component might be preferable. We don't know, we don't have data, and we don't have the necessary register on Ironlake for shader_time to tell us. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-24 23:03:24 -07:00
Matt Turner	492af22fb4	i965: Remove useless typo'd debugging messages. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-24 23:03:24 -07:00
Matt Turner	f3cb2e6ed7	i965: Move brw_land_fwd_jump() to compilation unit of its use. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-24 23:03:24 -07:00
Matt Turner	424303db7f	i965/fs: Use next_insn_offset rather than nr_insn. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-24 23:03:24 -07:00
Matt Turner	99af02fb17	i965: Emit 0.0:F sources with type VF instead. Number of compacted instructions: 817752 -> 827404 (1.18%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:24 -07:00
Matt Turner	fb977c90d1	i965: Emit ARF:UD for non-present src1 on Gen6+. Enables the next commits to compact more instructions. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:24 -07:00
Matt Turner	1acb3a290e	i965: Support compacted instructions with immediate sources. Note the weirdness with src1 subregs. The compacted immediate fields are uncompacted to bits [127:96] and the high five bits of the subreg mapping maps to bits [100:96]. Number of compacted instructions: 790085 -> 817752 (3.50%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:24 -07:00
Matt Turner	8942f44c8d	i965: Use next_offset() in instruction compaction code. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:23 -07:00
Matt Turner	392cbc2f93	i965: Move next_offset() to brw_eu.h for use elsewhere. Also perform arithmetic on char* rather than void* since the latter is a GNU C extension not available in C++. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:23 -07:00
Matt Turner	e32e69cc27	i965: Rename next_ip() -> next_offset(). That we were comparing its return value with offsets should have been a clue. :) Make it take a void *store in preparation for making the function useful elsewhere. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:23 -07:00
Matt Turner	f0f7fb181f	i965: Print disassembly after compaction. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:23 -07:00
Matt Turner	b5fd762474	i965/fs: Make patch_discard_jumps_to_fb_writes return bool. ... to tell us whether it emitted any code. Will be used to determine whether we need to skip an annotation for it. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-05-24 23:03:23 -07:00
Matt Turner	a35b9cb625	i965: Add annotation data structure and support code. Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:23 -07:00
Matt Turner	59f4e80d53	i965/fs+blorp: Remove left over dump_file arguments. Were used by the blorp unit test programs. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-05-24 23:03:23 -07:00
Matt Turner	cd1c1d302b	i965/fs: Don't hardcode DEBUG_WM in generic fs code. Similar to Paul's commit `e9fa3a944` except brw_fs_generator's debug_flag is for DEBUG_WM and DEBUG_BLORP. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:23 -07:00
Matt Turner	9976294e86	i965: Pass in start_offset to brw_compact_instructions(). Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:22 -07:00
Matt Turner	2afdd2f40b	i965: Delete unused brw_blorp_blit_test_compile().	2014-05-24 23:03:22 -07:00
Matt Turner	dd0e1c3aff	i965/cfg: Make DO instruction begin a basic block. The DO instruction doesn't exist on Gen6+. Since before this commit, DO always ended a basic block, if it also happened to start one (e.g., a while loop inside an if statement) the block containing only the DO would actually contain no hardware instructions. Pre-Gen6's WHILE instructions jumps to the instruction following the DO, so strictly speaking we won't be modeling that properly, but I claim there is actually no functional difference. This will simplify an upcoming change where we want to mark the first hardware instruction in the loop as beginning a block, and the last instruction before the loop as ending one. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-24 23:03:22 -07:00
Jeremy Huddleston Sequoia	04ce3be401	darwin: Guard Core Profile usage behind a testing envvar Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2014-05-24 20:41:38 -07:00
Jeremy Huddleston Sequoia	9eb1d36c97	darwin: Write errors in choosing the pixel format to the crash log Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2014-05-24 20:41:35 -07:00
Joakim Sindholt	404387ecd7	nv50: count wrapped textures towards the tex_obj count But don't count their size towards the allocated memory, since that belongs to whoever created it. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-23 12:34:39 -04:00
Christoph Bumiller	caa34a7a64	nvc0: assert that we have vertex elements state Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-23 12:34:39 -04:00
Christoph Bumiller	2595682689	nvc0: use PRIxPTR for sizeof() Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-23 12:34:39 -04:00
Christoph Bumiller	7669e362ab	nv50,nvc0: allow 15,16,30 bpp display formats Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-23 12:34:39 -04:00
Christoph Bumiller	b9142c246d	nv50,nvc0: handle guard band defines [imirkin: moved default case out of switch] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-23 12:34:39 -04:00
Christoph Bumiller	d479713d25	nv50/ir/tgsi: optimize KIL Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:39 -04:00
Christoph Bumiller	452a4151aa	nv50/ir: fix lowering of predicated instructions (without defs) Note that predicated instructions with defs are still not supported because transformation to SSA doesn't handle them yet. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:38 -04:00
Christoph Bumiller	3b0867f35b	nv50/ir/opt: fix constant folding with saturate modifier Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:38 -04:00
Christoph Bumiller	2f2d1b3d9b	nv50/ir/tgsi: TGSI_OPCODE_POW replicates its result Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:38 -04:00
Christoph Bumiller	49eccef06b	nv50,nvc0: set constbufs dirty on pipe context switch Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:38 -04:00
Christoph Bumiller	200382be85	nv50: setup scissors on clear_render_target/depth_stencil [imirkin: add logic to also clear the "regular" scissors] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:38 -04:00
Christoph Bumiller	7d11b761f2	nv50,nvc0: always pull out bufctx on context destruction Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 12:34:38 -04:00
Pavel Popov	8dc4a98c44	i965: Properly return RESET status in glGetGraphicsResetStatusARB The glGetGraphicsResetStatusARB from ARB_robustness extension always returns GUILTY_CONTEXT_RESET_ARB and never returns NO_ERROR for guilty context with LOSE_CONTEXT_ON_RESET_ARB strategy. This is because Mesa returns GUILTY_CONTEXT_RESET_ARB if batch_active !=0 whereas kernel driver never reset batch_active and this variable always > 0 for guilty context. The same behaviour also can be observed for batch_pending and INNOCENT_CONTEXT_RESET_ARB. But ARB_robustness spec says: If a reset status other than NO_ERROR is returned and subsequent calls return NO_ERROR, the context reset was encountered and completed. If a reset status is repeatedly returned, the context may be in the process of resetting. 8. How should the application react to a reset context event? RESOLVED: For this extension, the application is expected to query the reset status until NO_ERROR is returned. If a reset is encountered, at least one RESET status will be returned. Once NO_ERROR is encountered, the application can safely destroy the old context and create a new one. The main problem is the context may be in the process of resetting and in this case a reset status should be repeatedly returned. But looks like the kernel driver returns nonzero active/pending only if the context reset has already been encountered and completed. For this reason the RESET status cannot be repeatedly returned and should be returned only once. The reset_count and brw->reset_count variables can be used to control that glGetGraphicsResetStatusARB returns RESET status only once for each context. Note the i915 triggers reset_count twice which allows to return correct reset count immediately after active/pending have been incremented. v2 (idr): Trivial reformatting of comments. Signed-off-by: Pavel Popov <pavel.e.popov@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 09:25:04 -07:00
Jon TURNEY	002a3a7427	appleglx: Improve error reporting if CGLChoosePixelFormat() didn't find any matching pixel formats. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2014-05-23 15:24:09 +01:00
Jon TURNEY	5a459a036e	Fix build of appleglx Define GLX_USE_APPLEGL, as config/darwin used to, to turn on specific code to use the applegl direct renderer Convert src/glx/apple/Makefile to automake Since the applegl libGL is now built by linking libappleglx into libGL, rather than by linking selected files into a special libGL: - Remove duplicate code in apple/glxreply.c and apple/apple_glx.c. This makes apple/glxreply.c empty, so remove it - Some indirect rendering code is already guarded by !GLX_USE_APPLEGL, but we need to add those guards to indirect_glx.c, indirect_init.c (via it's generator), render2.c and vertarr.c so they don't generate anything Fix and update various includes glapi_gentable.c (which is only used on darwin), should be included in shared glapi as well, to provide _glapi_create_table_from_handle() Note that neither swrast nor indirect is supported in the APPLEGL path at the moment, which makes things more complex than they need to be. More untangling is needed to allow that v2: Correct apple/Makefile.am for srcdir != builddir Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-23 15:24:07 +01:00
Jon TURNEY	45f9aae004	Make DRI dependencies and build depend on the target - Don't require xcb-dri[23] etc. if we aren't building for a target with DRM, as we won't be using dri[23] - Enable a more fine-grained control of what DRI code is built, so that a libGL using direct swrast can be built on targets which don't have DRM. The HAVE_DRI automake conditional is retired in favour of a number of other conditionals: HAVE_DRI2 enables building of code using the DRI2 interface (and possibly DRI3 with HAVE_DRI3) HAVE_DRISW enables building of DRI swrast HAVE_DRICOMMON enables building of target-independent DRI code, and also enables some makefile cases where a more detailled decision is made at a lower level. HAVE_APPLEDRI enables building of an Apple-specific direct rendering interface, still which requires additional fixing up to build properly. v2: Place xfont.c and drisw_glx.c into correct categories. Update 'make check' as well Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-23 15:24:04 +01:00
Jon TURNEY	ff90a8784c	Fix build for darwin Fix build for darwin, when ./configured --disable-driglx-direct - darwin ld doesn't support -Bsymbolic or --version-script, so check if ld supports those options before using them - define GLX_ALIAS_UNSUPPORTED as config/darwin used to, as aliasing of non-weak symbols isn't supported - default to -with-dri-drivers=swrast v2: Use -Wl,-Bsymbolic, as before, not -Bsymbolic Test that ld --version-script works, rather than just looking for it in ld --help Don't use -Wl,--no-undefined on darwin, either Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-23 15:24:01 +01:00
Emil Velikov	e0372239a5	targets/egl-static: add missing line break in ldflags Accidently omitted by commit `7b7944ee1c`. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-05-23 15:23:59 +01:00
James Legg	846c715abb	mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT glFramebufferRender(..., GL_DEPTH_STENCIL_ATTACHMENT, ..., 0) only detached the depth buffer and not the stencil buffer. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=79115 Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 08:06:02 -06:00
Emil Velikov	d4c3968c25	targets/osmesa: limit the amount of exported symbols src/gallium/targets/osmesa/Makefile.am \| 1 + src/gallium/targets/osmesa/osmesa.sym \| 18 ++++++++++++++++++ 2 files changed, 19 insertions(+) create mode 100644 src/gallium/targets/osmesa/osmesa.sym	2014-05-23 07:40:24 -06:00
José Fonseca	172ef0c5a5	gallivm: Disable workaround for PR12833 on LLVM 3.2+. Fixed upstream.	2014-05-23 11:37:47 +01:00
José Fonseca	2c02f34fcc	gallivm: Support MCJIT on Windows. It works fine, though it requires using ELF objects. With this change there is nothing preventing us to switch exclusively to MCJIT, everywhere. It's still off though.	2014-05-23 11:37:47 +01:00
José Fonseca	94dbc16dc4	mesa/x86: Fix build with clang 3.4. It defines bit_SSE41 instead of bit_SSE4_1. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=79095 Trivial.	2014-05-23 11:37:47 +01:00
José Fonseca	c98b704128	mesa: Move declaration to top of block. To fix MSVC build. Trivial.	2014-05-23 11:37:47 +01:00
Jordan Justen	57876fee38	meta blit: Set Z texcoord during meta blit to sample the correct layer If the source renderbuffer has a depth > 0, then send a Z texcoord which is set to the source attachment Z offset. This fixes piglit's gl-3.2-layered-rendering-gl-layer-render with the GL_TEXTURE_2D_MULTISAMPLE_ARRAY case test on i965/gen8. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 00:56:01 -07:00
Kenneth Graunke	746921cbb4	i965: Listen to BRW_NEW_FRAGMENT_PROGRAM for 3DSTATE_PS_BLEND. brw_color_buffer_write_enabled depends on brw->fragment_program, which means we have to listen to BRW_NEW_FRAGMENT_PROGRAM. On most generations, this was only called from a function that already subscribed. However, on Broadwell, we failed to listen to the necessary event in the atom that emits 3DSTATE_PS_BLEND. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 00:42:06 -07:00
Kenneth Graunke	7d3985ca6c	i965: Use WE_all for FB write header setup on Broadwell. I forgot to disable writemasking on the OR and MOV which set the render target index and "source 0 alpha present to render target" bit. Using get_element_ud is equivalent and avoids a line-wrap. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-23 00:42:06 -07:00
Tobias Klausmann	f50361cce7	mesa/x86: fix a typos in SSE4.1 detection Commit `a2fb71e23` introduced 32-bit code for SSE4.1. Fix compilation, and make sure to check ecx for the SSE4.1 bit. [imirkin: switch sse4.1 to look at ecx] Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-22 21:10:08 -04:00
José Fonseca	cfec135de7	mesa: Rely on USE_X86_64_ASM. This fixes MinGW x64 builds. We don't use assembly on any of the Windows builds, to avoid divergence between MSVC and MinGW when testing. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-22 22:39:46 +01:00
José Fonseca	c59c8f0363	scons: Fix x86_64 build. x86/common_x86.c is required also for x86_64 builds. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-22 22:39:42 +01:00
Carl Worth	03a0471832	docs: Import 10.1.4 release notes, add news item.	2014-05-22 11:29:49 -07:00
Matt Turner	a9bc85f3b2	mesa/x86: Brown bag fix for undeclared variable.	2014-05-22 11:02:36 -07:00
Matt Atwood	f935dfc022	i965: Use SSE4.1 runtime detection for intel_miptree_map. Previous it was a compile-time decision. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-22 10:17:16 -07:00
Matt Atwood	a2fb71e23b	mesa/x86: add SSE4.1 runtime detection. Add a bit to _mesa_x86_features for SSE 4.1, along with macros to query. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-22 10:17:16 -07:00
Matt Turner	8b9302f2b4	mesa/x86: Support SSE 4.1 detection on x86-64. Uses the cpuid.h header provided by gcc and clang. Other platforms are encouraged to switch.	2014-05-22 10:17:16 -07:00
Matt Turner	1a31657a9b	mesa: Add uninitialized_vars macro from the Linux kernel.	2014-05-22 10:17:16 -07:00
Vinson Lee	5dd927bbfc	configure.ac: Do not enable -Wl,--no-undefined on Mac OS X. This patch fixes this build error on Mac OS X. CCLD libglapi.la clang: warning: argument unused during compilation: '-pthread' clang: warning: argument unused during compilation: '-pthread' ld: unknown option: --no-undefined clang: error: linker command failed with exit code 1 (use -v to see invocation) Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-05-21 22:13:13 -07:00
Alexander von Gluck IV	d4225f803b	haiku: Add missing u_memory.h for FREE() Acked-by: Brian Paul <brianp@vmware.com>	2014-05-21 20:58:06 -04:00
Vinson Lee	8479edf3d7	configure.ac: Remove -fstack-protector-strong from LLVM flags. -fstack-protector-strong is not supported by clang. This patch fixes this build error on Fedora 20 with clang. CXX gallivm/lp_bld_debug.lo clang: error: unknown argument: '-fstack-protector-strong' Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75010 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-05-21 16:07:00 -07:00
Rob Clark	a4d229b099	freedreno/a3xx: fix blend opcode Seems the opcodes are slightly different from a2xx. Resync headers and move blend_func() helper into hw generation specific code. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-21 17:29:13 -04:00
Timothy Arceri	5a40a00089	mesa: check constant before null check For most drivers this if statement is always going to fail so check the constant value first. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-22 06:52:03 +10:00
Rob Clark	b81de5352d	freedreno/a3xx: fix depth/stencil gmem restore We already multiply by bytes per pixel for this, so `f3ba7611` broke mem2gmem for depth/stencil. Drop the now-redundant mutiply by cpp. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-21 16:11:46 -04:00
Eric Anholt	b11d345ab0	i965: Ask the VBO module to actually use VBOs. Note that this covers the Begin/End rendering path, but not user vertex arrays (so we can't drop copy_array_to_vbo_array() code). Improves performance of isosurf GLVERTEX\|TRIANGLES by 16.7506% +/- 4.98934% (n=20). No difference on openarena (n=10), which was why this was reverted back in `cbde276580`. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-21 11:38:55 -07:00
Rob Clark	f3ba761129	freedreno/a3xx: fix depth/stencil GMEM positioning In cases where there was no color buf bound, there were inconsistancies in register settings related to position of depth/stencil inside GMEM. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-21 12:06:38 -04:00
Rob Clark	4da8267c36	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-21 12:06:38 -04:00
Rob Clark	0d54904c04	freedreno: use OUT_RELOCW when buffer is written These aren't buffers we ever read back from CPU, so using incorrect reloc fxn wasn't really harming anything. But might as well be correct. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-21 12:06:38 -04:00
Rob Clark	cb9ed57072	rbug: add missing pipe->blit() entrypoint Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-05-21 12:06:38 -04:00
Anuj Phogat	46737cebd3	meta: Use gl_FragColor to output color values to all the draw buffers _mesa_meta_setup_blit_shader() currently generates a fragment shader which, irrespective of the number of draw buffers, writes the color to only one 'out' variable. Current shader rely on an undefined behavior and possibly works by chance. From OpenGL 4.0 spec, page 256: "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a set of draw buffers into which the single fragment color defined by gl_FragColor is written. If a fragment shader writes to gl_FragData, or a user-defined varying out variable, DrawBuffers specifies a set of draw buffers into which each of the multiple output colors defined by these variables are separately written. If a fragment shader writes to none of gl_FragColor, gl_FragData, nor any user defined varying out variables, the values of the fragment colors following shader execution are undefined, and may differ for each fragment color." OpenGL 4.4 spec, page 463, added an additional line in this section: "If some, but not all user-defined output variables are written, the values of fragment colors corresponding to unwritten variables are similarly undefined." V2: Write color output to gl_FragColor instead of writing to multiple 'out' variables. This'll avoid recompiling the shader every time draw buffers count is updated. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-21 08:43:14 -07:00
Anuj Phogat	bee2915210	meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-21 08:43:13 -07:00
Ilia Mirkin	cdeb7004e0	tgsi: add GS_INVOCATIONS to property names array In commit `4be146b1`, I neglected to add the new property to the strings array. This leads to the string '(null)' to be printed instead when converting a GS shader to text. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-21 09:31:16 -04:00
Ilia Mirkin	28360fcad7	nv50,nvc0: fix 3d blits with mipmap levels Make sure to normalize the z coordinates as well as the x/y ones when there are mipmaps present. Fixes 3d mipmap generation, which now uses the blit path. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Skeggs <bskeggs@redhat.com>	2014-05-21 09:31:16 -04:00
Ilia Mirkin	d2a3de19c6	nv50/ir: fix constant folding for OP_MUL subop HIGH These instructions can come in either through IMUL_HI/UMUL_HI TGSI opcodes, or from OP_DIV constant folding. Also make sure that the constant foldings which delete the original instruction still get counted as having done something. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Skeggs <bskeggs@redhat.com>	2014-05-21 09:31:16 -04:00
Ilia Mirkin	d3a5cf052c	nv50/ir: fix s32 x s32 -> high s32 multiply logic Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand". This logic can come into use by the IMUL_HI instruction (very unlikely to be seen), as well as from constant folding of division by a constant. Fixes dolphin's divisions by 255. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Skeggs <bskeggs@redhat.com>	2014-05-21 09:31:16 -04:00
Kenneth Graunke	1472584397	i965/fs: Assume fragment color clamping is off when precompiling. Modern applications frequencly use both UNORM buffers and FLOAT buffers with color clamping disabled. (FLOAT with clamping explicitly enabled and SNORM buffers appear to be less common.) We don't need to emit saturates in the fragment shader in either of the common cases. Mesa sets ctx->Color._ClampFragmentColor to false if all the color buffers are UNORM. Also, for GL_FIXED_ONLY mode (the default in legacy OpenGL), it will be false if any FLOAT buffers are bound. Since the common case is false, that should be our default. Thanks to Roland Scheidegger for pointing out some faulty logic in v1 of this patch (unnecessary code and incorrect explanations). v2: Drop superfluous code and reword commit message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-21 00:29:30 -07:00
Sarah Sharp	c524f3ef91	egl: Add EGL_CHROMIUM_sync_control extension. Chromium defined a new GL extension (that isn't registered with Khronos). We need to add an EGL extension for it, so we can migrate ChromeOS on Intel systems to use EGL instead of GLX. http://git.chromium.org/gitweb/?p=chromium/src/third_party/khronos.git;a=commitdiff;h=27cbfdab35c601f70aa150581ad1448d0401f447 The EGL_CHROMIUM_sync_control extension is similar to the GLX extension OML_sync_control, but only defines one function, eglGetSyncValuesCHROMIUM, which is equivalent to glXGetSyncValuesOML. http://www.opengl.org/registry/specs/OML/glx_sync_control.txt Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Jamey Sharp <jamey@minilop.net> Cc: Ian Romanick <idr@freedesktop.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>	2014-05-20 15:19:48 -07:00
Sarah Sharp	f6e50994e1	Import eglextchromium.h from Chromium. In order to support the (currently unregistered) Chromium-specific EGL extension eglGetSyncValuesCHROMIUM on Intel systems, we need to import the Chromium header that defines it. The file was downloaded from https://chromium.googlesource.com/chromium/chromium/+/trunk/ui/gl/EGL/eglextchromium.h It is subject to the license found at https://chromium.googlesource.com/chromium/chromium/+/trunk/LICENSE I have imported the header file and added the license text to the top. The only change was to fix the include guard on the Chromium header to change the last line from a #define to a #endif, which makes the header actually compile. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: Jamey Sharp <jamey@minilop.net> Cc: Ian Romanick <idr@freedesktop.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>	2014-05-20 11:31:32 -07:00
Jeremy Huddleston Sequoia	7a109268ab	darwin: Fix test for kCGLPFAOpenGLProfile support at runtime Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>	2014-05-20 10:53:43 -07:00
Rob Clark	57e68a91f5	freedreno: don't advertise texture arrays for now I think a3xx and later should support (it is part of GLES3), but this isn't needed for the time being and still needs to be reversed. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-20 10:52:56 -04:00
Jeremy Huddleston Sequoia	ff5456d1ac	glapi: Avoid heap corruption in _glapi_table Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-05-20 01:37:58 -07:00
Rob Clark	52381a7ffb	freedreno/a3xx: shadow sampler support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-19 21:17:25 -04:00
Rob Clark	08b9180819	freedreno/a3xx/compiler: refactor trans_samp() Split it up into some smaller fxns so it doesn't grow into a huge monster as we add things. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-19 21:17:25 -04:00
Rob Clark	1686a0edc0	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-19 21:17:25 -04:00
Kenneth Graunke	2ecc7268ba	meta: Avoid _swrast_BlitFramebuffer in the meta CopyTexSubImage code. This is a replacement for `bd44ac8b5c` that should actually work. Fixes Piglit's copyteximage-border on swrast, as well as one of es3conform's packed_pixels_pixelstore test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78546 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-19 17:18:55 -07:00
Kenneth Graunke	54540ea691	meta: Split _swrast_BlitFramebuffer out of the meta blit path. Separating the software fallbacks from the rest of the meta path (which is usually hardware accelerated) gives callers better control over their blitting options. For example, i965 might want to try meta blit, hardware blits, then swrast as a last resort. Splitting it makes that possible. This updates all callers to maintain the existing behavior (even in the few cases where it isn't desirable behavior - later patches can change that). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-19 17:18:55 -07:00
Kenneth Graunke	d89ce333cc	meta: Drop unnecessary early returns in _mesa_meta_BlitFramebuffer. These aren't necessary - all of the following code is predicated on mask being non-zero, so no code will get executed anyway. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-19 17:18:54 -07:00
Kenneth Graunke	2fa3796bc1	Revert "i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage." This reverts commit `bd44ac8b5c`. Fixes: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78842 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78843 Re-breaks: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 but that will be fixed properly in a few commits. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-19 17:18:54 -07:00
Brian Paul	75688254d7	docs: update the prerequisites section SCons is required for Windows. Add links to flex/bison for Windows. Reorder items and improve formatting. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-19 16:15:16 -06:00
Topi Pohjolainen	21dddb22c1	i965/fbo: Only try stencil meta blits on gen >= 8 I don't have an ILK at hand but the fix should be trivial. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78872 Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-19 11:22:31 -07:00
Kenneth Graunke	0b96d362bf	mesa: Disable GL_EXT_framebuffer_multisample_blit_scaled on Broadwell. It's not properly implemented in the meta code, and we don't have time to fix it for 10.2. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-19 11:12:30 -07:00
Roland Scheidegger	1e9cbbb1c4	llvmpipe: do IR counting for shader cache management after optimization. `2ea923cf57` had the side effect of IR counting now being done after IR optimization instead of before. Some quick analysis shows that there's roughly 1.5 times more IR instructions before optimization than after, hence the effective shader cache size got quite a bit smaller. Could counter this with an increase of the instruction limit but it probably makes more sense to count them after optimizations, so move that code. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-19 17:07:41 +02:00
Vinson Lee	9e74de884a	i965: Rename brw_disasm to brw_disassemble_inst. Fixes build error introduced with commit `4b04152db0`. CC test_eu_compact.o test_eu_compact.c: In function ‘test_compact_instruction’: test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration] brw_disasm(stderr, &src, brw->gen, false); ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78888 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-05-19 00:42:18 -07:00
Kenneth Graunke	13edd5f616	i965: Fix a "discards 'const' qualifier" warning. Trivial.	2014-05-18 23:36:48 -07:00
Kenneth Graunke	09b4f260a7	i965/fs: Finally kill struct brw_wm_compile (better known as 'c'). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:20 -07:00
Kenneth Graunke	8b994d0f3b	i965/fs: Stop copying the program key. We already have a perfectly good copy of the program key, and nobody is going to modify it. The only reason we copied it was because the brw_wm_compile structure embedded the key rather than pointing to it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:20 -07:00
Kenneth Graunke	cca6dc9f0f	i965/fs: Rip struct brw_wm_compile out of the visitors and generators. Instead, just pass the key and prog_data as separate parameters. This moves it up a level - one step further toward getting rid of it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:20 -07:00
Kenneth Graunke	2d4ac9b5b8	i965/fs: Plumb a mem_ctx all the way through the FS compile. 'c' is going away, but we still need a memory context that lives for the duration of the compile. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:20 -07:00
Kenneth Graunke	25f8fbbf2f	i965/fs: Use 'c' as the mem_ctx in fs_visitor. Previously, the memory context situation was a bit of a mess: fs_visitor allocated its own memory context, and freed it in the destructor. However, some data produced by fs_visitor (such as the list of instructions) needs to live beyond when fs_visitor is "done", so the caller can pass it to fs_generator. Everything worked out because brw_wm_fs_emit's fs_visitor variables happen to not go out of scope until the end of the function. But that meant that moving the declaration of, say, the SIMD16 fs_visitor instance, could cause everything to explode. Using a memory context that exists for the duration of the compile is clearer, and should be equivalent. Ultimately, we don't want to use 'c', but this matches the behavior of fs_generator and gen8_fs_generator, so it'll be simple to change later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:20 -07:00
Kenneth Graunke	81b11bf093	i965/fs: Actually free program data on the error path. We throw away the data generated during compilation on the success path, so we really ought to on the failure path as well. The caller has no access to it anyway, so it's purely leaked. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:20 -07:00
Kenneth Graunke	c96fdeb723	i965/fs: Replace c->key with a direct reference in the generators. 'c' is going away. This is also a bit shorter. Marking the key pointer as const will also deter people from changing it in these classes, as that's absolutely not OK. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:19 -07:00
Kenneth Graunke	65b2df3ec8	i965/fs: Replace c->key with a direct reference in fs_visitor. 'c' is going away. This is also shorter. Marking the key pointer as const will also deter people from changing it in fs_visitor, as it's absolutely not OK to modify it there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:19 -07:00
Kenneth Graunke	b61d055d66	i965/fs: Replace c->prog_data with a direct reference in the generators. 'c' is going away. This is also a bit shorter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:19 -07:00
Kenneth Graunke	8a04e0de8b	i965/fs: Replace c->prog_data with a direct reference in fs_visitor. 'c' is going away. This is also a bit shorter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:19 -07:00
Kenneth Graunke	55f4e3a06b	i965/fs: Move some flags that affect code generation to fs_visitor. runtime_check_aads_emit isn't actually used currently, but I believe we should be using it on Gen4-5, so I haven't eliminated it. See https://bugs.freedesktop.org/show_bug.cgi?id=78679 for details. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:19 -07:00
Kenneth Graunke	8ef78828fa	i965/fs: Move payload register info from brw_wm_compile to fs_visitor. This data is created by fs_visitor and only used when emitting code, so keeping it in fs_visitor makes sense. I decided it would be reasonable to group these all together in a struct, since they're highly related. v2: s/nr_payload_regs/payload.num_regs/ in some comments (chrisf). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:19 -07:00
Kenneth Graunke	c76e6db05f	i965/fs: Simplify gl_SampleMaskIn handling. As far as I can tell, there's no point in allocating an extra register and generating a MOV---we can just use the copy provided as part of our thread payload directly. It's already in the right format. Of course, there are zero Piglit tests for this. We don't actually ship the extension (GL_ARB_gpu_shader5) that exposes this functionality either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:18 -07:00
Kenneth Graunke	5cd7cf58e6	i965/fs: Rename c->sample_mask_reg to sample_mask_in_reg. This is actually for gl_SampleMaskIn, which is quite different than gl_SampleMask. Renaming should help avoid confusion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:18 -07:00
Kenneth Graunke	db9c915abc	i965/fs: Move c->last_scratch into fs_visitor. Nothing outside of fs_visitor uses it, so we may as well keep it internal. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:18 -07:00
Kenneth Graunke	7e28bd797d	i965/fs: Move total_scratch calculation into fs_visitor::run(). With this one use gone, c->last_scratch is now only used inside fs_visitor. The rest of the driver uses prog_data->total_scratch. We already compute similar prog_data fields in fs_visitor, so this seems reasonable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:18 -07:00
Kenneth Graunke	c51163b0cf	i965/fs: Move perf_debug about register spilling to a more obvious spot. The if (!allocated_without_spills) block is an obvious spot for this performance warning message. In the Vec4 backend, scratch is also used for indirect access of temporary arrays. The FS backend doesn't implement that yet, but if it did, this message would be inaccurate, since scratch access wouldn't necessarily mean spilling. Moving it preemptively fixes that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-18 23:35:18 -07:00
Kenneth Graunke	db1449b700	i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble. "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-18 23:35:18 -07:00
Kenneth Graunke	4b04152db0	i965: Rename brw_disasm/gen8_disassemble to brw/gen8_disassemble_inst. We're going to use "disassemble" for the function that disassembles the whole program. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-18 23:35:17 -07:00
Kenneth Graunke	4a2f0e305c	i965: Fix dump_prog_cache to handle compacted instructions. dump_prog_cache has interpreted compacted instructions as full size instructions, decoding garbage and complaining about invalid values. We can just use brw_dump_compile to handle this correctly in less code. The output format changes slightly, but it's still perfectly acceptable. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-18 23:35:17 -07:00
Kenneth Graunke	3285bc97ef	i965: Use brw_dump_compile for clip, SF, and old GS programs. Looping over the instructions and calling brw_disasm doesn't handle compacted instructions. In most cases, this hasn't been a problem since we don't compact prior to Sandybridge. However, Sandybridge's transform feedback GS program should already be compacted, and so this ought to fix decoding of that. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-18 23:35:17 -07:00
Ilia Mirkin	5b8f1a0f7c	nv50/ir: fix integer mul lowering for u32 x u32 -> high u32 UNION appears to expect that all of its sources are conditionally defined. Otherwise it inserts an unpredicated mov instruction which overwrites the desired result. This fixes tests that use UMUL_HI, and much less directly, unsigned integer division by a constant, which uses this functionality in a peephole pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Skeggs <bskeggs@redhat.com>	2014-05-18 17:59:16 -04:00
Ilia Mirkin	4ebaabcccb	nv50/ir: make sure that texprep/texquerylod's args get coalesced Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ben Skeggs <bskeggs@redhat.com>	2014-05-18 17:59:16 -04:00
Rob Clark	acc1651711	freedreno/a3xx: use util_format_compose_swizzles() Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-18 16:05:06 -04:00
Rob Clark	88ba9de917	freedreno/a3xx/compiler: 1D textures Gallium already gives us height==1 for these, so the texture state is already setup correctly to emulate 1D textures as a Nx1 2D texture. We just need to supply the .y coord. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-18 15:23:53 -04:00
Rob Clark	6f84f64643	freedreno: fix caps In particular, we want mesa to emulate primitive restart for us. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-18 15:22:55 -04:00
Rob Clark	f7debd4a3e	freedreno: fix index buffer offset Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-18 15:22:25 -04:00
Rob Clark	5646319f25	freedreno/a3xx: add sRBG texture support That was easy. Turns out it is just a matter of setting one bit. Enable sampling from sRGB texture, and therefore enable GL 2.1 :-) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-16 20:48:40 -04:00
Rob Clark	9227e6c98c	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-16 20:08:09 -04:00
Roland Scheidegger	3bf2d86c09	gallivm: (trivial) fix compilation with llvm 3.1, 3.2 I actually checked the getModuleIdentifier() function exists with 3.1 but missed that the file moved... This fixes https://bugs.freedesktop.org/show_bug.cgi?id=78803	2014-05-17 02:03:35 +02:00
Roland Scheidegger	3a1da0abee	gallivm: print out how long it takes to optimize shader IR. Enabled with GALLIVM_DEBUG=perf (which up to now was only used to print warnings for unoptimized code). While some unexpectedly long shader compile times for some shaders were fixed with `8a9f5ecdb1` this should help recognize such problems in the future. For now though only available in debug builds (which are not always suitable for such analysis). And since this uses system time, it might not be all that accurate (even llvmpipe's own rasterization threads might be running at the same time, or just other tasks). (llvmpipe also has LP_DEBUG=counters but this only gives an average per shader and the the total time for all shaders.) This prints information like this: optimizing module fs17_variant0 took 1 msec optimizing module setup_variant_0 took 0 msec optimizing module draw_llvm_vs_variant0 took 9 msec optimizing module draw_llvm_vs_variant0 took 12 msec optimizing module fs17_variant1 took 2 msec v2: rebase for recent gallivm compilation changes, and print time for whole modules instead of functions (otherwise it would be very spammy since it would include all trivial inline sse2 functions), using the shiny new module names, prying them off LLVM using new helper (not available through C bindings). Per function timings, while possibly giving more information (if there'd be a problem only in for instance the partial not the whole function), don't seem all that useful for now. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-16 22:50:14 +02:00
Roland Scheidegger	26cac02c51	gallivm: give more verbose names to modules When we had just one module "gallivm" was an appropriate name. But now we have modules containing all functions for a particular variant, so give it a corresponding name (this is really just for helping debugging). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-16 22:50:14 +02:00
Brian Paul	ef6b6658f9	mesa: fix double-freeing of dispatch tables inside glBegin/End. We allocate dispatch tables for BeginEnd and OutsideBeginEnd. But when we destroy the context we were freeing the BeginEnd and Exec tables. If Exec==BeginEnd we did a double-free. This would happen if the context was destroyed while inside a glBegin/End pair. Now free the BeginEnd and OutsideBeginEnd pointers. Cc: "10.1", "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-16 07:14:57 -06:00
Matt Turner	730bc124c3	i965: Use binary literals counter select. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 23:31:27 -07:00
Michel Dänzer	2bab95973d	glsl_to_tgsi: Make sure the 'shader' member is always initialized Fixes the valgrind report below and random crashes with piglit on radeonsi. ==30005== Conditional jump or move depends on uninitialised value(s) ==30005== at 0xB13584E: st_translate_program (st_glsl_to_tgsi.cpp:5100) ==30005== by 0xB14698B: st_translate_fragment_program (st_program.c:747) ==30005== by 0xB14777D: st_get_fp_variant (st_program.c:824) ==30005== by 0xB11219C: get_color_fp_variant (st_cb_drawpixels.c:1042) ==30005== by 0xB1131AE: st_DrawPixels (st_cb_drawpixels.c:1154) ==30005== by 0xAFF8806: _mesa_DrawPixels (drawpix.c:162) ==30005== by 0x4EB86DB: stub_glDrawPixels (generated_dispatch.c:6640) ==30005== by 0x4F1DF08: piglit_visualize_image (piglit-util-gl.c:1574) ==30005== by 0x40691D: draw_image_to_window_system_fb(int, bool) (draw-buffers-common.cpp:733) ==30005== by 0x406C8B: draw_reference_image(bool, bool) (draw-buffers-common.cpp:854) ==30005== by 0x40722A: piglit_display (alpha-to-coverage-dual-src-blend.cpp:117) ==30005== by 0x4EA7168: run_test (piglit_fbo_framework.c:52) Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-16 11:12:45 +09:00
Roland Scheidegger	b416645387	gallivm: remove optimization workaround when not having sse 4.1 This workaround doesn't list any llvm version, but it was introduced 2010-06-10 (`e277d5c1f6`). It is unlikely this bug is still present in llvm versions we support (3.1+). There's no specific test listed, but I ran lp_test_arit (which uses the mentioned functions) on llvm 3.1 and 3.3 with sse41 disabled and this pass enabled without issues. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-16 01:09:34 +02:00
Roland Scheidegger	93731fbeec	gallivm: remove workaround for reversing optimization pass order. 32bit code generation and llvm >= 2.7 used a different optimization pass order - this code was initially introduced (2010-07-23) by `815e79e72c`, apparently due to buggy code being generated with then brand new llvm versions (which was llvm 2.7 plus pre 2.8 devel). It seems very highly likely that whatever this bug was it has been fixed in newer llvm versions, though there's no easy way to test this - the mentioned piglit test has been removed years ago, and even if you'd build it I'm sceptical the glsl compiler would still produce the required code to trigger it. I have no idea what a good order of passes is, but just remove the workaround and use the same order everywhere. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-16 01:09:34 +02:00
Matt Turner	8a6f7dfc19	i965/gen8: Make disassembly function match brw's signature. gen8_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 15:45:40 -07:00
Matt Turner	1ef52d6ab3	i965: Pass brw_context and assembly separately to brw_dump_compile. brw_dump_compile will be called indirectly by code common used by generations before and after the gen8 instruction format change. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 15:45:40 -07:00
Matt Turner	74b252d270	i965: Pull brw_compact_instructions() out of brw_get_program(). Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 15:45:40 -07:00
Matt Turner	cce3bea2a7	i965/disasm: Align send instruction meta-information with dst. Has been misaligned since we added instruction offset prefixes. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 15:45:40 -07:00
Matt Turner	e00fe451b8	i965/disasm: Disassemble the compaction control bit. brw_disasm doesn't disassemble compacted instructions, so we uncompact before disassembling them which would unset the compaction control bit. Instead pass it as a separate argument. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 15:45:40 -07:00
Matt Turner	58bcf5996d	i965/cfg: Embed exec_node in bblock_link. In order to remove bblock_link's inheritance of exec_node. Also makes linked list walk code much nicer. Acked-by: Eric Anholt <eric@anholt.net>	2014-05-15 15:45:40 -07:00
Matt Turner	a77023c992	i965/cfg: Make brw_cfg.h closer to C-includable. Only bblock_link's inheritance left. Acked-by: Eric Anholt <eric@anholt.net>	2014-05-15 15:45:40 -07:00
Matt Turner	d4d843e02f	i965/cfg: Protect brw_cfg.h from multiple inclusion. Acked-by: Eric Anholt <eric@anholt.net>	2014-05-15 15:45:39 -07:00
Matt Turner	9b0108ddc1	glsl: Add C-callable fprint_ir function. Acked-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 15:45:39 -07:00
Topi Pohjolainen	d45fadf11a	i965/fb: Use meta path for stencil up/downsampling Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-05-15 21:39:33 +03:00
Topi Pohjolainen	475216a4f0	i965/meta: Stencil blit for miptree updownsampling Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 21:39:33 +03:00
Topi Pohjolainen	b18f6b9b86	i965/fb: Use meta path for stencil blits This is effective only on gen8 for now as previous generations still go through blorp. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 21:39:33 +03:00
Topi Pohjolainen	d1829badf5	i965/meta: Stencil blits v2: Create the intel renderbuffer with level hardcoded to zero instead of overriding it in the surface state configuration. Also moved the dimension adjustments for tiling, mip level, msaa into the render buffer creation. Finally prepares for another blit path needed for miptree updownsampling. v3 (Ken): Dropped unnecessary memory context for "ralloc_asprintf()" Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-05-15 21:39:33 +03:00
Topi Pohjolainen	9d752c098c	i965: Extend brw_get_rb_for_first_slice() for specified level/layer v2: Configure stencil directly for final dimensions instead of adjusting bit by bit for tiling, mip level and msaa. v3 (Ken): Used non-static constant for horizontal alignment Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 21:39:33 +03:00
Topi Pohjolainen	36caae48b2	i965/gen8: Surface state overriding for stencil v2: Allow hardware to offset accesses to individual layers. Also leave the mip-level overriding for the creator of the intel renderbuffer to handle. Merged with "i965/gen8: Allow stencil buffers to be configured as single sampled" Ken: I left the "_mesa_problem()" still in place. I think it is clearer to remove it in a separate patch. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 21:39:32 +03:00
Topi Pohjolainen	6aefaa4eb2	i965/wm: Surface state overrides for configuring w-tiled as y-tiled v2: Use intel_mipmap_tree::total_width in order to get correct alignment automatically. Also use "mt->total_height / mt->physical_depth0" as surface height allowing hardware to offset to correct slice. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 21:39:32 +03:00
Jordan Justen	103057b2b7	i965 meta up/downsample: Fix renderbuffer _BaseFormat mt->format is of type mesa_format, and therefore can't be used with _mesa_base_fbo_format which requires a GLenum input. On gen8, this fixes various piglit fbo-depthstencil tests with samples > 1. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-15 10:49:05 -07:00
Matt Turner	255357f79b	i965: Delete current_insn() function.	2014-05-15 10:35:55 -07:00
Matt Turner	006232bcde	i965: Remove blorp unit tests. They've served their purpose (in transitioning blorp to using fs_generator) and now they just necessitate large amounts of manual labor to regenerate if the disassembler changes. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-15 10:35:55 -07:00
Emil Velikov	39ae284a69	egl-static: include libradeonwinsys.la only once With this and the previous patch, we no longer have multiple definitions in the final egl_gallium.so. v2: Drop duplicate libloader link. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com> (v1) Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v1)	2014-05-15 17:32:31 +01:00
Emil Velikov	d812c74582	gallium/radeon: link in libradeon.la at target level It makes more sense to link the core and common parts of the driver as the target is build. Additionally this will help us drop duplicating symbols for targets that static link mulitple pipe-drivers. Only egl-static needs that currently with more to come. To simplify things a bit add HAVE_GALLIUM_RADEON_COMMON variable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-05-15 17:32:30 +01:00
Emil Velikov	6fcc0b0ba5	gallium/radeon: build only a single common library libradeon Just fold libllvmradeon in libradeon. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-05-15 17:32:30 +01:00
Rob Clark	670418740f	freedreno/a3xx: fix write to bogus register The loops for updating the multiple packed fields in SP_VS_OUT[] and SP_VS_VPC_DST[] will zero out one register beyond the last that on required. Which is normally not a problem (and is kinda convenient when looking at cmdstream dumps) unless we have maximum (16) varyings. Fix loop termination condition so that this does not happen. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 21:26:35 -04:00
Rob Clark	c37889b5ac	freedreno/a3xx: account for special inputs/outputs We need to size input/output tables big enough for special inputs/ outputs (gl_Position, gl_FrontFacing, etc) which, while they don't count towards the hw limit of 16 attributes or 16 varyings, we do still need to track them all the same. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 21:26:35 -04:00
Rob Clark	5dcf59e142	freedreno/a3xx: fix MAX_INPUTS shader cap Hardware only supports 16. Which fd3_shader_variant properly reflected, but the pipe cap did not, leading to array overflow (and shaders that could not possibly work). Also a bunch of asserts to make problems like this easier to see. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 21:25:53 -04:00
Rob Clark	e1896948da	freedreno/a3xx: add debug flag to expose glsl130 We are starting to add integer support to the compiler, which does not get exercised with glsl feature level 120 and without advertising integer support. But doing so breaks too many things right now. So for now use a debug flag to conditionally expose the functionality while it is in development. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 21:20:29 -04:00
Ryan Houdek	ac2a8e3c9d	freedreno/a3xx/compiler: add KILL_IF The KILL_IF opcode could potentially be merged in to the regular KILL opcode function. It was a pain to do so, so I've left is separated for cleanliness. Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 21:19:43 -04:00
Ryan Houdek	a889049400	freedreno/a3xx/compiler: start adding integer support Adds a large sum of TGSI opcodes to the a3xx compiler. For integer opcodes we have 28 opcodes added. Adds 4 floating point compare opcodes If GLSL 1.30 is enabled, this allows the GLSL 1.30 piglits to have a completion amount of 432/641. Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com> Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 21:19:21 -04:00
Roland Scheidegger	8620730f8a	draw: better llvm names for shaders for debugging. All shaders had the same name. We could probably use some identifier per shader too, but for now only use the variant number. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-15 02:35:35 +02:00
Roland Scheidegger	65ad90bd1b	llvmpipe: improve setup shader names (for debugging) The setup shaders were composed of both a fs shader number and a variant number. But since they aren't tied to a particular fragment shader, the former was a fixed zero while the latter was also always zero because it was never assigned. So, similar to what the fs code does, use a ever increasing number to give it a more catchy name (unlike fragment shaders though where this number is for each explicitly created shader, we just use it for the implicitly created variants). And while here, fix whitespace a bit. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-15 02:35:29 +02:00
Roland Scheidegger	1d28650b55	llvmpipe: kill off llvmpipe_variant_count Unused except it was increased for both fs and setup shader variants created. Probably some leftover from ages ago. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-15 02:35:26 +02:00
Roland Scheidegger	3e817e7e56	mesa/st: fix number of ubos being declared in a shader Previously the code used the total number of ubos being declared in the linked program (so the ubos of all shaders combined), use the number from the particular shader instead. This fixes an assertion failure with piglit arb_uniform_buffer_object-maxblocks seen in llvmpipe since `8a9f5ecdb1` as it now emits code for each declared buffer, not just the ones actually used. CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-15 02:35:25 +02:00
Ben Skeggs	9c64cb80d2	nvc0: enable support for maxwell boards Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:54 +10:00
Ben Skeggs	d548d47edf	nvc0: add maxwell (sm50) compiler backend The big missing part here is proper sched data calculations, but hopefully the chosen placeholder will be sufficient for now. Passes piglit as well as GK107 does. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:49 +10:00
Ben Skeggs	7b9475fa65	nvc0: maxwell isa has no per-instruction join modifier Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:46 +10:00
Ben Skeggs	07d3972b49	nvc0: replace immd 0 with $rLASTGPR for emit/restart opcodes Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:42 +10:00
Ben Skeggs	3723ff5223	nvc0: move nvc0 lowering pass class definitions into header Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:39 +10:00
Ben Skeggs	bede1bdb48	nvc0: bump sched data member to 32-bits SM50 backend requires 21 bits per instruction, not 8. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:34 +10:00
Ben Skeggs	c42d7556d3	nvc0: use vertex arrays for eng3d blit Maxwell doesn't have immediate-mode. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:29 +10:00
Ben Skeggs	edb1020ea5	nvc0: restrict "constant vbo" logic to fermi/kepler classes Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:25 +10:00
Ben Skeggs	322460fdbc	nvc0: replace some vb->stride checks with constant_vbo instead Maxwell no longer has the methods to set constant attributes, and we'll want to be treating stride 0 vtxbufs the same as for stride > 0. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:21 +10:00
Ben Skeggs	9306c3470f	nvc0: add maxwell class Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:16 +10:00
Ben Skeggs	0079a375a5	nvc0: allow for easier modification of compiler library routines Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:54:12 +10:00
Ben Skeggs	737477dac3	nvc0: properly distribute macros in source form Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-15 09:53:56 +10:00
Emil Velikov	e48054d036	docs: Add a note about llvm-shared-libs and libxatracker Both changes landed in 10.2, and for people not following the development cycle these will come as a surprise. Note that the pipe_* interface is not stable. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 23:44:08 +01:00
Brad King	6aac2637a6	automake: Honor GL_LIB for gallium libgl-xlib Use "@GL_LIB@" in src/gallium/targets/libgl-xlib/Makefile.am to produce the library name specified by the configure --with-gl-lib-name option. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-14 23:44:08 +01:00
Emil Velikov	f57d092199	configure: correctly set LD_NO_UNDEFINED Commit `11623be934` was meant to have this hunk, which I accidently dropped during git rebase. Cc: 10.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Julien Cristau <jcristau@debian.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jonathan Gray <jsg@jsg.id.au>	2014-05-14 23:44:08 +01:00
Roland Scheidegger	8a9f5ecdb1	gallivm: only fetch pointers to constant buffers once In `1d35f77228` support for multiple constant buffers was introduced. This meant we had another indirection, and we did resolve the indirection for each constant buffer access. This looks very reasonable since llvm can figure out if it's the same pointer, however it turns out that this can cause llvm compilation time to go through the roof and beyond (I've seen cases in excess of factor 100, e.g. from 50 ms to more than 10 seconds (!)), with all the additional time spent in IR optimization passes (and in the end all of it in DominatorTree::dominate()). I've been unable to narrow it down a bit more (only some shaders seem affected, seemingly without much correlation to overall shader complexity or constant usage) but it is easily avoidable by doing the buffer lookups themeselves just once (at constant buffer declaration time). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-14 16:23:33 +02:00
Roland Scheidegger	18c6454ad1	gallivm: fix output stream flushing in error case for disassembly. When there's an error, also need to flush the stream, otherwise an assertion is hit (meaning you don't actually see the error neither).	2014-05-14 16:23:33 +02:00
Michel Dänzer	c5828b0599	radeonsi: Fix anisotropic filtering state setup Bring it back in line with r600g. I broke this in the original radeonsi bringup. :( Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78537 Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-05-14 22:53:30 +09:00
Ilia Mirkin	12d97fb7c1	tgsi: support parsing texture offsets from text tgsi shaders Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 09:40:37 -04:00
Ilia Mirkin	04b7e65814	mesa/st: provide native integers implementation of ir_unop_any Previously, ir_unop_any was implemented via a dot-product call, which uses floating point multiplication and addition. The multiplication was completely pointless, and the addition can just as well be done with an or. Since we know that the inputs are booleans, they must already be in canonical 0/~0 format, and the final SNE can also be avoided. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 09:40:37 -04:00
Rob Clark	209522070e	gallium/docs: clarify when query results are reset It wasn't completely clear from the docs, so I had to figure out by looking at piglit results. Hopefully this saves the next driver writer implementing queries some time. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-14 07:54:02 -04:00
José Fonseca	b18b7781b2	gallivm: Remove lp_func_delete_body. Not necessary, now that we will free the whole module (hence all function bodies) immediately after compiling. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
José Fonseca	a6f5cc66db	gallivm: Remove gallivm_free_function. Unused. Deprecated by gallivm_free_ir(). Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
José Fonseca	0b239d9ed9	llvmpipe: Delete unneeded LLVM stuff earlier. Same as Frank's change to draw module but for llvmpipe module. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
Frank Henigman	ef14f0d59f	draw: Delete unneeded LLVM stuff earlier. Free up unneeded LLVM stuff immediately after generating vertex shader code. Saves about 500K per shader. v2: Don't bother calling gallivm_free_function (Jose) Signed-off-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
Frank Henigman	865d0312c0	gallivm: Separate freeing LLVM intermediate data from freeing final code. Split free_gallivm_state() into two steps. First step is gallivm_free_ir() which cleans up the LLVM scaffolding used to generate code while preserving the code itself. Second step is gallivm_free_code() to free the memory occupied by the code. v2: s/gallivm_teardown/gallivm_free_ir/ (Jose) Signed-off-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
Frank Henigman	2c73102dc3	gallivm: One code memory pool with deferred free. Provide a JITMemoryManager derivative which puts all generated code into one memory pool instead of creating a new one each time code is generated. This saves significant memory per shader as the pool size is 512K and a small shader occupies just several K. This memory manager also defers freeing generated code until you tell it to do so, making it possible to destroy the LLVM engine while keeping the code, thus enabling future memory savings. v2: Fix compilation errors with LLVM 3.4 (Jose) Signed-off-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
José Fonseca	2ea923cf57	gallivm: Run passes per module, not per function. This is how it is meant to be done nowadays. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
José Fonseca	920933e09e	gallivm: Use LLVM global context. I saw that LLVM internally uses its global context for some things, even when we use our own. Given ours is also global, might as well use LLVM's. However, sepearate contexts can still be enabled with a simple source code modification, for when the need/benefit arises. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
José Fonseca	69f0835ff1	gallivm: Stop using module providers. Nowadays LLVMModuleProviderRef is just an alias for LLVMModuleRef, so its use just causes unnecessary confusion. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:05:00 +01:00
José Fonseca	9cf67e51b0	gallivm,draw,llvmpipe: Remove support for versions of LLVM prior to 3.1. Older versions haven't been tested probably don't work anyway. But more importantly, code supporting it is hindering further work. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:04:59 +01:00
José Fonseca	ecef2da0b2	configure: Require LLVM 3.1. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:04:59 +01:00
José Fonseca	c0ef9a67d3	scons: Require LLVM 3.1 Support for prior versions will be removed in the following change. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-14 11:04:59 +01:00
Matt Turner	2012599abb	i965: Reformat brw_set_src1 so it can be easily found with grep.	2014-05-13 22:40:01 -07:00
Samuel Iglesias Gonsalvez	e0dc018fd5	i965: fix size assert for gen7 in brw_init_compaction_tables() It should compare with it's own size. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>	2014-05-13 22:35:42 -07:00
Iago Toral Quiroga	520dfa4b5c	i965: Relax accumulator dependency scheduling on Gen < 6 Many instructions implicitly update the accumulator on Gen < 6. The instruction scheduling code just calls add_barrier_deps() for each accumulator access on these platforms, but a large class of operations don't actually update the accumulator -- mostly move and logical instructions. Teaching the scheduling code about this would allow more flexibility to schedule instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77740 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-13 22:33:59 -07:00
Jonathan Gray	0c0bbe77d0	glsl: simplify the M_PIf macros, fixes build on OpenBSD The M_PIf macros used a preprocessor paste to append 'f' to M_PI defines, which works if the values are only numbers but breaks on OpenBSD where M_PI definitions have casts and brackets to meet requirements of a future version of POSIX, http://austingroupbugs.net/view.php?id=801 http://austingroupbugs.net/view.php?id=828 Simplify the M_PI*f macros by using casts directly in the defines as suggested by Kenneth Graunke. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jonathan Gray <jsg@jsg.id.au>	2014-05-13 22:30:22 -07:00
Carl Worth	a5769ad373	docs: Really add the 10.1.3 release nots this time Commit `a96c3bccf6` intended to add these, but I forgot to add the file.	2014-05-13 17:30:17 -07:00
Rob Clark	f999c13176	freedreno/a3xx: occlusion query support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-13 18:33:19 -04:00
Rob Clark	b8f78e1890	freedreno: add support for hw queries Real GPU queries need some infrastructure to track samples per tile and accumulate the results. But fortunately this can be shared across GPU generation. See: https://github.com/freedreno/freedreno/wiki/Queries#hardware-queries Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-13 18:33:19 -04:00
Rob Clark	13a0cf4480	freedreno/query: allow multiple query implementations Split out fd_query into an abstract base class, to allow multiple implementations. The current sw based queries are moved into fd_sw_query. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-13 18:33:19 -04:00
Kenneth Graunke	2265bda513	mesa: Dump ARB_vp/fp source and IR when MESA_GLSL=dump. As far as I can tell, Mesa hasn't had a convenient way to dump ARB_vp/fp source until now. Using MESA_GLSL=dump is convenient, since it means you can use a single environment variable to dump a program's shaders, no matter which language they're written in. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-13 15:32:16 -07:00
Kenneth Graunke	bd44ac8b5c	i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage. The point of copytexsubimage_using_blit_framebuffer is to use a hardware accelerated BlitFramebuffer path. If that fails, we shouldn't do a swrast blit---we should try our CTSI fallback code. This is especially important for i965 and GLES, where we don't even create a swrast context. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-13 15:32:16 -07:00
Jordan Justen	c51c192891	i965/gen8: Set depth extent field The depth extent field is used to limit the allowed slice range that can be rendered to. With the previous setting, only slice 0 could be rendered. This fixes piglit amd_vertex_shader_layer-layered-depth-texture-render. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-13 14:26:41 -07:00
Jordan Justen	294ada2fef	i965/gen8 depth: Set depth size based on LOD0 for 3D textures Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-13 14:25:58 -07:00
Jordan Justen	e6d6ed55ab	i965/gen7 depth: Set depth size based on LOD0 for 3D textures Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-13 14:25:58 -07:00
Jordan Justen	e47d08adef	i965/gen8 renderbuffer: Set depth size based on LOD0 for 3D textures Fixes piglit's 'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped' Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-13 14:25:58 -07:00
Jordan Justen	b875f39e29	i965/gen7 renderbuffer: Set depth size based on LOD0 for 3D textures If blorp is disabled for color clears, then piglit's 'gl-3.2-layered-rendering-clear-color-all-types 3d mipmapped' will fail. Currently, gen8 fails similarly on this test because gen8 does not use blorp. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-13 14:25:57 -07:00
Rob Clark	521ee86db7	freedreno/a3xx: add point-size Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-13 16:54:37 -04:00
Rob Clark	a13a798926	freedreno: update generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-13 16:54:20 -04:00
Bryan Cain	4e974a9cf3	glsl_to_tgsi: remove unnecessary dead code elimination pass With the more advanced dead code elimination pass already being run, eliminate_dead_code was making no difference in instruction count, and had an undesirable O(n^2) runtime. So remove it and rename eliminate_dead_code_advanced to eliminate_dead_code. Reviewed-by: Marek Olšák <marek.olsak at amd.com>	2014-05-13 14:57:55 -05:00
José Fonseca	1646f4d0fb	ralloc: Omit detailed license information about talloc. That information misleads source code auditing tools to think that ralloc itself is released under LGPL v3. Instead, simply state talloc is not licensed under a permissive license. v2: Use wording suggested by Kenneth. Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-13 12:48:38 +01:00
Iago Toral Quiroga	5421617325	i965: Avoid redundant call to brw_merge_inputs() in brw_try_draw_prims() We always call brw_merge_inputs() right before looping over the primitives but this can be called inside the loop for each primitive too. In the case we do it for the first primitive the call is redundant and can be skipped. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-13 10:09:35 +02:00
Iago Toral Quiroga	a143fbb322	glsl: Do not call lhs->variable_referenced() multiple times Instead take the result from the first call and use it where needed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-13 10:01:02 +02:00
Topi Pohjolainen	2a549c43a8	meta: Refactor state save/restore for framebuffer texture blits Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-13 10:04:25 +03:00
Kristian Høgsberg	06842d436e	wayland: Move version 2 request to end of interface specification We're moving towards requiring interface additions to be appended to the end of the interface block. No functional change, opcodes are assigned as before, but version 2 additions are now grouped together, which prevents a scanner warning. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-05-12 15:55:21 -07:00
Timothy Arceri	9c9dd8ca93	glsl: the number of samplers is already calculated so use it Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-13 07:40:08 +10:00
Eric Anholt	afe3d1556f	i965: Stop doing remapping of "special" regs. Now that we aren't using pixel_[xy] in live variables, nothing is looking at these regs after the visitor stage. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 09:50:32 -07:00
Eric Anholt	66f5c8df06	i965: Generalize the pixel_x/y workaround for all UW types. This is the only case where a fs_reg in brw_fs_visitor is used during optimization/code generation, and it meant that optimizations had to be careful to not move pixel_x/y's register number without updating it. Additionally, it turns out we had a couple of other UW values that weren't getting this treatment (like gl_SampleID), so this more general fix is probably a good idea (though I wasn't able to replicate problems with either pixel_[xy]'s values or gl_SampleID, even when telling the register allocator to reuse registers immediately) Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 09:49:27 -07:00
Eric Anholt	11bef60d09	i965: Move has_hiz from the slice to the level. The value depends only on the level, so no need to store the bool per slice. Shrinks intel_mipmap_slice from 24 bytes to 16, while slotting into an existing hole in intel_mipmap_level. Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-12 09:49:18 -07:00
Topi Pohjolainen	4dc9c314c8	meta: Refactor configuration of renderbuffer sampling Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 17:48:45 +03:00
Topi Pohjolainen	a2952315ac	meta: Refactor binding of renderbuffer as texture image Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 17:48:45 +03:00
Topi Pohjolainen	ac4db0aa55	meta: Merge compiling and linking of blit program Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 17:48:45 +03:00
Topi Pohjolainen	3a43cd0c3e	i965/blorp: Expose coordinate scissoring and mirroring Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 17:48:45 +03:00
Topi Pohjolainen	4a92ad5531	i965/gen8: Use helper variables for surface parameters Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-12 17:48:45 +03:00
Ilia Mirkin	8baed87212	nv50,nvc0: fix blit 3d path for 1d array textures Need to adjust coordinates since the shader receives the array index as depth in z, but the TEX instruction expects it to be the second coordinate for a 1D array texture. This fixes fbo-generatemipmap-array. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-11 19:26:31 -04:00
Ilia Mirkin	4467c0c9fb	nv50,nvc0: leave queries on during blit, turn them on for 2d engine Fixes the new logic of the conditional rendering piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-11 19:26:31 -04:00
Ilia Mirkin	64a7ddf40d	mesa/st: leave current query enabled during glBlitFramebuffer Also make sure that pipe_blit_info gets zero'd out so that query isn't accidentally left enabled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-05-11 19:26:31 -04:00
Ilia Mirkin	752ce0affb	gallium: add bit to pipe_blit_info to leave current query enabled Previously the implication was that queries should be disabled during blits. However glBlitFramebuffer() is supposed to obey the current query, and this new bit will indicate that to the driver. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-05-11 19:26:31 -04:00
Ilia Mirkin	863573b9cb	nv50: fix setting of texture ms info to be per-stage Different textures may be bound to each slot for each stage. So we need to be able to upload ms parameters for each one without stages overwriting each other. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-05-11 19:26:31 -04:00
Ilia Mirkin	68f47cad0d	nv50/ir: make sure to reverse cond codes on all the OP_SET variants Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>	2014-05-11 19:26:31 -04:00
Rob Clark	83b4ec03e7	freedreno/a2xx: fix compiler warning Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-11 08:58:20 -04:00
Marek Olšák	d9e102b220	radeonsi: prepare depth export registers at compile time Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	9baaa5dd4f	radeonsi: simplify depth/stencil export code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	bd2df40a84	radeon/llvm: add support for non-scalar system values The sample position is one of them. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	250aa93e23	radeonsi: add and use a helper function for loading constants Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	86035cd88d	radeonsi: only count CS space for state atoms if we're going to draw Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	023d367ae6	radeonsi: remove unused variable exports_ps in si_pipe_shader_ps Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	315f3c171d	radeonsi: use DRAW_PREAMBLE on CIK It's the same as setting the 3 regs separately, but shorter, and it also seems to be required on GFX7.2 and later. This doesn't fix Hawaii. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Marek Olšák	58c659703b	r600g: simplify framebuffer state size computation Take the upper bound. The number doesn't have to absolutely correct, only safe. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-10 13:58:46 +02:00
Kenneth Graunke	155f98d49f	Revert "i965: Fix depth (array slices) computation for 1D_ARRAY render targets." This reverts commit `e6967270c7`. Chris Forbes pointed out that this is broken for texture views which restrict the number of slices. He committed a better fix which makes this unnecessary. Cc: "10.2" <mesa-stable@lists.freedesktop.org>	2014-05-09 20:08:38 -07:00
Emil Velikov	a3e78bab7f	egl_dri2: cleanup memory leak in dri2_create_context() Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-10 02:09:02 +01:00
Emil Velikov	42770ff94e	ilo: destroy the mutex, if winsys creation fails Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-05-10 02:09:02 +01:00
Emil Velikov	326b8e253e	glx/tests: Partially revert commit `51e3569573` C++ does not support designated initializers, thus compilation is not guaranteed to succeed. Surprisingly gcc 4.6.3 fails to build the code, while version 4.9.0 compiles it without a hitch. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78403 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Vinson Lee <vlee@freedesktop.org>	2014-05-10 02:08:36 +01:00
Emil Velikov	e477d12c33	configure: error out if building GBM without dri Both backends require --enable-dri, and building an empty libgbm makes little to no sense. Error out at configure to prevent the user from shooting themselves in the foot. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78225 Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-10 02:08:36 +01:00
Chia-I Wu	510465016b	mesa: propagate FragDepthLayout to gl_program The information was lost during linking, causing the layout to be treated as FRAG_DEPTH_LAYOUT_NONE. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-09 17:21:53 -07:00
Chris Forbes	417f5ea00d	glsl: Rename linker's is_varying_var Both the ast->IR and linker have functions with this name, but different behavior. Rename the linker's version to var_counts_against_varying_limit to be closer to what it is actually used for. Suggested by Ian a while back. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-05-10 09:29:13 +12:00
Kenneth Graunke	9584959123	i965: Fix GPU hangs on Broadwell in shaders with some control flow. According to the documentation, we need to set the source 0 register type to IMM for flow control instructions that have both JIP and UIP. Fixes GPU hangs in approximately 10 Piglit tests, 5 es3conform tests, Unigine Crypt, a WebGL raytracer demo, and several Steam titles. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75478 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75878 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76939 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Tested-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-09 14:18:13 -07:00
Tom Stellard	93c2ebbd83	radeonsi: Enable geometry shaders with LLVM 3.4.1 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-05-09 12:16:05 -04:00
Tom Stellard	c5d0008325	configure.ac: Add LLVM_VERSION_PATCH to DEFINES Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> CC: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-05-09 12:16:05 -04:00
Carl Worth	a96c3bccf6	docs: Import 10.1.3 release notes, andd news item.	2014-05-09 07:52:26 -07:00
Thomas Hellstrom	9306b7c171	st/xa: Fix performance regression introduced by commit "Cache render target surface" The mentioned commit has the nasty side-effect of turning off accelerated copies. Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-05-09 08:40:12 +02:00
Tom Stellard	c5f0c98c49	clover: Destory pipe_screen when device does not support compute v2 v2: - Make sure screen was successfully created before destroying it. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-09 04:33:03 -04:00
Tom Stellard	c650033b86	pipe-loader: Don't destroy the winsys in the sw loader The screen takes ownership of the winsys, and is responsible for destroying it. Users of pipe-loader should make sure they destory and screens they've created to avoid memory leaks. This fixes a crash in clover introduced by `ce6c17c083` where the pipe-loader was destroying the winsys while a screen was still using it. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-09 04:32:54 -04:00
Chris Forbes	23e9f06569	i965/Gen8: Set up layer constraints properly for depth buffers Same issues as the previous commit fixed for Gen7: - Bogus physical->logical layer conversion; depth/stencil surfaces are still IMS layout on Gen8. - mt_layer ignored in layered rendering case, which breaks handling of views with MinLayer. - Render target array extent not set correctly for arrays. I'm not able to test this one since I can't get a Broadwell yet, but it's the same set of fixes as for Gen7. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-09 09:46:20 +12:00
Chris Forbes	77d55ef481	i965/Gen7: Set up layer constraints properly for depth buffers Again, a few problems: - Layered attachments did not honor MinLayer. - Non-layered MSAA attachments rendered to the wrong layer due to dividing by the layer count. All depth buffers use the IMS layout, so the physical layer count == logical layer count. - Layered attachments were not limited to irb->layer_count, so we could render off the end of the texture. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-09 09:46:19 +12:00
Chris Forbes	9269ea599c	i965/Gen8: Set up layer constraints properly for renderbuffers Fixing the same issues the previous commit does for Gen7. Note that I can't test this one, since I don't have a Broadwell. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-09 09:46:19 +12:00
Chris Forbes	dd43900b7b	i965/Gen7: Set up layer constraints properly for renderbuffers There were a few problems here, which mostly just broke layered rendering into a view: - Render target view extent was always set to be == depth. This is benign for non-layered-rendering, but allows writes off the end of the render target for layered rendering, which ends badly. - Layered rendering did not honor the mt_layer setting, so would not properly handle MinLayer being set on a view. V2: Restore the MAX2() to account for zero depth/layer_count. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-09 09:46:19 +12:00
Chris Forbes	cc8c00da88	i965: Fix typo in assert message Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-09 09:46:19 +12:00
Adam Jackson	74388dd24b	radeonsi: Don't use anonymous struct trick in atom tracking I'm somewhat impressed that current gccs will let you do this, but sufficiently old ones (including 4.4.7 in RHEL6) won't. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Adam Jackson <ajax@redhat.com>	2014-05-08 12:05:58 -04:00
Roland Scheidegger	cf93f86957	llvmpipe: change LP_MAX_SHADER_INSTRUCTIONS limit definition. When the limit was changed to be defined in terms of LP_MAX_SHADER_VARIANTS (`75f1fea14f`) when it was increased, this inadvertently lowered the limit in some branches (that have a lower LP_MAX_SHADER_VARIANTS number) when merged. So, make sure the limit is always at least the number it once was. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-05-08 16:26:49 +02:00
Roland Scheidegger	9af68e9b1d	draw: do not use draw_get_option_use_llvm() inside draw execution paths `1c73e919a4` made it possible to not allocate the tgsi machine if llvm was used. However, draw_get_option_use_llvm() is not reliable after draw context creation, since drivers can explicitly request a non-llvm draw context even if draw_get_option_use_llvm() would return true (and softpipe does just that) which leads to crashes. Thus use draw->llvm to determine if we're using llvm or not instead (and make draw->llvm available even if HAVE_LLVM is false so we don't have to put even more ifdefs). Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-08 16:26:49 +02:00
Kenneth Graunke	e6967270c7	i965: Fix depth (array slices) computation for 1D_ARRAY render targets. 1D array targets store the number of slices in the Height field. Fixes Piglit's spec/!OpenGL 3.2/layered-rendering/clear-color-all-types 1d_array single_level, at least when used with Meta clears. Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-05-07 23:42:11 -07:00
Kenneth Graunke	5c399ca8e4	mesa: Fix MaxNumLayers for 1D array textures. 1D array targets store the number of slices in the Height field. Cc: "10.2 10.1 10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-05-07 23:42:11 -07:00
Kenneth Graunke	ecfc418b68	i965: Enable GL_ARB_texture_view on Broadwell. This is a port of commit `c9c08867ed`. A tiny bit of extra work was necessary to not break stencil texturing. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-07 23:42:11 -07:00
Ilia Mirkin	9d95d64be0	mesa: pass target through to driver when choosing texture format This only matters for TextureView where the texObj's target has not been set yet, in all other instances, texObj->target should be the same as the passed-in target parameter. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-07 20:40:46 -04:00
Ilia Mirkin	e7047f2917	nv50/ir/gk110: fix set with f32 dest Should fix comparison opcodes like SGE/SLT/etc which expected a float to be returned. These were previously getting integer 0/-1 values. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Cc: 10.2 <mesa-stable@lists.freedesktop.org>	2014-05-07 20:40:46 -04:00
Ilia Mirkin	5a40fe03f7	nv50/ir: allow load propagation when flags are defined The old condition disallowed load propagation any time flags were defined, even with e.g. set and a constbuf reference. The new condition disallows it only with immediate propagation. (There are no opcodes that set the condition flag and have an immediate argument.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-07 20:40:46 -04:00
Ilia Mirkin	83b900fd0a	mesa/st: pass 4-offset TG4 without lowering if supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-07 20:40:46 -04:00
Ilia Mirkin	d95df4f4e4	gallium: add a cap for supporting 4-offset TG4 opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-07 20:40:46 -04:00
Brian Paul	9ced3fc649	svga: add switch case for PIPE_SHADER_CAP_PREFERRED_IR, remove default case Remove default switch case so we're warned of missing cases at compile time. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-05-07 11:32:11 -06:00
Brian Paul	9b1ae44ae1	tgsi: add missing switch cases in tgsi_exec_get_shader_param() Add cases for PIPE_SHADER_CAP_MAX_SAMPLER_VIEWS and PIPE_SHADER_CAP_PREFERRED_IR. Remove default switch case so we learn of missing cases at compile time. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-05-07 11:32:11 -06:00
Brian Paul	baec25635d	gallivm: add PIPE_SHADER_CAP_PREFERRED_IR switch case, remove default Return PIPE_SHADER_IR_TGSI for the PIPE_SHADER_CAP_PREFERRED_IR query. Remove default switch case so we learn of missing switch cases at compile time. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-05-07 11:32:11 -06:00
Brian Paul	ed8bfaba52	gallium: remove enum numbers from shader cap queries The enum numbers were just cruft. Reviewed-by: Michel Dänzer <michel@daenzer.net>	2014-05-07 11:32:11 -06:00
Ian Romanick	f7bf37cb13	linker: Fix consumer_inputs_with_locations indexing In an earlier incarnation of populate_consumer_input_sets and get_matching_input, the consumer_inputs_with_locations array was indexed using the user-specified location. In that version, only user-defined varyings were included in the array. In the current incarnation, the Mesa location is used to index the array, and built-in varyings are included. This change fixes the unit test to exepect gl_ClipDistance in the array, and it resizes the arrays to actually be big enough. It's just dumb luck that the existing piglit tests use small enough locations to not stomp the stack. :( Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78258 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.2" <mesa-stable@lists.freedesktop.org> Cc: Vinson Lee <vlee@freedesktop.org>	2014-05-07 09:50:14 -07:00
José Fonseca	98934f4aba	st/wgl: Advertise WGL_ARB_create_context(_profile). We added wglCreateContextAttribsARB but not the extension strings. This allows creation of GL 3.x contexts. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-07 16:15:45 +01:00
José Fonseca	aee501060b	st/wgl: Honour request of 3.1 contexts through core profile where available. Port `5f493eed69` from GLX. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-07 16:15:45 +01:00
Kenneth Graunke	9701c6984d	meta: Only clear the requested color buffers. This path is used to implement both glClear and glClearBuffer; the latter is only supposed to clear particular buffers. Core Mesa provides us that information in the buffers bitmask; we must only clear buffers mentioned there. To accomplish this, we save/restore the color draw buffers state, and use glDrawBuffers to restrict drawing to the relevant buffers. Fixes Piglit's spec/!OpenGL 3.0/clearbuffer-mixed-formats and spec/ARB_framebuffer_object/fbo-drawbuffers-none glClearBuffer tests for drivers using meta clears (such as Broadwell). Cc: "10.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77852 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77856 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-06 11:31:31 -07:00
Kenneth Graunke	c1c1cf5f92	meta: Add infrastructure for saving/restoring the DrawBuffers state. Sometimes we need to configure what draw buffers we render to, without creating a new FBO. This path will make that possible. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-06 11:31:29 -07:00
Kenneth Graunke	e526ebf35c	meta: Add a new MESA_META_DRAW_BUFFERS bit. This will be used for saving/restoring the glDrawBuffers state. For now, make sure that existing users of MESA_META_ALL don't get the new bit, since they probably won't want it. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-06 11:31:27 -07:00
Kenneth Graunke	7c8df60f31	meta: Unify the GLSL and fixed-function clear paths. The majority of _mesa_meta_Clear and _mesa_meta_glsl_Clear was the same; adding a boolean for whether to use GLSL allows us to share most of it without polluting either path too much. Tested for regressions by hacking i965 to always use the non-GLSL path. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-06 11:31:21 -07:00
Kenneth Graunke	cde8bad1c9	i965: Always intel_prepare_render() after invalidating front buffers. Fixes glean/texture_srgb, which hit recursive-flush prevention assertions in vbo_exec_FlushVertices. This probably hurts the performance of front buffer rendering, but very few people in their right mind do front buffer rendering. Fixes Glean's texture_srgb test. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-06 11:30:54 -07:00
Marek Olšák	2484daa4fd	radeonsi: implement ARB_texture_cube_map_array No LLVM changes needed. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> v2: updated GL3.txt and relnotes	2014-05-06 17:18:17 +02:00
Marek Olšák	cc71df5652	configure.ac: radeonsi requires EGL_DRM and GBM Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-05-06 16:59:35 +02:00
Tapani Pälli	e65917f94e	glsl: fix bogus layout qualifier warnings Print out GL_ARB_explicit_attrib_location warnings only when parsing attribute that uses "location" qualifier. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77245 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>	2014-05-06 08:36:40 +03:00
Carl Worth	6dd907c80d	docs: Import 10.1.2 release notes, andd news item.	2014-05-05 13:25:44 -07:00
Paulo Sergio Travaglia	97a70f26f2	st/egl: Flush resources before presentation (android - bug 77966) [olv: Use the real name provided by the patch author. Ideally this could be moved to somewhere higher level so that we would not need to create a pipe context to flush resources. Plus, it is not clear if flushing resources for another context is valid.] Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-05-05 08:33:14 +08:00
Ilia Mirkin	5cfd45fbc3	docs: mark ARB_stencil_texturing as done for nv50+/r600+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-04 20:17:25 -04:00
Ilia Mirkin	833f870d9b	mesa/st: implement ARB_stencil_texturing If StencilSampling is enabled on the texture object, pass in an equivalent stencil-only format. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-05-04 20:10:14 -04:00
Ilia Mirkin	cee22a0b48	nv50,nvc0: add X8Z24_UNORM, fix stencil-only formats S8_UINT will become useful when ARB_texture_stencil8 becomes supported by mesa. The other stencil formats are needed for ARB_stencil_texturing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-04 20:05:44 -04:00
Rob Clark	b7e7ae9f60	xa: fix segfault Fixes: Program received signal SIGSEGV, Segmentation fault. bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 445 mask_pic->srf->tex->format); (gdb) bt #0 bind_samplers (comp=0x21b054, comp=0x21b054, ctx=0x211430) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:445 #1 xa_composite_prepare (ctx=0x211430, comp=comp@entry=0x21b054) at ../../../../../src/gallium/state_trackers/xa/xa_composite.c:488 #2 0xb6f454b4 in XAPrepareComposite (op=<optimized out>, pSrcPicture=<optimized out>, pMaskPicture=<optimized out>, pDstPicture=<optimized out>, pSrc=0x5b3ad8, pMask=0x0, pDst=0x5923b8) at msm-exa-xa.c:533 We can't yet handle solid fill mask, so explicitly reject that, rather than segfaulting. Otherwise DDX would need to check XA version to see if solid fill mask were supported. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-05-04 11:08:10 -04:00
Kenneth Graunke	829cb0423d	i965: Set miptree target field when creating from a BO. Prior to commit `8435b60a35`, the region equivalent of this function called intel_miptree_create_layout, which set mt->target to target. With that commit, it no longer copied target. Piglit's ext_image_dma_buf_import-sample_[xa]rgb8888 tests would then hit an assertion failure, where image->TexObject->Target was GL_TEXTURE_EXTERNAL_OES, and mt->target was GL_TEXTURE_2D. Copying the target fixes this assertion failure. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 23:05:37 -07:00
Ian Romanick	64c4670dd6	mesa: Bump version to 10.3-devel Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 21:43:48 -07:00
Ian Romanick	a06c9791d1	docs: Add missing release notes for ARB_separate_shader_objects Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 17:25:19 -07:00
Eric Anholt	20404e45c7	i965: Move push constant state packets to push constant update time. -0.553779% +/- 0.423394% effect on cairo-perf-trace runtime on glamor (n=612) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	113037148d	i965: Merge gen8_upload_constant_state into gen7_upload_constant_state. The two paths are really similar, and the extra conditionals will be dwarfed by the cost of the actual upload. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	51b79a6571	i965: Refactor gen7_upload_constant_state to look more like gen8. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	1515ceb8fd	i965: Drop unnecessary state flag for units on NEW_BINDING_TABLE. Commit `30259856a8` moved the state packets to table generation time, but forgot to make this change. Apparently the performance win there was about not reemitting the table pointers on unrelated state changes. No performance difference on cairo on glamor (n=118). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	f9a2679db5	i965/gen7+: Move sampler state packets to the stage sampler state table update. Now that we have the stage state coming into our setup of sampler states, it's easy to drop an identifier into it of which stage the stage_state is, and then look up which packet to emit in a little table. No performance difference on cairo on glamor (n=492). v2: Don't forget to do the workaround flush on IVB. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	680d202d49	i965/gen6: Don't update unit state when samplers change. There's no remaining dependency between these two packets that I can find. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	02a3449758	i965: Drop a NEW_SAMPLER annotation for use of sampler_count. The sampler count is set up from the gl_program at draw time, not at sampler change time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	57ad5a3103	i965: Simplify sampler setup by passing the stage state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	9e363f0262	i965: Make batch dumping go to stderr, too. All our other debug goes there. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:40 -07:00
Eric Anholt	55a049b9ae	i965: Fix a stale comment reference Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 17:01:39 -07:00
Armin K	0b307afd57	glx: Conditionally compile GLX_MESA_query_renderer DRI3 support Missed out with commit `625bdd64e5`. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 23:20:34 +01:00
Samuel Li	7f8f6790e4	radeonsi: add Mullins pci ids. Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-05-02 17:30:31 -04:00
Samuel Li	aad669b1e9	radeonsi: add support for Mullins asics. v2: name defaults to kabini for older llvm v3: fix llvm version check Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-05-02 17:30:27 -04:00
Alex Deucher	b26175b6c3	configure: bump up libdrm_radeon requirement to 2.4.54 Required for Mullins. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-05-02 17:29:56 -04:00
Ian Romanick	625bdd64e5	dri3: Enable GLX_MESA_query_renderer on DRI3 too This should have happend around the time of commit `4680d23`, but Keith's DRI3 patches and my GLX_MESA_query_renderer patches crossed in the mail. I don't have a working DRI3 setup, so I haven't been able to actually verify this. I'm hoping that someone can piglit this for me on DRI3... It's also unfortunate the DRI2 and DRI3 can't share more code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Keith Packard <keithp@keithp.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 22:13:58 +01:00
José Fonseca	7ebdc9e48c	util: Don't attempt to redefine INFINITY/NAN on VS 2013. There are now provided by VS. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:47 +01:00
José Fonseca	8c879ac197	mesa: VS 2013 does not provide strcasecmp. A define is necessary, like for earlier VS versions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:47 +01:00
José Fonseca	ade79b21e9	egl: Don't attempt to redefine stdint.h types with VS 2010. Just include stdint.h. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:47 +01:00
José Fonseca	979692a52a	scons: Don't use bundled C99 headers for VS 2013. Use the ones provided by the compiler instead. NOTE: External trees should be updated to not include '#include/c99' directory directly, but rather rely on scons/gallium.py to do the right thing. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	0582800dd6	scons: Don't restrict MSVC_VERSION values. Saves the trouble of continuously needing to update. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	d69fd5d940	draw: Prevent signed/unsigned comparisons. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	605ef195aa	st/vega: Prevent signed/unsigned comparisons. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	42b9f8590d	scons: Adjust the warnings for VS. Silence insignificant warnings so significant warnings have a chance to stand out. The only abundant warning that's not silenced here is "C4018: signed/unsigned mismatch", as it could hide security issues, so it's better to actually fix the code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
José Fonseca	5bd3b91784	util/u_debug_flush: Use util_snprintf. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-05-02 22:04:46 +01:00
Emil Velikov	1c6154c9b4	targets/omx: add nouveau target Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:27 +01:00
Emil Velikov	be1b5feaa0	targets/omx: use GALLIUM_VIDEO_CFLAGS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:27 +01:00
Emil Velikov	ce6c17c083	targets/pipe-loader: cleanup version-script Drop the version/name tag from the script as it was never meant to be there. Add swrast_create_screen as it is used when loading swrast. Rename the file to pipe.sym. v2: Rebase on top of the LD_NO_UNDEFINED changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 21:48:27 +01:00
Emil Velikov	f743670b9a	targets/opencl: hide all the exported llvm/clang mayhem... hopefully Both llvm and clang polute the exported symbol table, as soon as we try to link with either one. Other than those two everything else looks good (clean). Cc: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 21:48:27 +01:00
Emil Velikov	7b7944ee1c	targets/egl-static: freshen up the version script Namely drop the version/name tag of the exported symbol, and rename the filename to egl.sym. v2: Rebase on top of the LD_NO_UNDEFINED changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	4eaa3c9b60	targets/gbm: add version-script to limit exported symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	69d790da9f	targets/vdpau: use version script to limit the exported symbols Using export-symbols-regex is the least desirable method of restricting the exported symbols, as is completely messes up with the symbol table. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	53dd2e45f4	targets/omx: drop the version from the omx targets Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	bea9e8dca0	targets/omx: use version script to limit amount of exported symbols Using export-symbols-regex is the least desirable method of restricting the exported symbols, as is completely messes up with the symbol table. radeon_drm_winsys_create is not needed, avoid exporting it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:26 +01:00
Emil Velikov	6239d42fdb	targets/dri: use a single version script to restict exported symbols Rather than having multiple (almost) identical version scripts use a single one. Cc: Christian König <christian.koenig@amd.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:25 +01:00
Emil Velikov	b8f31dfc22	targets/xvmc: limit the amount of exported symbols In the presence of LLVM the final library exports every symbol from the llvm namespace. Resolve this by using a version script (w/o the version/name tag). Considering that there are only ~25 symbols, explicitly list them to minimize the chances of rogue symbols sneaking in. Drop the *winsys_create functions as they were only meant for gl-vdpau interop. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-05-02 21:48:25 +01:00
Emil Velikov	9bcb3698db	targets/osmesa: hide osmesa_create_screen The symbol is not meant to be exported, and its presence was only a side effect due to the missing visibility flags. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-05-02 21:48:25 +01:00
Emil Velikov	658b36ff78	targets/pipe-loader: drop driver_descriptor symbol from swrast The symbol is used for hardware only drivers. For swrast the loader uses swrast_create_screen. Add VISIBILITY_CFLAGS while we're here. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 21:48:25 +01:00
Juha-Pekka Heikkila	a50b02783b	mesa: add extra null checks in vbo_rebase_prims() v2 [idr]: Move declarations before code to prevent MSVC build breaks. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 12:00:30 -07:00
Juha-Pekka Heikkila	dc675919d3	mesa: add missing null checks in _tnl_register_fastpath() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 11:58:36 -07:00
Ian Romanick	59ad2e6696	mesa: Add _mesa_error_no_memory for logging out-of-memory messages This can be called from locations that don't have a context pointer handy. This patch also adds enough infrastructure so that the unit tests for the GLSL compiler and the stand-alone compiler will build and function. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-05-02 11:58:36 -07:00
Chia-I Wu	267e28bb62	glsl: make static constant variables "static const" This allows them to be moved to .rodata, and allow us to be sure that they will not be modified. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-05-02 10:50:14 -07:00
Petri Latvala	6a2d28599f	docs: update 10.2 release notes Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:05 -07:00
Petri Latvala	b4363c8ea4	i965: Enable INTEL_performance_query for Gen5+. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	8cf5bdad3c	mesa: Implement INTEL_performance_query. Using the existing driver hooks made for AMD_performance_monitor, implement INTEL_performance_query functions. v2: Whitespace changes. v3: Whitespace changes, add a _mesa_warning() Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	dac82ceac5	mesa: Add core support for the GL_INTEL_performance_query extension. Like AMD_performance_monitor, this extension provides an interface for applications (and OpenGL-based tools) to access GPU performance counters. Since the exact performance counters available vary between vendors and hardware generations, the extension provides an API the application can use to get the names, types, and minimum/maximum values of all available counters. Applications create performance queries based on available query types, and begin/end measurement collection. Multiple queries can be measuring simultaneously. v2: Whitespace changes v3: src/mapi/glapi/gen/gl_API.xml: Also expose the functions to GLES2. v4: Whitespace changes, static_dispatch="false" for all functions, fix dispatch_sanity test for GLES2 functions Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	6ccb98e88c	mesa: Add INTEL_performance_query enums to tests/enum_strings.cpp Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Petri Latvala	927c3c9704	Regenerate gl_mangle.h. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 10:07:04 -07:00
Ilia Mirkin	cf6c9dbc33	docs: update ARB_buffer_storage for nouveau	2014-05-02 12:16:25 -04:00
Ilia Mirkin	3df4d692f3	nouveau: add ARB_buffer_storage support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:16:25 -04:00
Ilia Mirkin	b0d02db7e0	nouveau: remove cb_dirty, it's never used Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:01:35 -04:00
Ilia Mirkin	1baf77dbe8	nvc0: treat non-linear 2DRect textures the same as 2D This fixes textureGather(2DRect) piglit tests, and does not appear to have any adverse effects. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:01:35 -04:00
Ilia Mirkin	cd064c6a25	mesa/st: enable carry/borrow lowering pass This handles the last of the ARB_gs5 instructions currently present in mesa. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-05-02 12:01:35 -04:00
Ilia Mirkin	31b92aa2fc	glsl: add lowering passes for carry/borrow Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-05-02 12:01:35 -04:00
Ian Romanick	f64bfb2e39	mesa: Eliminate gl_shader_program::InternalSeparateShader This was a work-around to allow linking a program with only a fragment shader in a GLES context. Now that we have GL_EXT_separate_shader_objects in GLES contexts, we can just use that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:11 -07:00
Ian Romanick	7d9adef340	mesa: Enable GL_EXT_separate_shader_objects for OpenGL ES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:10 -07:00
Ian Romanick	507b875cf5	glsl: Sort the list of extensions ARB, OES, then everything else. If there's ever a KHR shading language extension, it should go between ARB and OES. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:10 -07:00
Ian Romanick	fb615feafb	mesa: Remove support for desktop OpenGL GL_EXT_separate_shader_objects I don't know of any applications that actually use it. Now that Mesa supports GL_ARB_separate_shader_objects in all drivers, this extension is just cruft. The entrypoints for the extension remain in the XML. This is done so that a new libGL will continue to provide dispatch support for old drivers that try to expose this extension. Future patches will add OpenGL ES GL_EXT_separate_shader_objects, but that's a different thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:10 -07:00
Ian Romanick	e608449d3e	mesa/sso: Enable GL_ARB_separate_shader_objects by default Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:20:08 -07:00
Ian Romanick	0939d3d097	sso: Add display list support for ARB_separate_shader_objects new functions With this patch, the piglit arb_separate_shader_object-dlist test passes. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	7ff937e579	linker: Modify cross_validate_outputs_to_inputs to match using explicit locations This will be used for GL_ARB_separate_shader_objects. That extension not only allows separable shaders to rendezvous by location, but it also allows traditionally linked shaders to rendezvous by location. The spec says: 36. How does the behavior of input/output interface matching differ between separable programs and non-separable programs? RESOLVED: The rules for matching individual variables or block members between stages are identical for separable and non-separable programs, with one exception -- matching variables of different type with the same location, as discussed in issue 34, applies only to separable programs. However, the ability to enforce matching requirements differs between program types. In non-separable programs, both sides of an interface are contained in the same linked program. In this case, if the linker detects a mismatch, it will generate a link error. v2: Make sure consumer_inputs_with_locations is initialized when consumer is NULL. Noticed by Chia-I. v3: Rebase on removal of ir_variable::user_location. v4: Replace a (stale) FINISHME with some good explanation comments from Eric. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	d030a3404c	linker: Sort shader I/O variables into a canonical order v2: Rebase on removal of ir_variable::user_location. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	c557eb7722	linker: Allow geometry shader without vertex shader for separable programs Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	1ff5a2b1ba	linker: Assign varying locations for separable programs Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:40 -07:00
Ian Romanick	7d73c3e99e	linker: Allow consumer stage or producer stage to be NULL When linking a separable program that contains only a fragment shader, the producer will be NULL. Similar cases will exist with geometry shaders and, eventually, tessellation shaders. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:19:40 -07:00
Ian Romanick	fe37cb0ac6	linker: Refactor code that gets an input matching an output Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:40 -07:00
Ian Romanick	5699220cd5	glsl: Exit when the shader IR contains an interface block instance While writing the link_varyings::single_interface_input test, I discovered that populate_consumer_input_sets assumes that all shader interface blocks have been lowered to discrete variables. Since there is a pass that does this, it is a reasonable assumption. It was, however, non-obvious. Make the code fail when it encounters such a thing, and add a test to verify that behavior. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:40 -07:00
Ian Romanick	ba7195d126	glsl/tests: Add first simple tests of populate_consumer_input_sets Four initial tests: * Create an IR list with a single input variable and verify that variable is the only thing in the hash tables. * Same as the previous test, but use a built-in variable (gl_ClipDistance) with an explicit location set. * Create an IR list with a single input variable from an interface block and verify that variable is the only thing in the hash tables. * Create an IR list with a single input variable and a single input variable from an interface block. Verify that each is the only thing in the proper hash tables. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 07:19:39 -07:00
Ian Romanick	8f5852bd2b	linker: Refactor code that builds hash tables of varyings during linking I want to make some changes to this code, but first I want to make some unit tests for it... so that I can capture the pre- and post-invariants. Pulling the code out into its own function in a non-anonymous namespace enables that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 07:19:39 -07:00
Ian Romanick	ca21cffebd	meta: Fix saving the program pipeline state This code was broken in some odd ways before. Too much state was being saved, it was being restored in the wrong order, and in the wrong way. The biggest problem was that the pipeline object was restored before restoring the programs attached to the default pipeline. Fixes a regression in the glean texgen test. v3: Fairly significant re-write. I think it's much cleaner now, and it avoids a bug with some meta ops that use shaders (reported by Chia-I). v4: Check Pipeline.Current against NULL instead of Pipeline.Default. Suggested by Chia-I. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-05-02 07:17:34 -07:00
Ian Romanick	4a868a984d	mesa/sso: Refactor new function _mesa_bind_pipeline Pull most of the guts out of _mesa_BindPipeline into a new utility function that can be use elsewhere (e.g., meta). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:16:55 -07:00
Ian Romanick	5998fd536a	linker: Make lower_packed_varyings work with explicit locations Don't do anything with variables that have explicitly assigned locations. This is also how built-in varyings are handled. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:16:54 -07:00
Ian Romanick	7016afe25d	glsl: Remove varying "base" parameters In February 2013 Paul unified the values used for shader stage outputs and shader stage inputs. See commits 8a076c5f0^..eed6baf76. Since that time, the location_base parameters are always VARYING_SLOT_VAR0. Instead of passing that around, just hard code it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-02 07:16:54 -07:00
Ian Romanick	03488cd3b9	glsl: Constify parameter to a couple varying_matches methods Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-05-02 07:16:54 -07:00
Tom Stellard	e05cebafd8	clover: Add a stub implementation of clCreateImage() v3 Now that we are uisng the OpenCL 1.2 headers, applications expect all the OpenCL 1.2 functions to be implemented. This fixes linking errors with the piglit CL tests. v2: - Use c++ features - Fix error code handling v3: - Move <iostream> into api/util.hpp - Fix indentation Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-05-02 06:48:17 -07:00
Chris Forbes	11f92fd9f9	docs: Add missing ARB_gpu_shader5 subfeature to GL3.txt Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-05-02 17:09:13 +12:00
Fredrik Höglund	e6ff557d15	docs: Mark ARB_multi_bind as done ...and update relnotes. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:42 +02:00
Fredrik Höglund	68f3b31a0f	mesa: Enable ARB_multi_bind Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:42 +02:00
Fredrik Höglund	2a25570456	mesa: Implement glBindImageTextures Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	63995b902a	mesa: Implement glBindVertexBuffers v2: Use the user provided offset and stride when the buffer ID is zero. Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2014-05-02 03:00:41 +02:00
Fredrik Höglund	f0c36cf4fa	mesa: Implement glBindBuffersRange Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	533cfa03ac	mesa: Implement glBindBuffersBase Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	835abfaba4	mesa: Add _mesa_set_transform_feedback_binding() Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	f65a0c19a5	mesa: Refactor set_ubo_binding() Make set_ubo_binding() just update the binding, and move the code that does validation, flushes the vertices etc. into a new bind_uniform_buffer() function. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 03:00:41 +02:00
Fredrik Höglund	28d7335810	mesa: Add helper functions for looking up multiple buffers v2: Document the difference between _mesa_lookup_bufferobj() and _mesa_multi_bind_lookup_bufferobj(). v3: Don't create the buffer objects when they don't exist. Reviewed-by: Brian Paul <brianp@vmware.com> (v2) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2)	2014-05-02 02:53:26 +02:00
Fredrik Höglund	19f7eeb6fb	mesa: Refactor set_atomic_buffer_binding() Make set_atomic_buffer_binding() just update the binding, and move the code that does validation, flushes the vertices etc. into a new bind_atomic_buffer() function. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:26 +02:00
Fredrik Höglund	4f30c0ba80	mesa: Implement glBindTextures Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	659d94b256	mesa: Add a texUnit parameter to dd_function_table::BindTexture This is for glBindTextures(), since it doesn't change the active texture unit. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	b8ee235e72	mesa: Add helper functions for looking up multiple textures Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	b16e2ada4c	mesa: Implement glBindSamplers Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	6655e70f99	glapi: Add infrastructure for ARB_multi_bind Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	82291f64e3	mesa: Add functions for doing unlocked hash table lookups This patch adds functions for locking/unlocking the mutex, along with _mesa_HashLookupLocked() and _mesa_HashInsertLocked() that do lookups and insertions without locking the mutex. These functions will be used by the ARB_multi_bind entry points to avoid locking/unlocking the mutex for each binding point. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	30af8ce3f8	mesa: Optimize unbind_texobj_from_texunits() The texture can only be bound to the index that corresponds to its target, so there is no need to loop over all possible indices for every unit and checking if the texture is bound to it. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	4bd8272088	mesa: Add a _BoundTextures field in gl_texture_unit This will be used by glBindTextures() when unbinding textures, to avoid having to loop over all the targets. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Fredrik Höglund	6bf8ac846a	mesa: Store the target index in gl_texture_object This will be used by glBindTextures() so we don't have to look it up for each texture. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-02 02:53:25 +02:00
Eric Anholt	d55e5a323b	i965: Fix the file comment for intel_image.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:28 -07:00
Eric Anholt	5566747296	i965: Rename intel_regions.h to something more appropriate now. We had the EGLimage structure laying around in intel_regions.h, but now it's the only thing left in the file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	e7f65655cb	i965: Delete the intel_regions.c code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	3278f96a52	i965: Drop region usage from DRI2 winsys-allocated buffers. v2: Fix bad pointer on unreference (caught by Chad) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-05-01 15:12:27 -07:00
Eric Anholt	835f90692f	i965: Drop a funny assert about mt pitch. I slipped this in in the region->pitch change from pixels to bytes, but I don't see any reason for it any more -- the libdrm code doesn't appear to divide pitch by a cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	b49982de6a	i965: Fix intel_bufferobj_buffer range for blit drawpixels. If the stride wasn't width*cpp, we wouldn't track how much of the src is busy, and allow a subdata into the end to proceed unsynchronized. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	e16c5c9063	i965: Drop use of intel_region from miptrees. Note: region->width/height used to reflect the total_width/height padding of separate stencil, though mt->total_width didn't. region->width/height was being used in EGL images, where the padded value would have been the wrong one, so I converted them to use rb->Width/Height. v2: Drop debug printf that slipped in (caught by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	e3a9ca4563	i965: Replace the region in DRIimage with just a BO pointer and stride. Regions aren't refcounted safely for multithreaded applications, and they're not terribly useful wrappers of a BO, so I'm trying to remove them. Even the stride I added here could probably be reduced to use of an existing field in the __DRIimageRec, but I want this to be as mechanical of a change as possible. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	8435b60a35	i965: Make intel_set_texture_region just take a BO and pitch. I want to do this to get the region removed from DRI images. However, it does mean that we won't share the intel_region between the rb and the texture for texture_from_pixmap. I think that's fine. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:27 -07:00
Eric Anholt	c0bf5a7eff	i965: Stop making a pointless region for DRI2 to just throw it away. I noticed that we were doing this while changing the DRI3 path to not use regions, which involved changing the signature of intel_update_winsys_renderbuffer_miptree() this way. v2: Replace my comment with Chad's version. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	3a7a20752f	i965: Drop the global GEM name from regions. Once a buffer has been named, drm_intel_bo_flink() is just a getter. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	76932c0ded	i965: Drop the tiling argument to intel_miptree_create_for_bo. The drm function to get the tiling is just a getter storing the two pointers, so we don't need to go out of our way to avoid it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	522fb01275	i965: Drop pointless cast of texObj to intelObj. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	3033f80af5	i965: Move intel_region_get_aligned_offset() to be a miptree function. All the consumers are doing it on a miptree. v2: fix a silly duplicated dereference (review by Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> (v1) Reviewed-by: Chad Versace <chad.versace@linux.intel.com> (v1)	2014-05-01 15:12:26 -07:00
Eric Anholt	9791eb4280	i965: Move intel_region_get_tile_masks() to be a miptree function. All the consumers are doing it on a miptree. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	ea2cac01e8	i965: Fix another broken offset-aligned-to-tile test. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	65e025f99c	i965: Fix offset-aligned-to-tile test in dma_buf import. v1 of the patch got pushed, insted of the v2 that I had reviewed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Eric Anholt	6db640da22	i965: Reuse intel_miptree_get_tile_offsets(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-05-01 15:12:26 -07:00
Brian Paul	5ec1adeb10	mesa: move declarations before code in texstore.c To fix MSVC build. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-05-01 16:01:06 -06:00
Ville Syrjälä	eb502c31a0	i965: Fix format of private renderbuffers intel_alloc_renderbuffer_storage() will clobber rb->Format which was already set up by intel_create_renderbuffer(). This causes the driver to potentially create the depth buffer in the wrong format. In practice this makes the depth buffer Z24 even if the visual has depthBits==16. The incorrect depth buffer format doesn't seem to cause any actual problems in i965, but it seems like we should fix it anyway. I see Z16 has been more or less deprecated in the driver except the for the depthBits==16 case. But if we want to use Z24 even in that case (not sure it's really legal?) it would look better if the code made that decision explicitly rather than relying on the format to get magically overwritten by the renderbuffer code. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-05-01 23:56:34 +03:00
Ville Syrjälä	c1d4d49993	i915: Don't advertise Z formats in TextureFormatSupported on gen2 Gen2 doesn't support texturing from Z formats, so state as much. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-05-01 23:56:25 +03:00
Ville Syrjälä	d3edc31810	i915: Fix format of private renderbuffers intel_alloc_renderbuffer_storage() will clobber rb->Format which was already set up by intel_create_renderbuffer(). This causes the driver to potentially create the depth buffer in the wrong format. Long time ago things worked by accident because _mesa_choose_tex_format() checked for ARB_depth_texture and thus returned MESA_FORMAT_NONE on gen2 hardware. Somehow that ended up working when depthBits==16 because the driver would then pick DEPTH_FRMT_16_FIXED. Not sure how, but things also seemed to work with depthBits==24. Things started to go more sideways at: commit `6ae473221a` Author: Eric Anholt <eric@anholt.net> Date: Mon Apr 22 16:04:25 2013 -0700 intel: Fold the one last function intel_tex_format.c into the caller. since that caused intel_miptree_create_layout() to divide by zero when encoutering MESA_FORMAT_NONE (bw==0). So after this commit things were broken enough that many applications wouldn't even run. Things got a bit better at: commit `c245efe7e8` Author: Eric Anholt <eric@anholt.net> Date: Thu Mar 21 09:50:45 2013 -0700 mesa: Remove extension checking from ChooseTexFormat. since now _mesa_choose_tex_format() would return MESA_FORMAT_X8_Z24 for GL_DEPTH_COMPONENT due to i915 erroneosly claiming that MESA_FORMAT_X8_S24 (and others) are supported texture formats even on gen2 hardware. So now the the div-by-zero was gone, but now the driver would pick DEPTH_FRMT_24_FIXED_8_OTHER even when depthBits==16 which caused rendering problems. If we prevent rb->Format from getting clobbered for the depth buffer things work much better. This makes the spinning title text visible again in chromium-bsu at 16bpp, for example. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2014-05-01 23:56:09 +03:00
Anuj Phogat	c1743707a1	mesa: Allow FLOAT_32_UNSIGNED_INT_24_8_REV in get_tex_depth_stencil() Fixes a crash in Khronos OpenGL CTS packed_pixels tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	29b8e894d1	mesa: Add support to unpack depth-stencil texture in to FLOAT_32_UNSIGNED_INT_24_8_REV V2: Follow the new naming convention for unpack functions. Use double precision for converting Z24 to a float. V3: Unpack stencil value to most significant byte. Use 'struct z32f_x24s8' type. V4: Unpack stencil value to least significant byte. Add a comment to clarify stencil packing. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	7a8045d2f7	mesa: Add new helper function _mesa_unpack_depth_stencil_row() This patch makes non-functional changes in the code. New helper function added here will make it easier to support more data types in the following patches. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	ef924f0de9	mesa: Remove redundant if checks in _mesa_texstore_xx_xx() functions This patch contains non-functional changes. Assertion checks made earlier in the functions make the if checks redundant. So, remove the if checks and unindent the code in if block. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	1a8f9ba9b3	mesa: Allow srcFormat=GL_DEPTH_STENCIL in _mesa_texstore_xx_xx() functions _mesa_texstore_z24_s8() and _mesa_texstore_z32f_x24s8() are capable of handling GL_DEPTH_STENCIL format. So, allow it in both the functions. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	aeb9d4495d	mesa: Add missing types in _mesa_texstore_xx_xx() functions Depth-stencil teture targets are allowed to use source data of type GL_UNSIGNED_INT_24_8_EXT and GL_FLOAT_32_UNSIGNED_INT_24_8_REV. Fixes few crashes in Khronos OpenGL CTS packed_pixels tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	d714b20eb4	i965: Fix crash in do_blit_readpixels() Fixes a crash in Khronos CTS packed_pixels tests. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:40 -07:00
Anuj Phogat	5388fc157e	mesa: Add error condition for format=STENCIL_INDEX in glGetTexImage() From OpenGL 4.0 spec, page 306: "Calling GetTexImage with a format of STENCIL_INDEX causes the error INVALID_ENUM." Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	340658e44f	mesa: Add entry for extension ARB_texture_stencil8 V2: Alphabetize the new entry Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	9bcb0a8532	glsl: Apply the link error conditions to GL_ARB_fragment_coord_conventions Link error conditions added in previous patch are equally applicable to GL_ARB_fragment_coord_conventions implementation. Extension's spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use of gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	35f11e85cb	glsl: Link error if fs defines conflicting qualifiers for gl_FragCoord GLSL 1.50 spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." This patch causes the shader link to fail if we have multiple fragment shaders with conflicting layout qualifiers for gl_FragCoord. V2: Restructure the code and add conditions to correctly handle the following case: fragment shader 1: layout(origin_upper_left) in vec4 gl_FragCoord; void main() { foo(); gl_FragColor = gl_FragData; } fragment shader 2: layout(pixel_center_integer) in vec4 gl_FragCoord; void foo() { } V3: Allow linking in the following case: fragment shader 1: void main() { foo(); gl_FragColor = gl_FragCoord; } fragment shader 2: in vec4 gl_FragCoord; void foo() { ... } Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	a751adf071	glsl: Compile error if fs uses gl_FragCoord before first redeclaration Section 4.3.8.1, page 39 of GLSL 1.50 spec says: "Within any shader, the first redeclarations of gl_FragCoord must appear before any use of gl_FragCoord." GLSL compiler should generate an error in following case: vec4 p = gl_FragCoord; layout(origin_upper_left) in vec4 gl_FragCoord; void main() { } Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	581e4acb0d	glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoord GLSL 1.50 spec says: "If gl_FragCoord is redeclared in any fragment shader in a program, it must be redeclared in all the fragment shaders in that program that have a static use gl_FragCoord. All redeclarations of gl_FragCoord in all fragment shaders in a single program must have the same set of qualifiers." This patch makes the glsl compiler to generate an error if we have a fragment shader defined with conflicting layout qualifier declarations for gl_FragCoord. For example: layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; layout(pixel_center_integer) in vec4 gl_FragCoord; void main() { } V2: Some code refactoring for better readability. Add compiler error conditions for redeclarations like: layout(origin_upper_left) in vec4 gl_FragCoord; layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; and in vec4 gl_FragCoord; layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord; V3: Simplify function is_conflicting_fragcoord_redeclaration() V4: Check for null pointer before doing strcmp(var->name, "gl_FragCoord"). Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	49c71050de	mesa: Use location VERT_ATTRIB_GENERIC0 for vertex attribute 0 In OpenGL 3.1 attribute 0 becomes non-magic, just like in OpenGL ES 2.0. Earlier versions of OpenGL used attribute 0 exclusively for vertex position. V2: Add a utility function _mesa_attr_zero_aliases_vertex() in varray.h Fixes 4 Khronos OpenGL CTS failures: glGetVertexAttrib depth24_basic depth24_precision rgb8_rgba8_rgb Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	dc75479b7a	mesa: Fix querying location of nth element of an array variable This patch makes changes to the behavior of glGetAttribLocation(), glGetFragDataLocation() and glGetFragDataIndex() functions. Code changes handle a case described in following example: shader program: layout(location = 1)in vec4[4] a; void main() { } Currently, glGetAttribLocation("a") returns 1. glGetAttribLocation("a[i]"), where i = {0, 1, 2, 3}, returns -1. But the expected locations for array elements are: 1, 2, 3 and 4 respectively. This clarification came up with the addition of ARB_program_interface_query to OpenGL 4.3. From Page 326 (page 347 of the PDF) of OpenGL 4.3 spec: "Otherwise, the command is equivalent to GetProgramResourceLocation(program, PROGRAM_INPUT, name);" And, From Page 101 (page 122 of the PDF) of OpenGL 4.3 spec: "A string provided to GetProgramResourceLocation or GetProgramResourceLocationIndex is considered to match an active variable if • the string exactly matches the name of the active variable; • if the string identifies the base name of an active array, where the string would exactly match the name of the variable if the suffix "[0]" were appended to the string; or • if the string identifies an active element of the array, where the string ends with the concatenation of the "[" character, an integer (with no "+" sign, extra leading zeroes, or whitespace) identifying an array element, and the "]" character, the integer is less than the number of active elements of the array variable, and where the string would exactly match the enumerated name of the array if the decimal integer were replaced with zero." V2: Simplify get_matching_index() function. Add relevant text from OpenGL spec in commit message. Fixes failures in Khronos OpenGL CTS tests: explicit_attrib_location_room draw_instanced_max_vertex_attribs Proprietary linux drivers of NVIDIA (331.49) matches the behavior expected by OpenGL 4.3 spec. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Anuj Phogat	8c61b6a99b	glsl: Allow overlapping locations for vertex input attributes Currently overlapping locations of input variables are not allowed for all the shader types in OpenGL and OpenGL ES. From OpenGL ES 3.0 spec, page 56: "Binding more than one attribute name to the same location is referred to as aliasing, and is not permitted in OpenGL ES Shading Language 3.00 vertex shaders. LinkProgram will fail when this condition exists. However, aliasing is possible in OpenGL ES Shading Language 1.00 vertex shaders." Taking in to account what different versions of OpenGL and OpenGL ES specs say about aliasing: - It is allowed only on vertex shader input attributes in OpenGL (2.0 and above) and OpenGL ES 2.0. - It is explictly disallowed in OpenGL ES 3.0. Fixes Khronos CTS failing test: explicit_attrib_location_vertex_input_aliased.test See more details about this at below mentioned khronos bug. V2: Fix the case where location exceeds the maximum allowed attribute location. V3: Simplify the condition added in V2. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: Khronos #9609 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-05-01 10:58:39 -07:00
Roland Scheidegger	a773fdc64d	glx/drisw: fix memory leak when destroying screen. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-01 16:13:38 +02:00
Roland Scheidegger	64d6460a56	gallivm: fix 2 leaks in disassembly code don't leak the MCSubtargetInfo (not really big, was already fixed with llvm master) and TargetMachine (big). While this is only used for debugging the leak is large enough to get you into trouble in some cases. Tested with llvm 3.1 and master. Before (llvm 3.1), GALLIVM_DEBUG=asm glxgears: ==14152== LEAK SUMMARY: ==14152== definitely lost: 105,228 bytes in 20 blocks ==14152== indirectly lost: 347,252 bytes in 261 blocks ==14152== possibly lost: 866,625 bytes in 1,453 blocks ==14152== still reachable: 7,344,677 bytes in 6,494 blocks ==14152== suppressed: 0 bytes in 0 blocks After: ==13799== LEAK SUMMARY: ==13799== definitely lost: 3,108 bytes in 6 blocks ==13799== indirectly lost: 0 bytes in 0 blocks ==13799== possibly lost: 804,143 bytes in 1,429 blocks ==13799== still reachable: 7,314,267 bytes in 6,473 blocks ==13799== suppressed: 0 bytes in 0 blocks Reviewed-by: Brian Paul <brianp@vmware.com>	2014-05-01 16:13:38 +02:00
José Fonseca	6d911a5944	mesa: Move declaration to top of block. To fix MSVC build. Trivial.	2014-05-01 10:00:10 +01:00
José Fonseca	b0de67ad2d	osmesa: Fix typo in _MaxEnabledTexImageUnit.	2014-05-01 09:55:20 +01:00
Kenneth Graunke	85ce2242cb	i965/vec4: Port untyped atomic message support to Broadwell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:12 -07:00
Kenneth Graunke	45367d2d09	i965/vec4: Port untyped surface reads support to Broadwell. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:10 -07:00
Kenneth Graunke	e9e89d5756	i965/fs: Port untyped atomic message support to Broadwell. v2: Fix SIMD mode comment (caught by Eric Anholt). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:08 -07:00
Kenneth Graunke	54a48984b3	i965/fs: Port untyped surface read support to Broadwell. v2: Drop unused num_components variable; fix SIMD Mode comment (caught by Eric Anholt). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:06 -07:00
Kenneth Graunke	f1cd9fee53	i965/fs: Set fs_inst::header_present for untyped atomics/surface reads. The brw_eu_emit.c code manually forces the header present bit when used in align1 (scalar) mode. So, this has no effect currently. However, it is nice to have fs_inst::header_present reflect reality. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:04 -07:00
Kenneth Graunke	4d9c27df45	i965: Disassemble atomic operations and other DP:DC1 stuff on Broadwell. This is similar to what Eric did for Gen7 a little while ago; it also has support for untyped surface reads. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:24:02 -07:00
Kenneth Graunke	3b3c46656e	i965: Implement the create_raw_surface() hook on Broadwell. Otherwise we crash when setting up atomic buffer objects. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77221 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:23:59 -07:00
Kenneth Graunke	69fd055166	i965: Drop mark_surface_used from gen8 generators. Francisco made brw_mark_surface_used a freestanding function in commit `a32817f3c2`. We should use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:23:57 -07:00
Kenneth Graunke	b10785f9a9	i965/fs: Add support for fs_inst::force_writemask_all on Broadwell. This must not have existed when I wrote the original code. The atomic operation header setup code uses this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-05-01 00:23:44 -07:00
Kenneth Graunke	ac30e1adb4	i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS. For platforms using hardware contexts (currently Gen6+), we failed to emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS, instead emitting MI_NOOP for both. During one of the context initialization reordering patches, we accidentally moved brw_init_state before we set brw->CMD_PIPELINE_SELECT and brw->CMD_VF_STATISTICS. So, when brw_init_state uploaded initial GPU state (brw_init_state -> brw_upload_initial_gpu_state -> brw_upload_invariant_state), these would be 0 (MI_NOOP). Storing the commands in the context is not worthwhile. We have many generation checks in our state upload code, and for platforms with hardware contexts, this only gets called once per GL context anyway. The cost is negligable, and it's easy to botch context creation ordering. This may fix hangs on Gen6+ when using the media pipeline. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-05-01 00:12:22 -07:00
Kenneth Graunke	0380ec467d	i965: Don't enable reset notification support on Gen4-5. arekm reported that using Chrome with GPU acceleration enabled on GM45 triggered the hw_ctx != NULL assertion in brw_get_graphics_reset_status. We definitely do not want to advertise reset notification support on Gen4-5 systems, since it needs hardware contexts, and we never even request a hardware context on those systems. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75723 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 23:08:22 -07:00
Carl Worth	4546b70e08	doc: Add pointer to the Mesa Stable Queue page. Since this is now updated daily and looks to be useful.	2014-04-30 16:27:03 -07:00
Eric Anholt	862986ade3	i965: Fix state flag comments on color_buffer_write_enabled() calls. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	e739558c9d	i965: Drop bogus state flag comment. This was introduced with the comment and code below it, though the code only touches prog_data (CACHE_NEW_WM_PROG). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	60c5f9716c	i965: Track the number of samples in the drawbuffer. This keeps us from having to emit the nonpipelined state packet on every FBO binding. -4.42003% +/- 1.09961% effect on cairo-perf-trace runtime on glamor (n=110). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	973345fc23	mesa: Track maximum CurrentTexUnit to reduce glDeleteTextures() overhead. No more walking 96*6 pointers looking to see if they're the current texture, when we only use the first 2 out of 96 units. -6.26002% +/- 1.87817% effect on cairo runtime on no-fbo-cache glamor (n=36). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:21 -07:00
Eric Anholt	6a97deb88a	mesa: Rewrite shader-based texture image state updates. Instead of walking 6 shader stages for each of the 96 combined texture image units, now we just walk the samplers used in each shader stage. With cairo-perf-trace on Xephyr with glamor, I'm seeing a -6.50518% +/- 2.55601% effect on runtime (n=22) since the "drop _EnabledUnits" change. No significant performance difference on an apitrace of minecraft (n=442). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	a580b500ed	mesa: Split the shader texture update logic from fixed function. I want to avoid walking the entire long array texture image units, but the obvious way to do so means walking program samplers, and thus hitting the units in a random order. This change replaces the previous behavior of only setting up the fallback texture for a fragment shader with setting up the fallback texture for any shader that's missing a complete texture of the right target in its unit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	e5e50fae6a	mesa: Finish removing the _ReallyEnabled field. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	741f5d58e6	radeon: Drop the remaining driver usage of _ReallyEnabled. This is kind of ugly, but I think it's worth it to finish off the last consumers of _ReallyEnabled. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	2f8749af20	swrast: Drop remaining use of _ReallyEnabled. The _MaxEnabledTexImageUnit check assures us that Unit[0].Current != NULL. This is the last consumer of _ReallyEnabled outside of the radeons. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	8061f90a64	gallium: Drop use of _ReallyEnabled. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	cef82a64bd	mesa: Drop _ReallyEnabled usage from ff_fragment_shader. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	07b94c99a7	i915: Drop use of _ReallyEnabled. We can just look at _Current's target. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	ff9c3e8e5a	mesa: Replace use of _ReallyEnabled as a boolean with use of _Current. I'm probably not the only person that has tried to kill _ReallyEnabled. This does the mechanical part of the work, and cleans _ReallyEnabled from i965. I think that using _Current makes texture management clearer: You can't have multiple targets in use in the same texture image unit at the same time, because there's just that one pointer. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	62d46332d8	mesa: Ensure that (unit->_Current != 0) == (unit->_ReallyEnabled != 0). I'm going to try to delete _ReallyEnabled, which is this weird bitfield with either 0 or 1 bits set with just the reference to _Current. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	6bac47c05a	mesa: Drop dead last_ReallyEnabled fields from drivers. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:20 -07:00
Eric Anholt	c703658b39	mesa: Drop _EnabledUnits. The field wasn't really valid, since we've got more than 32 units now. It turns out it was mostly just used for checking != 0, or checking for fixed function coordinates, though. v2: Fix mis-conversion in xm_line.c (caught by Ken). Reviewed-by: Matt Turner <mattst88@gmail.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:33:17 -07:00
Eric Anholt	3dfe56c53b	swrast: Just use _EnabledCoordUnits for figuring out which texcoords to build. _EnabledUnits is all of the first 32 image units that are used by fixed function or programs, while _EnabledCoordUnits is just which fixed function fragment shader texcoords need to be generated. This is a theoretical bugfix in the case of a vertex shader texturing from large texture image unit number (we'd end up flagging something other than a VARYING_SLOT_TEXn as needing to be generated), but it's actually just motivated by trying to kill _EnabledUnits. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:21:59 -07:00
Eric Anholt	1ad443ecdd	i915: Redo texture unit walking on i830. We now know what the max unit is in the context state. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 14:21:59 -07:00
Matt Turner	9565392031	i965/vec4: Remove 'mul_arg' from try_emit_mad(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 11:41:29 -07:00
Matt Turner	1e50bc9ee1	i965/fs: Remove 'mul_arg' from try_emit_mad(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-30 11:41:29 -07:00
Brian Paul	475f5ff64d	mesa: change invalid texture swizzle error to GL_INVALID_ENUM The original GL_EXT_texture_swizzle extensions said GL_INVALID_OPERATION was to be generated when the an invalid swizzle was passed to glTexParameter(). But in OpenGL 3.3 and later, the error should be GL_INVALID_ENUM. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-30 10:09:44 -06:00
Andreas Hartmetz	1c6aa6599e	translate_sse: Use the correct buffer index in this fast path. It is possible that there are multiple input buffers but only one is relevant for translation. Then there will be only a single translation group, which might need to source data from a buffer index != 0. Fixes wrong vertex shader inputs as observed while debugging with an application and driver combination that requires translation of a vertex attribute in a non-trivial set of attributes and input buffers. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-29 20:35:10 -04:00
Tom Stellard	ca848e8bee	clover: Query drivers for max clock frequency Igor Gnatenko: v2: PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY instead of PIPE_COMPUTE_MAX_CLOCK_FREQUENCY Bruno Jiménez: v3: Drivers report clock in Mhz Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 15:28:17 -07:00
Tom Stellard	0a41054b7f	radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY Igor Gnatenko: v2: in define RADEON_INFO_MAX_SCLK use 0x1a instead of 0x19 (upstream changes) Bruno Jiménez: v3: Convert the frequency to MHz from kHz after getting it in 'do_winsys_init' Signed-off-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-29 15:25:50 -07:00
Tom Stellard	5fe1a0ebad	gallium: Add PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY Bruno Jiménez: v2: Updated the docs v3: Remove trailing comma Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 15:24:53 -07:00
Kenneth Graunke	979a015bc1	i965: Fix a few base addresses on Broadwell. We intended to set these 64-bit addresses to 0, and set the enable bit. But, I accidentally placed the DWord with the high bits first, when it should have been second. This generally worked out, by luck - presumably General State Base Address is initially zero, and ends up remaining that way in our contexts since we bungled the "modify enable" bit. v2: Fix MOCS shift on GSBA. It should be 4, and I had 2. (Caught by Ben Widawsky.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>	2014-04-29 14:01:06 -07:00
EdB	7fb05f9298	clover: Stub implementation of CL 1.2 sub-devices. The implementation is basically a NOP but it conforms with OpenCL 1.2. [ Francisco Jerez: Initialize property return buffer for CL_DEVICE_PARTITION_PROPERTIES, CL_DEVICE_PARTITION_TYPE, CL_DEVICE_PARTITION_AFFINITY_DOMAIN, and make the latter a scalar rather than a vector. Some clean-up and code style fixes. ] Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 16:14:50 +02:00
EdB	5827781d25	clover: Add clEnqueue{Marker, Barrier}WithWaitList. Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 13:12:38 +02:00
Jan Vesely	7b11c97d31	clover: Align kernel argument sizes to nearest power of 2 v2: use a new variable for aligned size add comment make both vars const only use the aligned value in argument constructors fix comment typo Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-04-29 13:09:21 +02:00
Francisco Jerez	df985cc8f6	clover: Avoid warnings from references to deprecated CL 1.1 APIs. Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-29 13:01:37 +02:00
Francisco Jerez	beadd6b0cc	clover: Update OpenCL headers to version 1.2 from Khronos. The C++ headers are not updated because they rely on CL 1.2 APIs that we do not implement yet when the core CL 1.2 headers are present. Acked-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-29 13:01:10 +02:00
Ilia Mirkin	f782d6e792	nvc0/ir: offset appears to come before the Z ref Fixes textureGatherOffset when used with a shadow sampler. Also verified against blob compiler with textureLodOffset manually (no piglit tests for texture[Lod]Offset + shadow samplers). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 20:32:36 -04:00
Brian Paul	50034c0171	mesa: remove unused #pragma export on/off lines PRAGMA_EXPORT_SUPPORTED is never defined. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77749 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 17:16:42 -06:00
Ilia Mirkin	f3aa999383	nv50/ir: change texture offsets to ValueRefs, allow nonconst This allows us to have non-constant offsets for textureGatherOffset and textureGatherOffsets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:09:18 -04:00
Ilia Mirkin	46364a53ef	nvc0/ir: do constant folding of extbf/insbf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:05:16 -04:00
Ilia Mirkin	1c85177419	nvc0/ir: add support for MUL_HI tgsi opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:05:16 -04:00
Ilia Mirkin	b4b20d42f6	nvc0/ir: add support for new bitfield manipulation opcodes This adds support for: IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC Which are all required for ARB_gs5 support. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-28 19:05:16 -04:00
Ilia Mirkin	1db993f2fe	tgsi: add tgsi_exec support for new bit manipulation opcodes Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:05:11 -04:00
Ilia Mirkin	ab4927f3e0	gallium/util: add helpers for bitfield manipulation Add bitwise reversing and signed MSB helpers for software implementation of the new TGSI opcodes. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:05:07 -04:00
Ilia Mirkin	3e73bf2724	mesa/st: implement new bit manipulation opcodes Also pipe through [IU]MUL_HI, MAD, and lower ldexp. This provides coverage of all new ARB_gpu_shader5 functions except uaddCarry, usubBorrow and interpolateAt*. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:05:04 -04:00
Ilia Mirkin	a52eaba787	gallium: add new opcodes for ARB_gs5 bit manipulation support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-28 19:04:46 -04:00
Emil Velikov	b125c92aa9	glx/drisw: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:39 +01:00
Emil Velikov	a2454bdfbd	glx/dri3: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:39 +01:00
Emil Velikov	55d82adec6	glx/dri2: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	76ae25d7e8	glx/dri: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	2f519e4635	glx/indirect: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Set indirect_screen_vtable as a static const. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	31a3b58cb7	glx/apple: explicitly assign struct components for glx_*_vtable ... to improve readability of code. Set applegl_screen_vtable as a static const. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	5f280d0c44	egl_dri: rework dri extension handling Use designated initialisers, and store the extensions pointers as const. The loader extensions __DRIdri2LoaderExtension and __DRIswrastLoaderExtension are setup by the platform backends so they should not be constified. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:38 +01:00
Emil Velikov	5457caa58c	gbm: cleanup __DRI*extension handling Use designated initialisers, store all extension pointers as const and use a const __DRIextensions array over assigning each element individually. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:38 +01:00
Emil Velikov	c812557a0e	dri_util: cleanup dri extension handling Explicitly set the version that is implemented, as that may differ from the one defined in dri_interface.h. The remaining __DRI*Extensions are treated as constants, so got ahead and declare them as such. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:38 +01:00
Emil Velikov	51e3569573	glx/tests: explicitly set __DRI2rendererQueryExtension members While we're here use the typcast'ed name and constify. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:38 +01:00
Emil Velikov	ecfe986120	glx/dri3: rework __DRIextension handling Use a const array with the extensions, rather than assigning each one to a fixed size array at runtime. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-28 19:13:37 +01:00
Emil Velikov	4be3874c97	glx/dri2: rework __DRIextension handling Make sure that the DRI*Extensions report the version of the interface implemented over the listed in the headers. While both are currently the same, this may change in the future. v2: Keep loader extensions handling as is. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:13:18 +01:00
Emil Velikov	98e2a8e2f9	st/dri: cleanup dri extension handling Explicitly set the version that is implemented, as that may differ from the one defined in dri_interface.h. Use designated initialisers and constify whereever possible. Note: __DRIimageExtension should not be made const as it's modified at runtime. This patch should have no side effects on compilers that do not support designated initialisers, as the existing code in dri/common already uses them. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:28 +01:00
Emil Velikov	748b35a69f	dri/radeon: use a const __DRIextension array Rather than keeping a separate and unused copy of the screen extensions within the radeon screen, use a constant array that can be used directly with __DRIscreen. [Kristian Høgsberg] The copy in the radeon screen isn't unused, that's where the array is built and stored, the dri screen just points to that. The pattern here was used for cases where the extensions exported by a dri driver could vary at runtime, for example depending on chipset. In this case, it's known at compile time, so it makes sense to use a static const array instead. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:27 +01:00
Emil Velikov	38f20f79da	drivers/dri: cleanup dri extension instantiation Uniformly use the typecasted extension name, constify extension instances and use designated initialisers. Set the implemented version of the extension, over the one defined in dri_infertace.h. Patch covers the following extensions: __DRItexBufferExtension __DRIimageExtension __DRIrobustnessExtension __DRI2rendererQueryExtension __DRIdri2LoaderExtension Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:27 +01:00
Emil Velikov	9b42fd1772	dri_interface: Update __DRItexBufferExtensionRec to version 3 With commit e59fa4c46c8("dri2: release texture image.") we updated the extension without bumping the version number. The patch itself added an interface required to enable texture_from_pixmap on certain platforms. The new code was effectively never build, as it depended on __DRI_TEX_BUFFER_VERSION >= 3, which never came to be in upstream mesa. This commit bumps the version number, drops the __DRI_TEX_BUFFER_VERSION checks and resolves all the build conflicts. Additionally it add a version check as egl and dri3, as require version 2 of the extension which does not have the releaseTexBuffer hook. Cc: Juan Zhao <juan.j.zhao@intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-28 19:11:27 +01:00
Jon TURNEY	ec8ebff342	Check for dladdr(), rather than assuming we have it if we have RTLD_DEFAULT Unfortunately, Cygwin defines RTLD_DEFAULT (for glibc compatibility), but can't provide dladdr(), so add a check for dladdr() Since I don't think scons is ever used to build for Cygwin, just set HAVE_DLADDR in SConscript, assuming that if we have RTLD_DEFAULT, we have dladdr(). Cc: Jonathan Gray <jsg@jsg.id.au> Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-28 19:11:02 +01:00
Richard Sandiford	6c8f547f66	util: Fix cross-compiles between endiannesses The old python code used sys.is_big_endian to select between little-endian and big-endian formats, which meant that the build and host endiannesses needed to be the same. This patch instead generates both big- and little- endian layouts, using PIPE_ARCH_BIG_ENDIAN to select between them. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:27 +01:00
Richard Sandiford	6944796cbe	util: Split out channel-parsing Python code Splits out the code that parses the channel list, so that we can have different lists for little and big endian. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:25 +01:00
Richard Sandiford	1a3746212d	util: Split out channel-printing Python code Rather than iterate over format.channels and format.swizzles directly, use Python subfunctions that take the channel and swizzle lists as arguments. This allow the channel and swizzle lists to depend on endianness. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:24 +01:00
Richard Sandiford	0ee3ac938a	util: Turn inv_swizzle into a global function With the big-endian changes, there can be two swizzle orders for each format. This patch turns Format.inv_swizzle() into a global function that takes the swizzle list as a parameter. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:22 +01:00
Richard Sandiford	227d7a6a3c	util: Add more query methods to u_format_parse.Format The main aim is to reduce the number of places that access channels[0], swizzles[0] and swizzles[1] directly. There is no change to the generated u_format_table.c. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-28 13:16:20 +01:00
Michel Dänzer	136c437cea	st/mesa: Fix NULL pointer dereference for incomplete framebuffers This can happen with glamor, which uses EGL_KHR_surfaceless_context and only explicitly binds GL_READ_FRAMEBUFFER for glReadPixels. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-28 12:12:03 +09:00
Chris Forbes	151a20dcd4	glsl: fix spelling of derived Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-27 21:37:23 +12:00
Ilia Mirkin	e88644c1f2	docs: mark off nv50/nvc0 for ARB_sample_shading, update relnotes relnotes weren't updated this whole time, so I went through all the GL3.txt changes and picked out the nouveau ones since 10.1. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-27 00:16:29 -04:00
Chia-I Wu	7b2dd89041	mesa: overhaul debug namespace support _mesa_HashTable is not well-suited for us: it locks a mutex unnecessarily and it does not accept 0 as the key (and have branches to handle 1 specially). What we really need is a sparse array. Whether it should be implemented as a hash table, a list, or a bsearch()-able array requires investigations of the use models. We choose to implement it as a list for now, assuming it is common to have a short list of IDs in each (source, type) namespace. The code is simpler, and the memory footprint is lower. This also fixes several corner cases such as making messages to have different states at different severities. v2: use GLbitfield for State/DefaultState, and add a comment Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	70e4337014	mesa: delay copying of debug groups Do not copy the debug group until it is about to be written. One likely scenario of using glPushDebugGroup/glPopDebugGroup is to enclose a sequence of GL commands and give them a human-readable description. There is no message control change in this scenario, and thus no need to copy. This also reduces the initial size of gl_debug_state from 306KB to 7KB. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	a30c4c6ca0	mesa: clean up debug output namespace handling Add functions to provide these operations on a struct gl_debug_namespace: init(): initialize the namespace copy(): copy all elements from one namespace to another clear(): clear all elements (to free the memories) set(): set the value of an element set_all(): set the value of all elements get(): get the value of an element A debug namespace is like a sparse array. The length of the array is huge, 2^sizeof(GLuint), but most of the elements assume the same value sepcified by set_all(). Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	44a1374793	mesa: clean up debug groups Add struct gl_debug_group to hold all namespaces of a debug group. Replace the 3-dimensional array, Namespaces, in struct gl_debug_state by a 1-dimensional array of type struct gl_debug_groups. Turn the 4-dimensional array, Defaults, in struct gl_debug_state to a 1-dimensional array in struct gl_debug_namespace. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	e412305f9f	mesa: clean up debug message log Remove NextMsgLength, and move members of struct gl_debug_state that belong to the message log to a new struct, gl_debug_log. Rename gl_debug_msg to gl_debug_message. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:21 +08:00
Chia-I Wu	cf61ea3029	mesa: use accessors for struct gl_debug_state When GL_DEBUG_OUTPUT_SYNCHRONOUS is GL_TRUE, drivers are allowed to log debug messages from other threads. That requires gl_debug_state to be protected by a mutex, even when it is a context state. While we do not spawn threads in Mesa yet, this commit makes it easier to do when we want to. Since the definition of struct gl_debug_state is no longer needed by the rest of the driver, move it to main/errors.c. This should make it even harder to use the struct incorrectly. v2: add comments for the accessors Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	94e45c98e1	mesa: eliminate debug output message_insert Add validate_length, and call it together with log_msg directly instead of message_insert. No functional change. v2: make sure length is non-negative (i.e., known) before calling validate_length, noted by Timothy Arceri Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	188d22d9b7	mesa: eliminate debug output should_log In both call sites, it could be easily replaced by direct debug_is_message_enabled calls. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	c9dfb6b76c	mesa: eliminate debug output control_app_messages Merge control_app_messages with the only caller. Eliminate set_message_state and control_messages too as they are unused. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	274913c42c	mesa: eliminate debug output get_msg Merge get_msg with the only caller. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	04a8baad37	mesa: refactor _mesa_PopDebugGroup and _mesa_free_errors_data Replace free_errors_data by debug_clear_group. Add debug_pop_group and debug_destroy for use in _mesa_PopDebugGroup and _mesa_free_errors_data respectively. No funcitonal change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	f1d00dce43	mesa: refactor _mesa_PushDebugGroup Move group copying to debug_push_group. Save the group message before pushing instead of after, since we will need it after popping. No functional change otherwise. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	de0e0ae4b6	mesa: refactor debug output control_messages Move most of the code to debug_set_message_enable_all. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	7e9451dc46	mesa: refactor debug output get_msg Move message fetching to debug_fetch_message and message deletion to debug_delete_messages. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	e9d1b5c8af	mesa: refactor debug out log_msg Move message logging to debug_log_message. Replace store_message_details by debug_message_store. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	880183fee8	mesa: refactor debug output set_message_state Move message state update to debug_set_message_enable. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	7554d27de4	mesa: refactor debug output should_log Move the message filtering logic to debug_is_message_enabled. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Chia-I Wu	672b209225	mesa: refactor _mesa_get_debug_state Move gl_debug_state allocation to a new function, debug_create. No functional change. Signed-off-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-27 10:06:20 +08:00
Ilia Mirkin	9339f8ac1b	nvc0/ir: fetch shadow value from proper place for TG4 cube array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	b86d78b4c1	nvc0/ir: set gatherComp for non-shadow targets Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	24e68c9024	nvc0/ir: set instance count based on the GS_INVOCATIONS property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	802fe8d9af	nvc0/ir: add support for INVOCATIONID system value Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 12:01:13 -04:00
Ilia Mirkin	b3a2398ade	nvc0/ir: add support for SAMPLEMASK sysval Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 11:57:18 -04:00
Ilia Mirkin	c3d2bda53e	mesa/st: translate gl_InvocationID to INVOCATIONID semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:57:15 -04:00
Ilia Mirkin	389379e81d	mesa/st: translate gl_SampleMaskIn to SAMPLEMASK semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:57:12 -04:00
Ilia Mirkin	4be146b108	gallium: add GS_INVOCATIONS property Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:57:09 -04:00
Ilia Mirkin	76db20fc67	gallium: add INVOCATIONID semantic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-26 11:56:39 -04:00
Ilia Mirkin	af38ef907c	nvc0: add support for PIPE_CAP_SAMPLE_SHADING Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 11:53:34 -04:00
Ilia Mirkin	f715a0a39a	nv50: add support for PIPE_CAP_SAMPLE_SHADING Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-26 11:53:24 -04:00
Ilia Mirkin	c5d822dad9	mesa/st: add support for ARB_sample_shading Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-26 11:52:52 -04:00
Ilia Mirkin	88d8d88d8c	gallium: add basic support for ARB_sample_shading Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-26 11:52:01 -04:00
Enrico Horn	3a2885fb26	mapi: OpenVG symbol exports. Fixes another mistake in `144bbb7b78`. Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77502	2014-04-25 19:34:38 -07:00
Matt Turner	18993f7892	glsl: Use properly typed arguments for bitfieldInsert. bitfieldInsert takes scalar integers for its last two arguments. Since bitfieldInsert is lowered on i965 to two instructions that have more flexible arguments, I didn't notice when I wrote this. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-25 19:24:39 -07:00
Eric Anholt	07730e9463	i965: Don't bother flushing the batch if it doesn't ref our mt to map. -1.1372% +/- 0.858033% effect on cairo runtime on glamor (n=175). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-25 18:19:55 -07:00
Ander Conselvan de Oliveira	17860309f1	egl: Protect use of gbm_dri with ifdef HAVE_DRM_PLATFORM Otherwise it fails to compile if the drm egl platform is disabled. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:17:54 +01:00
Neil Roberts	63d4661ab2	wayland: Fix the logic in disabling the prime capability It looks like this bit of code is trying to disable the prime capability if the driver doesn't support createImageFromFds. However the logic looks a bit broken and what it would actually do is disable all other capabilities apart from prime. This patch fixes it to actually disable prime. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:17:05 +01:00
Ander Conselvan de Oliveira	49964fa28b	gbm: Set errno on errors This should give the caller some information of what called the error. For the gbm_bo_import() case, for instance, it is possible to know if the import is not supported or the error was caused by an invalid parameter. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:16:45 +01:00
Ander Conselvan de Oliveira	aa91fe1c09	gbm/dri: Fix out-of-memory error path in dri_device_create() Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:16:00 +01:00
Emil Velikov	c0953cf06e	gallium/tests: conditionally include sw/dri winsys In all fairness we allow the gallium tests to be build with --disable-dri which will result in the approapriate winsys to not be build, thus the build will fail. ./configure --disable-dri --with-gallium-drivers=svga --enable-gallium-tests Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:09:26 +01:00
Emil Velikov	6c44d43bae	automake: cleanup pipe-loader handling when using sw/xlib winsys Rather than defining our own set of variables, use NEED_WINSYS_XLIB and based on it include the sw/xlib winsys. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:09:17 +01:00
Emil Velikov	5c6a1445d5	pipe-loader: conditionally build and use pipe_loader_sw_probe_dri The function relies on the sw/dri winsys which is build only when --enable-dri is set. Fixes build issues with the following config ./configure --disable-dri --with-gallium-drivers=svga --enable-xa Issue can be reproduced with any hw gallium driver + st that uses the pipe-loader. Cc: Brian Paul <brianp@vmware.com> Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-25 21:09:09 +01:00
Roland Scheidegger	a7a03d84fc	llvmpipe: fix clearing of individual color buffers in a fb GL (3.0) allows you to clear individual color buffers in a fb. In fact for fbs containing both int and float/normalized color buffers this is required (because the clearing values are otherwise undefined if applied to all buffers). The gallium interface was changed a while ago, but llvmpipe ignored it (hence doing such individual clears always resulted in clearing all buffers, plus some assorted asserts due to the mixed fbs). So change the clear command to indicate the buffer to be cleared. Also, because indicating the buffer to be cleared would have made lp_rast_arg_cmd larger which is unacceptable (we're trying to shrink it some day) allocate the clear value in the scene and just pass a pointer. There's several advantages and disadvantages here: + clearing individual buffers works (we could also actually bin such clears now if they'd come through clear_render_target() if the surface is in the current fb, though we didn't do this before for the single rb case and still don't try). + since there's one clear per rb, we do the format conversion in setup rather than per bin. Aside from the (drop in the ocean...) performance advantage this means that clearing to very small values (that is, denormal when converted to the format) should work for small float (fp16 etc.) formats, as the util code couldn't handle it correctly before (because cpu denorms are disabled when executing the bin commands, screwing up the magic conversion and flushing the values to 0, though this was not verified). - there's some overhead for traditional old-style clear-all MRT cases, since there's one rast clear command per rb instead of one for all rbs. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=76976. v2: get rid of the ugly manual memcpy stuff and just use union util_color. This is 32 bytes instead of 16 but as the allocation is per scene we can live with those additional 16 bytes (and the additional 128 bytes in the setup context), which makes the code much more obvious. Suggested by Brian. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-25 19:29:30 +02:00
Roland Scheidegger	fa4082320a	gallium/util: use ui[4] instead of ui in union util_color util_color often merely represents a collection of bytes, however it is inconvenient if those bytes can only be accessed as floats/doubles for int formats exceeding 32bits. (Note that since rgba8 formats use one uint, not 4 bytes, hence the byte and short member were left as is.)	2014-04-25 19:29:30 +02:00
Roland Scheidegger	2f65f61bea	llvmpipe: (trivial) use correct LP_MIN_VECTOR_ALIGN define for alignment. Currently it's the same value. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-25 19:29:30 +02:00
Marek Olšák	3a3b1bf60e	r600g: fix hang on RV740 by using DX_RASTERIZATION_KILL instead of SX_MISC Changing SX_MISC hangs RV740. When we're at it, let's use DX_RASTERIZATION_KILL on all R700 and later chipsets. Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:13 +02:00
Marek Olšák	3d0c4f3b01	r600g: fix for an MSAA hang on RV770 Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	ecc8a37ec5	r600g: fix for broken CULL_FRONT behavior on R6xx Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	ef162cf13d	r600g: fix for HTILE on R6xx Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	0967970768	r600g: fix buffer copying on R600-R700 This fixes broken rendering in DOTA 2. Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	042e40f67b	r600g: fix flushing on RV670, RS780, RS880 again Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	20a9b784da	r600g: fix MSAA resolve on R6xx when the destination is 1D-tiled Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	6dd045ef40	r600g: disable async DMA on R700 Cc: 10.0 10.1 mesa-stable@lists.freedesktop.org	2014-04-25 01:33:12 +02:00
Marek Olšák	e5741f1e91	r600g: fix edge flags and layered rendering on R600-R700 We forgot to set these bits. Cc: 10.1 mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	8a1dfba73e	st/mesa: remove trailing NULL colorbuffers Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-25 01:33:12 +02:00
Marek Olšák	e522c455e4	r300g: don't crash when getting NULL colorbuffers Cc: mesa-stable@lists.freedesktop.org	2014-04-25 01:33:12 +02:00
Marek Olšák	ba4f6a5fc9	r300g: fix runtime warning after winsys cleanup Broken by: `b2238b3452` winsys/radeon: remove cs_write_reloc, add simpler cs_get_reloc	2014-04-25 01:33:12 +02:00
Marek Olšák	7920adb45c	radeonsi: implement GL_ARB_vertex_type_10f_11f_11f_rev Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-25 01:33:12 +02:00
José Fonseca	f438a82492	st/xlib: Do minimal version checking in glXCreateContextAttribsARB. The current version checking is wrongly refusing to create 3.3 contexts; unsupported version are checked elsewhere; and the DRI path doesn't do this sort of checking neither. This enables piglit glsl 3.30 tests to run without skipping. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-24 20:26:23 +01:00
José Fonseca	7380ce9bf6	llvmpipe: Advertise GLSL 3.30. According to Roland all TGSI support is there in theory. In practice there are a few piglit failures and crashes, as this hadn't been tested before. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-24 20:26:23 +01:00
José Fonseca	5f493eed69	st/xlib: Honour request of 3.1 contexts through core profile where available. The GLX_ARB_create_context_profile spec says: "If version 3.1 is requested, the context returned may implement any of the following versions: * Version 3.1. The GL_ARB_compatibility extension may or may not be implemented, as determined by the implementation. * The core profile of version 3.2 or greater." Mesa does not support GL_ARB_compatibility, and there are no plans to ever support it, therefore the only chance to honour a 3.1 context is through core profile, i.e, the 2nd alternative from the spec. This change does that. And with it piglit tests that require 3.1 contexts no longer skip. Assuming there is no objection with this change, src/glx/dri_common.c and src/gallium/state_trackers/wgl/stw_context.c should also be updated accordingly, given they have the same logic. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-24 20:26:23 +01:00
Zack Rusin	1c73e919a4	draw/llvm: reduce memory usage Lets make draw_get_option_use_llvm function available unconditionally and use it to avoid useless allocations when LLVM paths are active. TGSI machine is never used when we're using LLVM. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 13:59:24 -04:00
Brian Paul	552a8e44a9	docs: fix typo in 10.1.1 release notes URL	2014-04-24 08:37:23 -06:00
Brian Paul	0a92c88a51	swrast: move texture_slices() calls out of loops Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
Brian Paul	1a7fa8b2eb	swrast: move null pointer check earlier in _swrast_map_teximage() There's no reason to compute texel size, stride, etc. if there's no image data to map. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
Brian Paul	5e81e6e268	swrast: remove _mesa_ prefix from static function And add a const qualifier. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
Brian Paul	7cc2e2e99d	swrast: allocate swrast_texture_image::ImageSlices array if needed Fixes a segmentation fault in conform divzero.c test. This happens when glTexImage(level, width=0, height=0) is called. We don't allocate texture memory in that case so the ImageSlices array was never allocated. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-24 08:16:01 -06:00
nick	15c92464df	swrast: Fix vertex color in _swsetup_Translate() Straightforward fix to properly load dest->color with color data, as opposed to position data as previously implemented. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-24 08:16:00 -06:00
José Fonseca	1527a545a4	gallivm: Fix wrong operator in lp_exec_default. Courtesy of MSVC static code analyser. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-24 14:49:53 +01:00
José Fonseca	878877d3c4	mesa/st: Handle empty frame-buffers without asserting. Fixes assertion failures with radeonsi. Tested-by: Marek Olšák <maraeo@gmail.com>	2014-04-24 14:48:37 +01:00
José Fonseca	fd92346c53	mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY. This prevents buffer overflow w/ llvmpipe when running piglit bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array single_level -fbo -auto v2: Compute the framebuffer size as the minimum size, as pointed out by Brian; compacted code; ran piglit quick test list (with no regressions.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-23 19:12:23 +01:00
José Fonseca	7a8667f2b3	util/u_debug: Pass correct size to strncat. Courtesy of Clang static analyzer. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-23 19:12:23 +01:00
Rob Clark	05b3cea77b	freedreno/a3xx: fix TOTALATTRTOVS In cases where varying fetches are optimized away (just pass-through in vertex shader, but unused in fragment shader) we need to calculate the correct TOTALATTROVS based on the actual number of varyings fetched, otherwise lockup. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-23 07:32:16 -04:00
Kenneth Graunke	34a68345e2	i965: Make Broadwell HiZ path arrange for TC flushes. HiZ operations make the depth/render caches out of sync with the sampler caches. We need to arrange for a TC flush to happen before the target buffer is used by the sampler. Calling brw_render_cache_set_add_bo makes that happen. On previous generations, brw_blorp_exec took care of flushing the texture cache by calling intel_batchbuffer_emit_mi_flush after doing any rendering. If we were to use the normal drawing path, then brw_postdraw_set_buffers_need_resolve would handle this. On Broadwell, we don't use BLORP, and we don't emit a rectangle primitive via the normal drawing path. The 3DSTATE_WM_HZ_OP and PIPE_CONTROL implicitly make drawing happen. So, none of our existing code makes this flush happen - we need to do it directly. Fixes 11 Piglit copyteximage subtests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77223 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77226 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-22 10:57:11 -07:00
Matt Turner	fe49949392	i965: Use uint16_t for control/src index tables. No need to use 32-bits to store 15 and 12. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-22 09:12:31 -07:00
Matt Turner	f02f489295	i965/disasm: Fix s/xoo/xor/ typo. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-22 09:12:31 -07:00
Matt Turner	06501b3cf0	i965/disasm: Remove tables with obvious mappings. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-22 09:12:31 -07:00
Ilia Mirkin	5ce3f2fe72	mesa/st: enable EXT_shader_integer_mix when NativeIntegers is on Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-22 11:27:34 -04:00
Christian König	7eda318ffe	st/omx/enc: implement frame reordering and B-frames Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-22 16:42:08 +02:00
Leo Liu	b03be6908e	st/omx/enc: replace omx buffer with texture buffer Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-22 15:13:08 +02:00
Michel Dänzer	360038fa50	radeonsi: Fix calculation of number of banks for SI The way cik_num_banks() was calculating the index only makes sense for the CIK specific macrotile mode array. For SI, we need to use the tile mode index directly. This happened to work most of the time because most of the SI tiling modes use the same number of banks. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-22 12:07:07 +09:00
Chris Forbes	0dfa6e7cf5	glsl: Only allow `invariant` on shader in/out between stages. Previously this was special-cased for VS and FS; it never got updated when geometry shaders came along. Generalize using is_varying_var() so this won't be broken again with tessellation. Note that there are two copies of the logic for `invariant`: It can be present as part of a new declaration, and also as a redeclaration of an existing variable or block member. Fixes the four new piglits: spec/glsl-1.50/compiler/invariant-qualifier-*.geom Note for stable: This won't quite pick cleanly due to whitespace and state->target -> state->stage renames. Should be straightforward adjustments though. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-22 09:07:05 +12:00
Brian Paul	0a0075666c	svga: move draw debug code into separate function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-21 14:54:28 -06:00
Brian Paul	e959274081	mesa: move declaration before code To fix MSVC build.	2014-04-21 13:24:26 -06:00
Anuj Phogat	f8ae2a56c6	mesa: Fix error code generation in glReadPixels() Section 4.3.1, page 220, of OpenGL 3.3 specification explains the error conditions for glreadPixels(): "If the format is DEPTH_STENCIL, then values are taken from both the depth buffer and the stencil buffer. If there is no depth buffer or if there is no stencil buffer, then the error INVALID_OPERATION occurs. If the type parameter is not UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_REV, then the error INVALID_ENUM occurs." Fixes failing Khronos CTS test packed_depth_stencil_error.test V2: Avoid code duplication Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 11:20:50 -07:00
Anuj Phogat	bd1880dfe8	mesa: Add an error condition in glGetFramebufferAttachmentParameteriv() From the OpenGL 4.4 spec page 275: "If pname is FRAMEBUFFER_ATTACHMENT_COMPONENT_TYPE, param will contain the format of components of the specified attachment, one of FLOAT, INT, UNSIGNED_INT, SIGNED_NORMALIZED, or UNSIGNED_NORMALIZED for floating-point, signed integer, unsigned integer, signed normalized fixedpoint, or unsigned normalized fixed-point components respectively. If no data storage or texture image has been specified for the attachment, param will contain NONE. This query cannot be performed for a combined depth+stencil attachment, since it does not have a single format." Fixes Khronos CTS test: packed_depth_stencil_parameters.test Khronos Bug# 9170 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 11:20:50 -07:00
Brian Paul	7cb3bbf2cd	libgl-gdi: silence unused variable warning when not using LLVM	2014-04-21 09:50:53 -06:00
Brian Paul	1f043cd95a	docs: import 10.0.5 release notes and update links	2014-04-21 09:03:32 -06:00
Brian Paul	3fd9943a65	docs: import 10.1.1 release notes, update links	2014-04-21 09:03:32 -06:00
Benjamin Bellec	9b3b9c613f	mesa: fix GetStringi error message with correct function name Signed-off-by: Benjamin Bellec <b.bellec@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: <mesa-stable@lists.freedesktop.org>	2014-04-21 08:44:20 -06:00
Brian Paul	27496af67f	st/mesa: fix invalid pointer use in st_texture_get_sampler_view() The '*used' pointer was pointing into the stObj->sampler_views array. If 'free' was null, we'd realloc that array, thus making the 'used' pointer invalid. This soon led to memory errors. Just change the pointer to be 'used' so it points directly at the pipe_sampler_view. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-21 08:30:46 -06:00
Chris Forbes	9fec560e63	glsl: Fix typo Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-21 16:02:02 +12:00
Chris Forbes	d63026f62a	i965: Use ctx->Texture._MaxEnabledTexImageUnit for upper bound Avoid looping over 32/48/96 (!!) tex image units every draw, most of which we don't care about. Improves performance on everyone's favorite not-a-benchmark by 2.9% on Haswell. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 10:13:07 +12:00
Chris Forbes	c4a98e76d7	mesa: Track max enabled tex image unit This gives us a better bound for some hot loops in the drivers than MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is ridiculously large on modern hardware, and only getting worse as more shader stages are added. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-21 10:12:00 +12:00
Ilia Mirkin	ba6dcb3c2b	nouveau/codegen: add missing values for OP_TXLQ into the target arrays Also rework things so that if someone were to add an opcode without adjusting the values in these arrays, there will be a compilation error. This fixes a few quadop-related piglit regressions since commit `d5faf8e786`. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Ilia Mirkin	47c19a5819	nvc0: change logic for centering of eng2d blit when downsampling We want to center the sample. The old code may have been correct given the limited values of ms_x/y, but the new logic should be more intuitive. Note that ms_x can only be 1/2 and ms_y can only be 0/1. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Ilia Mirkin	6d5c3c8260	nv50: use 2d blit when src/dst have same number of samples The 2D engine should be usable in more cases, but this fixes MS blits between textures with the same MS settings. Otherwise a single sample is selected to be the target texel value. This allows other tests to work that render to a RB and then blit that to a texture for input into a shader that uses sampler2DMS to verify it. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Ilia Mirkin	2d2e60bdee	gallium/docs: fix PIPE_CAP_ENDIANNESS delimiter, remove trailing spaces Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-19 13:23:32 -04:00
Petri Latvala	b45f65e760	mesa: update glext.h to version 20140313 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-18 14:30:57 -07:00
Kenneth Graunke	a1273a07ed	i965/fs: Implement fs_inst::force_sechalf support on Broadwell. Back when I originally wrote this code, force_sechalf was only used for Gen4 code, so I didn't bother hooking it up. However, it's used more generally these days. In particular, we use it for computing gl_SamplePosition. Fixes Piglit's spec/ARB_sample_shading/builtin-gl-sample-position tests. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77222 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-18 11:57:33 -07:00
Chris Forbes	92840aabf7	glsl: Allow explicit binding on atomics again As of `943b2d52bf`, layout(binding) on an atomic would fail the assertion here. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 10:35:05 -07:00
Alex Deucher	7489f3eeda	radeonsi: fix num banks selection on SI for dma setup (v2) The number of banks varies based on the tile mode index just like CIK. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=77533 v2: fix ordering for nbanks calculation for consistency Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-04-18 13:24:12 -04:00
Matt Turner	f770123f58	i965/fs: Reduce restrictions on interference in register coalescing. We previously only allowed coalescing registers that interfere (i.e., whose live ranges overlap) if the destination register's live range was entirely inside the source's live range. This is unnecessary -- we only need to check for interfering writes in the intersection of their live ranges. total instructions in shared programs: 1639470 -> 1638453 (-0.06%) instructions in affected programs: 84751 -> 83734 (-1.20%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	55de1c035c	i965/fs: Give up in interference check if we see a WHILE. Rather than any old control flow. Muchnick's algorithm just checks for interfering writes between the MOV and the end of the program. Handling this when you have backward branches is hard, so don't, but there's no reason to bail if you see forward branches. instructions in affected programs: 4270 -> 4248 (-0.52%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	5ff1e446d4	i965/fs: Simplify interference scan in register coalescing. We were starting at the beginning of the instruction list, rather than with the MOV instruction itself. This allows us to coalesce after control flow. Excluding the shaders from an unreleased title, the shader-db results: total instructions in shared programs: 1603791 -> 1594215 (-0.60%) instructions in affected programs: 678772 -> 669196 (-1.41%) GAINED: 5 LOST: 0 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	04a4e43eb2	i965/fs: Unindent can_coalesce_vars(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	a975b2f55c	i965/fs: Recognize nop-MOV instructions early. And avoid rewriting other instructions unnecessarily. Removes a few self-moves we weren't able to handle because they were components of a large VGRF. instructions in affected programs: 830 -> 826 (-0.48%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Matt Turner	ef6127ff69	i965/fs: Only sweep NOPs if register coalescing made progress. Otherwise there's nothing to do. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-18 09:16:19 -07:00
Marek Olšák	352e06ddea	r600g,radeonsi: don't skip the context flush if a fence should be returned Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77589	2014-04-18 13:33:57 +02:00
Brian Paul	744d2a225d	svga: fix comment for emit_adjusted_vertex_attribs()	2014-04-17 16:15:37 -06:00
Brian Paul	cb34575e19	svga: compute need_swvfetch in svga_create_vertex_elements_state() This saves us doing it at state validation time. Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-04-17 11:31:15 -07:00
Brian Paul	851645a3e7	svga: add VS code to set attribute W component to 1 There's a few 3-component vertex attribute formats that have no equivalent SVGA3D_DECLTYPE_x format. Previously, we had to use the swtnl code to handle them. This patch lets us use hwtnl for more vertex attribute types by fetching 3-component attributes as 4-component attributes and explicitly setting the W component to 1. This lets us handle PIPE_FORMAT_R16G16B16_SNORM/UNORM and PIPE_FORMAT_R8G8B8_UNORM vertex attribs without using the swtnl path. Fixes piglit normal3b3s GL_SHORT test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:33 -07:00
Brian Paul	615a356ee3	svga: implement support for signed byte vertex attributes There's no SVGA3D_DECLTYPE that directly corresponds to PIPE_FORMAT_R8G8B8_SNORM. Previously, we used the swtnl fallback path to handle this but that's slow and causes invariance issues. Now we fetch the attribute as SVGA3D_DECLTYPE_UBYTE4N and insert some extra VS instructions to remap the attributes from the range [0,1] to the range[-1,1]. Fixes Sauerbraten sw fallback. Fixes piglit normal3b3s-invariance test. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:33 -07:00
Brian Paul	52faafa174	svga: move translated vertex declaration types into svga_velems_state Now only translate the formats once in svga_create_vertex_elements_state(). And rename the array and use the proper SVGA3dDeclType type. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Brian Paul	0f5add1959	Revert "svga: add work-around for Sauerbraten Z fighting issue" This reverts commit `c875d6e57a`. Conflicts: src/gallium/drivers/svga/svga_context.c This work-around will no longer be needed after the next patch which properly supports signed-byte vertex attributes. Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Brian Paul	7c7ab5434a	svga: use new inst_token_setp() helper function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Brian Paul	8e131576ee	svga: use new inst_token_predicated() helper function Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2014-04-17 11:29:32 -07:00
Kenneth Graunke	71846a943f	i965: Retype pre-Gen6 varying pull load destination to UW. This sets up the proper execution mask for sends in SIMD16 mode. Fixes Piglit's glsl-fs-normalmatrix, glsl-fs-uniform-array-2, glsl-fs-uniform-array-6, and glsl-fs-uniform-array-7 on Ironlake, which regressed when I enabled SIMD16 pull parameter support in commit `b207e88b25`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-17 10:54:00 -07:00
Anuj Phogat	ee10e893cb	mesa: Fix error condition for multisample proxy texture targets Fixes failures in Khronos OpenGL CTS test proxy_textures_invalid_samples Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:26:39 -07:00
Anuj Phogat	1d350b9e22	i965: Add glBlitFramebuffer to commands affected by conditional rendering Fixes failures in Khronos OpenGL CTS test conditional_render_test9 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:26:39 -07:00
Anuj Phogat	8ed42ddd7d	swrast: Add glBlitFramebuffer to commands affected by conditional rendering Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 10:26:05 -07:00
Anuj Phogat	48fc2703e5	i965: Fix component mask and varying_to_slot mapping for gl_ViewportIndex gl_ViewportIndex doesn't get its own varying slot. It is stored in VARYING_SLOT_PSIZ.z. This patch fixes the issue for both gen7 and gen8 because gen7_upload_3dstate_so_decl_list() is shared between them. Fixes failures in OpenGL Khronos CTS test transform_feedback_builtins. Makes new piglit test glsl-1.50-transform-feedback-builtins pass for 'gl_ViewportIndex'. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:08:28 -07:00
Anuj Phogat	7928b9c249	i965: Fix component mask and varying_to_slot mapping for gl_Layer gl_Layer doesn't get its own varying slot. It is stored in VARYING_SLOT_PSIZ.y. This patch fixes the issue for both gen7 and gen8 because gen7_upload_3dstate_so_decl_list() is shared between them. Fixes failures in OpenGL Khronos CTS test transform_feedback_builtins. Makes new piglit test glsl-1.50-transform-feedback-builtins pass for 'gl_Layer'. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:08:28 -07:00
Anuj Phogat	969b461c2b	i965: Put an assertion to check valid varying_to_slot[varying] Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-17 10:08:28 -07:00
Darren Powell	bc86690f13	radeonsi: Added Diag Handler to receive LLVM Error messages Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-17 19:37:58 -04:00
Marek Olšák	9f9ab8ec0d	winsys/radeon: remove some unused code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:19 +02:00
Marek Olšák	8b966bcaf2	winsys/radeon: remove is_handle_added array Use index -1 if a buffer is not added. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:19 +02:00
Marek Olšák	b0fca0a378	winsys/radeon: remove local variable reloc from radeon_get_reloc Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:18 +02:00
Marek Olšák	3384a41aa9	winsys/radeon: remove parameter reloc from radeon_get_reloc Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-04-17 13:54:18 +02:00
José Fonseca	75e487538d	util: Add __declspec(noreturn) to _debug_assert_fail(). Mostly for consistency; as MSVC's static source code analysis doesn't seem to rely on assertions, but instead on different kind of source annotations( http://msdn.microsoft.com/en-us/library/hh916383.aspx ). Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 09:56:49 +01:00
José Fonseca	a2b89c4ae1	auxiliary/os,auxiliary/util: Fix the `‘noreturn’ function does return` warning. Now that _debug_assert_fail() has the noreturn attribute, it is better that execution truly never returns. Not just for sake of silencing the warning, but because the code at the return IP address may be invalid or lead to inconsistent results. This removes support for the GALLIUM_ABORT_ON_ASSERT debugging environment variable, but between the usefulness of GALLIUM_ABORT_ON_ASSERT and better static code analysis I think better static code analysis wins. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 09:56:48 +01:00
José Fonseca	97fa9cd220	scons: Enable building through Clang Static Analyzer. Same intent as commit `a45a50a482`, but this the C compiler is detected via C-preprocessor macros, similar to how autotools do it, as that seems to be the most reliable method. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-17 09:56:48 +01:00
Maarten Lankhorst	74f19445cc	gallium glsl: Fix crash with piglit fs-deref-literal-array-of-structs.shader_test This allows the following shader code to work without a weird crash: struct Foo { int value[1]; }; int actual_value = Foo[2](Foo(int[1](100)), Foo(int[1](200)))[i].value[0]; Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2014-04-17 10:34:10 +02:00
Maarten Lankhorst	49d26a277d	nouveau/vdec: small fixes to h264 handling nouveau_vp3_inter_sizes requires sliec_count as argument just as the other places that call it from h264 code do. Hopefully fixes something. Fix the status_vp code to allow status == 0 too, when processing hasn't started yet. set h264->second_field correctly.	2014-04-17 10:30:39 +02:00
Thomas Hellstrom	09cd376353	st/xa: Cache render target surface Otherwise it will trick the gallium driver into thinking that the render target has actually changed (due to different pipe_surface pointing to same underlying pipe_resource). This is really badness for tiling GPUs like adreno. This also appears to fix a rendering error with Motif on vmwgfx. Why that is is still under investigation. Based on an idea by Rob Clark. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-04-17 09:56:28 +02:00
Rob Clark	a45ae814d1	st/xa: scissor to help tilers Keep track of the maximal bounds of all the operations and set scissor accordingly. For tiling GPU's this can be a big win by reducing the memory bandwidth spent moving pixels from system memory to tile buffer and back. You could imagine being more sophisticated and splitting up disjoint operations. But this simplistic approach is good enough for the common cases. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-17 09:42:06 +02:00
Rob Clark	3c52013273	st/xa: remove unneeded args Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-17 09:40:42 +02:00
Iago Toral Quiroga	cda5e0c25e	glsl: Small optimization for constant conditionals Once the relevant branch has been identified do not iterate over the instructions in the branch, do a linked list insertion instead to avoid the loop. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 23:39:57 -07:00
Iago Toral Quiroga	4472ab9e6d	glsl: Fix incorrect indentation. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 23:22:24 -07:00
Chris Forbes	d1b6f67110	meta: Clip src/dest rects in BlitFramebuffer, using the scissor Fixes piglit's fbo-blit-stretch test on drivers which use the meta path. (i965: should fix Broadwell, but also fixes Sandybridge/Ivybridge/Haswell since this test falls off the blorp path now due to format conversion) V2: Use scissor instead of just mangling the rects, to avoid texcoord rounding problems. (Thanks Marek) V3: Rebase on Eric's CTSI meta changes; re-add _mesa_update_state in the CTSI path so that _mesa_clip_blit sees the correct bounds. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77414 Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-17 18:11:24 +12:00
Samuel Iglesias Gonsalvez	9927180714	mesa: fix check for dummy renderbuffer in _mesa_FramebufferRenderbufferEXT() According to the spec: <renderbuffertarget> must be RENDERBUFFER and <renderbuffer> should be set to the name of the renderbuffer object to be attached to the framebuffer. <renderbuffer> must be either zero or the name of an existing renderbuffer object of type <renderbuffertarget>, otherwise an INVALID_OPERATION error is generated. This patch changes the previous returned GL_INVALID_VALUE to GL_INVALID_OPERATION. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76894 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>	2014-04-16 23:00:40 -07:00
Matt Turner	42a26cb5e4	i965: Don't make instructions with a null dest a barrier to scheduling. Now that we properly track accumulator dependencies, the scheduler is able to schedule instructions between the mach and mov in the common the integer multiplication pattern: mul acc0, x, y mach null, x, y mov dest, acc0 Since a null destination implies no dependency on the destination, we can also safely schedule instructions (that don't write the accumulator) between the mul and mach. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	a6860100b8	i965/fs: Change fs_visitor::emit_lrp to use MAC for gen<6 This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving one instruction and two temporary registers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	da0c3b02e7	i965/fs: Add support for the MAC instruction. This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	2dfbbeca50	i965/vec4: Change vec4_visitor::emit_lrp to use MAC for gen<6 This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving one instruction and two temporary registers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	0974706671	i965/vec4: Add support for the MAC instruction. This allows us to generate the MAC (multiply-accumulate) instruction, which can be used to implement some expressions in fewer instructions than doing a series of MUL and ADDs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	306ed81b93	i965: Add writes_accumulator flag Our hardware has an "accumulator" register, which can be used to store intermediate results across multiple instructions. Many instructions can implicitly write a value to the accumulator in addition to their normal destination register. This is enabled by the "AccWrEn" flag. This patch introduces a new flag, inst->writes_accumulator, which allows us to express the AccWrEn notion in the IR. It also creates a n ALU2_ACC macro to easily define emitters for instructions that implicitly write the accumulator. Previously, we only supported implicit accumulator writes from the ADDC, SUBB, and MACH instructions. We always enabled them on those instructions, and left them disabled for other instructions. To take advantage of the MAC (multiply-accumulate) instruction, we need to be able to set AccWrEn on other types of instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Juha-Pekka Heikkila	30c35d1dcb	i965: Add is_accumulator() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-04-16 22:46:45 -07:00
Matt Turner	6541f1b4d0	i965: Add reads_accumulator_implicitly() function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 22:46:44 -07:00
Anuj Phogat	cb6566f9df	mesa: Add error condition for integer formats in glGetTexImage() OpenGL 4.0 spec, page 306 suggests an INVALID_OPERATION in glGetTexImage if : "format is one of the integer formats in table 3.3 and the internal format of the texture image is not integer, or format is not one of the integer formats in table 3.3 and the internal format is integer." V2: Use helper function _mesa_is_format_integer() Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 18:37:06 -07:00
Anuj Phogat	3135668254	mesa: Add helper function _mesa_is_format_integer() This function will be used in the following patch. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 18:37:06 -07:00
Anuj Phogat	fdd8bebc22	mesa: Fix glGetVertexAttribi(GL_VERTEX_ATTRIB_ARRAY_SIZE) mesa currently returns 4 when GL_VERTEX_ATTRIB_ARRAY_SIZE is queried for a vertex array initially set up with size=GL_BGRA. This patch makes changes to return size=GL_BGRA as required by the spec. Fixes Khronos OpenGL CTS test: vertex_array_bgra_basic.test V2: Use array->Format instead of adding a new variable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: <mesa-stable@lists.freedesktop.org>	2014-04-16 18:37:06 -07:00
Anuj Phogat	80b4a36fed	glsl: Fix copy-paste error in linker_warning() Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-16 18:37:06 -07:00
Michel Dänzer	7286739b9b	r600g: Disable LLVM by default at runtime for graphics For graphics, the LLVM compiler backend currently has many shortcomings compared to the non-LLVM one. E.g. it can't handle geometry shaders yet, but that's just the tip of the iceberg. So building Mesa with --enable-r600-llvm-compiler is currently not recommended for anyone who doesn't want to work on fixing those issues. However, for protection of users who end up enabling it anyway for some reason, let's disable the LLVM backend at runtime by default. It can be enabled with the environment variable R600_DEBUG=llvm. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-17 10:15:59 +09:00
Roland Scheidegger	f23d1160c2	gallivm: fix compilation with llvm 3.5 r206241+ Just adjust to the ever-changing API, pass in MCContext when creating the MCDisassembler. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-16 19:57:47 +02:00
José Fonseca	e3c58cdfd9	Revert "scons: Enable building through Clang Static Analyzer." This reverts commit `a45a50a482`. Unfortunately gcc dumps argv[0] as the first word of --version, so it is unreliable for detecting gcc. In particular `cc --version` and `i686-w64-mingw32-gcc --version` give wrong results. A better solution needs to be found -- most likely using C-preprocessing like autotools does. Revert for now.	2014-04-16 13:18:06 +01:00
Marek Olšák	11459436d9	r600g,radeonsi: share some of gfx flush code Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:52 +02:00
Marek Olšák	adfadeadd8	r600g,radeonsi: share r600_flush_from_st Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:52 +02:00
Marek Olšák	586011486d	r600g: merge r600_flush with r600_context_flush Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	d4edc60767	radeonsi: merge si_flush with si_context_flush This also removes si_flush_gfx_ring. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	70cf6639c3	gallium/radeon: create and return a fence in the flush function All flush functions get a fence parameter. cs_create_fence is removed. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	3e9d2cbca2	r600g: remove redundant r600_flush_dma_from_winsys Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	dd72c327e9	winsys/radeon: fold cs_set_flush_callback into cs_create Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	c6033a6cb8	radeonsi: cleanup redundant computation of flush flags and rename a function Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	fc151b08be	r600g: remove redundant r600_flush_from_winsys Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	b2238b3452	winsys/radeon: remove cs_write_reloc, add simpler cs_get_reloc The only difference is that it doesn't write to the CS and only returns the index. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
Marek Olšák	927213f33d	winsys/radeon: consolidate hash table lookup I should have done this long ago. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-16 14:02:51 +02:00
José Fonseca	d3c0e236f2	scons: Add an analyze option. For Clang static code analyzer, the scan-build script will produce more comprehensive output. Nevertheless you can invoke it as CC=clang CXX=clang++ scons analyze=1 For MSVC this is the best way to use its static code analysis. Simply invoke as scons analyze=1 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 11:44:21 +01:00
José Fonseca	f81305c0cb	util/u_debug: Add noreturn attribute to _debug_assert_fail(). As recommended by http://clang-analyzer.llvm.org/annotations.html#attr_noreturn Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 11:44:17 +01:00
José Fonseca	a45a50a482	scons: Enable building through Clang Static Analyzer. By accurately detecting gcc/clang through --version option instead of executable name. Clang Static Analyzer reports many issues, most false positives, but it found at least one real and subtle use-after-free issue in st_texture_get_sampler_view(): http://people.freedesktop.org/~jrfonseca/scan-build-2014-04-14-1/report-869047.html#EndPath Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-16 11:44:06 +01:00
Iago Toral Quiroga	6d0e30c6a3	glsl: Properly handle blocks that define the same field name. Currently we can have name space collisions between blocks that define the same fields. For example: in block { vec4 Color; } In[]; out block { vec4 Color; } Out; These two blocks will assign the same interface name (block.Color) to the Color field in flatten_named_interface_blocks_declarations.cpp, leading to havoc. This was breaking badly the gl-320-primitive-shading test from ogl-samples. The patch uses the block instance name to avoid collisions, producing names like block.In.Color and block.Out.Color to avoid the name clash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76394 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 22:18:43 -07:00
Michel Dänzer	6ac5a5e383	r600g/radeonsi: Map transfer staging texture unsynchronized when possible The transfer staging texture is always freshly allocated, so for write-only transfers we don't need to explicitly wait for the BO to become idle. Squeezes a few hundered MB/s more out of x11perf -shmput500 with glamor. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-16 12:11:27 +09:00
Matt Turner	9fed627234	Revert "i965/fs: Only sweep NOPs if register coalescing made progress." This reverts commit `f092e8951c`. Didn't mean to push this...	2014-04-15 17:27:55 -07:00
Matt Turner	f092e8951c	i965/fs: Only sweep NOPs if register coalescing made progress. Otherwise there's nothing to do.	2014-04-15 16:28:04 -07:00
Eric Anholt	7ae870211d	i965: Fix buffer overruns in MSAA MCS buffer clearing. This manifested as rendering failures or sometimes GPU hangs in compositors when they accidentally got MSAA visuals due to a bug in the X Server. Today we decided that the problem in compositors was equivalent to a corruption bug we'd noticed recently in resizing MSAA-visual glxgears, and debugging got a lot easier. When we allocate our MCS MT, libdrm takes the size we request, aligns it to Y tile size (blowing it up from 300x300=900000 bytes to 384*320=122880 bytes, 30 pages), then puts it into a power-of-two-sized BO (131072 bytes, 32 pages). Because it's Y tiled, we attach a 384-byte-stride fence to it. When we memset by the BO size in Mesa, between bytes 122880 and 131072 the data gets stored to the first 20 or so scanlines of each of the 3 tiled pages in that row, even though only 2 of those pages were allocated by libdrm. In the glxgears case, the missing 3rd page happened to consistently be the static VBO that got mapped right after the first MCS allocation, so corruption only appeared once window resize made us throw out the old MCS and then allocate the same BO to back the new MCS. Instead, just memset the amount of data we actually asked libdrm to allocate for, which will be smaller (more efficient) and not overrun. Thanks go to Kenneth for doing most of the hard debugging to eliminate a lot of the search space for the bug. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77207 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:47 -07:00
Eric Anholt	e5b86cb64b	meta: Add support for MSAA resolves from 2D_MS_ARRAY textures. We don't have any piglit tests for this currently. v2: Use vec3s for the texcoords so it has some hope of working. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:45 -07:00
Eric Anholt	234db60954	meta: Add an accelerated glCopyTexSubImage using glBlitFramebuffer. You'll note from the previous commits that there's something of a loop here: You call CTSI, which calls BlitFB, then if things go wrong that falls back to CTSI. As a result, meta CTSI reaches over into blitfb to tell it "no, don't try that fallback". v2: Drop the _mesa_update_state(), which was only necessary due to use of _mesa_clip_blit() in _mesa_meta_BlitFramebuffer() in another patch series. v3: Drop an _EXT suffix I copy-and-pasted. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	70961c032f	meta: Add support for CUBE_MAP_ARRAY to generatemipmap. I added support to bind_fbo_image in the process of building meta CopyTexSubImage, and found that it broke generatemipmap because previously we would just throw a GL error there and then end up with an incomplete FBO and fallback. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	bb3f983d10	meta: Infer bind_fbo_image parameters from an incoming image. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	cd808ac848	meta: Move bind_fbo_image() code back to meta.c, to reuse it elsewhere. I need to do the same code again for CopyTexSubImage(). v2: Drop incorrect, not-terribly-useful comment (review by Ken) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	4cc42805e7	meta: Refactor the BlitFramebuffer depth CopyTexImage fallback. This avoids a ReadPixels() if there's accelerated CopyTexImage present. It now requires GLSL as opposed to just fragment programs, but we don't have any drivers that do ARB_fp but not GLSL. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 14:34:22 -07:00
Eric Anholt	b702233f53	meta: Refactor the BlitFramebuffer color CopyTexImage fallback. There shouldn't be anything special about copying out a subset of the src rb to a temp before texturing from it, so just do it when we're figuring out our src texture binding. This drops Anuj's change to copy an extra border of 1 pixel around the src area. I can't see how that change could be valid, and presumably if there's some filtering problem at edges we just need to set the right wrap mode. v2: Don't fall back to swrast on non-2D/RECT/2D_MS textures when we can still CopyTexSubImage. Fixes a segfault regression on i965 with gl-3.2-layered-rendering-blit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) Tested-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-15 14:34:06 -07:00
Eric Anholt	4e43299633	meta: Drop blit src size fallback. I think we can assert that renderbuffer size is <= maximum 2D texture size. Our source coordinates should have already been clipped to the src renderbuffer size, but haven't actually (so we could potentially have trouble if there's scaling, and we're in the CopyTexImage path that tries to use src size). However, this texture size dependency was blocking the next refactors, so I'm not sure if we want to go ahead with this series before we get the clipping sorted out or not. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 12:27:37 -07:00
Mike Stroyan	602510395a	i965: Avoid dependency hints on math opcodes Putting NoDDClr and NoDDChk dependency control on instruction sequences that include math opcodes can cause corruption of channels. Treat math opcodes like send opcodes and suppress dependency hinting. Signed-off-by: Mike Stroyan <mike@LunarG.com> Tested-by: Tony Bertapelli <anthony.p.bertapelli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-15 10:31:46 -07:00
Matt Turner	ad48a9a319	i965: Expand INTEL_DEBUG to uint64_t. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 10:29:00 -07:00
Matt Turner	58db339599	dri: Expand driParseDebugString return value to uint64_t. Users will downcast if they don't have >32 debug flags. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-15 10:28:57 -07:00
Matt Turner	73400d8f70	i965/fs: Remove dead_code_eliminate_local(). Subsumed by the new dead_code_eliminate() function. No shader-db changes. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-15 09:25:47 -07:00
Matt Turner	18d12336b9	i965/fs: Clear variable from live-set if it's completely overwritten. One program affected: instructions in affected programs: 246 -> 244 (-0.81%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-15 09:25:44 -07:00
Matt Turner	f34f39330b	i965/fs: Reimplement dead_code_elimination(). total instructions in shared programs: 1653399 -> 1651790 (-0.10%) instructions in affected programs: 92157 -> 90548 (-1.75%) GAINED: 2 LOST: 2 Also significantly reduces the number of optimization loop iterations: total loop iterations in shared programs: 39724 -> 31651 (-20.32%) loop iterations in affected programs: 21617 -> 13544 (-37.35%) Including some great pathological cases, like 29 -> 3 in Strike Suit Zero and 24 -> 3 in Dota2. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-15 09:25:11 -07:00
Matt Turner	596737ee91	i965/vec4: Let DCE eliminate dead writes in other basic blocks. We previously stopped searching for unread writes after encountering control flow, but we can instead just search backwards until we hit control flow. instructions in affected programs: 22854 -> 22194 (-2.89%)	2014-04-15 09:24:09 -07:00
Matt Turner	4dcfb92417	i965/gs: Add dummy source to prepare_channel_masks instruction. The generator uses its destination as a source implicitly, which breaks some assumptions in dead code elimination. Giving the instruction a source allows us to reason about it better.	2014-04-15 09:24:09 -07:00
Matt Turner	d877c643be	glsl: Use M_PI_* macros. Notice our multiple values for M_PI_2, which rounded ...32 up to ...4 and ...5.	2014-04-15 09:24:09 -07:00
Kenneth Graunke	4f20b7d3dd	i965: Disable Z16 in all APIs. We originally thought that GL 3.0 required GL_DEPTH_COMPONENT16 to map exactly to Z16. However, we misread the specification, thanks in part to LaTeX reordering the tables in the PDF. Page 180 of the GL 3.0 specification (glspec30.20080923.pdf) says: "[...] memory allocation per texture component is assigned by the GL to match the allocations listed in tables 3.16-3.18 as closely as possible. [...] Required Texture Formats [...] In addition, implementations are required to support the following sized internal formats. Requesting one of these internal formats for any texture type will allocate exactly the internal component sizes and types shown for that format in tables 3.16-3.17:" Notably, however, GL_DEPTH_COMPONENT16 does /not/ appear in table 3.16 or table 3.17. It appears in table 3.18, where the "exact" rule doesn't apply, and it falls back to the "closely as possible" rule. The confusing part is that the ordering of the tables in the PDF is: Table 3.16 (pages 182-184) Table 3.18 (bottom of page 184 to top of 185) Table 3.17 (page 185) Presumably, people saw table 3.16, then saw the table immediately following with DEPTH_COMPONENT* formats, and assumed it was 3.17. Based on a patch by Chia-I Wu, but without the driconf option to force Z16 to be used. It's not required, and there's apparently no benefit to actually using it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-04-15 02:15:11 -07:00
Kenneth Graunke	be000b4d19	i965: Update comments about Z16 being slow. We've learned a few things since we originally disabled Z16; this attempts to summarize the issue. I am no expert on this subject, though, so the comment may not be totally accurate. I did some benchmarking on GM45 and Ironlake, and discovered that for GLBenchmark 2.7 EgyptHD, using Z16 was 3% slower on GM45 (n=15), and 4.5% slower on Ironlake (n=95). So, we can drop the "on Ivybridge" aspect of the comment - it's always slower. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-04-15 02:15:11 -07:00
Michel Dänzer	313104e8d5	r600g/radeonsi: Use caching buffer manager for textures as well Significantly reduces BO allocation / destruction overhead for transfers, e.g. measurable via x11perf -shm{ge,pu}t* with glamor. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-15 11:34:56 +09:00
Jordan Justen	24c773fb06	i965/gen8: add debug code to show FS disasm with jump locations Copied from similar code in gen8_vec4_generator.cpp. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-14 10:32:42 -07:00
Chia-I Wu	73a4761058	ilo: remove GPE state size estimation Use size defines from genhw.	2014-04-14 20:45:04 +08:00
Chia-I Wu	8fa8e9b1b8	ilo: remove GPE command size estimation Use size defines from genhw.	2014-04-14 20:45:04 +08:00
Chia-I Wu	bdd0546d7c	ilo: remove unused headers Remove intel_.h. brw_.h is still needed by the state dumper and disassembler.	2014-04-14 20:45:04 +08:00
Chia-I Wu	e55e1610e5	ilo: use only defines from genhw headers Stop including classic driver headers in genhw.h, with some formatting fixes.	2014-04-14 20:45:04 +08:00
Chia-I Wu	6c6bd796ad	ilo: scripted conversion to genhw headers Hopefully my four hundred line sed script is correct.	2014-04-14 20:45:04 +08:00
Chia-I Wu	01e3e82a56	ilo: add genhw headers All except genhw.h are generated by https://github.com/olvaffe/envytools/. intel_chipset.h is deprecated.	2014-04-14 20:45:03 +08:00
Chia-I Wu	d75a8799fd	ilo: avoid brw_wm_barycentric_interp_mode in compiler In preparation for genhw.	2014-04-14 20:45:03 +08:00
Chia-I Wu	ad39b991ce	ilo: add TOY_OPCODE_DO We used to give BRW_OPCODE_DO a special meaning, while we should have used TOY_OPCODE_DO.	2014-04-14 20:45:03 +08:00
Vinson Lee	36fb36aa36	gtest: Update to 1.7.0. This patch fixes gtest build errors on Mac OS X 10.9. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73106 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-14 00:06:53 -07:00
Chris Forbes	936dda08ee	mesa: Consider gl_VertexID and gl_InstanceID active attribs Fixes piglit's spec/gl-3.2/get-active-attrib-returns-all-inputs. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-13 19:27:01 +12:00
Chris Forbes	ca5c8d6cd4	mesa: Extract is_active_attrib() in shaderapi The rules are about to get a bit more complex to account for gl_InstanceID and gl_VertexID, which are system values. Extracting this first avoids introducing duplication. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-13 19:26:56 +12:00
Chris Forbes	aeb03f8aea	glsl: Fix typo in interface block comment Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-13 17:02:11 +12:00
Simone Scanzoni	c3b701d63c	egl-static: fix build after recent radeon winsys changes Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-04-13 02:37:36 +02:00
Chris Forbes	b92e7f2da9	mesa: Fix typo in error message Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-13 12:38:24 +12:00
Iago Toral Quiroga	a5957f7bc5	i965: glClearBuffer() should only clear a single buffer. glClearBuffer() is currently clearing all active draw color buffers (all buffers that have not been set to GL_NONE when calling glDrawBuffers) instead of only clearing the one it receives as parameter. Altough brw_clear() receives a bit mask indicating the color buffers that should be cleared, this mask is ignored when calling brw_blorp_clear_color(). This was breaking the 'fbo-drawbuffers-none glClearBuffer' piglit test. The patch provides the bit mask to brw_blorp_clear_color() so it can limit clearing to the color buffers present in the mask. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76832 Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-13 12:28:25 +12:00
Chris Forbes	26224d3e00	i965: Add comment to explain the weird-looking shadow compares. This always looks crazy when I stumble across it, until I remember what the hardware is doing. Describing it ought to short-circuit that process next time :) V2: Fix indents to 6 spaces, not 7. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-13 08:51:46 +12:00
Kenneth Graunke	857f3a68ea	glsl: Ignore loop-too-large heuristic if there's bad variable indexing. Many shaders use a pattern such as: for (int i = 0; i < NUM_LIGHTS; i++) { ...access a uniform array, or shader input/output array... } where NUM_LIGHTS is a small constant (such as 2, 4, or 8). The expectation is that the compiler will unroll those loops, turning the array access into constant indexing, which is more efficient, and which may enable array splitting and other optimizations. In many cases, our heuristic fails - either there's another tiny nested loop inside, or the estimated number of instructions is just barely beyond the threshold. So, we fail to unroll the loop, leaving the variable indexing in place. Drivers which don't support the particular flavor of variable indexing will call lower_variable_index_to_cond_assign(), which generates piles and piles of immensely inefficient code. We'd like to avoid generating that. This patch detects unsupported forms of variable-indexing in loops, where the array index is a loop induction variable. In that case, it bypasses the loop-too-large heuristic and forces unrolling. Improves performance in various microbenchmarks: Gl32PSBump8 by 47%, Gl32ShMapVsm by 80%, and Gl32ShMapPcf by 27%. No changes in shader-db. v2: Check ir->array for being an array or matrix, rather than the ir_dereference_array itself. v3: Fix and expand statistics in commit message. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:43 -07:00
Kenneth Graunke	2231db5598	glsl: Rename loop_unroll_count::fail to "nested_loop." The "fail" flag is set if loop_unroll_count encounters a nested loop; calling the flag "nested_loop" is a bit clearer. The original reasoning was that count is inaccurate (too small) if there are nested loops, as we don't do any sort of analysis on the inner loop. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:41 -07:00
Kenneth Graunke	8268a2f347	glsl: Pass gl_shader_compiler_optimizations to unroll_loops(). Loop unrolling will need to know a few more options in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:39 -07:00
Kenneth Graunke	da22221aa3	glsl: Drop do_common_optimization's max_unroll_iterations parameter. Now that we pass in gl_shader_compiler_options, it makes sense to just use options->MaxUnrollIterations, rather than passing a separate parameter. Half of the invocations already passed options->MaxUnrollIterations, while the other half passed in a hardcoded value of 32. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:37 -07:00
Kenneth Graunke	f00a6483e9	i965: Use EmitNoIndirect flags in lower_variable_index_to_cond_assign. This will prevent the two from getting out of sync again. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:36 -07:00
Kenneth Graunke	320e0c5205	i965: Correct EmitNoIndirect shader compiler option flags. These were out of sync with the flags used to control lower_variable_index_to_cond_assign in brw_shader.cpp. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 17:41:25 -07:00
Matt Turner	509b2a6523	i965/fs: Reset reg_from when we can't coalesce. Not setting this would prevented coalescing after a failed attempt if the sources for both MOVs were the same. total instructions in shared programs: 1654531 -> 1650224 (-0.26%) instructions in affected programs: 423167 -> 418860 (-1.02%) GAINED: 2 LOST: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-11 15:27:46 -07:00
Eric Anholt	7e034a8d77	i965: Fill in a bunch of gen7/hsw data cache-related disasm. This gets us disasm of atomic ops. v2: Fix fallthrough on pre-gen7. (bug caught by Ilia Mirkin). Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-11 13:38:53 -07:00
Eric Anholt	99442bc7b2	i965: Stop setting up a 1:1 "attrib" member in our vertex inputs. It's just the array index, so we can just go look at the array and see which element we are. No significant performance difference (n=140) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:38:53 -07:00
Eric Anholt	9a5d19d680	i965: Skip a bunch of IB BO refcount twiddling. Improves cairo performance on glamor by 1.64828% +/- 1.04742% (n=65). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:38:52 -07:00
Eric Anholt	3f9440cfbb	i965/gen7: Skip repeated NULL depth/stencil state emits. Improves cairo performance on glamor by 2.87752% +/- 0.966977 (n=57). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:38:52 -07:00
Chris Forbes	fe4f373eb4	docs: Fix ubo indexing description Ian points out that this being unrestricted was an oversight in the spec, and is corrected in GLSL4.40. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-12 08:31:05 +12:00
Brian Paul	e5f306e3ff	draw: remove unused 'start' variable in draw_stats_clipper_primitives() It was computed, but never actually used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 13:54:17 -06:00
Kenneth Graunke	ae2a03b573	glsl: Try vectorizing when seeing a repeated assignment to a channel. When considering assignment expressions like: v.x += u.x; v.x += u.x; the vectorizer would incorrectly keep going, attempting to find more instructions to vectorize. It would overwrite the saved assignment to point at the second one, and increment channels a second time, resulting in try_vectorize thinking the expression was a vec2 instead of a float. Instead, if we see a repeated assignment to a channel, just try to vectorize everything we've found so far. This clears the saved state so it will start over. Fixes Piglit's repeated-channel-assignments.vert. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-11 12:39:37 -07:00
Ian Romanick	625cf8c874	glsl: Propagate explicit binding information from the AST all the way to the linker Information about the binding was not being properly communicated from the front-end compiler to the linker. As a result, the linker never knew that any UBOs had explicit bindings! Fixes the piglit test arb_shading_language_420pack-binding-layout. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: github@socker.lepus.uberspace.de [v0] Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	25a6656875	linker: Set binding for all elements of UBO array Previously, a UBO like layout(binding=2) uniform U { ... } my_constants[4]; wouldn't get any bindings set. The code would try to set the binding of U, but that would fail. It should instead set the bindings for U[0], U[1], ... Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	cc42717b50	linker: Set block bindings based on UniformBlocks rather than UniformStorage For blocks, gl_shader_program::UniformStorage isn't very useful. The names stored there are the names of the elements of the block, so finding blocks with an instance name is hard. There is also only one entry in ::UniformStorage for each element of a block array, and that is a deal breaker. Using ::UniformBlocks is what _mesa_GetUniformBlockIndex does. I contemplated sharing code between set_block_binding and _mesa_GetUniformBlockIndex, but building the stand-alone compiler and the unit tests make this hard. I plan to return to this effort shortly. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	157391a41b	linker: Clean up "unused parameter" warnings ../../src/glsl/link_uniform_initializers.cpp:87:1: warning: unused parameter 'mem_ctx' [-Wunused-parameter] ../../src/glsl/link_uniform_initializers.cpp:87:1: warning: unused parameter 'type' [-Wunused-parameter] ../../src/glsl/link_uniform_initializers.cpp:127:1: warning: unused parameter 'mem_ctx' [-Wunused-parameter] ../../src/glsl/link_uniform_initializers.cpp:127:1: warning: unused parameter 'type' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	943b2d52bf	linker: Fold set_uniform_binding into call site In the next patch, we'll see that using gl_shader_program::UniformStorage is not correct for uniform blocks. That means we can't use ::UniformStorage to select between the sampler path and the block path. Instead we want to just use the type of the variable. That's never passed to set_uniform_binding, and it's easier to just remove the function (especially for later patches in the series) than to add another parameter. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	881c52f13f	linker: Various trivial clean-ups in set_sampler_binding - Remove the spurious block left from the previous commit and re-indent. - Constify elements. - Make the spec reference in the code look like other spec references in the compiler. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Ian Romanick	6e2f63b69e	linker: Split set_uniform_binding into separate functions for blocks and samplers The two code paths are quite different, and there are some problems in the handling of uniform blocks. Future changes will cause these paths to diverge further. Ultimately, selecting between the two functions will happen at the set_uniform_binding call site, and set_uniform_binding will be deleted. NOTE: This patch just moves code around. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: github@socker.lepus.uberspace.de	2014-04-11 12:26:01 -07:00
Heinrich Janzing	c8e7568f97	softpipe: fix shadow sampling And remove nonsensical approximation of linear interpolation behavior for shadow samplers. Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Brian Paul <brianp@vmware.com>	2014-04-11 11:47:32 -06:00
Brian Paul	86b8843e9c	softpipe: add PIPE_CAP_MIN/MAX_TEXTURE_GATHER_OFFSET query cases To silence compiler warnings. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-11 11:47:31 -06:00
Brian Paul	f61edd509b	mesa: use _mesa_get_srgb_format_linear() in sRGB texstore functions Instead of switch statements. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 11:47:31 -06:00
Brian Paul	c5631b341e	swrast: use macros to initialize texfetch_funcs[] table Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 11:47:31 -06:00
Brian Paul	4da1efb370	swrast: fix more fetch_texel function names These were missed/typo'd in the previous patch series: s/R8G8B8A/R8G8B8A8/ s/rgba_16/RGBA_UNORM16/ s/rgba_uint/RGBA_UINT/ s/rgba_int/RGBA_SINT/ Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-11 11:47:31 -06:00
José Fonseca	9d36a8d4d2	egl-static: Fix missing radeon_surface.h includes. Fixes fatal error: radeon_surface.h: No such file or directory when libdrm is not present, or non-Linux OSes. Trivial.	2014-04-11 16:46:02 +01:00
Knut Andre Tidemann	5ac3435a47	gallium/radeon: fix missing winsys include in pipe-loader. The commit `3b0b44f7de` introduced a build error: error: dereferencing pointer to incomplete type This patch fixes this issue in all the affected files. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-04-11 19:22:17 -04:00
Christian König	68bba1801e	st/omx/enc: separate input buffer private and task structure Keep tasks as linked list, this way we can associate more than one encoding task with each buffer. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	7806dbeb70	radeon/vce: implement B-frame support Signed-off-by: Slava Grigorev <slava.grigorev@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	a56fa0e83b	radeon/vce: add proper CPB backtrack Remember what frames we encoded at which position. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	d7d41ce133	vl: add interface for H264 B-frame encoding Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:03 +02:00
Christian König	ee4439c562	radeon/vce: remove RVCE_NUM_CPB_EXTRA_FRAMES Doesn't seems to be needed any more. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-11 11:35:02 +02:00
Chris Forbes	ce57c8e925	docs/relnotes: Fix consistency, add i965 to ARB_buffer_storage. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-04-11 21:20:13 +12:00
Kenneth Graunke	227049098b	i965: Fix missing _NEW_SCISSOR in Broadwell SF_CLIP_VIEWPORT state. The _Xmin/_Xmax/_Ymin/_Ymax values need to be guarded by _NEW_SCISSOR. Fixes Piglit's scissor-many, and rendering in GNOME Shell. Hopefully fixes similar issues with Unity and ChromeOS. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75879 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: James Ausmus <james.ausmus@intel.com> Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>	2014-04-10 23:38:10 -07:00
Ilia Mirkin	31640f4c38	mesa/st: set min/max texture gather offset to driver-reported value It was always getting set to -8/7 unconditionally. Use the driver-reported value instead. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-10 20:42:48 -04:00
Ilia Mirkin	c2f9ad5289	gallium: add a way to query min/max texture gather offsets Defaults to providing the same offsets as MIN/MAX_TEXEL_OFFSET. For nvc0, the offset can be -32/31. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-10 20:42:36 -04:00
Marek Olšák	8291f6d5c5	configure.ac: require libdrm_radeon 2.4.53 We need latest radeon_drm.h.	2014-04-10 21:24:50 +02:00
Marek Olšák	3b0b44f7de	winsys/radeon: fix a race condition in initialization of radeon_winsys::screen Create the screen in the winsys while the mutex is locked. This also results in a nice code cleanup! Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	ac330d4130	winsys/radeon: fix a race condition between winsys_create and winsys_destroy This also hides the reference count from drivers. v2: update the reference count while the mutex is locked in winsys_create Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	7c57b01564	winsys/radeon: fix a race condition between 2 calls to radeon_winsys_create This fixes random crashes of: piglit/glx-multithread-shader-compile. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	b5ebfc33b8	winsys/radeon: remove unused radeon_info variables, move backend_map Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	9b8449ae90	winsys/radeon: unify radeon_bo::flink and radeon_bo::name Both contained the GEM flink name. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	34564c8753	winsys/radeon: remove definitions already present in radeon_drm.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	e3e05c6db9	winsys/radeon: handle squared micro tiling from GEM_GET_TILING Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 20:50:17 +02:00
Marek Olšák	38858207a1	gallium/u_gen_mipmap: rewrite using pipe->blit (v2) This replaces u_gen_mipmap with an extremely simple implementation based on pipe->blit. st/mesa is also cleaned up. Pros: - less code - correct mipmap generation for NPOT 3D textures (u_blitter uses a better formula) - queries are not affected by mipmap generation if drivers disable them v2: add "first_layer", "last_layer" parameters, drop "face" v2.1: add format v2.2: document the format parameter	2014-04-10 20:50:16 +02:00
Marek Olšák	26c41398cc	st/mesa: properly implement MapTextureImage with multiple mapped slices (v2) This is needed by _mesa_generate_mipmap. This adds an array of pipe_transfers to st_texture_image. Each transfer is for mapping a single layer. v2: allocate the array of transfers on demand	2014-04-10 20:50:16 +02:00
Brian Paul	5206d4bc09	mesa: remove the MALLOC, CALLOC and FREE macros No longer used anywhere. These also caused trouble in the Gallium state tracker code where we include both core Mesa and Gallium util headers (and the macros were defined differently in each world.) Removing these macros should help avoid macro mix-ups in the future. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:12 -06:00
Brian Paul	7e55050301	xlib: s/FREE/free/ Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:11 -06:00
Brian Paul	3b323c4d40	mesa: s/FREE/free/ in vdpau code Reviewed-by: Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-10 07:53:11 -06:00
Brian Paul	00f31bdd32	mesa: s/FREE/free/ in _mesa_free_errors_data() Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:10 -06:00
Brian Paul	7fbb8ba499	mesa: use malloc/free instead of MALLOC/FREE in attrib stack code We moved away from MALLOC/FREE in the rest of core Mesa a while ago. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-10 07:53:05 -06:00
Brian Paul	f9985db0bc	st/mesa: fix sampler_view REALLOC/FREE macro mix-up We were using REALLOC() from u_memory.h but FREE() from imports.h. This mismatch caused us to trash the heap on Windows after we deleted a texture object. This fixes a regression from commit `6c59be7776`. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-04-10 07:53:05 -06:00
Chris Forbes	87502bbcd7	docs: Expand ARB_gpu_shader5 to describe status of individual features This extension is a huge grab-bag of "stuff that's in DX11". Break it apart to make it clear what still needs to be done. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-10 18:52:03 +12:00
Chris Forbes	0d653b948f	docs: Mark off ARB_texture_view and add to release notes for 10.2. V4: Don't claim Gen8 yet. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:42 +12:00
Chris Forbes	2a2f8cd9d2	i965: Enable ARB_texture_view on Gen7 V4: Don't enable this for Gen8 yet -- that still needs wired up. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:42 +12:00
Chris Forbes	ea477817d7	i965: Account for view parameters in blit CTSI path Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	01d6a2ad16	i965: Account for MinLayer/MinLevel in blorp CTSI path Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	058f353a15	i965: Account for view parameters in fast depth clears V2: - No need for layer_multiplier; multisampled depth surfaces are IMS. - Remove unused num_layers. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	540d53d9b0	i965/blorp: Account for nonzero MinLayer in layered clears. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	d581247569	i965/blorp: Use irb->layer_count in clear Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	98328e4c19	i965: Add layer_count to intel_renderbuffer This is the effective layer count, for clears etc. This differs from the depth of the miptree level when views are involved. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	0a08147fcb	i965: Pull out layer_multiplier in intel_update_renderbuffer_wrapper We're about to need this in another place. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	a76cde35d8	i965: Add `layered` parameter to intel_update_renderbuffer_wrapper We're about to need this so we can determine the layer count of the wrapper. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	85dda825fe	i965: Adjust renderbuffer wrapper to account for MinLevel/MinLayer Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	24f490fb37	i965: Enable texture upload fast path with MinLevel We'll still avoid MinLayer here since the fast path doesn't understand arrays at all, but it's straightforward to do levels. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	5de52541e5	i965: Account for MinLevel in texture upload fast path Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	ba3499ba01	i965: Adjust map/unmap code for MinLevel/MinLayer This allows core mesa's TexSubImage paths etc to work correctly with views which have nonzero MinLevel or MinLayer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	ca1d1b2fc1	i965: Don't try to use fast upload path for nontrivial views This will eventually be relaxed, but we'll get the fallback path working first. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	c9c08867ed	i965: Adjust surface_state emission to account for view parameters V4: Comment style, remove magic shift. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	771c2ae0af	i965: Add _Format to intel_texobj. This is the actual mesa_format to use. In non-view cases this is always the same as the mt's format. V4: Comment style Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	b7f011fdc9	i965: Add driver hook for TextureView We need to wire the original texture's mt into the view. All the hard work of setting up an appropriate tree of gl_texture_image structures has already been done by core mesa. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	93fa16bdd1	i965: Ensure that texture validation is skipped for immutable textures. If we were to relayout the miptree, we'd break any views that are sharing it. (Simplified based on suggestions from Eric) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:41 +12:00
Chris Forbes	a98b675945	i965: refactor format selection for unsupported ETC* formats We will need to call this to munge view formats. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	14c116433d	i965: refactor format munging for separate stencil We will need this for munging the view's format. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	215c9432b9	i965: Include #slices in miptree debug Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	c1b017472b	mesa: Adjust _MaxLevel computation to account for views Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	61e264f4fc	mesa: Prefer non-swizzled formats for most sized internalformats These formats can be cast to others (with different component types or sizes) via ARB_texture_view or ARB_shader_image_load_store. We want them to be laid out consistently so that we can just reinterpret the memory with a different format. In V1, this was done conditionally on a 'prefer_no_swizzle' flag which was set in TexStorage/TextureView paths, but we need the same behavior for ARB_shader_image_load_store (which also works with images created via TexImage, so we don't want it to be conditional. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	58790043bb	i965: Render R8G8B8X8 as R8G8B8A8 The sampler can handle R8G8B8X8 (and substitute 1.0 for the fourth component) but we can't use it as a render target. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	50eed4eed5	i965: Pretend we don't support BRW_SURFACEFORMAT_R16G16B16_FLOAT for textures. None of the other 3-component 16bpc formats are directly supported, so they get promoted to XRGB equivalents. Not promoting RGB16F the same way makes texture views much more fiddly -- we don't want to have to do crazy copying behind the scenes. (with my other master + my experimental ARB_texture_view support) fixes the piglit test: `spec/ARB_texture_view/view compare 48bit formats` No regressions in gpu.tests on Haswell. V4: Don't alter the formats table -- just don't match it to a mesa_format. [Kenneth] Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	66b0554fa6	i965: Enable R10G10B10A2_UNORM format This is supported by all generations, and is required for memory layout consistency for texture_view. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	932a1eeac8	i965: Enable R8G8B8A8_UNORM_SRGB format Now this is the preferred format for GL_SRGB8_ALPHA8. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	6ef7205613	swrast: Add support for fetching from MESA_FORMAT_R10G10B10A2_UNORM V4: Fix rebase conflicts with Brian's renaming of the texfetch functions. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Chris Forbes	a421be1dcb	mesa: fix packing of float texels to GL_SHORT/GL_BYTE Previously, we would unpack the texels to floats using _TO_FLOAT_TEX, and then pack them into the desired format using FLOAT_TO_. Unfortunately, this isn't quite the inverse operation, and so some texel values would end up off-by-one. This fixes the GL_RGB8_SNORM and GL_RGB16_SNORM subcases in piglit's arb_texture_view-format-consistency-get test on i965. The similar 1-, 2- and 4-component cases already worked because they took the memcpy path rather than repacking. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-04-10 18:27:40 +12:00
Michel Dänzer	ee2bcf38a4	r600g: Don't leak bytecode on shader compile failure Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74868 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-10 14:00:43 +09:00
Emil Velikov	55f9bbd46c	build: force .so extension for the gallium dri modules While linux uses .so as a default extension for shared libraries that is not the case for other platforms. The loader in libGL (and others) assumes that the dri module will always have a .so extension, thus it will fail to load on the affected platforms. Spotted-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-09 22:12:36 +01:00
Jon TURNEY	92d0786f88	Partially revert `bba9c28` "configure: use LIB_EXT rather than hardcoded .so" Filenames passed to dlopen() don't need to use the platform's default extension for shared libraries. Using the '.so' extension when dlopen()ing DRI drivers is hardcoded into mesa and the X server, so it should be hardcoded here in the Makefile as well. A similar fix is probably also needed for gallium DRI drivers. (Consider that if we were starting from scratch, perhaps we would use a custom extension like .dri instead) Cc: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-09 22:12:36 +01:00
Emil Velikov	56f531657c	Partially revert "st/xa: Fix advertized version number and try to avoid future discrepancies" This reverts commit `61bedc3d6b`. As the header is the one defining the API/ABI and is distributed during installation, we should be using it rather than re-defining the XA version in configure.ac. Bump the version in the header to 2.2.0, to reflect what was the original intent of commit `42158926c6`. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-09 22:12:35 +01:00
Emil Velikov	f9832f960f	glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path With commit 1f1928db001(glx: Drop _Xglobal_lock while we create and initialize glx display) we've split the big _Xglobal_lock handling in a more fine grained manner. Unfortunatelly we forgot to drop the unlock_mutex on the error paths, leading to undefined behaviour as the mutex is already unlocked. Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-09 22:12:35 +01:00
Rob Clark	6afd7be132	freedreno/a3xx: assert() -> debug_assert() We hit this assert with some piglit tests. Which appears to be a bug outside of freedreno. Previously we were relying on assert() being redefined to debug_assert() so that we didn't crash in release builds. Somehow that stopped working. So just use debug_assert() directly. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-09 16:37:04 -04:00
Brian Paul	e853ade544	svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create() Fixes a crash in svga_context_flush_buffers() if we use the 'draw' module for AA lines (when the device doesn't support that feature). We need to initialize this list before we setup the swtnl pieces. Found/fixed by Charmaine Lee. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-04-09 12:02:03 -06:00
Kenneth Graunke	26ae030fcc	i965: Stop advertising GL_MESA_ycbcr_texture. The "new" fragment shader backend has never supported the necessary color conversion code for this to work. We began using the new backend in Mesa 7.10 for GLSL (commit `a81d423d93`, October 2010), and for ARB_fragment_program in Mesa 9.1 (commit `97615b2d8c`, August 2012). I haven't heard any complaints, so I don't think anyone will miss this feature. I believe mplayer used it at one point, but these days defaults to other paths anyway. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <idr@freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-09 08:28:25 -07:00
Rob Clark	4a92c12232	freedreno/a3xx/compiler: add CEIL fixes piglit glsl-fs-ceil Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-09 10:59:18 -04:00
Rob Clark	9604e31dc9	freedreno/a3xx/compiler: fix neg mov's create_mov() was fixed up to handle neg/abs properly for interal mov's, using absneg.f, but forgot to fix it for TGSI MOV's. The problem with using add.f to handle negated mov's is that we can only take a single const reg src. So: MOV TEMP[n], -CONST[m] would turn into: add.f Rdst, (neg)CONST[m], 0.0 which would not work. Anyways, just remove the extra code and always use create_mov() which DTRT. This fixes piglit vs-op-neg-int test. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-09 10:59:18 -04:00
Marek Olšák	4d641803e8	radeonsi: allow fast color clear and Hyper-Z with 1D-tiled surfaces on CIK This depends on my kernel fix. Hyper-Z is still disabled by default.	2014-04-09 01:45:16 +02:00
Marek Olšák	fb5cf3490e	r600g,radeonsi: add a bunch of useful queries for the HUD	2014-04-09 01:45:16 +02:00
Marek Olšák	4a5519f1e0	r600g,radeonsi: set correct initial domain for shared resources	2014-04-09 01:45:16 +02:00
Marek Olšák	5f7faff61b	gallium/radeon: fix warnings	2014-04-09 01:45:16 +02:00
Iago Toral Quiroga	1a92637c68	tnl: Merge _tnl_vbo_draw_prims() into _tnl_draw_prims(). This should help prevent situations where we render without proper index bounds. For example: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-08 15:10:10 -07:00
Topi Pohjolainen	2ffb50d77b	i965: Remove unused sampler key fields Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 13:34:59 -07:00
Brian Paul	6f059725fa	mesa: move declaration before code in etc2_unpack_rgb8() To fix MSVC build since `cb4ad13685`.	2014-04-08 14:17:40 -06:00
Kenneth Graunke	ec1baea95a	i965: Delete "fast color clear unsupported" performance warning. Applications frequently clear to colors other than 0.0 or 1.0, which prevents us from doing fast color clears. In that case, we issue this performance warning on basically every glClear call, resulting in so much spam that it's nearly impossible to see any other messages. Plus, I don't think it's useful. We aren't suggesting a better way to do what the application developers want---we're just telling them it would be faster to do something they don't want. Driver developers have no control over the clear color, so this message is totally useless to them. A better alternative to get this sort of information is to use INTEL_DEBUG=blorp, which tells you whether color clears were fast, simd16 repdata, or slow. v2: Rebase on has_color_component changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-08 13:09:46 -07:00
Rob Clark	ee839cc6ef	freedreno/a3xx: deal with optimized tex instructions Keep track of whether we actually have any sam instructions in the resulting shader, rather than using TGSI SAMP declarations. If the sam instruction is optimized out, because the result is not used, we don't want to emit texture state, etc. In fact emitting sampler state and/or setting PIXLODENABLE bit when there are no texture fetches seems to cause lockup. In theory this should never happen for a "normal" shader, unless the state tracker is wonky. But it is a very real possibility for binning pass shaders. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-04-08 16:06:49 -04:00
Courtney Goeltzenleuchter	cb4ad13685	mesa: add bounds checking to eliminate buffer overrun Decompressing ETC2 textures was causing intermitent segfault by copying resulting 4x4 texel block to the destination texture regardless of the size of the destination texture. Issue found via application crash in GLBenchmark 3.0's Manhattan test. v2: add more detail comment. Compute limit outside inner loops. v3: add bugzilla reference v4: Correct cc syntax in commit log v5: really grab the right patch Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988 Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1, suggested v2-3]	2014-04-08 12:55:25 -07:00
Leo Liu	a22d944fdb	st/omx/enc: cleanup omx/vid_enc.c cleanup by moving each step into a separate function Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-08 17:51:19 +02:00
Christian König	5f374826f8	st/omx/enc: allocate input buffer private on demand v2: move allocation to a function as first step to clean vid_enc_EncodeFrame Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-04-08 17:51:15 +02:00
Brian Paul	9bb2ec6fd1	svga: replace sampler assertion with conditional For TEX instructions, the set of samplers and sampler views should be consistent. The XA state tracker sometimes passes an inconsistent set of samplers and sampler views. Rather than assert and die, issue a warning. v2: add debugging code to detect inconsistent state. v3: also check for null sampler in svga_state_tss.c Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-08 08:45:18 -06:00
Chia-I Wu	4ddf51db6a	i965/vec4: fix record clearing in copy propagation Given mov vgrf7, vgrf9.xyxz add vgrf9.xyz, vgrf4.xyzw, vgrf5.xyzw add vgrf10.x, vgrf6.xyzw, vgrf7.wwww the last instruction would be wrongly changed to add vgrf10.x, vgrf6.xyzw, vgrf9.zzzz during copy propagation. The issue is that when deciding if a record should be cleared, the old code checked for inst->dst.writemask & (1 << ch) instead of inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, ch)) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749 Signed-off-by: Chia-I Wu <olv@lunarg.com> Cc: Jordan Justen <jljusten@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romainck <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.1" <mesa-stable@freedesktop.org>	2014-04-08 21:04:22 +08:00
Eric Anholt	57d6e7b7ee	i965/vec4: Add a test for copy propagation behavior. I thought I was seeing a bug in the code while reviewing, but it's not there. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-08 00:59:48 -07:00
Eric Anholt	6230b646a5	i965/fs: Track whether we're doing dual source in a more obvious way. I'm going to be turning dual_src_output into an array in a moment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	14b85e3a47	i965/fs: Add a couple more global special regs to special[] Nothing bad came of this because they weren't used after visitor running, but leaving them in a bad state seems like a recipe for pain later. Suggested-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	4303d26f93	i965/fs: Handle arrays of special regs more cleanly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	72b845e640	i965/fs: Fix dump_instructions() on uniforms. All of a vec4 uniform was being printed as "u0" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	caa2605db5	i965/fs: Fix vgrf0 live interval when no interpolation was done. When you've got a simple solid-color shader that doesn't generate pixel_x/y interpolation, we were deciding that the first vgrf was both the undefined pixel_x and pixel_y, and extending its live interval to avoid the stride problem. That tricked other optimization that tries to see if a particular instruction is the last use of a variable. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	cf40ebacb1	i965: Drop pointless check for variable declarations in splitting. We're walking the whole instruction stream, so we know the declaration will be found. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	66b15ad9db	i965: Remove stale comment. We stopped doing variable index lowering for uniforms in `a64c1eb9b1`, 5 months after the comment was added. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	8c2bfbc6b9	glsl: Move tree grafting's debug output to stderr. The rest of our compiler dumps are there, now. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:48 -07:00
Eric Anholt	e9822f77a9	glsl: Skip making a temporary for assignments when we don't need one. While we wish our optimization passes could identify all the cases where we can coalesce our variables, we miss out on a lot of opportunities. total instructions in shared programs: 1673849 -> 1673166 (-0.04%) instructions in affected programs: 299521 -> 298838 (-0.23%) GAINED: 7 LOST: 0 Note that many programs are "hurt". The notable ones are where we produce unrolling in cases we didn't before (presumably just because of the lower instruction count). But there are also some cases where pushing things right into the variables prevents copy propagation and tree grafting, since we don't split our variable usage webs apart. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:59:47 -07:00
Iago Toral Quiroga	dff3439fef	i915: Fix build error. is_power_of_two() is now provided by mesa so its definition must be removed from the i915 driver code. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-08 00:29:59 -07:00
Kenneth Graunke	73f80c20f6	glsl: Pass ctx->Const.NativeIntegers to do_algebraic. The next patch will introduce an optimization that only works when integers are not represented as floating point values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:02:06 -07:00
Kenneth Graunke	169c645f12	glsl: Pass ctx->Const.NativeIntegers to do_common_optimization(). The next few patches will introduce an optimization that only works when integers are not represented as floating point values. v2: Re-word-wrap a line, as requested by Ian Romanick. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:02:03 -07:00
Kenneth Graunke	40d9337406	glsl: Validate that base types match for a number of binops. The IR is not supposed to support implicit type conversions; we just failed to validate it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:02:01 -07:00
Kenneth Graunke	e14b93371c	glsl: Fix lack of i2u in lower_ubo_reference. ir_binop_ubo_load takes unsigned integer operands. However, the array index used to compute these offsets may be a signed integer. (For example, see Piglit's spec/glsl-1.40/uniform_buffer/fs-bvec-array). For some reason, we were missing an ir_binop_i2u cast, and ir_validator was failing to catch that. Without this change, ir_builder's type inference code broke for me when writing a new optimization pass. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:58 -07:00
Kenneth Graunke	4311f9878d	i965/fs: Skip emitting MACH/MOV for small integers. The vector backend already implemented this optimization, but surprisingly, we never bothered to implement it in the scalar backend. In addition to saving two instructions, this eliminates a use of the accumulator as an explicit source, which is unsupported in SIMD16 mode on Gen7+, which could help us gain SIMD16 programs. Cuts 19.23% of the instructions in dolphin/efb2ram.shader_test. v2: Rebase on is_16bit_integer_constant -> is_uint16_constant rename. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:55 -07:00
Kenneth Graunke	7540be22d1	glsl: Make is_16bit_constant from i965 an ir_constant method. The i965 MUL instruction doesn't natively support 32-bit by 32-bit integer multiplication; additional instructions (MACH/MOV) are required. However, we can avoid those if we know one of the operands can be represented in 16 bits or less. The vector backend's is_16bit_constant static helper function checks for this. We want to be able to use it in the scalar backend as well, which means moving the function to a more generally-usable location. Since it isn't i965 specific, I decided to make it an ir_constant method, in case it ends up being useful to other people as well. v2: Rename from is_16bit_integer_constant to is_uint16_constant, as suggested by Ilia Mirkin. Update comments to clarify that it does apply to both int and uint types, as long as the value is non-negative and fits in 16-bits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:53 -07:00
Kenneth Graunke	bd69f65f90	mesa: Move is_power_of_two() function from brw_context.h to macros.h. This makes the function available from core Mesa code, including the GLSL compiler. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:51 -07:00
Kenneth Graunke	6bda3a5267	i965: Fix "SIMD16 unsupported" messages via KHR_debug. Performance warnings are logged via KHR_debug in addition to when the INTEL_DEBUG=perf environment variable is set. Without this, messages in debug contexts would have "(null)" for the reason. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-08 00:01:34 -07:00
Kenneth Graunke	ee12a03805	i965: Fix missing dirty bits in the gen8_sbe_state atom. These are clearly needed---the comments in the function are even present for each one of them. I originally had two separate state atoms for 3DSTATE_SBE and 3DSTATE_SBE_SWIZ. When I combined the functions, I must have forgotten to add the atoms for 3DSTATE_SBE_SWIZ. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 23:37:18 -07:00
Kenneth Graunke	47682f2ca1	i965: Drop BRW_NEW_RASTERIZER_DISCARD flag from Broadwell SOL atom. Nothing actually uses this---we handle rasterizer discard in the clipper in order for statistics counters to work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 23:37:16 -07:00
Kenneth Graunke	f68353c57c	i965: Use the correct program when uploading Broadwell SOL state. This is the equivalent of commit `43e77215b1`. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 23:36:19 -07:00
Thomas Hellstrom	47f60cbb71	st/xa: Make sure unused samplers are set to NULL renderer_copy_prepare was setting the first sampler but never telling the cso code how many samplers were actually used. Fix this. Cc: "10.1" <mesa-stable@freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-07 22:34:10 -07:00
Thomas Hellstrom	e5d2c5b899	st/xa: Bind destination before setting new state Binding a new destination may cause the svga driver to emit draw calls while propagating the surface. Make sure this doesn't happen in the middle of sampler state setup where state may be incosistent. In practice, surface propagation should never happen here and even if it did, it wouldn't be a valid reason for the svga driver to emit partially set up state, but to avoid future uncertainties, make sure this doesn't happen anyway. Found while auditing the state tracker for inconsistent sampler state / sampler view setup. Cc: "10.1" <mesa-stable@freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-04-07 22:34:10 -07:00
Eric Anholt	34f15903d6	glapi: Fix libglapi build. This line appears to have been accidentally dropped from the last commit, and the resulting libglapi was missing symbols.	2014-04-07 14:34:49 -07:00
Matt Turner	144bbb7b78	glapi/build: Add headers to distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:45:26 -07:00
Matt Turner	fbca1ab780	glapi/gen: Ship more Python files Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:45:19 -07:00
Matt Turner	b0f37a6bd2	glapi/gen: Ship XML and Python files Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:43:21 -07:00
Matt Turner	f76ac9c9a6	glapi/gen: Add missing XML files to API_XML Also (re)move XML files from COMMON to API_XML. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:43:21 -07:00
Matt Turner	cdc3a6bb21	src/build: Add getopt to distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:41:02 -07:00
Matt Turner	a97611313d	gbm/build: Add headers to distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:41:01 -07:00
Matt Turner	3f64c3d591	egl/build: Sort egl sources alphabetically. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:41:00 -07:00
Matt Turner	5ae2f28ca7	egl/build: Remove unused -DXF86VIDMODE. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:40:58 -07:00
Matt Turner	5074117928	egl/build: Include headers and XML in distribution. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:40:57 -07:00
Matt Turner	1d4007fbd9	egl/build: Drop two unnecessary Makefiles. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-07 09:40:31 -07:00
Matt Turner	5c770ba919	i965/fs: Remove left-over 'removed' variable. I think this was used for coalescing out partly dead large virtual registers, but the patch that enabled that caused regressions and didn't make it upstream. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 10:29:43 -07:00
Matt Turner	99437b730f	i965/fs: Check for interference after finding all channels. It's more likely that we won't find writes to all channels than one will interfere, and calculating interference is more expensive. This change will also help prepare for coalescing load_payload instructions' operands. Also update the live intervals for all channels, and not just the last that we saw. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-07 10:29:22 -07:00
Jordan Justen	70285f607c	i965: initialize more device info fields for Cherryview The intent in `9b6b084eb7` was for urb .size and .min_vs_entries fields to use the values from the GEN8_FEATURES macro. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-04-07 09:52:32 -07:00
Brian Paul	d3ef6f5427	swrast: reindent s_texfetch_temp.h, remove trailing whitespace Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:27 -06:00
Brian Paul	a19d60faef	swrast: remove out of date comments in s_texfetch_tmp.h The comments were out of date and redundant (the functions are pretty much self-explanatory). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:27 -06:00
Brian Paul	56db16fb5b	swrast: rename texture fetch functions (pt. 7) sed commands: s/f_z24_s8/S8_UINT_Z24_UNORM/g s/f_s8_z24/Z24_UNORM_S8_UINT/g s/f_z16/Z_UNORM16/g s/f_z32/Z_UNORM32/g s/z32f_x24s8/Z32_FLOAT_S8X24_UINT/g s/f_ycbcr_rev/YCBCR_REV/g s/f_ycbcr/YCBCR/g s/dudv8/DUDV8/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:27 -06:00
Brian Paul	d41fe0aec2	swrast: rename texture fetch functions (pt. 6) sed commands: s/rgb9_e5/R9G9B9E5_FLOAT/g s/r11_g11_b10f/R11G11B10_FLOAT/g s/f_alpha_f16/A_FLOAT16/g s/f_alpha_f32/A_FLOAT32/g s/f_luminance_f16/L_FLOAT16/g s/f_luminance_f32/L_FLOAT32/g s/f_luminance_alpha_f16/LA_FLOAT16/g s/f_luminance_alpha_f32/LA_FLOAT32/g s/f_intensity_f16/I_FLOAT16/g s/f_intensity_f32/I_FLOAT32/g s/f_r_f16/R_FLOAT16/g s/f_r_f32/R_FLOAT32/g s/f_rg_f16/RG_FLOAT16/g s/f_rg_f32/RG_FLOAT32/g s/f_rgb_f16/RGB_FLOAT16/g s/f_rgb_f32/RGB_FLOAT32/g s/f_rgba_f16/RGBA_FLOAT16/g s/f_rgba_f32/RGBA_FLOAT32/g s/xbgr16161616_float/RGBX_FLOAT16/g s/xbgr32323232_float/RGBX_FLOAT32/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	9eb45114fd	swrast: rename texture fetch functions (pt. 5) sed commands: s/srgba8/A8B8G8R8_SRGB/g s/sargb8/B8G8R8A8_SRGB/g s/sabgr8/R8G8B8A8_SRGB/g s/sxbgr8/R8G8B8X8_SRGB/g s/sla8/L8A8_SRGB/g s/sl8/L_SRGB8/g s/srgb8/BGR_SRGB8/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	faa8a8e8b2	swrast: rename texture fetch functions (pt. 4) sed commands: s/signed_rg1616/R16G16_SNORM/g s/signed_rg88_rev/R8G8_SNORM/g s/signed_al88/L8A8_SNORM/g s/signed_a8/A_SNORM8/g s/signed_a16/A_SNORM16/g s/signed_l8/L_SNORM8/g s/signed_l16/L_SNORM16/g s/signed_i8/I_SNORM8/g s/signed_i16/I_SNORM16/g s/signed_r8/R_SNORM8/g s/signed_r16/R_SNORM16/g s/signed_al1616/LA_SNORM16/g s/signed_rgb_16/RGB_SNORM16/g s/signed_rgba_16/RGBA_SNORM16/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	a401362019	swrast: rename texture fetch functions (pt. 3) Rename functions to match format names. sed commands: s/f_rg1616_rev/G16R16_UNORM/g s/f_rg1616/R16G16_UNORM/g s/f_argb2101010/B10G10R10A2_UNORM/g s/f_a8/A_UNORM8/g s/f_a16/A_UNORM16/g s/f_i8/I_UNORM8/g s/f_i16/I_UNORM16/g s/f_r8/R_UNORM8/g s/f_r16/R_UNORM16/g s/f_rgb888/BGR_UNORM8/g s/f_bgr888/RGB_UNORM8/g s/f_l8/L_UNORM8/g s/f_l16/L_UNORM16/g s/xbgr16161616_unorm/RGBX_UNORM16/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:26 -06:00
Brian Paul	e4ebb24b35	swrast: rename texture fetch functions (pt. 2) Rename functions to match format names. sed commands: s/f_al1616_rev/A16L16_UNORM/g s/f_al1616/L16A16_UNORM/g s/f_rgb565_rev/R5G6B5_UNORM/g s/f_rgb565/B5G6R5_UNORM/g s/f_argb4444_rev/A4R4G4B4_UNORM/g s/f_argb4444/B4G4R4A4_UNORM/g s/f_rgba5551/A1B5G5R5_UNORM/g s/f_argb1555_rev/A1R5G5B5_UNORM/g s/f_al88_rev/A8L8_UNORM/g s/f_al88/L8A8_UNORM/g s/f_gr88/R8G8_UNORM/g s/f_rg88/G8R8_UNORM/g s/f_al44/L4A4_UNORM/g s/f_rgb332/B2G3R3_UNORM/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:25 -06:00
Brian Paul	fde3258389	swrast: rename texture fetch functions (pt. 1) Rename functions to match format names. sed commands: s/signed_rgba8888_rev/R8G8B8A8_SNORM/g s/signed_rgba8888/A8B8G8R8_SNORM/g s/f_rgba8888_rev/R8G8B8A_UNORM/g s/f_rgba8888/A8B8G8R8_UNORM/g s/f_rgbx8888_rev/R8G8B8X8_UNORM/g s/f_rgbx8888/X8B8G8R8_UNORM/g s/f_argb8888_rev/A8R8G8B8_UNORM/g s/f_argb8888/B8G8R8A8_UNORM/g s/f_xrgb8888_rev/X8R8G8B8_UNORM/g s/f_xrgb8888/B8G8R8X8_UNORM/g s/signed_rgbx8888/X8B8G8R8_SNORM/g Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:25 -06:00
Brian Paul	e0fafd1913	mesa: rename stencil/Z functions in format_unpack.c So the function names match the format names. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-07 09:21:24 -06:00
Ilia Mirkin	89c5b56be6	nouveau: fix firmware check on nvd7/nvd9 The kernel driver expects the class to be based on chipset generation rather than VP generation. Make sure to pass 90b1 for NVDX chipsets instead of 95b1. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77102 Fixes: `40dd777b33` Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@ubunutu.com>	2014-04-07 08:58:15 -04:00
Thomas Hellstrom	2f6fcd65f2	winsys/svga: Fix prime surface references also for guest-backed surfaces Implement guest-backed surface sharing using prime fds. Previously only legacy surfaces could use this functionality. Also use the vmwgfx 2.6 single-ioctl prime fd reference if available. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-07 03:34:52 -07:00
Thomas Hellstrom	0887b499e9	winsys/svga: Update the vmwgfx_drm.h header to latest version from kernel Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2014-04-07 03:34:47 -07:00
Ilia Mirkin	159cec9dec	docs: mark ARB_texture_gather as done on nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:19 -04:00
Ilia Mirkin	f6579e4b17	nvc0: add support for texture gather Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:19 -04:00
Ilia Mirkin	91900c6d33	docs: mark ARB_texture_query_lod as done for nv50, nvc0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:18 -04:00
Ilia Mirkin	423f64e83a	nvc0: enable texture query lod Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:18 -04:00
Ilia Mirkin	d5faf8e786	nv50: enable texture query lod Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-07 01:06:18 -04:00
Dave Airlie	4dc13e3c71	st/mesa: add support for ARB_texture_query_lod Add support for the LODQ texture instruction. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-07 01:06:18 -04:00
Dave Airlie	be5276ae7d	gallium: add support for LODQ opcodes. This opcode provide support for GL_ARB_texture_query_lod, Signed-off-by: Dave Airlie <airlied@redhat.com> [imirkin: rebase, docs update] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-07 01:06:18 -04:00
Matt Turner	5d0b3ec4ae	i965/vec4: Allow constant propagation into dot product. total instructions in shared programs: 1667088 -> 1667055 (-0.00%) instructions in affected programs: 3362 -> 3329 (-0.98%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-04-05 09:52:54 -07:00
Matt Turner	34ec1a24d6	glsl: Optimize (x + y cmp 0) into (x cmp -y). Cuts a small handful of instructions in Serious Sam 3: instructions in affected programs: 4692 -> 4666 (-0.55%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:37 -07:00
Matt Turner	6499ecafa5	i965/fs: Split out can_coalesce_vars() function. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:37 -07:00
Matt Turner	29841fbe20	i965/fs: Split out is_coalesce_candidate() function. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:37 -07:00
Matt Turner	0fbcdec2f6	i965/fs: Split fs_visitor::register_coalesce() into its own file. The function has gotten large, and brw_fs.cpp is the largest source file in the driver. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:37 -07:00
Matt Turner	8b1ab5c93b	i965/fs: Mark appropriate fs_inst members as const. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:36 -07:00
Matt Turner	39ecfca121	i965: Mark is_tex() and friends as const. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-04-05 09:47:36 -07:00
Matt Turner	92d03f7f28	i965/fs: Don't propagate saturation modifiers if there are source modifiers. Which would lead to translating mad vgrf9:F, vgrf3:F, u0:F, vgrf6:F mov.sat vgrf7:F, -vgrf9:F into mad.sat vgrf9:F, vgrf3:F, u0:F, vgrf6:F mov vgrf7:F, -vgrf9:F Fixes some lighting effects in Dota2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:36 -07:00
Matt Turner	7a7b8a02be	i965/fs: Don't propagate saturate modifiers into partial writes. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:36 -07:00
Matt Turner	86ae6f477d	i965/fs: Fix off-by-one in saturate propagation. ip needs to be initialized to start_ip - 1, since the first thing in the main loop is ip++. Otherwise we would incorrectly propagate the saturate from the mov to the mad: mad a, b, c, d mov.sat x, a add y, z, a Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-05 09:47:36 -07:00
Matt Turner	20dee82a75	i965/vec4: Consider sources of non-GRF-dst instructions for dead channels. Previously we'd ignore the sources of instructions with non-GRF destinations when calculating calculating the dead channels. This would lead to us incorrectly removing the first instruction in this sequence: mov vgrf11, ... cmp.ne.f0 null, vgrf11, 1.0 mov vgrf11, ... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76616	2014-04-05 09:47:36 -07:00
Matt Turner	63d57f3b08	i965/fs: Name temporary ralloc contexts something other than mem_ctx. Or else poor programmers might mistakenly use the temporary mem_ctx, instead of the fs_visitor's mem_ctx and wonder why their code is crashing. Also remove the parenting. These contexts are local to the optimization passes they're in and are freed at the end.	2014-04-05 09:44:54 -07:00
Matt Turner	26012c1673	i965/fs: Recalculate live intervals in calculate_register_pressure(). Otherwise calling dump_instructions() after declaring a new fs_reg would segfault when calculate_register_pressure()'s loop over reg walked off the end of the virtual_grf_start[] array that calculate_live_intervals() would have reallocated for you, if it had known there was a new register.	2014-04-05 09:44:54 -07:00
Jonathan Gray	c973e440d5	egl/dri2: use drm macros to construct device name Don't hardcode /dev/dri/card0 but instead use the drm macros which allows the correct /dev/drm0 device to be opened on OpenBSD. v2: use snprintf and fallback to /dev/dri/card0 v3: check for snprintf truncation Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-05 13:36:29 +01:00
Jonathan Gray	81799c82e4	configure: don't require libudev for gbm or egl drm/wayland After the loader changes libudev is no longer required for gbm or the egl drm/wayland platforms. Lets these build/run on OpenBSD. v2: preserve the libudev requirement for Linux as suggested by Emil Velikov. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:35:25 +01:00
Jonathan Gray	0295953c5d	egl/dri2: don't require libudev to build drm/wayland platforms After the loader changes libudev is no longer required to build gbm or the egl drm/wayland platforms. Remove a libudev ifdef which allows the the drm egl driver to be loaded on OpenBSD. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:33:48 +01:00
Jonathan Gray	11623be934	automake: don't enable -Wl,--no-undefined on OpenBSD OpenBSD does not have DT_NEEDED entries for libc by design, over concerns how the symbols would be referenced after changing the major version of the library. So avoid -no-undefined checks on OpenBSD as they will fail. v2: don't include the -no-undefined libtool option in the variable and change -Wl,--no-undefined references in Automake.inc as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76856 Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-05 13:30:27 +01:00
Emil Velikov	e4bd00c1c6	targets/dri: move common libraries to GALLIUM_DRI_LIB_DEPS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:54 +01:00
Emil Velikov	fc91e7e4ae	targets/omx: use GALLIUM_COMMON_LIB_DEPS The targets do not require expat or selinux. Use GALLIUM_COMMON_LIB_DEPS which provides the core requirements for each gallium target. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:48 +01:00
Emil Velikov	6b41043050	targets/xvmc: use GALLIUM_COMMON_LIB_DEPS The targets do not require expat or selinux. Use GALLIUM_COMMON_LIB_DEPS which provides the core requirements for each gallium target. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:46 +01:00
Emil Velikov	432b5776f2	r600/omx: drop -lstdc++ hack The build system will use g++ to link the static library due to the dummy.cpp source(s). Thus one does not need the explicit link against stdc++. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:02:30 +01:00
Emil Velikov	28a4276442	drivers/nouveau: mention dummy.cpp to use g++ linker The build system does not know that the static library is C++. Mention the cpp file to trigger generation of the proper variable and drop the hacky stdc++ linking. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-05 13:00:32 +01:00
Emil Velikov	16372969c7	drivers/nouveau: use GALLIUM_COMMON_LIB_DEPS Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-04-05 13:00:14 +01:00
Emil Velikov	c8129604ef	drivers/r300: use GALLIUM_COMMON_LIB_DEPS Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76848 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:00:07 +01:00
Emil Velikov	ba5eba5008	automake: introduce GALLIUM_COMMON_LIB_DEPS Rather than copying the core four dependencies all over gallium, introduce the above variable to avoid all the duplication. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76848 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 13:00:02 +01:00
Emil Velikov	16c13aaeb8	automake: move GALLIUM_DRI_LIB_DEPS to Automake.inc With recent commit we started de-duplicating all of the compiler/ linker flags moving their handling inside Automake.inc. This did not take into consideration that the above variable was set at configure time, leading to issues on certain build combinations. Move the variable to where it's used/handled thus cleaning up configure.ac. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76848 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:59:44 +01:00
Johannes Nixdorf	476db98e03	configure.ac: fix the detection of expat with pkg-config The pkg-config module was called "EXPAT" instead of "expat" in PKG_CHECK_EXISTS. This seems to have been wrong because the wrong argument was copied from PKG_CHECK_MODULES. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:24:01 +01:00
Jonathan Gray	1cc742d912	megadriver_stub.c: don't use _GNU_SOURCE to gate the compat code _GNU_SOURCE is only set/required for linux\|-gnu\|gnu) and as the functionality is available on other systems check for RTLD_DEFAULT instead. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:21:31 +01:00
Jonathan Gray	380f05ccc3	loader: don't limit the non-udev path to only android Platforms that lack libudev (OpenBSD and possibly others) need this change in order to load the correct dri driver. Under linux we unconditionally require libudev, thus this code will never get build. v2: Add commit message (Emil Velikov) Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:17:28 +01:00
Jonathan Gray	727f54a76e	loader: use 0 instead of FALSE which isn't defined Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-04-05 12:16:45 +01:00
Francisco Jerez	4ccff1499c	clover: Document that the obj() helpers already take care of object validation.	2014-04-05 12:18:29 +02:00
Matt Turner	489cb0b2d1	i965: Mark SNB GT1 as a GT1. brw->gt only seems to be used on gen >= 7, so this shouldn't have any effect. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-04-04 15:07:41 -07:00
Marek Olšák	78f754b739	gallium/u_blitter: implement scaled blitting in the Z direction So that pipe->blit can be used for 3D mipmap generation.	2014-04-04 19:38:36 +02:00
Marek Olšák	8ab7bb4707	gallium/u_blitter: don't adjust cubemap coordinates by a small number It may cause issues with mipmap generation. I think it was used to make some piglit tests pass on r300g.	2014-04-04 19:38:36 +02:00
Leo Liu	0817182b2f	Revert "radeon: just don't map VRAM buffers at all" This reverts commit `96e8b916a7`. In the case of VCE encoding with raw YUV file, CPU load directly to VRAM is faster than combination of CPU writing to GTT and then blit to VRAM with GPU. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-04-04 16:21:04 +02:00
Leo Liu	de1a59b7a7	radeon/vce: cleanup cpb handling v2: fix whitespace errors, minor coding style changes Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-04-04 12:35:55 +02:00
Christian König	6c59be7776	st/mesa: improve sampler view handling Keep a dynamically increasing array of all the views created for a texture instead of just the last one. v2: add comments, fix array size calculation, release only the first sampler view found Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-04 10:25:35 +02:00
Thomas Hellstrom	61bedc3d6b	st/xa: Fix advertized version number and try to avoid future discrepancies The xa version number had to be set in two places. In configure.ac and in xa_tracker.h. Furthermore, xa_tracker.h is an installed header so we can't use mesa internal defines. So therefore, at configure time, modify the xa_tracker.h header to use the version given by configure.ac Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Rob Clark <robdclark@gmail.com>	2014-04-04 08:33:43 +02:00
Ian Romanick	4fa58ae5c7	glapi: Fix make check /me puts a paper bag on his head and sits in the corner. This was supposed to be included in `5a68f731`, which added glPointSizePointerOES back to the list of functions exposed by libGLESv1_CM. It looks like it was an uncommitted change in my tree when I sent the patch out. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-03 20:12:19 -07:00
Brian Paul	177c9be615	llvmpipe: remove no-op checks in sampler, sampler_view functions Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 20:05:56 -06:00
Brian Paul	61a3e9936c	softpipe: remove no-op checks in sampler, sampler_view functions Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 19:39:23 -06:00
Brian Paul	4105ad825f	svga: remove no-op checks in sampler, sampler_view functions We are checking for no-ops in the CSO module for both of these items so there's no reason to do it in the driver. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 19:39:23 -06:00
Brian Paul	5a2f8b2c48	cso: check for no sampler view changes in cso_set_sampler_views() As we do for sampler states in single_sampler_done() and many other CSO functions. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-03 19:39:23 -06:00
Timothy Arceri	ffa39ab067	docs: Add note about updating tests to dev info Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>	2014-04-04 06:48:11 +11:00
José Fonseca	c6050ce7da	st/wgl: Remove wglGalliumMESA(). These were only used by the Python state tracker, which was removed, hence they have no practical use. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-03 12:52:09 +01:00
Ian Romanick	572a25be2f	glapi: Fix scons build Put the -c in the correct place (and match Makefile.am). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76960 Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: José Fonseca <jfonseca@vmware.com>	2014-04-03 12:52:09 +01:00
Adel Gadllah	d120506e15	glx: Do not advertise buffer_age on dri2 Previously GLX_EXT_buffer_age has always been advertised as supported because both client_glx_support and client_glx_only where set. So it did not matter that direct_support is only set when running dri3 and we ended up always advertising it. Fix that by not setting client_glx_only for buffer_age in known_glx_extensions. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-04-02 21:28:26 +01:00
Brian Paul	2355a64414	cso: fix sampler view count in cso_set_sampler_views() We want to call pipe->set_sampler_views() with count being the maximum of the old number of sampler views and the new number. This makes sure we null-out any old sampler views. We already do the same thing for sampler states in single_sampler_done(). Fixes some assertions seen in the VMware driver with XA tracker. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Tested-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-04-02 13:58:05 -06:00
Ian Romanick	5a68f73102	glapi: Add static dispatch for glPointSizePointerOES The OpenGL ES 1.1 conformance tests expect this function to be statically available form libGLESv1_CM.so. The comment "required for es1.1" in the XML file should have been a clue. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76926 Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Lu Hua <huax.lu@intel.com>	2014-04-02 11:30:52 -07:00
Ian Romanick	065ca63043	Revert "Revert "glapi/es1: Don't mark core functions as static_dispatch=false"" This reverts commit `526e49290c`. The original build problem should be fixed by the previous commit. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com>	2014-04-02 11:30:49 -07:00
Ian Romanick	cecffa08d1	glapi: Enable ES compatibility mode Ages ago Chia-I added an ES compatibility flag to several of the various generator scripts. The intention was to bridge differences between ES and desktop in Mesa builds without ES. It doesn't appear that it has ever been used. Recent changes to static_dispatch status of several ES1 functions caused problems in desktop-only, non-shared-glapi builds. Enabling the ES compatibility mode appears to fix these build problems. This is kind of a duct tape solution to this problem. As I mentioned in the cover letter for the series that triggered the build problem, I would like to make some major changes to the generator architecture and the XML. The whole point of the proposed architecture changes is to better handle the differences between desktop GL and ES. I think duct tape is okay for now. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76869 Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Chia-I Wu <olv@lunarg.com>	2014-04-02 11:30:45 -07:00
Ian Romanick	8e3a7c6204	glapi: Fix build break in 'make check' on non-shared-glapi builds Commit `fb78fa58` made the GL_ARB_debug_output functions aliases of the GL_KHR_debug output functions. As a result, the function names in struct _glapi_table also changed. The table in check_table.cpp used the ARB names. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com> Cc: Vinson Lee <vlee@freedesktop.org>	2014-04-02 11:30:42 -07:00
Ian Romanick	4e18279fae	glapi: Remove support for "short string" mode C89 has a fairly short minimum-maximum string length. To support compilers limited by the C89 limits, this script had a mode where it would generate a character array instead of a giant string. These were functionally the same, but the code generated for the character array is HUGE and difficult to read. As far as I can tell, nothing in Mesa uses '-m short' any more. The generated files used to be tracked in revision control, but I think we stopped using '-m short' when we stopped tracking the generated files. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Brian Paul <brianp@vmware.com> Tested-by: Lu Hua <huax.lu@intel.com> Cc: Vinson Lee <vlee@freedesktop.org>	2014-04-02 11:30:37 -07:00
Juha-Pekka Heikkila	0f641b2d50	mesa: remove redundant running of check_symbol_table() Nested for loops running through tables against which they finally do an assert were ran also with optimized builds. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	17e7cbe078	mesa: Add missing null check in _mesa_parse_arb_program() Add missing null check in program_parse.tab.c through program_parse.y Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	68a45b130e	mesa: Prevent negative indexing on noise2, noise3 and noise4 % operator could return negative value which would cause indexing before perm table. Change %256 to &0xff Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	1056c50d57	glx: add extra null check in getFBConfigs Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-02 19:54:37 +03:00
Juha-Pekka Heikkila	88976daea9	glx: remove unused __glXClientInfo() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:54:37 +03:00
Tapani Pälli	e14cc504f3	i965/vec4: do not trim dead channels on gen6 for math Do not set a writemask on Gen6 for math instructions, those are executed using align1 mode that does not support a destination mask. v2: cleanups, better comment (Matt) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76883 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-02 19:50:48 +03:00
Thomas Hellstrom	5dc206525b	winsys/svga: Replace the query mm buffer pool with a slab pool v3 This is to avoid running out of query buffer space due to winsys limitations. Instead of a fixed size per screen pool of query buffers, use a slab allocator that allocates a new slab if we run out of space in the first one. v2: Correct email addresses. v3: s/8192/VMW_QUERY_POOL_SIZE/. Improve documentation and log message. Reported-and-tested-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-04-02 18:32:44 +02:00
Dave Airlie	76ba50a25a	mesa/soft/llvmpipe: add fake MSAA support This adds a gallium cap that allows us to fake GL3.0 by not exposing MSAA on sw rendering. It also forces the extra extensions needed for GL3.2. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-04-02 12:12:04 +10:00
Kristian Høgsberg	882b46a42e	gbm: Add gbm_bo_get_fd to gbm-symbols-check script	2014-04-01 14:08:38 -07:00
Kristian Høgsberg	a43d286ef7	gbm: Add import from fd Add a new import type that lets us create a gbm bo from a DMA-BUF file descriptor. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-01 12:27:26 -07:00
Kristian Høgsberg	f54f5891be	gbm: Add gbm_bo_get_fd() Add gbm function to get a DMA-BUF file descriptor for a gbm bo. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-04-01 12:27:13 -07:00
Jordan Justen	7c379ebe17	include/GLES3: add OpenGL ES 3.1 Headers From: http://www.khronos.org/registry/gles/api/GLES3/gl31.h http://www.khronos.org/registry/gles/api/GLES2/gl2ext.h http://www.khronos.org/registry/gles/api/GLES3/gl3platform.h Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 09:30:32 -07:00
Brian Paul	526e49290c	Revert "glapi/es1: Don't mark core functions as static_dispatch=false" This reverts commit `f6e290f80c`. To fix the broken build. The DRI-enabled build seems OK after reverting. Th non-DRI/gallium build is still suffering from an unrelated issue in the pipe-loader code.	2014-04-01 08:42:15 -06:00
Iago Toral Quiroga	f5904b732e	mesa: Allow setting GL_TEXTURE_MAX_LEVEL to 0 with GL_TEXTURE_RECTANGLE. Currently, we raise an error when doing this which breaks a conformance test from the OpenGL samples pack. Even if this is a bit silly it is not an error. From http://www.opengl.org/wiki/Rectangle_Texture: "Rectangle textures contain exactly one image; they cannot have mipmaps. Therefore, any texture parameters that depend on LODs are irrelevant when used with rectangle textures; attempting to set these parameters to any value other than 0 will result in an error." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76496 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 08:37:06 -06:00
Ilia Mirkin	c13ff5a763	gallium/docs: fix silent math failures due to ~ and & Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	b4cf180695	gallium/docs: line up some of the equations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	05d0223da3	gallium/docs: fix incorrect/missing references Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	45e383bfae	gallium/docs: fix use of _ in math sections Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	2f14e5eb09	gallium/docs: add format to index Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Ilia Mirkin	4ca110a7b9	gallium/docs: fix a lot of bad formatting Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 10:17:13 -04:00
Chia-I Wu	5d76e44643	glsl: remove UBO fields from _mesa_glsl_parse_state They are not needed since `514f8c7ec7`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-04-01 13:41:20 +08:00
Ilia Mirkin	010171b562	nv50: implement clear_buffer to accelerate ARB_clear_buffer_object Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-31 21:55:03 -04:00
Ilia Mirkin	f5ba1a1f7f	mesa/st: Accelerate ARB_clear_buffer_object with clear_buffer Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-31 21:21:11 -04:00
Ilia Mirkin	24b86cb304	gallium: add interface to clear buffers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-31 21:20:02 -04:00
Ian Romanick	4c035706dc	mapi_abi: Remove ABI-check work arounds for functions that are no longer exported The previous commit stopped exporting 21 libGLESv2 and 88 libGLESv1_CM functions. This removes the work-arounds for those functions from ABI-check. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:47:25 -07:00
Ian Romanick	1a59f9a131	mapi_abi: Make ES1 and ES2 static_dispatch=false functions hidden This has been a long standing issue with the ES libraries. Functions marked in the XML with 'static_dispatch=false' were still incorrectly exported. ABI-check is supposed to detect this case, but we have to paper over failures every time a new extension is added. This change will cause a big pile of functions to disappear from libGLESv2 and libGLESv1_CM. libGLESv2 loses (20 functions): glBindVertexArrayOES glCompressedTexImage3DOES glCompressedTexSubImage3DOES glCopyTexSubImage3DOES glDeleteVertexArraysOES glDiscardFramebufferEXT glDrawBuffersNV glFlushMappedBufferRangeEXT glFramebufferTexture3DOES glGenVertexArraysOES glGetBufferPointervOES glGetProgramBinaryOES glIsVertexArrayOES glMapBufferOES glMapBufferRangeEXT glProgramBinaryOES glReadBufferNV glTexImage3DOES glTexSubImage3DOES glUnmapBufferOES libGLESv1_CM loses (88 functions): glAlphaFuncxOES glBindFramebufferOES glBindRenderbufferOES glBlendEquationOES glBlendEquationSeparateOES glBlendFuncSeparateOES glCheckFramebufferStatusOES glClearColorxOES glClearDepthfOES glClearDepthxOES glClipPlanefOES glClipPlanexOES glColor4xOES glDeleteFramebuffersOES glDeleteRenderbuffersOES glDepthRangefOES glDepthRangexOES glDiscardFramebufferEXT glDrawTexfOES glDrawTexfvOES glDrawTexiOES glDrawTexivOES glDrawTexsOES glDrawTexsvOES glDrawTexxOES glDrawTexxvOES glFlushMappedBufferRangeEXT glFogxOES glFogxvOES glFramebufferRenderbufferOES glFramebufferTexture2DOES glFrustumfOES glFrustumxOES glGenerateMipmapOES glGenFramebuffersOES glGenRenderbuffersOES glGetBufferPointervOES glGetClipPlanefOES glGetClipPlanexOES glGetFixedvOES glGetFramebufferAttachmentParameterivOES glGetLightxvOES glGetMaterialxvOES glGetRenderbufferParameterivOES glGetTexEnvxvOES glGetTexGenfvOES glGetTexGenivOES glGetTexGenxvOES glGetTexParameterxvOES glIsFramebufferOES glIsRenderbufferOES glLightModelxOES glLightModelxvOES glLightxOES glLightxvOES glLineWidthxOES glLoadMatrixxOES glMapBufferOES glMapBufferRangeEXT glMaterialxOES glMaterialxvOES glMultiTexCoord4xOES glMultMatrixxOES glNormal3xOES glOrthofOES glOrthoxOES glPointParameterxOES glPointParameterxvOES glPointSizePointerOES glPointSizexOES glPolygonOffsetxOES glQueryMatrixxOES glRenderbufferStorageOES glRotatexOES glSampleCoveragexOES glScalexOES glTexEnvxOES glTexEnvxvOES glTexGenfOES glTexGenfvOES glTexGeniOES glTexGenivOES glTexGenxOES glTexGenxvOES glTexParameterxOES glTexParameterxvOES glTranslatexOES glUnmapBufferOES Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Chia-I Wu <olv@lunarg.com> Cc: Paul Berry <stereotype441@gmail.com>	2014-03-31 14:47:00 -07:00
Ian Romanick	dfccd5ccd7	mapi: Hack around glGetInternalformativ not being hidden in GLES This is hella ugly. The same-named function in desktop OpenGL is hidden, but it needs to be exposed by libGLESv2 for OpenGL ES 3.0. There's no way to express in the XML that a function should be be hidden in one API but exposed in another. This won't affect any change now, but it will prevent a regression in a later patch. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:46:48 -07:00
Ian Romanick	f6e290f80c	glapi/es1: Don't mark core functions as static_dispatch=false Functions that are part of OpenGL ES 1.0 or 1.1 should have static dispatch functions in libGLESv1_CM. This doesn't affect any change yet, but it will prevent later regressions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:46:39 -07:00
Ian Romanick	d457eb193c	glapi: Mark all GL_ARB_separate_shader_objects functions with static_dispatch=false This prevents the entrypoints from being (incorrectly) advertised by libGL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:46:32 -07:00
Ian Romanick	5ccc4e7a8d	glapi: Remove some duplicate ignore="true" lines It looks like these were added accidentally by Paul in commit `1a1db174`. From the commit message and the look of the patch, I think this was just some sed-job left overs. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-31 14:45:37 -07:00
Matt Turner	3a8bd97241	i965/vec4: Don't trim writemasks of texture instructions. It was my understanding that the writemask works in SIMD4x2 mode for texturing instructions and doesn't require a message header. Some bit of this logic must be wrong, so disable it until it's understood. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76617 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-31 10:24:10 -07:00
Emil Velikov	d681b22ed7	automake: ask the linker to do garbage collection By doing GC the linker removes all the symbols that are not referenced and/or used by the final library. This results in a saving of ~100K up-to ~600K per (stripped) binary (classic vs gallium drivers). If interested one can ask the compiler to print the sections that are removed using -Wl,--print-gc-sections. v2: Check if ld supports the flag before using it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com> (v1)	2014-03-31 14:56:14 +01:00
Emil Velikov	d187a150d4	automake: add -Wl,--no-undefined to all libraries ... apart from the dri drivers. With this final change we can build mesa without fear that the resulting libraries will have unresolved symbols. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:09:23 +01:00
Emil Velikov	902dc61f88	gallium/targets: add missing library dependencies Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:08:55 +01:00
Emil Velikov	354a5cad74	pipe-loader: reorder PIPE_LIBS Reorder -lm, -lrt, -lpthreads and -ldl to be consistent with the rest of mesa. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-31 13:05:36 +01:00
Emil Velikov	0177ff0039	pipe-loader: use PTHREAD_LIBS over -lpthread Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:02:47 +01:00
Emil Velikov	501af7a1a0	dri/i965: use CLOCK_LIBS over -lrt Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 13:01:10 +01:00
Emil Velikov	5503c227d9	automake: consistently use -no-undefined Set the flag for all but the dri targets. They have missing glapi symbols which are required for the normal operation with the X server. Jon, I fear that you'll need to carry the "no-undefined" hunk locally when building the dri drivers under cygwin. Cc: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:59:16 +01:00
Emil Velikov	6c8d8119ca	targets/egl-static: move the common LDFLAGS into AM_LDFLAGS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:56:25 +01:00
Emil Velikov	c323273201	targets/omx: do not link against the trace driver Unused due to the missing GALLIUM_TRACE define. Requested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-31 12:55:29 +01:00
Emil Velikov	0484b8446a	gallium/targets: explicitly include a dummy.cpp and remove all the LINK mayhem Explicitly setting the linker variable was required for old and broken build toolchains. At this point this should no longer be needed, and setting the sources lists will trigger generation of the correct LINK variables. Explicitly include dummy.cpp to use g++ to link the static library which in most cases is based upon C++ code. v2: Reword commit message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:26:47 +01:00
Emil Velikov	2d9c33009a	gallium/targets: move LLVM_LIBS handling inside Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-31 12:26:32 +01:00
Emil Velikov	2328900f66	gallium/targets: fold LLVM_LDFLAGS inside Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-31 12:26:16 +01:00
Emil Velikov	1ea1767f72	targets/omx: use GALLIUM_OMX_LINKER_FLAGS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-31 12:25:34 +01:00
Emil Velikov	e6f8db1e56	targets/omx: introduce GALLIUM_OMX_LIB_DEPS Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-31 12:25:04 +01:00
Emil Velikov	55bc658e4b	targets/pipe-loader: move LLVM_LIBS handling inside PIPE_LIBS This lets us have only one if HAVE_MESA_LLVM block, rather than one for each driver. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:23:59 +01:00
Emil Velikov	e36cc99880	targets/pipe-loader: include dummy.cpp irrespective of HAVE_MESA_LLVM Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:22:58 +01:00
Emil Velikov	029bc4510b	targets/pipe-loader: compact duplicating LDFLAGS Every library uses the same libtool/linker flags. Compact those into AM_LDFLAGS and append the version script to it. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:22:30 +01:00
Joakim Sindholt	e6545aaaeb	pipe-loader/swrast: add soft/llvmpipe defines Or it compiles them in, but pretends they don't exist v2: Rebase (Emil) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:22:08 +01:00
Emil Velikov	613b4d59e4	targets/xa: drop libudev references from automake build Mesa does _not_ link against libudev. Additionally the only place that deals with it is the loader, thus we can drop the CFLAGS. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:47 +01:00
Emil Velikov	f5466b7b93	dri/common: LIBDRM_LIBS is not a linker/libtool flag, add it to LIBADD Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:42 +01:00
Emil Velikov	46ae286b9d	drivers/x11: GL_LIB_DEPS is not a linker/libtool flag, add it to LIBADD Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:36 +01:00
Emil Velikov	e62b7d38a1	configure: autodetect video state-trackers when non swrast driver is present It makes little sense to enable the vdpau, xvmc and omx state-trackers as they do not make use of (don't work with) the software driver. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:21:30 +01:00
Emil Velikov	3dc174e85e	configure: use grep in quiet mode, rather than piping stderr/stdout to /dev/null grep -q is easier to read and consistent with the rest of configure.ac. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:20:10 +01:00
Emil Velikov	e8e1158ac3	configure: error out when building gallium-osmesa without softpipe Gallium osmesa links against the softpipe driver, thus the build will fail if it's missing. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:18:39 +01:00
Emil Velikov	4d8267ef20	Partially revert "automake: allow only shared builds" Evidently at least static OSMesa is still used as shared one causes substantial increase in the load time for some programs that use it (from seconds up-to ~30min). Rather than forcing everyone to use shared mesa, revert commit `a6efbac9fb` and default to shared build when both shared and static are disabled. v2: Whitespace cleanup, drop silly comment. Reported-by: Burlen Loring <burlen.loring@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:18:17 +01:00
Emil Velikov	23740ed031	configure: enable dri3 only for linux Currently only linux can make use of dri3, so it would make sense to enable it explicitly for the platform. Drop a duplicated libudev check while we're at it. v3: Properly handle dri3 and reword commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76377 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-31 12:11:37 +01:00
Chris Forbes	ec4b8d1697	mesa: Fix format matching checks for GL_INTENSITY* internalformats. GL_INTENSITY has never been valid as a pixel format -- to get the memcpy pack/unpack paths, the app needs to specify GL_RED as the pixel format (or GL_RED_INTEGER for the integer formats). Note: This was briefly merged before, but exposed some breakage in gallium, so was reverted. Hopefully it will stick this time. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 11:56:48 +13:00
Chris Forbes	e3cdbdb14b	st: fix st_choose_matching_format to ignore intensity _mesa_format_matches_format_and_type() returns true for GL_RED/GL_RED_INTEGER (with an appropriate type) into an intensity mesa_format. We want the `red`-based format instead, regardless of the order we find them in our walk of the mesa formats list. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-04-01 11:56:18 +13:00
Chris Forbes	3196c53c5d	mesa: fix texstore for MESA_FORMAT_R8G8B8A8_SRGB The case for this was in the wrong function, and this format's store func was not set in the table at all. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-04-01 11:54:56 +13:00
Rob Clark	db414c4686	freedreno/a3xx/compiler: fix RECT textures Whether or not the coords are normalized is handled in the texture state. But we otherwise need to treat RECT sample instructions as 2D. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 12:10:26 -04:00
Rob Clark	83808a90be	freedreno/a3xx/compiler: avoid negative register ids In some cases, we need a register to be assigned up to three components before the base. Since we can't have negative register #'s, just shift everything up. May increase register usage for trivial shaders, but I don't think we are shader limited in those cases. A proper solution is going to require a better register assignment algorithm (which is on the TODO list), this is just a hack to get us by until then. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:53:32 -04:00
Rob Clark	2346ea6347	freedreno/a3xx: missing wfi RB_FRAME_BUFFER_DIMENSION is not a banked context register, so we need to wait for the GPU to idle before updating it. But we'd rather not have unnecessary WFI's, so actually keep track if we need to emit it or not. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:50:24 -04:00
Rob Clark	ae5efaf285	freedreno/a3xx: little extra debug Catch things which should not happen in debug builds. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:40:00 -04:00
Rob Clark	92141afd0e	freedreno: handle null sampler This is something that XA triggers. In some cases it will only use SAMP[1] (composite mask) but not SAMP[0] (composite src). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-30 09:38:16 -04:00
Kenneth Graunke	9b6b084eb7	i965: Add Cherryview support. Based on a patch by Ville Syrjälä. As usual, these are placeholder values; actual values will come later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 17:10:09 -07:00
Ian Romanick	4047263cb1	glsl: Clean up "unused parameter" warnings ../../src/glsl/builtin_functions.cpp:72:1: warning: unused parameter 'state' [-Wunused-parameter] ../../src/glsl/ir_clone.cpp:31:1: warning: unused parameter 'ht' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:44:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:50:1: warning: unused parameter 'ignore' [-Wunused-parameter] ../../src/glsl/ir_equals.cpp:68:1: warning: unused parameter 'ignore' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:149:6: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:556:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/ir_print_visitor.cpp:562:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/link_uniforms.cpp:213:1: warning: unused parameter 'record_type' [-Wunused-parameter] ../../src/glsl/loop_analysis.cpp:225:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:73:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:79:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/loop_unroll.cpp:85:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_copy_propagation_elements.cpp:189:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_cse.cpp:402:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_dead_code_local.cpp:117:30: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_redundant_jumps.cpp:53:1: warning: unused parameter 'ir' [-Wunused-parameter] ../../src/glsl/opt_vectorize.cpp:301:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:58 -07:00
Ian Romanick	1b28c8d77a	mesa: Clean up "unused parameter" warnings program/ir_to_mesa.cpp:2008:1: warning: unused parameter 'ir' [-Wunused-parameter] program/ir_to_mesa.cpp:2272:1: warning: unused parameter 'ir' [-Wunused-parameter] program/ir_to_mesa.cpp:2278:1: warning: unused parameter 'ir' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:55 -07:00
Ian Romanick	1bdf65f743	mesa/program: Constify find_variable_storage Also clean up an old whitespace blooper. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:53 -07:00
Ian Romanick	22128e30f3	glsl: Move Doxygen block closing ot the correct place This is the closing for the "\defgroup IR Intermediate representation nodes" all the way at the top of the file. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 10:57:49 -07:00
Iago Toral Quiroga	029ccd773d	i965: Make sure we always compute valid index bounds before drawing. When doing software rendering (i.e. rendering to the selection buffer) we need to make sure that we have valid index bounds before calling _tnl_draw_prims(), otherwise we can crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59455 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-28 08:48:14 -07:00
Chia-I Wu	e7f7574598	glsl: remove {add,get}_type_ast from glsl_symbol_table They are not needed since `0da1a2cc36`. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-28 10:59:49 +08:00
Brian Paul	e341856294	mesa: fix glMultiDrawArrays inside a display list The underlying glDrawArrays() calls weren't getting compiled into the display list. We simply need to use the current dispatch table so the CALL_DrawArrays() is routed to the display list save function. This patch also fixes glMultiModeDrawArraysIBM and glMultiModeDrawElementsIBM. Fixes the new piglit gl-1.4-dlist-multidrawarrays test. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-27 11:09:30 -06:00
Brian Paul	12b959c351	st/mesa: overhaul texture / sample swizzle code Previously we only examined the GL_DEPTH_MODE state to determine the sampler view swizzle for depth textures. Now we also consider the texture base format for color textures too. The basic idea is if we're sampling from a RGB texture we always want to get A=1, even if the actual hardware format might be RGBA. We had assumed that the texture's A values were always one since that's what Mesa's texstore code does. But if we render to the RGBA texture, the A values might not be 1. Subsequent sampling didn't return the right values. Now we examine the user-specified texture base format vs. the actual gallium format to determine the right swizzle. Fixes several fbo-blending-formats, fbo-clear-formats and fbo-tex-rgbx failures with VMware/svga driver (and possibly other drivers). No other piglit regressions with softpipe or VMware/svga. Reviewed-by: Marek Olšák <maraeo@gmail.com>	2014-03-27 09:45:25 -06:00
Brian Paul	0151707cfc	st/mesa: simplify apply_depthmode() In preparation for following changes. I used a temporary test harness to compare the old code to the new for all possible swizzle inputs. No change in results.	2014-03-27 08:08:26 -06:00
Eric Anholt	b02bcea715	i965: Use intel_upload_space() for pull constant uploads. This also happens to fix a leak of the current GS pull constant BO on context destroy, by just not holding on to the pull const bos after the surface state is generated. No statistically significant performance difference on GLB2.7 on HSW at 1024x768 (n=40) or 320x240 (n=44), or on BYT at 320x240 (n=47). v2: Rebase on intel_upload simplification. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-26 13:14:57 -07:00
Eric Anholt	3b57988290	i965: Massively simplify the intel_upload implementation. The implementation kept a page-sized area for uploading data, and uploaded chunks from that to a 64kb-sized streamed buffer. This wasted cache footprint (and extra state tracking to do so) when we want to just write our data into the buffer immediately. Instead, build it around an interface like brw_state_batch() that just gets you a pointer to BO memory to upload your stuff immediately. Improves OpenArena on HSW by 1.62209% +/- 0.355299% (n=61) and on BYT by 1.7916% +/- 0.415743% (n=31). v2: Rebase on Mesa master, drop old prototypes. Re-do performance comparison on a kernel that doesn't punish CPU efficiency improvements. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-26 13:13:26 -07:00
Zack Rusin	b1909b260f	draw/llvm: improve debugging output a bit it's useful to know what the llvmbuildstore arguments are going to be before executing it because it can crash and make sure to print out the inputs only if we're not generating a gs because it fetches inputs differently. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 15:58:59 -04:00
Zack Rusin	a3c0fa2d22	draw/gs: reduce the size of the gs output buffer We used to overallocate the output buffer sometimes running out of memory with applications rendering large geometries. The actual maximum number of vertices out is simply the maximum number of primitives in (number of gs invocations) multiplied by the maximum number of output vertices per gs input primitive (i.e. gs invocation). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 15:58:32 -04:00
Brian Paul	c875d6e57a	svga: add work-around for Sauerbraten Z fighting issue Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	070951b6ba	svga: null out query's hwbuf pointer after destroying Just to be extra safe. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	8bbc84d1e5	svga: add some debug_printf() calls in the query object code To help debug failures. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	488d4c4826	st/mesa: add null pointer checking in query object functions Don't pass null query object pointers into gallium functions. This avoids segfaulting in the VMware driver (and others?) if the pipe_context::create_query() call fails and returns NULL. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	82246f7939	svga: fix a comment (sampler vs. sampler_view)	2014-03-26 10:31:13 -06:00
Brian Paul	1f4ebfaa88	mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up And use the z32f_x24s8 helper struct in unpack_Z32_FLOAT_X24S8(). Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:31:13 -06:00
Brian Paul	c1377ed464	mesa: fix indentation, formatting, etc in fbobject.c	2014-03-26 10:31:13 -06:00
Brian Paul	f5e0d024d1	mesa: rename format_(un)pack.c functions to match format names (pt. 7) sed commands: s/z_Z24_S8\b/S8_UINT_Z24_UNORM/g s/z_S8_Z24\b/Z24_UNORM_S8_UINT/g s/z_Z16\b/Z_UNORM16/g s/z_Z32\b/Z_UNORM32/g s/z_Z32_FLOAT/Z_FLOAT32/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	7f37802c8a	mesa: rename format_(un)pack.c functions to match format names (pt. 6) sed commands: s/ARGB2101010_UINT\b/B10G10R10A2_UINT/g s/ABGR2101010_UINT\b/R10G10B10A2_UINT/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	e51c3f9523	mesa: rename format_(un)pack.c functions to match format names (pt. 5) sed commands: s/SIGNED_R_UNORM8\b/R_SNORM8/g s/SIGNED_RG88_REV\b/R8G8_SNORM/g s/SIGNED_RGBX8888\b/X8B8G8R8_SNORM/g s/SIGNED_A8B8G8R8_UNORM\b/A8B8G8R8_SNORM/g s/SIGNED_R8G8B8A8_UNORM\b/R8G8B8A8_SNORM/g s/SIGNED_R_UNORM16\b/R_SNORM16/g s/SIGNED_R16G16_UNORM\b/R16G16_SNORM/g s/SIGNED_RGB_16\b/RGB_SNORM16/g s/SIGNED_RGBA_16\b/RGBA_SNORM16/g s/SIGNED_A_UNORM8\b/A_SNORM8/g s/SIGNED_L_UNORM8\b/L_SNORM8/g s/SIGNED_L8A8_UNORM\b/L8A8_SNORM/g s/SIGNED_L_UNORM8\b/I_SNORM8/g s/SIGNED_A_UNORM16\b/A_SNORM16/g s/SIGNED_L_UNORM16\b/L_SNORM16/g s/SIGNED_L16A16_UNORM\b/LA_SNORM16/g s/SIGNED_L_UNORM16\b/I_SNORM16/g s/XBGR16161616_SNORM\b/RGBX_SNORM16/g s/SIGNED_G8R8_UNORM\b/G8R8_SNORM/g s/SIGNED_G16R16_UNORM\b/G16R16_SNORM/g s/SIGNED_I_UNORM8\b/I_SNORM8/g s/SIGNED_I_UNORM16\b/I_SNORM16/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	f10f5b8822	mesa: rename format_(un)pack.c functions to match format names (pt. 4) sed commands: s/SRGBA_UNORM8\b/A8B8G8R8_SRGB/g s/SABGR_UNORM8\b/R8G8B8A8_SRGB/g s/SARGB8\b/B8G8R8A8_SRGB/g s/XBGR8888_SRGB\b/R8G8B8X8_SRGB/g s/XRGB8888_SRGB\b/B8G8R8X8_SRGB/g s/SL_UNORM8\b/L_SRGB8/g s/SLA_UNORM8\b/L8A8_SRGB/g manually changed SRGB8 -> BGR_SRGB8 Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	be9eee3bcf	mesa: rename format_(un)pack.c functions to match format names (pt. 3) sed commands: s/LUMINANCE_FLOAT32\b/L_FLOAT32/g s/LUMINANCE_FLOAT16\b/L_FLOAT16/g s/LUMINANCE_ALPHA_FLOAT32\b/LA_FLOAT32/g s/LUMINANCE_ALPHA_FLOAT16\b/LA_FLOAT16/g s/ALPHA_FLOAT32\b/A_FLOAT32/g s/ALPHA_FLOAT16\b/A_FLOAT16/g s/XBGR32323232_FLOAT\b/RGBX_FLOAT32/g s/RGB9_E5_FLOAT\b/R9G9B9E5_FLOAT/g s/R11_G11_B10_FLOAT\b/R11G11B10_FLOAT/g s/INTENSITY_FLOAT16\b/I_FLOAT16/g s/INTENSITY_FLOAT32\b/I_FLOAT32/g v2: removed a few redundant/no-op substitutions Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	a49f46b15a	mesa: rename format_(un)pack.c functions to match format names (pt. 2) sed commands: s/ABGR2101010\b/R10G10B10A2_UNORM/g s/XRGB2101010_UNORM\b/B10G10R10X2_UNORM/g s/XBGR16161616_UNORM\b/RGBX_UNORM16/g s/ABGR2101010\b/R10G10B10A2_UNORM/g s/I8\b/I_UNORM8/g s/I16\b/I_UNORM16/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Brian Paul	5c619ace6f	mesa: rename format_(un)pack.c functions to match format names (pt. 1) sed commands: s/RGBA8888\b/A8B8G8R8_UNORM/g s/RGBA8888_REV\b/R8G8B8A8_UNORM/g s/ARGB8888\b/B8G8R8A8_UNORM/g s/ARGB8888_REV\b/A8R8G8B8_UNORM/g s/RGBA8888\b/X8B8G8R8_UNORM/g s/RGBA8888_REV\b/R8G8B8X8_UNORM/g s/XRGB8888\b/B8G8R8X8_UNORM/g s/XRGB8888_REV\b/X8R8G8B8_UNORM/g s/RGB888\b/BGR_UNORM8/g s/BGR888\b/RGB_UNORM8/g s/RGB565\b/B5G6R5_UNORM/g s/RGB565_REV\b/R5G6B5_UNORM/g s/ARGB4444\b/B4G4R4A4_UNORM/g s/ARGB4444_REV\b/A4R4G4B4_UNORM/g s/RGBA5551\b/A1B5G5R5_UNORM/g s/ARGB1555\b/B5G5R5A1_UNORM/g s/ARGB1555_REV\b/A1R5G5B5_UNORM/g s/AL44\b/L4A4_UNORM/g s/AL88\b/L8A8_UNORM/g s/AL88_REV\b/A8L8_UNORM/g s/AL1616\b/L16A16_UNORM/g s/AL1616_REV\b/A16L16_UNORM/g s/RGB332\b/B2G3R3_UNORM/g s/A8\b/A_UNORM8/g s/A16\b/A_UNORM16/g s/L8\b/L_UNORM8/g s/L16\b/L_UNORM16/g s/L8\b/I_UNORM8/g s/L16\b/I_UNORM16/g s/R8\b/R_UNORM8/g s/GR88\b/R8G8_UNORM/g s/RG88\b/G8R8_UNORM/g s/R16\b/R_UNORM16/g s/GR1616\b/R16G16_UNORM/g s/RG1616\b/G16R16_UNORM/g s/ARGB2101010\b/B10G10R10A2_UNORM/g Reviewed-by: Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-26 10:29:12 -06:00
Zack Rusin	bbdefabfc9	llvmpipe: Fix llvmpipe_create_gs_state. Revert unintended behaviour change from commit `b995a010e6`. Tested-by: José Fonseca <jfonseca@vmware.com>	2014-03-26 16:11:28 +00:00
Christian König	aa2274c1d2	st/omx/dec: fix possible segfault at eos Signed-off-by: Christian König <christian.koenig@amd.com>	2014-03-26 16:29:20 +01:00
José Fonseca	2de70fe23f	mapi/glapi: Use ElementTree instead of libxml2. It is quite hard to meet the dependency of the libxml2 python bindings outside Linux, and in particularly on MacOSX; whereas ElementTree is part of Python's standard library. ElementTree is more limited than libxml2: no DTD verification, defaults from DTD, or XInclude support, but none of these limitations is serious enough to justify using libxml2. In fact, it was easier to refactor the code to use ElementTree than to try to get libxml2 python bindings. In the process, gl_item_factory class was refactored so that there is one method for each kind of object to be created, as it simplifies things substantially. I confirmed that precisely the same output is generated for GL/GLX/GLES. v2: Remove m4/ax_python_module.m4 as suggested by Matt Turner. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-26 13:51:32 +00:00
José Fonseca	b761dfa0c3	mapi/glapi: Remove glX_doc.py. As suggested by Ian Romanick, given it's no longer used. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-26 12:32:57 +00:00
Christian König	d117ddbe31	st/mesa: fix sampler view handling with shared textures v4 Release the references to the sampler views before destroying the pipe context. v2: remove TODO and unrelated change v3: move to st_texture.[ch], rename callback, add comment v4: fix rebase mess up and add further cleanups Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-26 12:06:43 +01:00
Roland Scheidegger	3b421daf32	gallivm: fix no-op n:n lp_build_resize() This can get called in some circumstances if both src type and dst type have same width (seen with float32->unorm32). While this particular case was bogus anyway let's just fix that as it can work trivially (due to the way it was called it actually worked anyway apart from the assert). Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-26 01:44:23 +01:00
Kevin Rogovin	fe635d51ff	i965: For fast color clears, only check the color of live channels. When deciding if a clear color is suitable for fast clear, take into account if a color channel is active in the buffer format. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-25 15:34:28 -07:00
Kenneth Graunke	ee4484be3d	i965: Set Broadwell MOCS values everywhere it's possible. This patch introduces two pre-canned MOCS values: BDW_MOCS_WB (write-back, all caches) and BDW_MOCS_WT (write-through, all caches). We use write-through caching for render targets, and write-back for all other data. (At least on Haswell, I believe write-back LLC/eLLC didn't work for scan-out buffers, while write-through did.) No performance analysis has been done on the impact of this patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-03-25 15:14:08 -07:00
Kenneth Graunke	1afe335925	mesa: In core profile, refuse to draw unless a VAO is bound. Core profile requires a non-default VAO to be bound. Currently, calls to glVertexAttribPointer raise INVALID_OPERATION unless a VAO is bound, and we never actually get any vertex data set. Trying to draw without any vertex data can only cause problems. In i965, it causes a crash. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76400 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: mesa-stable@lists.freedesktop.org	2014-03-25 15:13:49 -07:00
Ilia Mirkin	29bcc73d4d	Revert "build: llvm libs may not be in system search path, add rpath" This reverts commit `d9b983519c`. Unfortunately it seems like rpath is evaluated before LD_LIBRARY_PATH, so this breaks e.g. steam, as well as any other user of that env var, if the llvm path happens to be where other libs also reside. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76082 Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-25 17:18:46 -04:00
Chris Forbes	4002daf095	Revert "mesa: Fix format matching checks for GL_INTENSITY* internalformats." This reverts commit `40d7b51953`.	2014-03-26 10:06:10 +13:00
Brian Paul	64278b36d6	mesa: move GLbitfield any_valid_stages declaration before code To fix MSVC build.	2014-03-25 13:33:10 -06:00
Ian Romanick	c4cec40883	glsl: Clean up "unused parameter" warnings ../../src/glsl/ir_constant_expression.cpp:486:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1633:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1752:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1761:1: warning: unused parameter 'variable_context' [-Wunused-parameter] ../../src/glsl/ir_constant_expression.cpp:1769:1: warning: unused parameter 'variable_context' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	f3ab987b70	glsl: Minor clean ups in constant_referenced These could probably be squashed into one of the previous commits. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	6429d6276d	glsl: Remove ir_dereference::constant_referenced All of the functionality is implemented in a private function in the one file where it is used. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	bb0d6db974	glsl: Fold implementation of ir_dereference_array::constant_referenced into wrapper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	35bf94f901	glsl: Fold implementation of ir_dereference_record::constant_referenced into wrapper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	b66319b006	glsl: Fold implementation of ir_dereference_variable::constant_referenced into wrapper Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	14f0faacb6	glsl: Add wrapper function that calls ir_dereference::constant_referenced Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Ian Romanick	c11c7e4f01	glsl: Group all of the constant_referenced functions together Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-25 12:09:36 -07:00
Gwenole Beauchesne	3bd65dc8a1	i965: fix dma_buf import with non-zero offset. Fix eglCreateImage() from a packed dma_buf surface with a non-zero offset to pixels data. In particular, this fixes support for planar YUV surfaces when they are individually mapped on a per-plane basis, i.e. when the OES_EGL_image_external is not used and user application wants to use its own shader code for composition, or processing on individual plane (OCL). Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 18:56:41 +01:00
Gregory Hainaut	1c29068074	mesa/sso: Implement ValidateProgramPipeline Implementation note: I don't use context for ralloc (don't know how). The check on PROGRAM_SEPARABLE flags is also done when the pipeline isn't bound. It doesn't make any sense in a DSA style API. Maybe we could replace _mesa_validate_program by _mesa_validate_program_pipeline. For example we could recreate a dummy pipeline object. However the new function checks also the TEXTURE_IMAGE_UNIT number not sure of the impact. V2: Fix memory leak with ralloc_strdup Formatting improvement V3 (idr): * Actually fix the leak of the InfoLog. :) * Directly generate logs in to gl_pipeline_object::InfoLog via ralloc_asprintf isntead of using a temporary buffer. * Split out from previous uber patch. * Change spec references to include section numbers, etc. * Fix a bug in checking that a different program isn't active in a stage between two stages that have the same program. Specifically, if (pipe->CurrentVertexProgram->Name == pipe->CurrentGeometryProgram->Name && pipe->CurrentGeometryProgram->Name != pipe->CurrentVertexProgram->Name) should have been if (pipe->CurrentVertexProgram->Name == pipe->CurrentFragmentProgram->Name && pipe->CurrentGeometryProgram->Name != pipe->CurrentVertexProgram->Name) v4 (idr): Rework to use CurrentProgram array in loops. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	95426b28ac	mesa/sso: Add _mesa_sampler_uniforms_pipeline_are_valid This is much like _mesa_sampler_uniforms_are_valid, but it operates across an entire pipeline object. This function differs from _mesa_sampler_uniforms_are_valid in that it directly creates the gl_pipeline_object::InfoLog instead of writing to some temporary buffer. This was originally included in another patch, but it was split out by Ian Romanick. v2 (idr): Fix the loop bounds. shProg isn't an array, so ARRAY_SIZE(shProg) was 1, so only the vertex program was validated. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	aa46ad26b1	mesa/sso: Add gl_pipeline_object::InfoLog support V2 (idr): * Keep the behavior of other info logs in Mesa: and empty info log reports a GL_INFO_LOG_LENGTH of zero. * Use a NULL pointer to denote an empty info log. * Split out from previous uber patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	658eaa3229	mesa/sso: Implement GL_PROGRAM_PIPELINE_BINDING for glGet Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:26 -07:00
Gregory Hainaut	9e9fac4714	mesa/sso: Implement _mesa_BindProgramPipeline Test become green in piglit: The updated ext_transform_feedback-api-errors:useprogstage_noactive useprogstage_active bind_pipeline arb_separate_shader_object-GetProgramPipelineiv arb_separate_shader_object-IsProgramPipeline For the moment I reuse Driver.UseProgram but I guess it will be better to create a UseProgramStages functions. Opinion is welcome V2: formatting & rename V3 (idr): * Change spec references to core OpenGL versions instead of issues in the extension spec. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	78578b7599	mesa/sso: Implement _mesa_UseProgramStages Now arb_separate_shader_object-GetProgramPipelineiv should pass. V3 (idr): * Change spec references to core OpenGL versions instead of issues in the extension spec. * Split out from previous uber patch. v4 (idr): Use _mesa_has_geometry_shaders in _mesa_UseProgramStages to detect availability of geometry shaders. v5 (idr): Whitespace cleanup, use _mesa_lookup_shader_program_err instead of open-coding it again, and update some comments at the end of _mesa_UseProgramStages. All suggested by Eric. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	4caa9db71c	mesa/sso: Add gl_pipeline_object parameter to _mesa_use_shader_program Extend use_shader_program to support a different target. Allow to reuse the function to update the pipeline state. Note I bypass the flush when target isn't current. Maybe it would be better to create a new UseProgramStages driver function This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	de4f85f52d	meta/sso: Update meta to save and restore SSO state. save and restore _Shader/Pipeline binding point. Rational we don't want any conflict when the program will be unattached. V2: formatting improvement V3 (idr): * Build fix. The original patch added calls to _mesa_use_shader_program with 4 parameters, but the fourth parameter isn't added to that function until a much later patch. Just drop that parameter for now. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	c03477050a	mesa/sso: rename Shader to the pointer _Shader Basically a sed but shaderapi.c and get.c. get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior shaderapi.c => the old api stil update the Shader object directly V2: formatting improvement V3 (idr): * Rebase fixes after a block of code was moved from ir_to_mesa.cpp to shaderapi.c. * Trivial reformatting. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
Gregory Hainaut	b2bddaf7a0	mesa/sso: replace Shader binding point with _Shader To avoid NULL pointer check a default pipeline object is installed in _Shader when no program is current The spec say that UseProgram/UseShaderProgramEXT/ActiveProgramEXT got an higher priority over the pipeline object. When default program is uninstall, the pipeline is used if any was bound. Note: A careful rename need to be done now... V2: formating improvement V3 (idr): * Build fix. The original patch added calls to _mesa_use_shader_program with 4 parameters, but the fourth parameter isn't added to that function until a much later patch. Just drop that parameter for now. * Trivial reformatting. * Updated comment of gl_context::_Shader v4 (idr): Reformat spec quotations to look like spec quotations. Update comments describing what gl_context::_Shader can point to. Bot suggested by Eric. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-25 10:25:25 -07:00
José Fonseca	b995a010e6	llvmpipe: Simplify vertex and geometry shaders. Eliminate lp_vertex_shader, as it added nothing over draw_vertex_shader. Simplify lp_geometry_shader, as most of the incoming state is unneeded. (We could also just use draw_geometry_shader if we were willing to peek inside the structure.) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-03-25 12:54:39 +00:00
José Fonseca	ee89432a47	draw: Duplicate TGSI tokens in draw_pipe_pstipple module. As done in draw_pipe_aaline and draw_pipe_aapoint modules. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-25 12:54:39 +00:00
Alexander von Gluck IV	7683fce878	haiku: Fix build through scons corrections and viewport fixes * Add HAVE_PTHREAD, we do have pthread support wrappers now for non-native Haiku threaded applications. * Viewport changed behavior recently breaking the build. We fix this by looking at the gl_context ViewportArray (Thanks Brian for the idea) Acked-by: Brian Paul <brianp@vmware.com>	2014-03-24 19:01:53 -05:00
Kenneth Graunke	eccad18bd8	i965: For color clears, only disable writes to components that exist. The SIMD16 replicated FB write message only works if we don't need the color calculator to mask our framebuffer writes. Previously, we bailed on it if color_mask wasn't <true, true, true, true>. However, this was needlessly strict for formats with fewer than four components - only the components that actually exist matter. WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set to <true, true, true, false>. This will work perfectly fine with the replicated data message; we just bailed unnecessarily. Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) by abound 50%, and Bay Trail (at 1366x768) by over 70% (using Chrome 24). v2: Use _mesa_format_has_color_component() to properly handle ALPHA formats (and generally be less fragile). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-24 14:46:05 -07:00
Kenneth Graunke	630bf288de	mesa: Skip clearing color buffers when color writes are disabled. WebGL Aquarium in Chrome 24 actually hits this. v2: Move to core Mesa (wisely suggested by Ian); only consider components which actually exist. v3: Use _mesa_format_has_color_component to determine whether components actually exist, fixing alpha format handling. v4: Add a comment, as requested by Brian. No actual code changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-24 14:45:03 -07:00
Kenneth Graunke	92234b1b2a	mesa: Introduce a _mesa_format_has_color_component() helper. When considering color write masks, we often want to know whether an RGBA component actually contains any meaningful data. This function provides an easy way to answer that question, and handles luminance, intensity, and alpha formats correctly. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-24 14:38:51 -07:00
Eric Anholt	0d99aef6c8	i965: Fix compiler warning about signed/unsigned. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:16:38 -07:00
Eric Anholt	4545ec1691	i965/gen8: Change the winsys MSAA blits from blorp to meta. This gets us equivalent code paths on BDW and pre-BDW, except for stencil (where we don't have MSAA stencil resolve code yet) Improves MSAA-forced citybench by 7.94496% +/- 2.38429% (n=16). Reduces DRI2 MSAA glxgears performance by -12.3559% +/- 1.52845% (n=9). v2: Move the new meta code to brw_meta_updownsample.c, name it brw_meta_updownsample(), add a comment about intel_rb_storage_first_mt_slice(), and rename that function and move the RB generation into it (review ideas by Ken). v3: Fix 2 src vs dst pasteos in previous change. v4: Skip this path pre-gen8 for now, until we can analyze the glxgears performance delta some more. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:05 -07:00
Eric Anholt	7ccb26fdec	mesa: Stop skipping the FinishRenderTexture calls for winsys FBOs. Now that BindRenderbufferTexImage() is a thing that drivers can do, winsys FBOs can have NeedsFinishRenderTexture set. v2: Keep the short-circuit for non-BindRenderbufferTexImage() drivers (review by Ken). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	dd4b226184	i965: Skip reallocating the private MSAA miptree, unless it's resized. Even if the singlesample_mt got reopened from DRI due to pageflipping/buffer swapping, our private miptree shouldn't need any changes. Improves performance of a little swapbuffers-loving microbenchmark with MSAA forced on, by 1.2371% +/- 0.624802% (n=102) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	44e944c87c	i965: Simplify the no-reopening-the-winsys-buffer tests. The formatting was weird, and the tests were duplicated, and it is guaranteed that mt->region exists. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	e07e7e9f89	i965: Don't forget to free the old singlesample_mt. Fixes a memory leak with MSAA winsys buffers since my move of singlesample_mt to the rb in `4e0924c5de` Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Eric Anholt	41033509f2	i965: Add an env var for forcing window system MSAA. Sometimes it would be nice to benchmark some app with MSAA versus not, but it doesn't offer the controls you want. Just provide a handy knob to force the issue. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-24 11:15:04 -07:00
Matt Turner	764e25d79d	i965/vec4: Eliminate dead writes to the flag register. For each write, search previous instructions for unread writes to the flag register and remove them. Note that this will not eliminate the last unread write. total instructions in shared programs: 788074 -> 788004 (-0.01%) instructions in affected programs: 4930 -> 4860 (-1.42%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	9cd51bb0c4	i965/vec4: Eliminate writes that are never read. With an awful O(n^2) algorithm that searches previous instructions for dead writes. total instructions in shared programs: 805582 -> 788074 (-2.17%) instructions in affected programs: 144561 -> 127053 (-12.11%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	1b8f143a23	i965/vec4: Factor code out of DCE into a separate function. Will be reused in the next commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	9630ba6c6e	i965/vec4: Let dead code eliminate trim dead channels. That is, modify mad dst, a, b, c to be mad dst.xyz, a, b, c if dst.w is never read. total instructions in shared programs: 811869 -> 805582 (-0.77%) instructions in affected programs: 168287 -> 162000 (-3.74%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	dc0f5099fa	i965/vec4: Track live ranges per-channel, not per vgrf. Will be squashed with the next patch. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	89ccd11eeb	i965/vec4: Don't dead code eliminate instructions writing the flag. A future patch adds support for removing dead writes to the flag register. This patch simplifies the logic until then. total instructions in shared programs: 811813 -> 811869 (0.01%) instructions in affected programs: 3378 -> 3434 (1.66%) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	3a12f50f9c	i965/vec4: Preparatory clean up of dead_code_eliminate(). Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:26 -07:00
Matt Turner	10dd6eca89	i965/vec4: Add is_null() method to dst_reg. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	0884ce8f42	i965/vec4: Print the predicate in dump_instructions(). Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	a6367dfc15	i965/vec4: Rename depends_on_flags() to reads_flag(). To be consistent with the fs backend. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	de4692f56c	i965/vec4: Add and use vec4_instruction::writes_flag(). To be consistent with the fs backend. Also the instruction scheduler incorrectly considered SEL with a conditional modifier to read the flag register. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Matt Turner	b0d3205c2a	i965/vec4: Add missing doxygen close brace. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-24 11:06:25 -07:00
Chris Forbes	a419a1c565	mesa: Generate FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT earlier The ARB_framebuffer_object spec lists this case before the FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER and FRAMEBUFFER_INCOMPLETE_READ_BUFFER cases. Fixes two broken cases in piglit's fbo-incomplete test, if ARB_ES2_compatibility is not advertised. (If it is, this is masked because the FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER / FRAMEBUFFER_INCOMPLETE_READ_BUFFER cases are removed by that extension) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-25 06:49:25 +13:00
Chris Forbes	40d7b51953	mesa: Fix format matching checks for GL_INTENSITY* internalformats. GL_INTENSITY has never been valid as a pixel format -- to get the memcpy pack/unpack paths, the app needs to specify GL_RED as the pixel format (or GL_RED_INTEGER for the integer formats). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-25 06:47:50 +13:00
Christian König	92e543c45d	st/mesa: recreate sampler view on context change v3 With shared glx contexts it is possible that a texture is create and used in one context and then used in another one resulting in incorrect sampler view usage. v2: avoid template copy v3: add XXX comment Signed-off-by: Christian König <christian.koenig@amd.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-24 17:50:38 +01:00
Kenneth Graunke	eabfadf4af	i965: Report the type of color clear in INTEL_DEBUG=blorp. It's useful to know whether a clear is fast (MCS-based), using the SIMD16 repdata message, or slow. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-23 00:32:53 -07:00
Marek Olšák	011569b5b7	radeonsi: disable fast color clear for 1D-tiled surfaces on CIK This will be re-enabled once my kernel fix lands.	2014-03-22 18:44:58 +01:00
Kenneth Graunke	4c79f088c0	Revert "i965: For color clears, only disable writes to components that exist." This reverts commit `2919c3fdb4`. For formats like BGRX, looping through 0..num_components works fine. But for formats like XRGB, we'd check the color mask for X and fail to check it for B.	2014-03-21 17:03:20 -07:00
Kenneth Graunke	2919c3fdb4	i965: For color clears, only disable writes to components that exist. The SIMD16 replicated FB write message only works if we don't need the color calculator to mask our framebuffer writes. Previously, we bailed on it if color_mask wasn't <true, true, true, true>. However, this was needlessly strict for formats with fewer than four components - only the components that actually exist matter. WebGL Aquarium attempts to clear a BGRX texture with the ColorMask set to <true, true, true, false>. This will work perfectly fine with the replicated data message; we just bailed unnecessarily. Improves performance of WebGL Aquarium on Iris Pro (at 1920x1080) by abound 40%, and Bay Trail (at 1366x768) by over 70% (using Chrome 24). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Tested-by: Dylan Baker <baker.dylan.c@gmail.com>	2014-03-21 15:35:08 -07:00
Kenneth Graunke	a63db538ad	i965: Print number of multisamples in INTEL_DEBUG=blorp output. This lets us distinguish MSAA resolves from other ordinary blits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-03-21 15:34:59 -07:00
Kenneth Graunke	9834058a91	i965: Drop BLT TexSubImage Y-tiling restriction on Gen6+. Currently, we don't use this path on Sandybridge because we suspect other paths will be faster. But we potentially could. If we do, we should allow it to support Y-tiled BLTs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-21 15:31:45 -07:00
Chris Forbes	351e13c5ad	i965: Enable ARB_vertex_type_10f_11f_11f_rev for Gen4/5 also. Tested on ILK and CTG (with the GL3isms taken out of the piglits). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-22 09:19:55 +13:00
Tom Stellard	8d8d0cb09e	clover: Fix typo in validate_object() Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-03-21 19:12:12 +01:00
Roland Scheidegger	9477d8c862	llvmpipe: add support for b5g6r5_srgb The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA (and vice versa), fix this to handle also 16bit 565-style srgb formats. Still not really all that generic, things like r10g10b10a2_srgb or r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not require more work to not crash but near certainly need some higher precision calculation) but not needed right now. The code is not fully optimized for this (could use more direct calculation instead of expanding to 8-bit range first) but should be good enough. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-21 17:23:38 +01:00
Roland Scheidegger	2aa77f2777	gallium: add b5g6r5 srgb format GL generally doesn't seem to allow srgb formats with less (or more) than 8 bit for the rgb channels, though some hw could easily do it (typically for formats with up to 10 bits for the rgb channels, at least for formats with less than 8 bits support is likely widespread even). While it may be true there aren't really any benefits for such formats, we need for it for d3d, though luckily only for b5g6r5_srgb it seems. So add this format along with the util code for conversion - since that util code is heavily tuned for 8bit srgb this isn't really all that well optimized and rounding doesn't seem right but at least it should give some halfway meaningful results. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-21 17:23:38 +01:00
Ilia Mirkin	19ba573a57	nvc0/ir: move sample id to second source arg to fix sampler2DMS The nvc0 texfetch instruction expects the sample id to be in the second source (usually used for the offset) rather than as part of the texture coordinate. This fixes all the sampler2DMS/Array tests on nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-20 20:47:47 -04:00
Marek Olšák	e5f6b6d0fe	st/mesa: drop the lowering of quad strips to triangle strips This fallback to triangle strips is silly and should be done in drivers if they need it. This should fix the case when quad strips are used with flatshading that is enabled by the "flat" GLSL varying modifier. It also fixes primitive restart for quad strips. This fixes piglit: NV_primitive_restart/primitive-restart-draw-mode-quad_strip Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	2706448a10	gallium/u_gen_mipmap: remove the software fallback The last changes to it are from 2008 and 2009. It doesn't support most texture formats and some texture targets. Nobody can possibly be using this. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	db722bdcab	st/mesa: fix generating mipmaps for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	91df26842f	mesa: fix software fallback for generating mipmaps for 3D textures It didn't use the driver-provided src/dstRowStride at all. This was broken for the cases when stride != width*bpp. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	78c60d1b63	mesa: fix software fallback for generating mipmaps for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	185ad78ffd	mesa: allow generating mipmaps for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	55cf320ed8	mesa: fix texture border handling for cube arrays Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-21 00:50:53 +01:00
Marek Olšák	54690a5f3b	r600g: use more appropriate names for async DMA functions _dma_copy calls either _dma_copy_buffer or *_dma_copy_tile. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-20 19:03:40 +01:00
Marek Olšák	6c487ff3bd	r600g: deobfuscate async DMA code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-20 18:56:11 +01:00
Marek Olšák	2c703ee8ad	r600g: don't flush the gfx IB explicitly before doing DMA It's flushed by calling r600_context_bo_reloc. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-20 18:41:18 +01:00
Marek Olšák	e914d0052f	winsys/radeon: only add duplicate relocations for DMA if VM isn't supported Also rewrite the comment for it to be readable and reorder the code. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-20 18:41:17 +01:00
Niels Ole Salscheider	71254732db	radeonsi: Implement DMA blit This code is a slightly modified version of evergreen_dma_blit (and evergreen_dma_copy as well as evergreen_dma_copy_tile). It would be nice to share some of the code in the long term. I have reused some "cik"-prefixed functions that also return the right value for SI. I am not sure if they should be renamed. v2: Marek> removed gfx.flush in si_dma_copy_tile Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-20 17:21:16 +01:00
Niels Ole Salscheider	acf55e7325	radeon: Move r600_need_dma_space to common code Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-20 17:21:16 +01:00
Richard Sandiford	f4b3430a36	llvmpipe: Tighten check for alpha-only formats The AoS version of ld_build_blend_factor was assuming that if the first channel was alpha, there were no rgb components. Fixes glean/blendFunc on System z. No piglit regressions on x86_64. The shortcut is still used in tests like spec/ARB_framebuffer_object/ fbo-alpha. Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>	2014-03-20 16:50:40 +01:00
Jonathan Gray	8044fd6769	nouveau: don't assume libdrm include prefix drm headers may be installed in a different directory Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-20 08:32:12 -04:00
Jonathan Gray	8fbc9d9b6f	nouveau: use DLOPEN_LIBS instead of -ldl libdl does not exist on many platforms which have dlopen in libc. Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-20 08:32:12 -04:00
Brian Paul	eaf9affa5e	c11/threads: don't include assert.h if the assert macro is already defined In the gallium code, the assert() macro could come from either the system's assert.h file (via c11/threads.h) or from gallium's u_debug.h. It looks like all known assert.h files unconditionally #undef assert before defining their own version. So the assert you get depends on whether threads.h or u_debug.h was included last. In the gallium code we really want to use the assert() from u_debug.h (it behaves better on Windows). In gallium, c11/threads.h is only included after u_debug.h in the os_thread.h wrapper. So Adding an #ifndef assert test in the threads*.h files avoids using the system's assert(). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-19 17:13:31 -06:00
Ilia Mirkin	e58071355e	nouveau: there may not have been a texture if the fbo was incomplete Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-19 18:20:29 -04:00
Ilia Mirkin	b676df9abf	nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-19 18:17:40 -04:00
Ilia Mirkin	18690995a6	mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture EXT_packed_depth_stencil is supported by all drivers, but ARB_depth_texture isn't (notably nouveau_vieux). This should avoid passing unexpected values down to ChooseTextureFormat. The EXT_packed_depth_stencil spec does not make any explicit references to requiring ARB_depth_texture in order to allow textures with that format, however if there is no dependency, ARB_depth_texture would be practically implied by the extension. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Note for 10.0 backport: This will produce a conflict, the solution is to move the surrounding if as well.	2014-03-19 18:17:40 -04:00
Ilia Mirkin	51989817e6	loader: add special logic to distinguish nouveau from nouveau_vieux There are a lot of different pci ids supported by nouveau, and more are added all the time. The relevant distinguisher between drivers is the chipset id. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-19 18:17:40 -04:00
Matt Turner	c049dd4396	glsl: Allow dot() on scalars, and throw out dotlike(). In all uses of dotlike() we're writing generic code that operates on 1-4 component vectors. That our IR requires ir_binop_dot expressions' operands to be 2+ component vectors is an implementation detail that's not important when implementing built-in functions with dot(), which is defined for scalar floats in GLSL. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 23:20:29 -07:00
Matt Turner	6cbc64c3cb	glsl: Optimize pow(x, 2) into x * x. Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 23:20:29 -07:00
Matt Turner	9a9eaaa79a	glsl: Match whitespace changes from previous patch. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 23:20:29 -07:00
Matt Turner	7988b4804f	glsl: Expose pack/unpack built-ins for ARB_gpu_shader5. ARB_gpu_shader5 and ES 3.0 expose different subsets of ARB_shading_language_packing. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 23:20:29 -07:00
Eric Anholt	651b8baa82	i965: Drop some more dead code from the old CACHED_BATCH feature. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 14:45:09 -07:00
Eric Anholt	512c88f826	i965: Drop special case for edgeflag thanks to Marek's change to core. As of `780ce576bb`, we end up with R8_SSCALED anyway. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 14:45:09 -07:00
Brian Paul	f4435da940	mesa: include stdbool.h in register_allocate.h to fix build https://bugs.freedesktop.org/show_bug.cgi?id=76331	2014-03-18 13:28:17 -06:00
Ian Romanick	f74cf5f80e	i965: Enable EWA anisotropic filtering algorithm Volume 4, part 1 of the Ivybridge PRM says, "Generally, the EWA approximation algorithm results in higher image quality than the legacy algorithm." Using a classic anisotropic filtering "tunnel" demo, it appears that there is no anisotropic filtering on IVB without this bit set. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 10:56:38 -07:00
Kenneth Graunke	dd2e5d3999	i965: Actually initialize simd16_unsupported and no16_msg. I meant to include this fixes in v3 of commit `de7ad2c88f`, but accidentally pushed a previous version. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-18 10:50:48 -07:00
Kenneth Graunke	91f4528da6	i965/upload: Refactor open-coded ALIGN-like computations. Sadly, we can't use actual ALIGN(), since that only supports power-of-two values for the alignment parameter. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-18 10:39:04 -07:00
Kenneth Graunke	b8b4e280b4	i965: Fix indentation in brw_upload_indices(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-18 10:38:48 -07:00
Kenneth Graunke	051edcc144	i965: Consolidate code for setting brw->ib.start_vertex_offset. This was set identically in three places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-18 10:38:44 -07:00
Kenneth Graunke	7a0fd3ca1d	i965: Allocate register sets at screen creation, not context creation. Register sets depend on the particular hardware generation, but don't depend on anything in the actual OpenGL context. Computing them is fairly expensive, and they take up a large amount of memory. Putting them in the screen allows us to compute/allocate them once for all contexts, saving both time and space. Improves the performance of a context creation/destruction microbenchmark by about 3x on my Haswell i7-4750HQ. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:35:53 -07:00
Kenneth Graunke	b3e4b769dd	i965: Allocate the screen using ralloc rather than calloc. This will allow us to use the screen as a memory context. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:31:12 -07:00
Eric Anholt	41097db91b	ra: Convert another bool array to bitsets. This one saves about 2MB peak allocation in glsl-fs-algebraic-add-add-1, with no performance difference on timing short shader-db runs (n=9/10, warmup outlier removed). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-18 10:20:28 -07:00
Kenneth Graunke	da1cce2d68	ra: Use a bitset for storing which registers belong to a class. This should use 1/8 the memory. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christoph Brill <egore911@gmail.com>	2014-03-18 10:15:24 -07:00
Kenneth Graunke	8d856c3937	ra: Create a reg_belongs_to_class() helper function. This is a little easier to read. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Christoph Brill <egore911@gmail.com>	2014-03-18 10:15:23 -07:00
Kenneth Graunke	786a647245	ra: Use bool instead of GLboolean. This isn't the GL API, so there's no reason to use GLboolean. Using bool is safer: any non-zero value is treated as "true". When converting a value to a GLboolean, all but the low byte is discarded, which means that values like 256 will be incorrectly rendered as false. Done via the following vim commands: :%s/GLboolean/bool/g :%s/GL_TRUE/true/g :%s/GL_FALSE/false/g and one line of manual whitespace tidying. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-18 10:15:18 -07:00
Kenneth Graunke	de7ad2c88f	i965: Accurately bail on SIMD16 compiles. Ideally, we'd like to never even attempt the SIMD16 compile if we could know ahead of time that it won't succeed---it's purely a waste of time. This is especially important for state-based recompiles, which happen at draw time. The fragment shader compiler has a number of checks like: if (dispatch_width == 16) fail("...some reason..."); This patch introduces a new no16() function which replaces the above pattern. In the SIMD8 compile, it sets a "SIMD16 will never work" flag. Then, brw_wm_fs_emit can check that flag, skip the SIMD16 compile, and issue a helpful performance warning if INTEL_DEBUG=perf is set. (In SIMD16 mode, no16() calls fail(), for safety's sake.) The great part is that this is not a heuristic---if the flag is set, we know with 100% certainty that the SIMD16 compile would fail. (It might fail anyway if we run out of registers, but it's always worth trying.) v2: Fix missing va_end in early-return case (caught by Ilia Mirkin). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> [v1] Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:38 -07:00
Kenneth Graunke	b207e88b25	i965/fs: Support pull parameters in SIMD16 mode. This is just a matter of reusing the pull/push constant information set up by the SIMD8 compile. This gains us 78 SIMD16 programs in shader-db. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:36 -07:00
Kenneth Graunke	229319e0f0	i965/fs: Use a single instance of the pull_constant_loc[] array. Now that we don't renumber uniform registers, assign_constant_locations and move_uniform_array_access_to_pull_constants use the same names. So, they can share a single copy of the pull_constant_loc[] array. This simplifies the code considerably. assign_constant_locations() doesn't need to walk through pull_params[] to rediscover reladdr demotions; it just has that information in pull_constant_loc[]. We also only need to rewrite the instruction stream once, instead of twice. Even better, we now have a single array describing the layout of all pull parameters, which we can pass to the SIMD16 program. This actually hurts a few shaders in Serious Sam 3, and one in KWin: total instructions in shared programs: 1841957 -> 1842035 (0.00%) instructions in affected programs: 1165 -> 1243 (6.70%) Comparing dump_instructions() before and after the pull constant transformations with and without this patch, it appears that there is a uniform array with variable indexing (reladdr) and constant indexing (of array element 0). Previously, we uploaded array element 0 as both a pull constant (for reladdr) /and/ a push constant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:32 -07:00
Kenneth Graunke	542f2e47f2	i965/fs: Don't renumber UNIFORM registers. Previously, remove_dead_constants() would renumber the UNIFORM registers to be sequential starting from zero, and the resulting register number would be used directly as an index into the params[] array. This renumbering made it difficult to collect and save information about pull constant locations, since setup_pull_constants() and move_uniform_array_access_to_pull_constants() used different names. This patch generalizes setup_pull_constants() to decide whether each uniform register should be a pull constant, push constant, or neither (because it's unused). Then, it stores mappings from UNIFORM register numbers to params[] or pull_params[] indices in the push_constant_loc and pull_constant_loc arrays. (We already did this for pull constants.) Then, assign_curb_setup() just needs to consult the push_constant_loc array to get the real index into the params[] array. This effectively folds all the remove_dead_constants() functionality into assign_constant_locations(), while being less irritable to work with. v2: Add assert(remapped <= i), requested by Topi. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:29 -07:00
Kenneth Graunke	d9f339eccd	i965/fs: Split pull parameter decision making from mechanical demoting. move_uniform_array_access_to_pull_constants() and setup_pull_constants() both have two parts: 1. Decide which UNIFORM registers to demote to pull constants, and assign locations. 2. Mechanically rewrite the instruction stream to pull the uniform value into a temporary VGRF and use that, eliminating the UNIFORM file access. In order to support pull constants in SIMD16 mode, we will need to make decisions exactly once, but rewrite both instruction streams. Separating these two tasks will make this easier. This patch introduces a new helper, demote_pull_constants(), which takes care of rewriting the instruction stream, in both cases. For the moment, a single invocation of demote_pull_constants can't safely handle both reladdr and non-reladdr tasks, since the two callers still use different names for uniforms due to remove_dead_constants() remapping of things. So, we get an ugly boolean parameter saying which to do. This will go away. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:26 -07:00
Kenneth Graunke	2163e0fd5a	i965/fs: Record pull constant locations for all array elements. When demoting a variably indexed uniform array to pull constants, we only recorded the location for the base of the array (element 0). Recording locations for all array elements is a trivial amount of code and will make subsequent refactoring easier. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:24 -07:00
Kenneth Graunke	7c7627781f	i965/fs: Save push constant location information. Previously, both move_uniform_array_access_to_pull_constants() and setup_pull_constants() maintained stack-local arrays with this information. Storing this information will allow it to be used from multiple functions, allowing us to split and move code around. We'll also eventually want to pass pull constant location information to the SIMD16 compile. Saving this information will help us do that. Unfortunately, the two functions cannot share the contents of the array just yet. remove_dead_constants() renumbers all the UNIFORM registers to be contiguous starting at zero, so the two functions talk about uniforms using different names. We can't even remap them, since move_uniform_array_access_to_pull_constants() deletes UNIFORM registers that are only accessed with reladdr, so remove_dead_constants can't even see them. This situation will improve in the next few patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:21 -07:00
Kenneth Graunke	de77efde91	i965/fs: Delete dead code to fail compiles with SIMD16 pull parameters. The SIMD8 compile will determine whether pull parameters are necessary. If so, it will set prog_data->nr_pull_params to a value greater than 0. brw_wm_fs_emit checks if nr_pull_params > 0 and skips the SIMD16 compile altogether. So, this code should never occur. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-18 10:11:08 -07:00
Brian Paul	63e7b51912	gallium/docs: update SLT, SGE, SFL, STR opcode docs To emphasize that the result is floating point 1.0 or 0.0, to match other opcodes like SLE and SEQ. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-18 08:03:27 -06:00
Charmaine Lee	81f342ce64	glx: Fix incorrect pdp assignment in dri2_bind_context(). pdp should be set to dpyPriv->dri2Display. Fixes blank frame failure running glretrace ClearView. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-18 08:03:27 -06:00
Maarten Lankhorst	8fe888fafd	nvc0: Handle user mapped vertex buffer for edgeflag Handle mapping edgeflag data similar to the code around it. This fixes a crash in piglit test gl-2.0-edgeflag. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2014-03-18 14:51:06 +01:00
Francisco Jerez	d70ad1a4f9	clover: Fix region size error checking in some buffer transfer commands. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-18 12:14:46 +01:00
Ilia Mirkin	c8309cde30	nv50/ir/gk110: add postfactor support for fmul Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	d8e0d1e882	nv50/ir/gk110: set not modifier on first source of logic op Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	b56e50b8af	nv50/ir/gk110: use shl/shr instead of lshf/rshf so that c[] is supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	34bf5e27c6	nv50/ir/gk110: add 64/128-bit fetch/export support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:55 -04:00
Ilia Mirkin	3c40be2615	nv50/ir/gk110: fix handling of OP_SUB for floating point ops Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	72310869f0	nv50/ir/gk110: presin/preex2 take their source at bit 23 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	48a9ba63f5	nv50/ir/gk110: add implementations of div u32/s32 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	4bb14aca29	nv50/ir/gk110: implement quadop Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	67cb8a6996	nv50/ir/gk110: fill in mov from predicate Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	563083ef57	nv50/ir/gk110: handle derivAll flag, fix useOffsets for non-txf Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	ece734b3c1	nv50/ir/gk110: fix setting texture for txd/txf/txq Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	08505549ab	nv50/ir/gk110: add texcsaa implementation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	c17f7247ec	nv50/ir/gk110: add pfetch support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:54 -04:00
Ilia Mirkin	15b1f420d0	nv50/ir/gk110: add emit/restart implementations Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	1b68009466	nv50/ir/gk110: add missing break in sched emit Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	76554d2d1f	nv50/ir/gk110: implement partial txq support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	cb3dcb1430	nv50/ir/gk110: fill out texture instruction support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:53 -04:00
Ilia Mirkin	ce75a3e8d3	nv50/ir/gk110: fix control flow opcode emission, add sat flag Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-18 05:56:34 -04:00
Chad Versace	468cc866b4	egl/main: Enable Linux platform extensions Enable EGL_EXT_platform_base and the Linux platform extensions layered atop it: EGL_EXT_platform_x11, EGL_EXT_platform_wayland, and EGL_MESA_platform_gbm. Tested with Piglit's EGL_EXT_platform_base tests under an X11 session. To enable running the Wayland and GBM tests, windowed Weston was running and the kernel had render nodes enabled. I regression tested my EGL_EXT_platform_base patch set with Piglit on Ivybridge under X11/EGL, standalone Weston, and GBM with rendernodes. No regressions found. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:49:06 -07:00
Chad Versace	9a40ee16d0	egl/wayland: Emit EGL_BAD_PARAMETER for eglCreatePlatformPixmapSurface From the EGL_EXT_wayland_spec, version 3: It is not valid to call eglCreatePlatformPixmapSurfaceEXT with a <dpy> that belongs to Wayland. Any such call fails and generates EGL_BAD_PARAMETER. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	1787f5632f	egl/gbm: Emit EGL_BAD_PARAMETER for eglCreatePlatformPixmapSurface From the EGL_MESA_platform_gbm spec, version 5: It is not valid to call eglCreatePlatformPixmapSurfaceEXT with a <dpy> that belongs to the GBM platform. Any such call fails and generates EGL_BAD_PARAMETER. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	6d1f83ec09	egl/main: Stop using EGLNative types internally Internally, much of the EGL code uses EGLNativeDisplayType, EGLNativeWindowType, and EGLPixmapType. However, the EGLNative type often does not match the variable's actual type. The concept of EGLNative types are a bad match for Linux, as explained below. And the EGL platform extensions don't use EGLNative types at all. Those extensions attempt to solve cross-platform issues by moving the EGL API away from the EGLNative types. The core of the problem is that eglplatform.h can define each EGLNative type once only, but Linux supports multiple EGL platforms. To work around the problem, Mesa's eglplatform.h contains multiple definitions of each EGLNative type, selected by feature macros. Mesa expects EGL clients to set the feature macro approrpiately. But the feature macros don't work when a single codebase must be built with support for multiple EGL platforms, such as Mesa itself. When building libEGL, autotools chooses the EGLNative typedefs based on the first element of '--with-egl-platforms'. For example, '--with-egl-platforms=x11,drm,wayland' defines the following: typedef Display* EGLNativeDisplayType; typedef Window EGLNativeWindowType; typedef Pixmap EGLNativePixmapType; Clearly, this doesn't work well for Wayland and GBM. Mesa works around the problem by casting the EGLNative types to different things in different files. For sanity's sake, and to prepare for the EGL platform extensions, this patch removes from egl/main and egl/dri2 all internal use of the EGLNative types. It replaces them with 'void*' and checks each explicit cast with a static assertion. Also, the patch touches egl_gallium the minimal amount to keep it compatible with eglapi.h. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	cefa06cd69	egl: Add STATIC_ASSERT() macro Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	eef68a9094	egl/dri2: Dispatch eglCreateImageKHR by display, not driver Add dri2_egl_display_vtbl::create_image, set it for each platform, and let egl_dri2 dispatch eglCreateImageKHR to that. To remove ambiguity, rename egl_dri2.c:dri2_create_image() to dri2_create_image_from_dri(). This prepares for the EGL platform extensions. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	88b9e600a6	egl/dri2/x11: Don't clobber _EGLDriver::API dri2_initialize_x11_swrast() does a strange thing. For some extensions it doesn't support, it sets the corresponding functions in _EGLDriver::API to NULL. The intention here is clear, but misplaced. NULL or not, the function pointers never get called because their extensions aren't supported. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:23 -07:00
Chad Versace	eadd5e0c0a	egl/dri2: Dispatch eglCreateWaylandBufferFromImageWL by display, not driver Add dri2_egl_display_vtbl::create_wayland_buffer_from_image, set it for each platform, and let egl_dri2 dispatch eglCreateWaylandBufferFromImageWL to that. This prepares for the EGL platform extensions. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:22 -07:00
Chad Versace	f506ef6784	egl/dri2: Consolidate eglTerminate egl_dri2.c:dri2_terminate() handled terminating X11 and DRM displays. The Wayland platform implemented its own dri2_wl_terminate(), which was nearly a copy of the common one. To implement the EGL platform extensions, we either need to dispatch eglTerminate per display or define a common implementation for all platforms. This patch chooses consolidation. It removes dri2_wl_terminate() by folding it into the common dri2_terminate(). It was necessary to invert the `if (disp->PlatformDisplay == NULL)` and the switch statement because, unlike DRM and X11, Wayland's terminator performed action even when EGL didn't own the native display. In the inversion, I replaced `disp->PlatformDisplay == NULL` with `dri2_dpy->own_device` because the two expressions are synonymous, but the latter's meaning is clearer. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:22 -07:00
Chad Versace	31cd0fee31	egl/dri2/x11: Set dri2_dpy->own_device When the user calls eglGetDisplay(EGL_DEFAULT_DISPLAY), the Wayland and DRM platforms set dri2_dpy->own_device=true. This patch makes the X11 platform do the same for consistency. Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:39:22 -07:00
Chad Versace	688a0e8e73	egl/dri2: Dispatch eglPostSubBufferNV by display, not driver Add dri2_egl_display_vtbl::post_sub_buffer, set it for each platform, and let egl_dri2 dispatch eglPostSubBufferNV to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	75d398ed93	egl/dri2: Dispatch eglSwapBuffersRegionNOK by display, not driver Add dri2_egl_display_vtbl::swap_buffers_region, set it for each platform, and let egl_dri2 dispatch eglSwapBuffersRegionNOK to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	bc2cbc0951	egl/dri2: Dispatch eglCopyBuffers by display, not driver Add dri2_egl_display_vtbl::copy_buffers, set it for each platform, and let egl_dri2 dispatch eglCopyBuffers to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	3fdfbd2572	egl/dri2: Dispatch API.QueryBufferAge by display, not driver Add dri2_egl_display_vtbl::query_buffer_age, set it for each platform, and let egl_dri2 dispatch API.QueryBufferAge to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	958dd80c40	egl/dri2: Dispatch eglDestroySurface by display, not driver Add dri2_egl_display_vtbl::destroy_surface, set it for each platform, and let egl_dri2 dispatch eglDestroySurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	bf20076baf	egl/dri2: Dispatch eglCreatePbufferSurface by display, not driver Add dri2_egl_display_vtbl::create_pbuffer_surface, set it for each platform, and let egl_dri2 dispatch eglCreatePbufferSurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	bc8b07a657	egl/dri2: Dispatch eglCreatePixmapSurface by display, not driver Add dri2_egl_display_vtbl::create_pbuffer_surface, set it for each platform, and let egl_dri2 dispatch eglCreatePixmapSurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:04 -07:00
Chad Versace	0a0c881a13	egl/dri2: Dispatch eglCreateWindowSurface by display, not driver Add dri2_egl_display_vtbl::create_window_surface, set it for each platform, and let egl_dri2 dispatch eglCreateWindowSurface to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	d03948a766	egl/dri2: Dispatch eglSwapBuffersWithDamage by display, not driver Add dri2_egl_display_vtbl::swap_buffers_with_damage, set it for each platform, and let egl_dri2 dispatch eglSwapBuffersWithDamageEXT to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	ad173bcfdb	egl/dri2: Dispatch eglSwapBuffers by display, not driver Add dri2_egl_display_vtbl::swap_buffers, set it for each platform, and let egl_dri2 dispatch eglSwapBuffers to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	8b9298af0a	egl/dri2: Dispatch eglSwapInterval by display, not driver Add dri2_egl_display_vtbl::swap_interval, set it for each platform, and let egl_dri2 dispatch eglSwapInterval to that. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	a218765478	egl/wl,x11: Call dri2_swap_interval() statically Don't call it through the driver dispatch table. Just call it statically. This prepares for the EGL platform extensions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	d019cd81b5	egl/dri2: Put platform func names into proper namespaces Each of the egl_dri2 platforms (except Android) prefix their function names with "dri2", not "dri2_${platform}". This means many function names have three separate definitions in the egl_dri2 directory: one in each of platform_drm.c, platform_wayland.c, and platform_x11.c. For example, each of the three files defines dri2_create_window_surface(). The name collisions make it difficult to review patches for correctness ("Is this patch hunk calling a platform_x11 function or a global egl_dri2 function?"), complicate debugging, and confuse code navigation tools. For each function in platform_x11.c prefixed with 'dri2', this patch changes its prefix to 'dri2_x11'. Likewise for platform_drm.c and 'dri2_drm'; and platform_wayland.c and 'dri2_wl'. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	90502b18b2	egl/dri2: Move dri2_egl_display virtual funcs to vtbl dri2_egl_display has only one virtual function, 'authenticate'. Define dri2_egl_display::vtbl and move 'authenticate' there. This prepares for the EGL platform extensions, which will add many more virtual functions to dri2_egl_display. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Chad Versace	38848b6217	egl: Update to revision 24567 of eglext.h This pulls in EGL_EXT_platform_base, EGL_EXT_platform_wayland, EGL_EXT_platform_x11, and EGL_MESA_platform_gbm. This patch has a lot of churn because Khronos recently changed its method of generating headers. Khronos now generates it headers from XML. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-03-17 15:36:03 -07:00
Michel Dänzer	7e0396dd73	winsys/radeon: Store GPU virtual memory addresses of BOs in a hash table This allows retrieving the existing BO and incrementing its reference count, instead of creating a separate winsys representation for it, when the kernel reports that the BO was already assigned a virtual memory address. This fixes problems with XWayland using radeonsi and the xf86-video-wlglamor driver, which calls GEM flink outside of the radeon winsys code and creates BOs from the flinked names using the same DRM file descriptor. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-17 11:53:59 +09:00
Chia-I Wu	361902ec04	targets/dri-ilo: make the driver installable install-gallium-links.mk fails to create the compat link for ilo_dri.so because it looks for dri_LTLIBRARIES instead of noinst_LTLIBRARIES. Fix this by switching to dri_LTLIBRARIES (and make the driver installable). Since pci_id_driver_map.h and the DDX both tell libGL.so to look for "i965", ilo_dri.so will never be loaded even enabled and installed. The change should not create any more confusion. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-16 13:26:22 +08:00
Marek Olšák	2e361160ff	mesa: mark GL_RGB9_E5 as not color-renderable The GL 4.4 spec says it's not color-renderable and not accepted by RenderBufferStorage. The EXT_texture_shared_exponent spec says it's not color-renderable but it's accepted by RenderBufferStorageEXT. This seems to be a bug in the extension spec. Let's do what GL 4.4 says. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-15 18:39:50 +01:00
Aaron Watry	ec1ada7327	radeonsi/compute: Fix memory leak Free shader buffer object for all kernels when deleting compute state. Signed-off-by: Aaron Watry <awatry@gmail.com>	2014-03-15 11:59:19 -05:00
Marek Olšák	8199d149ed	st/mesa: remove _NEW_POLYGON dependency from vertex shader We can just check the polygon mode when updating the edge flag state. Also, we can just flag ST_NEW_VERTEX_PROGRAM directly, which makes ST_NEW_EDGEFLAGS_DATA useless.	2014-03-15 17:47:36 +01:00
Marek Olšák	4e634c5240	st/mesa: implement zero-stride edge flag by culling primitives This was unimplemented.	2014-03-15 17:47:36 +01:00
Marek Olšák	3d42696d10	st/mesa: fix per-vertex edge flags and GLSL support (v2) This fixes piglit/gl-2.0-edgeflag. v2: use StrideB to recognize per-vertex edge flags Cc: mesa-stable@lists.freedesktop.org	2014-03-15 17:47:35 +01:00
Kenneth Graunke	7554539d7e	i965/fs: Invalidate live intervals when demoting uniforms to pull params. Normally, nothing uses live intervals at this point, so this isn't necessary. However, dump_instructions() calculates them and uses them to show register pressure. So, calling dump_instructions() in this area of the code would segfault due to the arrays being the wrong size. This is not a candidate for stable branches because it only serves to fix internal debugging code that you manually have to invoke by altering the source code or using gdb. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-14 13:18:46 -07:00
Kenneth Graunke	13782dcf9d	i965/fs: Print "+reladdr" on variably-indexed uniform arrays. Previously, dump_instruction() would print output such as: { 2} 3: mov vgrf1:F, u0:F { 3} 4: mov vgrf7:F, u0:F { 4} 5: mov vgrf8:F, u0:F which looked like either a scalar access or perhaps a constant-indexed access of element 0, when it was really a variable index. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-14 13:17:57 -07:00
Kenneth Graunke	01d9023a9b	i965: Fix register types in dump_instructions(), again. In commit `e57d77280e`, I fixed this for destinations in the Vec4 backend, and sources in the scalar backend. But not both types in both backends. To prevent this mess from continuing, make the reg_encoding table static, so only the disassembler can use it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-14 13:17:57 -07:00
Kenneth Graunke	4d2e79269a	i965/fs: Fix register comparisons in saturate propagation. opt_saturate_propagation_local compares scan_inst->dst.reg/reg_offset with inst->src[0].reg/reg_offset, and ensures that scan_inst->dst.file is GRF. But nothing ensured that inst->src[0].file was GRF. In the following program, this resulted in u1:F matching vgrf1:UW, and a saturate being incorrectly propagated from instruction 8 to instruction 1. { 1} 0: add vgrf0:UW, hw_reg1+8:UW, hw_reg0:V { 1} 1: add vgrf1:UW, hw_reg1+10:UW, hw_reg0:V { 1} 2: linterp vgrf6:F, hw_reg2:F, hw_reg3:F, hw_reg0:F { 2} 3: linterp vgrf27:F, hw_reg2:F, hw_reg3:F, hw_reg0+16:F { 4} 4: mov vgrf10+0.0:F, vgrf6:F { 3} 5: mov vgrf10+1.0:F, vgrf27:F { 6} 6: tex vgrf8+0.0:F, vgrf10+0.0:F { 5} 7: mov vgrf32:F, u1:F { 5} 8: mov.sat vgrf12:F, u1:F From shader-db: total instructions in shared programs: 1841932 -> 1841957 (0.00%) instructions in affected programs: 5823 -> 5848 (0.43%) I inspected two of the 25 hurt shaders, and concluded that they were both hitting this bug, and not legitimately optimized. This fixes bugs in Left 4 Dead 2 and Team Fortress 2, possibly among others. The optimization pass didn't exist in 10.0, so this is only a candidate for 10.1. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 13:17:57 -07:00
Eric Anholt	2dbebbd37d	glsl: Improve debug output and variable names for opt_dead_code_local. I know this code has confused others, and it confused me 3 years later, too. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-14 13:02:05 -07:00
Eric Anholt	2f879356b5	i965: Add support for GL_ARB_buffer_storage. It turns out we can allow COHERENT storage/mappings all the time, regardless of LLC vs non-LLC. It just means never using temporary mappings to avoid GPU stalls, and on non-LLC we have to use the GTT intead of CPU mappings. If we were to use CPU maps on non-LLC (which might be useful if apps end up using buffer_storage on PBO reads, to avoid WC read slowness), those would be PERSISTENT but not COHERENT, but doing that would require us driving the clflushes from userspace somehow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:22 -07:00
Eric Anholt	1990da2568	i965: Always use CPU mappings for BOs on LLC platforms. It looks like there's no big difference for write-only workloads, but using a CPU map means that if they happen to read without having set the MAP_READ_BIT, they get 100x the performance for those reads. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:22 -07:00
Eric Anholt	bb63df0c2d	i965: Drop the system-memory temporary allocations for flush explicit. While in expected usage patterns nobody will ever hit this path, doubling our bandwidth used seems like a waste, and it cost us extra code too. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:22 -07:00
Eric Anholt	ea93246c00	i965: Switch mapping modes for non-explicit-flush blit-temporary maps. On LLC, it should always be better to use a cached mapping than the GTT. On non-LLC, it seems pretty silly to try to optimize read performance for the INVALIDATE_RANGE_BIT case. This will make the buffer_storage logic easier. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 12:56:21 -07:00
Jeff Muizelaar	ff1e850eec	gallivm: optimize repeat linear npot code in the aos int path Similar to the other cases, shift some weight/coord calculations to int space. This should be slightly faster (on x86 sse it should actually safe one instruction, and generally int instructions are cheaper).	2014-03-14 19:41:18 +01:00
Roland Scheidegger	9954f01497	gallivm: use correct rounding for nearest wrap mode (in the aos int path) The previous code used coords which were calculated as (int) (f_coord * tex_size * 256) >> 8. This is not only unnecessarily complex but can give the wrong texel due to rounding for negative coords (as an example, after denormalization coords from -1.0 to 0.0 should give -1, but this will give -1 for numbers from -1.0-1/256 - 0.0-1/256. Instead, juse use ifloor, dropping the shift stuff. Unfortunately, this will most likely be slower - with arch rounding available it shouldn't be too bad (trades a int shift for a round but also saves an int mul (which is shared by all coords) but otherwise it's a mess.	2014-03-14 19:41:18 +01:00
Jeff Muizelaar	88637e5764	gallivm: use correct rounding for linear wrap mode (in the aos int path) The previous method for converting coords to ints was sligthly inaccurate (effectively losing 1bit from the 8bit lerp weight). This is probably especially noticeable when trying to draw a pixel-aligned texture. As an example, for a 100x100 texture after dernormalization the texture coords in this case would turn up as 0.5, 1.5, 2.5, 3.5, 4.5, ... After the mul by 256, conversion to int and 128 subtraction, they end up as 0, 256, 512, 768, 1024, ... which gets us the correct coords/weights of 0/0, 1/0, 2/0, 3/0, 4/0, ... But even LSB errors (which are unavoidable) in the input coords may cause these coords/weights to be wrong, e.g. for a coord of 3.49999 we'd get a coord/weight of 2/255 instead. Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be equally fast on x86 sse though other archs probably suffer a little.	2014-03-14 19:41:18 +01:00
Brian Paul	6757ec3f8e	glapi: restore _glthread_GetID() function This partially reverts patch `02cb04c68f`. This fixes an unresolved symbol error when using older builds of libGL. Tested-by: Chia-I Wu <olv@lunarg.com>	2014-03-14 12:12:07 -06:00
Niels Ole Salscheider	f9901f1ab2	radeonsi: flush the dma ring in si_flush_from_st Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-14 15:01:14 +01:00
Niels Ole Salscheider	087b0ff1c1	radeon: Move DMA ring creation to common code Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-03-14 15:01:14 +01:00
Emil Velikov	a9cf3aa208	mesa: return v.value_int64 when the requested type is TYPE_INT64 Fixes "Operands don't affect result" defect reported by Coverity. Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-14 13:01:47 +00:00
Emil Velikov	f064bcdfbf	nvc0: minor cleanups in stream output handling Constify the offsets parameter to silence gcc warning 'assignment from incompatible pointer type' due to function prototype miss-match. Use a boolean changed as a shorthand for target != current_target. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-14 13:00:01 +00:00
Emil Velikov	ad4a44ebfc	nouveau: honor fread return value in the nouveau_compiler There is little point of continuing if fread returns zero, as it indicates that either the file is empty or cannot be read from. Bail out if fread returns zero after closing the file. Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-14 13:00:01 +00:00
Emil Velikov	ae7d236172	nouveau: typecast the prime_fd handle when calling nouveau_bo_set_prime Core drm defines that the handle is of type int, while all drivers treat it as uint internally. Typecast the value to silence gcc warning messages and be consistent amongst all drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-14 13:00:01 +00:00
Emil Velikov	c26b488088	nv50: add missing brackets when handling the samplers array Commit 3805a864b1d(nv50: assert before trying to out-of-bounds access samplers) introduced a series of asserts as a precausion of a previous illegal memory access. Although it failed to encapsulate loop within nv50_sampler_state_delete effectively failing to clear the sampler state, apart from exadurating the illegal memory access issue. Fixes gcc warning "array subscript is above array bounds" and "Nesting level does not match indentation" and "Out-of-bounds read" defects reported by Coverity. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-14 13:00:01 +00:00
Anuj Phogat	4d0e30accd	i965: Fix build warning of unused variable Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-14 02:57:00 -07:00
Adel Gadllah	a69fabc76c	dri3: Add GLX_EXT_buffer_age support v2: Indent according to Mesa style, reuse sbc instead of making a new swap_count field, and actually get a usable back before returning the age of the back (fixing updated piglit tests). Changes by anholt. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Robert Bragg <robert@sixbynine.org> (v1) Reviewed-by: Adel Gadllah <adel.gadllah@gmail.com> (v2) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-13 14:19:21 -07:00
Eric Anholt	0b02d8a633	dri3: Prefer the last chosen back when finding a new one. With the buffer_age code, I need to be able to potentially call this more than once per frame, and it would be bad if a new special event showing up meant I chose a different back mid-frame. Now, once we've chosen a back for the frame, another find_back will choose it again since we know that it won't have ->busy set until swap. Note that this makes find_back return a buffer id instead of a backbuffer index. That's kind of a silly distinction anyway, since it's an identity mapping between the two (it's the front buffer that is at an offset). Reviewed-By: Adel Gadllah <adel.gadllah@gmail.com>	2014-03-13 14:19:16 -07:00
Neil Roberts	551d459af4	Add the EGL_MESA_configless_context extension This extension provides a way for an application to render to multiple surfaces with different buffer formats without having to use multiple contexts. An EGLContext can be created without an EGLConfig by passing EGL_NO_CONFIG_MESA. In that case there are no restrictions on the surfaces that can be used with the context apart from that they must be using the same EGLDisplay. _mesa_initialze_context can now take a NULL gl_config which will mark the context as ‘configless’. It will memset the visual to zero in that case. Previously the i965 and i915 drivers were explicitly creating a zeroed visual whenever 0 is passed for the EGLConfig. Mesa needs to be aware that the context is configless because it affects the initial value to use for glDrawBuffer. The first time the context is bound it will set the initial value for configless contexts depending on whether the framebuffer used is double-buffered. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:47 -07:00
Neil Roberts	4b17dff3e5	eglCreateContext: Remove the check for whether config == 0 In eglCreateContext there is a check for whether the config parameter is zero and in this case it will avoid reporting an error if the EGL_KHR_surfacless_context extension is supported. However there is nothing in that extension which says you can create a context without a config and Mesa breaks if you try this so it is probably better to leave it reporting an error. The original check was added in `b90a3e7d8b` based on the API-specific extensions EGL_KHR_surfaceless_opengl/gles1/gles2. This was later changed to refer to EGL_KHR_surfacless_context in `b50703aea5`. Perhaps the original extensions specified a configless context but the new one does not. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:47 -07:00
Neil Roberts	4954518125	Fix the initial value of glDrawBuffers for GLES Under GLES 3 it is not valid to pass GL_FRONT to glDrawBuffers. Instead, GL_BACK has a magic interpretation which means it will render to the front buffer on single-buffered contexts and the back buffer on double-buffered. We were incorrectly setting the initial value to GL_FRONT for single-buffered contexts. This probably doesn't really matter at the moment except that presumably it would be exposed in the API via glGetIntegerv. When we switch to configless contexts this is more important because in that case we always want to rely on the magic interpretation of GL_BACK in order to automatically switch between the front and back buffer when a new surface with a different number of buffers is bound. We also do this for GLES 1 and 2 because the internal value doesn't matter in that case and it is convenient to use the same code to have the magic interpretation of GL_BACK. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:47 -07:00
Neil Roberts	0c58c96e54	Use the magic behaviour of GL_BACK in GLES 1 and 2 as well as 3 In GLES 3 it is not possible to select rendering to the front buffer and instead selecting GL_BACK has the magic interpretation that it is either the front buffer on single-buffered configs or the back buffer on double-buffered. GLES 1 and 2 have no way of selecting the draw buffer at all. In that case we were initialising the draw buffer to either GL_FRONT or GL_BACK depending on the context's config and then leaving it at that. When we switch to having configless contexts we ideally want Mesa to automatically switch between the front and back buffer whenever a double- or single-buffered surface is bound. To make this happen we can just allow the magic behaviour from GLES 3 in GLES 1 and 2 as well. It shouldn't matter what the internal value of the draw buffer is in GLES 1 and 2 because there is no way to query it from the external API. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-03-12 14:40:46 -07:00
Ian Romanick	87c66a4ff7	glsl: Fix typo Remove extra "any" and re-word-wrap the comment. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-12 11:16:50 -07:00
Ian Romanick	6bdc1d96c3	glsl: Rewrite unrolled link_invalidate_variable_locations calls as a loop Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-12 11:16:50 -07:00
Carl Worth	7b8acb9026	docs: Import 10.0.4 release notes, add news item.	2014-03-12 10:22:22 -07:00
Mike Stroyan	6e627b49f9	mesa: Release gl_debug_state when destroying context. Commit `6e8d04a` caused a leak by allocating ctx->Debug but never freeing it. Release the memory in _mesa_free_errors_data when destroying a context. Use FREE to match CALLOC_STRUCT from _mesa_get_debug_state. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-12 09:43:05 -06:00
Niels Ole Salscheider	2c886eba78	r600g: compute memory pool size is given in dw Multiply the dw value by 4 in order to map the complete buffer. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>	2014-03-11 19:00:08 -07:00
Eric Anholt	d3eb709ded	meta: Always restore the framebuffers and current renderbuffer. The few paths that were playing with framebuffers and renderbuffer were saving and restoring them. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:47:46 -07:00
Eric Anholt	feb3d8dacd	i965: Drop intel_check_front_buffer_rendering(). This was being applied in a subset of the places that intel_prepare_render() was called, to set the same flag that intel_prepare_render() was setting. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:47:44 -07:00
Eric Anholt	ec542d7457	i965: Drop broken front_buffer_reading/drawing optimization. The flag wasn't getting updated correctly when the ctx->DrawBuffer or ctx->ReadBuffer changed. It usually ended up working out because most apps only have one window system framebuffer, or if they have more than one and they have any front read/drawing, they will have called glReadBuffer()/glDrawBuffer() on it when they get started on the new buffer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:47:41 -07:00
Eric Anholt	66073ef438	intel: When checking for updating front buffer reading, use the right fb. It's the ctx->ReadBuffer that gets read from, not the ctx->DrawBuffer. So, if you happened to have a ctx->ReadBuffer that was the winsys buffer, and it had previously been intel_prepare_render()ed but not invalidated since then, and you called glReadBuffer() to switch to front buffer instead of back buffer reading on the winsys fbo while your drawbuffer was a user FBO, you'd never get the front buffer's miptree fetched, and segfault. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:46:59 -07:00
Marek Olšák	e1a9a54464	r600g,radeonsi: attempt to fix racy multi-context apps calling BufferData Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75061 v2: minimize the window where cs_buf != new_buf	2014-03-11 19:18:02 +01:00
Marek Olšák	74d95adea0	r600g,radeonsi: fix broken buffer download Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 19:18:02 +01:00
Marek Olšák	4ca3486b19	r600g,radeonsi: use a fallback in dma_copy instead of failing v2: - allow byte-aligned DMA buffer copies on Evergreen - fix piglit/texsubimage regression - use the fallback for 3D copies (depth > 1) as well	2014-03-11 19:18:02 +01:00
Marek Olšák	de5094d102	radeonsi: small cleanup in get_param Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	e219842282	radeonsi: set correct alignment for texture buffers and constant buffers I think these are all equivalent to vertex buffer fetches which should be dword-aligned. Scalar loads are also dword-aligned. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	f549129564	r600g, radeonsi: fix primitives-generated query with disabled streamout Buffers are disabled by VGT_STRMOUT_BUFFER_CONFIG, but the query only works if VGT_STRMOUT_CONFIG.STREAMOUT_0_EN is enabled. This moves VGT_STRMOUT_CONFIG to its own state. The register is set to 1 if either streamout or the primitives-generated query is enabled. However, the primitives-emitted query is also incremented, so it's disabled by setting VGT_STRMOUT_BUFFER_SIZE to 0 when there is no buffer bound. This fixes piglit: ARB_transform_feedback2/counting with pause EXT_transform_feedback/primgen-query transform-feedback-disabled Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	958ef47a6d	r600g,radeonsi: don't add streamout.num_dw_for_end twice It's already added in need_cs_space. Also don't calculate anything if there are no buffers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	4f1f32306a	r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limits CB_COLORi_VIEW.SLICE_MAX can be at most 2047. This fixes the maxlayers piglit test. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	8bd7a6f48c	st/dri: flush drawable textures before unreferencing This fixes piglit/fbo-sys-blit with fast clear on radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	a38e1fd78b	radeonsi: implement fast color clear This works for both multi-sample and single-sample color buffers. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	28eb0bcf19	r600g: move fast color clear code to a common place Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	d3c1be530a	r600g,radeonsi: move CMASK register values from r600_surface to r600_texture When doing fast clear for single-sample color buffers for the first time, a CMASK buffer has to be allocated and the CMASK state in all pipe_surfaces referencing the color buffer must be updated. Updating all surfaces is kinda silly, so let's move the values to r600_texture instead. This is only for Evergreen and later. R600-R700 don't have fast clear. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	61a2fac199	radeonsi: convert the framebuffer state to atom-based This looks like r600g. The shared Cayman MSAA code is used here. The real motivation for this is that I need the ability to change values of color registers after the framebuffer state is set. The PM4 state cannot be modified easily after it's generated. With this, I can just change r600_surface::cb_color_xxx and set framebuffer.atom.dirty=true and it's done. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	946d1cfe39	r600g: move cayman MSAA setup to a common place I will use this in radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	6a5499b9d9	radeonsi: move framebuffer-related state to a new struct si_framebuffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-11 18:51:20 +01:00
Marek Olšák	bee2b96b02	r600g,radeonsi: set priorities for relocations	2014-03-11 18:51:19 +01:00
Marek Olšák	3edb3b86b2	r300g,uvd,vce: set priorities for relocations This updates all occurences of cs_add_reloc.	2014-03-11 18:51:19 +01:00
Marek Olšák	db1a7f78c2	winsys/radeon: add interface for setting a priority number for each relocation The cs_add_reloc change is commented out not to break compilation. The highest priority of all cs_add_reloc calls is send to the kernel.	2014-03-11 18:51:19 +01:00
Jonathan Gray	0d6f573f6e	glsl: Link glsl_compiler with pthreads library. Fixes the following build error on OpenBSD: ./.libs/libglsl.a(builtin_functions.o)(.text+0x973): In function `mtx_lock': ../../include/c11/threads_posix.h:195: undefined reference to `pthread_mutex_lock' ./.libs/libglsl.a(builtin_functions.o)(.text+0x9a5): In function `mtx_unlock': ../../include/c11/threads_posix.h:248: undefined reference to `pthread_mutex_unlock' Signed-off-by: Jonathan Gray <jsg@jsg.id.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-11 08:47:12 -06:00
Jonathan Gray	40214267ab	gallium: add endian detection for OpenBSD Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-11 08:47:12 -06:00
Emil Velikov	a6efbac9fb	automake: allow only shared builds Static and shared builds were possible in the good old days of static makefiles. Currently the build system does not distinguish nor does anything special when one requests a static build. Print a warning message for the packager that static builds are not supported and continue building shared libs. Currently only Debian and derivatives use static build, and they use it for building a Xlib powered libGL. This patch will only change the warning message they are seeing but the binaries produced will be identical. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:44 +00:00
Emil Velikov	065b6ca52b	configure: update enable-llvm-shared-libs comments - As of commit cb080a10b68(configure.ac: Don't require shared LLVM when building OpenCL) opencl does not mandate using shared llvm. - Add a warning message that building with static llvm may cause compilation problems. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-11 12:50:44 +00:00
Emil Velikov	e267e4318c	st/dri: build the drm backend when libdrm is present Prevent build issues on systems lacking libdrm. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:44 +00:00
Emil Velikov	f41a65397b	glx: cleanup unneeded headers - xf86dri.h is the old dri1 header, not required by dri2 nor dri3 - fold xf86drm.h inclusiong inside dri2.h - dri3_glx does not have any drm specific dependencies - glapi.h is not required by the dri2 and dri3 codepaths Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:50:43 +00:00
Jon TURNEY	e5214dd8f1	glx/tests: honor enable-driglx-direct configure flag Recent commit fixed build issues in dri2_query_renderer.c by wrapping in defined(direct_rendering) && !defined(applegl) This patch targets the query_renderer tests, so that make check passes on platforms such as hurd and cygwin. v2: (Emil) - Rebase and update commit message. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-11 12:50:43 +00:00
Emil Velikov	254aafba3e	configure: read libomxil-bellagio.pc only when it exists Currenly configure.ac will print a warning when one is missing the package. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-11 12:50:43 +00:00
Emil Velikov	22c133546a	automake: create compat symlinks only for linux systems The primary users of these are linux developers, although it can be extended for BSD and others if needed. Fixes make install for Cygwin and OpenBSD at least. v2: - Wrap vdpau targets as well. v3: - Fold HAVE_COMPAT_SYMLINKS conditional within installlinks.mk Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63269 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk> (v1) Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-11 12:50:43 +00:00
Emil Velikov	bba9c28215	configure: use LIB_EXT rather than hardcoded .so Some platforms different library extension - dll, dylib, a. Honor that when we are creating the required links. Rename LIB_EXTENSION to LIB_EXT while we're here. With libglapi linking aside, building classic drivers on non-linux platforms should be possible now. v2: Resolve conflicts. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:43 +00:00
Emil Velikov	020bc0d0dd	automake: do not use symbols names for static glapi.la In the cases where one links against the static glapi.la there is no need to create temporary variables only to explicitly link agaist it. Instead use SHARED_GLAPI_LIB to explicitly indicate when one is building and linking with the shared glapi provider. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:43 +00:00
Emil Velikov	3c5599b276	configure: remove old makefile variables All the variables were used before the automake conversion and do not make sense (nor are used) currently. Replace GL_LIB_NAME with lib$(GL_LIB).$(LIB_EXTENSION) for apple-glx. The build has been broken for ages, but this will ease the recovery process as it happens. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:43 +00:00
Emil Velikov	49d7bcea82	gallium/targets: use install-gallium-targets.mk Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	f3595b6748	gallium/targets: drop link generation for non DRI targets All three (xvmc and omx) do not have an alternative loading similar to the dri modules. Thus one needs to explicitly install them in order to use/test them. v2: - Keep vdpau targets, as an equivalent of LIBGL_DRIVERS_PATH is being worked on. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	d8ba951ad6	targets/vdpau: use install-gallium-links.mk Drop the duplication across all vdpau targets. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-11 12:50:42 +00:00
Emil Velikov	ce24bcd394	targets/dri: use install-gallium-links.mk Drop the duplication across all dri targets. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	bbae65e25c	automake: introduce install-gallium-links.mk This helper script will be used to minimise the duplication during link generation across all gallium targets. v2: - Handle vdpau_LTLIBRARIES. Requested by Christian König. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	7b4ccad33d	automake: use install-lib-links.mk across all classic mesa Use the handy script and minimise the boilerplate in the makefiles. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	b496ab0567	automake: make install-lib-links less chatty There is little point in echoing everything that the script does to stdout. Wrap it in AM_V_GEN so that a reasonable message is printed as a indication of it's invocation. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:42 +00:00
Emil Velikov	90a4ffdea5	automake: use only the folder name if it's a subfolder of the present one v2: Resolve rebase conflicts. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:41 +00:00
Emil Velikov	b15b1fbb51	automake: silence folder creation There is little gain in printing whenever a folder is created. v2: - Use $(AM_V_at) over @ to have control in verbose builds. Suggested by Erik Faye-Lund. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:41 +00:00
Emil Velikov	c690f8dd9b	automake: use MKDIR_P when possible Use the automake predefined macro over hardcoding mkdir -p everywhere. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2014-03-11 12:50:41 +00:00
Vinson Lee	e6c565fcc5	radeon: Fix build. Fix build error introduced with commit `dfa25ea5cd`. CC r600_streamout.lo r600_streamout.c:108:6: error: conflicting types for 'r600_set_streamout_targets' void r600_set_streamout_targets(struct pipe_context ctx, ^ ./r600_pipe_common.h:413:6: note: previous declaration is here void r600_set_streamout_targets(struct pipe_context ctx, ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76009 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-03-10 22:54:59 -07:00
Zack Rusin	dfa25ea5cd	gallium: allow setting of the internal stream output offset D3D10 allows setting of the internal offset of a buffer, which is in general only incremented via actual stream output writes. By allowing setting of the internal offset draw_auto is capable of rendering from buffers which have not been actually streamed out to. Our interface didn't allow. This change functionally shouldn't make any difference to OpenGL where instead of an append_bitmask you just get a real array where -1 means append (like in D3D) and 0 means do not append. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-07 12:49:33 -05:00
Brian Paul	7d5903980e	meta: use non-ARB shader/program create/delete functions The non-ARB versions take GLuint ids, not GLhandleARB. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-10 17:07:05 -06:00
Brian Paul	d96ed5c088	mesa: s/GLhandleARB/GLuint/ for glGetUniform functions The GL specs say the parameter is GLuint, not GLhandleARB. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-10 17:06:57 -06:00
Brian Paul	a19b19fb94	mesa: rename MESA_FORMAT_X8Z24_UNORM -> MESA_FORMAT_X8_UINT_Z24_UNORM To follow the example of MESA_FORMAT_Z24_UNORM_X8_UINT. Reviewed-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 16:11:54 -06:00
Brian Paul	9b5fff2dd7	mesa: reorder MESA_FORMAT enums The MESA_FORMAT_x enums in formats.h weren't declared in any sort of reasonable order. Now it should be a little more logical. This also required reordering tables in formats.c and s_texfetch.c Reviewed-by: Michel Dänzer <michel@daenzer.net> Acked-by: Eric Anholt <eric@anholt.net>	2014-03-10 16:11:50 -06:00
Brian Paul	10738727ae	mesa: trim down format.h comments There's no real reason to list all the formats in the comments. Reviewed-by: Michel Dänzer <michel@daenzer.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 16:11:45 -06:00
Matt Turner	3330dec90c	i965/vec4: Don't fix-up scalar uniforms for 3 src instructions. Removes unnecessary MOV instructions in L4D2, TF2, Dota2, and many other Steam games. total instructions in shared programs: 1668126 -> 1657509 (-0.64%) instructions in affected programs: 242235 -> 231618 (-4.38%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-10 14:13:45 -07:00
Matt Turner	b823d5df0f	i965: Disassemble 3 src instructions' rep_ctrl field. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-10 14:13:45 -07:00
Matt Turner	dafcc1b7c4	i965: Disassemble 3-src operands widths' correctly. <4,1,1> isn't a real thing. We meant <4,4,1>, i.e., each component of the whole register. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-10 14:13:45 -07:00
Eric Anholt	30259856a8	i965: Move binding table update packets to binding table setup time. This keeps us from needing to reemit all the other stage state just because a surface changed. Improves unoptimized glamor x11perf -f8text by 1.10201% +/- 0.489869% (n=296). [v1] v2: - Drop binding table packets from Gen8 unit state as well. - Pass _3DSTATE_BINDING_TABLE_POINTERS_XS to brw_upload_binding_table, cutting even more code. v3: Don't forget to drop them from 3DSTATE_GS (botched refactor in v2). Signed-off-by: Eric Anholt <eric@anholt.net> [v1] Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v1] Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [v2, v3] Reviewed-by: Eric Anholt <eric@anholt.net> [v3]	2014-03-10 13:05:12 -07:00
Kenneth Graunke	db26253a48	i965: Reorganize the code in brw_upload_binding_tables. This makes both the empty and non-empty binding table paths exit through the bottom of the function, which gives us a place to share code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 13:05:12 -07:00
Maarten Lankhorst	8c136b53b7	fix vdpau interop when using -Bsymbolic-functions in ldflags Explicitly add radeon_drm_winsys_create and nouveau_drm_screen_create to the dynamic list. This will ensure vdpau interop still works even when the user links with -Bsymbolic-functions in hardened builds. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Tested-by: Rachel Greenham <rachel@strangenoises.org> Reported-by: Peter Frühberger <peter.fruehberger@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-10 17:08:19 +01:00
Chia-I Wu	952fda4d3f	ilo: do not set I915_EXEC_NO_RELOC This reverts most of commit `d80f0c34b7`. Upon a closer reading, having the presumed offsets written is not enough to set the flag. EXEC_OBJECT_NEEDS_GTT and/or EXEC_OBJECT_WRITE of the reloc entries must also be set appropriately.	2014-03-10 19:04:43 +08:00
Chia-I Wu	5ecdd7ba22	ilo: add support for PIPE_QUERY_PIPELINE_STATISTICS	2014-03-10 16:43:53 +08:00
Chia-I Wu	8fc2f0c874	ilo: add ILO_3D_PIPELINE_WRITE_STATISTICS The command writes statistics registers to the specified bo.	2014-03-10 16:43:53 +08:00
Chia-I Wu	d8b2e3c25e	ilo: add some MI commands to GPE We will need MI commands that load/store registers.	2014-03-10 16:43:53 +08:00
Chia-I Wu	0f41f9c63d	ilo: set PIPE_CONTROL_GLOBAL_GTT_WRITE automatically Set the flag automatically in gen6_emit_PIPE_CONTROL(), and set it only for GEN6.	2014-03-10 16:43:53 +08:00
Chia-I Wu	345bf92f13	ilo: print a warning when PPGTT is disabled Despite what the PRMs say, the driver appears to work fine when PPGTT is disabled. But at least print a warning in that case.	2014-03-10 16:42:42 +08:00
Chia-I Wu	747627d045	ilo: require hardware logical context support The code paths are not tested for a while, and have some known issues.	2014-03-10 16:42:42 +08:00
Chia-I Wu	72956ed374	ilo: protect the decode context with a mutex The decode context is not thread safe.	2014-03-10 16:42:42 +08:00
Chia-I Wu	d80f0c34b7	ilo: set I915_EXEC_NO_RELOC when available The winsys makes it clear that the pipe drivers should write presumed offsets. We can always set I915_EXEC_NO_RELOC when the kernel supports it.	2014-03-10 16:42:42 +08:00
Chia-I Wu	0b462d3ab1	ilo: move ring types to winsys It results in less code despite that i915_drm.h specifies the ring type as part of the execution flags.	2014-03-10 16:42:42 +08:00
Chia-I Wu	42c1ce4c03	ilo: winsys may limit the batch buffer size The maximum batch buffer size is determined at the time of drm_intel_bufmgr_gem_init(). Make sure the pipe driver does not exceed the limit.	2014-03-10 16:42:42 +08:00
Chia-I Wu	a434ac045e	ilo: PIPE_CAP_QUERY_TIMESTAMP may not be supported Reading TIMESTAMP register may fail, depending on both kernel and hardware.	2014-03-10 16:42:42 +08:00
Chia-I Wu	249b1ad984	ilo: rework winsys batch buffer functions Rename intel_winsys_check_aperture_size() to intel_winsys_can_submit_bo(), intel_bo_exec() to intel_winsys_submit_bo(), and intel_winsys_decode_commands() to intel_winsys_decode_bo(). Make a semantic change to ignore intel_context when the ring is not the render ring.	2014-03-10 16:42:42 +08:00
Chia-I Wu	3e324f99d3	ilo: replace bo alloc flags by initial domains The only alloc flag is INTEL_ALLOC_FOR_RENDER, which can as well be expressed by specifying the initial write domain. The change makes it obvious that we failed to set INTEL_ALLOC_FOR_RENDER in several places.	2014-03-10 16:42:42 +08:00
Chia-I Wu	76713ed5d6	ilo: remove intel_bo_get_size() Commit `bfa8d21759` uses it to work around a hardware limitation. But there are other ways to do it without the need for intel_bo_get_size().	2014-03-10 16:42:42 +08:00
Chia-I Wu	790c32ec75	ilo: remove intel_bo_get_virtual() Make the map functions return the pointer directly.	2014-03-10 16:42:42 +08:00
Chia-I Wu	90786613e9	ilo: rework winsys bo reloc functions Rename intel_bo_emit_reloc() to intel_bo_add_reloc(), intel_bo_clear_relocs() to intel_bo_truncate_relocs(), and intel_bo_references() to intel_bo_has_reloc(). Besides, we need intel_bo_get_offset() only to get the presumed offset afer adding a reloc entry. Remove the function and make intel_bo_add_reloc() return the presumed offset. While at it, switch to gem_bo->offset64 from gem_bo->offset.	2014-03-10 16:42:42 +08:00
Chia-I Wu	76ed4f75dd	ilo: add a wrapper to cast struct intel_bo It is just drm_intel_bo, but having a wrapper makes the code cleaner.	2014-03-10 16:42:42 +08:00
Chia-I Wu	4491f0a971	ilo: fix DRM_API_HANDLE_TYPE_FD export It can be exported by drm_intel_bo_gem_export_to_prime(). The code is already in winsys, just not enabled.	2014-03-10 16:42:42 +08:00
Chia-I Wu	276348e85a	ilo: improve winsys documentation/comments Document the interface, and add comments as to why some features are enabled and why some checks are made.	2014-03-10 16:42:41 +08:00
Chia-I Wu	f2aabecbb0	ilo: remove intel_winsys_enable_reuse() It should be an (winsys) implementation detail.	2014-03-10 16:42:41 +08:00
Tapani Pälli	56b1be4399	mesa/glsl: introduce a remap table for uniform locations Patch adds a remap table for uniforms that is used to provide a mapping from application specified uniform location to actual location in the UniformStorage. Existing UniformLocationBaseScale usage is removed as table can be used to set sequential values for array uniform elements. This mapping helps to implement GL_ARB_explicit_uniform_location so that uniforms locations can be reorganized and handled in a more easy manner. v2: small fixes + rename parameters for merge and split functions (Ian) improve documentation, remove old check for location bounds (Eric) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-10 09:46:24 +02:00
Tapani Pälli	aa0d95a08d	mesa: remove _mesa_symbol_table_iterator structure Nothing uses this structure, removal fixes Klocwork error about the possible oom condition in _mesa_symbol_table_iterator_ctor. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-10 09:45:41 +02:00
Michel Dänzer	678cf9618f	radeonsi: Use proper member name for deleting export shader PM4 state Fixes double-free with some piglit tests using geometry shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-10 12:21:50 +09:00
Marek Olšák	9c2a3934c5	r600g: document why texture offset emulation is needed	2014-03-10 00:19:59 +01:00
Ilia Mirkin	897f40f25d	Revert nvc0 part of "nv50: adjust blit_3d handling of ms output textures" The nvc0 bits don't appear to work, and I thought I had removed them from the commit. Oops. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-09 01:38:10 -05:00
Ilia Mirkin	253314d487	nv50: adjust blit_3d handling of ms output textures This fixes some unwanted scaling when the output is multisampled. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-09 01:32:06 -05:00
Ilia Mirkin	507f0230d4	nouveau: fix fence waiting logic in screen destroy nouveau_fence_wait has the expectation that an external entity is holding onto the fence being waited on, not that it is merely held onto by the current pointer. Fixes a use-after-free in nouveau_fence_wait when used on the screen's current fence. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75279 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-03-09 01:31:59 -05:00
Ilia Mirkin	5bf90cb521	nouveau: add valid range tracking to nouveau_buffer This logic is borrowed from the radeon code. The transfer logic will only get called for PIPE_BUFFER resources, so it shouldn't be necessary to worry about them becoming render targets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-03-09 01:31:21 -05:00
Julien Cristau	cf1c52575d	gbm: make 'devices' array static It's only used in this one file as far as I can tell, and exporting a symbol named 'devices' from a shared library is a recipe for trouble. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-08 20:43:54 +00:00
Emil Velikov	330a3799d0	automake: make clean the correct git_sha1.h.tmp When building out of tree, the file ends up dangling which may result in a binary with the old git sha. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-08 20:40:56 +00:00
Christian König	6a402359fd	radeonsi: fix freeing descriptor buffers That structure member is a pointer, so the loop with the Elements macro only freed up the first entry. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Christian König	58d2afa223	radeonsi: fix leaking the bound state on destruction v2 v2: rebased on stale pointer fixes Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Christian König	1fa2acba61	radeonsi: avoid stale state pointers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Christian König	1a8c66023b	radeonsi: avoid stale pointers in si_delete_shader_selector Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 16:08:15 +01:00
Marek Olšák	c1a06da465	Revert "winsys/radeon: if there's VRAM-only usage, keep it" This reverts commit `67aef6dafa`. It caused GPU hangs. The question is why. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75900	2014-03-08 16:00:25 +01:00
Christian König	a995f564c7	radeon/vce: fix memory leak Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-08 14:43:53 +01:00
Sir Anthony	6e39a8f6ec	glcpp: Do not remove spaces to preserve locations. After preprocessing by glcpp all adjacent spaces were replaced by single one and glsl parser received column-shifted shader source. It negatively affected ast location set up and produced wrong error messages for heavily-spaced shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-08 01:38:32 -08:00
Sir Anthony	da2275cd9b	glsl: Change locations from yylloc to appropriate tokens positions. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	5656775cf6	glsl: Add ast_node method to set location range. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	654ee41cd3	glsl: Make ast_node location comments more informative. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	433d562ac6	glsl: Extend ast location structure to hande end token position. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Sir Anthony	6984aa4350	glsl: Update lexers in glsl and glcpp to hande end position of token. Reviewed-by: Carl Worth <cworth@cworth.org>	2014-03-08 01:29:00 -08:00
Vinson Lee	98fb8c95c0	scons: Add drivers/common/meta_generate_mipmap.c to src/mesa/SConscript. This patch fixes this SCons build error introduced with commit `70e7905608`. build/linux-x86_64-debug/mesa/libmesa.a(driverfuncs.os): In function `_mesa_init_driver_functions': src/mesa/drivers/common/driverfuncs.c:99: undefined reference to `_mesa_meta_GenerateMipmap' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-03-07 23:39:29 -08:00
Kenneth Graunke	14ca611258	meta: Support GenerateMipmaps on 1DArray textures. I don't know how many people care about this case, but it's easy enough to do, so we may as well. The tricky part is that for some reason Mesa stores the number of array slices in Height, not Depth. I thought the easiest way to handle that here was to make Height = 1 (the actual height), and srcDepth = srcImage->Height. This requires some munging when calling _mesa_prepare_mipmap_level, so I created a wrapper that sorts it out for us. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:25 -08:00
Kenneth Graunke	158a7440c3	meta: Use srcWidth/Height/Depth rather than srcImage->Width and such. This is equivalent for now, and will differ once we add 1DArray support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:19 -08:00
Kenneth Graunke	ec23d5197e	meta: Support GenerateMipmaps on 2DArray textures. This is largely a matter of looping over the number of slices/layers, and not minifying depth (presumably that code exists for the unfinished 3D texture support). Normally, I would have made the loop over array slices the outermost loop. I suspect that would make it trickier to support 3D textures someday, though, so I didn't. The advantage is that we would only have one BufferData call per slice, rather than one per miplevel and slice. However, a GenerateMipmaps microbenchmark indicates that either way is basically just as fast. So I'm not sure it's worth bothering. Improves performance in a GenerateMipmaps microbenchmark by nearly 5x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:17 -08:00
Kenneth Graunke	15b2f69b9c	meta: Add a 'layer' argument to bind_fbo_image(). For array textures and 3D textures, this represents the layer to use. Just pass 0 for now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:16 -08:00
Kenneth Graunke	be84d53d44	meta: Refactor code for binding a texture image to the FBO. Almost the exact same code appeared twice, and it needs to expand to handle additional texture targets. Refactor it to tidy up the code and avoid duplicating more work in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:14 -08:00
Kenneth Graunke	45ee1b30d7	meta: Use minify() in GenerateMipmaps code. This is what the macro is for. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:13 -08:00
Kenneth Graunke	9afca91984	meta: Drop redundant FBO creation code in GenerateMipmaps. fallback_required() already creates the FBO in order to check whether we can render to the format. So it's guaranteed to exist. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:11 -08:00
Kenneth Graunke	1285bc87ac	meta: Replace GLboolean with bool in fallback_required(). This doesn't interact with the GL API, so we shouldn't use GL types. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:10 -08:00
Kenneth Graunke	092b7edb3f	meta: Make _mesa_meta_check_generate_mipmap_fallback static. This was only ever used in one place; there's no reason for it to be non-static. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:09 -08:00
Kenneth Graunke	70e7905608	meta: Split GenerateMipmap() into its own file. Putting the implementation of each GL function in its own file makes it much easier not to get lost. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:07 -08:00
Kenneth Graunke	3a7f3d843a	meta: De-static setup_texture_coords(). This will be used in multiple files soon. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 22:45:04 -08:00
Timothy Arceri	1308d21fbf	glapi: Add KHR_debug.xml	2014-03-08 15:45:26 +11:00
Timothy Arceri	6c3f5abc2d	mesa: add missing DebugMessageControl types Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-08 15:38:31 +11:00
Timothy Arceri	fb78fa58d2	mesa: make ARB_debug_output functions an alias of KHR_debug Also update dispatch sanity removing ARB_debug_output checks and removing KHR_debug placeholders as the checks have already been added V2: Make sure we exit case statements with conditional breaks rather than just dropping through. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-08 15:38:31 +11:00
Timothy Arceri	0608d346aa	glapi: move KHR_debug into its own file Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-08 15:31:59 +11:00
Adel Gadllah	b972e55684	glx_pbuffer: Refactor GetDrawableAttribute Move the pdraw != NULL check out so that they don't have to be duplicated. Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 16:59:57 -08:00
Adel Gadllah	6b13cd1f7f	glx: Update glxext.h to revision 25407 Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 16:59:57 -08:00
Tom Stellard	a1b189ac90	radeon: Include radeon_elf_util.c in the list of LLVM_C_FILES v2 This fixes the a build breakage caused by `6974eb9076` on build configurations where all the following are true: 1. radeonsi is not being built 2. r600g is being built 3. opencl is disabled 4. --enable-r600-llvm-compiler is not being used 5. libelf is not installed v2: - Add $(RADEON_CFLAGS) to libllvmradeon_la_CFLAGS Tested-by: Brian Paul <brianp@vmware.com>	2014-03-07 18:06:59 -05:00
Brian Paul	9b322d540a	st/mesa: only mark framebuffer as sRGB capable if Mesa supports the format Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-07 15:43:36 -07:00
Tom Stellard	6974eb9076	radeon/llvm: Factor elf parsing code out into its own function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-07 13:31:52 -05:00
Tom Stellard	1f4a9fc84e	radeon: Rename struct radeon_llvm_binary to radeon_shader_binary v2 And move its definition into r600_pipe_common.h; This struct is a just a container for shader code and has nothing to do with LLVM. v2: - Drop unrelated Makefile change Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-07 13:31:51 -05:00
Marek Olšák	d8fde8ffed	gallium: rename R4A4 and A4R4 formats to match their swizzle Like L4A4. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-07 18:07:05 +01:00
Marek Olšák	780ce576bb	mesa: fix the format of glEdgeFlagPointer Softpipe expects a float in the vertex shader, which is what glEdgeFlag generates. This fixes piglit/gl-2.0-edgeflag. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-07 18:07:05 +01:00
Marek Olšák	472ac0db08	radeonsi: fix blit compressed texture workaround to support 2D arrays We don't have a piglit test for this, but I think it's correct. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-07 18:07:05 +01:00
Marek Olšák	fcdf6fa86c	r600g: fix blitting the last 2 mipmap levels for Evergreen This fixes a lot of compressedteximage piglit tests. R600-R700 don't have this issue. Cc: mesa-stable@lists.freedesktop.org	2014-03-07 18:07:05 +01:00
Marek Olšák	8a08051e2a	r600g: fix texelFetchOffset GLSL functions Cc: mesa-stable@lists.freedesktop.org	2014-03-07 18:07:05 +01:00
Marek Olšák	67aef6dafa	winsys/radeon: if there's VRAM-only usage, keep it	2014-03-07 18:07:05 +01:00
Niels Ole Salscheider	f112ba03bb	radeon: Use upload manager for buffer downloads Using DMA for reads is much faster. Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Signed-off by: Marek Olšák <marek.olsak@amd.com>	2014-03-07 18:07:05 +01:00
Brian Paul	b46e8622f1	glapi: use 'Mesa' in error messages A user would have no idea what "_glthread_" is. This removes the last remaining instance of the _glthread_ string in Mesa. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-07 09:04:01 -07:00
Brian Paul	6d2dffe8b1	st/mesa: add test_format_conversion() debug function To check that the st_mesa_format_to_pipe_format() and st_pipe_format_to_mesa_format() functions correctly convert all corresponding Mesa/Gallium formats. This found that MESA_FORMAT_YCBCR_REV was missing in st_mesa_format_to_pipe_format(). Fixed that too. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-07 07:31:29 -07:00
Brian Paul	d8f7e3d79e	st/mesa: add MESA_FORMAT_R8G8B8A8_SRGB in st_mesa_format_to_pipe_format() v2: rename patch after rebasing on top of Jose's changes. Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-03-07 07:31:18 -07:00
José Fonseca	b3689adf51	mesa/st: Fix PIPE_FORMAT_R8G8B8A8_SRGB -> MESA_FORMAT_ conversion. Copy'n'past typo introduced in my `1d8e3067fd` commit. This fixes swapped RB channels I was seeing in my test machines. Trivial.	2014-03-07 13:35:24 +00:00
Kusanagi Kouichi	7233d4479e	st/vdpau: Add rotation v2 v2: add static asserts Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-07 09:20:11 +01:00
Kusanagi Kouichi	e7e207658c	vl: Add rotation v3 v2: rotate in gen_rect_verts instead v3: clear rotate in vl_compositor_clear_layers, update calc_drawn_area as well Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-03-07 09:20:11 +01:00
Christian König	53d1d879d5	st/omx/enc: fix crash on destruction Signed-off-by: Christian König <christian.koenig@amd.com>	2014-03-07 08:55:57 +01:00
Kenneth Graunke	378c6f2246	mesa: Drop unused hash_table::mem_ctx field. It's never used, and it's equivalent to ralloc_parent(ht) if you really need it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-06 20:55:34 -08:00
Michel Dänzer	9ceee5f4be	clover: Fix build against LLVM SVN r203065 or newer llvm/Linker.h was moved to llvm/Linker/Linker.h. Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-07 11:12:12 +09:00
Brian Paul	0f0c16b238	mesa: add MESA_FORMAT_R8G8B8A8_SRGB To match PIPE_FORMAT_R8G8B8A8_SRGB. v2: fix component name copy&paste bugs Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-06 18:17:14 -07:00
Matt Turner	8d3f739383	mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__. Because people insist on doing things like explicitly disabling SSE 4.1. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Tested-by: David Heidelberger <david.heidelberger@ixit.cz> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71547	2014-03-06 15:46:54 -08:00
Eric Anholt	c10896b593	i965: Fix render-to-texture in non-FinishRenderTexture cases. We've had several problems now with FinishRenderTexture not getting called enough, and we're ready to just give up on it ever doing what we need. In particular, an upcoming Steam title had rendering bugs that could be fixed by always_flush_cache=true. Instead of hoping Mesa core can figure out when we need to flush our caches, just track what BOs we've rendered to in a set, and when we render from a BO in that set, emit a flush and clear the set. There's some overhead to keeping this set, but most of that is just hashing the pointer -- it turns out our set never even gets very large, because cache flushes are so common (even on cairo-gl). No statistically significant performance difference in cairo-gl (n=100), despite spending ~.5% CPU in these set operations. v1: (Original patch by Eric Anholt.) v2: (Changes by Ken Graunke.) - Rebase forward from May 7th 2013 -> March 4th 2014. - Drop the FinishRenderTexture hook entirely; after rebasing the patch, the hook was just an empty function. - Move the brw_render_cache_set_clear() call from intel_batchbuffer_emit_flush() to brw_emit_pipe_control_flush(). In theory, this could catch more cases where we've flushed. - Consider stencil as a possible texturing source. v3: (changes by anholt): - Move set_clear() back to emit_mi_flush() -- it means we can drop more forced flushes from the code. In the previous location, it wouldn't have been called when we wanted pre-gen6. - Move the set clear from batch init to reset -- it should be empty at the start of every batch, since the kernel handled any inter-batch flush for us. v4: Drop the debug code in set.c that I accidentally committed. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Dylan Baker <baker.dylan.c@gmail.com> [v2]	2014-03-06 11:35:17 -08:00
Brian Paul	1e25aa4cdb	mesa: fix copy & paste bugs in pack_ubyte_SRGB8() Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-06 11:39:41 -07:00
Brian Paul	9493fc729e	mesa: fix copy & paste bugs in pack_ubyte_SARGB8() Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-06 11:16:15 -07:00
Aaron Watry	fb78152678	gallium/util: Fix memory leak Fix a leaked vertex shader in u_blitter.c Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> CC: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-06 11:38:26 -06:00
José Fonseca	1d8e3067fd	st/mesa: Add R8G8B8A8_SRGB case to st_pipe_format_to_mesa_format. With the recent SRGB changes all my automated OpenGL llvmpipe tests (piglit, conform, glretrace) start asserting with the backtrace below. I'm hoping this change will fix it. I'm not entirely sure, as this doesn't happen in my development machine (the bug probably depends on the exact X visual). Anyway, it seems the sensible thing to do here. Program terminated with signal 5, Trace/breakpoint trap. #0 _debug_assert_fail (expr=expr@entry=0x7fa324df2ed7 "0", file=file@entry=0x7fa324e3fc30 "src/mesa/state_tracker/st_format.c", line=line@entry=758, function=function@entry=0x7fa324e40160 <__func__.34798> "st_pipe_format_to_mesa_format") at src/gallium/auxiliary/util/u_debug.c:281 #0 _debug_assert_fail (expr=expr@entry=0x7fa324df2ed7 "0", file=file@entry=0x7fa324e3fc30 "src/mesa/state_tracker/st_format.c", line=line@entry=758, function=function@entry=0x7fa324e40160 <__func__.34798> "st_pipe_format_to_mesa_format") at src/gallium/auxiliary/util/u_debug.c:281 No locals. #1 0x00007fa3241d22b3 in st_pipe_format_to_mesa_format (format=format@entry=PIPE_FORMAT_R8G8B8A8_SRGB) at src/mesa/state_tracker/st_format.c:758 __func__ = "st_pipe_format_to_mesa_format" #2 0x00007fa3241c8ec5 in st_new_renderbuffer_fb (format=format@entry=PIPE_FORMAT_R8G8B8A8_SRGB, samples=0, sw=<optimised out>) at src/mesa/state_tracker/st_cb_fbo.c:295 strb = 0x19e8420 #3 0x00007fa32409d355 in st_framebuffer_add_renderbuffer (stfb=stfb@entry=0x19e7fa0, idx=<optimised out>) at src/mesa/state_tracker/st_manager.c:314 rb = <optimised out> format = PIPE_FORMAT_R8G8B8A8_SRGB sw = <optimised out> #4 0x00007fa32409e635 in st_framebuffer_create (st=0x19e7fa0, st=0x19e7fa0, stfbi=0x19e7a30) at src/mesa/state_tracker/st_manager.c:458 stfb = 0x19e7fa0 mode = {rgbMode = 1 '\001', floatMode = 0 '\000', colorIndexMode = 0 '\000', doubleBufferMode = 0, stereoMode = 0, haveAccumBuffer = 0 '\000', haveDepthBuffer = 1 '\001', haveStencilBuffer = 1 '\001', redBits = 8, greenBits = 8, blueBits = 8, alphaBits = 8, redMask = 0, greenMask = 0, blueMask = 0, alphaMask = 0, rgbBits = 32, indexBits = 0, accumRedBits = 0, accumGreenBits = 0, accumBlueBits = 0, accumAlphaBits = 0, depthBits = 24, stencilBits = 8, numAuxBuffers = 0, level = 0, visualRating = 0, transparentPixel = 0, transparentRed = 0, transparentGreen = 0, transparentBlue = 0, transparentAlpha = 0, transparentIndex = 0, sampleBuffers = 0, samples = 0, maxPbufferWidth = 0, maxPbufferHeight = 0, maxPbufferPixels = 0, optimalPbufferWidth = 0, optimalPbufferHeight = 0, swapMethod = 0, bindToTextureRgb = 0, bindToTextureRgba = 0, bindToMipmapTexture = 0, bindToTextureTargets = 0, yInverted = 0, sRGBCapable = 1} idx = <optimised out> #5 st_framebuffer_reuse_or_create (st=st@entry=0x19dfce0, fb=<optimised out>, stfbi=stfbi@entry=0x19e7a30) at src/mesa/state_tracker/st_manager.c:728 No locals. #6 0x00007fa32409e8cc in st_api_make_current (stapi=<optimised out>, stctxi=0x19dfce0, stdrawi=0x19e7a30, streadi=0x19e7a30) at src/mesa/state_tracker/st_manager.c:747 st = 0x19dfce0 stdraw = 0x640064 stread = 0x1300000006 ret = <optimised out> #7 0x00007fa324074a20 in XMesaMakeCurrent2 (c=c@entry=0x195bb00, drawBuffer=0x19e7e90, readBuffer=0x19e7e90) at src/gallium/state_trackers/glx/xlib/xm_api.c:1194 No locals. #8 0x00007fa3240783c8 in glXMakeContextCurrent (dpy=0x194e900, draw=8388610, read=8388610, ctx=0x195bac0) at src/gallium/state_trackers/glx/xlib/glx_api.c:1177 drawBuffer = <optimised out> readBuffer = <optimised out> xmctx = 0x195bb00 glxCtx = 0x195bac0 firsttime = 0 '\000' no_rast = 0 '\000' #9 0x00007fa32407852f in glXMakeCurrent (dpy=<optimised out>, drawable=<optimised out>, ctx=<optimised out>) at src/gallium/state_trackers/glx/xlib/glx_api.c:1211 No locals. Acked-by: Brian Paul <brianp@vmware.com>	2014-03-06 17:23:17 +00:00
Brian Paul	84094a273e	glapi: remove u_mutex wrapper code, use c99 thread mutexes directly v2: fix initializer mistake spotted by Chia-I Wu. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-06 07:53:06 -07:00
Brian Paul	846a7e8630	glapi: rename u_current dispatch table functions Put "table" in the names to make things more understandable. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-06 07:47:12 -07:00
Brian Paul	280e065707	glapi: replace 'user' with 'context' in u_current.[ch] code To make the functions more understandable. Reviewed-by: Chia-I Wu <olv@lunarg.com>	2014-03-06 07:47:05 -07:00
Brian Paul	ef8a19ed4f	glsl: fix compiler warnings in link_uniforms.cpp With a non-debug build, gcc has two complaints: 1. 'found' var not used. Silence with '(void) found;' 2. 'id' not initialized. It's assigned by the UniformHash->get() call, actually. But init it to zero to silence gcc. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-06 07:45:36 -07:00
Ilia Mirkin	3649800009	mesa/st: only compare the one scissor sizeof(scissor) returns the size of the full array rather than a single element. Fix it to consider just the one element. Fixes: `0705fa35` ("st/mesa: add support for GL_ARB_viewport_array (v0.2)") Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Dave Airlie <airlied@redhat.com>	2014-03-05 22:51:58 -05:00
Chia-I Wu	4c68c6dcff	st/mesa: make winsys fbo sRGB-capable when supported The texture formats of winsys fbo are always linear becase the st manager (st/dri for example) could not know the colorspace used. But it does not mean that we cannot make the fbo sRGB-capable. By - setting rb->Visual.sRGBCapable to GL_TRUE when the pipe driver supports the format in sRGB colorspace, - giving rb an sRGB internal format, and - updating code to check rb->Format instead of strb->texture->format, we should be good. Fixed bug 75226 for at least llvmpipe and ilo, with no piglit regression. v2: do not set rb->Visual.sRGBCapable for GLES contexts to avoid surprises Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75226 Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2014-03-06 10:59:25 +08:00
Chia-I Wu	6d23ca1621	st/mesa: add mappings for MESA_FORMAT_B8G8R8X8_SRGB The format is mapped to PIPE_FORMAT_B8G8R8X8_SRGB. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-06 10:59:25 +08:00
Chia-I Wu	5a27491a76	mesa: add MESA_FORMAT_B8G8R8X8_SRGB The format is needed to represent an RGB-only winsys framebuffer that is sRGB-capable. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-06 10:59:25 +08:00
Brian Paul	48a9094b69	mesa: fix packing/unpacking for MESA_FORMAT_A4R4G4B4_UNORM Spotted by Chia-I Wu. v2: also fix unpack_ubyte_ARGB4444_REV() Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Chia-I Wu <olv@lunarg.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-05 16:06:54 -07:00
Eric Anholt	171ec9585f	i965: Fix predicated-send-based discards with MRT. We need the header setup to not be predicated on which pixels are undiscarded. I'm not sure originally if I had thought that the mask disable implied predicate disable, or if I had just misread the mask disable as predicate disable. Either way, I know I had spent more time thinking about this in the gen8 generator than the gen7 generator. Plus, it turns out that I had mis-implemented the "the GPU will use the predicate unless this header is present" comment, by skipping setting up the pixel mask when the header was present. Fixes GPU hangs in piglit glsl-fs-discard-mrt, Trine, Trine 2 and preusmably MLL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75207 Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-05 13:37:33 -08:00
Eric Anholt	9856d658ce	configure: Fix bashism. /bin/sh defaults to dash on debian. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-05 13:37:33 -08:00
Andreas Boll	c1958911f1	docs: update 10.2 release notes	2014-03-05 22:20:48 +01:00
Brian Paul	02cb04c68f	mesa: remove remaining uses of _glthread_GetID() It was really only used in the radeon driver for a debug printf. And evidently, libGL.so referenced it just to work around some sort of linker issue. This patch removes the two calls to the function and the function itself. Fixes undefined _glthread_GetID symbol in libGL reported by 'nm'. Though, the missing symbol doesn't cause any issues on my system but it does cause glxinfo to fail on one of our test systems. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-03-05 11:05:48 -07:00
Brian Paul	0b0114cc3b	mesa: new init_teximage_fields_ms() function to init MS texture images Before, it was kind of ugly to set the multisample fields with assignments after we called _mesa_init_teximage_fields(). Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-03-05 11:05:47 -07:00
Rob Clark	4de1e5eddc	WIP: freedreno/a3xx: incorrect scissor for binning pass If scissor optimization is used (to avoid bringing scissored portions of the render target into GMEM and then back out to system memory) in combination with hw binning pass, the result would be a scissor mismatch between binning pass and rendering pass. This would cause rendering bugs in some scenarios with (for example) gnome-shell. I would have expected that simply using the correct screen-scissor during the binning pass would be enough, but seems like there is something else missing. So for now disable binning pass if scissor optimization is used.	2014-03-05 12:37:21 -05:00
Topi Pohjolainen	12d55d5f19	i965: Mark invariants in backend_visitor as constants Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:31:57 +02:00
Topi Pohjolainen	a290cd039c	i965: Merge resolving of shader program source Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:31:44 +02:00
Topi Pohjolainen	81494ec613	i965: Merge initialisation of backend_visitor Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:20:21 +02:00
Topi Pohjolainen	afed5354aa	i965/wm: Use resolved miptree consistently in surface setup Most of the logic refers to the local variable 'mt' directly but a few cases use 'intelObj->mt' instead. These are the same for now but will be different once stencil miptree gets used. v2 (Ian): fixed also indentation in surrounding lines Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:19:19 +02:00
Topi Pohjolainen	9b169a1893	i965/vec4: Mark invariant members as constants in vec4_visitor Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:13:57 +02:00
Topi Pohjolainen	8a9b4ade03	i965: Mark sources for offset getters as constants Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-03-05 10:13:05 +02:00
Ian Romanick	8f049dc298	docs: Import 10.1 release notes, add news item. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-05 09:32:26 +02:00
Ilia Mirkin	c74783abfa	nv50,nvc0: add 11f_11f_10f vertex support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-03-04 21:54:54 -05:00
Kenneth Graunke	dfa1ab0e52	i965: Implement ARB_stencil_texturing on Gen8+. On earlier hardware, we had to implement math in the shader to translate Y-tiled or untiled coordinates to W-tiled coordinates (which is what BLORP does today in order to texture from stencil buffers). On Broadwell, we can simply state that it's W-tiled in SURFACE_STATE, and adjust the pitch. This is much easier. In the surface state code, I chose to handle the "should we sample depth or stencil?" question separately from the setup for sampling from stencil. This should make it work with the BindRenderbufferTexImage hook as well, and hopefully be reusable for GL_ARB_texture_stencil8 someday. v2: Update docs/GL3.txt (caught by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-04 17:23:03 -08:00
Kenneth Graunke	23e81b93bb	mesa: Add core API support for GL_ARB_stencil_texturing (from 4.3). While the GL_ARB_stencil_texturing extension does not allow the creation of stencil textures, it does allow shaders to sample stencil values stored in packed depth/stencil textures. Specifically, applications can call glTexParameter* with a pname of GL_DEPTH_STENCIL_TEXTURE_MODE and value of either GL_DEPTH_COMPONENT or GL_STENCIL_INDEX to select which component they wish to sample. The default value is GL_DEPTH_COMPONENT (for traditional depth sampling). Shaders should use an unsigned integer sampler (presumably usampler2D) to access stencil data. Otherwise, results are undefined. Using shadow samplers with GL_STENCIL_INDEX selected also is undefined behavior. This patch creates a new gl_texture_object field, StencilSampling, to indicate that stencil should be sampled rather than depth. (I chose to use a boolean since I figured it would be more convenient for drivers.) It also introduces the [Get]TexParameter code to get and set the value, and of course the extension plumbing. v2: Also consider textures incomplete when sampling stencil with non-NEAREST min/mag filters (caught by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-03-04 17:21:06 -08:00
Dieter Nützel	5f23a2d9c2	radeon/uvd: fix typo in documentation s/grap/grab/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 17:54:07 -05:00
Eric Anholt	b959fd9674	dri: Require libudev-dev for building DRI on Linux. The loader infrastructure for everything but DRI2 requires that udev be present, so we can figure out an appropriate driver from the fd. We don't have a portable solution yet, but presumably it will have similar lookup based on the device node. It will also be even more required for krh's udev-based hwdb support, which lets us have a loader that actually loads DRI drivers not included in the loader's source distribution. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75212 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-04 14:32:52 -08:00
Tom Stellard	262e15fdd4	clover: Use correct LLVM version in #if for DataLayout construction Spotted by Michel Dänzer.	2014-03-04 16:22:09 -05:00
Zack Rusin	1dd84357ec	translate: fix buffer overflows Because in draw we always inject position at slot 0 whenever fragment shader would take the maximum number of inputs (32) it meant that we had PIPE_MAX_ATTRIBS + 1 slots to translate, which meant that we were crashing with fragment shaders that took the maximum number of attributes as inputs. The actual max number of attributes we need to translate thus is PIPE_MAX_ATTRIBS + 1. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-03-04 15:56:04 -05:00
Zack Rusin	08f174daa4	draw/llvm: fix generation of the VS with GS present draw_current_shader_* functions return a final output when considering both the geometry shader and the vertex shader. But when code generating vertex shader we can not be using output slots from the geometry shader because, obviously, those can be completely different. This fixes a number of very non-obvious crashes. A side-effect of this bug was that sometimes the vertex shading code could save some random outputs as position/clip when the geometry shader was writing them and vertex shader had different outputs at those slots (sometimes writing garbage and sometimes something correct). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Matthew McClure <mcclurem@vmware.com>	2014-03-04 15:37:52 -05:00
Anuj Phogat	079bff5a99	mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexImage{123}D() From OpenGL 3.3 spec, page 141: "Textures with a base internal format of DEPTH_COMPONENT or DEPTH_STENCIL require either depth component data or depth/stencil component data. Textures with other base internal formats require RGBA component data. The error INVALID_OPERATION is generated if one of the base internal format and format is DEPTH_COMPONENT or DEPTH_STENCIL, and the other is neither of these values." Fixes Khronos OpenGL CTS test failure: proxy_textures_invalid_size Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 11:23:04 -08:00
Anuj Phogat	0f6f92e284	mesa: Use clear_teximage_fields() in place of _mesa_init_teximage_fields() This patch makes no functional changes to the code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 11:22:58 -08:00
Anuj Phogat	063980151e	mesa: Set initial internal format of a texture to GL_RGBA From OpenGL 4.0 spec, page 398: "The initial internal format of a texel array is RGBA instead of 1. TEXTURE_COMPONENTS is deprecated; always use TEXTURE_INTERNAL_FORMAT." Fixes Khronos OpenGL CTS test failure: proxy_textures_invalid_size Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 11:22:39 -08:00
Vinson Lee	f2d724c686	scons: Build with C++11 with LLVM >= 3.5. Starting with llvm-3.5svn r202574, LLVM expects C+11 mode. commit f8bc17fadc8f170c1126328d203f0dab78960137 Author: Chandler Carruth <chandlerc@gmail.com> Date: Sat Mar 1 06:31:00 2014 +0000 [C++11] Turn off compiler-based detection of R-value references, relying on the fact that we now build in C++11 mode with modern compilers. This should flush out any issues. If the build bots are happy with this, I'll GC all the code for coping without R-value references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202574 91177308-0d34-0410-b5e6-96231b3b80d8 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-03-04 10:12:20 -08:00
Brian Paul	cbacee207f	st/osmesa: check buffer size when searching for buffers Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75543 Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-04 08:49:15 -07:00
José Fonseca	3d7c8836a6	configure: s/--with-llvm-shared-libs/--enable-llvm-shared-libs/ `--enable-llvm-shared-libs` option was recently renamed as `--with-llvm-shared-libs`, but several error messages still mention the old option, causing confusing. Trivial.	2014-03-04 14:09:37 +00:00
José Fonseca	a61d859519	c11/threads: Don't implement thrd_current on Windows. GetCurrentThread() returns a pseudo-handle (a constant which only makes sense when used within the calling thread) and not a real handle. DuplicateHandle() will return a real handle, but it will create a new handle every time we call. Calling DuplicateHandle() here means we will leak handles, which can cause serious problems. In short, the Windows implementation of thrd_t needs a thorough make over, and it won't be pretty. It looks like C11 committee over-simplified things: it would be much better to have seperate objects for threads and thread IDs like C++11 does. For now, just comment out the thrd_current() implementation, so we get build errors if anybody tries to use it. Thanks to Brian Paul for spotting and diagnosing this problem. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 12:05:23 +00:00
José Fonseca	e8d85034da	mapi/u_thread: Use GetCurrentThreadId u_thread_self() expects thrd_current() to return a unique numeric ID for the current thread, but this is not feasible on Windows. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-03-04 12:05:23 +00:00
José Fonseca	f34d75d6f6	c11/threads: Fix nano to milisecond conversion. Per https://gist.github.com/yohhoy/2223710/#comment-710118 Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Michel Dänzer <michel@daenzer.net>	2014-03-04 12:05:23 +00:00
Marek Olšák	1337da5115	r600g: implement edge flags Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 12:26:16 +01:00
Marek Olšák	ac35ded473	r600g: port color buffer format conversion from radeonsi r600_translate_colorformat is rewritten to look like radeonsi. r600_translate_colorswap is shared with radeonsi. r600_colorformat_endian_swap is consolidated. This adds some formats which were missing. Future "plain" formats will automatically be supported. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 12:26:16 +01:00
Marek Olšák	dff3eccd15	radeonsi: move translate_colorswap to common code Also translate the Y__X swizzle. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-03-04 12:26:16 +01:00
Emil Velikov	1a568e0f2b	Revert "configure: use enable_dri_glx local variable" This reverts commit `dfe8cb48fc`. Accidently pushed this commit, over 1bb23abe065(configure: disable shared glapi when building xlib powered glx).	2014-03-04 02:13:48 +00:00
Emil Velikov	1bb23abe06	configure: disable shared glapi when building xlib powered glx With commit 0432aa064b(configure: use shared-glapi when more than one gl* API is used) we removed "disable shared-glapi when building without dri" hunk. In the good old days of classic mesa, dri and xlib-glx were mutually exclusive thus the hunk made sense. Currently enable-dri is used as a synonym for a range of things thus it's more appropriate to handle xlib-glx explicitly. Fixes a missing symbol '_glapi_Dispatch' in a xlib powered libGL, build using the following ./autogen.sh --enable-xlib-glx --disable-dri --with-gallium-drivers=swrast Cc: Brian Paul <brianp@vmware.com> Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-03-04 02:13:14 +00:00
Brian Paul	1e3bdb35a6	mesa: remove unneeded glthread.c file The _glthread_GetID() function is also defined in mapi_glapi.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:09:00 -07:00
Brian Paul	db806cacfd	mesa: remove empty glthread.h file Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	94dc91d7ec	mesa: remove unused glthread/TSD macros Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	bc76e9f28d	xlib: remove unneeded context tracking code This removes the only use of _glthread_Get/SetTSD(), etc. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	c00b250c80	xlib: simplify context handling Get rid of the fake_glx_context struct. Now, an XMesaContext is the same as a GLXContext. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	9b8e267976	xlib: remove unused realglx.[ch] files At one point in time, the xlib driver could call the real GLX functions. But that's long dead. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	afbc9b3537	mesa: remove unused _glthread_*MUTEX() macros Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:59 -07:00
Brian Paul	f19000550d	glsl: switch to c11 mutex functions Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:58 -07:00
Brian Paul	d129ea7fa2	mesa: switch to c11 mutex functions Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:58 -07:00
Brian Paul	2706db701d	xlib: switch to c11 mutex functions The _glthread_LOCK/UNLOCK_MUTEX() macros are just wrappers around the c11 mutex functions. Let's start getting rid of those wrappers. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-03-03 13:08:58 -07:00
Brian Paul	657436da7e	mesa: update packed format layout comments Update the comments for the packed formats to accurately reflect the layout of the bits in the pixel. For example, for the packed format MESA_FORMAT_R8G8B8A8, R is in the least significant position while A is in the most-significant position of the 32-bit word. v2: also fix MESA_FORMAT_A1B5G5R5_UNORM, per Roland.	2014-03-03 13:08:58 -07:00
Hans	837da9bdae	mesa: don't define c99 math functions for MSVC >= 1800 Signed-off-by: Brian Paul <brianp@vmware.com> Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-03 11:56:33 -07:00
Hans	bf25660325	util: don't define isfinite(), isnan() for MSVC >= 1800 Signed-off-by: Brian Paul <brianp@vmware.com> Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>	2014-03-03 11:56:30 -07:00
Brian Paul	aff7c5e78a	mesa: don't call ctx->Driver.ClearBufferSubData() if size==0 Fixes failed assertion when trying to map zero-length region. https://bugs.freedesktop.org/show_bug.cgi?id=75660 Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-03-03 10:41:42 -07:00
Brian Paul	465b2c42bc	softpipe: use 64-bit arithmetic in softpipe_resource_layout() To avoid 32-bit integer overflow for large textures. Note: we're already doing this in llvmpipe. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-03-03 10:41:42 -07:00
Grigori Goronzy	070036ca39	NV_vdpau_interop: fix IsSurfaceNV return type The spec incorrectly used void as return type, when it should have been GLboolean. This has now been fixed. According to Nvidia, their implementation always used GLboolean. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-03 18:37:59 +01:00
Grigori Goronzy	86c06871a2	st/vdpau: fix possible NULL dereference Reviewed-by: Christian König <christian.koenig@amd.com>	2014-03-03 18:37:35 +01:00
Christian König	bd6654aa38	st/omx: always advertise all components omx_component_library_Setup should return all entrypoints the library implements, independent of what is available on the current hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74944 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com>	2014-03-03 18:22:38 +01:00
Bruno Jiménez	79c83837c9	clover: Fix building with latest llvm Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-03 17:16:58 +01:00
Bruno Jiménez	089d0660c7	configure: Remove more flags from llvm-config This way, we are left with only the preprocessor flags and '-std=X' Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-03-03 17:16:52 +01:00
Fabio Pedretti	8a8dd86edc	configure.ac: consolidate dependencies version check Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-03 16:45:16 +01:00
Julien Cristau	6f0e2731e8	glx/dri2: fix build failure on HURD Patch from Debian package. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-03 16:44:44 +01:00
Dave Airlie	15b4ff3f4e	st/dri: add support for dma-buf importer (DRIimage v8) This is just a simple implementation that stores the extra values into the DRIimage struct and just uses the fd importer. I haven't looked into what is required to import YUV or deal with the extra parameters. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-03-03 11:14:38 +10:00
Dave Airlie	3fd081d1a5	st/dri: move fourcc->format conversion to a common place Before I cut-n-paste this a 3rd time lets consolidate it. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-03-03 11:14:38 +10:00
Kenneth Graunke	c95ec27a4a	mesa: Move MESA_GLSL=dump output to stderr. i965 recently moved debug printfs to use stderr, including ones which trigger on MESA_GLSL=dump. This resulted in scrambled output. For drivers using ir_to_mesa, print_program was already using stderr, yet all the code around it was using stdout. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-02 13:37:09 -08:00
Kenneth Graunke	3f37dd913f	glsl: Fix broken LRP algebraic optimization. opt_algebraic was translating lrp(x, 0, a) into add(x, -mul(x, a)). Unfortunately, this references "x" twice, which is invalid in the IR, leading to assertion failures in the validator. Normally, cloning IR solves this. However, "x" could actually be an arbitrary expression tree, so copying it could result in huge piles of wasted computation. This is why we avoid reusing subexpressions. Instead, transform it into mul(x, add(1.0, -a)), which is equivalent but doesn't need two references to "x". Fixes a regression since `d5fa8a9562`, which isn't in any stable branches. Fixes 18 shaders in shader-db (bastion and yofrankie). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-03-02 13:35:03 -08:00
Rob Clark	ecb71cfa66	freedreno/a3xx/compiler: overflow in trans_endif The logic to count number of block outputs was out of sync with the actual array construction. But to simplify / make things less fragile, we can just allocate the arrays for worst case size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	e0007f733d	freedreno/a3xx/compiler: fix for resolving PHI's A value may be assigned on only one side of an if/else. In this case we can simply substitute a mov.f32f32. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	26530716ab	freedreno/lowering: two-sided-color Add option to generate fragment shader to emulate two sided color. Additional inputs are added to shader for BCOLOR's (on corresponding to each COLOR input). CMP instructions are used to select whether to use COLOR or BCOLOR. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	8dd70125fc	freedreno/a3xx/compiler: add SSG Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	44c8f96b0d	freedreno/a3xx: fix gl_PointSize If vertex writes pointsize, there are a few extra bits we need to turn on in the cmdstream here and there. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	05a9bda971	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	cb540c21f2	freedreno/a3xx: binning-pass vertex shader variant Now that we have the infrastructure for shader variants, add support to generate an optimized shader for hw binning pass (with varyings/outputs other than position/pointsize removed). This exposes the possibility that the shader uses fewer constants than what is bound, so we have to take care to not emit consts beyond what the shader uses, lest we provoke the wrath of the HLSQ lockup! Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	664045752f	freedreno/a3xx: add support for frag coord/face Fixes anything that tries to use gl_FrontFacing/gl_FragCoord. Also, face support is needed to emulate two sided color. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Rob Clark	76924e3b51	freedreno/a3xx: fix for unused inputs An unused input might not have a register assigned. We don't want bogus regid to result in impossibly high max_reg.. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-03-02 11:26:35 -05:00
Chris Forbes	befbda56a2	i965: Validate (and resolve) all the bound textures. BRW_MAX_TEX_UNIT is the static limit on the number of textures we support per-stage, not in total. Core's `Unit` array is sized by MAX_COMBINED_TEXTURE_IMAGE_UNITS, which is significantly larger, and across the various shader stages, up to ctx->Const.MaxCombinedTextureImageUnits elements of it may be actually used. Fixes invisible bad behavior in piglit's max-samplers test (although this escalated to an assertion failure on HSW with texture_view, since non-immutable textures only have _Format set by validation.) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-02 21:14:56 +13:00
Chris Forbes	590920f93e	i965: Widen sampler key bitfields for 32 samplers Previously the `high` 16 samplers on Haswell+ would not get sampler workarounds applied. Don't bother widening YUV fields, since they're ignored and going away soon anyway. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.1" <mesa-stable@lists.freedesktop.org> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-03-02 21:14:18 +13:00
Emil Velikov	fc25956bad	dri/i9*5: correctly calculate the amount of system memory The variable name states megabytes, while we calculate the amount in kilobytes. Correct this by dividing with the correct amount. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-03-01 08:49:59 -08:00
Ilia Mirkin	f19271c7bf	gallium/util: add missing u_math include This is needed for MIN2/MAX2 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-28 20:00:34 -05:00
Brian Paul	a12d4d0398	mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT Fixes glGetTexImage() when converting from MESA_FORMAT_Z32_FLOAT_S8X24_UINT to GL_UNSIGNED_INT_24_8. Hit by the piglit ext_packed_depth_stencil-getteximage test. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-28 17:16:37 -07:00
Siavash Eliasi	2a399d9eae	glx/apple: Fixed glx context memory leak in case of failure. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Jeremy Huddleston Sequoia: <jeremyhu@apple.com>	2014-02-28 15:57:15 -08:00
Siavash Eliasi	f4416323fc	gbm/dri: Fixed buffer object memory leak in case of failure. Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-28 15:57:15 -08:00
Siavash Eliasi	0fe8d71667	r300g/tests: Added missing fclose for FILE resource. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-28 15:57:15 -08:00
Ian Romanick	ff2cbf9e0c	i915: Allocate the sys_buffer using _mesa_align_malloc Though it won't matter on Linux, use _mesa_align_free to release it. Since i965 doesn't have sys_buffer, I overlooked this in the GL_ARB_map_buffer_alignment work a few months ago. Fixes i915 (and presumably i830) regressions in ARB_map_buffer_range tests and the failure in arb_map_buffer_alignment-sanity_test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74960 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-28 15:05:39 -08:00
Ian Romanick	8ba157006f	i915: Only allow 8 vertex texture units There's no reason to have more vertex texture units than fragment texture units on this hardware. Since increasing the default maximum number of texture units from 16 to 32, this has triggered some segfault in i915 driver. There's probably some array or bitfield that isn't properly sized now. This really papers over the bug, but I don't think I'll lose any sleep over that. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74071 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-28 15:05:38 -08:00
Petri Latvala	59989a4a92	i965: Assert array index on access to vec4_visitor's arrays. v2: vec4_visitor::pack_uniform_registers(): Use correct comparison in the assert, this->uniforms is already adjusted. Compare the actual value used to index uniform_size and uniform_vector_size instead. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 15:05:38 -08:00
Petri Latvala	7189fce237	i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically. v2: Don't add function parameters, pass the required size in prog_data->nr_params. v3: - Use the name uniform_array_size instead of uniform_param_count. - Round up when dividing param_count by 4. - Use MAX2() instead of taking the maximum by hand. - Don't crash if prog_data passed to vec4_visitor constructor is NULL v4: Rebase for current master v5 (idr): Trivial whitespace change. Signed-off-by: Petri Latvala <petri.latvala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254 Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 15:05:38 -08:00
Marek Chalupa	96f324e229	gbm: export gbm_device_is_format_supported Probably depending on compiler settings, the definition can be hidden, so undefined reference error can be encountred during linking. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75528 Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:57:30 +00:00
Emil Velikov	dfe8cb48fc	configure: use enable_dri_glx local variable GLX can be either dri or xlib based, while enable_dri is used in a variety of contexts. With enable_dri_glx the context is clearly visible. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:56:33 +00:00
Emil Velikov	4687b0a1a7	configure: enable the drm pipe-loader for non swrast drivers All hardware drivers including the virtual vmwgfx require the drm pipe-loader in order to be properly loaded by xa, gbm and opencl. Note this does _not_ add support for the above three it only allows the pipe driver to be loaded by the library. Eg. GBM will now properly open the pipe-i915 driver, should one be working on the such hardware. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75453 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:48:38 +00:00
Emil Velikov	e283e96666	configure: error out when building xa only with swrast Building to provide accelration using swrast does not make sense. Note: update your build script to explicitly mention svga in the gallium drivers list, if you are building the vmwgfx xa library. v2: Update error message to provide more clarify, add an example. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:47:56 +00:00
Emil Velikov	2e830bba21	configure: avoid setting variables as empty strings Recent patch converted our logic to use test -n and test -z. An emptry string variable (empty_str="") return true for both thus making the check unreliable. Fix this by correctly setting the variable when applicable. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:34:50 +00:00
Emil Velikov	f42333b6b6	configure: avoid constantly building megadrivers 'core' The issue is caused by a thinko that an empty string will be considered of zero length by 'test'. This is not the case, thus we were building the 'core' of megadrivers even when no classic drivers were built. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-28 22:34:50 +00:00
Tom Stellard	f61e382f0a	r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs This prevents clover from using unsupported devices. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "10.0 10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 16:17:34 -05:00
Matt Turner	4bd7f1d044	glsl: Don't vectorize horizontal expressions. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75224	2014-02-28 10:37:52 -08:00
Matt Turner	5eff8576ba	glsl: Add is_horizontal() method to ir_expression. Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 10:37:46 -08:00
Matt Turner	d5fa8a9562	glsl: Optimize lrp(x, 0, a) into x - (x * a). Helps one program in shader-db: instructions in affected programs: 96 -> 92 (-4.17%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 10:36:12 -08:00
Matt Turner	ecc6c3d4ab	glsl: Optimize lrp(0, y, a) into y * a. Helps two programs in shader-db: instructions in affected programs: 254 -> 234 (-7.87%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-28 10:36:06 -08:00
Brian Paul	43dee0295e	mesa: do depth/stencil format conversion in glGetTexImage glGetTexImage(GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8) was just using memcpy() instead of _mesa_unpack_uint_24_8_depth_stencil_row() to convert texels from the hardware format to the GL format. Fixes issue reported by David Meng at Intel. The new piglit ext_packed_depth_stencil-getteximage test checks for this bug. Also, add some format/type assertions. We don't yet handle the GL_FLOAT_32_UNSIGNED_INT_24_8_REV type. That should be fixed in a follow-on patch. Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 07:02:55 -07:00
Brian Paul	84787aae95	mesa: fix depth/stencil comments in formats.h	2014-02-28 07:02:36 -07:00
Thomas Hellstrom	f5e681f3fa	winsys/svga: Avoid calling drm getparam for max surface size on older kernels This avoids the kernel driver spewing out errors about the param not being supported. Also correct the max surface size used when the kernel does not support the query. Reported-by: Brian Paul <brianp@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-28 11:11:21 +01:00
Kenneth Graunke	085f61bd4e	meta: Drop ctx->API checks. API is always API_OPENGL_COMPAT (since commit `4e4a537ad5`, "meta: Push into desktop GL mode when doing meta operations."), so most of these checks do nothing. We could instead check save->API to only bother setting/restoring relevant GL state, but I'm not sure saving a few _mesa_set_enable calls is worth the complexity. My understanding is the point of the ctx->API guards was to avoid raising GL errors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-27 10:07:40 -08:00
Kenneth Graunke	cf719a0204	meta: Restore API at the end of _mesa_meta_end(), not the start. In _mesa_meta_begin(), we switch to API_OPENGL_COMPAT, then munge a lot of state (including some that doesn't exist in the actual API - like PolygonStipple in API_OPENGL_CORE). It seems reasonable that in _mesa_meta_end(), we should restore it, then switch back to the original API. This at least makes it symmetric. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-27 10:07:40 -08:00
Roland Scheidegger	612a1d5be1	util/u_format: don't crash in util_format_translate if we can't do translation Some formats can't be handled - in particular cannot handle ints/uints formats, which lack the pack_rgba_float/unpack_rgba_float functions. Instead of trying to call these (and crash) return an error (I'm not sure yet if we should try to translate such formats too here might not make much sense). v2: suggested by Jose, use separate checks for pack/unpack of rgba_8unorm and rgba_float functions (right now if one exists the other should as well). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-27 17:56:10 +01:00
Kenneth Graunke	80c1b9349c	i965: Convert VUE map generation checks to if rather than switch. There are currently only two VUE map layouts: one for Gen4-5, and one for everything else. We keep having to add new "case N+1" labels for every new hardware generation, and so far it's always been the same. This patch makes it so we only have to do work in the case where something actually changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-27 00:05:55 -08:00
Kenneth Graunke	9b1a6745f6	i965: Only emit VS state pipe control workaround on IVB and BYT. According to the BSpec's 3D workarounds page, this is unnecessary on shipping Haswell hardware, and was never necessary on Broadwell. It unfortunately doesn't say anything about Baytrail. The workaround database confirms those results for Ivybridge, Haswell, and Broadwell. Baytrail is less clear - one page says it's necessary, while the other says it isn't. For now, be conservative and leave it enabled. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-27 00:05:48 -08:00
Ilia Mirkin	51fc093421	nouveau: add a nouveau_compiler binary to compile TGSI into shader ISA This makes it easy to compare output between different cards, especially for ones that you don't have (and/or not in the current machine). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:48 -05:00
Ilia Mirkin	dd370f0af6	nv30: remove nv30_context use from nvfx_*prog This should pave the way to being able to use the compiler without a context. Also leads to cleaner code. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:47 -05:00
Ilia Mirkin	41dbc4c444	nv30: remove unused sprite flipping parameter Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:47 -05:00
Ilia Mirkin	fe2738f998	nv30: remove unused render_mode and hw_pointsprite_control Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:46 -05:00
Ilia Mirkin	8f23d08928	nv30: remove use_nv4x, it is identical to is_nv4x Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-26 23:35:45 -05:00
Ilia Mirkin	734fe2d246	docs: update nvc0 state ARB_texture_buffer_object_rgb32 has been supported for a while already.	2014-02-26 23:35:45 -05:00
Michel Daenzer	59936a49dd	radeonsi: Prevent geometry shader from emitting too many vertices	2014-02-27 10:27:55 +09:00
Anuj Phogat	b3094d9927	i965: Fix the region's pitch condition to use blitter intelEmitCopyBlit uses a signed 16-bit integer to represent buffer pitch, so it can only handle buffer pitches < 32k. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-26 13:43:00 -08:00
Brian Paul	863a1f7757	glsl: add switch case for MESA_SHADER_COMPUTE To fix warning about unhandled enum value. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-26 13:29:16 -07:00
Kenneth Graunke	fe8f3bef31	meta: Use a #define for the vector type to avoid %svec4 everywhere. By adding "#define gvec4 %svec4" to the top of our fragment shader, we can write generic code without needing to specialize it to vec4, ivec4, or uvec4 via asprintf. This also makes the INT and UNSIGNED_INT merge function code identical, so I combined those two cases. It's not a big savings, but a little bit tidier. v2: Rebase on Vinson's MSVC build fixes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-26 02:33:58 -08:00
Kenneth Graunke	f896e82301	i965: Don't try to dump shader source for fixed-function FS programs. sh->Source is NULL and this will segfault. Fixes MESA_GLSL=dump with "The Swapper". Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:31:24 -08:00
Kenneth Graunke	b18871c863	i965: Don't forget to subtract mt->first_level in minify calls. This fixes fbo-clear-formats GL_ARB_depth_texture on Ironlake, which regressed since commit `f128bcc7c2` ("i965: Drop mt->levels[].width/height.") intel_miptree_copy_slice was calling minify(.., 7) on a 2x2 texture with mt->first_level == 7. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75292 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:29:44 -08:00
Kenneth Graunke	ac0a8b9540	glsl: Delete LRP_TO_ARITH lowering pass flag. Tt's kind of a trap---calling do_common_optimization() after lower_instructions() may cause opt_algebraic() to reintroduce ir_triop_lrp expressions that were lowered, effectively defeating the point. Because of this, nobody uses it. v2: Delete more code (caught by Ian Romanick). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:56 -08:00
Kenneth Graunke	2fdea48e21	i965: Stop lowering ir_triop_lrp. Both the vector and scalar backends now support it natively, so there's no point in lowering it. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:55 -08:00
Kenneth Graunke	56879a7ac4	i965/vec4: Handle ir_triop_lrp on Gen4-5 as well. When the vec4 backend encountered an ir_triop_lrp, it always emitted an actual LRP instruction, which only exists on Gen6+. Gen4-5 used lower_instructions() to decompose ir_triop_lrp at the IR level. Since commit `8d37e9915a` ("glsl: Optimize open-coded lrp into lrp."), we've had an bug where lower_instructions translates ir_triop_lrp into arithmetic, but opt_algebraic reassembles it back into a lrp. To avoid this ordering concern, just handle ir_triop_lrp in the backend. The FS backend already does this, so we may as well do likewise. v2: Add a comment reminding us that we could emit better assembly if we implemented the infrastructure necessary to support using MAC. (Assembly code provided by Eric Anholt). Cc: "10.1" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75253 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:53 -08:00
Kenneth Graunke	ffde483f3c	i965/vec4: Add a brw->gen >= 6 assertion in three-source emitters. Three source instructions didn't exist until Gen6. vec4_generator has assertions to catch this, but catching it in the visitor provides a nicer backtrace. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Eric Anholt <eric@anholt.net>	2014-02-26 02:16:34 -08:00
Chia-I Wu	bb9c8071ea	ilo: create u_upload_mgr last Similar to u_blitter, u_upload_mgr is now a client of the pipe context. Its creation needs to be delayed until the context has been (almost) initialized.	2014-02-26 11:33:37 +08:00
Fredrik Höglund	3616e862f2	glx: Fix the GLXFBConfig attrib sort priorities The sort priorites for GLX_SAMPLES and GLX_SAMPLE_BUFFERS are not defined in GL_ARB_multisample, but they are defined in the GLX 1.4 specification. Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-26 02:17:12 +01:00
Fredrik Höglund	f41c2f6c33	glx: Fix the default values for GLXFBConfig attributes The default values for GLX_DRAWABLE_TYPE and GLX_RENDER_TYPE are GLX_WINDOW_BIT and GLX_RGBA_BIT respectively, as specified in the GLX 1.4 specification. This fixes the glx-choosefbconfig-defaults piglit test. Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-26 02:16:42 +01:00
Tom Stellard	54df6a0491	Re-commit 'clover: Fix build with LLVM 3.5' This was accidentally reverted in `9dfd7c5f75`	2014-02-25 14:43:26 -08:00
Vinson Lee	f094866d93	mesa: Add GL_ARB_buffer_storage to dispatch_sanity.cpp. Fixes 'make check' failure introduced with commit `119ffa7307`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75503 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-25 14:00:08 -08:00
Timothy Arceri	9dfd7c5f75	Revert "Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa" This reverts commit `1b79582f32`, reversing changes made to `376a98d345`.	2014-02-26 08:46:08 +11:00
Timothy Arceri	1b79582f32	Merge branch 'master' of git+ssh://git.freedesktop.org/git/mesa/mesa ry,	2014-02-26 08:39:32 +11:00
Tom Stellard	fcd499730b	clover: Fix build with LLVM 3.5	2014-02-25 13:32:37 -08:00
Timothy Arceri	376a98d345	glsl: removed unused dimension_count varible This variable is no longer needed after the cleanup to the code prior to the first arrays of array series Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-26 08:31:25 +11:00
Ilia Mirkin	d9b983519c	build: llvm libs may not be in system search path, add rpath On my gentoo system, llvm libs are in /usr/lib64/llvm, and llvm-config --ldflags does not provide the rpath (it does, of course, provide a -L). This adds the llvm dir to the rpath. It should be harmless if the path is a system path, and should make things work when it's not. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Tested-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-25 15:30:13 -05:00
Eric Anholt	42c2366de5	i965: Fix segfaults since the buffer_storage changes.	2014-02-25 12:19:15 -08:00
Ilia Mirkin	6417cabd9c	docs: update nv50 support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 14:42:35 -05:00
Ilia Mirkin	d1b1329c3a	nv50: enable txg where supported Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 14:42:34 -05:00
Ilia Mirkin	0e71c65db0	nv50: enable cube map array texture support Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 14:42:34 -05:00
Brian Paul	5a3dc449a9	libgl-xlib: add -Isrc/gallium/winsys flag So that sw/xlib/xlib_sw_winsys.h can be found. Fixes a build break. Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-25 12:35:07 -07:00
Brian Paul	c88a0b6af3	st/mesa: add comment to explain _min(), _maxf(), etc. functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-25 12:35:07 -07:00
Marek Olšák	9855477e90	r600g,radeonsi: consolidate create_surface and surface_destroy Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:26 +01:00
Marek Olšák	b9aa8ed009	radeonsi: inline util_blitter_copy_texture This will be used for changing texture properties without modifying pipe_resource like r600g, but not in this series. For now, this change allows consolidation of pipe_surface functions. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:22 +01:00
Marek Olšák	f7176d700f	radeonsi: remove useless psbox variable from resource_copy_region Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:20 +01:00
Marek Olšák	80eb377a37	radeonsi: compute depth surface registers only once Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:18 +01:00
Marek Olšák	629b019a40	radeonsi: compute color surface registers only once Same as r600g. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:17 +01:00
Marek Olšák	6b4e03216a	r600g: remove r600_resource.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:15 +01:00
Marek Olšák	ec266d06d0	r600g: remove r600_surface::htile_enabled v2: use one of the htile registers instead Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:12 +01:00
Marek Olšák	7fc6ece40e	r600g: use r600_surface::db_z_info db_z_info was unused. This just renames the variable to match the register name. Now, db_depth_info is unused on Evergreen. Both variables will be needed on SI though. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:10 +01:00
Marek Olšák	40b9812a76	r600g,radeonsi: share r600_surface I'm gonna use this in radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:08 +01:00
Marek Olšák	933eaeee25	radeonsi: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to framebuffer state It doesn't depend on anything else. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-25 16:08:05 +01:00
Marek Olšák	dca350201e	mesa: allow buffers to be mapped multiple times OpenGL allows a buffer to be mapped only once, but we also map buffers internally, e.g. in the software primitive restart fallback, for PBOs, vbo_get_minmax_index, etc. This has always been a problem, but it will be a bigger problem with persistent buffer mappings, which will prevent all Mesa functions from mapping buffers for internal purposes. This adds a driver interface to core Mesa which supports multiple buffer mappings and allows 2 mappings: one for the GL user and one for Mesa. Note that Gallium supports an unlimited number of buffer and texture mappings, so it's not really an issue for Gallium. v2: fix unmapping in xm_dd.c, remove the GL errors there v3: fix the intel driver (by Fredrik) Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	86e68b0f1f	docs: update ARB_buffer_storage status Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	04fb4bf61b	gallium/upload_mgr: remove useless variable "size" Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	7ea3f6bce5	gallium/upload_mgr: don't unmap buffers if persistent mappings are supported Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	db8886ed09	gallium: the other drivers don't support ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:07:33 +01:00
Marek Olšák	6381dd7e9d	r300g,r600g,radeonsi: add support for ARB_buffer_storage All GTT memory mappings are coherent and therefore can be persistent. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:05:41 +01:00
Marek Olšák	dfa0b8d9b8	st/mesa: implement ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:05:41 +01:00
Marek Olšák	5f61f052b5	gallium: add interface for persistent and coherent buffer mappings Required for ARB_buffer_storage.	2014-02-25 16:05:41 +01:00
Marek Olšák	d26a065b74	mesa: allow buffers mapped with the persistent flag to be used by the GPU v2: also fixed InvalidateBufferData, added citations from the 4.4 spec Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	4f78e17f6d	mesa: add error checks to glMapBufferRange, glMapBuffer for ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	119ffa7307	glapi: add ARB_buffer_storage Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	e592f11227	mesa: implement glBufferStorage, immutable buffers; add extension enable flag Reviewed-by: Fredrik Höglund <fredrik@kde.org> v2: dropped the error that DYNAMIC_STORAGE is required for MAP_WRITE_BIT, the error is removed in the latest revision of GL 4.4	2014-02-25 16:04:22 +01:00
Marek Olšák	7e548d0507	mesa: add storage flags parameter to Driver.BufferData It will be used by glBufferStorage. The parameters are chosen according to ARB_buffer_storage. Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:22 +01:00
Marek Olšák	aea4933287	mesa: remove unused driver hook BindBuffer Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2014-02-25 16:04:21 +01:00
Emil Velikov	882070cc81	nv50: correctly calculate the number of vertical blocks during transfer map Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-25 12:19:07 +00:00
Dave Airlie	7c3138acb9	st/mesa: add texture gather support. (v2) This adds support for GL_ARB_texture_gather, and one step of support for GL_ARB_gpu_shader5. This adds support for passing the TG4 instruction, along with non-constant texture offsets, and tracking them for the optimisation passes. This doesn't support native textureGatherOffsets hw, to do that you'd need to add a CAP and if set disable the lowering pass, and bump the MAX offsets to 4, then do the i0,j0 sampling using those. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-25 13:29:37 +10:00
Dave Airlie	2fcbec48d7	gallium: add texture gather support to gallium (v3) This adds support to gallium for a TG4 instruction, and two CAPs. The first CAP is required for GL_ARB_texture_gather. The second CAP is required to expose GL_ARB_gpu_shader5. However so far we haven't found any hardware that natively exposes the textureGatherOffsets feature from GL, so just lower it for now. If hardware appears for this we can add another CAP to allow TG4 to take 4 offsets. v2: add component selection src and a cap to say hw can do it. (st can use to help control GL_ARB_gpu_shader5/GLSL 4.00). Add docs. v3: rename to SM5, add docs. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-25 13:29:17 +10:00
Dave Airlie	122c3b9486	glsl/i965: move lower_offset_array up to GLSL compiler level. This lowering pass will be useful for gallium drivers as well, in order to support the GL TG4 oddity that is textureGatherOffsets. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-25 13:28:57 +10:00
Tom Stellard	945d87f958	clover: Pass buffer offsets to the driver in set_global_binding() v3 The offsets will be stored in the handles parameter. This makes it possible to use sub-buffers. v2: - Style fixes - Add support for constant sub-buffers - Store handles in device byte order v3: - Use endian helpers Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-24 12:56:27 -08:00
Tom Stellard	eac7236042	radeonsi: Use SI_BIG_ENDIAN now that it exists Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	8f3bcedde2	r600g: Use util_cpu_to_le32() instead of bswap32() on big-endian systems Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	195ee10673	radeonsi: Use util_cpu_to_le32() instead of bswap32() on big-endian systems Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	9f30685fae	util: Add util_cpu_to_le* helpers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-24 12:56:27 -08:00
Tom Stellard	a9f88e2ae8	util: Add util_bswap64() v3 v2: - Use __builtin_bswap64() - Remove unnecessary mask - Add util_le64_to_cpu() helper v3: - Remove unnecessary AC_SUBST Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-02-24 12:56:27 -08:00
Tom Stellard	f8ba0f55d3	configure.ac: Use AX_GCC_BUILTIN to check availability of __builtin_bswap32 v2 v2: - Remove unnecessary AC_SUBST Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-24 12:56:26 -08:00
Emil Velikov	73b46136b0	targets/opencl: resolve undefined symbols at link time Current automake build does not try to resolve undefined symbols thus we could end up with a broken library. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-24 14:59:39 +00:00
Emil Velikov	1ad9534337	gallium/targets: resolve undefined reference to pipe_loader_sw_probe_dri With the introduction of the pipe_loader_sw_probe_dri helper we require the sw/dri winsys during linking stage despite it being unused by any of the targets. This will cause a minor increase in the resulting library which will be cleaned up via linker options with upcoming patches. v2: Link with libswdri.la only when available. Reported-and-tested-by: Tom Stellard <thomas.stellard@amd.com> (v1) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:59:34 +00:00
Emil Velikov	61973ffe5b	configure: correctly report if we're building the sw/xlib winsys While looking at bug 75356, I've noticed that the presence of x11 egl platform pulls in sw/xlib as "needed" but fails to report so at the end of configure. Tested-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:57:41 +00:00
Emil Velikov	3445e8bb92	pipe-loader: wrap pipe_loader_sw_probe_xlib within HAVE_PIPE_LOADER_XLIB The above function implies using the the xlib winsys, which has additional library dependencies that should not be forced. Make the software xlib pipe loader optional thus avoid all the dependency hell. A user that wishes to use the particular pipe-loader would need to set the following within configure.ac. enable_gallium_xlib_loader=yes v2: - Wrap sw/xlib/xlib_sw_winsys.h to handle compilation on systems lacking X11 headers. Spotted by Christian Prochaska. Tested-by: Tom Stellard <thomas.stellard@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75356 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:52:27 +00:00
Emil Velikov	0e7c30233f	targets/gbm: exit gracefully if pipe_loader_drm_probe_fd is not available When one builds without gallium_drm_loader, the above function will not be available, thus we'll segfault in gallium_screen_create due to memory access violation. Tested-by: Tom Stellard <thomas.stellard@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75335 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-24 14:51:45 +00:00
Kenneth Graunke	73c78c514f	i965: Don't try to use the hardware blitter for multisampled miptrees. The blitter is completely ignorant of MSAA buffer layouts, so any attempt to use BLT paths with MSAA buffers is likely to break spectacularly. In most cases, BLORP handles MSAA blits, so we never hit this bug. Until recently, it also wasn't worth fixing, since Meta couldn't handle MSAA either, so there was nothing to fall back to. But now there is. +143 piglit tests on Broadwell (which doesn't have BLORP support). Surprisingly, three also start failing. Since non-IMS MSAA buffers store samples in successive array slices, using the blitter ought to access sample 0 and ignore the rest, which is apparently good enough for a few not-very-picky Piglit tests. Presumably the meta replacement code is still broken. No Piglit changes on Ivybridge. v2: Move the early return to the top of the function (suggested by Paul). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-23 20:19:00 -08:00
Rob Clark	3f7239ca0e	freedreno/a3xx/compiler: half-precision output Using generic shaders caused a measurable fps drop, which was isolated to use of full precision (vs half precision) output. This is an attempt to regain that lost performance by using half precision solid/blit shaders (when the output format is not float32). Note: for the built-in shaders, I would not expect them to be register starved. And in fact it is the solid frag shader that seems to have the biggest impact. So I suspect you get double the pixel pipe units (or half the cycles) when the output is half precision. So there may be some gain to using half precision output for application shaders as well, even though the rest of register usage is still full precision. But for half precision to work for more complex shaders, we need to deal with some constraints, like cat2 needing same precision for it's two src registers. So for now it is not enabled by default except for the built-in shaders. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:24 -05:00
Rob Clark	141ae71671	freedreno/a3xx: add shader variants Start putting in place infrastructure to deal with multiple shader variants. Initially we'll use this for two sided color (frag) and binning pass (vert) shaders. Possibly need for others later (such as YUV vs RGB eglImage?). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	9bbfae6265	freedreno/a3xx/compiler: collapse nop's with repeat Easier than making more extensive use of rpt, and the more compact shaders seem to bring some bit of performance boost. (Perhaps repeat flag benefits are more than just instruction cache, possibly it saves on instruction decode as well?) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	bb255fdf06	freedreno/a3xx: drop hand-coded blit/solid shaders Instead in the common code, construct these shaders from TGSI. For now we let a2xx keep it's hand coded shaders, as it's compiler isn't quite up to the job yet. All the same it is a net drop in code size and gets rid of special cases. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	1c953b7cda	freedreno/lowering: cleanup api Make things configurable, and tweak the API a bit to avoid an extra tgsi_shader_scan(). Getting closer to something generic which can be moved out of freedreno and shaderd by other drivers. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	67cea4b32a	freedreno/a3xx: add float 16 and 32bit formats Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	e819885b99	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Emil Velikov	f92fbba11b	glx/drisw: use the implemented version of __DRIswrastLoaderExtension ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	f6537d0608	glx/dri: use the implemented version of __DRIdamageExtension ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	ef342aad80	glx/dri_common: use the implemented version of __DRIsystemTimeExtension ... over the one provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	fbbf5ec471	glx/dri: use the implemented version of __DRIgetDrawableInfoExtension ... over the one provided by the headers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	15db8c0801	dri_util: use the implemented version of __DRIimageDriverExtension ... over the one provided by the headers. Currently both versions are identical, but that is not guaranteed to be the case in the future. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	e9eb3ec331	glx/dri3: set the implemented version of __DRIimageLoaderExtension ... over the one provided by the spec. Currently both versions are identical, but that is not guaranteed to be the case in the future. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	4e229a6e86	gbm: explicitly set __DRIimageLoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:17 +00:00
Emil Velikov	9e627ccc0d	egl/wayland: explicitly set __DRIimageLoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>y	2014-02-23 16:42:16 +00:00
Emil Velikov	73b35b913e	drivers/dri: explicitly set __DRI2flushExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>y	2014-02-23 16:42:16 +00:00
Emil Velikov	8b45bc0ad5	gbm: explicitly set __DRIdri2LoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>y	2014-02-23 16:42:16 +00:00
Emil Velikov	92273962f5	glx/dri2: set the implemented version of __DRIdri2LoaderExtension ... over the version number provided by the headers. Explicitly set extension members to improve clarity. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	6dffab2092	dri_interface: note introduction of __DRIdri2LoaderExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	c9fff0740e	dri_interface: note introduction of various __DRItexBufferExtension members Note the member function releaseTexBuffer was added without bumping spec version, and currently no drivers implement it. v2: releaseTexBuffer was introduced by version 3 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	acf2fae64e	dri_interface: Note the version introducing __DRIswrastLoaderExtensionRec::putImage2 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:16 +00:00
Emil Velikov	13e5daf2da	dri_util: explicitly set __DRIcopySubBufferExtension members Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:15 +00:00
Emil Velikov	01814734e6	dri_util: explicitly set __DRIswrastExtension members. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-23 16:42:15 +00:00
Kenneth Graunke	5e639a5f59	glsl: Pass stdout to _mesa_print_ir from st_glsl_to_tgsi. Fixes the Gallium build since commit `1e3bd9f9a5`. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75389 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-22 22:10:11 -08:00
Eric Anholt	83daa88035	i965: Move the remaining driver debug over to stderr. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	a76e5dce4f	i965: Move compiler debugging output to stderr. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	1e3bd9f9a5	glsl: Add a file argument to the IR printer. While we want to be able to print to stdout for glsl_compiler, for debugging drivers we want to be able to dump to stderr because that's where other driver debug (like LIBGL_DEBUG) tends to go, and because some apps actually close stdout to shut up their own messages (such as the X Server, or NWN). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	f28c920865	i965: Refactor debug dumping of GLSL IR. This was only going to get worse when tesselation shows up, and was causing too much extra duplication in my stderr changes coming up. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:21 -08:00
Eric Anholt	9ac9d133ed	intel: Remove some dead code I noticed in intel_screen.c. It was present in the initial i915tex import. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Eric Anholt	fdcf6c8fad	i965: Use the object label when available for INTEL_DEBUG=vs,gs,fs output. Note that this requires updated run.py in shader_db. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Eric Anholt	f474ced0d1	i965: Use the object label when available for shader_time output. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Eric Anholt	0e2c7e2f6e	meta: Set some object labels on our meta shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-22 19:23:20 -08:00
Ilia Mirkin	6152ba0894	nv50: make sure to clear _all_ layers of all attachments Unfortunately there's only one RT_ARRAY_MODE setting for all attachments, so clears were previously truncated to the minimum number of layers any attachment had. Instead set the RT_ARRAY_MODE to 512 (the max number of layers) before doing the clear. This fixes gl-3.2-layered-rendering-clear-color-mismatched-layer-count. Also fix clears of individual layered rt/zeta, in case it ever happens. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-22 18:42:31 -05:00
Chia-I Wu	d5cbd73d21	ilo: fix and enable fast depth clear Use tex->bo_format instead of zs->format in ilo_blitter_rectlist_clear_zs() because the latter may be combined depth/stencil format. hiz_can_clear_zs() is no-op for GEN7+, but move the GEN check so that the assertions are tested. Finally, call the fast depth clear function from ilo_clear().	2014-02-22 22:45:13 +08:00
Chia-I Wu	f57bddc7e4	ilo: add slice clear value It is needed for 3DSTATE_CLEAR_PARAMS, and can also be used to track what value the slice has been cleared to.	2014-02-22 22:45:13 +08:00
Chia-I Wu	4afb8a7fb5	ilo: better readability and doc for texture flags Improve comments for the flags, and explicitly separate their uses in slice flags and resolve flags.	2014-02-22 22:45:13 +08:00
Chia-I Wu	cb8a0d2be1	ilo: fix for stencil only rectlist ops 3DSTATE_STENCIL_BUFFER inherits some states from 3DSTATE_DEPTH_BUFFER. We need to emit both even the surface is stencil only.	2014-02-22 22:45:13 +08:00
Chia-I Wu	409add30b3	ilo: fix a false assertion failure on GEN6 Layer offsetting is possible when it is level 0, layer 0.	2014-02-22 22:45:12 +08:00
Chia-I Wu	e7307fe708	ilo: pipe_texture::usage is not a bitfield It happens to work because PIPE_USAGE_STAGING is 0x100.	2014-02-22 22:45:12 +08:00
Chia-I Wu	f8d19a58dc	ilo: set ILO_TEXTURE_CPU_WRITE for imported textures Assume the bo has been written by another process, which will trigger a HiZ resolve.	2014-02-22 22:45:12 +08:00
Christoph Bumiller	1f4bfb8797	nv50/ir/ra: fix SpillCodeInserter::offsetSlot usage We were turning non-memory spill slots into NULL. Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-22 13:17:23 +01:00
Matt Turner	7770b02693	Revert "i965/fs: Make fs_reg's type an enum for better debugging." This reverts commit `5ceadd29b0`. I rebased and apparently failed to build test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75355	2014-02-21 23:53:36 -08:00
Kenneth Graunke	760c6777a0	i965/fs: Drop the emit(fs_inst) overload. Using this emit function implicitly creates three copies, which is pointlessly inefficient. 1. Code creates the original instruction. 2. Calling emit(fs_inst) copies it into the function. 3. It then allocates a new fs_inst and copies it into that. The second could be eliminated by changing the signature to fs_inst(const fs_inst &) but that wouldn't eliminate the third. Making callers heap allocate the instruction and call emit(fs_inst *) allows us to just use the original one, with no extra copies, and isn't much more of a burden. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	326fc60ee9	i965/fs: Pass fs_regs by constant reference where possible. These functions (modulo emit_lrp, necessitating the small fix-up) pass these arguments by value unmodified to other functions. No point in making an additional copy. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	070f20272f	i965/fs: Move setting opcode = NOP to its one useful location. All other callers of init() immediately set opcode to something else. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	4fbebd6e65	i965/fs: Use a bitfield for fs_inst's bool fields. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	d91035a8f6	i965/fs: Reorder fs_inst's fields for better packing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	109c211ffd	i965/fs: Reduce the sizes of some fs_inst members. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	0fc1a77e14	i965/fs: Reorder fs_reg for better packing. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:33 -08:00
Matt Turner	5ceadd29b0	i965/fs: Make fs_reg's type an enum for better debugging. Since the enum is marked as packed, it'll still take only one byte. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Matt Turner	3f6baf5755	i965/fs: Reduce the sizes of some fs_reg members. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Matt Turner	98e2654880	i965: Mark brw_reg_type and register_file enums as PACKED. The C99 spec says the type of an enum is implementation defined (but can be char, signed int, or unsigned int). gcc appears to always give enums four bytes, even when they can fit in less. It does so because this is what other compilers seem to do [0] and therefore to maintain ABI compatibility with them. gcc has an -fshort-enum flag that tells the compiler to use only as much space as needed for an enum. Adding __attribute__((__packed__)) to an enum definition has the same behavior, but on a per-enum basis. brw_reg_type and register_file are not part of the ABI, so we can safely mark them as PACKED so that they'll take only a byte, rather than four. [0] http://gcc.gnu.org/onlinedocs/gcc/Non-bugs.html#index-fshort-enums-3868 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Matt Turner	00c567e897	i965: Reduce predicate field of backend_instruction to uint8_t. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 22:51:32 -08:00
Vinson Lee	079773d1cb	libgl-xlib: Fix xlib_sw_winsys.h include path. This patch fixes this SCons build error introduced with commit `4f37e52f37`. Compiling src/gallium/targets/libgl-xlib/xlib.c ... src/gallium/targets/libgl-xlib/xlib.c:35:42: fatal error: state_tracker/xlib_sw_winsys.h: No such file or directory #include "state_tracker/xlib_sw_winsys.h" ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75347 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 19:56:17 -08:00
Vinson Lee	24ce678f83	mesa: Move declarations before code. This patch fixes these MSVC build errors. Compiling src\mesa\drivers\common\meta_blit.c ... meta_blit.c src\mesa\drivers\common\meta_blit.c(255) : error C2143: syntax error : missing ';' before 'type' src\mesa\drivers\common\meta_blit.c(255) : error C2143: syntax error : missing ')' before 'type' src\mesa\drivers\common\meta_blit.c(255) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(255) : warning C4552: '<' : operator has no effect; expected operator with side-effect src\mesa\drivers\common\meta_blit.c(255) : error C2059: syntax error : ')' src\mesa\drivers\common\meta_blit.c(255) : error C2143: syntax error : missing ';' before '{' src\mesa\drivers\common\meta_blit.c(258) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(263) : error C2143: syntax error : missing ';' before 'type' src\mesa\drivers\common\meta_blit.c(263) : error C2143: syntax error : missing ')' before 'type' src\mesa\drivers\common\meta_blit.c(263) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(263) : warning C4552: '<=' : operator has no effect; expected operator with side-effect src\mesa\drivers\common\meta_blit.c(263) : error C2059: syntax error : ')' src\mesa\drivers\common\meta_blit.c(263) : error C2143: syntax error : missing ';' before '{' src\mesa\drivers\common\meta_blit.c(264) : error C2143: syntax error : missing ';' before 'type' src\mesa\drivers\common\meta_blit.c(264) : error C2143: syntax error : missing ')' before 'type' src\mesa\drivers\common\meta_blit.c(264) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(264) : warning C4552: '<' : operator has no effect; expected operator with side-effect src\mesa\drivers\common\meta_blit.c(264) : error C2059: syntax error : ')' src\mesa\drivers\common\meta_blit.c(264) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(264) : error C2143: syntax error : missing ';' before '{' src\mesa\drivers\common\meta_blit.c(268) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(268) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(269) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(269) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(270) : error C2065: 'step' : undeclared identifier src\mesa\drivers\common\meta_blit.c(270) : error C2065: 'i' : undeclared identifier src\mesa\drivers\common\meta_blit.c(559) : warning C4244: 'function' : conversion from 'const GLint' to 'GLfloat', possible loss of data src\mesa\drivers\common\meta_blit.c(723) : warning C4244: 'function' : conversion from 'const GLint' to 'GLfloat', possible loss of data src\mesa\drivers\common\meta_blit.c(773) : warning C4244: 'function' : conversion from 'const GLint' to 'GLfloat', possible loss of data Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 19:40:00 -08:00
Emil Velikov	dcbf404c0d	pipe-loader: introduce pipe_loader_sw_probe_null helper function v2: Handle null_sw_create failure, add missing function return type Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1)	2014-02-22 03:26:29 +00:00
Emil Velikov	969e8d15b7	pipe-loader: introduce pipe_loader_sw_probe_dri helper Will be used in the following commits. v2: Link gallium tests against the library. v3: Handle dri_create_sw_winsys failure v4: Rebase on top of the targets/xa changes Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v2)	2014-02-22 03:26:29 +00:00
Emil Velikov	cc3aeacab6	pipe-loader: introduce pipe_loader_sw_probe_xlib helper Will be used in the upcoming patches. v2: handle xlib_create_sw_winsys failure, drop unneeded header Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1)	2014-02-22 03:26:29 +00:00
Emil Velikov	6325fdd6cf	pipe-loader: use bool type for pipe_loader_drm_probe_fd() v2: Rebase on top of the rendernode changes. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1) Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1)	2014-02-22 03:26:29 +00:00
Emil Velikov	4f37e52f37	winsys/xlib: move xlib_create_sw_winsys within the winsys v2: Rebase on top of vl_winsys_xsp.c removal Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> (v1)	2014-02-22 03:26:28 +00:00
Emil Velikov	b4e8572bca	pipe-loader: handle memory allocation failure Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-02-22 03:26:28 +00:00
Emil Velikov	1fb750f7f7	pipe-loader: build pipe_loader_drm_x_auth whenever HAVE_PIPE_LOADER_XCB is defined Currently HAVE_PIPE_LOADER_XCB is defined, rather than being set to 1/0. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-22 03:26:28 +00:00
Emil Velikov	ed092a8e1f	pipe-loader: destroy sw_winsys on sw_release The sw pipe-loader implicitly handles winsys_create, thus we it would make sense to implicitly destroy it upon releasing the loader. Currently we leak the sw_winsys when releasing the pipe-loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-22 03:26:28 +00:00
Emil Velikov	636ac989b2	vl/winsys_dri: cleanup vl_screen_create error path Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-22 03:26:27 +00:00
Emil Velikov	0c9912b266	targets/pipe-loader: link pipe-nouveau against libdrm Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-22 03:26:27 +00:00
Kenneth Graunke	6984a6be5c	meta: Eliminate samplers[] array in favor of using vec4_prefix. We don't need an array mapping the shader index to "sampler2DMS", "isampler2DMS", and so on. We can simply do "%ssampler2DMS" and pass in vec4_prefix, which is "", "i", or "u". This eliminates the use of C99 array initializers and should fix the MSVC build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75344 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-21 19:18:07 -08:00
Kenneth Graunke	119aa50929	i965: Delete the fabulous target_to_target() function. gl_texture_object's Target field is never a cube face enumeration, so target_to_target is just the identity function. Aptly named, at least. I verified this by putting an assert(!"ZOMG, CUBES!") in the cube face case, and running Piglit. Nothing ever hit it. Beyond that, I inspected the code in mesa/main. This could probably also be deleted from i915, but I haven't tested there. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-21 19:17:55 -08:00
Kenneth Graunke	82f9ad8c60	i965: Fix S8 and X8 reversal in brw_depthbuffer_format refactor. In commit `09d9a8913e`, I accidentally botched the X8 and S8 cases. (I wrote this patch before realizing that X8 and S8 had been swapped in the big MESA_FORMAT rename, and apparently didn't rebase it properly after fixing that...) Fixes regressions in 13 Piglit tests on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75291 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-21 19:17:50 -08:00
Vinson Lee	5a0b08e9ea	mesa: Move declarations before code. This patch fixes these MSVC build errors introduced with `73b78f9c9f`. Compiling src\mesa\main\uniforms.c ... uniforms.c src\mesa\main\uniforms.c(291) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(294) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(294) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(294) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(306) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(309) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(309) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(309) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(322) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(325) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(325) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(325) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(345) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(348) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(348) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(348) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(360) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(363) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(363) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(363) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(376) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(379) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(379) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(379) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(588) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(591) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(591) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(591) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(603) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(606) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(606) : warning C4047: 'function' : 'gl_shader_program ' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(606) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 src\mesa\main\uniforms.c(619) : error C2143: syntax error : missing ';' before 'type' src\mesa\main\uniforms.c(622) : error C2065: 'shProg' : undeclared identifier src\mesa\main\uniforms.c(622) : warning C4047: 'function' : 'gl_shader_program *' differs in levels of indirection from 'int' src\mesa\main\uniforms.c(622) : warning C4024: '_mesa_uniform' : different types for formal and actual parameter 2 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 19:11:58 -08:00
Vinson Lee	aaefc85f3b	mesa/sso: Change CreateShaderProgramv return type from uint to GLuint. This patch fixes this MinGW build error. Compiling src/mapi/glapi/glapi_dispatch.c ... In file included from src/mapi/glapi/glapi_dispatch.c:41:0: build/windows-x86_64-debug/mapi/glapi/glapitable.h:930:4: error: expected specifier-qualifier-list before 'uint' uint (GLAPIENTRYP CreateShaderProgramv)(GLenum type, GLsizei count, const GLchar * const * strings); /* 886 */ ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 18:05:40 -08:00
Vinson Lee	34587e4a00	scons: Add main/pipelineobj.c to src/mesa/SConscript. This patch fixes this SCons build error. build/linux-x86_64-debug/mesa/libmesa.a(context.os): In function `init_attrib_groups': src/mesa/main/context.c:815: undefined reference to `_mesa_init_pipeline' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 17:00:47 -08:00
Vinson Lee	897a5fa360	mesa/sso: Fix typo of 'unsigned'. Fix build error introduced with commit `f4c13a890f`. CC pixeltransfer.lo main/pipelineobj.c: In function '_mesa_delete_pipeline_object': main/pipelineobj.c:59:4: error: unknown type name 'unsinged' unsinged i; ^ Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-02-21 16:41:04 -08:00
Gregory Hainaut	4719ad79ec	mesa/sso: Implement _mesa_GetProgramPipelineiv This was originally included in another patch, but it was split out by Ian Romanick. v2 (idr): * Trivial reformatting. * Remove GL_COMPUTE_SHADER. Compute shaders don't participate in pipeline objects anyway. Suggested by Matt Turner. v3 (idr): * Use _mesa_has_geometry_shaders. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:03 -08:00
Gregory Hainaut	c171834b49	mesa/sso: Implement _mesa_ActiveShaderProgram This was originally included in another patch, but it was split out by Ian Romanick. v2 (idr): Return early from _mesa_ActiveShaderProgram if _mesa_lookup_shader_program_err returns an error. Suggested by Jordan. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> [v2]	2014-02-21 15:41:03 -08:00
Gregory Hainaut	e9ff3b9918	mesa/sso: Implement _mesa_CreateShaderProgramv This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:03 -08:00
Gregory Hainaut	3659eade53	mesa/sso: Refactor implementation of _mesa_CreateShaderProgramEXT This will allow the guts of the implementation to be shared with _mesa_CreateShaderProgramv. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:03 -08:00
Gregory Hainaut	8ed8592fd6	mesa/sso: Add support for GL_PROGRAM_SEPARABLE query This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	4177d39c1e	mesa/sso: Implement _mesa_IsProgramPipeline Implement IsProgramPipeline based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	0c26552662	mesa/sso: Implement _mesa_GenProgramPipelines Implement GenProgramPipelines based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	55311557fd	mesa/sso: Implement _mesa_DeleteProgramPipelines Implement DeleteProgramPipelines based on the VAO code. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	f4c13a890f	mesa/sso: Add pipeline container/state V1: * Extend gl_shader_state as pipeline object state * Add a new container gl_pipeline_shader_state that contains binding point of the previous object * Update mesa init/free shader state due to the extension of the attibute * Add an init/free pipeline function for the context V2: * Rename gl_shader_state to gl_pipeline_object * Rename Pipeline.PipelineObj to Pipeline.Current * Formatting improvement V3 (idr): * Split out from previous uber patch. * Remove '#if 0' debug printfs. V4 (idr): * Fix some errors in comments. Suggested by Jordan. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	0f137a1d73	mesa: Add a mutex and refcounting to gl_shader_state Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	47476fa673	mesa: Make get_shader_flags publicly available Future patches will use this function outside shaderapi.c. This was originally included in another patch, but it was split out by Ian Romanick. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Gregory Hainaut	73b78f9c9f	mesa/sso: Add extension entry points for GL_ARB_separate_shader_objects Nothings implemented yet but glProgramUniform* which are mostly a copy/paste of the older function glUniform* I create dedicated pipelineobj.[ch] file that will contains function related to the "new" pipeline container object. V2: formatting improvement V3: * indentation fix * Update copyright * Add a comment on ProgramParameteri already present in another extension * Remove TODO, will be readded on correct patch V4 (idr): * Fix dispatch_sanity unit test * Make extension string available in core profiles (instead of just compatibility). * Trivial reformating Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	4d14b190bb	glsl/sso: Add parser and AST-to-HIR support for separate shader object layouts GL_ARB_separate_shader_objects adds the ability to specify location layouts for interstage inputs and outputs. In addition, this extension makes 'in' and 'out' generally available for shader inputs and outputs. This mimics the behavior of GL_ARB_explicit_attrib_location. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	f3b184590f	mesa/sso: Add extension tracking for ARB_separate_shader_objects This adds the necessary bits for both the API and the GLSL compiler. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:02 -08:00
Ian Romanick	79146065f9	mesa: Refactor per-stage link check to its own function Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-21 15:41:01 -08:00
Emil Velikov	68bc1e2025	specs: MESA_query_renderer.spec resolve a couple of typos Cc: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-21 22:52:46 +00:00
Emil Velikov	0432aa064b	configure: use shared-glapi when more than one gl* API is used Current behaviour states that shared-glapi is usefull when building with dri, which is not the case. Shared-glapi is used to dispatch the gl* functions across the one or more gl api's which can be dri based but do not need to be. Fixed the following build ./configure --enable-gles2 --disable-dri --enable-gallium-egl \ --with-egl-platforms=fbdev --with-gallium-drivers=swrast Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75098 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-21 22:48:50 +00:00
Emil Velikov	9eae750317	configure: use default dri drivers whenever opengl and dri are enabled Commit ee55500c22a(configure: cleanup classic dri drivers handling) cleaned up the logic handling autodetection of dri drivers, but missed the case when one can explicitly disable dri, and still request opengl. Fixes build issues for the following ./autogen.sh --disable-dri --with-gallium-drivers=swrast While we're here, explicitly clear with_dri_drivers whenever building without such drivers to prevent choking later on. v2: Simplify with_dri_drivers handling. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75126 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-21 22:47:51 +00:00
Eric Anholt	c2ebbe2728	i965: Stop throwing away our double precision for time calculations. Fixes negative times being reported in our perf debug. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:50 -08:00
Eric Anholt	f2f337c6d5	meta: Add support for integer blits. Compared to i965, the code generated doesn't use the AVG instruction. But I'm not sure that multisampled integer resolves are really that important to worry about. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	b0a8d0ee40	meta: Add support for doing MSAA to MSAA blits. These are non-stretched, non-resolving blits, so it's just a matter of sampling once from our gl_SampleID and storing that to our color/depth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	eb55b01eef	meta: Save and restore a bunch of MSAA state. We're disabling GL_MULTISAMPLE, so we didn't need to worry about a lot of that state. But to do MSAA to MSAA blits, we need to start handling more state. v2: Fix pasteo caught by Kenneth. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	f7f15d3c2d	meta: Try to do blending of sRGB values in linear colorspace. Blending of values would occur when doing GL_LINEAR filtering with scaling, and in an upcoming commit when doing MSAA resolves. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	7d2f73e737	meta: Add support for doing multisample resolves. Note that this doesn't handle GL_EXT_multisample_scaled_blit yet. The i965 code for that extension bakes in knowledge of the sample positions (well, knowledge of the sample positions aligned to a lower-resolution grid), which we would have to do at runtime somehow for meta. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Eric Anholt	aba85d960e	i965: Fix miptree matching for multisampled, non-interleaved miptrees. We haven't been executing this code before the meta-blit case, because we've been flagging the miptree as validated at texstorage time, and never having to revalidate. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-21 10:43:38 -08:00
Courtney Goeltzenleuchter	941769be81	mesa: Remove unnecessary condition. Identified by Valgrind memory check. Initialized block-opaque in a different patch. This test seems unnecessary. If opaque must be true, just set to true. Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>	2014-02-21 10:16:10 -08:00
Francisco Jerez	9b2fe7cf96	clover: Unabbreviate a few data accessor names for consistency. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:23 +01:00
Francisco Jerez	a0d99937a0	clover: Replace the transfer(new ...) idiom with a safer create(...) helper function. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	c4578d2277	clover: Migrate a bunch of pointers and references in the object tree to smart references. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	d82b39ce38	clover: Allow storing a range into a container of different (but compatible) element type. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	1b9fb2fd91	clover: Define an intrusive smart reference class. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	9ae0bd3829	clover: Some improvements for the intrusive pointer class. Define some additional convenience operators, clean up the implementation slightly, and rename it to 'intrusive_ptr' for reasons that will be obvious in the next commit. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:51:22 +01:00
Francisco Jerez	198cd136b9	clover: Fix up NULL constant pointer arguments. Tested-by: Tom Stellard <thomas.stellard@amd.com>	2014-02-21 12:29:05 +01:00
Jordan Justen	c97763ca2d	tgsi_ureg: add property_gs_invocations Fixes a build break in state_tracker/st_program.c Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75278 Reviewed-by: Dave Airlie <airlied@redhat.com>	2014-02-20 16:41:01 -08:00
Kenneth Graunke	1336ccb7dd	i965: Enable Broadwell support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:51:38 -08:00
Kenneth Graunke	808952a095	i965/fs: Implement FS_OPCODE_[UN]PACK_HALF_2x16_SPLIT[_XY] opcodes. I'd neglected to port these to Broadwell. Most of this code is copy and pasted from Gen7, but instead of using F32TO16/F16TO32, we just use MOV with HF register types. Fixes fs-packHalf2x16 and fs-unpackHalf2x16 tests (both the ARB extension and ES 3.0 variants). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:59 -08:00
Kenneth Graunke	850e372fc7	i965: Drop bogus F32TO16/F16TO32 instructions on Broadwell - use MOV. Broadwell removed the F32TO16 and F16TO32 instructions. However, it has actual support for HF values, so they're actually just MOV. Fixes vs-packHalf2x16 and vs-unpackHalf2x16 tests (both the ARB extension and ES 3.0 variants). v2: Emulate F32TO16's align16 zeroing bug, since Chad's front end code relies on it happening. We can probably refactor this code to be better later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:57 -08:00
Kenneth Graunke	3663bbe773	i965: Create a hardware context before initializing state module. brw_init_state() calls brw_upload_initial_gpu_state(). If hardware contexts are enabled (brw->hw_ctx != NULL), this will upload some initial invariant state for the GPU. Without hardware contexts, we rely on this state being uploaded via atoms that subscribe to the BRW_NEW_CONTEXT bit. Commit `46d3c2bf4d` accidentally moved the call to brw_init_state() before creating a hardware context. This meant brw_upload_initial_gpu_state would always early return. Except on Gen6+, we stopped uploading the initial GPU state via state atoms, so it never happened. Fixes a regression since `46d3c2bf4d`. Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	e3823147a5	i965/fs: Implement scratch read/write support for Broadwell. To make sure that both the Gen4 and Gen7 style messages work, I initially disabled the SHADER_OPCODE_GEN7_SCRATCH_READ optimization, ran Piglit, re-enabled it, and ran Piglit again. Both worked fine. Fixes 40 Piglit tests (most of the varying-packing category). v2: Move num_regs assertion from gen8_fs_generator to gen8_set_dp_scratch_message() (suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	29a6974403	i965: Add Gen8 assembly support for DP Scratch messages. The new accessors will make it easy to do Gen7-style scratch messages. v2: Move num_regs assertion from gen8_fs_generator into gen8_set_dp_scratch_message() (suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	a5e54c91a3	i965: Store absolute thread count in max_wm_threads on Broadwell. In the past, 3DSTATE_PS took an absolute number of threads. Conversely, on Broadwell you always program 64, and it implicitly scales based on the GT-level with no special programming. So, I stored 64 in brw_device_info::max_wm_threads. However, I didn't realize that we also use max_wm_threads to compute the size of the scratch space buffer. In that case, we really need the absolute number of threads. This patch hardcodes 3DSTATE_PS to use the value it expects, and changes max_wm_threads back to a (completely fake) absolute thread count (once again copied from Haswell). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:08 -08:00
Kenneth Graunke	dca84b4b5b	i965: Use MOV, not OR for setting URB write channel enables on Gen8+. On Broadwell, g0.5 contains the "Scratch Space Pointer"; using OR puts some bits of that into "ignored" sections of our message header. While this doesn't hurt, it's also not terribly /useful/. Using MOV is sufficient to set the only interesting bits in this part of the message header. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:07 -08:00
Kenneth Graunke	e643c7d036	i965: Implement a CS stall workaround on Broadwell. According to the latest documentation, any PIPE_CONTROL with the "Command Streamer Stall" bit set must also have another bit set, with five different options: - Render Target Cache Flush - Depth Cache Flush - Stall at Pixel Scoreboard - Post-Sync Operation - Depth Stall I chose "Stall at Pixel Scoreboard" since we've used it effectively in the past, but the choice is fairly arbitrary. Implementing this in the PIPE_CONTROL emit helpers ensures that the workaround will always take effect when it ought to. Apparently, this workaround may be necessary on older hardware as well; for now I've only added it to Broadwell as it's absolutely necessary there. Subsequent patches could add it to older platforms, provided someone tests it there. v2: Only flag "Stall at Pixel Scoreboard" when none of the other bits are set (suggested by Ian Romanick). v3: Prefix the function with "gen8" (requested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v2) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-20 15:50:07 -08:00
Jordan Justen	741782b594	i965: support instanced GS on gen7 v3: * Properly prevent dual object mode execution when the invocation count > 1 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	008338bc4e	i965: support gl_InvocationID for gen7 v2: * Make gl_InvocationID a system value v3: * Properly shift from R0.1 into DST.4 by adding GS_OPCODE_GET_INSTANCE_ID Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Acked-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	d099019935	glsl: add gl_InvocationID variable for ARB_gpu_shader5 v2: * Make gl_InvocationID a system value Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	22388e2208	main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support v3: * Add check for ARB_gpu_shader5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	86d6b5546b	mesa: initialize gl_geometry_program Invocations field Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:09 -08:00
Jordan Justen	313402048f	glsl/linker: produce gl_shader_program Geom.Invocations Grab the parsed invocation count, check for consistency during linking, and finally save the result in gl_shader_program Geom.Invocations. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	02dc74fbd7	glsl: parse invocations layout qualifier for ARB_gpu_shader5 _mesa_glsl_parse_state in_qualifier->invocations will store the invocations count. v3: * Use in_qualifier to allow the primitive to be specied separately from the invocations count (merge_qualifiers) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	738c9c3c54	glsl: Generate error for invalid input layout declarations Fixes various piglit tests: spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-*.geom Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Jordan Justen	0c558f9ee6	glsl: convert GS input primitive to use ast_type_qualifier We introduce a new merge_in_qualifier ast_type_qualifier which allows specialized handling of merging input layout qualifiers. By merging layout qualifiers into state->in_qualifier, we allow multiple input qualifiers. For example, the primitive type can be specified specified separately from the invocations count (ARB_gpu_shader5). state->gs_input_prim_type is moved into state->in_qualifier->prim_type state->gs_input_prim_type_specified is still processed separately so we can determine when the input primitive is specified. This is important since certain scenerios are not supported until after the primitive type has been specified in the shader code. v4: * Merge with compute shader input layout qualifiers Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-20 10:33:08 -08:00
Eric Anholt	5bc0b2f432	i965: Fix extra return value after winsys rb update refactor. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75172 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-20 10:15:13 -08:00
Eric Anholt	9245206cbf	i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls. Improves performance of a dolphin emulator trace I had laying around by 3.60131% +/- 0.995887% (n=128). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-20 10:15:13 -08:00
Eric Anholt	9e3cab8881	i965/fs: Add an optimization pass to remove redundant flags movs. We generate steaming piles of these for the centroid workaround, and this quickly cleans them up. total instructions in shared programs: 1591228 -> 1590047 (-0.07%) instructions in affected programs: 26111 -> 24930 (-4.52%) GAINED: 0 LOST: 0 (Improved apps are l4d2, csgo, and dolphin) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-20 10:15:13 -08:00
Roland Scheidegger	b2b2a2c06c	gallivm: add smallfloat to float conversion not relying on cpu denorm handling The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Si Chen <sichen@vmware.com>	2014-02-20 18:41:42 +01:00
Leo Liu	0206f0b3d4	st/omx/enc: add multi scaling buffers for performance improvement Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-20 13:34:16 +01:00
Christian König	754fa3a0d2	st/omx/dec/h264: fix prevFrameNumOffset handling Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-20 13:34:06 +01:00
Kenneth Graunke	57405605a8	i965: Actually claim to support MSAA on Broadwell. We need to advertise 8x, 4x, and 2x multisamples. Previously, we only claimed to support 0/1 samples. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	4af8c95783	i965: Update physical width/height munging for 2x IMS MSAA. I can't find any documentation to explain what ought to be done here, so I simply guessed based on the pattern I observed in the 4x/8x cases. It appears to work, but it could be totally wrong. I was able to find the Sandybridge PRM quote from the comments in the latest documentation: Shared Functions > 3D Sampler > Multisampled Surface Behavior. However, it only mentions 4x MSAA - not even 8x. After a substantial amount more digging, I was able to find a second page (incorrectly tagged) which confirmed the formulas in our code for 8x MSAA. However, that page didn't mention 2x MSAA at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	51145a24f7	i965: Enable smooth points when multisampling without point sprites. According to the "Point Multisample Rasterization" of the OpenGL specification (3.0 or later), smooth points are supposed to be enabled implicitly when multisampling, regardless of the GL_POINT_SMOOTH flag. However, if GL_POINT_SPRITE is enabled, you get square points no matter what. Core contexts always enable point sprites, so this effectively makes smooth points go away, even in the case of multisampling. Fixes Piglit's EXT_framebuffer_multisample/point-smooth tests. (Yes, that's right folks, we actually have Piglit tests for this.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	a3d70580b5	i965: Thwack multisample enable bit in 3DSTATE_RASTER. The meaning and effects of this bit are surprisingly complicated. See Rasterization > Windower > Multisampling > Multisample ModesState. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:43:22 -08:00
Kenneth Graunke	0c5873c9b9	i965: Only use the SIMD16 program for per-sample shading on Broadwell. This restriction carries forward from earlier platforms. The code is ported straight from gen7_wm_state.c. v2: Actually do it right. v3: Add missing _NEW_MULTISAMPLE bit (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:42:54 -08:00
Kenneth Graunke	61d7ea4b16	i965: Set "Position XY Offset Select" bits in 3DSTATE_PS on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:42:16 -08:00
Kenneth Graunke	01c42b2be6	i965: Add missing sample shading bits to Gen8's 3DSTATE_PS_EXTRA. v2: Also set the "oMask Present to Render Target" bit, which is required for shaders that write oMask. Otherwise the hardware won't expect the extra data. v3: Add missing _NEW_MULTISAMPLE (caught by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:42:02 -08:00
Kenneth Graunke	77c37ed74b	i965/fs: Implement FS_OPCODE_SET_OMASK on Broadwell. I made a few changes which I think simplify the code a bit compared to the Gen7 implementation, but which are largely pointless. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:39:41 -08:00
Kenneth Graunke	5476da79f8	i965/fs: Implement FS_OPCODE_SET_SAMPLE_ID on Broadwell. Largely cut and paste from Gen7; it works the same way. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:39:41 -08:00
Kenneth Graunke	80c4edfc27	i965: Disable MCS on Broadwell for now. v2: Add a perf_debug() message to remind us to come back to this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:39:21 -08:00
Kenneth Graunke	4eba0d124d	i965: Use gen7_surface_msaa_bits in Broadwell SURFACE_STATE code. We already set the number of samples, but were missing the MSAA layout mode. Reusing gen7_surface_msaa_bits makes it easy to set both. This also lets us drop the Gen8 surface_num_multisamples function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:35:54 -08:00
Kenneth Graunke	6eeae17c02	i965: Use ffs() for sample counting in gen7_surface_msaa_bits(). The enumerations are just log2(num_samples) shifted by 3, which we can easily compute via ffs(). This also makes it reusable for Broadwell, which has 2x MSAA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:35:53 -08:00
Kenneth Graunke	2ed5824a5d	i965: Simplify Broadwell's 3DSTATE_MULTISAMPLE sample count handling. These enumerations are simply log2 of the number of multisamples shifted by a bit, so we can calculate them using ffs() in a lot less code. Suggested by Eric Anholt. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:35:32 -08:00
Ian Romanick	7700c73cf4	glsl: Silence "type qualifiers ignored on function return type" warning The const in const unsigned foo(void); is meaningless. Removing it silences this warning: src/glsl/ast_to_hir.cpp:1802:56: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-02-19 15:08:50 -08:00
Ian Romanick	2c85fd5a96	glsl: Only warn for macro names containing __ From page 14 (page 20 of the PDF) of the GLSL 1.10 spec: "In addition, all identifiers containing two consecutive underscores (__) are reserved as possible future keywords." The intention is that names containing __ are reserved for internal use by the implementation, and names prefixed with GL_ are reserved for use by Khronos. Names simply containing __ are dangerous to use, but should be allowed. Per the Khronos bug mentioned below, a future version of the GLSL specification will clarify this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Cc: Tapani Pälli <lemody@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Bugzilla: Khronos #11702	2014-02-19 15:08:50 -08:00
Ian Romanick	0bd7892630	glcpp: Only warn for macro names containing __ Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the GLSL ES spec (all versions) say: "All macro names containing two consecutive underscores ( __ ) are reserved for future use as predefined macro names. All macro names prefixed with "GL_" ("GL" followed by a single underscore) are also reserved." The intention is that names containing __ are reserved for internal use by the implementation, and names prefixed with GL_ are reserved for use by Khronos. Since every extension adds a name prefixed with GL_ (i.e., the name of the extension), that should be an error. Names simply containing __ are dangerous to use, but should be allowed. In similar cases, the C++ preprocessor specification says, "no diagnostic is required." Per the Khronos bug mentioned below, a future version of the GLSL specification will clarify this. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de> Cc: Tapani Pälli <lemody@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Bugzilla: Khronos #11702	2014-02-19 15:08:50 -08:00
Tom Stellard	a4c734297f	configure: Use LLVM shared libraries by default Linking with LLVM static libraries is easily broken by changes to the llvm-config program or when LLVM adds, removes, or changes library components. Keeping up with these changes requires a lot of maintanence effort to keep the build working on the master and stable branches. Also, because of issues in the past LLVM static libraries, the release manager is currently configuring with --with-llvm-shared-libs when checking the build before release. Enabling shared libraries by default would allow the release manager to run ./configure with no arguments, and be reasonably confident that the build would succeed. Acked-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-19 14:35:49 -05:00
Francisco Jerez	8928d7860a	i965/fs: Allocate the param_size array dynamically. Useful because the total number of uniform components might exceed MAX_UNIFORMS * 4 in some cases because of the image metadata we'll be passing as push constants. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 19:03:56 +01:00
Francisco Jerez	eef710fc53	i965/fs: Use a separate variable to keep track of the last uniform index seen. Like the VEC4 back-end does. It will make dynamic allocation of the param_size array easier in a future commit. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 19:03:56 +01:00
Rob Clark	9186cd39d4	freedreno: tweak ringbuffer sizes/count Since we are now consuming two ringbuffers at a time, we probably want a pool larger than 4.. but we don't need each individual ringbuffer to be so large, so offset the pool size increase by reducing rb size. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-19 12:02:57 -05:00
Rob Clark	5993723471	freedreno/a3xx/compiler: scheduling/legalize fixes It seems the write-after-read hazard that applies to texture fetch instructions, also applies to sfu instructions. Also, cat5/cat6 instructions do not have a (ss) bit, so in these cases we need to insert a dummy nop instruction with (ss) bit set. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-19 12:01:26 -05:00
Francisco Jerez	bbf8239f92	i965: Have brw_imm_vf4() take the vector components as integer values. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:56:57 +01:00
Francisco Jerez	51b00c5cb9	i965: Add helper function to find out the signedness of a register type. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:56:57 +01:00
Francisco Jerez	560f10e573	i965/vec4: Use swizzle() in the ARB_vertex_program code. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	8797ccf3fa	i965/fs: Use offset() in the ARB_fragment_program code. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	6f56d5dc60	i965/fs: Remove fs_reg::retype. There doesn't seem to be any reason for it to be a method, and it's surprising that the expression 'reg.retype(t)' doesn't retype its object but rather it creates a temporary with the new type. Use 'retype(reg, t)' instead. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	3b03273275	i965/vec4: Trivial improvements to the with_writemask() function. Add assertion that the register is not in the HW_REG or IMM file, calculate the conjunction of the old and new mask instead of replacing the old [consistent with the behavior of brw_writemask(), causes no functional changes right now], make it static inline to let the compiler do a slightly better job at optimizing things, and shorten its name. v2: Assert that the new writemask is not zero to avoid undefined hardware behaviour. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	42b226ef82	i965: Make sure that backend_reg::type and brw_reg::type are consistent for fixed regs. And define non-mutating helper functions to retype fixed and normal regs with a common interface. At some point we may want to get rid of ::fixed_hw_reg completely and have fixed regs use the normal register data members (e.g. backend_reg::reg to select a fixed GRF number, src_reg::swizzle to store the swizzle, etc.), I have the feeling that this is not the last headache we're going to get because of the multiple ways to represent the same thing and the different register interface depending on the file a register is stored in... Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	98306e727b	i965/vec4: Add non-mutating helper functions to modify src_reg::swizzle and ::negate. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	2337820d49	i965: Add non-mutating helper functions to modify the register offset. Yes, we could avoid having four copies of essentially the same code by using templates here. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	af25addcd0	i965/vec4: Fix off-by-one register class overallocation. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	a32817f3c2	i965: Unify fs_generator:: and vec4_generator::mark_surface_used as a free function. This way it can be used anywhere. I need it from the visitor. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:25 +01:00
Francisco Jerez	ae8b066da5	i965: Move up duplicated fields from stage-specific prog_data to brw_stage_prog_data. There doesn't seem to be any reason for nr_params, nr_pull_params, param, and pull_param to be duplicated in the stage-specific subclasses of brw_stage_prog_data. Moving their definition to the common base class will allow some code sharing in a future commit, the removal of brw_vec4_prog_data_compare and brw__prog_data_free, and the simplification of the stage-specific brw__prog_data_compare. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 16:27:22 +01:00
Francisco Jerez	7f00c5f1a3	i965/vec4: Add constructor of src_reg from a fixed hardware reg. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-19 15:10:57 +01:00
Kenneth Graunke	98e048cf32	i965: Enable fast depth clears. They work fine now, too. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	7023786417	i965: Enable HiZ on Broadwell. It appears to work fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	8cad1c115a	i965: Implement HiZ resolves on Broadwell. Broadwell's 3DSTATE_WM_HZ_OP packet makes this much easier. Instead of programming the whole pipeline, we simply have to emit the depth/stencil packets, a state override, and a pipe control. Then arrange for the state to be put back. This is easily done from a single function. v2: Use minify(mt->logical_{width,height}0, level) in 3DSTATE_WM_HZ_OP instead of intel_mipmap_level's width/height fields. Those were based on the physical width/height, and thus wrong for MSAA buffers. Eric also deleted those fields. v3: Use 0xFFFF as the sample mask regardless of what the user set (as this operation is unrelated); set the drawing rectangle to the miplevel being operated on, rather than the whole surface; remove unnecessary MAX2(..., 1) around mt->logical_depth0 (all suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	82711611cf	i965: Refactor Gen8 depth packet emission. The existing code followed the vtable function signature, which is not a great fit: many of the parameters are unused, and the function still inspects global state, making it less reusable. This patch refactors the depth buffer packet emission code into a new function which takes exactly the parameters it needs, and which uses no global state. It then makes the existing vtable function call the new one. Ideally, we would remove the vtable function, and clean up that interface. But that can happen once HiZ is working. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	67f073b91c	i965: Add #defines for the 3DSTATE_WM_HZ_OP packet's contents. We're going to need these to implement HiZ. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	577fdf1f48	i965: Bump generation check in code to disable HiZ at LODs > 0. Broadwell's "HiZ Resolve" operation still has the restriction that the rectangle primitive must be 8x4 aligned. So I believe we still need this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:17 -08:00
Kenneth Graunke	a5d2eb6b98	i965: Program 3DSTATE_HIER_DEPTH_BUFFER properly on Broadwell. HiZ buffers still don't exist, but when they do, we'll set them up. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:16 -08:00
Kenneth Graunke	09d9a8913e	i965: Pull format conversion logic out of brw_depthbuffer_format. brw_depthbuffer_format is not very reusable at the moment, since it uses global state (ctx->DrawBuffer) to access a particular depth buffer. For HiZ on Broadwell, I need a function which simply converts the formats. However, at least one existing user of brw_depthbuffer_format really wants the existing interface. So, I've created a new function. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-19 01:46:16 -08:00
Chia-I Wu	4695f64895	egl: clarify what _eglInitResource does It is a helper called from the initializers of its subclasses.	2014-02-19 13:08:54 +08:00
Chia-I Wu	dc97e54d97	Revert "egl: Unhide functionality in _eglInitContext()" This reverts commit `1456ed85f0`. _eglInitResource can and is supposed to be called on subclass objects. Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-02-19 13:08:52 +08:00
Chia-I Wu	924490a747	Revert "egl: Unhide functionality in _eglInitSurface()" This reverts commit `498d10e230`. _eglInitResource can and is supposed to be called on subclass objects. Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>	2014-02-19 13:08:44 +08:00
Kenneth Graunke	c593ad6e46	i965: Bump MaxTexMbytes from 1GB to 1.5GB. Even with the other limits raised, TestProxyTexImage would still reject textures > 1GB in size. This is an artificial limit; nothing prevents us from having a larger texture. I stayed shy of 2GB to avoid the larger-than-aperture situation. For 3D textures, this raises the effective limit: - RGBA8: 645 -> 738 - RGBA16: 512 -> 586 - RGBA32F: 406 -> 465 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 18:59:24 -08:00
Kenneth Graunke	6c04423153	i965: Bump GL_MAX_CUBE_MAP_TEXTURE_SIZE to 8192. Gen4+ supports 8192x8192 cube maps. Ivybridge and later can actually support 16384, but that would place GL_MAX_CUBE_MAP_TEXTURE_SIZE above GL_MAX_TEXTURE_SIZE, which seems like a bad idea. (Unfortunately, we can't bump GL_MAX_TEXTURE_SIZE to 16384 without causing regressions due to awful W-tiled stencil buffer interactions.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 18:59:18 -08:00
Kenneth Graunke	06b047ebc7	i965: Bump MAX_3D_TEXTURE_SIZE to 2048. It's highly unlikely that there will be enough memory in the system to allocate enough space for this, but we should still expose the hardware limit. It's what the Intel Windows driver does, and it seems most other vendors do likewise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74130 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 18:58:57 -08:00
Ian Romanick	f0fdee5095	docs: Trivial updates to MESA_query_renderer.spec Fix the version and the status before sending to Khronos for listing in the registry. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 15:25:04 -08:00
Sinclair Yeh	6c9d6898fd	Prevent zero sized wl_egl_window It is illegal to create or resize a window to zero (or negative) width and/or height. This patch prevents such a request from happening.	2014-02-18 14:12:11 -08:00
Anuj Phogat	03597cf802	glsl: Fix condition to generate shader link error GL_ARB_ES2_compatibility doesn't say anything about shader linking when one of the shaders (vertex or fragment shader) is absent. So, the extension shouldn't change the behavior specified in GLSL specification. Tested the behavior on proprietary linux drivers of NVIDIA and AMD. Both of them allow linking a version 100 shader program in OpenGL context, when one of the shaders is absent. Makes following Khronos CTS tests to pass: successfulcompilevert_linkprogram.test successfulcompilefrag_linkprogram.test Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 11:07:09 -08:00
Anuj Phogat	6bd2472a8b	mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target() Fixes failing Khronos CTS test packed_depth_stencil_init.test Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-18 11:07:09 -08:00
Eric Anholt	d92f593d87	i965/fs: Use conditional sends to do FB writes on HSW+. This drops the MOVs for header setup, which are totally mis-scheduled. total instructions in shared programs: 1590047 -> 1589331 (-0.05%) instructions in affected programs: 43729 -> 43013 (-1.64%) GAINED: 0 LOST: 0 glb27-trex: x before + after +-----------------------------------------------------------------------------+ \| + x xx + + + \| \| ++ + xxx ++x xx + ** x+ + + + x \| \|+x xx x* x+++xxxxx++++xx++** x x+**x+xx+* + * + + *\| \| \|__\|__________MA___A___________\|___\| \| +-----------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 49 62.33 65.41 63.49 63.53449 0.62757822 + 50 62.28 65.4 63.7 63.6982 0.656564 No difference proven at 95.0% confidence Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 10:11:36 -08:00
Eric Anholt	4226798354	i965/fs: Drop dead comment about the old proj_attrib_mask optimization. The code was removed early last year. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 10:01:45 -08:00
Eric Anholt	f128bcc7c2	i965: Drop mt->levels[].width/height. It often confused people because it was unclear on whether it was the physical or logical, and people needed the other one as well. We can recompute it trivially using the minify() macro, clarifying which value is being used and making getting the other value obvious. v2: Fix a pasteo in intel_blit.c's dst flip. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> (v1) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 10:01:45 -08:00
Eric Anholt	4e0924c5de	i965: Move singlesample_mt to the renderbuffer. Since only window system renderbuffers can have a singlesample_mt, this lets us drop a bunch of sanity checking to make sure that we're just a renderbuffer-like thing. v2: Fix a badly-written comment (thanks Kenneth!), drop the now trivial helper function for set_needs_downsample. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 10:01:45 -08:00
Eric Anholt	019560c127	i965: Drop some duplicated code in DRI winsys BO updates. The only DRI2 vs DRI3 delta was just how to decide about frontbuffer-ness for doing the upsample. v2: Fix missing singlesample_mt->region->name update in the merged code, which would have broken the DRI2 don't-recreate-the-miptree optimization. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:36 -08:00
Eric Anholt	0440e677b9	i965: Simplify intel_miptree_updownsample. Pretty silly to pass in values dereferenced out of one of the arguments. v2: Get the destination size from the dst, even though the callers are always dealing with src size == dst size cases. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:34 -08:00
Eric Anholt	bbd85ad27c	i965: Don't try to use the ctx->ReadBuffer when asked to blorp miptrees. So far it's happened to be that we're only ever calling intel_miptree_blit() (up/downsampling) from the ReadBuffer, but I stumbled over a null ReadBuffer case when debugging later parts of the series. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:32 -08:00
Eric Anholt	af4f758a44	i965: Make the mt->target of multisample renderbuffers be 2D_MS. Mostly mt->target == 2D_MS just results in a few checks that we don't try to allocate multiple LODs and don't try to do slice copies with them. But with the introduction of binding renderbuffers to textures, we need more consistency. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:29 -08:00
Eric Anholt	4e4a537ad5	meta: Push into desktop GL mode when doing meta operations. This lets us simplify our shaders, and rely on GLES-prohibited functionality (like ARB_texture_multisample) when writing these driver-internal functions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:27 -08:00
Eric Anholt	b3dcce65c9	meta: Fix blit shader compile on non-glsl-130 drivers. Compare this VS to the one for the post-130 case. Fixes piglit glsl-lod-bias, and presumably tons of other code (I haven't done a full piglit run on swrast). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-18 09:56:06 -08:00
Rob Clark	20d14ef263	configure: fix build error with XA Fixes: xa_tracker.c: In function 'xa_tracker_create': xa_tracker.c:147:5: error: implicit declaration of function 'pipe_loader_drm_probe_fd' [-Werror=implicit-function-declaration] in some build configurations, as XA now implicitly depends on gallium_drm_loader. Signed-off-by: Rob Clark <robclark@freedesktop.org> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-02-18 08:12:37 -05:00
Michel Dänzer	cf0172d46a	r600g,radeonsi: Consolidate logic for short-circuiting flushes Fixes radeonsi emitting command streams to the kernel even when there have been no draw calls before a flush, potentially powering up the GPU needlessly. Incidentally, this also cuts the runtime of piglit gpu.py in about half on my Kaveri system, probably because an X11 client going away no longer always results in a command stream being submitted to the kernel via glamor. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761 Cc: "10.1" mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-18 10:46:23 +09:00
Emil Velikov	adad8fb2e9	st/dri: remove #ifdef DRM_CAP_PRIME guard Required for libdrm 2.4.37 and earlier. Both scons and automake require version 2.4.38 now so that guard is not longer needed. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:08:26 +00:00
Emil Velikov	6fbd00e43a	automake: remove leftover XORG and LIBKMS variables No longer set or used since the removal of st/xorg. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:08:03 +00:00
Emil Velikov	4b3a4c799a	scons: sync package requirements xorg-server and libkms is no longer required since the removal of the xorg state-tracker. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:04:07 +00:00
Emil Velikov	5fe47969c0	configure: bump up libdrm requirement to 2.4.38 This is the first version that introduced DRM_CAP_PRIME, which is implicitly required by egl/wayland. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:04:02 +00:00
Emil Velikov	f41102b538	configure: use test -n whenever possible Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:30 +00:00
Emil Velikov	8015ffeea1	configure: use test -z whenever possible Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:23 +00:00
Emil Velikov	ee55500c22	configure: cleanup classic dri drivers handling * Make sure that only drivers that are handled by configure.ac are included in DRI_DIRS. * Change with_dri_drivers default value to auto, and set enable autodetection, when enable_opengl is on. v2: Move "test" to the correct location. v3: Squash DRI_DIRS handling before the switch statement. Suggested by Ilia Mirkin Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:19 +00:00
Emil Velikov	35f6eed742	configure: compact ppc/sparc DRI_DIRS handling Both arches have the same list of dri_dirs. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:13 +00:00
Emil Velikov	65e67b9bf7	configure: drop explicit DRI_DIRS assignment on some platforms/arches Both x86_64\|amd64 and *bsd, already set the full range of available classic dri drivers. Drop the explicit assignment, and fall back to the generic default. Keep explicit list from plafroms/arches that do not handle the default list. Update help strings, to explicitly mention "classic" for applicable DRI drivers. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-18 00:00:05 +00:00
Emil Velikov	49e93e8945	configure: cleanup switch statement Move all the cases within one switch statement and handle i9{1,6}5 and r{adeon,200} independently. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-17 23:59:25 +00:00
Kusanagi Kouichi	d23f9e3390	targets/vdpau: Don't link unused libraries libvdpau, libselinux and libexpat are not used. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>	2014-02-17 21:14:17 +00:00
Kusanagi Kouichi	6ba4392da2	configure: Try pkg-config first for libselinux v2 (Emil) Add SELINUX_CFLAGS in the respective locations Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2014-02-17 21:14:16 +00:00
Kusanagi Kouichi	61f6cddef7	targets/vdpau: Always use c++ to link If built without llvm, the following error occurs with mplayer: Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE [vo/vdpau] Error when calling vdp_device_create_x11: 1 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-17 21:14:16 +00:00
Ilia Mirkin	6958fb341f	st/xvmc: fix tests so that they pass Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-16 23:21:57 -05:00
Rob Clark	8b5f894e13	pipe-loader: add pipe loader for freedreno/msm Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:36:23 -05:00
Rob Clark	24fa96163a	st/xa: missing handle type DRM_API_HANDLE_TYPE_SHARED is zero, so doesn't actually fix anything. But we shouldn't rely on SHARED handle type being zero. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:36:23 -05:00
Rob Clark	42158926c6	st/xa: use pipe-loader to get screen This lets multiple gallium drivers use XA. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:36:19 -05:00
Rob Clark	a122c75599	pipe-loader: split out "client" version Build two versions of pipe-loader, with only the client version linking in x11 client side dependencies. This will allow the XA state tracker to use pipe-loader. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:31:10 -05:00
Rob Clark	d73b2c0517	freedreno/a3xx/compiler: use (ss) for WAR hazards Seems texture sample instructions don't immediately consume there src(s). In fact, some shaders from blob compiler seem to indiciate that it does not even count the texture sample instructions when calculating number of delay slots to fill for non-sample instructions. (Although so far it seems inconclusive as to whether this is required.) In particular, when a src register of a previous texture sample instruction is clobbered, the (ss) bit is needed to synchronize with the tex pipeline to ensure it has picked up the previous values before they are overwritten. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	e8cca57a3f	freedreno/a3xx/compiler: fix RA typo Was supposed to be a '+', otherwise we end up with a negative offset and choosing registers below the assigned range. This seems to fix the scheduling mystery "solved" by adding in extra delay slots. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	579473f8f8	freedreno/a3xx/compiler: handle kill properly (new compiler) Since 'kill' does not produce a result, the new compiler was happily optimizing them out. We need to instead track 'kill's similar to outputs. But since there is no non-predicated kill instruction, (and for flattend if/else we do want them to be predicated), we need to track the topmost branch condition on the stack and use that as src arg to the kill. For a kill at the topmost level, we have to generate an immediate 1.0 to feed into the cmps.f for setting the predicate register. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	e35747b882	freedreno/a3xx/compiler: trans_cmp() sanity Thanks to figuring out 32bit float render target, and adding regdump test in fdre-a3xx, I can more easily play around with instructions to figure out range of inputs/outputs/etc. And from this I can conclude that cmps.f works more like expected and I can do something much more simple in trans_cmp() (compared to before which was more closely emulating the instruction sequence of the blob compiler). And using sel.b32 (binary 0/1) often makes more sense than sel.f32 (+/- float) or sel.u32 (+/- uint) as it can use the output directly from cmps.f without needing the 'add.s tmp0, tmp0, -1'. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	89dc282581	freedreno: fix problems if no color buf bound Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Eric Anholt	1020d8937e	meta: Don't try to enable FF texturing when we're using GLSL. On a core context, this would throw an error. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-14 12:09:42 -08:00
Carl Worth	a92581acf2	main: Avoid double-free of shader Label As documented, the _mesa_free_shader_program_data function: "Frees all the data that hangs off a shader program object, but not the object itself." This means that this function may be called multiple times on the same object, (and has been observed to). Meanwhile, the shProg->Label field was not being set to NULL after its free(). This led to a second call to free() of the same address on the second call to this function. Fix this by setting this field to NULL after free(), (just as with all other calls to free() in this function). Reviewed-by: Brian Paul <brianp@vmware.com> CC: mesa-stable@lists.freedesktop.org	2014-02-14 11:45:48 -08:00
Brian Paul	e4a5a9fd2f	gallium/pipebuffer: change pb_cache_manager_create() size_factor to float Requested by Marek. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 09:56:55 -07:00
Thomas Hellstrom	141e39a893	svga/winsys: Propagate surface shared information to the winsys The linux winsys needs to know whether a surface is shared. For guest-backed surfaces we need this information to avoid allocating a mob out of the mob cache for shared surfaces, but instead allocate a shared mob, that is never put in the mob cache, from the kernel. Also previously, all surfaces were given the "shareable" attribute when allocated from the kernel. This is too permissive for client-local surfaces. Now that we have the needed info, only set the "shareable" attribute if the client indicates that it needs to share the surface. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	fe6a854477	svga/winsys: implement GBS support This is a squash commit of many commits by Thomas Hellstrom. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Thomas Hellstrom	59e7c59621	gallium/util: Add flush/map debug utility code Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Thomas Hellstrom	8af358d8bc	gallium/pipebuffer: Add a cache buffer manager bypass mask In some situations, it may be desirable to bypass the cache at buffer creation but to insert the buffer in the cache at buffer destruction. One such situation is where we already have a kernel representation of a buffer that we want to use, but we also want to insert it in the cache when it's freed up. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Thomas Hellstrom	c9e9b1862b	pipebuffer, winsys: Add a size match parameter to the cached buffer manager In some situations it's important to restrict the sizes of buffers that the cached buffer manager is allowed to return Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	3d1fd6df53	svga: update texture code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	72b0e959fc	svga: update buffer code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	e0a6fb09bd	svga: add new helper functions for GBS buffers Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	6476bcbc50	svga: remove a couple unneeded assertions Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	f8bbd8261d	svga: adjust adjustment for point coordinates Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	d0c22a6d53	svga: track which textures are rendered to Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	c1e60a61e8	svga: add helpers for tracking rendering to textures Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	f84c830b14	svga: update shader code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	2f1fc8db10	svga: update constant buffer code for GBS Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	31dfefc47f	svga: add svga_have_gb_objects/dma() functions Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	823fbfdca7	svga: add new GBS commands And update some existing commands. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	d993ada50c	svga: update svga_winsys interface for GBS This adds new interface functions for guest-backed surfaces and adds a mobid parameter to the surface_relocation() function. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	024711385e	svga: update dumping code with new GBS commands, etc Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:44 -07:00
Brian Paul	2e0c90847f	svga: split / update svga3d header files The old svga3d_reg.h file is split into separate header files and we add new items for guest-backed surfaces. Plus some minor code fixes because of renamed symbols. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: "10.1" <mesa-stable@lists.freedesktop.org>	2014-02-14 08:21:43 -07:00
Grigori Goronzy	6d1cecbfd7	st/vdpau: add support for DEINTERLACE_TEMPORAL Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-14 09:05:20 +01:00
Grigori Goronzy	af34c3fd10	vl: add motion adaptive deinterlacer Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-14 08:55:33 +01:00
Leo Liu	f87dfc35bc	st/omx/enc: fix scaling src alignment issue Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-14 08:50:32 +01:00
Alex Deucher	01e6371149	radeon: reverse DBG_NO_HYPERZ logic Change the flag to DBG_HYPERZ and reverse the logic so setting the flag enabled the feature. This disables hyperz on r600g and radeonsi by default. It can be enabled by setting the env var. There are just too many issues with certain apps so leave it disabled for now until we sort out the issues with the problematic apps. Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=58660 https://bugs.freedesktop.org/show_bug.cgi?id=64471 https://bugs.freedesktop.org/show_bug.cgi?id=66352 https://bugs.freedesktop.org/show_bug.cgi?id=68799 https://bugs.freedesktop.org/show_bug.cgi?id=72685 https://bugs.freedesktop.org/show_bug.cgi?id=73088 https://bugs.freedesktop.org/show_bug.cgi?id=74428 https://bugs.freedesktop.org/show_bug.cgi?id=74803 https://bugs.freedesktop.org/show_bug.cgi?id=74863 https://bugs.freedesktop.org/show_bug.cgi?id=74892 https://bugzilla.kernel.org/show_bug.cgi?id=70411 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org> Acked-by: Marek Olšák <marek.olsak@amd.com>	2014-02-13 20:55:54 -05:00
Tom Stellard	3c4bd95b62	pipe-loader: Add support for render nodes v2 v2: - Add missing call to pipe_loader_drm_release() - Fix render node macros - Drop render-node configure option	2014-02-13 19:53:15 -05:00
Tom Stellard	8481d208ce	pipe-loader: Add auth_x parameter to pipe_loader_drm_probe_fd() The caller can use this boolean parameter to tell the pipe-loader to authenticate with the X server when probing a file descriptor.	2014-02-13 19:53:15 -05:00
Christian König	0320ba9988	st/omx/dec/h264: fix pic_order_cnt_type==2 Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-13 18:00:44 +01:00
Ilia Mirkin	0c8b165366	nouveau: fix chipset checks for nv1a by using the oclass instead Commit `f4ebcd133b` ("dri/nouveau: NV17_3D class is not available for NV1a chipset") fixed this partially by using the correct 3d class. However there were a lot of checks left over comparing against the chipset. Reported-and-tested-by: John F. Godfrey <jfgodfrey@gmail.com> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-02-13 11:06:41 -05:00
Christian König	0ef3ce4155	st/omx: initial OpenMAX H264 encoder v7 v2 (chk): fix eos handling v3 (leo): implement scaling configuration support v4 (leo): fix bitrate bug v5 (chk): add workaround for bug in Bellagio v6 (chk): fix div by 0 if framerate isn't known, user separate pipe object for scale and transfer, always flush the transfer pipe before encoding v7 (chk): make suggested changes, cleanup a bit more, only advertise encoder on supported hardware Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-02-13 11:11:24 +01:00
Christian König	9ff0cf903d	radeon/vce: initial VCE support v8 v2 (chk): revert feedback buffer hack v3 (slava): fixed bitstream size calculation v4 (chk): always create buffers in the right domain v5 (chk): flush async v6 (chk): rework fw interface add version check v7 (leo): implement cropping support v8 (chk): add hw checks Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>	2014-02-13 11:11:24 +01:00
Christian König	cbdd052577	radeon/winsys: add VCE support v4 v2: add fw version query v3: add README.VCE v4: avoid error msg when kernel doesn't support it Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-13 11:11:24 +01:00
Ilia Mirkin	ef9a6ded10	nv50: mark scissors/viewports dirty on context switch Commit `246ca4b001` ("nv50: implement multiple viewports/scissors, enable ARB_viewport_array") added dirty tracking to scissors/viewports. However it neglected to mark them all as dirty on a context switch. This fixes an apparent regression in webgl in chrome, but probably in any application that switches contexts. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-02-13 10:08:29 +01:00
Christian König	1ef7b9de06	gallium/vl: remove remaining softpipe video functions Unused and unmaintained for quite a while. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2014-02-13 09:46:54 +01:00
Ilia Mirkin	18caef953f	docs: add nv50 to the ARB_viewport_array list	2014-02-12 22:14:41 -05:00
Ilia Mirkin	246ca4b001	nv50: implement multiple viewports/scissors, enable ARB_viewport_array Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-12 21:47:36 -05:00
Ilia Mirkin	a7012eede8	mesa/st: hardcode the viewport bounds range The bound range is disconnected from the viewport dimensions. This is the relevant bit from glViewportArray: """ The location of the viewport's bottom left corner, given by (x, y) is clamped to be within the implementaiton-dependent viewport bounds range. The viewport bounds range [min, max] can be determined by calling glGet with argument GL_VIEWPORT_BOUNDS_RANGE. Viewport width and height are silently clamped to a range that depends on the implementation. To query this range, call glGet with argument GL_MAX_VIEWPORT_DIMS. """ Just set it to +/-16384, as that is the minimum required by ARB_viewport_array and the value that all current drivers provide. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-13 12:44:36 +10:00
Brian Paul	f0e967f212	scons: add meta_blit.c to src/mesa/SConscript	2014-02-12 17:46:11 -07:00
Eric Anholt	255bd9c0b8	meta: Add acceleration for depth glBlitFramebuffer(). Surprisingly, the GLSL shaders already wrote the sampled r value to FragDepth. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51600 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-12 16:17:11 -08:00
Eric Anholt	067c7b67e8	meta: Use BindRenderbufferTexImage() for meta glBlitFramebuffer(). This avoids a CopyTexImage() on Intel i965 hardware without blorp. v2: Move the !readAtt check up higher. v3: Rebase on idr's changes, plus readAtt check is totally gone, and also fix a typo in a comment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v2)	2014-02-12 16:17:11 -08:00
Eric Anholt	f29c25fc1d	i965: Add a driver hook for binding renderbuffers to textures. This will let us use meta's acceleration from renderbuffers without having to do a CopyTexImage first. This is like what we do for TFP, but just taking an existing renderbuffer and binding it to a texture with whatever its format was. The implementation won't work for stencil renderbuffers, and it only does non-texture renderbuffers (but then, if you're using a texture renderbuffer, you can just pull the texture object/level/slice out of the renderbuffer, anyway). v2: Don't forget to propagate NumSamples to the teximage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-12 16:17:11 -08:00
Eric Anholt	431decf16f	meta: Do a massive unindent (and rename) of blitframebuffer_texture(). This function is only handling the color case. We can just unindent as long as we're willing to do the check for the bit outside of the function. v2: Rebase on idr's changes, drop readAtt check that's always non-null anyway (it's a pointer into to the statically-allocated attachments array in the renderbuffer). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 16:17:11 -08:00
Eric Anholt	3e4ccf499e	meta: Move glBlitFramebuffer() to a separate file. v2: Drop a bunch of unnecessary includes (by Kenneth), rebase on idr's changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 16:17:08 -08:00
Eric Anholt	81ddbdaaba	meta: De-static some of meta's functions. I want split some meta.c code off to a separate file, so these functions can't be static any more. v2: Rebase on idr's changes, also expose setup_blit_shader, blit_shader_table_cleanup, setup_vertex_objects, setup_ff_tnl_for_blit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 16:16:03 -08:00
Eric Anholt	2c8f182c86	meta: Move the meta structures to the meta header. I'd like to split some of our code to separate files, since 4k lines and growing is pretty unreasonable for all these separate operations. v2: Rebase on idr's changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)	2014-02-12 15:38:58 -08:00
Eric Anholt	cd084aa297	meta: Fold the texture setup into setup_copypix_texture(). There was this funny argument passed to setup for "did alloc decide we need to allocate new texture storage?", which goes away if we don't have the caller do alloc as a separate step. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:58 -08:00
Eric Anholt	397b2c3966	meta: Drop the src == dst restriction on meta glBlitFramebuffer(). From the GL_ARB_fbo spec: If the source and destination buffers are identical, and the source and destination rectangles overlap, the result of the blit operation is undefined. As far as I know, that's the only thing that would have been of concern for this. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:58 -08:00
Eric Anholt	a4f3e2ca0e	mesa: Make TexImage error cases about internalFormat more informative. I tripped over one of these when debugging meta, and it's a lot nicer to just see the internalFormat being complained about. v2: Drop a note in the other errors path that there is one early return. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:58 -08:00
Eric Anholt	56b031d8ae	meta: Rename the "sampler" stuff to "blit shader". While these structs are generated per GLSL sampler type, they're structs of data-about-shaders (notably, the ID of a shader program), not data-about-samplers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Eric Anholt	e455c8283b	meta: Drop a now-trivial helper function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Eric Anholt	e48a6378c9	meta: Fold the glUseProgram() into the blit program generator. Everyone was just immediately calling it and doing nothing else with the shader program id. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Eric Anholt	b719aa3902	meta: Simplify the blit shader setup steps. The only thing that wants to track the glsl_sampler structure is the shader string generator. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 15:38:57 -08:00
Francisco Jerez	b424da4be0	i965/vec4: Fix confusion between SWIZZLE and BRW_SWIZZLE macros. Most of the VEC4 back-end agrees on src_reg::swizzle being one of the BRW_SWIZZLE macros defined in brw_reg.h, except in two places where we use Mesa's SWIZZLE macros. There is even a doxygen comment saying that Mesa's macros are the right ones. They are incompatible swizzle representations (3 bits vs. 2 bits per component), and the code using Mesa's works by pure luck. Fix it. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:39:42 +01:00
Francisco Jerez	a3a55067bd	i965/fs: Remove fs_reg::sechalf. The same effect can be achieved using ::subreg_offset. Remove the less flexible alternative and define a convenience function to keep the fs_reg interface sane. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:39:24 +01:00
Francisco Jerez	019bf6ed8d	i965/fs: Remove fs_reg::smear. The same effect can be achieved using a combination of ::stride and ::subreg_offset. Remove the less flexible ::smear to keep the data members of fs_reg orthogonal. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:07:57 +01:00
Francisco Jerez	756d37b1d6	i965/fs: Add support for specifying register horizontal strides. v2: Some improvements for copy propagation with non-contiguous register strides and mismatching types. v3: Add example of the situation that the copy propagation changes are intended to avoid. Clarify that 'fs_reg::apply_stride()' is expected to work with zero strides too. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:07:57 +01:00
Francisco Jerez	4c7206bafd	i965/fs: Add support for sub-register byte offsets to the FS back-end IR. It would be nice if we could have a single 'reg_offset' field expressed in bytes that would serve the purpose of both, but the semantics of 'reg_offset' are quite complex currently (it's measured in units of one, eight or sixteen dwords depending on the register file and the dispatch width) and changing it to bytes would be a very intrusive change at this stage. Add a separate 'subreg_offset' field for now. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 23:07:57 +01:00
Brian Paul	248606a5f0	glsl: rename _restrict to restrict_flag To fix MSVC compile breakage. Evidently, _restrict is an MSVC keyword, though the docs only mention __restrict (with two underscores). Note: we may want to also rename _volatile to volatile_flag to be consistent. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74900 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 13:37:09 -07:00
Brian Paul	fd0620ff6c	mesa: assorted clean-ups in detach_shader() Fix formatting, add new comments, get rid of extraneous indentation. Suggested by Ian in bug 74723. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-12 11:21:47 -07:00
Brian Paul	23d4ff53d4	svga: replace out-of-temps assertion with debug warning Signed-off-by: Brian Paul <brianp@vmware.com>	2014-02-12 11:21:46 -07:00
Francisco Jerez	76f95ba272	mesa: Handle binding of uniforms to image units with glUniform(). v2: Set driver-specified flag in NewDriverState when glUniform is used to bind an image unit. v3: Abbreviate argument type check. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	212122543b	glsl/linker: Propagate image uniform access qualifiers to the driver. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	c318a677dd	glsl/linker: Assign image uniform indices. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	e51158f2e7	glsl/linker: Count and check image resources. v2: Add comment about the reason why image variables take up space from the default uniform block. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	e8dbe430aa	glsl: Add image built-in function generator. Because of the combinatorial explosion of different image built-ins with different image dimensionalities and base data types, enumerating all the 242 possibilities would be annoying and a waste of .text space. Instead use a special path in the built-in builder that loops over all the known image types. v2: Generate built-ins on GLSL version 4.20 too. Rename '_has_float_data_type' to '_supports_float_data_type'. Avoid duplicating enumeration of image built-ins in create_intrinsics() and create_builtins(). v3: Use a more orthodox approach for passing image built-in generator parameters. v4: Cosmetic changes. Acked-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:06 +01:00
Francisco Jerez	87acc7c650	glsl: Add built-in constants for ARB_shader_image_load_store. v2: Add them on GLSL version 4.20 too. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	6057300ec6	glcpp: Add built-in define for ARB_shader_image_load_store. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	60c89f8bff	glsl: Add built-in types defined by ARB_shader_image_load_store. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	7af167d2be	glsl/ast: Generalize some sampler variable restrictions to all opaque types. No opaque types may be statically initialized in the shader, all opaque variables must be declared uniform or be part of an "in" function parameter declaration, no opaque types may be used as the return type of a function. v2: Add explicit check for opaque types in interface blocks. Check for opaque types in ir_dereference::is_lvalue(). Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	2158749e52	glsl/ast: Forbid declaration of image variables in structures and uniform blocks. Aggregating images inside uniform blocks is explicitly disallowed by the standard, aggregating them inside structures is not (as of GL 4.4), but there is a similar problem as with atomic counters: image uniform declarations require either a "writeonly" memory qualifier or an explicit format qualifier, which are explicitly forbidden in structure member declarations. In the resolution of Khronos bug #10903 the same wording applied to atomic counters was decided to mean that they're not allowed inside structures -- Rejecting image member declarations within structures seems the most reasonable option for now. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	6b28528d1c	glsl/ast: Make sure that image argument qualifiers match the function prototype. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	81c167ef1c	glsl/ast: Verify that function calls don't discard image format qualifiers. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	94a95e03d9	glsl/ast: Validate and apply memory qualifiers to image variables. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	910311c4a6	glsl/parser: Handle image built-in types. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	f9cf61df3b	glsl/parser: Handle image memory qualifiers. v2: Make the "map" array static const. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	fcd869ed56	glsl/parser: Handle the early_fragment_tests input layout qualifier. v2: Only allow the early_fragment_tests qualifier in fragment shaders. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	b0b26faa25	glsl/lexer: Add new tokens for ARB_shader_image_load_store. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	299e869d25	glsl/ast: Keep track of type qualifiers defined by ARB_shader_image_load_store. v2: Add comment next to the read_only and write_only qualifier flags. Change temporary copies of the type qualifier mask to use uint64_t too. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	c116541b2c	glsl: Add gl_uniform_storage fields to keep track of image uniform indices. v2: Promote anonymous struct into named struct. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:05 +01:00
Francisco Jerez	bb13691d1c	glsl: Add image memory and layout qualifiers to ir_variable. v2: Add comment next to the read_only and write_only qualifier flags. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:44:04 +01:00
Francisco Jerez	107d03a6d5	glsl: Add helper methods to glsl_type for dealing with images. Add predicates to query if a GLSL type is or contains an image. Rename sampler_coordinate_components() to coordinate_components(). v2: Use assert instead of unreachable. v3: No need to use a separate code-path for images in coordinate_components() after merging image and sampler fields in the glsl_type structure. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:43:37 +01:00
Francisco Jerez	8a2508ee07	glsl: Add image type to the GLSL IR. v2: Reuse the glsl_sampler_dim enum for images. Reuse the glsl_type::sampler_* fields instead of creating new ones specific to image types. Reuse the same constructor as for samplers adding a new 'base_type' argument. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:39:48 +01:00
Francisco Jerez	9e611fc72d	glsl: Add ARB_shader_image_load_store extension enables. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-02-12 18:39:48 +01:00
Fredrik Höglund	9afbd04d89	mesa: Preserve the NewArrays state when copying a VAO Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72895 Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-12 18:22:42 +01:00
Maarten Lankhorst	fee0686c21	nouveau: create only 1 shared screen between vdpau and opengl This fixes bug 73200 "vdpau-GL interop fails due to different screen objects" in the same way radeon does. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-12 14:57:25 +01:00
Maarten Lankhorst	572a8345bf	gallium makefiles: use a linker script for building dri drivers Only export __driDriverExtensions by default, and radeon_drm_winsys_create on radeons. Remove -Bsymbolic which should no longer be needed. As a side effect, it ought to fix a manifestation of bug 73200 on radeon. Signed-off-by: Maarten Lankhorst<maarten.lankhorst@canonical.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-12 13:51:51 +01:00
Matt Turner	025d99ce3c	glsl: Do not vectorize vector array dereferences. Array dereferences must have scalar indices, so we cannot vectorize them. Cc: "10.1" <mesa-stable@lists.freedesktop.org> Reported-by: Andrew Guertin <lists@dolphinling.net> Tested-by: Andrew Guertin <lists@dolphinling.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-11 16:05:55 -08:00
Ian Romanick	4cffd3e791	meta: Enable cubemap array texture support to decompress_texture_image Fixed piglit test getteximage-targets S3TC CUBE_ARRAY on systems that don't have libtxc_dxtn installed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	daa3eea877	meta: Add cubemap array support to generic blit shader code Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	e68aa12849	meta: Get the correct info log Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	10f7c54477	meta: Expand texture coordinate from vec3 to vec4 This will be necessary to support cubemap array textures because they use all four components. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	b2ad3dbfa4	meta: Use GLSL to decompress 2D-array textures Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72582 Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	c1417aae6c	meta: Use common GLSL code for blits Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	d524654c34	meta: Improve GLSL version check We want to use the GLSL 1.30-ish path for OpenGL ES 3.0. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	4825af972a	meta: Add rectangle textures to the shader-per-sampler-type table Rectangle textures were not necessary for mipmap generation (because they cannot have mipmaps), but all of the future users of this common code will need to support rectangle textures. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	f5a477ab76	meta: Refactor shader generation code out of mipmap generation path This is quite like code we want for blits. Pull it out so that it can be shared by other paths. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	ed3bc38ee7	meta: Refactor the table of glsl_sampler structures This will allow the same table of shader-per-sampler-type to be used for paths in meta other than just mipmap generation. This is also the reason the declarations of the structures was moved towards the top of the file. v2: Code formatting change suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	b514f24101	meta: Use common vertex setup code for _mesa_meta_Bitmap too Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	75227a0968	meta: Add storage to the vertex structure for R, G, B, and A Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Ian Romanick	5e5d87ff32	meta: Use common routine to configure fixed-function TNL state Also... glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0, 1.0) is the identity matrix, so drop the unnecessary call to _mesa_Ortho. v2: Rename setup_ff_TNL_for_blit() to setup_ff_tnl_for_blit(). Seems silly to capitalize one out of two to three acronyms in the name (change by anholt, acked by idr). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 16:00:12 -08:00
Kenneth Graunke	35e8de383c	i965: Fix General and Indirect Base Addresses on Broadwell. I set the "address modify enable" bit in the wrong DWord. The first DWord is the high 16 bits of the address, while the second is the low 32-bits and enable bit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:25:45 -08:00
Kenneth Graunke	b0e90ea09f	i965: Drop VECTOR_MASK_ENABLE in Broadwell's 3DSTATE_VS packet. We never set it on previous generations, but I had to set it in 3DSTATE_PS for correct behavior. For symmetry, I set it in 3DSTATE_VS as well, but there's no actual need to do so. Piglit works fine either way. The documentation also remarks that there should never be a need to program this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:25:29 -08:00
Kenneth Graunke	4dd1002518	i965/gs: Fix EndPrimitive on Broadwell. My earlier patch (i965: Reserve space for "Vertex Count" in GS outputs.) incremented Global Offset for most URB writes to make room for the new "Vertex Count" field, but failed to shift the URB writes used for writing control bits. Confusingly, Global Offset must be incremented by 2 here, rather than 1. The URB writes we use for actual data are HWord writes, which treat Global Offset as a 256-bit offset. These are OWord writes, so it's treated as a 128-bit offset instead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:25:03 -08:00
Kenneth Graunke	5ebfac8d72	i965/vec4: Support arbitrarily large sampler indices on Broadwell+. I added support for these on Haswell, but forgot to update the Broadwell code before landing it. Fixes Piglit's max-samplers test. v2: Use get_element_ud() for the destination as well as the source. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:24:36 -08:00
Kenneth Graunke	b371734331	i965/fs: Support arbitrarily large sampler indices on Broadwell+. I added support for these on Haswell, but forgot to update the Broadwell code before landing it. Partially fixes Piglit's max-samplers test. v2: Use get_element_ud() consistently, rather than using it for the source but using brw_vec1_grf for the destination.. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:22:22 -08:00
Kenneth Graunke	0e21ba07f2	i965/fs: Fix Broadwell texture header setup to be uncompressed. MOV_RAW disables masking, but doesn't force the instruction to be uncompressed. That needs to be done by hand. Fixes textureGather and texture offset tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 15:21:10 -08:00
Ian Romanick	1edca151a0	mesa: GL_ARB_half_float_pixel is not optional Almost every driver already supported it. All current and future Gallium drivers always support it, and most existing classic drivers support it. This only changes radeon and nouveau. This extension only adds data types that can be passed to, for example, glTexImage2D. It does not add internal formats. Since you can already pass GL_FLOAT to glTexImage2D this shouldn't pose any additional issues with those drivers. Note that r200 and i915 already supported this extension, and they don't support floating-point textures either. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	6d6a290181	mesa: Fix extension dependency for half-float TexBOs Half-float TexBOs should require both GL_ARB_half_float_pixel and GL_ARB_texture_float. This doesn't matter much in practice. Every driver that supports GL_ARB_texture_buffer_object already supports GL_ARB_half_float_pixel. We only expose the TexBO extension in core profiles, and those require GL_ARB_texture_float. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	54b1082828	meta: Silence unused parameter warning in _mesa_meta_CopyTexSubImage drivers/common/meta.c: In function '_mesa_meta_CopyTexSubImage': drivers/common/meta.c:3744:52: warning: unused parameter 'rb' [-Wunused-parameter] Unfortunately, the parameter can't just be removed because it is part of the dd_function_table::CopyTexSubImage interface. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	d156281cfe	meta: Silence unused parameter warning in setup_drawpix_texture drivers/common/meta.c: In function 'setup_drawpix_texture': drivers/common/meta.c:1572:30: warning: unused parameter 'texIntFormat' [-Wunused-parameter] setup_drawpix_texture has never used this paramater. Before the refactor commit `04f8193aa` it was used in several locations. After that commit, texIntFormat was only used in alloc_texture. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:36:43 -08:00
Ian Romanick	f34d599a5b	meta: Refactor common VAO and VBO initialization code v2: Clean up some stray binding calls Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> (v1) Reviewed-by: Eric Anholt <eric@anholt.net> (v2)	2014-02-11 14:24:02 -08:00
Ian Romanick	beb33fc5b7	meta: Track the _mesa_meta_DrawPixels VBO just like the others All of the other meta routines have a particular pattern for creating and tracking the VAO and VBO. This one function deviated from that pattern for no apparent reason. Almost all of the code added in this patch will be removed shortly. v2: Drop glDeleteBuffers() of the old, now-uninitialized vbo variable. Fixes getteximage-formats and fbo-mipmap-copypix regression when "2" landed in the variable (change by anholt). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:23:55 -08:00
Ian Romanick	83c90c9239	meta: Expand the vertex structure for the GenerateMipmap and decompress paths Final intermediate step leading to some code sharing. Note that the new GemerateMipmap and decompress vertex structures are the same as the new vertex structure in BlitFramebuffer and the others. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	897f975668	meta: Expand the vertex structure for the DrawPixels paths Another step leading to some code sharing. Note that the new DrawPixels vertex structure is the same as the new vertex structure in BlitFramebuffer and the others. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	d7ac102c7b	meta: Expand the vertex structure for the Clear paths Another step leading to some code sharing. Note that the new Clear vertex structure is the same as the new BlitFramebuffer and CopyPixels vertex structure. The "sizeof(float) * 7" hack is temporary. It will magically disappear in a just a couple more patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	545fd9bc9b	meta: Expand the vertex structure for the CopyPixels paths Another step leading to some code sharing. Note that the new CopyPixels vertex structure is the same as the new BlitFramebuffer vertex structure. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ian Romanick	9b4e659e62	meta: Expand the vertex structure for the BlitFramebuffer paths This is the first of several steps leading to some code sharing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-11 14:11:21 -08:00
Ilia Mirkin	908a711313	nv30,nvc0: only claim a single viewport It should be possible to make this be 16 on nvc0. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 22:08:01 +00:00
Emil Velikov	82cd6e6317	st/clover: use VISIBILITY_CXXFLAGS where approapriate Use the c++ visibility flags when building cpp files. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 21:36:52 +00:00
Emil Velikov	7ed32c9af9	omx: use VISIBILITY_CFLAGS to control exported symbols Initial step of cleaning the exported symbols from targets/omx - Mark omx_component_library_Setup as public v2: Keep export-symbols-regex Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> (v1)	2014-02-11 21:36:16 +00:00
Emil Velikov	eda9a66f7e	osmesa: drop obsolete AM_CXXFLAGS There is no cpp files during the build process, thus we can safely drop the unused cxxflags. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 21:32:39 +00:00
Emil Velikov	927b9e8eb8	st/vdpau: automake: export only PUBLIC symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-11 21:27:45 +00:00
Emil Velikov	255b39f17a	st/vdpau: do not export VdpPresentationQueueTargetCreateX11 The function pointer is retrieved via VdpGetProcAddress just like all the other vdpau functions and should not be exported. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-11 21:25:11 +00:00
Emil Velikov	d84e0eb406	wayland-egl: automake: add symbol test Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 20:19:46 +00:00
Emil Velikov	6405563783	st/egl: automake: avoid exporting all symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 20:19:01 +00:00
Emil Velikov	11926e8997	targets/egl-static: automake: don't export local symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 20:16:55 +00:00
Emil Velikov	5c7f75f70a	gbm: automake: add symbol tests Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	33b9c0d465	targets/gbm: automake: do not export internal symbols Add VISIBILITY_CFLAGS to automake build, so that only required symbols are exported. v2: Rebase Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	10e5ffd496	gbm: do not export _gbm_mesa_get_device This symbol is internal and was never part of the API. Unused by any of the gbm backends, it makes sense to simply not export it. Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	d00b319f40	gbm: automake: add VISIBILITY_CFLAGS Currently the library exports every symbol imaginable, rather than the ones defined by the API. Note: This may cause issues for libraries that are linking agaist libgbm's internals. Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	631cc6105d	st/gbm: automake: do not export gbm_gallium_drm_device_create Symbol is internal and was never meant to be exported. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	90ed101322	auxiliary/pipe-loader: automake: avoid exporting all symbols Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	165eecf1f6	egl/dri2/android: free driver_name in dri2_initialize_android error path v2: Cleanup driver name if dri2_load_driver() fails. Spotted by Chad Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	76d9f6d972	dri/nouveau: Pass the API into _mesa_initialize_context Currently we create a OPENGL_COMPAT context regardless of what was requested by the program. Correct that by retaining the program's request and passing it into _mesa_initialize_context. Based on a similar commit for radeon/r200 by Ian Romanick. Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 19:00:09 +00:00
Emil Velikov	118c36adb4	configure: cleanup libudev handling Add the explicit note about the required version during configure. Require the same version (151) of udev when building the pipe-loader. Mention the udev version requirement in GBM Requires.private. v2: Resolve a couple of silly typos. Spotted by Ilia v3: Cleanup platfrom/platform typo. Spotten by Stefan Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 18:59:59 +00:00
Emil Velikov	31f50f3149	gbm: drop unneeded dependency of libudev As of recently we dlopen the library, additionally the only code that is including the libudev.h header, is the loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	d57dc6dc30	opencl: do not link against libudev Previously the linking was required due to dependency of udev in the pipe-loader. Now this is no longer the case, as we dlopen the library. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	e19fba7cc6	gallium/tests: do not link against libudev Previously the linking was required due to dependency of udev in the pipe-loader. Now this is no longer the case, as we dlopen the library. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	897e1989da	egl-static: stop linking against libudev No longer required since all the udev code is in the loader. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	053e095ecb	egl_dri2: remove LIBUDEV_CFLAGS from Makefile.am None of the code within builds or (explicitly) requires udev. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	6fe2ca7a08	configure: drop LIBUDEV_CFLAGS from X11_INCLUDES The cflags are explicitly included in the only Makefile that handles udev dependant code. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:50 +00:00
Emil Velikov	7536d744ee	pipe-loader: drop obsolete libudev.h include All the udev code is in the loader, so there is no reason for us to include this header. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:17:49 +00:00
Emil Velikov	929f83376a	configure: error out when building radeonsi without gallium-llvm --enable-gallium-llvm is required by radeonsi. Currently we check only for LLVM_VERSION_INT which is 0, whenever gallium-llvm is disabled explicitly. ./configure --with-gallium-drivers=r600,radeonsi --disable-gallium-llvm v2: Correct typo in error message. Spotted by Tom Stellard Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-11 17:04:18 +00:00
Christian König	4ca8439dce	omx/radeonsi: fix target Another minor typo. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-11 17:10:22 +01:00
Christian König	79aa29d45e	omx: fix some minor configure.ac issues Matt Turner noted the incorrect order, but I somehow forgotten to change it before pushing upstream. The other one is a typo during rebase. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-11 17:08:42 +01:00
Christian König	ee978aee94	vl: add H264 encoding interface Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-02-11 13:26:13 +01:00
Kenneth Graunke	eaf3358e0a	i965: Don't call abort() on an unknown device. If we don't recognize the PCI ID, we can't reasonably load the driver. However, calling abort() is quite rude - it means the application that tried to initialize us (possibly the X server) can't continue via fallback paths. We already have a more polite mechanism - failing to create the context. So, just use that. While we're at it, improve the error message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73024 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Lu Hua <huax.lu@intel.com>	2014-02-11 02:23:22 -08:00
Daniel Kurtz	b47d231526	glsl: Add locking to builtin_builder singleton Consider a multithreaded program with two contexts A and B, and the following scenario: 1. Context A calls initialize(), which allocates mem_ctx and starts building built-ins. 2. Context B calls initialize(), which sees mem_ctx != NULL and assumes everything is already set up. It returns. 3. Context B calls find(), which fails to find the built-in since it hasn't been created yet. 4. Context A finally finishes initializing the built-ins. This will break at step 3. Adding a lock ensures that subsequent callers of initialize() will wait until initialization is actually complete. Similarly, if any thread calls release while another thread is still initializing, or calling find(), the mem_ctx/shader would get free'd while from under it, leading to corruption or use-after-free crashes. Fixes sporadic failures in Piglit's glx-multithread-shader-compile. Bugzilla: https://bugs.freedesktop.org/69200 Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-11 02:21:41 -08:00
Kenneth Graunke	e95a4ed296	i965/fs: Simplify FS_OPCODE_SET_OMASK stride mashing a bit. In the first case, we can simply call stride(mask, 16, 8, 2) rather than creating a new register with a different stride, then immediately changing it a second time. In the second case, the stride was already what we wanted, so we can just use mask without any changes at all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-11 02:21:35 -08:00
Kenneth Graunke	f948ad2a07	i965/fs: Simplify FS_OPCODE_SET_SAMPLE_ID stride mashing a bit. stride(brw_vec1_reg(...) ...) takes some register, changes the strides, then changes the strides again. Let's do it once. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-11 02:21:26 -08:00
Dave Airlie	08fd34c8a3	docs/GL3.txt: denote r600g support for ARB_viewport_array Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:15:18 +10:00
Dave Airlie	6d434252e2	r600g: add support for multiple viewports. tested on rv635 and barts. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:14:50 +10:00
Dave Airlie	0705fa35cd	st/mesa: add support for GL_ARB_viewport_array (v0.2) this just ties the mesa code to the pre-existing gallium interface, I'm not sure what to do with the CSO stuff yet. 0.2: fix min/max bounds Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:14:50 +10:00
Dave Airlie	c116ee6042	st/mesa: add support for viewport index semantic This adds GS output and FS input support, even though FS input support isn't supported until GLSL 4.30 from what I can see. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-11 14:06:40 +10:00
Kenneth Graunke	a21552a96b	i965: Program 2x MSAA sample positions. There are only two sensible placements for 2x MSAA samples - and one is the mirror image of the other. I chose (0.25, 0.25) and (0.75, 0.75). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-10 08:18:29 -08:00
Kenneth Graunke	f4bc0ac83e	i965: Store 4x MSAA sample positions in a scalar value, not an array. Storing a single value in an array is rather pointless. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-10 08:18:29 -08:00
Kenneth Graunke	16f7510ad3	i965: Duplicate less code in GetSamplePositions driver hook. The 4x and 8x cases contained identical code for extracting the X and Y sample offset values and converting them from U0.4 back to float. Without this refactoring, we'd have to duplicate it a third time in order to support 2x MSAA. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-10 08:18:28 -08:00
Ilia Mirkin	40dd777b33	nouveau/video: make sure that firmware is present when checking caps Apparently some players are ill-prepared for us claiming that a decoder exists only to have creating it fail, and express this poor preparation with crashes (e.g. flash). Check that firmware is there to increase the chances of there being a high correlation between reported capabilities and ability to create a decoder. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.0 10.1 <mesa-stable@lists.freedesktop.org> Tested-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-02-10 14:00:17 +01:00
Kenneth Graunke	a487ef87fe	mesa: Fix MESA_FORMAT_Z24_UNORM_S8_UINT vs. X8_UINT mix-up. In commit `eeed49f5f2`, Mark accidentally renamed MESA_FORMAT_S8_Z24 to MESA_FORMAT_Z24_UNORM_X8_UINT and MESA_FORMAT_X8_Z24 to MESA_FORMAT_Z24_UNORM_S8_UINT, reversing their sense. The commit message was correct, but what sed commands actually got run didn't match that. This patch swaps the two enum names, reversing them. This should undo the damage, but might break things if people have manually fixed a few instances in the meantime... Mark's commit also failed to mention renames: s/MESA_FORMAT_ARGB2101010_UINT\b/MESA_FORMAT_B10G10R10A2_UINT/g s/MESA_FORMAT_ABGR2101010\b/MESA_FORMAT_R10G10B10A2_UNORM/g but those seem okay. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-09 16:57:45 -08:00
Maxence Le Doré	b903be50b0	mesa: remove duplicated init of MaxViewports Already declared 5 lines before. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-09 16:45:23 -08:00
Grigori Goronzy	d34d5fddf8	gallium: add geometry shader output limits v2: adjust limits for radeonsi and llvmpipe v3: add documentation Cc: "10.1" <mesa-stable@lists.freedesktop.org> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-02-09 23:31:38 +01:00
Siavash Eliasi	61bc014c96	mesa: Removed unnecessary check for NULL pointer when freeing memory Note that it is OK to pass NULL pointers to this function since this commit: mesa: modified _mesa_align_free() to accept NULL pointer http://cgit.freedesktop.org/mesa/mesa/commit/?id=f0cc59d68a9f5231e8e2111393a1834858820735 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-09 16:16:34 +01:00
Ilia Mirkin	356aff3a5c	nv30: report 8 maximum inputs nvfx_fragprog_assign_generic only allows for up to 10/8 texcoords for nv40/nv30. This fixes compilation of the varying-packing tests. Furthermore it appears that the last 2 inputs on nv4x don't seem to work in those tests, so just report 8 everywhere for now. Tested on NV42, NV44. NV4B appears to have additional problems. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 9.1 9.2 10.0 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-08 19:06:51 -05:00
Christoph Bumiller	2e9ee44797	nv50/ir/ra: some register spilling fixes Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-09 00:04:13 +01:00
Brian Paul	c325ec8965	mesa: update assertion in detach_shader() for geom shaders Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74723 Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>	2014-02-08 14:21:28 -07:00
Brian Paul	6e8d04ac3e	mesa: allocate gl_debug_state on demand We don't need to allocate all the state related to GL_ARB_debug_output until some aspect of that extension is actually needed. The sizeof(gl_debug_state) is huge (~285KB on 64-bit systems), not even counting the 54(!) hash tables and lists that it contains. This change reduces the size of gl_context alone from 431KB bytes to 145KB bytes on 64-bit systems and from 277KB bytes to 78KB bytes on 32-bit systems. Reviewed-by: Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 11:27:58 -07:00
Brian Paul	31b2625cb5	mesa: trivial clean-ups in errors.c Whitespace changes, 78-column rewrapping, comment clean-ups, add some braces, etc. Reviewed-by: Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 11:27:58 -07:00
Brian Paul	1dc209d8f2	mesa: remove _mesa_ prefix from some static functions Reviewed-by: Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 11:27:57 -07:00
Kenneth Graunke	dcb0330d30	i965: Label JIP and UIP in Broadwell shader disassembly. This makes it obvious which number is which. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:38:15 -08:00
Kenneth Graunke	8a7fe50067	i965: Don't disassemble UIP field for Broadwell WHILE instructions. The WHILE instruction doesn't have UIP. It only has JIP. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:38:12 -08:00
Kenneth Graunke	5230655a2e	i965: Don't print source registers for Broadwell flow control. The bits which normally contain the source register descriptions actually contain the JIP/UIP jump targets, which we already printed. Interpreting JIP/UIP as source registers results in some really creepy looking output, like IF statements with acc14.4<0,1,0>UD sources. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:37:34 -08:00
Kenneth Graunke	8e0a0e4d30	i965: Fix fast depth clear values on Broadwell. Broadwell's 3DSTATE_CLEAR_PARAMS packet expects a floating point value regardless of format. This means we need to stop converting it to UNORM. Storing the value as float would make sense, but since we already have a uint32_t field, this patch continues shoehorning it into that. In a sense, this makes mt->depth_clear_value the DWord you emit in the packet, rather than the clear value itself. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-07 19:36:14 -08:00
Christoph Bumiller	882e98e5e6	nvc0: handle TGSI_SEMANTIC_LAYER Cc: 10.1 <mesa-stable@lists.freedesktop.org>	2014-02-07 23:14:00 +01:00
Christoph Bumiller	dd2229d4c6	nvc0: create the SW object It's required for being able to use software methods now.	2014-02-07 22:53:37 +01:00
Christoph Bumiller	b7233acf78	nvc0/ir/emit: hardcode vertex output stream to 0 for now	2014-02-07 22:53:36 +01:00
Chris Forbes	0c14c5c62a	i965: Enable ARB_texture_gather for one component on Gen6. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:24 +13:00
Chris Forbes	31d1077dd2	i965/vec4: Emit shader w/a for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:23 +13:00
Chris Forbes	73b91fe05a	i965/fs: Emit shader w/a for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:20 +13:00
Chris Forbes	c2d51aaa11	i965: Add surface format overrides for Gen6 gather Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:19 +13:00
Chris Forbes	2b7bbd89e8	i965: Add Gen6 gather wa to sampler key Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-08 10:32:06 +13:00
Eric Anholt	1e12dafcac	glsl: Optimize triop_csel with all-true or all-false. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	de796b0ef0	glsl: Optimize various cases of fma (aka MAD). Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	44577c4857	glsl: Optimize lrp(x, x, coefficient) --> x. total instructions in shared programs: 1627754 -> 1624534 (-0.20%) instructions in affected programs: 45748 -> 42528 (-7.04%) GAINED: 3 LOST: 0 (serious sam, humus domino demo) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	d72956790f	glsl: Optimize pow(x, 1) -> x. total instructions in shared programs: 1627826 -> 1627754 (-0.00%) instructions in affected programs: 6640 -> 6568 (-1.08%) GAINED: 0 LOST: 0 (HoN and savage2) Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:48 -08:00
Eric Anholt	6d7c123d6c	glsl: Optimize log(exp(x)) and exp(log(x)) into x. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-07 12:46:47 -08:00
Eric Anholt	2c2aa35336	glsl: Optimize ~~x into x. v2: Fix pasteo of an extra abs being inserted (caught by many). Rewrite to drop the silly switch statement. Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)	2014-02-07 12:46:47 -08:00
Eric Anholt	0f6279bab2	i965: Add some informative debug when the X Server botches DRI2 GetBuffers. We've had various bug reports over the years where miptrees are missing, and when I screwed it up while adding DRI2 to the modesetting driver, I figured I should put the info necessary for debug here. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-07 12:46:47 -08:00
Eric Anholt	b5e5f34dd2	i965: Remove redundant check in blitter-based glBlitFramebuffer(). The intel_miptree_blit() code checks the format for us now, plus it handles xrgb vs argb for us. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-07 12:46:47 -08:00
Kenneth Graunke	697f401a31	i965: Fix Gen8+ disassembly of half float subregister numbers. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	e990234ff6	i965: Use the new brw_load_register_mem helper for draw indirect. This makes it work on Broadwell, too. v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register (caught by Chris Forbes). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	b7c435b261	i965: Implement a brw_load_register_mem helper function. This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64 distinction. Placing the function in intel_batchbuffer.c is rather arbitrary; there wasn't really an obvious place for it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	2f97119950	i965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs. Since commit `9cee3ff562`, INTEL_DEBUG=vs has caused a NULL pointer dereference for fixed-function/ARB programs. In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the gl_shader_program. This is different than the FS visitor. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-07 12:36:38 -08:00
Kenneth Graunke	2062f40d81	glsl: Don't lose precision qualifiers when encountering "centroid". Mesa fails to retain the precision qualifier when parsing: #version 300 es centroid in mediump vec2 v; Consider how the parser's type_qualifier production is applied. First, the precision_qualifier rule creates a new ast_type_qualifier: <precision: mediump> Then the storage_qualifier rule creates a second one: <flags: in> and calls merge_qualifier() to fold in any previous qualifications, returning: <flags: in, precision: mediump> Finally, the auxiliary_storage_qualifier creates one for "centroid": <flags: centroid> it then does $$ = $1 and $$.flags \|= $2.flags, resulting in: <flags: centroid, in> Since precision isn't stored in the flags bitfield, it is lost. We need to instead call merge_qualifier to combine all the fields. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Kevin Rogovin <kevin.rogovin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-02-07 12:36:38 -08:00
Brian Paul	f47e596288	st/mesa: avoid sw fallback for getting/decompressing textures If st_GetTexImage() is to decompress the texture, avoid the fallback path even if prefer_blit_based_texture_transfer = false. For drivers that returned PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 0, we were always taking the fallback path for texture decompression rather than rendering a quad. The later is a lot faster. Cc: "10.0" "10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-07 09:54:43 -07:00
Erik Faye-Lund	5125165dde	gallium/tgsi: correct typo propagated from NV_vertex_program1_1 In the specification text of NV_vertex_program1_1, the upper limit of the RCC instruction is written as 1.884467e+19 in scientific notation, but as 0x5F800000 in binary. But the binary version translates to 1.84467e+19 rather than 1.884467e+19 in scientific notation. Since the lower-limit equals 2^-64 and the binary version equals 2^+64, let's assume the value in scientific notation is a typo and implement this using the value from the binary version instead. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:22:23 -07:00
Erik Faye-Lund	7a49a796a4	gallium/tgsi: use CLAMP instead of open-coded clamps Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:22:14 -07:00
Juha-Pekka Heikkila	498d10e230	egl: Unhide functionality in _eglInitSurface() _eglInitResource() was used to memset entire _EGLSurface by writing more than size of pointed target. This does work as long as Resource is the first element in _EGLSurface, this patch fixes such dependency. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	1456ed85f0	egl: Unhide functionality in _eglInitContext() _eglInitResource() was used to memset entire _EGLContext by writing more than size of pointed target. This does work as long as Resource is the first element in _EGLContext, this patch fixes such dependency. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	d530745169	glx: Add missing null check in __glX_send_client_info() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	d3e948340b	i965: Add missing null check in fs_visitor::dead_code_eliminate_local() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	e503609e6f	glx: Add some missing null checks in glx_pbuffer.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:05 -07:00
Juha-Pekka Heikkila	88cad8356e	glsl: Fix null access on file read error Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:04 -07:00
Juha-Pekka Heikkila	2ae1437a8e	glx: Add missing null check in __glXCloseDisplay Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:04 -07:00
Juha-Pekka Heikkila	d28e92ff74	glx: Add missing null checks in glxcmds.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-07 08:14:04 -07:00
Jordan Justen	020c43f401	main/get: support ARB_gpu_shader5 If a driver enables ARB_gpu_shader5 and sets Const.MaxVertexSteams >= 4, then piglit's arb_gpu_shader5-minmax test should now pass. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-06 16:58:33 -08:00
Jordan Justen	60914fa80d	glapi: add definitions for ARB_gpu_shader5 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-06 16:58:33 -08:00
Ilia Mirkin	0befbafb4b	nouveau/codegen: allow tex offsets on non-TXF instructions (e.g. TXL) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:50:19 -05:00
Ilia Mirkin	f76c7ad5b1	nv50: only over-allocate by a page for code The pre-fetching doesn't go too far. Tested with over-allocating by only a page, and didn't see any errors in dmesg. Saves ~512KB of VRAM. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:50:19 -05:00
Ilia Mirkin	364bdd2419	nv50: fix layerid to be the fp input number rather than vp output number In the tests they were the same so it didn't matter, but indications are that this is the correct behaviour. Also take this opportunity to (trivially) support using gl_Layer in fp. Cc: 10.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:03:24 -05:00
Ilia Mirkin	c7373b7dc7	nv50: rework primid logic Functionally identical but much simpler. Should also better integrate with future layer/viewport changes/fixes. Cc: 10.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>	2014-02-06 18:02:57 -05:00
Kristian Høgsberg	f658150639	glx: Pass NULL DRI drawables into the DRI driver for None GLX drawables GLX_ARB_create_context allows making a GLX context current with None drawable and readables, but this was never implemented correctly in GLX. We would create a __DRIdrawable for the None GLX drawable and pass that to the DRI driver and that would somehow work. Now it's somehow broken. The way this should have worked is that we pass a NULL DRI drawable to the DRI driver when the GLX user calls glXMakeContextCurrent() with None for drawable and readables. https://bugs.freedesktop.org/show_bug.cgi?id=74143 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-06 14:23:42 -08:00
Christian König	db54fca9b8	st/vdpau: add flush on unmap Flush the context when we unmap a buffer, otherwise VDPAU might start rendering the next frame while we still reference that buffer. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: StrangeNoises (rachel@strangenoises.org)	2014-02-06 20:58:38 +01:00
Marek Olšák	3f98053fc9	vdpau: flush the context before exporting the surface v2 Bugzilla (bug needs XBMC changes as well): https://bugs.freedesktop.org/show_bug.cgi?id=73191 When VL uploads vertex buffers, it uses PIPE_TRANSFER_DONTBLOCK, which always flushes the context in the winsys if the buffer being mapped is busy. Since I added handling of DISCARD_RANGE, DONTBLOCK has had no effect when combined with DISCARD_RANGE and I think the context isn't flushed anywhere else, so no commands are submitted to the GPU until the IB is full, which takes a lot of frames. Using DISCARD_RANGE is not the only way to trigger this bug. The other way is to reallocate the vertex buffer before every upload. BTW, I'm not sure if this is the right place for flushing, but it does fix the bug. v2 (chk): move the flush to the right place. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: StrangeNoises (rachel@strangenoises.org)	2014-02-06 20:58:07 +01:00
Matt Turner	e2ef93cf94	glsl: Initialize ubo_binding_mask flags to zero. Missed in commit `e63bb298`. Caused sporadic test failures, like incorrect-in-layout-qualifier-repeated-prim.geom. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-06 10:36:54 -08:00
Marek Olšák	559af1df10	gallium/radeon: fix warnings	2014-02-06 17:43:29 +01:00
Marek Olšák	c32114460d	gallium: remove PIPE_USAGE_STATIC Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:37:34 +01:00
Marek Olšák	eeb5a4a50e	gallium: define the behavior of PIPE_USAGE_* flags properly STATIC will be removed in the following commit. v2: changed the definition of IMMUTABLE Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:30:00 +01:00
Marek Olšák	ed84fb3167	gallium: remove PIPE_RESOURCE_FLAG_GEN_MIPS Unused. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:30:00 +01:00
Marek Olšák	2be5bbdd97	r600g,radeonsi: set resource domains in one place (v2) v2: This doesn't change the behavior. It only moves the tiling check to r600_init_resource and removes the usage parameter. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-02-06 17:29:59 +01:00
Marek Olšák	c6dbcf10df	st/mesa: fix crash when a shader uses a TBO and it's not bound This binds a NULL sampler view in that case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74251 Cc: "10.1" "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-06 17:29:59 +01:00
Christian König	b862cc23f2	st/omx: add workaround for bug in Bellagio Not blocking for the message thread can lead to accessing freed up memory. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:19:39 +01:00
Christian König	15e39ca28a	st/omx: initial OpenMAX support v3 Featuring a full grown MPEG2 and H264 decoder and a couple of hundred bugs. v2 (Leo): fix an error for pic_order_cnt_type 1 v3 (Leo): implement support for field decoding Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leo Liu <leo.liu@amd.com>	2014-02-06 16:16:34 +01:00
Christian König	c9b941ff1b	vl/rbsp: add H.264 RBSP implementation Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:16:33 +01:00
Christian König	b8b28bf94a	vl/vlc: add function to limit the vlc size Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:16:33 +01:00
Christian König	9ef42a54a7	vl/vlc: add remove bits function Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:16:33 +01:00
Christian König	fe0f9ab056	radeon: update legal notes on UVD Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 16:15:58 +01:00
Christian König	96e8b916a7	radeon: just don't map VRAM buffers at all Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-06 16:08:22 +01:00
Christian König	9b218dcdd7	radeon/video: directly create buffers in the right domain Avoid moving things around on start of stream. Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 15:54:14 +01:00
Christian König	7bcfb0bc8f	radeon/video: seperate common video functions Signed-off-by: Christian König <christian.koenig@amd.com>	2014-02-06 15:54:13 +01:00
Axel Davy	57f94bff71	gallium/dri2: Fix dri2_dup_image dri2_dup_image was not copying the dri_format field. This was causing some bugs, for example: . we create an gbm_bo. . we get an EGLImage from the gbm_bo. . Bug: impossible to get again the gbm_bo from the EGLImage by importing. (gbm dri2 backend) Signed-off-by: Axel Davy <axel.davy@ens.fr>	2014-02-05 22:22:00 -08:00
Chris Forbes	bba1105d52	i965/vs: Fix typo in brw_compute_vue_map Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-05 22:02:23 -08:00
Kenneth Graunke	e57d77280e	i965: Fix register types in dump_instructions(). This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract type that doesn't match the hardware description. dump_instruction() was using reg_encoding[] from brw_disasm.c, which no longer matches (and was incorrect for Gen8+ anyway). This patch introduces a new function to convert the abstract enum values into the letter suffix we expect. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reported-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 21:07:48 -08:00
Chad Versace	1340e24406	egl/glx: Remove egl_glx driver Mesa now has a real, feature-rich EGL implementation on X11 via xcb. Therefore I believe there is no longer a practical need for the egl_glx driver. Furthermore, egl_glx appears to be unmaintained. The most recent nontrivial commit to egl_glx was `6baa5f1` on 2011-11-25. Tested by running weston-smoke in windowed Weston on X with i965. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Kristian Høgsberg <krh@bitplanet.net>	2014-02-05 18:19:26 -08:00
Dave Airlie	0224bd20f3	docs: update 10.1 relnotes to note GL 3.3 on r600 and radeonsi. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-06 01:13:05 +00:00
Zack Rusin	8a3c990823	tgsi/ureg: increase the number of immediates ureg_program is allocated on the heap so we can just bump the number of immediates that it can handle. It's needed for d3d10. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Zack Rusin	efb152dd04	gallivm: make sure analysis works with large number of immediates We need to handle a lot more immediates and in order to do that we also switch from allocating this structure on the stack to allocating it on the heap. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Zack Rusin	69ee3f431f	gallivm: handle huge number of immediates We only supported up to 256 immediates, which isn't enough. We had code which was allocating immediates as an allocated array, but it was always used along a statically backed array for performance reasons. This commit adds code to skip that performance optimization and always use just the dynamically allocated immediates if the number of them is too great. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Zack Rusin	8507afc97f	gallivm: allow large numbers of temporaries The number of allowed temporaries increases almost with every iteration of an api. We used to support 128, then we started increasing and the newer api's support 4096+. So if we notice that the number of temporaries is larger than our statically allocated storage would allow we just treat them as indexable temporaries and allocate them as an array from the start. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-05 19:40:53 -05:00
Chris Forbes	5eeb12c0bc	i965/fs: Assume FBO rendering in precompile if MRT. If multiple color outputs are written, this shader is unlikely to be useful with a winsys framebuffer. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-06 10:58:52 +13:00
Chris Forbes	046f8d8a6f	i965/fs: Guess nr_color_regions better in precompile Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-06 10:58:37 +13:00
Chris Forbes	6c9de691c7	docs: Add relnotes for 10.2 Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-06 10:28:36 +13:00
Chris Forbes	87e916a240	mesa: Bump version to 10.2.0-devel Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>	2014-02-06 10:15:09 +13:00
Kristian Høgsberg	44338cd826	i965: Move intel_prepare_render() above first buffer access The driver is supposed to ensure buffers before any drawing operation, but in do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format before calling intel_prepare_render(). That was covered up by the unconditional call to intel_prepare_render() in intelMakeCurrent(), but we now only do this on the initial intelMakeCurrent call for a context (to get the size for the initial viewport values). https://bugs.freedesktop.org/show_bug.cgi?id=74083 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Alexander Monakov <amonakov@gmail.com>	2014-02-05 11:10:39 -08:00
Brian Paul	db98d238e2	st/mesa: add MESA_SHADER_COMPUTE case in shader_stage_to_ptarget() Silences compiler warning. Trivial.	2014-02-05 11:00:41 -07:00
Brian Paul	357faa5a36	mesa: re-wrap, fix-up comment text in formats.h Wrap to 78 columns, fix comment formatting. Trivial.	2014-02-05 10:43:21 -07:00
Paul Berry	25268b930d	i965/cs: Allow ARB_compute_shader to be enabled via env var. This will allow testing of compute shader functionality before it is completed. To enable ARB_compute_shader functionality in the i965 driver, set INTEL_COMPUTE_SHADER=1. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:14:16 -08:00
Paul Berry	3bbf93045a	i965/cs: Create the brw_compute_program struct, and the code to initialize it. v2: Fix comment. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:05:04 -08:00
Paul Berry	1fe274b3d7	glsl/cs: Prohibit mixing of compute and non-compute shaders. Fixes piglit test: spec/ARB_compute_shader/linker/mix_compute_and_non_compute Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:05:01 -08:00
Paul Berry	5a79bdab30	glsl/cs: Prohibit user-defined ins/outs in compute shaders. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:04:58 -08:00
Paul Berry	f5c5438e1f	main/cs: Implement query for COMPUTE_WORK_GROUP_SIZE. v2: Improve error message. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:04:55 -08:00
Paul Berry	28ce604b7f	mesa/cs: Handle compute shader local size during linking. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:04:20 -08:00
Paul Berry	0fa74e848f	glsl/cs: Handle compute shader local_size_{x,y,z} declaration. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:03:44 -08:00
Paul Berry	0398b69954	mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_COUNT constant. v2: Document that the 3-element array MaxComputeWorkGroupCount is indexed by dimension. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:03:08 -08:00
Paul Berry	c85c50997f	mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_INVOCATIONS constant. Reviewed-by: Matt Turner <mattst88@gmail.com> v2: Use CONTEXT_INT rather than CONTEXT_ENUM. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:02:30 -08:00
Paul Berry	347dde82e6	mesa/cs: Implement MAX_COMPUTE_WORK_GROUP_SIZE constant. v2: Document that the 3-element array MaxComputeWorkGroupSize is indexed by dimension. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-02-05 09:01:54 -08:00
Paul Berry	47d480e3e4	mesa/cs: Create the gl_compute_program struct, and the code to initialize it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:18 -08:00
Paul Berry	9b34ae2e64	mesa/cs: Handle compute shaders in _mesa_use_program(). v2: do cs after the ordered pipeline stages for consistency. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:16 -08:00
Paul Berry	c15064c169	glsl/cs: update main.cpp to use the ".comp" extension for compute shaders. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:13 -08:00
Paul Berry	d861c2963a	glsl/cs: Populate default values for ctx->Const.Program[MESA_SHADER_COMPUTE]. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:01:10 -08:00
Paul Berry	c61ec8d8e3	mesa/cs: Add a MESA_SHADER_COMPUTE stage and update switch statements. This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum. Also, where it is trivial to do so, it adds a compute shader case to switch statements that switch based on the type of shader. This avoids "unhandled switch case" compiler warnings. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:00:34 -08:00
Paul Berry	28e526d558	glsl/cs: Change some linker loops to use MESA_SHADER_FRAGMENT as a bound. Linker loops that iterate through all the stages in the pipeline need to use MESA_SHADER_FRAGMENT as a bound, so that we can add an additional MESA_SHADER_COMPUTE stage, without it being erroneously included in the pipeline. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:00:31 -08:00
Paul Berry	79134cb516	mesa/cs: Add dispatch API stubs for ARB_compute_shader. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 09:00:14 -08:00
Paul Berry	b7d05a58ae	mesa/cs: Add extension enable flags for ARB_compute_shader. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-05 08:59:37 -08:00
Roland Scheidegger	4a7da3bec5	gallivm: fix F2U opcode Previously, we were really doing F2I. And also move it to generic section. (Note that for llvmpipe the code generated is definitely bad, due to lack of unsigned conversions with sse. I think though what llvm does (using scalar conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit) including lots of domain changes is quite suboptimal, could do something like is_large = arg >= 2^31 half_arg = 0.5 * arg small_c = fptoint(arg) large_c = fptoint(half_arg) << 1 res = select(is_large, large_c, small_c) which should be much less instructions but that's something llvm should do itself.) This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs GL 3.0 version override to run.) Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-02-05 17:45:31 +01:00
José Fonseca	5c975966dc	tools/trace: Handle index buffer overflow gracefully. Trivial.	2014-02-05 10:58:38 +00:00
Dave Airlie	16215a9723	docs/GL3.txt: update r600 status This updates the r600 driver status to 3.3 being fully supported. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-05 10:56:58 +10:00
Dave Airlie	79ea0f4506	r600g: add support for geom shaders to r600/r700 chipsets (v2) This is my first attempt at enabling r600/r700 geometry shaders, the basic tests pass on both my rv770 and my rv635, It requires this kernel patch: http://www.spinics.net/lists/dri-devel/msg52745.html v2: address Alex comments. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:43 +10:00
Dave Airlie	ccea799ee3	r600g: enable GLSL 3.30 on evergreen GPUs This throws the switch to enable GL 3.3 and GLSL 330. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:43 +10:00
Dave Airlie	c6cfc54db0	r600g: properly propogate clip dist write value This moves the value from the GS shader to the copy shader so the registers are setup correctly. fixes tests/spec/glsl-1.50/execution/geometry/clip-distance-out-values.shader_test Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:43 +10:00
Dave Airlie	b209afb153	r600g: calculate a better value for array_size (v2) attempt to calculate a better value for array size to avoid breaking apps. v2: use 0xfff like streamout, suggested by Grigori Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:42 +10:00
Dave Airlie	ce9e939144	r600g: fix CAYMAN geometry shader support cayman has a different end of program bit, so do that properly. fixes hangs with geom shader tests on cayman. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:42 +10:00
Dave Airlie	7ec5e883f2	r600g: fix up shader out misc stuff for copy shader set the correct values so the misc out register is setup correctly for the copy shader. This also updates the state for the gs copy shader so the hw gets programmed correctly. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:42 +10:00
Dave Airlie	7863611de3	r600g: port the layered surface rendering patch from radeonsi This just makes r600 and evergreen do what the radeonsi codepaths do for layered rendering. This makes the 2d amd_vertex_shader_layer test pass on evergreen. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	f89394be98	r600g: initial VS output layer support This just adds support for emitting the proper value in the VS out misc. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	5191937352	r600g: setup const texture buffers for geom shaders This just enables the workarounds we have for vertex/pixel shaders for geom shaders as well. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	afce47fb0b	r600g: calculate correct cut value This selects the cut value depending on the shader selected. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:41 +10:00
Dave Airlie	0d79d5da40	r600g: fix dynamic_input_array_index.shader_test This follows what fglrx does, it unpacks the input we are going to indirect into a bunch of registers and indirects inside them. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:40 +10:00
Dave Airlie	e12147e9f6	r600g: add support for indirect geom ring writes We need to be able to write to the ring using a base register for when we emit vertices in a loop, in theory the SB compiler could collapse these indirect writes to direct writes if the register value is constant and known, but that is outside my pay grade. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:40 +10:00
Dave Airlie	cda63db780	r600g: write proper output prim type Vadim's code derived it from the info.mode, but it needs to be takes from the geometry shader output primitive. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:40 +10:00
Dave Airlie	2b0be2015d	r600g: enable instance cnt register with new enough kernel The instance cnt register was missing for a few kernels, with a new enough kernel we can output it. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	f4652babbd	r600g: add primitive input support for gs only enable prim id if gs uses it Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	b0e842bd9f	r600g: emit streamout from dma copy shader This enables streamout with GS in the mix, from the VS dma shader. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	20adc7449c	r600g/gs: fix cases where number of gs inputs != number of gs outputs this fixes a bunch of the geom shader built-in tests Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:39 +10:00
Dave Airlie	defebc0293	r600g: increase array base for exported parameters Trivial fix to Vadim's code. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:38 +10:00
Dave Airlie	d9954e402f	r600g: initialise the geom shader loop registers. As we do for vertex and pixel shaders. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:38 +10:00
Dave Airlie	461c463bb2	r600g: emit NOPs at end of shaders in more cases If the shader has no CF clauses at all emit an nop If the last instruction is an ENDLOOP add a NOP for the LOOP to go to if the last instruction is CALL_FS add a NOP These fix a bunch of hangs in the geometry shader tests. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:38 +10:00
Dave Airlie	c4782a58c3	r600g: don't enable SB for geom shaders SB needs fixes for three GS instructions it seems to raise them outside loops etc despite my best efforts. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:37 +10:00
Dave Airlie	5758a76d04	r600g/sb: add MEM_RING support Although we don't use SB on geom shaders, the VS copy shader will use it so we might as well implement MEM_RING support in sb. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:37 +10:00
Dave Airlie	eeead9b8ed	r600g: don't fail if we can't map VS->GS ring entries This can happen in normal operation, so don't report an error on it, just continue. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:37 +10:00
Vadim Girlin	1371d65a7f	r600g: initial support for geometry shaders on evergreen (v2) This is Vadim's initial work with a few regression fixes squashed in. v2: (airlied) fix regression in glsl-max-varyings - need to use vs and ps_dirty fix regression in shader exports from rebasing. whitespace fixing. v2.1: squash fix assert Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:49:11 +10:00
Vadim Girlin	34ee1d0f9f	r600g: add hw register definitions for GS block setup Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:40:42 +10:00
Vadim Girlin	a144bc29b5	r600g: defer shader variant selection and depending state updates [airlied: fix dropped streamout line - fix for master] Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:40:38 +10:00
Dave Airlie	ae29a098ea	r600g/bc: add support for indexed memory writes. It looks like we need these for geom shaders in the future. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-02-05 10:40:33 +10:00
Vadim Girlin	552aae7e47	r600g: move barrier and end_of_program bits from output to cf struct (v2) v2: fix regression on r600 NOP instructions. Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-05 10:40:23 +10:00
Dave Airlie	29a43cb0b6	r600g: split streamout emit code into a separate function For geometry shaders we need to call this code from a second place. Just move it out for now to keep future patches cleaner. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-02-05 10:40:17 +10:00
Marek Olšák	07075cf350	r600g,radeonsi: skip unnecessary buffer_is_busy call, add a comment	2014-02-04 20:19:16 +01:00
Marek Olšák	08f0344cf3	r600g,radeonsi: skip busy-checking for DISCARD_RANGE if it has been done already	2014-02-04 20:19:16 +01:00
Marek Olšák	796e2fba8c	r600g,radeonsi: treat DYNAMIC and STREAM usage as STAGING	2014-02-04 20:19:16 +01:00
Marek Olšák	0354b769c2	gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERS This can be derived from the shader caps. All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots for each shader stage.	2014-02-04 20:19:16 +01:00
Brian Paul	82c0914266	mesa: remove stray bits of GL_EXT_cull_vertex GL_EXT_cull_vertex was removed back in 2010 in commit `02984e3536` but these bits still lingered. Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-04 11:53:21 -07:00
Paul Berry	7f5740899f	glsl: Fix continue statements in do-while loops. From the GLSL 4.40 spec, section 6.4 (Jumps): The continue jump is used only in loops. It skips the remainder of the body of the inner most loop of which it is inside. For while and do-while loops, this jump is to the next evaluation of the loop condition-expression from which the loop continues as previously defined. Previously, we incorrectly treated a "continue" statement as jumping to the top of a do-while loop. This patch fixes the problem by replicating the loop condition when converting the "continue" statement to IR. (We already do a similar thing in "for" loops, to ensure that "continue" causes the loop expression to be executed). Fixes piglit tests: - glsl-fs-continue-inside-do-while.shader_test - glsl-vs-continue-inside-do-while.shader_test - glsl-fs-continue-in-switch-in-do-while.shader_test - glsl-vs-continue-in-switch-in-do-while.shader_test Cc: mesa-stable@lists.freedesktop.org Acked-by: Carl Worth <cworth@cworth.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-04 09:06:09 -08:00
Paul Berry	56790856b3	glsl: Make condition_to_hir() callable from outside ast_iteration_statement. In addition to making it public, we also need to change its first argument from an ir_loop * to an exec_list *, so that it can be used to insert the condition anywhere in the IR (rather than just in the body of the loop). This will be necessary in order to make continue statements work properly in do-while loops. Cc: mesa-stable@lists.freedesktop.org Acked-by: Carl Worth <cworth@cworth.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-04 09:06:09 -08:00
Topi Pohjolainen	933be19cdf	i965/blorp: do not use unnecessary hw-blending support This is really not needed as blorp blit programs already sample XRGB normally and get alpha channel set to 1.0 automatically by the sampler engine. This is simply copied directly to the payload of the render target write message and hence there is no need for any additional blending support from the pixel processing pipeline. The blending formula is anyway broken for color components, it multiplies the color component with itself (blend factor is the component itself). Alpha blending in turn would not fix the alpha to one independent of the source but simply used the source alpha as is instead (1.0 * src_alpha + 0.0 * dst_alpha). Quoting Eric: "If we want to actually make the no-alpha-bits-present thing work, we need to override the bits in the surface state or in the generated code. In the normal draw path, it's done for sampling by the swizzling code in brw_wm_surface_state.c, and the blending overrides is just to fix up the alpha blending stage which doesn't pay attention to that for the destination surface." If one modifies piglit test gl-3.2-layered-rendering-blit to use color component values other than zero or one, this change will kick in on IVB. No regressions on IVB. This is effectively revert of `c0554141a9`: i965/blorp: Support overriding destination alpha to 1.0. Currently, Blorp requires the source and destination formats to be equal. However, we'd really like to be able to blit between XRGB and ARGB formats; our BLT engine paths have supported this for a long time. For ARGB -> XRGB, nothing needs to occur: the missing alpha is already interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha channel to 1.0 when writing the destination colors. This is fairly straightforward with blending. For now, this code is never used, as the source and destination formats still must be equal. The next patch will relax that restriction. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2014-02-04 16:39:23 +02:00
Christian König	c3c24c3acc	radeon/uvd: fix feedback buffer handling v2 Without the correct feedback buffer size UVD runs into an error on each frame, reducing the maximum FPS. v2: fixing Michels comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: "10.1" "10.0" "9.2" <mesa-stable@lists.freedesktop.org>	2014-02-04 13:10:50 +01:00
Kenneth Graunke	adaa5a6ca6	i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw(). This moves the intel_batchbuffer_flush before the drm_intel_bo_busy call, which is a change in behavior. However, the old behavior was broken. In the future, we may want to only flush in the batchbuffer references the BO being mapped. That's certainly more typical. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-03 16:16:38 -08:00
Kenneth Graunke	e396674d5f	i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy(). This additionally measures the time stalled, while also simplifying the code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-03 16:16:35 -08:00
Kenneth Graunke	d613bafe91	i965: Create drm_intel_bo_map wrappers with performance warnings. Mapping a buffer is a common place where we could stall the CPU. In a few places, we've added special code to check whether a buffer is busy and log the stall as a performance warning. Most of these give no indication of the severity of the stall, though, since measuring the time is a small hassle. This patch introduces a new brw_bo_map() function which wraps drm_intel_bo_map, but additionally measures the time stalled and reports a performance warning. If performance debugging is not enabled, it simply maps the buffer with negligable overhead. We also add a similar wrapper for drm_intel_gem_bo_map_gtt(). This should make it easy to add performance warnings in lots of places. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-02-03 16:16:26 -08:00
Rob Clark	1b886078db	freedreno: enabling binning and opt by default Hw binning pass doesn't seem to have broken anything. And optimizing compiler fixes a lot of shaders and doesn't seem to break anything. So re-org slightly FD_MESA_DEBUG params and make both hw binning and optimizer enabled by default. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	554f1ac00c	freedreno/a3xx/compiler: new compiler The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): `163b6306b1` Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	f0e2d7ab46	freedreno/a3xx/compiler: split out old compiler For the time being, keep old compiler as fallback for things that the new compiler does not support yet. Split out as it's own commit to make the later new-compiler commits easier to follow. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	a418573c4d	freedreno/a3xx/compiler: prepare for new compiler Shuffle things around to prepare for new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Rob Clark	f08d2b1c0f	freedreno/a3xx: remove useless reg tracking in disasm-a3xx Not really used for anything anymore. So strip it out and avoid conflicting symbols with upcoming new-compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00
Carl Worth	1597788d12	docs: Add release notes for 10.0.3 Which was just made.	2014-02-03 13:55:24 -08:00
Brian Paul	fc3fcd1e01	draw: fix incorrect color of flat-shaded clipped lines When we clipped a line weren't copying the provoking vertex color to the second vertex. We also weren't checking for first vs. last provoking vertex. Fixes failures found with the new piglit line-flat-clip-color test. Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:50:04 -07:00
Brian Paul	349b76a553	mesa: change GL_ALL_ATTRIB_BITS to 0xFFFFFFFF This has been wrong for many years. It was originally 0x000FFFFF and long ago there was discussion about whether GL_ALL_ATTRIB_BITS should include the then-new GL_MULTISAMPLE_BIT bit. Eventually the ARB decided that glPushAttrib(GL_ALL_ATTRIB_BITS) should save all current and future attribute groups (hence ~0). Unfortunately, Mesa's gl.h was never updated. This was just recently spotted by Eric Anholt and reported as a bug to the ARB. Ian, Jon Leech and I discussed it at the ARB meeting and decided to change Mesa's value to reflect the ARB's decision. Acked-by: Eric Anholt <eric@anholt.net>	2014-02-03 12:50:03 -07:00
Brian Paul	307fd76053	gallium/auxiliary/indices: replace free() with FREE() To match the CALLOC_STRUCT() call. Cc: "10.0, 10.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:49:55 -07:00
Brian Paul	97fdace6d7	svga: check shader size against max command buffer size If the shader is too large, plug in a dummy shader. This patch also reworks the existing dummy shader code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:40:13 -07:00
Brian Paul	4686f610b1	svga: refactor some shader code Put common code in new svga_shader.c file. Considate separate vertex/ fragment shader ID generation. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-02-03 12:40:13 -07:00
Zack Rusin	9bace99d77	gallivm: fix opcode and function nesting gallivm soa code supported only a single level of nesting for control flow opcodes (if, switch, loops...) but the d3d10 spec clearly states that those are nested within functions. To support nesting of conditionals inside functions we need to store the nesting data inside function contexts and keep a stack of those. Furthermore we make sure that if nesting for subroutines is deeper than 32 then we simply ignore all subsequent 'call' invocations. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-02-03 13:29:14 -05:00
Kenneth Graunke	595bcf38a6	mesa: Drop unnecessary (void) ctx from VAO code. ctx is always used, even on release builds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:16 -08:00
Kenneth Graunke	4323b92479	mesa: Remove "APPLE" from some VAO error messages. Chances are, people will be using the core names these days. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:15 -08:00
Kenneth Graunke	cf62e59673	mesa: Update some comments relating to VAOs. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:13 -08:00
Kenneth Graunke	e1b1f2a687	mesa: Rename ElementArrayBufferObj to IndexBufferObj. DirectX and most hardware documentation use the term "Index Buffer" to refer to a buffer containing indexes into arrays of vertex data, which allows random access to vertex data, rather than sequential access. OpenGL uses a different term for this concept: "Element Array Buffer". However, "Index Buffer" has become much more widespread. A quick Google search shows 29,300 hits for "Element Array Buffer" vs. 82,300 hits for "Index Buffer." Arguably, "Index Buffer" is clearer: an "element of an array" (or list) usually refers to an actual item stored in the array, not the index used to refer to it. The terminology is also already used in Mesa: some VBO module code for dealing with ElementArrayBufferObj names local variables "ib". Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/ElementArrayBufferObj/IndexBufferObj/g' Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:11 -08:00
Kenneth Graunke	0354e50798	mesa: Rename _mesa_lookup_arrayobj to _mesa_lookup_vao. For consistency with the previous renames. Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/_mesa_lookup_arrayobj/_mesa_lookup_vao/g' Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:09 -08:00
Kenneth Graunke	de47fd2668	mesa: Rename _mesa_..._array_obj functions to _mesa_..._vao. _mesa_update_vao_client_arrays() is less of a mouthful than _mesa_update_array_object_client_arrays(), and generally clearer. Generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/_mesa_$[^_]*$_array_object/_mesa_\1_vao/g' with manual whitespace and indentation fixes applied. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:07 -08:00
Kenneth Graunke	aac1415b66	mesa: Rename "struct gl_array_object" to gl_vertex_array_object. I considered replacing it with "gl_vao", but spelling it out seemed to fit better with Mesa's traditional style. Mesa doesn't shy away from long type names - consider gl_transform_feedback_object, gl_fragment_program_state, gl_uniform_buffer_binding, and so on. Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ 's/gl_array_object/gl_vertex_array_object/g' v2: Rerun command to resolve conflicts with Ian's meta patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:05 -08:00
Kenneth Graunke	94e07c1960	mesa: Rename "arrayObj" local variables to "vao". Now that the field is named "VAO" instead of "ArrayObj", it makes sense to call the local variables "vao" instead of "arrayObj". Completely generated by: $ find . -type f -print0 \| xargs 0 sed -i 's/arrayObj/vao/g' Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:53:02 -08:00
Kenneth Graunke	0dfe50f1a6	mesa: Rename ArrayObj to VAO and DefaultArrayObj to DefaultVAO. When reading through the Mesa drawing code, it's not immediately obvious to me that "ArrayObj" (gl_array_object) is the Vertex Array Object (VAO) state. The comment above the structure explains this, but readers still have to remember this and translate accordingly. Out of context, "array object" is a fairly vague. Even in context, "array" has a lot of meanings: glDrawArrays, vertex data stored in user arrays, gl_client_arrays, gl_vertex_attrib_arrays, and so on. Using the term "VAO" immediately associates these fields with the OpenGL concept, clarifying the situation and aiding programmer sanity. Completely generated by: $ find . -type f -print0 \| xargs -0 sed -i \ -e 's/ArrayObj;/VAO;/g' \ -e 's/->ArrayObj/->VAO/g' \ -e 's/Array\.ArrayObj/Array.VAO/g' \ -e 's/Array\.DefaultArrayObj/Array.DefaultVAO/g' v2: Rerun command to resolve conflicts with Ian's meta patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-03 00:52:58 -08:00
Ian Romanick	81144c049b	meta: Silence several 'unused parameter' warnings Silences many GCC warnings of the form: drivers/common/meta.c: In function 'cleanup_temp_texture': drivers/common/meta.c:1208:41: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'setup_ff_blit_framebuffer': drivers/common/meta.c:1453:46: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_blit_cleanup': drivers/common/meta.c:1998:43: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_clear_cleanup': drivers/common/meta.c:2287:44: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'setup_ff_generate_mipmap': drivers/common/meta.c:3365:45: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_generate_mipmap_cleanup': drivers/common/meta.c:3556:54: warning: unused parameter 'ctx' [-Wunused-parameter] There are a couple other similar warnings, but they are less trivial. I want to investigate these further before axing them. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:09 +01:00
Ian Romanick	2bf4db1697	meta: Don't use fixed-function to decompress array textures Array textures can't be used with fixed-function, so don't. Instead, just drop the decompress request on the floor. This is no worse than what was done previously because generating the GL error (in _mesa_set_enable) broke everything anyway. A later patch will get GL_TEXTURE_2D_ARRAY targets working. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:09 +01:00
Ian Romanick	eb65d4b84d	meta: Use NDC in decompress_texture_image There is no need to use pixel coordinates, and using NDC directly will simplify the GLSL paths. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:09 +01:00
Ian Romanick	abfa65ca81	meta: Consistenly use non-Apple VAO functions For these objects, meta was already using the non-Apple function to delete the objects. Everywhere else in the file uses _mesa_GenVertexArrays and _mesa_BindVertexArrays. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:09 +01:00
Ian Romanick	070f55d893	meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY The hardware decompression path isn't even close to being able to handle this. This converts the crash (assertion failure) in "EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a plain old failure. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:09 +01:00
Ian Romanick	fcb498302b	meta: Release resources used by _mesa_meta_DrawPixels _mesa_meta_DrawPixels creates a VAO and (potentially) two fragment programs, but none of them are ever released. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:08 +01:00
Ian Romanick	2d3f92e881	meta: Release resources used by decompress_texture_image decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a sampler object, but none of them are ever released. Later patches will add program objects, exacerbating the problem. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org>	2014-02-02 16:49:08 +01:00
Ian Romanick	a722454dac	mesa: Use common _mesa_tex_target_to_index in tex param code TEXTURE_BUFFER_INDEX has to be specially called out because it is not allowed in any of the glTexParameter or glGetTexParameter functions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:08 +01:00
Ian Romanick	35e7027dab	mesa: Make target_enum_to_index available outside texobj.c The next patch will use this function in another file. v2: Rename _mesa_target_enum_to_index to _mesa_tex_target_to_index. Suggested by Brian. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-02-02 16:49:08 +01:00
Brian Paul	9451281aca	mesa: make several FBO functions static The four functions in question weren't called from any other file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:38 -07:00
Brian Paul	3abd4f4d90	mesa: move glGenerateMipmap() code into new genmipmap.c file Mipmap generation has nothing to do with FBOs. v2: update gl_genexec.py too (not api_exec.c) Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:37 -07:00
Brian Paul	bfcb9bb204	mesa: move glBlitFramebuffer code into new blit.c file Just for better organization. v2: update gl_genexec.py too (not api_exec.c) Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:37 -07:00
Brian Paul	20fedfd80a	mesa: don't signal _NEW_TEXTURE in TexSubImage() functions glTexSubImage(), glCopyTexSubImage() and glCompressedTexSubImage() only change the texel data, not other state like texture size or format. If a driver really needs do something special it can hook into the corresponding driver functions or Map/UnmapTextureImage(). This should avoid some needless state validation effort. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-02-02 06:52:37 -07:00
Brian Paul	c55e3e6811	mesa: add some comments about mipmap generation Trivial.	2014-02-02 06:52:37 -07:00
Brian Paul	e286b63c8f	mesa: simplify comment in texstorage.c Trivial.	2014-02-02 06:52:37 -07:00
Brian Paul	8b3e383820	mesa: formatting fixes, 78-column wrappings in dd.h Trivial.	2014-02-02 06:52:37 -07:00
Brian Paul	deb9dd6e27	mesa: remove target param from ctx->Driver.TexParameter() Not really used anywhere. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:52:37 -07:00
Brian Paul	c20b48c48e	gallivm: add a few const qualifiers Trivial.	2014-02-02 06:52:36 -07:00
Brian Paul	c6d94648cf	translate: reindent translate_sse.c Trivial.	2014-02-02 06:52:36 -07:00
Brian Paul	8689076925	mesa: make _mesa_get_proxy_target() static Wasn't used in any other file. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	9eaed3eb6e	mesa: remove unused _mesa_select_tex_object() function The _mesa_get_current_tex_object() function is now used everywhere that _mesa_select_tex_object() was formerly used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	d5df28381e	swrast: use _mesa_get_current_tex_object() in swrastSetTexBuffer2() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	ed72115891	st/mesa: use _mesa_get_current_tex_object() in st_context_teximage() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	f09a1261ad	mesa: use _mesa_get_current_tex_object() in GetTexLevelParameteriv() And update a related comment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	8b4f6fada2	radeon: use _mesa_get_current_tex_object() in radeonSetTexBuffer2() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Brian Paul	76c33e383c	r200: use _mesa_get_current_tex_object() in r200SetTexBuffer2() Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-02-02 06:47:32 -07:00
Paul Seidler	1cdeeef6c4	build: move ARCH_LIBS definition outside of ASM definition _mesa_streaming_load_memcpy is also needed even if assembling is disabled Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-01 15:01:06 -08:00
Eric Anholt	c849ecc19a	dri: Add a useful error message if someone's packages missed libudev deps. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-01 10:09:11 -08:00
Eric Anholt	63546b8e3d	dri: Also support the loader with libudev.so.0. As far as I know, this should be safe. If not, we have to decide whether to have variable lookup of the functions, or just drop support for .so.0 (which is a year and a half old it looks like) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74127 Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-02-01 10:08:36 -08:00
Rob Clark	dc00ec154b	freedreno: better manage our WFI's Updates to non-banked registers, CP_LOAD_STATE, etc, need a WFI if there is potentially pending rendering. Track this better, and add fd_wfi() calls everywhere that might potentially need CP_WAIT_FOR_IDLE. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 12:10:17 -05:00
Rob Clark	1fe9df8f29	freedreno/a3xx: add logicop Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:59:25 -05:00
Rob Clark	8d27be2633	freedreno/a3xx: handle frag z write Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:58:47 -05:00
Rob Clark	083b27a1b1	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:57:39 -05:00
Rob Clark	98c1111462	freedreno/a3xx: fix const confusion Gallium can leave const buffers bound above what is used by the current shader. Which can have a couple bad effects: 1) write beyond const space assigned, which can trigger HLSQ lockup 2) double emit of immed consts, first with bound const buffer vals followed by with actual immed vals. This seems to be a sort of undefined condition. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:57:09 -05:00
Rob Clark	5c6961efae	freedreno/a3xx/compiler: compiler cleanups Drop color/pos/psize_regid, plus a few compiler and IR cleanups. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:53:21 -05:00
Rob Clark	69eca28dd0	freedreno/compiler/a3xx: remove lowered instructions Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:52:27 -05:00
Rob Clark	0f2df4ff90	freedreno: add tgsi lowering pass Currently lowers the following instructions: DST, XPD, SCS, LRP, FRC, POW, LIT, EXP, LOG, DP4, DP3, DPH, DP2 translating these into equivalent simpler TGSI instructions. This probably should be moved to util so other drivers can use it, but just adding under freedreno for now so that I can clear out a lot of the lowering code in a3xx compiler before beginning to add new compiler. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:50:10 -05:00
Rob Clark	7524756199	freedreno/a3xx/compiler: add CLAMP Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:49:31 -05:00
Rob Clark	fafe16a8a0	freedreno/a3xx/compiler: various fixes Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:49:06 -05:00
Rob Clark	4971628bae	freedreno: ctx should hold ref to dev The ctx should hold ref to dev to avoid problems if screen is destroyed before ctx. Doesn't really fix the egl/glx issues, but at least it prevents things from getting much worse. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:47:08 -05:00
Rob Clark	303df12db8	freedreno: add prims-emitted driver query Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-01 11:45:19 -05:00
Kenneth Graunke	80bf1fbaf6	i965: Silence unused variable 'ctx' warning. Somehow I missed this before pushing the Broadwell PS state upload code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-31 21:40:27 -08:00
Kenneth Graunke	e1cdafe6f7	i965: Fix math instruction hstride assertions on Broadwell. In the final revision of my gen8_generator patch, I updated the MATH instruction's assertion from (dst.hstride == 1) to check that source and destination hstride matched. Unfortunately, I didn't test this enough, and many Piglit tests fail this test. The documentation indicates that "scalar source is also supported", which we believe means <0,1,0> access mode (hstride == 0). If hstride is non-zero, then it must match the destination register. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-31 17:50:09 -08:00
Kenneth Graunke	d8878055f5	i965: Add (disabled) Broadwell PCI IDs. This puts the PCI IDs in place so it's easy to enable support. However, it doesn't actually enable support since it's very preliminary still, and a few crucial pieces (such as BLORP) are still missing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	3ade766684	i965: Disable 3DSTATE_WM_HZ_OP fields. Eric believes this to be wrong and unnecessary, as the command is supposed to emit an implicit rectangle primitive. However, empirically the pixel pipeline is completely unreliable without it. So for now, it stays until someone comes up with a better solution. We'll need to do better than this when we implement multisampling, HiZ, or fast clears...but for now, this will do. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	4c4e0ed64b	i965: Update GS state for Broadwell. This is quite similar to the Gen7 code. The main changes: - 48-bit relocations - Thread count is specified as U/2-1 instead of U-1. - An extra DWord (DW9) with clip planes, URB entry output length/offsets - We need to program the "Expected Vertex Count" (VerticesIn) v2: Set the number of binding table entries so they can be prefetched (requested by Eric Anholt). v3: Add a WARN_ONCE for a missing workaround. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	a0d4311072	i965: Update multisampling state for Broadwell. On previous platforms, 3DSTATE_MULTISAMPLE contained the number of samples, pixel location, and the positions of each sample within a pixel for each multisampling mode (4x and 8x). It was also a non-pipelined command, presumably since changing the sample positions is fairly drastic. Broadwell improves upon this by splitting the sample positions out into a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet. Broadwell also supports 2x and 16x multisampling, in addition to the 4x and 8x supported by Gen7. This patch, however, does not implement 2x and 16x. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	9cd65e3289	i965: Update 3DSTATE_{DEPTH,STENCIL,...}_BUFFER and such for Broadwell. The amount of cut and paste from Gen7 is rather ugly, and should probably be cleaned up in the future. Even the Gen7 code is in need of some tidying though; many of the function parameters aren't used on platforms that use level/layer rather than tile offsets. Tidying both can be left to a future patch series. This at least gets things going. v2: Rebase on Paul's rename of NumLayers -> MaxNumLayers. v3: Shift QPitch by 2 when storing it in the packet. Bits 14:0 store bits 16:2 of the actual value. Fixes tests. v4: Add missing stencil buffer QPitch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	2fce1e3c69	i965: Update BLEND_STATE for Broadwell. v2: Allow logic ops on all surface types. The UNORM restriction was lifted with Haswell and I simply hadn't noticed. Also, add missing BRW_NEW_STATE_BASE_ADDRESS dirty bit. Both caught by Eric Anholt. v3: Fix swapped per-RT DWord pairs. Eliminates bizarre hacks. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:08 -08:00
Kenneth Graunke	460e0df330	i965: Update SF_CLIP_VIEWPORT for Broadwell. It has additional fields to support clipping to the viewport even if guardband clipping is enabled. v2: Update for viewport array changes. v3: No, seriously, update for viewport array changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2014-01-31 17:50:08 -08:00
Kenneth Graunke	dcbf25969e	i965: Rework SURFACE_STATE entries for Broadwell. v2: Add missing SCS setting in gen8_emit_buffer_surface_state (caught by Eric Anholt). v3: Use stored QPitch rather than recomputing it. v4: Shift QPitch by 2 when setting it in the packet; bits 14:0 store bits 16:2 of the actual value (fixes myriads of cube and array texturing tests). Also, only enable cube face bits for cubemaps (matches Chris Forbes' commit on master). Port to use offset64. v5: s/gl_format/mesa_format/g v6: Fix DW5 of renderbuffer state, which neglected to subtract irb->mt->first_level. Use vertical_alignment() rather than hardcoding 4. Use ffs for multisample counts rather than a large switch statement (all caught/suggested by Eric). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:07 -08:00
Kenneth Graunke	990aaf87c4	i965: Update SOL state for Broadwell. Unlike on Gen7, we can directly set the offset via the state packet. We also -have- to: the kernel SOL reset code won't work anymore. v2: Fix copy and paste mistake in buffer stride setup; drop stale comment (caught by Eric Anholt). Add a perf_debug for missing MOCS setup. v3: Rebase on Paul Berry's changes to CurrentVertexProgram. v4: Fix SO Write Offset handling. We need to set bits 20 and 21 so the hardware both loads and saves the offset. There's also a restriction that 3DSTATE_SO_BUFFER can only be programmed once per buffer between primitives, so the "reset to zero" code needed reworking. Fixes most of the transform feedback Piglit tests. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	fd91ab662d	i965: Update the code that disables unused shader stages for Broadwell. v2: Also disable 3DSTATE_WM_CHROMAKEY for safety. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	3d3c351cfb	i965: Update 3DSTATE_CLIP for Broadwell. Broadwell's winding order, polygon fill, and viewport Z test fields have moved to DWord 1 of 3DSTATE_RASTER. v2: Add a perf_debug for a future optimization and improve commit message (both suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:07 -08:00
Kenneth Graunke	5c0d7dbcb9	i965: Rework vertex uploads for Broadwell. v2: Emit a dummy 3DSTATE_VF_SGVS packet when not needed. v3: Add WARN_ONCE and perf_debugs requested by Eric Anholt. v4: Program 3DSTATE_SGVS even in the no-elements case so gl_VertexID continues working. Fix 3DSTATE_VF_INSTANCING to not use an element index to access the buffers array. Some ARB_draw_indirect prep work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:07 -08:00
Kenneth Graunke	08a4714959	i965: Update STATE_BASE_ADDRESS for Broadwell. v2: Fix missing "change" bit on instruction state base address (caught by Haihao Xiang). v3: Add a perf_debug for missing MOCS setup, requested by Eric. v4: Fix buffer sizes. The value, specified at bit 12 and up, is actually measured in 4k pages. We need to round up to the next multiple of 4k. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v3] Reviewed-by: Matt Turner <mattst88@gmail.com> [v4]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	f3c6d6f1e1	i965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA. v2: Fix setting of GEN8_PSX_ATTRIBUTE_ENABLE after rebases. v3: Add missing binding table entry counts. Don't worry about alpha testing or alpha to coverage when setting the "Kill Pixel" bit; those are specified in 3DSTATE_PS_BLEND (caught by Eric Anholt). Drop unused _NEW_BUFFERS. Tidy comments. v4: Rebase on Paul Berry's changes to CurrentFragmentProgram. v5: Re-enable line stippling. It doesn't crash or anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v3]	2014-01-31 17:50:07 -08:00
Kenneth Graunke	20d9286f71	i965: Rework 3DSTATE_VS for Broadwell. v2: Remove incorrect MOCS shifts; rename urb_entry_write_offset to urb_entry_output_offset to closer match the documentation. v3: Only emit a non-zero constant buffer read length when active. v4: Add missing binding table counts (caught by Eric). v5: Rebase on Paul Berry's changes to CurrentVertexProgram. v6: Drop bogus SBE read length/offset field code. We were programming the wrong values, and our 3DSTATE_SBE code overrides any value we put here anyway with the correct one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v4]	2014-01-31 17:50:06 -08:00
Kenneth Graunke	c96686a6cc	i965: Add the new 3DSTATE_PS_BLEND state packet. v2: Only set GEN8_PS_BLEND_HAS_WRITEABLE_RT if color buffer writes are enabled (caught by Eric Anholt). v3: Set non-blending flags (writeable RT, alpha test, alpha to coverage) for integer formats too. +14 Piglits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2014-01-31 17:50:06 -08:00
Kenneth Graunke	17768bb7b4	i965: Replace DEPTH_STENCIL_STATE with Gen8's 3DSTATE_WM_DEPTH_STENCIL. v2: Use stencil->_WriteEnabled instead of setting GEN8_WM_DS_STENCIL_BUFFER_WRITE_ENABLE twice (suggested by Eric). v3: Mask stencil->WriteMask and stencil->ValueMask with 0xff. The field is only 8-bits, so we'd trip the new SET_FIELD assertion when core Mesa gave us a value like 0xFFFFFFFF. The Gen7 code uses structure field widths to implicitly do this truncation. Fixes Piglit tests. v4: Use uint32_t for dw1/dw2, not uint8_t. Worst. Typo. Ever. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v2]	2014-01-31 17:50:06 -08:00
Kenneth Graunke	90fff1354b	i965: Update SF, SBE, and RASTER state for Broadwell. The attribute override portion of 3DSTATE_SBE was split out into 3DSTATE_SBE_SWIZ; various bits of 3DSTATE_SF were split out into 3DSTATE_RASTER. v2: Set Force URB Read Offset bit. Eventually the URB read offset should be set in 3DSTATE_VS, but that will require some refactoring. v3: Rebase on viewport array changes. v4: Improve comments about URB read length/offset overrides. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:06 -08:00
Kenneth Graunke	4552a22f04	i965: Bump generation assertions on workaround flushes. I haven't investigated whether these are necessary on Broadwell or not, but for paranoia's sake, we may as well continue doing them for now. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-31 17:50:06 -08:00
Kenneth Graunke	2184b519cd	i965: Duplicate gen7_atoms to gen8_atoms. It's going to diverge significantly. Starting out with a copy allows future patches to change atoms one by one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-31 17:50:06 -08:00
Brian Paul	f51ca46f0c	radeon: move driContextSetFlags(ctx) call after ctx var is initialized CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-31 17:09:44 -07:00
Brian Paul	2d6d69bab6	r200: move driContextSetFlags(ctx) call after ctx var is initialized Otherwise, ctx was a garbage value. CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-31 17:09:44 -07:00
Roland Scheidegger	1d53603f1f	llvmpipe: fix denorm handling for r11g11b10_float format when blending The code re-enabling denorms for small float formats did not recognize this format due to format handling hacks (mainly, the lp_type doesn't have the floating bit set). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-31 19:51:06 +01:00
Matt Turner	606544214e	glsl: Expand non-expr & non-swizzle scalar rvalues in vectorizing.	2014-01-31 10:21:50 -08:00
Matt Turner	3f49a8c9a5	glcpp: Reject #version after the version has been resolved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74166 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Carl Worth <cworth@cworth.org>	2014-01-31 10:21:50 -08:00
Carl Worth	9d4a6bd6bb	glcpp: Rename the variable used to enable debugging. The -p option we now use when calling bison means that this variable will be named glcpp_parser_debug not yydebug. This was not caught when the -p option was added because this variable isn't used in the code as committed. (I prefer the declaration to remain since it allows a developer to easily find this variable name to enable debugging.)	2014-01-31 10:02:58 -08:00
Carl Worth	2dc93bd5d1	glcpp: Add "make check" test for comment-parsing bug This is the innocent-looking but killer test case to verify the bug fixed in the preceding commit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-31 10:02:54 -08:00
Carl Worth	71978cf66f	glcpp: Don't enter lexer's NEWLINE_CATCHUP start state for single-line comments In commit `6005e9cb28` a new start state of NEWLINE_CATCHUP was added to the lexer. This start state is used whenever the lexer is emitting a NEWLINE token to emit additional NEWLINE tokens for any newline characters that were skipped by an immediately preceding multi-line comment. However, that commit erroneously entered the NEWLINE_CATCHUP state for single-line comments. This is not desired since in the case of a single-line comment, the lexer is not emitting any NEWLINE token. The result is that the lexer will remain in the NEWLINE_CATCHUP state and proceed to fail to emit a NEWLINE token for the subsequent newline character, (since the case to match \n expects only the INITIAL start state). The fix is quite simple, remove the "BEGIN NEWLINE_CATCHUP" code from the single-line comment case, (preserving it only in exactly the cases where the lexer is actually emitting a NEWLINE token). Many thanks to Petri Latvala for reporting this bug and for providing the minimal test case to exercise it. The bug showed up only with a multi-line comment which was followed immediately by a single-line comment (without any intervening newline), such as: /* */ // Kablam! Since `6005e9cb28`, and before this commit, that very innocent-looking combination of comments would yield a parse failure in the compiler. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-31 10:02:36 -08:00
Brian Paul	df21f31788	mesa: use _mesa_align_free() in _mesa_delete_buffer_object() To match _mesa_align_malloc() call in _mesa_buffer_data(). Found by Colin Harrison <colin.harrison@virgin.net> Signed-off-by: Brian Paul <brianp@vmware.com>	2014-01-31 09:52:11 -07:00
Michel Dänzer	db8b6fb2df	st/dri: Fix tests for no draw/read buffers in dri_make_current() Fixes piglit glx/GLX_ARB_create_context/current with no framebuffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-31 11:06:26 +09:00
Keith Packard	3fbd1b0cb5	dri3: Track current Present swap mode and adjust buffer counts This automatically adjusts the number of buffers that we want based on what swapping mode the X server is using and the current swap interval: swap mode interval buffers copy > 0 1 copy 0 2 flip > 0 2 flip 0 3 Note that flip with swap interval 0 is currently limited to twice the underlying refresh rate because of how the kernel manages flipping. Moving from 3 to 4 buffers would help, but that seems ridiculous. v2: Just update num_back at the point that the values that change num_back change. This means we'll have the updated value at the point that the freeing of old going-to-be-unused backbuffers happens, which might not have been the case before (change by anholt, acked by keithp). Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 17:29:33 -08:00
Keith Packard	aea4757eb4	dri3, i915, i965: Add __DRI_IMAGE_FOURCC_SARGB8888 The __DRIimage createImageFromFds function takes a fourcc code, but there was no fourcc code that match __DRI_IMAGE_FORMAT_SARGB8. This adds a define for that format, adds a translation in DRI3 from __DRI_IMAGE_FORMAT_SARGB8 to __DRI_IMAGE_FOURCC_SARGB8888 and then adds translations back to __IMAGE_FORMAT_SARGB8 in both the i915 and i965 drivers. I'll refrain from comments on whether I think having two separate sets of format defines in dri_interface.h is a good idea or not... Fixes piglit glx-tfp and glx-visuals-depth Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 17:29:23 -08:00
Keith Packard	f12d6d613a	dri3: Flush XCB before blocking for special events XCB doesn't flush the output buffer automatically, so we have to call xcb_flush ourselves before waiting. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:40:25 -08:00
Keith Packard	09d6c19720	dri3: Enable GLX_INTEL_swap_event Now that we're tracking SBC values correctly, and the X server has the ability to send the GLX swap events from a PresentPixmap request, enable this extension. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:40:06 -08:00
Keith Packard	1525474ead	dri3: Fix dri3_wait_for_sbc to wait for completion of requested SBC Eric figured out that glXWaitForSbcOML wanted to block until the requested SBC had been completed, which means to wait until the PresentCompleteNotify event for that SBC had been received. This replaces the simple sleep(1) loop (which was bogus) with a loop that just checks to see if we've seen the specified SBC value come back in a PresentCompleteNotify event yet. The change is a bit larger than that as I've broken out a piece of common code to wait for and process a single Present event for the target drawable. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:38:36 -08:00
Keith Packard	71d614250e	dri3: Track full 64-bit SBC numbers, instead of just 32-bits Tracking the full 64-bit SBC values makes it clearer how those values are being used, and simplifies the wait_msc code. The only trick is in re-constructing the full 64-bit value from Present's 32-bit serial number that we use to pass the SBC value from request to event. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-30 16:35:00 -08:00
Mark Mueller	34a8a0820f	mesa: Add warning to _REV pack/unpack functions with incorrect behavior Signed-off-by: Mark Mueller <MarkKMueller@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-01-31 00:51:36 +01:00
Siavash Eliasi	03065ea05c	r600g: Removed unnecessary positivity check for unsigned int variable. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-01-31 00:50:08 +01:00
Michel Dänzer	9f26ad00d7	st/dri: Allow creating OpenGL 3.3 core contexts Enables OpenGL 3.3 piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-30 10:30:41 +09:00
Kristian Høgsberg	cbecd958a7	build: Share the all-local rule for linking libraries into the build dir This consolidates how we link the libraries into the build directory. It works for lib_LTLIBRARIES but not custom shared libraries like DRI drivers or gallium state trackers which needs special casing (cf dri mega drivers, for example) Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-29 12:58:13 -08:00
Emil Velikov	7965908976	loader: do not print the pci id during normal operation Spamming the pci id is not beneficial. Make sure it's printed only when needed. v2: Change severity to _LOADER_DEBUG, rather than removing the message. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-29 19:55:02 +00:00
Emil Velikov	780dfc1fec	loader: print WARNING and FATAL messages using the default logger Lower values are used for more severe cases. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-29 19:53:53 +00:00
Emil Velikov	4c35e32594	glsl: s/_NDEBUG/NDEBUG/ The former symbol is never defined within mesa. Based on the code it seems that the original intent was to use NDEBUG. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 19:52:35 +00:00
Kristian Høgsberg	e3afbe3ad7	dir-locals.el: Set indent-tabs-mode true for makefile-mode Makefiles need hard tabs, let's not make that harder than it needs to be. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-29 11:45:49 -08:00
Courtney Goeltzenleuchter	3e894e213b	mesa: Return after ScissorArrayv or ScissorIndexed detect a parameter error Fixes piglit arb_viewport_array-scissor-ignore. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jon Ashburn <jon@lunarg.com>	2014-01-29 09:40:02 -07:00
Ian Romanick	ca385bffa6	docs: Add GL_ARB_map_buffer_alignment status to GL3.txt and release notes Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:40 -07:00
Siavash Eliasi	7fd6ad7adc	mesa: GL_ARB_map_buffer_alignment is not optional Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. v2: Making GL_ARB_map_buffer_alignment a desktop OpenGL extension only. v3: Squash two commits together. v4 (idr): MIN_MAP_BUFFER_ALIGNMENT queries don't have any dependencies. In previous versions of the patch it depended on EXTRA_API_GL which would prevent the query from working in core profile contexts. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	b9aaa96ec3	nouveau: Use gl_constants::MinMapBufferAlignment as the alignment in nouveau_bo_new This driver does not support GL_ARB_map_buffer_range, so no special treatment is needed for unaligned offsets in the mapping. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	d38867d80c	radeon / r200: Use gl_constants::MinMapBufferAlignment as the alignment in radeon_bo_open These drivers do not support GL_ARB_map_buffer_range, so no special treatment is needed for unaligned offsets in the mapping. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	f772d51c25	mesa: Use _mesa_align_malloc in _mesa_buffer_data v2: Fixed memory leak. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	689b20cfe0	mesa: Set gl_constants::MinMapBufferAlignment to 64 by default Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	6bb27ee51c	mesa/st: Unconditionally enable ARB_map_buffer_alignment. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Ian Romanick	25c14f40f3	freedreno: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 Allocations actually have page alignment, but 64 is still a reasonable value. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rob Clark <robclark@freedesktop.org>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	205e624048	ilo: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 Ian manually ran the map_buffer_range* tests and the arb_map_buffer_alignment-* tests, but he did not do a full piglit run. v2 (idr): Use 64 instead of 4096 Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	75081391a4	svga: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:11:39 -07:00
Siavash Eliasi	d273fe72df	i915g: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly.	2014-01-29 09:11:39 -07:00
Siavash Eliasi	4329e99b23	i915g: Use alignment of 64 instead of 16 for buffer allocation Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	809d3a7d25	llvmpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	6317664de0	llvmpipe: Use alignment of 64 instead of 16 for buffer allocation v2: Changed allocation alignment of llvmpipe_displaytarget_layout. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	c83b34c43b	softpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64 v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Siavash Eliasi	e36759a81e	softpipe: Use alignment of 64 instead of 16 for buffer allocation v2: Changed allocation alignment in softpipe_displaytarget_layout. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-29 09:09:41 -07:00
Stéphane Marchesin	023a50dd9b	i915g: support more PIPE_CAPs	2014-01-28 18:56:54 -08:00
Michel Dänzer	f8e16010e5	radeonsi: Put GS ring buffer descriptors with streamout buffer descriptors And mark the constant buffers as read only for the GPU again. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:09:26 +09:00
Michel Dänzer	d7c68e2dc1	radeonsi: Enable OpenGL 3.3 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:09:14 +09:00
Michel Dänzer	db9d6af862	radeonsi: Geometry shader micro-optimizations Move parameter loads out of loops, and use the instruction offset instead of a VGPR for the vertex attribute offset when writing to the ESGS ring buffer. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:09:04 +09:00
Michel Dänzer	3b3687adcb	radeonsi: We don't support indirect addressing of geometry shader inputs Fixes piglit spec/glsl-1.50/execution/geometry/dynamic_input_array_index Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:54 +09:00
Michel Dänzer	b4e14931a9	radeonsi: Pass VS resource descriptors to the HW ES shader stage as well This makes sure constants and samplers work in the vertex shader even when a geometry shader is active. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:43 +09:00
Michel Dänzer	67e385b3b7	radeonsi: Fix streamout from geometry shader Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:33 +09:00
Michel Dänzer	d88a375229	radeonsi: Simplify shader PM4 state handling Just always bind the current states before drawing. Besides the simplification, as a bonus this makes sure the VS hardware shader stage always uses the GS copy shader when a geometry shader is active, fixing a number of GS related piglit tests. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:21 +09:00
Michel Dänzer	e884c560a6	radeonsi: Properly match ES outputs to GS inputs Fixes piglit vs-gs-arrays-within-blocks-pass. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:08:10 +09:00
Michel Dänzer	e1df0d45c4	radeonsi: Really dump TGSI code before any TGSI->LLVM conversion attempt While we're at it, use the local variable 'sel'. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:58 +09:00
Michel Dänzer	7b19c391f4	radeonsi: Also export clip distances with geometry shader Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:48 +09:00
Michel Dänzer	8afde9fa23	radeonsi: Take GS into account for VS state in more places Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:35 +09:00
Michel Dänzer	28630713b2	radeonsi: Handle adjacency primitives Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:23 +09:00
Michel Dänzer	d8b3d806fc	radeonsi: Handle TGSI_SEMANTIC_PRIMID Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:07:11 +09:00
Michel Dänzer	7c7d7380f1	radeonsi: Generalize counting of shader parameters Now it covers ES->GS as well as VS->PS. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:06:58 +09:00
Michel Dänzer	f07a96dad1	radeonsi: Fix handling of geometry shader output vertex ID It needs to increment at shader runtime, not at shader compile time, as the geometry shader can emit vertices in loops. LLVM automagically converts the ID back to an immediate value if its value can be determined at compile time. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:06:45 +09:00
Michel Dänzer	404b29d765	radeonsi: Initial geometry shader support Partly based on the corresponding r600g work by Vadim Girlin and Dave Airlie. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:06:28 +09:00
Michel Dänzer	51f89a03e1	radeonsi: Refactor shader input / output handling code In preparation for adding geometry shader support. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-29 11:05:58 +09:00
Matt Turner	947c828d5c	i965/fs: Add a saturation propagation optimization pass. Transforms, for example, mul vgrf3, vgrf2, vgrf1 mov.sat vgrf4, vgrf3 into mul.sat vgrf3, vgrf2, vgrf1 mov vgrf4, vgrf3 which gives register_coalescing an opportunity to remove the MOV instruction. total instructions in shared programs: 1515039 -> 1504634 (-0.69%) instructions in affected programs: 798586 -> 788181 (-1.30%) GAINED: 0 LOST: 4 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-28 17:47:41 -08:00
Matt Turner	39d7ec2c9a	i965: Add can_do_saturate() method to backend_instruction. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-28 17:47:41 -08:00
Anuj Phogat	3303475558	mesa: Generate correct error code in glDrawBuffers() OpenGL 3.3 spec expects GL_INVALID_OPERATION: "For both the default framebuffer and framebuffer objects, the constants FRONT, BACK, LEFT, RIGHT, and FRONT AND BACK are not valid in the bufs array passed to DrawBuffers, and will result in the error INVALID OPERATION." But OpenGL 4.0 spec changed the error code to GL_INVALID_ENUM: "For both the default framebuffer and framebuffer objects, the constants FRONT, BACK, LEFT, RIGHT, and FRONT_AND_BACK are not valid in the bufs array passed to DrawBuffers, and will result in the error INVALID_ENUM." This patch changes the behaviour to match OpenGL 4.0 spec Fixes Khronos OpenGL CTS draw_buffers_api.test. V2: Update the comment in code. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-28 15:30:55 -08:00
Dave Airlie	faee376869	loader: fix running with --disable-egl builds I sometimes build without EGL just for speed purposes, however it no longer finds my drivers when I do due to the HAVE_LIBUDEV defines being wrong. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-28 21:51:21 +00:00
Anuj Phogat	dc2f94bc78	i965: Ignore 'centroid' interpolation qualifier in case of persample shading I missed this change in commit `f5cfb4a`. It fixes the incorrect rendering caused in Dolphin Emulator. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73915 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Tested-by: Markus Wick <wickmarkus@web.de> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-28 13:32:20 -08:00
Matt Turner	10dc994e09	gbm: Make libgbm.so.1 symlink. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-28 07:29:14 -08:00
Kevin Rogovin	1db9ed6495	mesa: Allow depth = 0 parameter for TexImage3D. Fixes the tests for the depth parameter for TexImage3D calls when the target type is GL_TEXTURE_2D_ARRAY or GL_TEXTURE_CUBE_MAP_ARRAY so that a depth value of 0 is accepted. Previously, the check incorrectly required the depth argument to be atleast 1. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-28 07:29:14 -08:00
Tom Stellard	7b4592a489	r600g,radeonsi: Don't set resource_create in r600_common_screen_init() r600g and radeonsi have different implementations of resource_create. https://bugs.freedesktop.org/show_bug.cgi?id=74139 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-28 07:24:11 -08:00
José Fonseca	f29968b270	c11: Add missing stdlib.h include. For malloc/free. Silences gcc mingw warnings.	2014-01-28 14:35:04 +00:00
Emil Velikov	61c825e862	loader: include dlfcn.h when building with HAVE_LIBUDEV The code depending on the definitions is already wrapped in the same conditional so go ahead and wrap the include. Otherwise we'll brake compilation on platforms that are missing the header. Add assert.h in there as well, as it is introduced and used in the same fashon. Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74122 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-28 14:32:03 +00:00
José Fonseca	2eddf91faf	gallivm: Workaround http://llvm.org/PR18600 We have code generation paths that carry out swizzles of AoS vectors via bitwise shifts, as these tend to generate more efficient code than straightforward byte shuffles. But when the input is a constant the additional bitwise arithmetic operations somehow don't really get constant propagated properly, evenutally causing assertion failure in InstCombine pass. Therefore avoid the bug by using the trivial shuffles for constant inputs. Although the sample LLVM IR can cause a crash with any LLVM version, this was only seen in practice with LLVM 3.2. Reviewed-by: Matthew McClure <mcclurem@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-28 14:27:27 +00:00
Matt Turner	37f1903e00	glsl: Avoid combining statements from different basic blocks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74113 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 21:15:35 -08:00
Matt Turner	8e2b8bd0e6	glsl: Set proper swizzle when a channel is missing in vectorizing. Previously, for example if the x channel was missing from a series of assignments we were attempting to vectorize, the wrong swizzle mask would be applied. a.y = b.y; a.z = b.z; a.w = b.w; would be incorrectly transformed into a.yzw = b.xyz; Fixes two transform feedback tests in the ES3 conformance suite. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73978 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73954 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	57109d57f8	glsl: Use bitfieldInsert in ldexp() lowering. Shaves a few instructions off of lowered ldexp(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	3ea64f9093	glsl: Add constant evaluation of ir_binop_bfm. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	c59a605c70	glcpp: Resolve implicit GLSL version to 100 if the API is ES. Fixes a regression since `b2d1c579` where ES shaders without a #version declaration would fail to compile if their precision declaration was wrapped in the standard #ifdef GL_ES check. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74066 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Matt Turner	3e0e9e3bf9	glcpp: Check version_resolved in the proper place. The check was in the wrong place, such that if a shader incorrectly put a preprocessor token before the #version declaration, the version would be resolved twice, leading to a segmentation fault when attempting to redefine the __VERSION__ macro. #extension GL_ARB_sample_shading: require #version 130 void main() {} Also, rename glcpp_parser_resolve_version to glcpp_parser_resolve_implicit_version to avoid confusion. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Carl Worth <cworth@cworth.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 21:15:35 -08:00
Michel Dänzer	a818bf481a	r600g: s/r600_llvm_gpu_string/r600_get_llvm_processor_name/ Fixes build failure introduced by commit `65dc588bfd` ('r600g,radeonsi: consolidate get_compute_param'), which consolidated the former into the latter.	2014-01-28 10:12:32 +09:00
Marek Olšák	7209703432	radeonsi: cleanup includes, add missing license Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:13 +01:00
Marek Olšák	2942124db8	radeonsi: remove open-coded PS_PARTIAL_FLUSH event Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:10 +01:00
Marek Olšák	8a4d7c296f	radeonsi: move some inline functions from si_pipe.h to si_state.c And si_tex_aniso_filter is unused. v2: remove INLINE occurences Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:05 +01:00
Marek Olšák	530348680a	radeonsi: remove si_resource.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:04 +01:00
Marek Olšák	6e38a3de8a	radeonsi: remove si.h Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:40:02 +01:00
Marek Olšák	27a73a1b94	radeonsi: move si_upload_const_buffer to a better place This gets rid of another file. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:59 +01:00
Marek Olšák	9f5c037ab9	radeonsi: inline si_translate_index_buffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:57 +01:00
Marek Olšák	0932f0ff14	radeonsi: inline si_upload_index_buffer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:53 +01:00
Marek Olšák	ed42e95404	r600g,radeonsi: consolidate remaining obviously duplicated pipe_screen code Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:50 +01:00
Marek Olšák	65dc588bfd	r600g,radeonsi: consolidate get_compute_param v2: added fprintf to r600_get_llvm_processor_name Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:48 +01:00
Marek Olšák	d41bd71bcf	r600g,radeonsi: consolidate get_paramf and get_video_param radeonsi now reports PIPE_VIDEO_CAP_SUPPORTS_PROGRESSIVE = true if UVD support isn't available. It's what all the other drivers do. Also, some #include directives were missing in radeon_uvd.h. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:46 +01:00
Marek Olšák	a4c218f398	r600g,radeonsi: consolidate variables for CS tracing Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:42 +01:00
Marek Olšák	ba0c16f7b2	r600g,radeonsi: consolidate get_timestamp, get_driver_query_info This enables more queries for the Gallium HUD with radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:39 +01:00
Marek Olšák	4df3f25fa2	r600g,radeonsi: consolidate get_name and get_vendor queries Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:37 +01:00
Marek Olšák	f4612105e8	radeon: place context-related functions first in r600_pipe_common.c To follow the unwritten convention of r600g and radeonsi. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:27 +01:00
Marek Olšák	a9ae7635b7	r600g,radeonsi: consolidate the contents of r600_resource.c Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:25 +01:00
Marek Olšák	8739c60796	radeonsi: advertise the pipeline statistics query Implemented by the common code. You can now visualize the statistics with the HUD, see GALLIUM_HUD=help for all available queries. For example: GALLIUM_HUD=clipper-primitives-generated Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:15 +01:00
Marek Olšák	62d55c0a2d	radeonsi: use queries from r600g Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:10 +01:00
Marek Olšák	c53b8de335	r600g: remove a no-op while loop for (;;) { } while (); I was surprised to see such a statement. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:08 +01:00
Marek Olšák	aa90f17126	r600g: convert query emission code to radeon_emit Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:39:03 +01:00
Marek Olšák	dc76eea22c	r600g: only emit NOP relocations for queries if VM is disabled Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:38:59 +01:00
Marek Olšák	4e5c70e066	r600g: move queries to drivers/radeon Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com>	2014-01-28 01:38:56 +01:00
Mark Mueller	f5bd5568ab	mesa: Fix Type A _INT formats to MESA_FORMAT naming standard Replace Type A _INT formats names with _SINT to match naming spec, and update type C formats as follows: s/MESA_FORMAT_R_INT8\b/MESA_FORMAT_R_SINT8/g s/MESA_FORMAT_R_INT16\b/MESA_FORMAT_R_SINT16/g s/MESA_FORMAT_R_INT32\b/MESA_FORMAT_R_SINT32/g s/MESA_FORMAT_RG_INT8\b/MESA_FORMAT_RG_SINT8/g s/MESA_FORMAT_RG_INT16\b/MESA_FORMAT_RG_SINT16/g s/MESA_FORMAT_RG_INT32\b/MESA_FORMAT_RG_SINT32/g s/MESA_FORMAT_RGB_INT8\b/MESA_FORMAT_RGB_SINT8/g s/MESA_FORMAT_RGB_INT16\b/MESA_FORMAT_RGB_SINT16/g s/MESA_FORMAT_RGB_INT32\b/MESA_FORMAT_RGB_SINT32/g s/MESA_FORMAT_RGBA_INT8\b/MESA_FORMAT_RGBA_SINT8/g s/MESA_FORMAT_RGBA_INT16\b/MESA_FORMAT_RGBA_SINT16/g s/MESA_FORMAT_RGBA_INT32\b/MESA_FORMAT_RGBA_SINT32/g s/\bMESA_FORMAT_RED_RGTC1\b/MESA_FORMAT_R_RGTC1_UNORM/g s/\bMESA_FORMAT_SIGNED_RED_RGTC1\b/MESA_FORMAT_R_RGTC1_SNORM/g s/\bMESA_FORMAT_RG_RGTC2\b/MESA_FORMAT_RG_RGTC2_UNORM/g s/\bMESA_FORMAT_SIGNED_RG_RGTC2\b/MESA_FORMAT_RG_RGTC2_SNORM/g s/\bMESA_FORMAT_L_LATC1\b/MESA_FORMAT_L_LATC1_UNORM/g s/\bMESA_FORMAT_SIGNED_L_LATC1\b/MESA_FORMAT_L_LATC1_SNORM/g s/\bMESA_FORMAT_LA_LATC2\b/MESA_FORMAT_LA_LATC2_UNORM/g s/\bMESA_FORMAT_SIGNED_LA_LATC2\b/MESA_FORMAT_LA_LATC2_SNORM/g	2014-01-27 14:34:04 -08:00
Mark Mueller	8b47b6bc32	mesa: Fix MESA_FORMAT names containg SIGNED Update comments. Replace format names containing SIGNED with SNORM appended w/decoration per the format name spec: s/MESA_FORMAT_SIGNED_R8\b/MESA_FORMAT_R_SNORM8/g s/MESA_FORMAT_SIGNED_RG88_REV\b/MESA_FORMAT_R8G8_SNORM/g s/MESA_FORMAT_SIGNED_RGBX8888\b/MESA_FORMAT_X8B8G8R8_SNORM/g s/MESA_FORMAT_SIGNED_RGBA8888\b/MESA_FORMAT_A8B8G8R8_SNORM/g s/MESA_FORMAT_SIGNED_RGBA8888_REV\b/MESA_FORMAT_R8G8B8A8_SNORM/g s/MESA_FORMAT_SIGNED_R16\b/MESA_FORMAT_R_SNORM16/g s/MESA_FORMAT_SIGNED_GR1616\b/MESA_FORMAT_R16G16_SNORM/g s/MESA_FORMAT_SIGNED_RGB_16\b/MESA_FORMAT_RGB_SNORM16/g s/MESA_FORMAT_SIGNED_RGBA_16\b/MESA_FORMAT_RGBA_SNORM16/g s/MESA_FORMAT_SIGNED_A8\b/MESA_FORMAT_A_SNORM8/g s/MESA_FORMAT_SIGNED_I8\b/MESA_FORMAT_I_SNORM8/g s/MESA_FORMAT_SIGNED_L8\b/MESA_FORMAT_L_SNORM8/g s/MESA_FORMAT_SIGNED_A16\b/MESA_FORMAT_A_SNORM16/g s/MESA_FORMAT_SIGNED_I16\b/MESA_FORMAT_I_SNORM16/g s/MESA_FORMAT_SIGNED_L16\b/MESA_FORMAT_L_SNORM16/g s/MESA_FORMAT_SIGNED_AL88\b/MESA_FORMAT_L8A8_SNORM/g s/MESA_FORMAT_SIGNED_RG88\b/MESA_FORMAT_G8R8_SNORM/g s/MESA_FORMAT_SIGNED_RG1616\b/MESA_FORMAT_G16R16_SNORM/g	2014-01-27 14:33:29 -08:00
Mark Mueller	2e02e195fe	mesa: Fix MESA_FORMAT names with ALPH, INTENSITY, and LUMINANCE Compressed spelled out color components ALPHA, INTENSITY, and LUMINANCE to A, I, and L: s/MESA_FORMAT_ALPHA_UINT8\b/MESA_FORMAT_A_UINT8/g' s/MESA_FORMAT_ALPHA_UINT16\b/MESA_FORMAT_A_UINT16/g' s/MESA_FORMAT_ALPHA_UINT32\b/MESA_FORMAT_A_UINT32/g' s/MESA_FORMAT_ALPHA_INT32\b/MESA_FORMAT_A_SINT32/g' s/MESA_FORMAT_ALPHA_INT16\b/MESA_FORMAT_A_SINT16/g' s/MESA_FORMAT_ALPHA_INT8\b/MESA_FORMAT_A_SINT8/g' s/MESA_FORMAT_INTENSITY_UINT8\b/MESA_FORMAT_I_UINT8/g' s/MESA_FORMAT_INTENSITY_UINT16\b/MESA_FORMAT_I_UINT16/g' s/MESA_FORMAT_INTENSITY_UINT32\b/MESA_FORMAT_I_UINT32/g' s/MESA_FORMAT_INTENSITY_INT32\b/MESA_FORMAT_I_SINT32/g' s/MESA_FORMAT_INTENSITY_INT16\b/MESA_FORMAT_I_SINT16/g' s/MESA_FORMAT_INTENSITY_INT8\b/MESA_FORMAT_I_SINT8/g' s/MESA_FORMAT_LUMINANCE_UINT8\b/MESA_FORMAT_L_UINT8/g' s/MESA_FORMAT_LUMINANCE_UINT16\b/MESA_FORMAT_L_UINT16/g' s/MESA_FORMAT_LUMINANCE_UINT32\b/MESA_FORMAT_L_UINT32/g' s/MESA_FORMAT_LUMINANCE_INT32\b/MESA_FORMAT_L_SINT32/g' s/MESA_FORMAT_LUMINANCE_INT16\b/MESA_FORMAT_L_SINT16/g' s/MESA_FORMAT_LUMINANCE_INT8\b/MESA_FORMAT_L_SINT8/g' s/MESA_FORMAT_LUMINANCE_ALPHA_UINT8\b/MESA_FORMAT_LA_UINT8/g' s/MESA_FORMAT_LUMINANCE_ALPHA_UINT16\b/MESA_FORMAT_LA_UINT16/g' s/MESA_FORMAT_LUMINANCE_ALPHA_UINT32\b/MESA_FORMAT_LA_UINT32/g' s/MESA_FORMAT_LUMINANCE_ALPHA_INT32\b/MESA_FORMAT_LA_SINT32/g' s/MESA_FORMAT_LUMINANCE_ALPHA_INT16\b/MESA_FORMAT_LA_SINT16/g' s/MESA_FORMAT_LUMINANCE_ALPHA_INT8\b/MESA_FORMAT_LA_SINT8/g' s/MESA_FORMAT_ALPHA_FLOAT16\b/MESA_FORMAT_A_FLOAT16/g' s/MESA_FORMAT_ALPHA_FLOAT32\b/MESA_FORMAT_A_FLOAT32/g' s/MESA_FORMAT_INTESITY_FLOAT16\b/MESA_FORMAT_I_FLOAT16/g' s/MESA_FORMAT_INTESITY_FLOAT32\b/MESA_FORMAT_I_FLOAT32/g' s/MESA_FORMAT_INTENSITY_FLOAT16\b/MESA_FORMAT_I_FLOAT16/g' s/MESA_FORMAT_INTENSITY_FLOAT32\b/MESA_FORMAT_I_FLOAT32/g' s/MESA_FORMAT_LUMINANCE_FLOAT16\b/MESA_FORMAT_L_FLOAT16/g' s/MESA_FORMAT_LUMINANCE_FLOAT32\b/MESA_FORMAT_L_FLOAT32/g' s/MESA_FORMAT_LUMINANCE_ALPHA_FLOAT16\b/MESA_FORMAT_LA_FLOAT16/g' s/MESA_FORMAT_LUMINANCE_ALPHA_FLOAT32\b/MESA_FORMAT_LA_FLOAT32/g'	2014-01-27 14:32:41 -08:00
Mark Mueller	eeed49f5f2	mesa: Change many Type P MESA_FORMATs to meet naming spec Conversion of Type P formats as follows (w/related comment fixes): s/MESA_FORMAT_RGB565\b/MESA_FORMAT_B5G6R5_UNORM/g s/MESA_FORMAT_RGB565_REV\b/MESA_FORMAT_R5G6B5_UNORM/g s/MESA_FORMAT_ARGB4444\b/MESA_FORMAT_B4G4R4A4_UNORM/g s/MESA_FORMAT_ARGB4444_REV\b/MESA_FORMAT_A4R4G4B4_UNORM/g s/MESA_FORMAT_RGBA5551\b/MESA_FORMAT_A1B5G5R5_UNORM/g s/MESA_FORMAT_XBGR8888_SNORM\b/MESA_FORMAT_R8G8B8X8_SNORM/g s/MESA_FORMAT_XBGR8888_SRGB\b/MESA_FORMAT_R8G8B8X8_SRGB/g s/MESA_FORMAT_ARGB1555\b/MESA_FORMAT_B5G5R5A1_UNORM/g s/MESA_FORMAT_ARGB1555_REV\b/MESA_FORMAT_A1R5G5B5_UNORM/g s/MESA_FORMAT_AL44\b/MESA_FORMAT_L4A4_UNORM/g s/MESA_FORMAT_RGB332\b/MESA_FORMAT_B2G3R3_UNORM/g s/MESA_FORMAT_ARGB2101010\b/MESA_FORMAT_B10G10R10A2_UNORM/g s/MESA_FORMAT_Z24_S8\b/MESA_FORMAT_S8_UINT_Z24_UNORM/g s/MESA_FORMAT_S8_Z24\b/MESA_FORMAT_Z24_UNORM_S8_UINT/g s/MESA_FORMAT_X8_Z24\b/MESA_FORMAT_Z24_UNORM_X8_UINT/g s/MESA_FORMAT_Z24_X8\b/MESA_FORMAT_X8Z24_UNORM/g s/MESA_FORMAT_RGB9_E5_FLOAT\b/MESA_FORMAT_R9G9B9E5_FLOAT/g s/MESA_FORMAT_R11_G11_B10_FLOAT\b/MESA_FORMAT_R11G11B10_FLOAT/g s/MESA_FORMAT_Z32_FLOAT_X24S8\b/MESA_FORMAT_Z32_FLOAT_S8X24_UINT/g s/MESA_FORMAT_ABGR2101010_UINT\b/MESA_FORMAT_R10G10B10A2_UINT/g s/MESA_FORMAT_XRGB4444_UNORM\b/MESA_FORMAT_B4G4R4X4_UNORM/g s/MESA_FORMAT_XRGB1555_UNORM\b/MESA_FORMAT_B5G5R5X1_UNORM/g s/MESA_FORMAT_XRGB2101010_UNORM\b/MESA_FORMAT_B10G10R10X2_UNORM/g s/MESA_FORMAT_AL88\b/MESA_FORMAT_L8A8_UNORM/g s/MESA_FORMAT_AL88_REV\b/MESA_FORMAT_A8L8_UNORM/g s/MESA_FORMAT_AL1616\b/MESA_FORMAT_L16A16_UNORM/g s/MESA_FORMAT_AL1616_REV\b/MESA_FORMAT_A16L16_UNORM/g s/MESA_FORMAT_RG88\b/MESA_FORMAT_G8R8_UNORM/g s/MESA_FORMAT_GR88\b/MESA_FORMAT_R8G8_UNORM/g s/MESA_FORMAT_GR1616\b/MESA_FORMAT_R16G16_UNORM/g s/MESA_FORMAT_RG1616\b/MESA_FORMAT_G16R16_UNORM/g s/MESA_FORMAT_SRGBA8\b/MESA_FORMAT_A8B8G8R8_SRGB/g s/MESA_FORMAT_SARGB8\b/MESA_FORMAT_B8G8R8A8_SRGB/g s/MESA_FORMAT_SLA8\b/MESA_FORMAT_L8A8_SRGB/g Conflicts: src/mesa/drivers/dri/i965/brw_surface_formats.c src/mesa/main/format_pack.c src/mesa/main/format_unpack.c src/mesa/main/formats.c src/mesa/main/texformat.c src/mesa/main/texstore.c	2014-01-27 14:31:55 -08:00
Mark Mueller	50a01d2aca	mesa: Change many Type A MESA_FORMATs to meet naming standard Update comments. Conversion of the following Type A formats: s/MESA_FORMAT_RGB888\b/MESA_FORMAT_BGR_UNORM8/g s/MESA_FORMAT_BGR888\b/MESA_FORMAT_RGB_UNORM8/g s/MESA_FORMAT_A8\b/MESA_FORMAT_A_UNORM8/g s/MESA_FORMAT_A16\b/MESA_FORMAT_A_UNORM16/g s/MESA_FORMAT_L8\b/MESA_FORMAT_L_UNORM8/g s/MESA_FORMAT_L16\b/MESA_FORMAT_L_UNORM16/g s/MESA_FORMAT_I8\b/MESA_FORMAT_I_UNORM8/g s/MESA_FORMAT_I16\b/MESA_FORMAT_I_UNORM16/g s/MESA_FORMAT_R8\b/MESA_FORMAT_R_UNORM8/g s/MESA_FORMAT_R16\b/MESA_FORMAT_R_UNORM16/g s/MESA_FORMAT_Z16\b/MESA_FORMAT_Z_UNORM16/g s/MESA_FORMAT_Z32\b/MESA_FORMAT_Z_UNORM32/g s/MESA_FORMAT_S8\b/MESA_FORMAT_S_UINT8/g s/MESA_FORMAT_SRGB8\b/MESA_FORMAT_BGR_SRGB8/g s/MESA_FORMAT_RGBA_16\b/MESA_FORMAT_RGBA_UNORM16/g s/MESA_FORMAT_SL8\b/MESA_FORMAT_L_SRGB8/g s/MESA_FORMAT_Z32_FLOAT\b/MESA_FORMAT_Z_FLOAT32/g s/MESA_FORMAT_XBGR16161616_UNORM\b/MESA_FORMAT_RGBX_UNORM16/g s/MESA_FORMAT_XBGR16161616_SNORM\b/MESA_FORMAT_RGBX_SNORM16/g s/MESA_FORMAT_XBGR16161616_FLOAT\b/MESA_FORMAT_RGBX_FLOAT16/g s/MESA_FORMAT_XBGR16161616_UINT\b/MESA_FORMAT_RGBX_UINT16/g s/MESA_FORMAT_XBGR16161616_SINT\b/MESA_FORMAT_RGBX_SINT16/g s/MESA_FORMAT_XBGR32323232_FLOAT\b/MESA_FORMAT_RGBX_FLOAT32/g s/MESA_FORMAT_XBGR32323232_UINT\b/MESA_FORMAT_RGBX_UINT32/g s/MESA_FORMAT_XBGR32323232_SINT\b/MESA_FORMAT_RGBX_SINT32/g s/MESA_FORMAT_XBGR8888_UINT\b/MESA_FORMAT_RGBX_UINT8/g s/MESA_FORMAT_XBGR8888_SINT\b/MESA_FORMAT_RGBX_SINT8/g	2014-01-27 14:30:50 -08:00
Mark Mueller	ef145ba4de	mesa: Rename 4 color component unsigned byte MESA_FORMATs Change all 4 color component unsigned byte formats to meet spec for P Type formats: s/MESA_FORMAT_RGBA8888\b/MESA_FORMAT_A8B8G8R8_UNORM/g s/MESA_FORMAT_RGBA8888_REV\b/MESA_FORMAT_R8G8B8A8_UNORM/g s/MESA_FORMAT_ARGB8888\b/MESA_FORMAT_B8G8R8A8_UNORM/g s/MESA_FORMAT_ARGB8888_REV\b/MESA_FORMAT_A8R8G8B8_UNORM/g s/MESA_FORMAT_RGBX8888\b/MESA_FORMAT_X8B8G8R8_UNORM/g s/MESA_FORMAT_RGBX8888_REV\b/MESA_FORMAT_R8G8B8X8_UNORM/g s/MESA_FORMAT_XRGB8888\b/MESA_FORMAT_B8G8R8X8_UNORM/g s/MESA_FORMAT_XRGB8888_REV\b/MESA_FORMAT_X8R8G8B8_UNORM/g	2014-01-27 14:29:13 -08:00
Mark Mueller	71fe943716	mesa: change gl_format to mesa_format s/\bgl_format\b/mesa_format/g. Use better name for Mesa Formats enum	2014-01-27 14:28:46 -08:00
Ian Romanick	bc0ed68275	docs: Update GL3.txt due to recent work v2: Note that Fredrik Höglund is working on GL_ARB_multi_bind, not Maxence Le Doré. Suggested by Matt. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 14:35:19 -07:00
Ian Romanick	6901c278ca	glcpp: Make sure GL_AMD_shader_trinary_minmax is defined The define was only available if gl_extensions::AMD_shader_trinary_minmax was set, but no driver set it. Since the extension is advertised by default, remove that field too. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Maxence Le Doré <maxence.ledore@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-27 14:28:24 -07:00
Ian Romanick	764be9f9e8	mesa: Clean up bad code formatting left from previous commit Also s/_EXT// on enums that are now part of core. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	a6729731af	mesa: GL_EXT_framebuffer_blit is not optional Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	71cc510ef6	radeon: Enable GL_EXT_framebuffer_blit The dd_function_table::BlitFramebuffer is already initialized to _mesa_meta_BlitFramebuffer, so it should just work. Tested on a Radeon 7500 (OpenGL renderer string: Mesa DRI R100 (RV200 5157) TCL DRI2). I couldn't do a full piglit run because it would tank the system with or without this patch. I just ran all the blit tests (-t blit to piglit-run.py). Only fbo-sys-sub-blit failed. All of the other tests that weren't skipped (i.e., all the multisample and sRGB tests skip) passed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	bed51a4858	r200: Enable GL_EXT_framebuffer_blit The dd_function_table::BlitFramebuffer is already initialized to _mesa_meta_BlitFramebuffer, so it should just work. Tested on a FireGL 8800 (OpenGL renderer string: Mesa DRI R200 (R200 5148) TCL DRI). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	33214679bb	radeon / r200: Pass the API into _mesa_initialize_context Otherwise an application that requested an OpenGL ES 1.x context would actually get a desktop OpenGL context. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	af0b34783e	mesa: Validate internalFormat with target in glTexStorage paths Fixes the glTexStorage3D failure in ext_packed_depth_stencil-depth-stencil-texture and oes_packed_depth_stencil-depth-stencil-texture_gles2. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:43 -07:00
Ian Romanick	421b5958eb	mesa: Refactor internalFormat / target checks to a separate function We need almost identical code in the glTexStorage path. v2: Fix typo in a comment noticed by Topi. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:42 -07:00
Ian Romanick	88db6ad7db	mesa: Generate the correct error for a depth format with a 3D texture All versions of the OpenGL spec are quite clear that GL_INVALID_OPERATION should be generated. I added a quotation from the 3.3 core profile spec. Fixes the glTexImage3D subcases of ext_packed_depth_stencil-depth-stencil-texture and oes_packed_depth_stencil-depth-stencil-texture_gles2. The same subtests of oes_packed_depth_stencil-depth-stencil-texture_gles1 fail, but they fail with a different wrong error code. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-27 14:21:42 -07:00
Matt Turner	3f3aafbfee	glx: Update glxext.h to revision 24777. It readds the GLXContextID typedef, but under #ifndef GLX_VERSION_1_3 and glx.h already defines GLX_VERSION_1_3. Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=11454 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-27 09:57:12 -08:00
Emil Velikov	a6031a82f9	loader: Add missing \n on message printing Cover both loader and glx/dri_glx Drop \n from the default loader logger Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-27 09:37:29 -08:00
Eric Anholt	867d7c0e10	dri: Reuse dri_message to implement our other message handlers. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:37:29 -08:00
Eric Anholt	4a8da40fc0	dri: Fix the logger error message handling. Since the loader changes, there has been a compiler warning that the prototype didn't match. It turns out that if a loader error message was ever thrown, you'd segfault because of trying to use the warning level as a format string. Reviewed-by: Keith Packard <keithp@keithp.com> Tested-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:37:29 -08:00
Eric Anholt	7bd95ec437	dri2: Trust our own driver name lookup over the server's. This allows Mesa to choose to rename driver .sos (or split drivers), without needing a flag day with the corresponding 2D driver. v2: Undo the loader-only-for-dri3 change. Reviewed-by: Keith Packard <keithp@keithp.com> [v1] Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]	2014-01-27 09:37:10 -08:00
Eric Anholt	be7a6976a8	dri2: Open the fd before loading the driver. I want to stop trusting the server for the driver name, and instead decide on our own based on the fd, so I needed this code motion. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:36:24 -08:00
Eric Anholt	378e7ad26f	dri3: Fix two little memory leaks. Noticed when valgrinding an unrelated bug. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-27 09:36:24 -08:00
Eric Anholt	4556c73470	loader: Use dlsym to get our udev symbols instead of explicit linking. Steam links against libudev.so.0, while we're linking against libudev.so.1. The result is that the symbol names (which are the same in the two libraries) end up conflicting, and some of the usage of .so.1 calls the .so.0 bits, which have different internal structures, and segfaults happen. By using a dlopen() with RTLD_LOCAL, we can explicitly look for the symbols we want, while they get the symbols they want. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com> Tested-by: Mike Lothian <mike@fireburn.co.uk>	2014-01-27 09:36:24 -08:00
Tom Stellard	d51dbe048a	r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader. This is necessary to prevent the next SURFACE_SYNC packet from hanging the GPU. https://bugs.freedesktop.org/show_bug.cgi?id=73418 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-27 11:09:15 -05:00
Ilia Mirkin	3518606c14	docs: sync up nv50/nvc0 status on GL4.x extensions Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	59e334194b	docs: update GL3.txt, relnotes to reflect current nv50/nvc0 status Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	839bd3cff7	nv50, nvc0: update reported glsl version to 330 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Christoph Bumiller	3efed4cd05	mesa/st: expose ARB_texture_rgb10_a2ui if R10G10B10A2_UINT is supported v2 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-27 16:40:43 +01:00
Christoph Bumiller	c7b14ba23f	nv50: add more RGB10A2 formats	2014-01-27 16:40:43 +01:00
Christoph Bumiller	f3bd2bc7b2	st/mesa: fix GS varyings for PIPE_CAP_TGSI_TEXCOORD	2014-01-27 16:40:43 +01:00
Ilia Mirkin	dc8da4c29b	nv50: enable seamless cube maps on all hw Some of the hardware support is missing. The NVIDIA-provided driver, which claims seamless cube map support fails the relevant tests as well. As this is the last extension before we can have OpenGL 3.2, doing this allows us to expose geometry shaders without doing the additional work involved in supporting ARB_geometry_shader4. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	b9b7cfbabf	nv50: report glsl 1.50 now that gp tests pass Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	3bd40073b9	nv50: add support for texelFetch'ing MS textures, ARB_texture_multisample Creates two areas in the AUX constbuf: - Sample offsets for MS textures - Per-texture MS settings When executing a texelFetch with a MS sampler, looks up that texture's settings and adjusts the parameters given to the texfetch instruction. With this change, all the ARB_texture_multisample piglits pass, so turn on PIPE_CAP_TEXTURE_MULTISAMPLE. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	a6cf950ba2	nv50: copy nvc0's get_sample_position implementation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	b87f5abd21	nv50: add comments about CB_AUX contents Updates a few inconsistencies as well, like the size of the buffer, location of the runout, etc. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	250e7c835e	nvc0: don't forget to also clear additional layers Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	e3247355cc	nv50: don't forget to also clear additional layers Fixes most of the tests/spec/gl-3.2/layered-rendering/* piglits. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	d98b85b507	nv50: allocate an extra code bo to avoid dmesg spam Each code BO is a heap that allocates at the end first, and so GPs are allocated at the very end of the allocated space. When executing, we see PAGE_NOT_PRESENT errors for the next page. Just over-allocate to make sure that there's something there. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:43 +01:00
Ilia Mirkin	58589f6c6d	nv50: GP_REG_ALLOC_RESULT must be positive Set max_out to 1 when there are no outputs. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	006095b38a	nv50: VP_RESULT_MAP_SIZE has to be positive Make sure that we never try to use a 0-sized map. This can happen when using a gp, so add a dummy mapping when computing vp_gp_mapping in that case. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	c4adbd5a57	nv50: enable primitive id generation when it is an FP input without GP Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	70a07ac352	nv50: handle gl_Layer writes in GP Marks gl_Layer as only having one component, and makes sure to keep track of where it is and emit it in the output map, since it is not an input to the FP. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	7c624148a6	nv50: properly set the PRIMITIVE_ID enable flag when it is a gp input. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	6f3219a8f3	nv50/ir: add support for gl_PrimitiveIDIn Note that the primitive id is stored in a[0x18], while usually the geometry instructions are of the form a[$a1 + 0x4] which gets mapped to p[] space. We need to avoid the change from a[] to p[] here, so it's keyed on whether the access is indirect or not. Note that there's also a use-case for accessing e.g. a[$r1], however that's not supported for now. (Could be added by checking the register file of the indirect parameter.) Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	f77069419a	nv50/ir: fix support for shader input + immediate in gp This only works for up to $a3, hopefully we won't go that high. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	45b7f1701e	nv50/ir: disallow shader input + cbuf in same instruction in gp Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	42dc414cc6	nv50/ir: disallow predicates on emit/restart ops	2014-01-27 16:40:42 +01:00
Ilia Mirkin	20929963d3	nv50: allow vert_count to be >255 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Bryan Cain	02b317a0d6	nv50: add support for geometry shaders Layer output probably doesn't work yet, but other than that everything seems to be working. Signed-off-by: Bryan Cain <bryancain3@gmail.com> [calim: fix up minor bugs, code formatting] Signed-off-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Bryan Cain	b3f82e1a63	nv50/ir: delay calculation of indirect addresses Instead of emitting an SHL 4 io an address register on the TGSI ARL and UARL instructions, emit the shift when the loaded address is actually used. This is necessary because input vertex and attribute indices in geometry shaders on nv50 need to be shifted left by 2 instead of 4. Signed-off-by: Bryan Cain <bryancain3@gmail.com> [calim: various updates to the indirect address logic] Signed-off-by: Christoph Bumiller <e0425955@student.tuwien.ac.at> [imirkin: remove OP_MAD change that calim made, add OP_RESTART handling same as OP_EMIT for code flow analysis] Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Christoph Bumiller	67250acbab	nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPs	2014-01-27 16:40:42 +01:00
Ilia Mirkin	2689b59cab	nv50/ir: txg not available on nvaa/nvac Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	e05de038bf	nv50, nvc0: only clear out the buffers that we were asked to clear Fixes fbo-drawbuffers-none glClearBuffer piglit test. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	c75eeab609	nv50, nvc0: clear out RT on a null cbuf This is needed since commit `9baa45f78b` (st/mesa: bind NULL colorbuffers as specified by glDrawBuffers). This implementation is highly based on a larger commit by Christoph Bumiller <e0425955@student.tuwien.ac.at> in his gallium-nine branch. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	3f264e16e2	nv50: don't leak heap on tls alloc failure Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	18d97a8df7	nouveau/codegen: set dType to S32 for OP_NEG U32 It doesn't make sense to do an OP_NEG from U32 to U32. This was manifested on nv50 in glsl-fs-atan-3 which was generating a UMAD TEMP[0].x, TEMP[0].xxxx, -TEMP[5].xxxx, TEMP[0].xxxx instruction. (For some reason, nvc0 causes a different shader to be generated.) This led to a cvt neg u32 $r1 u32 $r1 Which did not yield the desired result. This changes the final output to cvt neg s32 $r1 u32 $r1 which produces the desired output and the piglit tests passes. My assumption is that this is also what we want on nvc0, but could not test as there was no suitable shader that generated the problem instruction. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	45b64e52f4	util/u_vbuf: correct map offset calculation for crazy offsets When the min_index is very large (or very negative), the multipliation can overflow 32 bits and result in an incorrect map pointer modification. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-27 16:40:42 +01:00
Ilia Mirkin	3de97ce920	translate: deal with size overflows by casting to ptrdiff_t This was discovered as a result of the draw-elements-base-vertex-neg piglit test, which passes very negative offsets in, followed up by large indices. The nouveau code correctly adjusts the pointer, but the translate code needs to do the proper inverse correction. Similarly fix up the SSE code to do a 64-bit multiply to compute the proper offset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-27 16:40:42 +01:00
Emil Velikov	4dd445f1cf	gallium/rtasm: handle mmap failures appropriately For a variety of reasons mmap (selinux and pax to name a few) and can fail and with current code. This will result in a crash in the driver, if not worse. This has been the case since the inception of the gallium copy of rtasm. Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73473 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-01-27 13:24:51 +00:00
Alexander von Gluck IV	e5e4120723	haiku: change atomic int to non-volatile * Our atomic calls changed recently and no longer want atomic int pointers to be volatile * Spellcheck	2014-01-26 18:56:05 -06:00
Kenneth Graunke	07149f0252	i965: Don't store qpitch / 4 as mt->qpitch for compressed surfaces. Broadwell requires software to specify QPitch in a bunch of packets, so we decided to store it in the miptree. However, when I did that refactoring, I missed a subtlety: the hardware expects QPitch to be "in units of rows in the uncompressed surface". This is the value we originally compute. However, for compressed surfaces, we then divided it by 4 (the block height), to obtain the physical layout. This is no longer the QPitch Broadwell expects. So, store the original undivided value in mt->qpitch, but continue to use the divided value in brw_miptree_layout_texture_array(). For non-Broadwell platforms, this should have no impact at all. Helps fix Piglit's "getteximage-targets S3TC CUBE" test on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-25 19:20:17 -08:00
Vinson Lee	a487b4d0e3	c11: Do not use pthread_mutex_timedlock on NetBSD. This patch fixes the NetBSD build. NetBSD does not have pthread_mutex_timedlock. CC glapi_dispatch.lo threads_posix.h: In function 'mtx_timedlock': threads_posix.h:216:5: error: implicit declaration of function 'pthread_mutex_timedlock' Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-01-24 18:20:42 -08:00
Kenneth Graunke	6709f0549f	glsl: Simplify built-in generator functions for min3/max3/mid3. The type of all three parameters are identical, so we don't need to specify it three times. The predicate is always identical too, so we don't need to make it a parameter, either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-24 14:18:15 -08:00
Kenneth Graunke	44a86e2b4f	glsl: Fix chained assignments of vector channels. Simple shaders such as: void splat(vec2 v, float f) { v[0] = v[1] = f; } failed to compile with the following error: error: value of type vec2 cannot be assigned to variable of type float First, we would process v[1] = f, and transform: LHS: (expression float vector_extract (var_ref v) (constant int (1))) RHS: (var_ref f) into: LHS: (var_ref v) RHS: (expression vec2 vector_insert (var_ref v) (constant int (1)) (var_ref f)) Note that the LHS type is now vec2, not a float. This is surprising, but not the real problem. After emitting assignments, this ultimately becomes: (declare (temporary) vec2 assignment_tmp) (assign (xy) (var_ref assignment_tmp) (expression vec2 vector_insert (var_ref v) (constant int (1)) (var_ref f))) (assign (xy) (var_ref v) (var_ref assignment_tmp)) We would then return (var_ref assignment_tmp) as the rvalue, which has the wrong type---it should be float, but is instead a vec2. To fix this, we simply return (vector_extract (var_ref assignment_temp) <the appropriate channel>) to pull out the desired float value. Fixes Piglit's chained-assignment-with-vector-constant-index.vert and chained-assignment-with-vector-dynamic-index.vert tests. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74026 Reported-by: Dan Ginsburg <dang@valvesoftware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-24 14:18:15 -08:00
Kenneth Graunke	6c158e110c	glsl: Rename "expr" to "lhs_expr" in vector_extract munging code. When processing assignments, we have both an LHS and RHS. At a glance, "lhs_expr" clearly refers to the LHS, while a generic name like "expr" is ambiguous. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-24 14:18:15 -08:00
Paul Berry	eab32bb8f1	Update .gitignore for Catalan translations build artifacts Causes git to ignore the new build artifacts introduced by commit `d5e5367e89` (driconf: Add Catalan translations).	2014-01-24 13:45:16 -08:00
Ian Romanick	c11d76c51a	mesa: Increment the list pointer while freeing instruction data Since the list pointer was never incremented when a OPCODE_PIXEL_MAP opcode was encountered, the data for the instruction would get freed over and over and over... resulting in a crash. Fixes gl-1.0-beginend-coverage. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72214 Reviewed-by: Brian Paul <brianp@vmware.com> Cc: Lu Ha <huax.lu@intel.com>	2014-01-24 13:43:10 -08:00
Brian Paul	a44554870e	svga: rename "tex_usage" to "bindings", add comments Trivial.	2014-01-24 13:33:29 -07:00
Brian Paul	e2dd240e32	st/mesa: add a simple sanity check assertion in st_validate_attachment() Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-24 13:33:13 -07:00
Paul Berry	43e77215b1	i965/gen7: Use to the correct program when uploading transform feedback state. Transform feedback may come from either the geometry shader or the vertex shader, so we can't use ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the current post-link transform feedback information. Fortunately we can use ctx->TransformFeedback.CurrentObject->shader_program. Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-23 13:41:36 -08:00
Paul Berry	e190709119	mesa: Ensure that transform feedback refers to the correct program. Previous to this patch, the _mesa_{Begin,Resume}TransformFeedback functions were using ctx->Shader.CurrentProgram[MESA_SHADER_VERTEX] to find the program that would be the source of transform feedback data. This isn't correct--if there's a geometry shader present it should be ctx->Shader.CurrentProgram[MESA_SHADER_GEOMETRY]. (These might be different if separate shader objects are in use). This patch creates a function get_xfb_source(), which figures out the correct program to use based on GL state, and updates _mesa_{Begin,Resume}TransformFeedback to call it. get_xfb_source() is written in terms of the gl_shader_stage enum, so it should not need modification when we add tessellation shaders in the future. It also creates a new driver flag, NewTransformFeedbackProg, which is flagged whenever this program changes. To reduce future confusion, this patch also rewords some comments and error message text to avoid referring to vertex shaders. Cc: 10.0 <mesa-stable@lists.freedesktop.org> v2: make the for loop in get_xfb_source() clearer. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-23 13:41:01 -08:00
Paul Berry	9cee3ff562	i965: Remove *_generator::shader field; use prog field instead. The "shader" field in fs_generator, vec4_generator, and gen8_generator was only used for one purpose; to figure out if we were compiling an assembly program or a GLSL shader (shader is NULL for assembly programs). And it wasn't being used properly: in vec4 shaders we were always initializing it based on prog->_LinkedShaders[MESA_SHADER_FRAGMENT], regardless of whether we were compiling a geometry shader or a vertex shader. This patch simplifies things by using the "prog" field instead; this is also NULL for assembly programs. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-23 13:40:55 -08:00
Matt Turner	00c672086c	gles3: Update gl3.h to revision 24614. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	d519ebb34c	gles2: Update gl2ext.h to revision 24614. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	117d8ce27b	gles2: Update gl2.h to revision 24614. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	66ef8feb4d	glcpp: Define GL_EXT_shader_integer_mix in both GL and ES. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Matt Turner	73c3c7e37d	glcpp: Remove unused gl_api bits. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Matt Turner	b2d1c579bb	glcpp: Set extension defines after resolving the GLSL version. Instead of defining preprocessor macros in glcpp_parser_create based on the GL API, wait until the shader version has been resolved. Doing this allows us to correctly set (and not set) preprocessor macros for extensions allowed by the API but not the shader, as in the case of ARB_ES3_compatibility. The shader version has been resolved when the preprocessor encounters the first preprocessor token, since the GLSL spec says "The #version directive must occur in a shader before anything else, except for comments and white space." Specifically, if a #version token is found the version is known explicitly, and if any other preprocessor token is found then the GLSL version is implicitly 1.10. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71630 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Anuj Phogat	c907595ba7	glsl: Disable ARB_texture_rectangle in shader version 100. OpenGL with ARB_ES2_compatibility allows shaders that specify #version 100. This fixes the Khronos OpenGL test(Texture_Rectangle_Samplers_frag.test) failure. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-23 11:33:22 -08:00
Matt Turner	e0648015e9	glsl: Mark GLSL 4.40 as a known version. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-23 11:33:22 -08:00
Brian Paul	f7c118ffbf	st/mesa: fix glReadBuffer(GL_NONE) segfault Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73956 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Tested-by: Ahmed Allam <ahmabdabd@hotmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 11:08:40 -07:00
Brian Paul	349efdbba1	svga: fix PS output register setup regression Fixes glean fragProg1 regression caused by commit `b9f68d927e` (implement TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS). This bug only appears when the fragment shader emits fragment.Z before color outputs. The bug was caused by confusion between register indexes and semantic indexes. Also added some comments to better explain register indexing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-23 11:08:40 -07:00
Emil Velikov	c6b6916b9a	glx: link loader util lib only when building with dri3 Otherwise we pull libudev as a dependency and crash games/programs that ship their own version of libudev. Either way we should link the loader lib only when needed. This fixes a regression caused by commit `eac776cf77` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Sat Jan 11 02:24:43 2014 +0000 glx: use the loader util lib Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73854 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-23 18:04:22 +00:00
Alex Henrie	d5e5367e89	driconf: Add Catalan translations See the instructions in Makefile.am under "Adding new translations". Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-23 09:10:19 -08:00
Alex Henrie	84529a5ddb	driconf: Correct and update Spanish translations Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-23 09:10:18 -08:00
Alex Henrie	822b4315b7	driconf: Synchronize po files See the instructions in Makefile.am under "Updating existing translations". Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-23 09:10:18 -08:00
Ian Romanick	e4fcae0755	mesa: Set gl_constants::MinMapBufferAlignment Leaving it set to zero isn't really correct since every allocation has at least an alignment of 1 byte. It also caused a problem in the i965 driver after I removed the MAX(64, ...) from the alignment calculation. That's what I get for changing a patch without retesting it. :( Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73907 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Lu Hua <huax.lu@intel.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	7a0f26dec9	radeon / r200: Eliminate BEGIN_BATCH_NO_AUTOSTATE Sed job: grep -lr BEGIN_BATCH_NO_AUTOSTATE src/mesa/drivers/dri/ \| while read f do cat $f \| sed 's/BEGIN_BATCH_NO_AUTOSTATE/BEGIN_BATCH/g' > x mv x $f done Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	2d5fd20690	radeon / r200: Remove unused 'dostate' parameter This parameter hasn't been used since January 2010 (commit `29e02c7`). Fixes the following warning in both radeon and r200: radeon_common.c: In function 'r200_rcommonBeginBatch': radeon_common.c:762:14: warning: unused parameter 'dostate' [-Wunused-parameter] Note that now BEGIN_BATCH and BEGIN_PATCH_NO_AUTOSTATE are identical. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	5b4c12972c	radeon / r200: Fix 'empty body' warning radeon_common.c: In function 'radeon_draw_buffer': radeon_common.c:237:3: warning: suggest braces around empty body in an 'if' statement [-Wempty-body] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
Ian Romanick	b790bed21e	radeon / r200: Fix incompatible pointer type warning When parameters were removed from dd_function_table::Viewport (commit `065bd6ff`), radeon_viewport (in both radeon and r200) started generating a warning. radeon_common.c: In function 'r200_radeon_viewport': radeon_common.c:415:15: warning: assignment from incompatible pointer type [enabled by default] radeon_common.c:419:23: warning: assignment from incompatible pointer type [enabled by default] I didn't notice this initially, and it's harmless because the function is never called through the incorrectly typed pointer. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:50:58 -08:00
José Fonseca	840154dc50	draw: Save original driver functions earlier. Otherwise they will be NULL when stage destroy is invoked prematurely, (i.e, on out of memory). Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-23 15:49:32 +00:00
Brian Paul	1a44180578	mesa: whitespace fixes in glformats.c Reindent _mesa_get_nongeneric_internalformat() to match other functions. Remove extraneous empty lines in _mesa_get_linear_internalformat(). Trivial.	2014-01-23 08:31:21 -07:00
Brian Paul	a15eb19676	svga: minor code movement in svga_tgsi_insn.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	f12954e1cb	svga: whitespace, formatting fixes in svga_state_framebuffer.c Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	56b876ecd0	svga: simplify common immediate value construction Use some new helper functions to make the code much more readable. And fix wrong value for XPD's w result. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	023020d740	svga: add comments, etc to svga_tgsi_insn.c code To make things a little easier to understand for newcomers. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:01 -07:00
Brian Paul	fe043ae554	svga: assorted cleanups in shader code Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:23:00 -07:00
Brian Paul	2a30379dcd	svga: rename shader_result -> variant To be more consisten with other parts of gallium. Plus, update/add various comments. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-23 08:22:58 -07:00
Brian Paul	35ddd2cc5d	mesa: rename unbind_texobj_from_imgunits() ... to unbind_texobj_from_image_units() and change a local var's type to silence an MSVC warning. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:14 -07:00
Brian Paul	1f2007429e	glsl: silence a couple warnings in find_active_atomic_counters() Silence unitialized variable 'id' warning. Silence unused 'found' warning. Only seen in release builds. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:14 -07:00
Brian Paul	5306ee736e	mesa: initialize "is_layered" variable to silence warning Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:14 -07:00
Brian Paul	b98fa6fe6f	mesa: fix/add some cases in _mesa_get_linear_internalformat() In some cases we were converting generic formats to sized formats and vice versa. The point is to simply convert sRGB formats to corresponding linear formats. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:13 -07:00
Brian Paul	91567b83bf	mesa: add missing ETC2_SRGB cases in formats.c In the _mesa_get_format_color_encoding() and _mesa_get_srgb_format_linear() functions. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-23 08:13:13 -07:00
José Fonseca	ab6f9fccd4	radeon: More missing stdio.h includes.	2014-01-23 14:20:20 +00:00
José Fonseca	fa75cc4b89	os/os_thread: Revert pipe_barrier pre-processing logic. Whitelist platforms instead of blacklisting, as several pthread implementations are missing pthread_barrier_t, in particular MacOSX.	2014-01-23 13:44:10 +00:00
José Fonseca	cd978ce26a	c11: Fix missing pthread_mutex_timedlock declaration warnings on MacOSX.	2014-01-23 13:42:38 +00:00
José Fonseca	6b6fdb6aa9	radeon: Adding missing stdio.h include. Became apparent with the C11 thread changes. Unfortunately I didn't have all dependencies to build the driver, and only noticed this issue on build server.	2014-01-23 13:23:43 +00:00
José Fonseca	ab5dc45b2f	mapi: Prevent cast from pointer to integer of different size. On Windows64.	2014-01-23 13:21:52 +00:00
José Fonseca	799f30f385	c11: Update docs/license.html and include verbatim copy of Boost license.	2014-01-23 12:55:55 +00:00
José Fonseca	f298720cbc	egl: Use C11 thread abstractions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	54876afcf0	mapi: Use C11 thread abstractions. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	fd33a6bcd7	gallium: Use C11 thread abstractions. Note that PIPE_ROUTINE now returns an int. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	ecaa81bd96	c11: Import threads.h emulation library. Implementation is based of https://gist.github.com/2223710 with the following modifications: - inline implementatation - retain XP compatability - add temporary hack for static mutex initializers (as they are not part of the stack but still widely used internally) - make TIME_UTC a conditional macro (some system headers already define it, so this prevents conflict) - respect HAVE_PTHREAD macro Reviewed-by: Brian Paul <brianp@vmware.com> Acked-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-23 12:55:55 +00:00
José Fonseca	349f0a94ae	os: Remove pipe_static_condvar. Never used. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-23 12:55:55 +00:00
Timothy Arceri	815e064fb6	docs: Mark ARB_arrays_of_arrays as started Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:37 +11:00
Timothy Arceri	b0c64d3cc6	glsl: remove remaining is_array variables Previously the reason we needed is_array was because we used array_size == NULL to represent both non-arrays and unsized arrays. Now that we use a non-NULL array_specifier to represent an unsized array, is_array is redundant. Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:37 +11:00
Timothy Arceri	61a5846099	glsl: create type name for arrays of arrays We need to insert outermost dimensions in the correct spot otherwise the dimension order will be backwards Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:36 +11:00
Timothy Arceri	3d492f19f6	glsl: Allow arrays of arrays as input to vertex shader Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:36 +11:00
Timothy Arceri	3dc932d450	glsl: only call mark_max_array if we are assigning an array This change does not help fix or prevent any bugs it just seems reasonable to do Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:37:36 +11:00
Timothy Arceri	bfb48750f0	glsl: Add ARB_arrays_of_arrays support to yacc definition and ast Adds array specifier object to hold array information Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:31:10 +11:00
Timothy Arceri	72288e0c7b	mesa: Add ARB_arrays_of_arrays Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 23:15:29 +11:00
Topi Pohjolainen	bda88f121b	i965/blorp: switch eu-emitter to use FS IR and fs_generator No regressions on IVB (piglit quick + unit tests). v2 (Paul): - no need to patch the unit tests anymore. Original logic was altered and unit tests updated to match the fs-generator - lrp emission moves from the blorp compiler core into the emitter here (previously there was a separate refactoring patch which is not really needed anymore as the lrp logic got refactored when the original lrp logic got fixed). - pass 'BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX' to the generator in fs_inst::target instead of hardcoding it Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:47:12 +02:00
Topi Pohjolainen	8f3e5363ad	i965/fs: add support for BRW_OPCODE_AVG in fs_generator Needed for compiling blorp blit programs. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:47:12 +02:00
Topi Pohjolainen	9927d7ae68	i965/fs: introduce blorp specific rt-write for fs_generator The compiler for blorp programs likes to emit instructions for the message construction itself meaning that the generator needs to skip any such when blorp programs are translated for the hw. In addition, the binding table control is special for blorp programs and the generator does not need to update the binding tables associated with the compiler bookkeeping (this in fact gets thrown away as the blorp compiler sets the program data in its own way). v2 (Paul): do not hardcode the binding table index but use fs_inst::target instead. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:46:57 +02:00
Topi Pohjolainen	85fc724df5	i965/fs: allow unit tests to dump the final patched assembly Unit tests comparing generated blorp programs to known good need to have the dump in designated file instead of in default standard output. The comparison also expects the jump counters of if-else-instructions to be correctly set and hence the dump needs to be taken _after_ 'patch_IF_ELSE()' is run (the default dump of the fs_generator does this before). v2 (Paul): dropped the redundant 'dump_enabled' argument Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:57 +02:00
Topi Pohjolainen	757b4cf011	i965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter v2 (Paul): renamed emit_if() to emit_cmp_if() Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:53 +02:00
Topi Pohjolainen	8c0030678a	i965/blorp: wrap RNDD (/brw_RNDD(&func, /emit_rndd(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:51 +02:00
Topi Pohjolainen	44524cb42f	i965/blorp: wrap FRC (/brw_FRC(&func, /emit_frc(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:49 +02:00
Topi Pohjolainen	f9d875926e	i965/blorp: wrap MUL (/brw_MUL(&func, /emit_mul(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:47 +02:00
Topi Pohjolainen	bbab8068d2	i965/blorp: wrap OR (/brw_OR(&func, /emit_or(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:44 +02:00
Topi Pohjolainen	de6ea2fe25	i965/blorp: wrap SHL (/brw_SHL(&func, /emit_shl(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:42 +02:00
Topi Pohjolainen	d256a5f843	i965/blorp: wrap SHR (/brw_SHR(&func, /emit_shr(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:39 +02:00
Topi Pohjolainen	0df1f5ce4e	i965/blorp: wrap ADD (/brw_ADD(&func, /emit_add(/) In addition, the special case requiring explicit execution size control is wrapped manually. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:37 +02:00
Topi Pohjolainen	c777e72bd8	i965/blorp: wrap AND (/brw_AND(&func, /emit_and(/) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:34 +02:00
Topi Pohjolainen	8b5fd98043	i965/blorp: wrap MOV (/brw_MOV(&func, /emit_mov(/) In addition, the two special cases requiring explicit execution size control are wrapped manually. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:30 +02:00
Topi Pohjolainen	250494f742	i965/blorp: wrap emission of if-equal-assignment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:28 +02:00
Topi Pohjolainen	9e9617f797	i965/blorp: wrap emission of conditional assignment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:25 +02:00
Topi Pohjolainen	8c42ade7a4	i965/blorp: move emission of sample combining into eu-emitter v2 (Paul): pass the combining opcode as an argument to emit_combine(). This keeps manual_blend_average() selfcontained documentation wise. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:16 +02:00
Topi Pohjolainen	ecf795615c	i965/blorp: move emission of rt-write into eu-emitter Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:13 +02:00
Topi Pohjolainen	aac6bace9f	i965/blorp: move emission of texture lookup into eu-emitter Resolving of the hardware message type is moved into the emitter also in preparation for switching to use fs_generator. The generator wants to translate the high level op-code into the message type and hence the emitter needs to know the original op-code. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:10 +02:00
Topi Pohjolainen	41d397f22b	i965/fs: introduce non-compressed equivalent of tex_cms v2: introduces 'SHADER_OPCODE_TXF_UMS' also for gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:45:04 +02:00
Topi Pohjolainen	ce527a6722	i965: rename tex_ms to tex_cms Prepares for the introduction of non-compressed multi-sampled lookup used in the blorp programs. v2: now also taking into account gen8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:44:58 +02:00
Topi Pohjolainen	3c44e43357	i965/blorp: move emission of pixel kill into eu-emitter The combination of four separate comparison operations and and the masked "and" require special treatment when moving to FS LIR. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:44:52 +02:00
Topi Pohjolainen	f031487dcb	i965/blorp: introduce separate eu-emitter for blit compiler Prepares for presenting blorp blit programs using FS IR that allows EU-assembly generation using i965 glsl-compiler backend (fs_generator). v2: rebased on top of endif-jump counter fix (moving the added brw_set_uip_jip() into the emitter) Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-23 08:44:44 +02:00
Kenneth Graunke	d8c7740dda	i965: Support 32 texture image units on Haswell+. The Intel closed source OpenGL driver recently began supporting 32 texture image units on Haswell. This makes the open source driver support 32 as well. Earlier generations don't have the message header field required to support more than 16 sampler states, so we continue to advertise 16 there. On Haswell, this causes us to advertise: - GL_MAX_TEXTURE_IMAGE_UNITS = 32 - GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS = 32 - GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96 instead of the old values of 16, 16, and 48. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:58 -08:00
Kenneth Graunke	5a51a26804	i965/fs: Switch from BRW_MAX_TEX_UNIT to the actual limit. BRW_MAX_TEX_UNIT is about to grow, but only Gen7+ will be able to support the new larger value. On older platforms, we don't want to allocate the extra space - it would just be a waste. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:56 -08:00
Kenneth Graunke	50ce6f682d	mesa: Bump MAX_TEXTURE_IMAGE_UNITS to 32. This allows drivers to optionally support more than 16 texture units. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:55 -08:00
Kenneth Graunke	15fc919491	i965/vec4: Support arbitrarily large sampler state indices on Haswell+. Like the scalar backend, we add an offset to the "Sampler State Pointer" field to select a group of 16 samplers, then use the "Sampler Index" field to select within that group. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:53 -08:00
Kenneth Graunke	d58e03fe4f	i965/vec4: Refactor sampler message setup. The next patch adds an additional case where the message header is necessary. So we want to do the g0 copy if inst->header_present is set, rather than inst->texture_offset. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:51 -08:00
Kenneth Graunke	e0a5602911	i965/vec4: Don't set header_present if texel offsets are all 0. In theory, a shader might use textureOffset() but set all the texel offsets to zero. In that case, we don't actually need to set up the message header - zero is the implicit default. By moving the texture_offset setup before the header_present setup, we can easily only set header_present when there are non-zero texel offset values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:49 -08:00
Kenneth Graunke	6943ac0bd9	i965/fs: Support arbitrarily large sampler state indices on Haswell+. The message descriptor's "Sampler Index" field is only 4 bits (on all generations of hardware), so it can only represent indices 0 through 15. Haswell introduced a new field in the message header - "Sampler State Pointer". Normally, this is copied straight from g0, but we can also add a byte offset (as long as it's a multiple of 32). This patch uses a "Sampler State Pointer" offset to select a group of 16 sampler states, and then uses the "Sampler Index" field to select the state within that group. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:48 -08:00
Kenneth Graunke	d7450e52e6	i965/fs: Plumb sampler index into emit_texture_gen7. We'll need this in the next patch. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:46 -08:00
Kenneth Graunke	ebfe43d5ad	i965/fs: Refactor sampler message header to duplicate less code. Previously, the code to copy g0 to the message header existed in two places - one for the texture offset case, and one for any other case. By treating texture_offset as a special case of header_present, we can remove this duplication and shorten the code. Future patches which add new header fields also won't have to add additional duplication. This also clarifies a confusing construct. The old code contained: } else if (inst->header_present) { if (brw->gen >= 7) { ...explicit copy from g0 to the message header... } else { /* Set up an implied move from g0 to the MRF. */ } } This looks like it might set up an implied move on Sandybridge, which doesn't support those. However, Sandybridge only uses a message header for texture offsets, so it would never hit this code path. The new code avoids this implicit knowledge by only setting up an implied move on Gen4-5. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:42 -08:00
Kenneth Graunke	87e7326735	i965: Use get_element_ud to shorten texture header access. This is shorter, easier to read, and further from the 80 column limit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-22 17:18:18 -08:00
Marek Olšák	d40532f260	gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats This fixes a serious regression introduced in `4e549ddb50`. Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	d382e90614	gallium: remove PIPE_CAP_SCALED_RESOLVE If any driver doesn't support this, it can use a blit after resolving the samples. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	a8930adbf8	radeonsi: use hardware scissors correctly Use the WINDOW and VPORT scissors for the framebuffer and scissor test, respectively. The other two scissors are disabled (they cover the max fb size). We actually have 16 VPORT scissors, which will map well to ARB_viewport_array. Also, we don't need to write SC_WINDOW_OFFSET with this commit, because it's disabled everywhere. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	69c29cb147	radeonsi: handle R600_CONTEXT_PS_PARTIAL_FLUSH in si_emit_cache_flush For consistency only, This is unused by radeonsi currently. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	5dfb10b2f5	r600g,radeonsi: if discarding whole buffer range, discard whole resource instead Also set the unsynchronized flag if the whole resource was discarded to avoid doing buffer-busy checks again. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-23 01:47:14 +01:00
Marek Olšák	ee0dc659c8	gallium/u_upload_mgr: don't expose u_upload_flush It's unused and shouldn't be used at all in my opinion. If some driver doesn't support the unsynchronized flag, u_upload_mgr should avoid the synchronization by other means, e.g. by using the DONTBLOCK flag.	2014-01-23 01:47:14 +01:00
Marek Olšák	0c20bff4b6	gallium/hud: just unmap the upload vertex buffer instead of recreating it	2014-01-23 01:47:14 +01:00
Marek Olšák	2b033f3aab	gallium/vl: use u_upload_mgr to upload vertices for vl_compositor This is the recommended way for streaming vertices. Always use this if you need to upload vertices every frame. Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-23 01:47:14 +01:00
Kristian Høgsberg	11baad3508	intel: Fix initial MakeCurrent for single-buffer drawables Commit `05da4a7a5e` attempts to eliminate the call to intel_update_renderbuffer() in the case where we already have a drawbuffer for the drawable. Unfortunately this only checks the back left renderbuffer, which breaks in case of single buffer drawables. This means that the initial viewport will not be set in that case. Instead, we now check whether the initial viewport has not been set, in which case we call out to intel_update_renderbuffer(). https://bugs.freedesktop.org/show_bug.cgi?id=73862 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-22 12:30:59 -08:00
Paul Berry	0da1a2cc36	glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays. Most of the time it is not necessary to perform type inference to compile GLSL; the type of every expression can be inferred from the contents of the expression itself (and previous type declarations). The exception is aggregate initializers: their type is determined by the LHS of the variable being assigned to. For example, in the statement: mat2 foo = { { 1, 2 }, { 3, 4 } }; the type of { 1, 2 } is only known to be vec2 (as opposed to, say, ivec2, uvec2, int[2], or a struct) because of the fact that the result is being assigned to a mat2. Previous to this patch, we handled this situation by doing some type inference during parsing: when parsing a declaration like the one above, we would call _mesa_set_aggregate_type(), which would infer the type of each aggregate initializer and store it in the corresponding ast_aggregate_initializer::constructor_type field. Since this happened at parse time, we couldn't do the type inference using glsl_type objects; we had to use ast_type_specifiers, which are much more awkward to work with. Things are about to get more complicated when we add support for ARB_arrays_of_arrays. This patch simplifies things by postponing the call to _mesa_set_aggregate_type() until ast-to-hir time, when we have access to glsl_type objects. As a side benefit, we only need to have one call to _mesa_set_aggregate_type() now, instead of six. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-22 11:08:30 -08:00
Jan Vesely	6ec210989f	clover: Don't crash on NULL global buffer objects. Specs say "If the argument is a buffer object, the arg_value pointer can be NULL or point to a NULL value in which case a NULL value will be used as the value for the argument declared as a pointer to __global or __constant memory in the kernel." So don't crash when somebody does that. v2: Insert NULL into input buffer instead of buffer handle pair Fix constant_argument too Drop r600 driver changes v3: Fix inserting NULL pointer Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-01-22 13:30:35 +01:00
Vinson Lee	6caf34b97e	meta: Move loop variable declaration outside loop. Fixes MSVC build error introduced with commit `69b258cb46`. meta.c(618) : error C2143: syntax error : missing ';' before 'type' meta.c(618) : error C2143: syntax error : missing ')' before 'type' meta.c(618) : error C2065: 'i' : undeclared identifier meta.c(618) : warning C4552: '<' : operator has no effect; expected operator with side-effect meta.c(618) : error C2059: syntax error : ')' meta.c(618) : error C2143: syntax error : missing ';' before '{' meta.c(619) : error C2065: 'i' : undeclared identifier meta.c(620) : error C2065: 'i' : undeclared identifier Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-01-21 22:59:16 -08:00
Topi Pohjolainen	8b16b0255b	i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR No known bugs fixed but this is now in line with fs-generator. No regresssions on IVB. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-22 08:13:32 +02:00
Topi Pohjolainen	89347dd61b	i965/blorp: patch jump counters also for endif No known bugs fixed but this is now in line with fs-generator. No regresssions on IVB. Eric further explained that: "The endif jump, since it's forward, is just an optimization to have set right -- otherwise, the GPU will just step forward instruction by instruction until it hits something else that updates the per-channel PC." Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-22 08:13:32 +02:00
Paul Berry	1032c33cb9	mesa: Change redundant code into loops in texstate.c. This is possible now that ctx->Shader.CurrentProgram is an array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:52 -08:00
Paul Berry	6ac2e1e199	mesa: Change redundant code into loops in shaderapi.c. This is possible now that ctx->Shader.CurrentProgram is an array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:49 -08:00
Paul Berry	5808c44bab	mesa: Remove ad-hoc arrays of gl_shader_program. Now that we have a ctx->Shader.CurrentProgram array, we can just use it directly. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:47 -08:00
Paul Berry	69b258cb46	meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with an array. Since ctx->Shader.Current{Vertex,Geometry,Fragment}Program is an array, this allows some meta code to be rolled up into loops. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:44 -08:00
Paul Berry	b4b70674ea	i965: Fix comments to refer to the new ctx->Shader.CurrentProgram array. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:41 -08:00
Paul Berry	1aef45578c	mesa: Fold long lines introduced by the previous patch. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:38 -08:00
Paul Berry	3b22146dc7	mesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an array. These are replaced with ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' ')' \ -print0 \| xargs -0 sed -i \ -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \ -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \ -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g' Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:25:02 -08:00
Paul Berry	cd18ba1c7a	glsl/linker: Refactor in preparation for adding more shader stages. Rather than maintain separately named arrays and counts for vertex, geometry, and fragment shaders, just maintain these as arrays indexed by the gl_shader_type enum. v2: When there is neither a vertex nor a geometry shader, set prog->LastClipDistanceArraySize = 0, and clarify that the values is not used. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:59 -08:00
Paul Berry	4a91675b26	mesa: use _mesa_validate_shader_target() more frequently. This patch replaces code in _mesa_new_shader() and delete_shader_cb() that checks the type of a shader with calls to _mesa_validate_shader_target(). This has two advantages: it allows for a more thorough check (since _mesa_validate_shader_target() doesn't permit shader targets that aren't supported by the back-end), and it reduces the amount of code that will need to be modified when adding new shader stages. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:56 -08:00
Paul Berry	020919b2ae	main: Allow ctx == NULL in _mesa_validate_shader_target(). This will allow this function to be used in circumstances where there is no context available, such as when building built-in GLSL functions. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:54 -08:00
Paul Berry	6ab2a6148a	mesa: Make validate_shader_target() non-static. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:49 -08:00
Paul Berry	46d210d38f	mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program. In my recent zeal to refactor Mesa's handling of the gl_shader_stage enum, I accidentally wound up with two functions that do the same thing: _mesa_program_index_to_target(), and _mesa_shader_stage_to_program(). This patch keeps _mesa_shader_stage_to_program(), since its name is more consistent with other related functions. However, it changes the signature so that it accepts an unsigned integer instead of a gl_shader_stage--this avoids awkward casts when the function is called from C++ code. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 20:24:43 -08:00
Dave Airlie	2212a97fe3	llvmpipe: dump geometry shaders when using LP_DEBUG=tgsi for consistency with vs and fs dumpers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-01-22 14:08:03 +10:00
Ian Romanick	178c1bf1ad	mesa: Generate GL_INVALID_OPERATION for unsupported DSA TexStorage functions We have to make the functions available to work around a GLEW bug (see comments already in the code), but if an application calls one of these functions we should still generate GL_INVALID_OPERATION. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 15:39:54 -08:00
Ian Romanick	17594dccfd	mesa: Silence many unused parameter warnings main/texstorage.c: In function '_mesa_alloc_texture_storage': main/texstorage.c:240:53: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:241:37: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:241:53: warning: unused parameter 'depth' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage1DEXT': main/texstorage.c:464:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:464:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:464:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:465:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:466:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage2DEXT': main/texstorage.c:473:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:473:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:473:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:474:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:475:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:475:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c: In function '_mesa_TextureStorage3DEXT': main/texstorage.c:483:34: warning: unused parameter 'texture' [-Wunused-parameter] main/texstorage.c:483:50: warning: unused parameter 'target' [-Wunused-parameter] main/texstorage.c:483:66: warning: unused parameter 'levels' [-Wunused-parameter] main/texstorage.c:484:34: warning: unused parameter 'internalformat' [-Wunused-parameter] main/texstorage.c:485:35: warning: unused parameter 'width' [-Wunused-parameter] main/texstorage.c:485:50: warning: unused parameter 'height' [-Wunused-parameter] main/texstorage.c:485:66: warning: unused parameter 'depth' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 15:39:54 -08:00
Anuj Phogat	f5cfb4ae21	i965: Ignore 'centroid' interpolation qualifier in case of persample shading This patch handles the use of 'centroid' qualifier with 'in' variables in a fragment shader when persample shading is enabled. Per sample shading for the whole fragment shader can be enabled by: glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID} builtin variables in fragment shader. Explaining it below in more detail. /* Enable sample shading using OpenGL API */ glEnable(GL_SAMPLE_SHADING); glMinSampleShading(1.0); Example fragment shader: in vec4 a; centroid in vec4 b; main() { ... } Variable 'a' will be interpolated at sample location. But, what interpolation should we use for variable 'b' ? ARB_sample_shading recommends interpolation at sample position for all the variables. GLSL 400 (and earlier) spec says that: "When an interpolation qualifier is used, it overrides settings established through the OpenGL API." But, this text got deleted in later versions of GLSL. NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3) interpolates at sample position. This convinces me to use the similar approach on intel hardware. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-21 14:42:28 -08:00
Anuj Phogat	a92e5f7cf6	i965: Use sample barycentric coordinates with per sample shading Current implementation of arb_sample_shading doesn't set 'Barycentric Interpolation Mode' correctly. We use pixel barycentric coordinates for per sample shading. Instead we should select perspective sample or non-perspective sample barycentric coordinates. It also enables using sample barycentric coordinates in case of a fragment shader variable declared with 'sample' qualifier. e.g. sample in vec4 pos; A piglit test to verify the implementation has been posted on piglit mailing list for review. V2: Do not interpolate all the 'in' variables at sample position if fragment shader uses 'sample' qualifier with one of them. For example we have a fragment shader: #version 330 #extension ARB_gpu_shader5: require sample in vec4 a; in vec4 b; main() { ... } Only 'a' should be sampled at sample location, not 'b'. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-21 14:42:27 -08:00
Anuj Phogat	3313cc269b	i965: Add an option to ignore sample qualifier This will be useful in my next patch which depends on a functionality of _mesa_get_min_invocations_per_fragment() to ignore the sample qualifier (prog->IsSample) based on a flag passed to it. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-21 14:42:27 -08:00
Matt Turner	78d65476b6	mesa/x86: Remove dead read_rgba_span_x86.h. Dead since `304f7a13`.	2014-01-21 14:20:44 -08:00
Matt Turner	bf0773aeca	i965/fs: Optimize LRP with x == y into a MOV. total instructions in shared programs: 1487331 -> 1485988 (-0.09%) instructions in affected programs: 45638 -> 44295 (-2.94%) GAINED: 7 LOST: 0 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Jordan Justen	8d37e9915a	glsl: Optimize open-coded lrp into lrp. total instructions in shared programs: 1498191 -> 1487051 (-0.74%) instructions in affected programs: 669388 -> 658248 (-1.66%) GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	13100ac142	i965: Enable AOS optimizations for the geometry shader. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	4bd6e0d7c6	glsl: Vectorize multiple scalar assignments Reduces vertex shader instruction counts in DOTA2 by 6.42%, L4D2 by 4.61%, and CS:GO by 5.71%. total instructions in shared programs: 1500153 -> 1498191 (-0.13%) instructions in affected programs: 59919 -> 57957 (-3.27%) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	5e82d8a9da	glsl: Add parameter to .equals() to ignore an IR type. Only implemented for ir_swizzles currently, but perhaps will be useful for other IR types in the future. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	ebf91993c1	mesa: rename PreferDP4 to OptimizeForAOS. This flag was really just a proxy for determining whether the backend was vector (AOS) or scalar (SOA). It will be used to apply a future optimization only for vector backends. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 14:20:44 -08:00
Matt Turner	413622fbef	i965/fs: Print the maximum register pressure. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Kenneth Graunke	391eaa59bd	i965/fs: Show register pressure in dump_instructions() output. Dumping the number of live registers at each IP allows us to see register pressure and identify any local maxima. This should aid in debugging passes designed to reduce register pressure, as well as optimizations that suddenly trigger spilling. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:20:44 -08:00
Kenneth Graunke	3b74f4b233	i965: Compute the number of live registers at each IP. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-21 14:20:44 -08:00
Matt Turner	0ea600ef1a	i965/fs: Call opt_peephole_sel later in the optimization loop. Calling it after value numbering (added in the next commit) prevents some instruction count regressions. total instructions in shared programs: 1524387 -> 1523905 (-0.03%) instructions in affected programs: 13112 -> 12630 (-3.68%) GAINED: 0 LOST: 3 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	ede6c341f6	i965/fs: Calculate interference better in register_coalesce. Previously we simply considered two registers whose live ranges overlapped to interfere. Cases such as set A ------ ... \| mov B, A -- \| ... \| B \| A use B -- \| ... \| use A ------ would be considered to interfere, even though B is an unmodified copy of A whose live range fit wholly inside that of A. If no writes to A or B occur between the mov B, A and the use of B then we can safely coalesce them. Instead of removing MOV instructions, we make them NOPs and remove them at once after the main pass is finished in order to avoid recomputing live intervals (which are needed to perform the previous step). total instructions in shared programs: 1543768 -> 1513077 (-1.99%) instructions in affected programs: 951563 -> 920872 (-3.23%) GAINED: 46 LOST: 22 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	4a7d0c550e	i965/fs: Support coalescing registers of size > 1. total instructions in shared programs: 1550048 -> 1549880 (-0.01%) instructions in affected programs: 1896 -> 1728 (-8.86%) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	78fa6172e1	i965/fs: Assert that var < num_vars. Helped to track down a problem in a version of the next commit. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	9bb4d71fd2	i965/fs: Add a comment explaining how register coalescing works. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	2dfb067139	i965/fs: Add and use MAX_SAMPLER_MESSAGE_SIZE definition. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	81d52419cf	mesa: Add STRINGIFY macro. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	80b949f16b	i965/fs: Fix the example about overwriting uniforms in SIMD16. mov takes only a single source argument. Example instruction inexplicably changed from add to mov in commit `f10f5e49`. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Matt Turner	71bc11a375	i965: Print reg_offset for vgrf of size > 1 in dump_instruction(). Previously we wouldn't print the +0 for the first part of a VGRF of size greater than 1. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-21 14:09:33 -08:00
Grigori Goronzy	955c93dc08	glsl: Match unnamed record types across stages. Unnamed record types are assigned to separate types per stage, e.g. if uniform struct { ... } a; is defined in both vertex and fragment shader, two separate types will result with different names. When linking the shader, this results in a type conflict. However, there is no reason why this should not be allowed according to GLSL specifications. Compare and match record types when linking shader stages to avoid this conflict. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-21 14:01:09 -08:00
Grigori Goronzy	41c9bf884f	glsl: Extract function for record comparisons. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-21 14:01:09 -08:00
Brian Paul	6d8cf5181a	docs: remove some ancient README.* files None of this info is relevant anymore. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-21 10:53:51 -08:00
Brian Paul	b9f68d927e	svga: implement TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS Fixes several colorbuffer tests, including piglit "fbo-drawbuffers-none" for "gl_FragColor" and "glDrawPixels" cases. v2: rework patch to only avoid creating extra shader variants when TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS is not specified. Per Jose. Use a write_color0_to_n_cbufs key field to replicate color0 to N color buffers only when N > 0 and WRITES_ALL_CBUFS is set. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:51 -08:00
Brian Paul	384fd64ab1	svga: rename color output variables Just to be bit more readable. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:51 -08:00
Brian Paul	f6bc7d6586	svga: fix clearing for null color buffers Fixes piglit "fbo-drawbuffers-none glClear" test. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:51 -08:00
Brian Paul	ff59b3d9ee	mesa: add missing TYPE_DOUBLEN_2 cases in get.c The new TYPE_DOUBLEN_2 type was added in `0e60d850` but the code to return values of that type wasn't completed. Fixes conform's default state test. glGetFloatv(GL_DEPTH_RANGE) wasn't returning anything. v2: remove stray 'break' statements. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-21 10:53:12 -08:00
Paul Berry	51000c2ff8	i965: Modify some error messages to refer to "vec4" instead of "vs". These messages are in code that is shared between the VS and GS back-ends, so use the terminology "vec4" to avoid confusion. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-21 09:05:33 -08:00
Paul Berry	a4d68e9ee9	i965: Add GS support to INTEL_DEBUG=shader_time. Previously, time spent in geometry shaders would be counted as part of the vertex shader time. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-21 09:05:12 -08:00
Roland Scheidegger	e23e4f67be	draw: fix points with negative w coords for d3d style point clipping Even with depth clipping disabled, vertices which have negative w coords must be discarded. And since we don't have a proper guardband implementation yet (relying on driver to handle all values except infs/nans in rasterization for such points) we need to kill them off manually (as they can end up with coordinates inside viewport otherwise). v2: use 0.0f instead of 0 (spotted by Brian). Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-21 17:49:02 +01:00
Kenneth Graunke	ad04e396fa	i965: Reserve space for "Vertex Count" in GS outputs. v2: Also increment ir->offset in the GS visitor, rather than at the final assembly generation stage (requested by Paul). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-21 00:20:14 -08:00
Kenneth Graunke	94c0a11b19	i965: Update blitter code for 48-bit addresses. v2: Rebase on Eric's SET_FIELD changes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> [v1]	2014-01-20 16:21:52 -08:00
Kenneth Graunke	23827756f3	i965: Update PIPE_CONTROL packet lengths for Broadwell. On Broadwell, PIPE_CONTROL needs an extra DWord to accomodate the 48-bit addressing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:24 -08:00
Kenneth Graunke	f7e76e00b6	i965: Re-combine the Gen4-5 and Gen6+ write_depth_count functions. Now that we have a helper function that handles the PIPE_CONTROL variations between the various platforms, these are basically the same. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	f5dd608db2	i965: Create a helper function for emitting PIPE_CONTROL writes. There are a lot of places that use PIPE_CONTROL to write a value to a buffer (either an immediate write, TIMESTAMP, or PS_DEPTH_COUNT). Creating a single function to do this seems convenient. As part of this refactor, we now set the PPGTT/GTT selection bit correctly on Gen7+. Previously, we set bit 2 of DW2 on all platforms. This is correct for Sandybridge, but actually part of the address on Ivybridge and later! Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to adjust that in substantially fewer places, giving us confidence that we've hit them all. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	35458a99c0	i965: Use full-length PIPE_CONTROL packets for workaround writes. I believe that PIPE_CONTROL uses the length field to decide whether to do 32-bit or 64-bit writes. A length of 4 would do a 32-bit write, while a length of 5 would do a 64-bit write. (I haven't verified this, though.) For workaround writes, we don't care what value gets written, or how much data. We're only writing something because hardware bugs mandate that do so. So using a 64-bit write should be fine. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	4b9e5c985c	i965: Emit full-length PIPE_CONTROLs for (non-write) flushes. The PIPE_CONTROL packet actually has 5 DWords on Gen6+: 1. Header 2. Flags 3. Address 4. Immediate Data: Lower DWord 5. Immediate Data: Upper DWord We just never emitted the last one. While it appears to work, it's probably safer to emit the entire thing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-20 15:38:23 -08:00
Kenneth Graunke	9420b577dd	i965: Create a helper function for emitting PIPE_CONTROL flushes. These days, we need to emit PIPE_CONTROL flushes all over the place. Being able to do that via a single function call seems convenient. Broadwell will also increase the length of these packets by 1; with the refactoring, we should have to do this in substantially fewer places. v2: Add back forgotten intel_emit_post_sync_nonzero_flush (caught by Eric Anholt). Drop unlikely() from BLT_RING check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-20 15:38:16 -08:00
Kenneth Graunke	ded5674689	i965: Fix MI_STORE_REGISTER_MEM for Broadwell. It now takes a 48-bit address. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	f11c1feaf7	i965: Introduce an OUT_RELOC64 macro. Broadwell uses 48-bit addresses. The first DWord is the low 32 bits, and the second DWord is the high 16 bits. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	67ebcb4711	i965: Use the new drm_intel_bo offset64 field. libdrm 2.4.52 introduces a new 'uint64_t offset64' field, intended to replace the old 'unsigned long offset' field. To preserve ABI, libdrm continues to store the presumed offset in both locations. On Broadwell, a 64-bit kernel may place BOs at "high" (> 4G) addresses. However, with a 32-bit userspace, the 'unsigned long offset' field will only be 32-bit, which is not large enough to hold this value. We need to use a proper uint64_t (like the kernel does). Technically, a lot of this code doesn't affect Broadwell, so we could leave it using the old field. But it makes sense to just switch to the new, properly typed field. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	77425ef91a	build: Require libdrm 2.4.52 for Intel. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 15:12:23 -08:00
Kenneth Graunke	5f4eed3575	i965: Delete intel_batchbuffer_emit_reloc_fenced. Nothing in i965 uses it. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 15:12:12 -08:00
Ian Romanick	4cd8011907	i915: Silence warning: unused parameter warning in intel_bufferobj_buffer intel_buffer_objects.c: In function 'old_intel_bufferobj_buffer': intel_buffer_objects.c:471:17: warning: unused parameter 'flag' [-Wunused-parameter] The parameter hasn't been used since the i915 and i965 drivers had their breakup. i965 got the flags, and i915 got to cry itself to sleep. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:40:46 -08:00
Ian Romanick	8468f437e8	i915: Ensure that intel_bufferobj_map_range meets alignment guarantees Not actually tested, but the changes are identical to the i965 changes that are tested. v2: Remove MAX2(64, ...). Suggested by Ken (in the i965 version of this patch). Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Siavash Eliasi <siavashserver@gmail.com>	2014-01-20 11:40:41 -08:00
Ian Romanick	1ec663ab19	i965: Ensure that intel_bufferobj_map_range meets alignment guarantees No piglit regressions on IVB. With minor tweaks to the arb_map_buffer_alignment-map-invalidate-range test (disable the extension check, set alignment to 64 instead of querying), the i965 driver would fail the test without this patch (as predicted by Eric). With this patch, it passes. v2: Remove MAX2(64, ...). Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: Siavash Eliasi <siavashserver@gmail.com>	2014-01-20 11:40:34 -08:00
Ian Romanick	c2352a88ed	docs: Note that GL_ARB_viewport_array is done on i965 At least for GEN7+, anyway. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:05 -08:00
Courtney Goeltzenleuchter	7837f425e7	i965: Enable ARB_viewport_array v2 (idr): Only enable the extension on GEN7+ w/core profile because it requires geometry shaders. v3 (idr): Add some casting to fix setting of ViewportBounds.Min. Negating an unsigned value, then casting to float doesn't do what you might think it does. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:05 -08:00
Ian Romanick	d3ee8ba346	i965: Consider all viewports before enabling guardband clipping Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:05 -08:00
Ian Romanick	bdff9a6e47	i965: Consider only the scissor rectangle for viewport 0 for clears noop_scissor (correctly) only examines the scissor rectangle for viewport 0. Therefore, it should only be called when that scissor rectangle is enabled. v2: Remove spurious change to radeon code. Noticed by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	2c27f1d47a	i965: Set all the supported scissor rectangles for GEN7 Currently MaxViewports is still 1, so this won't affect any change. v2: Minor code reformatting suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	a2b946cb35	mesa: Refactor bounding-box calculation out of _mesa_update_draw_buffer_bounds Drivers that currently use _Xmin and friends to set their scissor rectangle will need to use this code directly once they are updated for GL_ARB_viewport_array. v2: Use different bit-test idiom and fix mixed tabs and spaces. Both were suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	d989c4b134	i965: Set all the supported viewports for GEN7 Currently MaxViewports is still 1, so this won't affect any change. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	fceb8b55c0	i965: Emit writes to viewport index This variable is handled in a fashion identical to gl_Layer. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Ian Romanick	37f65b0751	i965: Set the maximum VPIndex At various stages the hardware clamps the gl_ViewportIndex to these values. Setting them to zero effectively makes gl_ViewportIndex be ignored. This is acutally useful in blorp (so that we don't have to modify all of the viewport / scissor state). v2: Use INTEL_MASK to create GEN6_CLIP_MAX_VP_INDEX_MASK. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:01 -08:00
Courtney Goeltzenleuchter	9ef16befd0	mesa: Add ARB_viewport_array plumbing Define API connections to extension entry points added in previous commits. Update entry points to use floating point arguments as required by the extension. Add get tokens for ARB_viewport_array state. v2: Include review feedback. v3 (idr): Fix 'make check'. Add missing Get infrastructure (some was culled from other pathces). Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	c2eefb06aa	glsl: Add gl_ViewportIndex built-in variable v2 (idr): Fix copy-and-paste bug... s/LAYER/VIEWPORT/ Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	5439964270	glsl: Add extension infrastructure for ARB_viewport_array Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	3815264d7d	mesa: Add varying slot for viewport index Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	86231c4ab3	mesa: Add new viewport and depth-range entry points for GL_ARB_viewport_array v2 (idr): Use set_viewport_no_notify / set_depth_range_no_notify (and manually notify the driver) instead of calling _mesa_set_viewporti / _mesa_set_depthrangei. Refactor bodies of _mesa_ViewportIndexed and _mesa_ViewportIndexedv into a shared function. Remove spurious CLAMP calls in _mesa_DepthRangeArrayv and _mesa_DepthRangeIndexed. v3 (idr): Add some missing return-statements after calls to _mesa_error. v4 (idr): Only perform the ViewportBounds.Min / ViewportBounds.Max clamping in set_viewport_no_notify if GL_ARB_viewport_array is enabled. Otherwise the driver may not have set ViewportBounds, and the clamping will do bad things. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	0a7baa68a8	mesa: Add new scissor entry points for GL_ARB_viewport_array v2 (idr): Use set_scissor_no_notify (and manually notify the driver) instead of calling _mesa_set_scissori. Refactory bodies of _mesa_ScissorIndexed and _mesa_ScissorIndexedv into a shared function. Perform parameter validation in the same order in all three functions. Pull MaxViewports comparison fix (in _mesa_ScissorArrayv) from the next patch to this patch. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	917db0bc3d	mesa: Add custom get function for SCISSOR_TEST to _mesa_IsEnabledi Now that the scissor enable state is a bitfield need a custom function to extract the correct value from gl_context. Modeled Scissor.EnableFlags after Color.BlendEnabled. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Courtney Goeltzenleuchter	6d9c0011a0	mesa: Add new get entrypoints for ARB_viewport_array v2 (idr): Fix several "comparison between signed and unsigned integer expressions" warnings. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	a4bc73f7ba	mesa: Change parameter to _mesa_set_viewport to float This matches the expectations of GL_ARB_viewport_array and the storage type where the values will land. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:32:00 -08:00
Ian Romanick	91ad851876	meta: Restore all scissor state Previously the restore code would enable all scissor rectangles if any scissor rectangles were enabled on entry to meta. When there is only one scissor rectangle, this is fine. As soon as a driver supports multiple viewports, this will be a problem. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	6d3b1dc150	mesa: Set all scissor rects In _mesa_Scissor, make sure that ctx->Driver.Scissor is only called once instead of once per scissor rectangle. v2: Use MAX_VIEWPORTS instead of ctx->Const.MaxViewports because the driver may not set ctx->Const.MaxViewports yet. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	454cec4299	mesa: Set all viewports from _mesa_Viewport and _mesa_DepthRange In _mesa_Viewport and _mesa_DepthRange, make sure that ctx->Driver.Viewport is only called once instead of once per viewport or depth range. v2: Make _mesa_DepthRange actually set all of the depth ranges (instead of just index 0). Noticed by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	562f353434	mesa: Restore all the viewports in _mesa_PopAttrib Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	c65db3ebed	mesa: Restore all the scissor rectangles in _mesa_PopAttrib Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	9de863603d	mesa: Initialize all the viewports v2: Use MAX_VIEWPORTS instead of ctx->Const.MaxViewports because the driver may not set ctx->Const.MaxViewports yet. v3: Handle all viewport entries in update_viewport_matrix and _mesa_copy_context too. This was previously in an earlier patch. Having the code in the earlier patch could cause _mesa_copy_context to access a matrix that hadn't been constructed. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v2]	2014-01-20 11:31:59 -08:00
Ian Romanick	f6d7cd4a11	mesa: Add an index parameter to _mesa_set_scissor Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	5232a7ded0	mesa: Refactor scissor rectangle setting even more Create an internal function that just writes data into the scissor rectangle. In future patches this will see more use because we only want to call dd_function_table::Scissor once after setting all of the scissor rectangles instead of once per scissor rectangle. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	799265aadc	mesa: Refactor viewport setting even more Create an internal function that just writes data into the viewport. In future patches this will see more use because we only want to call dd_function_table::Viewport once after setting all of the viewport instead of once per viewport. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:59 -08:00
Ian Romanick	42f916e150	mesa: Refactor depth range setting even more Create an internal function that just writes data into the depth range. In future patches this will see more use because we only want to call dd_function_table::DepthRange once after setting all of the depth ranges instead of once per depth range. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:58 -08:00
Ian Romanick	3eb135d1c7	mesa: Add an index parameter to _mesa_set_viewport Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:58 -08:00
Courtney Goeltzenleuchter	cbb271a488	mesa: Convert gl_context::Viewport to gl_context::ViewportArray Only element 0 of the array is used anywhere at this time, so there should be no changes. v4: Split out from a single megapatch. Suggested by Ken. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:56 -08:00
Courtney Goeltzenleuchter	5b84226c31	mesa: Converty gl_viewport_attrib::X, ::Y, ::Width, and ::Height to float v4: Split out from a single megapatch. Suggested by Ken. Also make meta's save_state::ViewportX, ::ViewportY, ::ViewportW, and ::ViewportH to match gl_viewport_attrib. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:53 -08:00
Courtney Goeltzenleuchter	d4dc359875	mesa: Convert gl_viewport_attrib::Near and ::Far to double v4: Split out from a single megapatch. Suggested by Ken. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:50 -08:00
Courtney Goeltzenleuchter	0e60d85029	mesa: Allow glGet of values that are 2 doubles This will be used when the viewport near and far plane are stored as doubles instead of as floats. v4 (idr): Split out from a single megapatch. Suggested by Ken. Also drop value_double_4. It's never used anywhere in the patch series. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:31:47 -08:00
Ian Romanick	83bd850cc7	mesa: Move parameter validation from _mesa_set_viewport to _mesa_Viewport Internal callers should do the right thing. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:42 -08:00
Courtney Goeltzenleuchter	a9c73fb778	mesa: Update gl_scissor_attrib to support ARB_viewport_array Update Mesa and drivers to access updated gl_scissor_attrib. Now have an enable bitfield and array of gl_scissor_rects. Drivers have been updated to the new scissor enable state attribute (gl_context.scissor.EnableFlags) but still treat it as a single boolean which is okay as mesa will only use bit 0 when communicating with a driver that does not support ARB_viewport_array. v2 (idr): Rebase fixes. v3 (idr): Small code formatting fix suggsted by Ken. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:42 -08:00
Ian Romanick	1f59e963b4	mesa: Add new constants related to GL_ARB_viewport_array These limits will be queryable by GL_MAX_VIEWPORTS, GL_VIEWPORT_SUBPIXEL_BITS, and GL_VIEWPORT_BOUNDS_RANGE. Drivers that actually implement the extension must set values for these constants that comply with the minimum-maximums from the spec. Most of these changes were part of other patches. They were separated out because it make reordering of later patches easier. Also, MaxViewports wasn't set by that patch, and I completely overlooked it in review. It's now obvious that it's set. :) v2 (idr): Split these changes out from the original patches. Keep MaxViewportWidth and MaxViewportHeight as GLuint. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:41 -08:00
Courtney Goeltzenleuchter	b39bfa4f49	mesa: Add extension tracking bit for ARB_viewport_array v2 (idr): Split these changes out from the original patch. Only advertise GL_ARB_viewport_array in a core profile because it requires geometry shaders. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-20 11:29:41 -08:00
Brian Paul	d6b6ab51d4	draw: use some cast wrappers in draw_pt_fetch_shade_pipeline*.c Trivial.	2014-01-20 11:01:48 -08:00
Brian Paul	807cbb9023	draw: whitespace and formatting fixes in draw_pt_fetch_shade_pipeline*.c Trivial.	2014-01-20 11:00:32 -08:00
Brian Paul	ad814d04ca	draw: fix incorrect vertex size computation in LLVM drawing code We were calling draw_total_vs_outputs() too early. The call to draw_pt_emit_prepare() could result in the vertex size changing. So call draw_total_vs_outputs() after draw_pt_emit_prepare(). This fix would seem to be needed for the non-LLVM code as well, but it's not obvious. Instead, I added an assertion there to try to catch this problem if it were to occur there. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-20 10:57:20 -08:00
Brian Paul	3a4255148b	docs: note reduced display list memory usage in 10.1 relnotes	2014-01-20 10:52:11 -08:00
Roland Scheidegger	8c0368abb9	draw: clean up d3d style point clipping Instead of skipping x/y clipping completely if there's point_tri_clip points use guard band clipping. This should be easier (previously we could not disable generating the x/y bits in the clip mask for llvm path, hence requiring custom clip path), and it also allows us to enable this for tris-as-points more easily too (this would require custom tri clip filtering too otherwise). Moreover, some unexpected things could have happen if there's a NaN or just a huge number in some tri-turned-point, as the driver's rasterizer would need to deal with it and that might well lead to undefined behavior in typical rasterizers (which need to convert these numbers to fixed point). Using a guardband should hence be more robust, while "usually" guaranteeing the same results. (Only "usually" because unlike hw guardbands draw guardband is always just twice the vp size, hence small vp but large points could still lead to different results.) Unfortunately because the clipmask generated is completely unaffected by guard band clipping, we still need a custom clip stage for points (but not for tris, as the actual clipping there takes guard band into account). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-20 17:45:53 +01:00
Brian Paul	799abb271a	swrast: check for null/-1 when mapping renderbuffers Fixes fbo-drawbuffers-none crash (but test still fails). https://bugs.freedesktop.org/show_bug.cgi?id=73757 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-20 08:18:21 -08:00
Brian Paul	3ede8dd5f1	softpipe: fix crash when accessing null colorbuffer Fixes piglit fbo-missing-attachment-blit test. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73755 Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-20 08:18:21 -08:00
Brian Paul	33ae0c24d0	st/vdpau: s/surface/resource/ to fix compiler warning Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-20 07:54:42 -08:00
José Fonseca	a1e528a0f0	i915,r200,radeon,vega: Change vendor from "VMware, Inc." to "Mesa Project". These are components which were originally developed by Tungsten Graphics, which was in turn acquired by VMware, but are de facto now being maintained by third-party contributors of the Mesa open-source community. This matches what's reported by swrast driver and a few other components. Suggested by Ian Romanick.	2014-01-20 14:15:27 +00:00
José Fonseca	f0c2662b12	logger: Remove unused variable. Silences gcc "unused variable ‘buf’" warning. Trivial.	2014-01-20 13:58:11 +00:00
José Fonseca	d43260b59e	logger: s/\<log\>/log_/ Currently the MSVC build is broken because of conflicting definitions of 'log' function. I didn't investigate thoroughly, but I suspect the it is conflicting standard math.h's log. log_ is admittedly not a great name, but it is better than a broken build. A better one can be used in a follow-on build.	2014-01-20 13:57:12 +00:00
Topi Pohjolainen	9ab553cf52	i965/blorp: reduce the scope of the explicit compression control By highlighting these special cases makes it clearer to switch to the fs-generator as the wider scoped compression control settings used in the current implementation can be simply dropped. No regressions on IVB (piglit quick + unit tests). v2 (Ian): typo in a comment Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 09:42:36 +02:00
Topi Pohjolainen	d0f63b3757	i965/blorp: remove dependency to compression control state Effectively only the mask control bit gets altered for the single addition in question and hence there is no real need to use a fresh state control level for it -- that is more useful when multiple intructions share the same mask and compression settings. This is a preparation step for removing the explicit compression control modifiers in the blit compiler. After this patch there are no nested state control levels making the constant nature of the compression settings more apparent. No regressions on IVB (piglit quick + unit tests). v2 (Matt, Ian): use temporary variable instead of assigning directly on the same line with a function call. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-20 09:42:27 +02:00
Kristian Høgsberg	05da4a7a5e	i965: Only update renderbuffers on initial intelMakeCurrent We call intel_prepare_render() in intelMakeCurrent() to make sure we have renderbuffers before calling _mesa_make_current(). The only reason we do this is so that we can have valid defaults for width and height. If we already have buffers for the drawable we're making current, we don't need this step. In itself, this is a small optimization, but it also avoids a round trip that could block on the display server in a unexpected place. https://bugs.freedesktop.org/show_bug.cgi?id=72540 https://bugs.freedesktop.org/show_bug.cgi?id=72612 Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-19 20:48:19 -08:00
Ilia Mirkin	f5788e042a	st/vdpau: check surface params before creating surfaces Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-19 20:02:10 -05:00
Ilia Mirkin	813ce219c8	st/vdpau: fix bogus error handling in output/bitmap creation Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-19 20:02:10 -05:00
Ilia Mirkin	00e4314f6d	st/vdpau: don't return a device if the screen doesn't support NPOT NV3x cards don't support NPOT textures. Technically this restriction could be worked around, but since it also doesn't expose any video decoding hw, just turn it off entirely. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Christian König <christian.koenig@amd.com>	2014-01-19 20:01:48 -05:00
Armin K	ad3c99e22a	pipe-loader: Fix build pipe_loader_drm.c: In function 'pipe_loader_drm_probe_fd': pipe_loader_drm.c:120:4: error: implicit declaration of function 'loader_get_pci_id_for_fd' [-Werror=implicit-function-declaration] Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-19 15:20:58 +00:00
Emil Velikov	26d380da69	loader: ifdef libdrm specific code and include Mesa provides the flexibility of building without the need to have libdrm present on the system. The situation has regressed with the recent commit commit `8c2e7fd846` Author: Emil Velikov <emil.l.velikov@gmail.com> Date: Fri Jan 10 23:36:16 2014 +0000 loader: introduce the loader util lib By isolating libdrm code by #ifndef __NOT_HAVE_DRM_H we can have libdrm-less builds on across all build systems. This patch converts Android's _EGL_NO_DRM to __NOT_HAVE_DRM_H to provide consistency with the other cases within mesa, allows compilation of libloader on libdrm-less scons and conditionally links against libdrm if present under automake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73776 BUgzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73777 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-19 15:17:00 +00:00
Kenneth Graunke	a33d1339d5	i965: Double the push constant space multipliers on Broadwell too. Broadwell has 2Kb push constant size increments like Haswell GT3. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-18 21:58:13 -08:00
Kenneth Graunke	4c6a1d380a	i965: Update invariant state for Broadwell. The only difference is that STATE_SIP takes a 48-bit address, so we need to output two zeroes. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-18 21:57:59 -08:00
Kenneth Graunke	37e9b5e305	i965: Use the Sandybridge VUE format on Broadwell as well. It hasn't changed. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-18 21:56:23 -08:00
Kenneth Graunke	11f6882e1d	i965: Create a new fragment shader backend for Broadwell. This replaces the old fs_generator backend. v2: Port to the C-based representation of assembly instructions. Fix texturing after the texture-grf merge. v3: Add high quality derivative support. Fix SET_SIMD4X2_OFFSET. v4: Pass brw_context to gen8_instruction functions as required. v5: Fixes for MRT, as well as zero render targets (alpha test only). v6: Replace n-wide with SIMDn in comments and messages; port over Topi's blorp-generator changes; add missing TXF_MCS opcode, fix missing high quality derivatives for DDX; fix typo (all caught by Eric). Simplify ADDC/SUBB handling; drop "Used only on Gen6+" comment (caught by Matt). Emit SIMD16 versions of three source instructions (caught by both Eric and Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:56:08 -08:00
Kenneth Graunke	9eb568d753	i965: Create a new vec4 backend for Broadwell. This replaces the old vec4_generator backend. v2: Port to use the C-based instruction representation. Also, remove Geometry Shader offset hacks - the visitor will handle those instead of this code. v3: Texturing fixes (including adding textureGather support). v4: Pass brw_context to gen8_instruction functions as required. v5: Add SHADER_OPCODE_TXF_MCS support; port DUAL_INSTANCED gs fixes (caught by Eric). Simplify ADDC/SUBB handling; add comments to gen8_set_dp_message calls (suggested by Matt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:56:02 -08:00
Kenneth Graunke	f8035ba036	i965: Add a new infrastructure for generating Broadwell shader assembly. This replaces the brw_eu_emit.c layer for Broadwell. It will be used by both the vector and scalar shader backends. v2: Port to use the C-based instruction representation. v3: Fix destination register type for CMP. v4: Pass brw to gen8_instruction functions (required by rebase). v5: Remove bogus assertion on math instructions (caught by Piglit). v6: Remove more restrictions on math instructions (caught by Eric). Make ADDC and SUBB helpers set accumulator writes, like MAC and MACH (caught by Matt). v7: Don't implicitly force ALU3 operations to SIMD8 (we've been able to do SIMD16 versions since Haswell, but didn't when I originally wrote this code). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:54 -08:00
Kenneth Graunke	8ea4b16eea	i965: Implement a disassembler for Broadwell's new instruction encoding. Heavily based on Keith Packard's existing brw_disasm.c code. I've tried to go through most of the pieces (like SFIDs) and update the lists to include features added in recent generations. v2: Port to use the C-based instruction emitters. This allows us to use C99 array initializers, which tidies up some of the code. v3: Improve decoding of render target write messages. v4: Update for BRW_REGISTER_TYPE becoming an abstraction. v5: Rebase on Chris Forbes' SFID message defines. v6: Fix disassembly of UV immediates; remove silly casts. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-18 21:55:45 -08:00
Kenneth Graunke	0923dad90a	i965: Add a new representation for Broadwell shader instructions. Broadwell significantly changes the EU instruction encoding. Many of the fields got moved to different bit positions; some even got split in two. With so many changes, it was infeasible to continue using struct brw_instruction. We needed a new representation. This new approach is a bit different: rather than a struct, I created a class that has four DWords, and helper functions that read/write various bits. This has several advantages: 1. We can create several different names for the same bits. For example, conditional modifiers, SFID for SEND instructions, and the MATH instruction's function opcode are all stored in bits 27:24. In each situation, we can use the appropriate setter function: set_sfid(), set_math_function(), or set_cond_modifier(). This is much easier to follow. 2. Since the fields are expressed using the original 128-bit numbers, the code to create the getter/setter functions follows the table in the documentation very closely. To aid in debugging, I've enabled -fkeep-inline-functions when building gen8_instruction.c. Otherwise, these functions cannot be called by gdb, making it insanely difficult to print out anything. Kenneth Graunke wrote most of this code. Damien Lespiau ported it to C99. Xiang Haihao added media fields. Zhao Yakui added indirect addressing support. Eric Anholt added an assertion to make sure that values fit in the alloted number of bits. v2: Update for brw_reg_type_to_hw_type(), which necessitates passing brw_context pointers around everywhere. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:37 -08:00
Kenneth Graunke	f4cf231cac	i965: Add SFID #defines for media stuff. While we probably won't ever use these, having them makes it easy to share disassembler code between intel-gpu-tools and Mesa. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:31 -08:00
Kenneth Graunke	9e7da0c716	i965: Add #defines for new Broadwell math functions. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-18 21:55:25 -08:00
Chris Forbes	45607b5c5f	i965: add struct and SFID for pixel interpolator messages Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-18 21:55:17 -08:00
Chris Forbes	566e0ddfd0	i965/Gen7: Only emit cube face enables for cubes. This is not observed to actually fix anything, but the PRM says this field must be zero for other surface types. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-19 11:22:34 +13:00
Chris Forbes	b0042f2c23	i965: Improve dumping of Gen7 SURFACE_STATE Previously this was missing many interesting fields. Having them decoded makes debugging views much easier. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-19 11:22:32 +13:00
Chris Forbes	9b5eda8544	i965: Add masks for more SURFACE_STATE fields Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2014-01-19 11:22:00 +13:00
Emil Velikov	66fd5057d3	nv50: drop obsolete check from error path At 'out_err' the nv50_context has been calloc-ated. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:45 +00:00
Emil Velikov	e1e30f6dfb	nv50: assert before trying to out-of-bounds access framebuffer.cbufs Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:41 +00:00
Emil Velikov	3805a864b1	nv50: assert before trying to out-of-bounds access samplers Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:37 +00:00
Emil Velikov	6a53b81086	nv50: assert before trying to out-of-bounds access textures Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:34 +00:00
Emil Velikov	19069803be	nv50: pass vtxbuf index as unsigned The index passed to the function is already unsigned, and internally we threat it as unsigned. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:28 +00:00
Emil Velikov	1773611c52	nv50: assert before trying to out-of-bounds access vtxbuf Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:24 +00:00
Emil Velikov	741e935a72	nv50: typecast the result of ffs() to unsigned Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:20 +00:00
Emil Velikov	5e130f2371	nv50: assert before trying to out-of-bounds access constbuf Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:15 +00:00
Emil Velikov	12e744abbb	nv50: access only the available amount of constbuf The textures array is defined as a number of NV50_MAX_PIPE_CONSTBUFS per shader stage. Currently the nv50 driver handles only 3 shader stages, thus we wreck chaos when accessing array-out-of-bounds. Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:17:09 +00:00
Emil Velikov	d606ca37eb	nv50: access only the available amount of textures The textures array is defined as a number of PIPE_MAX_SAMPLERS per shader stage. Currently nv50 driver handles only 3 shader stages, thus we wreck chaos when accessing array-out-of-bounds. Fixes a segfault in piglit/bin/arb_texture_buffer_object-data-sync -fbo -auto Cc: 9.1 9.2 10.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-18 19:16:16 +00:00
Rob Clark	bf70c238a7	loader: fallback to drmGetVersion() for non-pci devices Use the kernel driver name are returned by drmGetVersion() for non-pci(platform) devices. Signed-off-by: Rob Clark <robclark@freedesktop.org> v2 (Emil): Rebased and weaked commit message. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:52:07 +00:00
Emil Velikov	26458420d8	pipe-loader: add support for non-pci (platform) devices Culled out of the "loader: refactor duplicated code into loader util lib" patch by Rob Clark. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-18 18:52:07 +00:00
Emil Velikov	3d3ae75c86	pci_ids: no not include loader.h As per original approach by Rob, each user of the loader lib should include loader.h and the pci_id_driver_map.h header will be used exclusively by the loader. Add back the include guard __IS_LOADER and remove no longer needed include folder in the scons build. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:51:54 +00:00
Emil Velikov	8d4357b5ba	egl_dri2: use loader util lib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:49 +00:00
Emil Velikov	a0a1c60fb0	pipe-loader: use loader util lib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-18 18:47:49 +00:00
Emil Velikov	0e78c35234	st/egl: use loader util lib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>	2014-01-18 18:47:48 +00:00
Emil Velikov	a980024224	egl-static: use loader util lib v2 * Drop the no longer used _EGL_NO_DRM from Android.mk. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:48 +00:00
Emil Velikov	fae0dfa59b	gbm: use the loader util lib Additionally this commit removes the following exported functions _gbm_udev_device_new_from_fd() _gbm_fd_get_device_name() _gbm_log() All three were erroneously marked as exported since their inception. Neither of them has ever been a part of the API thus there should be no users of them. Cc: Chad Versace <chad.versace@linux.intel.com> Cc: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:48 +00:00
Emil Velikov	eac776cf77	glx: use the loader util lib v2 * Set logger to ErrorMessageF. Spotted by Kristian Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:48 +00:00
Emil Velikov	8c2e7fd846	loader: introduce the loader util lib All the various window system integration layers duplicate roughly the same code for figuring out device and driver name, pci-id's, etc. Which is sad. So extract it out into a loader util lib. v2 (Emil) * Separate the introduction of libloader from the code de-duplication. * Strip out non-pci devices support. * Add scons + Android build system support. * Add VISIBILITY_CFLAGS to avoid exporting the loader funcs. v3 (Emil) * PIPE_OS_ANDROID is undefined at this scope, use ANDROID * Make sure we define _EGL_NO_DRM when building only swrast Signed-off-by: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-18 18:47:27 +00:00
Kenneth Graunke	1c5e2965a0	i965: Remove CACHED_BATCH support altogether. Using an unoptimized variant of glamor spending 50% of its CPU time in brw_draw_prims() (and hitting the cache very frequently): N Min Max Median Avg Stddev x 200 29200 40500 34900 34750 958.43256 + 200 31000 40300 34700 34622 916.35941 No difference proven at 95.0% confidence Similarly, no difference on GLB2.7: N Min Max Median Avg Stddev x 63 64.1 71.36 70.69 70.113175 1.6782026 + 63 63.6 71.18 70.75 70.223651 1.6044186 No difference proven at 95.0% confidence v2: Rebase on master (by anholt) v3: Add a missing BEGIN_BATCH(3) to aa_line_parameters -- CACHED_BATCH didn't have the asserts about batchbuffer usage that ADVANCE_BATCH does, so we started assertion failing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-17 13:21:11 -08:00
Eric Anholt	746e3e3b3a	i965: Replace 8-wide and 16-wide with SIMD8 and SIMD16. Those are the terms used in the docs, and think "n-wide" was something I just happened to say. Note that shader-db needs updating for the INTEL_DEBUG=fs parsing. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-17 12:58:43 -08:00
Eric Anholt	26a3bf5c72	i965: Stop doing our optimization on a copy of the GLSL IR. The original intent was that we'd keep a driver-private copy, and there would be the normal copy for swrast to make use of without the tuning (or anything more invasive we might do) specific to i965. Only, we don't generate swrast code any more, because swrast can't render current shaders anyway. Thus, our private copy is rather a waste, and we can just do our backend-specific operations on the linked shader. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-17 12:58:37 -08:00
José Fonseca	8771285054	s/Tungsten Graphics/VMware/ Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 \| xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra$ph\\|hp$ics,\? [iI]nc\.\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics$,\? [iI]nc\.$\?$, Cedar Park$\?$, Austin$\?$, \(Texas\\|TX$\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/alanh@tungstengraphics.com/alanh@vmware.com/ s/jens@tungstengraphics.com/jowen@vmware.com/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g s/keithw\?@tungstengraphics.com/keithw@vmware.com/g s/michel@tungstengraphics.com/daenzer@vmware.com/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/zack@tungstengraphics.com/zackr@vmware.com/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-17 20:00:32 +00:00
José Fonseca	27307a73e5	trace: Re-license trace.xsl under MIT license. I was the sole author, as Tungsten Graphics employee, which was since then acquired by VMware Inc. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-17 20:00:32 +00:00
Brian Paul	3618ac4f20	svga: fix crash when clearing null color buffer Fixes regression since `9baa45f78b` but some of the piglit fbo-drawbuffers-none tests still don't pass. v2: use the right pointer type for 'h' Reviewed-by: José Fonseca <jfonseca@vmware.com>	2014-01-17 08:52:37 -08:00
Brian Paul	d6fa71fbb0	llvmpipe: handle NULL color buffer pointers Fixes regression from `9baa45f78b` v2: incorporate a few small changes suggested by Roland. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-17 08:52:11 -08:00
Brian Paul	7b4ceec0b7	softpipe: handle NULL color buffer pointers Fixes regression from `9baa45f78b` Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-17 08:52:11 -08:00
Roland Scheidegger	3b64714da4	llvmpipe: fix large point rasterization with point_quad_rasterization The whole round-pointsize-to-int stuff must only be done with GL legacy rules (no point_quad_rasterization) or all the wrong edges are lit up. This was previously in a private branch (d3d pointsprite test complains loudly otherwise) and got lost in a merge. However, it should certainly apply to GL point sprite rasterization as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-17 17:01:01 +01:00
Roland Scheidegger	4b9bcf31f4	gallium: add bits for clipping points as tris (d3d-style) OpenGL does whole-point clipping, that is a large point is either fully clipped or fully unclipped (the latter means it may extend beyond the viewport as long as the center is inside the viewport). d3d9 (d3d10 has no large points) however requires points to be clipped after they are expanded to a rectangle. (Note some IHVs are known to ignore GL rules at least with some hw/drivers.) Hence add a rasterizer bit indicating which way points should be clipped (some drivers probably will always ignore this), and add the draw interaction this requires. Drivers wanting to support this and using draw must support large points on their own as draw doesn't implement vp clipping on the expanded points (it potentially could but the complexity doesn't seem warranted), and the driver needs to do viewport scissoring on such points. Conflicts: src/gallium/drivers/llvmpipe/lp_context.c src/gallium/drivers/llvmpipe/lp_state_derived.c Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-17 17:01:01 +01:00
Ilia Mirkin	739dc95e67	mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_program Commit `c13970808` (mesa: GL_EXT_secondary_color is not optional) changed CHECK_EXTENSION2(EXT_secondary_color, ARB_vetex_program, cap) to CHECK_EXTENSION(ARB_vertex_program, cap) However CHECK_EXTENSION2 checks that either extension is available, not both. Remove the extension check entirely since the intent was for it to always be enabled. v2: Fix glGet*(GL_COLOR_SUM) too. Suggested by Ian. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-16 16:42:33 -08:00
Zack Rusin	93b953d139	llvmpipe: do constant buffer bounds checking in shaders It's possible to bind a smaller buffer as a constant buffer, than what the shader actually uses/requires. This could cause nasty crashes. This patch adds the architecture to pass the maximum allowable constant buffer index to the jit to let it make sure that the constant buffer indices are always within bounds. The behavior follows the d3d10 spec, which says the overflow should always return all zeros, and overflow is only defined as access beyond the size of the currently bound buffer. Accesses beyond the declared shader constant register size are not considered an overflow and expected to return garbage but consistent garbage (we follow the behavior which some wlk tests expect which is to return the actual values from the bound buffer). Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-16 16:33:57 -05:00
Ilia Mirkin	dd687fb8d0	nv50, nvc0: initialize ctx->sample_mask to ~0 Commit `95bf222603` (cso_context: Fix cso_context::sample_mask initial value.) fixed the cso sample mask to be initialized to ~0. The cso code is also careful not to needlessly call set_sample_mask, so we ended up with the ctx->sample_mask never being set. This broke a number of EXT_framebuffer_multisample piglit tests. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2014-01-16 19:26:05 +01:00
Aaron Watry	188383591d	mesa/main: Free ctx->DrawIndirectBuffer during teardown ctx->DrawIndirectBuffer wasn't being free'd in _mesa_free_buffer_objects With this patch, "valgrind --leak-check=full glxgears" on evergreen (CEDAR) now shows: LEAK SUMMARY: definitely lost: 0 bytes in 0 blocks indirectly lost: 0 bytes in 0 blocks possibly lost: 0 bytes in 0 blocks still reachable: 70,228 bytes in 651 blocks suppressed: 0 bytes in 0 blocks Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-16 10:10:04 -06:00
Aaron Watry	ce3528896b	st/dri: prevent leak of dri option default values v2: Change comment style CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-16 10:10:04 -06:00
Aaron Watry	5ac3229f76	radeon: Move gfx/dma cs cleanup to r600_common_context_cleanup The radeonsi code was not cleaning up either of these items leading to leaked memory. v2: Move cleanup to r600_common_context_cleanup instead of duplicating the logic for SI CC: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-16 10:10:04 -06:00
Ian Romanick	a05c596a00	mesa: Eliminate parameters to dd_function_table::Scissor The i830 and i915 drivers used them, but they didn't really need to. They will just be annoying in future patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-15 10:02:48 -08:00
Ian Romanick	6dbab6b2bb	mesa: Eliminate parameters to dd_function_table::DepthRange No driver uses them. They will just be annoying in future patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-15 10:02:48 -08:00
Ian Romanick	065bd6ffc2	mesa: Eliminate parameters to dd_function_table::Viewport No driver uses them. They will just be annoying in future patches. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-15 10:02:48 -08:00
Ian Romanick	fbc0c9a553	radeon: Remove dead code A future patch will rename some of the fields of gl_viewport_attrib, and I don't want to update dead code that I can't test. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Dave Airlie <airlied@redhat.com>	2014-01-15 10:02:47 -08:00
Ian Romanick	4fcdb75268	i915: Remove spurious calls to DepthRange For both i830 and i915, the driver DepthRange function just calls intelCalcViewport. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Eric Anholt <eric@anholt.net>	2014-01-15 10:02:47 -08:00
Ian Romanick	0a75909b3f	mesa: Add COMPRESSED_RGBA_S3TC_DXT1_EXT to COMPRESSED_TEXTURE_FORMATS for GLES The ES and desktop GL specs diverge here. Yay! In desktop OpenGL, the driver can perform online compression of uncompressed texture data. GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats that it could ask the driver to compress with some expectation of quality. The GL_ARB_texture_compression spec calls this "suitable for general-purpose usage." As noted above, this means GL_COMPRESSED_RGBA_S3TC_DXT1_EXT is not included in the list. In OpenGL ES, the driver never performs compression. GL_NUM_COMPRESSED_TEXTURE_FORMATS and GL_COMPRESSED_TEXTURE_FORMATS give the application a list of formats that the driver can receive from the application. It is the complete list of formats. The GL_EXT_texture_compression_s3tc spec says: "New State for OpenGL ES 2.0.25 and 3.0.2 Specifications The queries for NUM_COMPRESSED_TEXTURE_FORMATS and COMPRESSED_TEXTURE_FORMATS include COMPRESSED_RGB_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT1_EXT, COMPRESSED_RGBA_S3TC_DXT3_EXT, and COMPRESSED_RGBA_S3TC_DXT5_EXT." Note that the addition is only to the OpenGL ES specification! Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> See-also: http://lists.freedesktop.org/archives/mesa-dev/2013-October/047439.html Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Brian Paul <brianp@vmware.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-15 10:02:47 -08:00
Brian Paul	bf27d02390	scons: add new shaderimage.c file to the build	2014-01-15 09:17:04 -07:00
Francisco Jerez	bd62666224	clover: Fix clover::keys and ::values to deal with r-value references properly. Returning a reference is incorrect if the specified pair was a temporary -- Instead of that, use decltype() to deduce the correct return type qualifiers. Fixes a crash in clCreateProgramWithBinary(). Reported-and-tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:48:37 +01:00
Francisco Jerez	5662602ba0	clover: Don't try to build programs created from a binary again. According to the spec it's allowed to call clBuildProgram() on a program created from a user-specified binary. We don't need to do anything to build the program in that case. Reported-and-tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:48:05 +01:00
Francisco Jerez	5195f1d9c6	clover: Add missing fields to the clover::module serialization code. Tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:46:12 +01:00
Francisco Jerez	efcc84f425	clover: Store map result into a temporary vector in clCreateProgramWithBinary. This avoids the inefficient multiple evaluation of the map result in the code below. It should cause no functional changes. Tested-by: "Dorrington, Albert" <albert.dorrington@lmco.com>	2014-01-15 16:45:05 +01:00
Francisco Jerez	83db4a30b8	docs: Mark ARB_shader_image_load_store as work in progress. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	647344bf3e	mesa: Validate image units when the texture state changes. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	ace31f4bc0	mesa: Unbind deleted textures from the shader image units. From ARB_shader_image_load_store: If a texture object bound to one or more image units is deleted by DeleteTextures, it is detached from each such image unit, as though BindImageTexture were called with <unit> identifying the image unit and <texture> set to zero. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	902f9df36b	mesa: Add image parameter queries for ARB_shader_image_load_store. v2: Fix off-by-one error in index parameter bound checking. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	eb0de7c432	mesa: Add ARB_shader_image_load_store to the extension table. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	a167e354e7	glapi: Update dispatch XML files for ARB_shader_image_load_store. And uncomment the relevant lines of the dispatch sanity test. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:08 +01:00
Francisco Jerez	bcc49e17ff	mesa: Implement the GL entry points defined by ARB_shader_image_load_store. v2: Name image format classes consistently, fix array and 3D teximage selection with layered = GL_FALSE, make sure that the user-specified layer is less than the number of texture layers, add some asserts. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	7510c10209	mesa: Add MESA_FORMAT_SIGNED_RG88 and _RG1616. Including pack/unpack and texstore code. ARB_shader_image_load_store requires support for the GL_RG8_SNORM and GL_RG16_SNORM formats, which map to MESA_FORMAT_SIGNED_GR88 and MESA_FORMAT_SIGNED_GR1616 on little-endian hosts, and MESA_FORMAT_SIGNED_RG88 and MESA_FORMAT_SIGNED_RG1616 respectively on big-endian hosts -- only the former were already present, add support for the latter. Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	87942749a3	mesa: Add MESA_FORMAT_ABGR2101010. Including pack/unpack and texstore code. This texture format is a requirement for ARB_shader_image_load_store. Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	16070716bc	mesa: Add driver interface for ARB_shader_image_load_store. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	7a98741ef2	mesa: Add state data structures required for ARB_shader_image_load_store. v2: Increase MAX_IMAGE_UNITS to what i965 wants and add a separate MAX_IMAGE_UNIFORMS define, clarify a couple of comments. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2014-01-15 16:42:07 +01:00
Francisco Jerez	d9b0b4e960	mesa: Define helper function to get the number of texture layers. And to check if it can have layers at all. This will be used by the implementation of ARB_shader_image_load_store. v2: Fix constness of texobj argument, use assert and return reasonable default rather than calling unreachable() in default switch case. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-15 16:42:07 +01:00
Emil Velikov	bfcf78c110	st/mesa: use signed temporary variable to store _ColorDrawBufferIndexes The temporary variable used to store _ColorDrawBufferIndexes must be signed (GLint), otherwise the following conditional will be incorrectly evaluated. Leading to crashes in the driver/mesa or accessing/writing to arbitrary memory location. The bug dates back to 2009. Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-15 14:33:28 +00:00
Emil Velikov	3515a648a9	automake: include the git sha in the opengl version string for oot builds Acked-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-15 14:32:24 +00:00
Emil Velikov	10368e1446	mesa: use signed temporary variable to store _ColorDrawBufferIndexes _ColorDrawBufferIndexes is defined as GLint* and using a GLuint* will result in the first part of the conditional to be evaluated to true always. Unintentionally introduced by the following commit, this will result in a driver segfault if one is using an old version of the piglit test bin/clearbuffer-mixed-format -auto -fbo commit `03d848ea10` Author: Marek Olšák <marek.olsak@amd.com> Date: Wed Dec 4 00:27:20 2013 +0100 mesa: fix interpretation of glClearBuffer(drawbuffer) This corresponding piglit tests supported this incorrect behavior instead of pointing at it. Cc: Marek Olšák <marek.olsak@amd.com> Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2014-01-15 14:31:04 +00:00
Ilia Mirkin	716b512dcf	nouveau: add framebuffer validation callback Fixes assertions when trying to attach textures to fbs with formats not supported by the render engines. See https://bugs.freedesktop.org/show_bug.cgi?id=73459 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2014-01-15 12:12:00 +01:00
Francisco Jerez	e457aca7fa	clover: Use cl_ulong in the maximum allocation size calculation to avoid overflow.	2014-01-14 22:10:24 +01:00
Kenneth Graunke	8c4a9f631d	i965: Emit 3DSTATE_VF on Broadwell too. It's not just for Haswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-14 00:59:03 -08:00
Kenneth Graunke	eadabec4cd	i965: Disable workaround flush for push constants on Broadwell. If it wasn't necessary for Haswell, it's likely not to be necessary for Broadwell either. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-14 00:59:03 -08:00
Kenneth Graunke	8618407d15	i965: Enable native ETC texture support on Broadwell. Broadwell, like Baytrail, has native ETC texture support. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-14 00:58:17 -08:00
Chia-I Wu	fa772aa92b	ilo: handle NULL renderbuffers correctly Renderbuffers may be NULL since `9baa45f78b`.	2014-01-14 16:27:57 +08:00
Chia-I Wu	7fdab3b201	ilo: disable HiZ for misaligned levels We need to disable HiZ for non-8x4 aligned levels, except for level 0, layer 0. For the very first layer we can adjust Width and Height fields of 3DSTATE_DEPTH_BUFFER to make it aligned. Specifically, add ILO_TEXTURE_HIZ and set the flag only for properly aligned levels. ilo_texture_can_enable_hiz() is updated to check for the flag. In tex_layout_validate(), align the depth bo to 8x4 so that we can adjust Width/Height of 3DSTATE_DEPTH_BUFFER without introducing out-of-bound access. Finally in rectlist blitter, add the ability to adjust 3DSTATE_DEPTH_BUFFER.	2014-01-14 15:43:20 +08:00
Chia-I Wu	18645d1533	ilo: use a helper to determine if HiZ is enabled Add ilo_texture_can_enable_hiz and replace all checks for tex->hiz.bo by calls to ilo_texture_can_enable_hiz().	2014-01-14 15:43:20 +08:00
Chia-I Wu	1427c3f79f	ilo: decide on hiz first in texture allocation Add tex_layout_init_hiz() before tex_layout_init_format() to decide whether HiZ should be enabled. On GEN6, because of layer offsetting, HiZ is enabled only when the texture is non-mipmapped and non-array. PIPE_USAGE_STAGING is also taken as a hint to disable HiZ.	2014-01-14 15:43:20 +08:00
Chia-I Wu	194a61cd39	ilo: emit gen7_wa_pipe_control_wm_max_threads_stall on Haswell Rename the workaround, as it is for 3DSTATE_PS instead of 3DSTATE_WM, and emit it on Haswell too. This does not fix any app, but an assertion failure.	2014-01-14 15:43:19 +08:00
Chia-I Wu	c6605c51de	ilo: use HALIGN_4 on GEN7 for depth buffers The comment was no longer true since `6642381e75`.	2014-01-14 15:42:53 +08:00
Chia-I Wu	e90e3e39c2	ilo: OOM for HiZ is fatal on GEN6 On GEN6, HiZ and Separate Stencil Buffer must be enabled at the same time.	2014-01-14 15:19:41 +08:00
Chia-I Wu	5b1c516080	ilo: fix a HiZ bo leakage Dereference the HiZ bo when the texture is destroyed.	2014-01-14 15:19:41 +08:00
Chia-I Wu	af57378e59	ilo: simplify ilo_texture_set_slice_flags() Call ilo_texture_get_slice() for the last slice so that we can get rid of the duplicated assert().	2014-01-14 15:19:41 +08:00
Vinson Lee	8f9b70fa3c	egl-static: Fix build error. Fix build regression introduced with commit `786af2f963`. egl_pipe.c:46:38: fatal error: radeonsi/radeonsi_public.h: No such file or directory #include "radeonsi/radeonsi_public.h" ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73578 Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2014-01-13 15:54:26 -08:00
Andreas Hartmetz	aa7ae4fd6e	radeonsi: Rename the commonly occurring rscreen variable. The "r" stands for R600. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:14 +01:00
Andreas Hartmetz	8662e66bf2	radeonsi: Rename the commonly occurring rctx/r600 variables. The "r" stands for R600. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:14 +01:00
Andreas Hartmetz	44d27ce2b2	radeonsi: Rename r600_trace_emit->si_trace_emit. I had previously considered that unsafe. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	0b57fc15e1	radeonsi: Rename R600->SI in some remaining defines. I had previously considered that unsafe. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	1b79764f49	radeonsi: Rename radeonsi->si remaining identifiers in si_uvd.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	b902298615	radeonsi: Rename r600->si remaining identifiers in si_state_draw.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	3a4b87511e	radeonsi: Rename r600->si remaining identifiers in si_resource.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	5d068f734c	radeonsi: Rename r600->si remaining identifiers in si_query.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	eb0ddb6d5b	radeonsi: Rename r600->si remaining identifiers in si_pipe.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	238427625f	radeonsi: Rename r600->si remaining identifier in si_hw_context.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	3160aa4877	radeonsi: Rename radeonsi->si remaining identifiers in si_compute.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	7b7eb4dd1f	radeonsi: Rename r600->si remaining identifiers in si_blit.c. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	45578def71	radeonsi: Rename r600->si for functions in si_pipe.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	280c360c02	radeonsi: Rename r600->si for functions in si.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	f2a21ed8b9	radeonsi: Rename r600->si for functions in si_resource.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	a88f46bc9b	radeonsi: Rename r600->si for structs in si_resource.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	3e81883a42	radeonsi: Rename r600->si for structs in si.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	238aeabce0	radeonsi: Rename r600->si for structs in si_pipe.h. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Andreas Hartmetz	786af2f963	radeonsi: Apply si_* file naming scheme. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2014-01-14 00:07:13 +01:00
Michał Górny	5ea2376334	Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config. This should help with cross-compiling and multilib when $CHOST-specific llvm-config is expected rather than build host default one. It will help us a bit in Gentoo where we've started using i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Michał Górny <mgorny@gentoo.org> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=73100 CC: "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-13 14:37:55 -08:00
Tom Stellard	6a19bb56e0	configure: Disable xvmc by default The xvmc unit tests are failing on r300g and r600g. Reviewed-by: Vinson Lee <vlee@freedesktop.org>	2014-01-13 14:37:55 -08:00
Kenneth Graunke	277dbf08b0	glsl: Remove exec_list iterators now that nothing uses them. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:47 -08:00
Kenneth Graunke	826d9fb8c0	glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking. These can't use foreach_list since they want to skip over the first few list elements. Just doing the ad-hoc list walking isn't too bad. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:45 -08:00
Kenneth Graunke	48d0faaa43	glsl: Use a new foreach_two_lists macro for walking two lists at once. When handling function calls, we often want to walk through the list of formal parameters and list of actual parameters at the same time. (Both are guaranteed to be the same length.) Previously, we used a pattern of: exec_list_iterator 1st_iter = <1st list>.iterator(); foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) { ... 1st_iter.next(); } This was awkward, since you had to manually iterate through one of the two lists. This patch introduces a foreach_two_lists macro which safely walks through two lists at the same time, so you can simply do: foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) { ... } v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested by Ian Romanick. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:49:42 -08:00
Kenneth Graunke	02ff2a2758	glsl: Statically cast parameter exec_node to ir_variable. Formal function parameters are always ir_variable objects, not an arbitrary ir_instruction. So there's no need to dynamically cast here. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	8050584096	glsl: Cast ir_call parameters to ir_rvalue, not ir_instruction. A function call's parameters are always rvalues. ir_rvalue may not always be a subclass of ir_instruction in the future, so we should use the right one. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	2e113dfab8	glsl: Replace foreach_iter and iter.remove() with foreach_list_safe. foreach_list_safe allows you to safely remove the current node. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	838a6871bb	glsl: Convert piles of foreach_iter to foreach_list_safe. In these cases, we edit the list (or at least might be), so we use the foreach_list_safe variant. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Kenneth Graunke	5f7e778fa1	glsl: Convert piles of foreach_iter to the newer foreach_list macro. foreach_iter and exec_list_iterators have been deprecated for some time now; we just hadn't ever bothered to convert code to the newer foreach_list and foreach_list_safe macros. In these cases, we aren't editing the list, so we can use foreach_list rather than foreach_list_safe. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-13 11:38:19 -08:00
Paul Berry	fb6d9798a0	i965: Ensure that all necessary state is re-emitted if we run out of aperture. Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-13 09:44:39 -08:00
Marek Olšák	df918b5b90	r600g: fix glClearBuffer by handling PIPE_CLEAR_COLORi flags correctly also restructure the code	2014-01-13 15:48:08 +01:00
Marek Olšák	6e98a17551	r600g: handle NULL colorbuffers correctly on R600-R700	2014-01-13 15:48:08 +01:00
Marek Olšák	07032d4068	r600g: handle NULL colorbuffers correctly on Evergreen	2014-01-13 15:48:08 +01:00
Marek Olšák	a86de9a72f	radeonsi: handle NULL colorbuffers correctly Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-13 15:48:08 +01:00
Marek Olšák	9677cfab32	gallium/util: easy fixes for NULL colorbuffers Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-13 15:48:08 +01:00
Marek Olšák	9baa45f78b	st/mesa: bind NULL colorbuffers as specified by glDrawBuffers An example why it is required: Let's say there's a fragment shader writing to gl_FragData[0..1]. The user calls: glDrawBuffers(2, {GL_NONE, GL_COLOR_ATTACHMENT0}); That means gl_FragData[0] is unused and gl_FragData[1] is written to GL_COLOR_ATTACHMENT0. st/mesa was skipping the GL_NONE draw buffer, therefore gl_FragData[0] was written to GL_COLOR_ATTACHMENT0, which was wrong. This commit fixes it, but drivers must also be fixed not to crash when binding NULL colorbuffers. There is also a new set of piglit tests for this. The MSAA state also had to be fixed not to crash when reading fb->cbufs[0]. Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-13 15:48:07 +01:00
Marek Olšák	9bf9578c1b	mesa: handle GL_NONE draw buffers correctly in glClear Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-13 15:48:07 +01:00
Marek Olšák	4e549ddb50	st/mesa: use sRGB formats for MSAA resolving if destination is sRGB Copied from the i965 driver, including the big comment. Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-13 15:48:07 +01:00
Marek Olšák	355686a69f	st/mesa: check depth and stencil writemask before clearing	2014-01-13 15:25:31 +01:00
Marek Olšák	9ea3f88c0a	st/mesa: always prefer pipe->clear over clear_with_quad (v2) v2: clear depth and stencil together	2014-01-13 15:25:31 +01:00
Martin Andersson	c156d24525	st/egl: Flush resources before presentation Fixes wayland regression on r600g due to fast clear introduced by commit `edbbfac6`. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2014-01-13 15:25:31 +01:00
Tapani Pälli	99abb87c63	dri: set yInverted default to GL_TRUE yInverted is used by EGL_NOK_texture_from_pixmap to indicate that window system rendering is y-inverted compared to OpenGL texture representation. This extension is only known to be used with X11 window system where sane default is GL_TRUE. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73371 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-13 08:00:37 +02:00
Tapani Pälli	f8c5b8a17d	egl_dri2: call dri2_add_configs_for_visuals after extensions set dri2_add_config makes decisions based on NOK_texture_from_pixmap so it needs to be enabled before calling dri2_add_configs_for_visuals. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-13 07:59:56 +02:00
Ian Romanick	2dc35a619c	mesa: Set the correct error in _mesa_BeginConditionalRender Piglit was recently changed to expect the correct error code (piglit commit 271b998), so it started failing on Mesa. This corrects that failing and adds some spec quotations to justify the errrors set. The code was rearranged a little bit to match the order listed in the spec. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-10 17:19:48 -08:00
Kenneth Graunke	db1dc21a75	i965: Delete duplicate write_timestamp function. brw_queryobj.c needs a version of write_timestamp that works on all generations for the QueryCounter() driver hook. So there's no point in duplicating it in gen6_queryobj.c. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2014-01-10 15:35:01 -08:00
Paul Berry	532b1fecd9	i965: Fix clears of layered framebuffers with mismatched layer counts. Previously, Mesa enforced the following rule (from ARB_geometry_shader4's list of criteria for framebuffer completeness): * If any framebuffer attachment is layered, all attachments must have the same layer count. For three-dimensional textures, the layer count is the depth of the attached volume. For cube map textures, the layer count is always six. For one- and two-dimensional array textures, the layer count is simply the number of layers in the array texture. { FRAMEBUFFER_INCOMPLETE_LAYER_COUNT_ARB } However, when ARB_geometry_shader4 was adopted into GL 3.2, this rule was dropped; GL 3.2 permits different attachments to have different layer counts. This patch brings Mesa in line with GL 3.2. In order to ensure that layered clears properly clear all layers, we now have to keep track of the maximum number of layers in a layered framebuffer. Fixes the following piglit tests in spec/!OpenGL 3.2/layered-rendering: - clear-color-all-types 1d_array mipmapped - clear-color-all-types 1d_array single_level - clear-color-mismatched-layer-count - framebuffer-layer-count-mismatch Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-10 05:58:49 -08:00
Paul Berry	28af1dc217	main: check texture target when validating layered framebuffers. From section 4.4.4 (Framebuffer Completeness) of the GL 3.2 spec: If any framebuffer attachment is layered, all populated attachments must be layered. Additionally, all populated color attachments must be from textures of the same target. We weren't checking that the attachments were from textures of the same target. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-10 05:58:46 -08:00
Chad Versace	90368875e7	i965/gen6/blorp: Remove redundant HiZ workaround Commit `1a92881` added extra flushes to fix a HiZ hang in WebGL Google Maps. With the extra flushes emitted by the previous two patches, the flushes added by `1a92881` are redundant. Tested with the same criteria as in `1a92881`: by zooming in and out continuously for 2 hours on Sandybridge Chrome OS (codename Stumpy) without a hang. CC: Kenneth Graunke <kenneth@whitecape.org> CC: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-09 15:02:45 -08:00
Chad Versace	6a5c86f486	i965/gen6/blorp: Set need_workaround_flush at top of blorp Unconditionally set brw->need_workaround_flush at the top of gen6 blorp state emission. The art of emitting workaround flushes on Sandybridge is mysterious and not fully understood. Ken and I believe that intel_emit_post_sync_nonzero_flush() may be required when switching from regular drawing to blorp. This is an extra safety measure to prevent undiscovered difficult-to-diagnose gpu hangs. I verified that on ChromeOS, pre-patch, need_workaround_flush was not set at the top of blorp, as Paul expected. To verify, I inserted the following debug code at the top of gen6_blorp_exec(), restarted the ui, and inspected the logs in /var/log/ui. The abort gets triggered so early that the browser never appears on the display. static void gen6_blorp_exec(...) { if (!brw->need_workaround_flush) { fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__); abort(); } ... } CC: Kenneth Graunke <kenneth@whitecape.org> CC: Stéphane Marchesin <marcheu@chromium.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-09 15:02:39 -08:00
Chad Versace	5e0cd58de4	i965/gen6/blorp: Set need_workaround_flush immediately after primitive This patch makes the workaround code in gen6 blorp follow the pattern established in the regular draw path. It shouldn't result in any behavioral change. On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim() and gen6_blorp_emit_primitive(). brw_emit_prim() sets need_workaround_flush immediately after emitting the primitive, but blorp does not. Blorp sets need_workaround_flush at the bottom of brw_blorp_exec(). This patch moves the need_workaround_flush from brw_blorp_exec() to gen6_blorp_emit_primitive(). There is no need to set need_workaround_flush in gen7_blorp_emit_primitive() because the workaround applies only to gen6. Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2014-01-09 15:02:36 -08:00
Carl Worth	3587fbc586	docs: Import 10.0.2 release notes, add news item.	2014-01-09 12:05:53 -08:00
Brian Paul	513a324b88	mesa: add missing SNORM formats in _mesa_base_fbo_format() We weren't handling the LUMINANCE_SNORM, LUMINANCE_ALPHA_SNORM and INTENSITY_SNORM cases. Note that adding these cases here does not require a driver to support rendering to these surface types. If the driver can't do it we'll report an incomplete framebuffer. NVIDIA doesn't support GL_EXT_texture_snorm but their driver accepts these formats in glRenderBufferStorage(). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2014-01-09 11:35:52 -07:00
Brian Paul	689ec8dfb2	mesa: remove dead geom shader code I doubt the swrast-based drivers will ever support GS. Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-09 11:35:52 -07:00
Brian Paul	c47207d517	docs: minor updates to VMware SVGA3D driver page Signed-off-by: Brian Paul <brianp@vmware.com>	2014-01-09 11:35:50 -07:00
Brian Paul	d046fd731a	mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query If a channel has zero bits it's not signed. v2: also check for luminance and intensity format bits. Bruce Merry's proposed piglit test hits the luminance case. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-09 11:35:50 -07:00
Brian Paul	0fc8d7c66e	mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed() This packed floating point format only stores positive values. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-09 11:35:50 -07:00
Brian Paul	d81d263eeb	st/mesa: fix breakage from gl_constant::Program[] change	2014-01-09 11:35:13 -07:00
Paul Berry	8668eaaa00	mesa: Use functions to convert gl_shader_stage to PROGRAM enum or pipe target. Suggested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Improve assert message.	2014-01-09 09:31:27 -08:00
Paul Berry	e654216ac7	main: Change init_program_limits() to use gl_shader_stage. This allows the caller to execute it in a loop rather than hand-rolling a separate call for each stage. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:23 -08:00
Paul Berry	bce8bc0b25	glsl: Index into ctx->Const.Program[] rather than using ad-hoc code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:19 -08:00
Paul Berry	b539385789	mesa: Index into ctx->Const.Program[] rather than using ad-hoc code. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:16 -08:00
Paul Berry	84732a982c	mesa: replace ctx->Const.{Vertex,Fragment,Geomtery}Program with an array. These are replaced with ctx->Const.Program[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}]. In patches to follow, this will allow us to replace a lot of ad-hoc logic with a variable index into the array. With the exception of the changes to mtypes.h, this patch was generated entirely by the command: find src -type f '(' -iname '.c' -o -iname '.cpp' -o -iname '.py' \ -o -iname '.y' ')' -print0 \| xargs -0 sed -i \ -e 's/Const\.VertexProgram/Const.Program[MESA_SHADER_VERTEX]/g' \ -e 's/Const\.GeometryProgram/Const.Program[MESA_SHADER_GEOMETRY]/g' \ -e 's/Const\.FragmentProgram/Const.Program[MESA_SHADER_FRAGMENT]/g' Suggested-by: Brian Paul <brianp@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 09:31:01 -08:00
José Fonseca	9b96be595b	llvmpipe: Honour pipe_rasterizer::point_quad_rasterization. Commit `eda21d2a30` fixed the rasterization of points for Direct3D but ended up breaking the rasterization of OpenGL non-sprite points, in particular conform's pntrast.c test. The only way to get both working is to properly honour pipe_rasterizer::point_quad_rasterization, and follow the weird OpenGL rule when it is false. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-09 12:35:11 +00:00
Eric Anholt	f46563fe1c	i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps. We definitely want to fall through to the unsynchronized map case, instead of wasting bandwidth on a copy. Prevents a -43.2407% +/- 1.06113% (n=49) performance regression on aa10perf when teaching glamor to provide the GL_INVALIDATE_RANGE_BIT information. This is a performance fix, which I usually wouldn't cherry-pick to stable. But this was really was just a bug in the code, its presence would discourage developers from giving us the best information they can, and I think we've got fairly high confidence in the unsynchronized map path already. Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-09 15:39:20 +08:00
Eric Anholt	e186b927b8	i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels. Fixes piglit GL_MESA_pack_invert/readpixels and GPU hangs with glamor and cairo-gl. Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:30:33 +08:00
Eric Anholt	a4b222ac13	i965: Fix incorrect bounds tracking for blit readpixels's GPU access. While incorrect, it probably wouldn't affect anyone ever: You'd have to do an appropriately-formatted readpixels into a PBO, then overwrite the tail end of the updated area of the PBO with glBufferSubData(), and you wouldn't get appropriate synchronization. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:30:32 +08:00
Eric Anholt	66524daf17	i965: Use SET_FIELD to safety check our x/y offsets in blits. The earlier assert made sure that our math didn't exceed our bounds, but this makes sure that we don't overflow from the high bits X into the low bits of Y. We've already put checks in intel_miptree_blit(), but I've wanted to expand the type in our protoype from short to uint32_t, and we could get in trouble with intel_emit_linear_blit() if we did. v2: Add Ken's comment about the funny language extension used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1) Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)	2014-01-09 15:30:11 +08:00
Eric Anholt	5d2e86924e	i965: Add an assert for when SET_FIELD's value exceeds the field size. This was one of the things we always wanted to do to this, to make it more useful than just (value << FIELD_MASK). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:23:27 +08:00
Eric Anholt	98cdb2ceed	i965: Add a safety check for emitting blits. With all of the flipping and pitch twiddling and miptree layout involved in our blits, there are lots of ways for us to scribble outside of a buffer. Put in a check that we're not about to do so. This catches a bug that glamor was running into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:23:23 +08:00
Eric Anholt	bdc5241af4	i965: Don't call the blitter on addresses it can't handle. Noticed by tex3d-maxsize on my next commit to check that our addresses don't overflow. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2014-01-09 15:23:00 +08:00
Thomas Sondergaard	e8ff08edd8	mesa: Namespace qualify fma to override ambiguity with fma from math.h MSVC 2013 version of math.h includes an fma() function. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:07 -07:00
Thomas Sondergaard	8fcddd325c	mesa: Work around internal compiler error This small rearrangement avoids MSVC 2013 ICE. Also, this should be a better memory access order. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-08 17:33:06 -07:00
Thomas Sondergaard	067ad6e53e	mesa: Fix compile error with MSVC 2013 This fixes the following compile error: src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3 overloads have similar conversions Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:06 -07:00
Thomas Sondergaard	20e65c92c7	mesa: Preliminary support for MSVC_VERSION=12.0 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 17:33:06 -07:00
Rob Clark	646c16af6e	freedreno: add basic query support Add for now some simple/basic query support (ie. things not actually requiring the GPU). Might change around a bit when I actually add GPU queries, but for now this enables some useful performance info in the GALLIUM_HUD. For example: GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls The driver specific specific queries are: + draw-calls + batches - number of batches per second, sum of batches-sysmem plus batches-gmem + batches-gmem - render a set of tiles in GMEM, for each tile (optionally) system mem -> gmem (restore), plus N draws, plus gmem -> system mem (resolve) per second + batches-sysmem - N draws to system memory (GMEM bypass) per second + restores - number of GMEM batches that required restore per second Ideally for GMEM rendering, you want batches-gmem to equal fps. If the app is doing something that triggers multiple passes (ie. requires extra round trip gmem <-> system memory) then the # of batches per second will go up relative to fps. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	725d736f6a	freedreno/a3xx: use cs patch instead of RFI+RMW Since we now have the cmdstream patch mechanism needed for hw binning, might as well also use it for RB_RENDER_CONTROL updates. This avoids the need to use RMW (and associated WFI) to update RB_RENDER_CONTROL. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	c0766528ba	freedreno/a3xx: support for hw binning pass The binning pass sorts vertices into which bins/tiles they apply to. The visibility information generated during the binning pass can be used to speed up the rendering pass by filtering out vertices which do not apply to the current tile. See: https://github.com/freedreno/freedreno/wiki/Adreno-tiling#optimized-approach This brings a significant fps boost. A rough assortment of tests (supertuxkart, etracer, tremulous, glmark2 'build' test, etc) seems to yield a ~35-45% fps improvement. For now, to be conservative, the binning pass is not enabled yet by default. To enable it use: FD_MESA_DEBUG=binning So far I haven't found anything that breaks with binning enabled, but I'd like a bit more testing before I enable it as default. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	bfb44c24bc	freedreno: be more clever about gmem usage Only need to leave room for depth/stencil if it is actually used, etc. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Rob Clark	42c5e2a2ed	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-01-08 16:30:18 -05:00
Chris Forbes	9e99735f30	i965: fold offset into coord for textureOffset(gsampler2DRect) The hardware is broken with nonzero texel offsets and unnormalized coordinates; instead of doing correct offsetting, we get garbage. This just extends the existing workaround for ir_txf and ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect. Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is not enabled; also fixes the new piglit test 'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'. Has been broken ~forever; suggesting including this in only 10.0 because the lowering pass doesn't exist in 9.2 or earlier so would require quite a different patch. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Lee Salzman <lsalzman@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2014-01-09 10:09:01 +13:00
Paul Berry	31ec2f8338	mesa: Remove _mesa_progshader_enum_to_string(), which is no longer used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:32:14 -08:00
Paul Berry	acfc58a7e5	glsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:32:01 -08:00
Paul Berry	2adb9fea77	glsl: Make more use of gl_shader_stage enum in lower_clip_distance.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:58 -08:00
Paul Berry	80ee24823f	glsl: Make more use of gl_shader_stage enum in link_varyings.cpp. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename "shaderType" param of is_varying_var() to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:55 -08:00
Paul Berry	9110078209	glsl: Change _mesa_glsl_parse_state ctor to use gl_shader_stage enum. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename "target" param to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:31:49 -08:00
Paul Berry	e3b86f07da	mesa: Use gl_shader::Stage instead of gl_shader::Type where possible. This reduces confusion since gl_shader::Type is sometimes GL_SHADER_PROGRAM_MESA but is more frequently GL_SHADER_{VERTEX,GEOMETRY,FRAGMENT}. It also has the advantage that when switching on gl_shader::Stage, the compiler will alert if one of the possible enum types is unhandled. Finally, many functions in src/glsl (especially those dealing with linking) already use gl_shader_stage to represent pipeline stages; using gl_shader::Stage in those functions avoids the need for a conversion. Note: in the process I changed _mesa_write_shader_to_file() so that if it encounters an unexpected shader stage, it will use a file suffix of "????" rather than "geom". Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:31:45 -08:00
Paul Berry	65511e5f22	mesa: Store gl_shader_stage enum in gl_shader objects. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:31:28 -08:00
Paul Berry	1722f5e73e	mesa: Move declaration of gl_shader_stage earlier in mtypes.h. Also move the related #define MESA_SHADER_STAGES. This will allow gl_shader_stage to be used in struct gl_shader. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:30:54 -08:00
Paul Berry	72a995d307	glsl: make _mesa_shader_stage_to_string() available to non-C++ code. Reviewed-by: Brian Paul <brianp@vmware.com> v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-08 07:30:48 -08:00
Paul Berry	665b8d7b6d	mesa: Clean up nomenclature for pipeline stages. Previously, we had an enum called gl_shader_type which represented pipeline stages in the order they occur in the pipeline (i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several inconsistently named functions for converting between it and other representations: - _mesa_shader_type_to_string: gl_shader_type -> string - _mesa_shader_type_to_index: GLenum (GL__SHADER) -> gl_shader_type - _mesa_program_target_to_index: GLenum (GL__PROGRAM) -> gl_shader_type - _mesa_shader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string This patch tries to clean things up so that we use more consistent terminology: the enum is now called gl_shader_stage (to emphasize that it is in the order of pipeline stages), and the conversion functions are: - _mesa_shader_stage_to_string: gl_shader_stage -> string - _mesa_shader_enum_to_shader_stage: GLenum (GL__SHADER) -> gl_shader_stage - _mesa_program_enum_to_shader_stage: GLenum (GL__PROGRAM) -> gl_shader_stage - _mesa_progshader_enum_to_string: GLenum (GL__{SHADER,PROGRAM}) -> string In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES, for consistency with the new name for the enum. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> v2: Also rename the "target" field of _mesa_glsl_parse_state and the "target" parameter of _mesa_shader_stage_to_string to "stage". Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 07:30:30 -08:00
José Fonseca	eda21d2a30	llvmpipe: Fix the bottom_edge_rule adjustment for points. The adjustment needs to be applied to the y coordinates and not the x coordinates, just like the equivalent code for lines and triangles in lp_setup_line.c and lp_setup_tri.c. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-01-08 12:18:17 +00:00
José Fonseca	37de6b0682	llvmpipe: Respect bottom_edge_rule when computing the rasterization bounding boxes. This was inadvertently forgotten when replacing gl_rasterization_rules with lower_left_origin and half_pixel_center (commit `2737abb44e`). This makes a difference when lower_left_origin != half_pixel_center, e.g, D3D10. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Zack Rusin <zackr@vmware.com>	2014-01-08 12:18:17 +00:00
Chia-I Wu	76edf44f9e	ilo: enable HiZ The support is still early. Fast depth buffer clear is not enabled yet. HiZ can be forced off with ILO_DEBUG=nohiz.	2014-01-08 18:11:36 +08:00
Chia-I Wu	e7b4219e22	ilo: resolve Z/HiZ correctly When the depth buffer is to be read, perform a Depth Buffer Resolve if it has been rendered. When the depth buffer is to be rendered, perform a HiZ Buffer Resolve when the depth buffer is modified externally.	2014-01-08 18:11:35 +08:00
Chia-I Wu	77e3db464f	ilo: add flags to texture slices The flags are used to mark who (CPU, BLT, or RENDER) has accessed the resource and how (READ or WRITE).	2014-01-08 18:11:35 +08:00
Chia-I Wu	846f70a6ef	ilo: rename and add an accessor for texture slices Rename ilo_texture::slice_offsets to ilo_texture::slices and add an accessor, ilo_texture_get_slice().	2014-01-08 18:11:35 +08:00
Chia-I Wu	127fbc086b	ilo: add HiZ op support to the pipelines Add blitter functions to perform Depth Buffer Clear, Depth Buffer Resolve, and Hierarchical Depth Buffer Resolve. Those functions set ilo_blitter up and pass it to the pipelines to emit the commands.	2014-01-08 18:11:35 +08:00
Chia-I Wu	546416d495	ilo: add support for HiZ allocation Add tex_create_hiz() to create HiZ bo. It is not really called yet.	2014-01-08 18:11:35 +08:00
Chia-I Wu	e372819589	ilo: refactor separate stencil allocation Move separate stencil allocation code to tex_create_separate_stencil to keep tex_create sane.	2014-01-08 18:11:35 +08:00
Chia-I Wu	82676f5d34	ilo: assorted GPE fixes for HiZ Allow HiZ op to be specified in 3DSTATE_WM. Pass depth format directly in gen7_emit_3DSTATE_SF. Use tex->hiz.bo to determine if HiZ exists. Fix 3DSTATE_SF for the case when there is no ilo_rasterizer_state. Fix 3DSTATE_PS for the case when there is no ilo_shader_state.	2014-01-08 18:11:35 +08:00
Chia-I Wu	6642381e75	ilo: no layer offsetting on GEN7+ Even though the Ivy Bridge PRM lists some restrictions that require layer offsetting as the Sandy Bridge PRM does, it seems they are actually lifted.	2014-01-08 18:11:34 +08:00
Chia-I Wu	011fde4bf2	ilo: offset to layers only when necessary GEN6 has several requirements regarding the LOD/Depth/Width/Height of the render targets and the depth buffer. We used to offset to the layers in question unconditionally to meet the requirements. With this commit, offseting is done only when the requirements are not met.	2014-01-08 18:11:34 +08:00
Chia-I Wu	0a2a221d01	ilo: allow ilo_zs_surface to skip layer offsetting Make offset to layer optional in ilo_gpe_init_zs_surface.	2014-01-08 18:11:34 +08:00
Chia-I Wu	8d9f5d57e2	ilo: allow ilo_view_surface to skip layer offsetting Make offset to layer optional in ilo_gpe_init_view_surface_for_texture. render_cache_rw is always the same as is_rt and is replaced.	2014-01-08 18:11:34 +08:00
Tapani Pälli	0978a6966a	i965/fs: do SEL optimization only when src type for MOV matches Fixes a bug where then branch operates with ivec4 while else uses vec4. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72379 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-08 07:06:45 +02:00
Kenneth Graunke	847bc36a38	glsl: Optimize pow(2, x) --> exp2(x). On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	5e3fd6a9db	glsl: Refactor is_zero/one/negative_one into an is_value() method. This patch creates a new generic is_value() method, which checks if an ir_constant has a particular value. (For vectors, it must have the single value repeated across all components.) It then rewrites the is_zero/is_one/is_negative_one methods to use this generic helper. All three were basically identical except for the value they checked for. The other difference is that is_negative_one rejects boolean types. The new is_value function maintains this behavior, only allowing boolean types when checking for 0 or 1. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	d6c1d66d3a	glsl: Optimize pow(1.0, X) --> 1.0. Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2014-01-07 12:54:57 -08:00
Kenneth Graunke	05fbb021a6	mesa: Use get_local_param_pointer in glProgramLocalParameters4fvEXT(). Using the get_local_param_pointer helper ensures that the LocalParams arrays have actually been allocated before attempting to use them. glProgramLocalParameters4fvEXT needs to do a bit of extra checking, but it can be simplified since the helper has already validated the target. Fixes crashes in programs that use Cg (for example, Awesomenauts, Rocketbirds: Hardboiled Chicken, and Tiny and Big: Grandpa's Leftovers) since commit `e5885c119d` (mesa: Dynamically allocate the storage for program local parameters.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73136 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Brian Paul <brianp@vmware.com> Tested-by: Laurent Carlier <lordheavym@gmail.com>	2014-01-07 12:50:23 -08:00
José Fonseca	2d368b982a	llvmpipe: Basic implementation of pipe_context::set_sample_mask. We don't support MSAA (ie, number of samples is always one) therefore sample_mask boils down to a synonym of the rasterizer_discard flag. Also, this change makes setup actually use the value received in lp_setup_set_rasterizer_discard instead of reaching out to llvmpipe upper layers to re-fetch it. Based on Si Chen's draft. With this patch `wgf11multisample Coverage passes 100%` on the UMD D3D10 state tracker. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Si Chen <sichen@vmware.com>	2014-01-07 16:04:42 +00:00
José Fonseca	95bf222603	cso_context: Fix cso_context::sample_mask initial value. The initial value of cso_context::sample_mask_saved is irrelevant as it will be overwritten with cso_context::sample_mask in cso_save_sample_mask. Therefore it is cso_context::sample_mask that needs to be properly initialized. This fixes regressions in blits and mipmap generation after adding support for sample_mask to llvmpipe. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-07 16:04:42 +00:00
Si Chen	72c6d0e506	llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers. Implement Alpha to Coverage by discarding a fragment alpha component is less than 0.5. This is a joint work of Jose and Si. Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2014-01-07 16:04:42 +00:00
Andreas Fänger	2a0fb946e1	swrast: fix delayed texel buffer allocation regression for OpenMP Commit `9119269ca1` moved the texel buffer allocation to _swrast_texture_span(), however, when compiled with OpenMP support this code already runs multi-threaded so a critical section is required to prevent multiple allocations and rendering errors. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-07 08:03:49 -07:00
Dave Airlie	aa4e2243a2	gallium/draw: remove double semicolon code cleanup. Signed-off-by: Dave Airlie <airlied@redhat.com>	2014-01-07 18:52:46 +10:00
Brian Paul	8d1400fe12	glsl: rename min(), max() functions to fix MSVC build Evidently, there's some other definition of "min" and "max" that causes MSVC to choke on these function names. Renaming to min2() and max2() fixes things. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 16:57:49 -07:00
Kenneth Graunke	f6b10544cd	i965: Remove unused PIPE_CONTROL defines. Both brw_defines.h and intel_reg.h defined PIPE_CONTROL fields, which had similar names, but couldn't be used in the same way. (One had built-in shifts, and the other didn't...) Delete the unused set to preserve sanity. (Eric wrote an almost identical patch back in August, so I believe he approves.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 15:45:42 -08:00
Vinson Lee	f8432832a7	mesa: Remove GLXContextID typedef from glxext.h. This patch fixes this build error with gcc <= 4.5 and clang <= 3.1. CC clientattrib.lo In file included from ../../include/GL/glx.h:333:0, from glxclient.h:45, from clientattrib.c:32: ../../include/GL/glxext.h:275:13: error: redefinition of typedef 'GLXContextID' ../../include/GL/glx.h:171:13: note: previous declaration of 'GLXContextID' was here Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70591 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:57:23 -08:00
Maxence Le Doré	a44ca3595e	docs/relnotes/10.1.html: report AMD_shader_trinary_minmax support Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:11 -08:00
Maxence Le Doré	1a9e8c23eb	mesa: enable AMD_shader_trinary_minmax Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:10 -08:00
Maxence Le Doré	eb5dc75601	glsl: implement mid3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:09 -08:00
Maxence Le Doré	73c7451587	glsl: implement max3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:08 -08:00
Maxence Le Doré	ce46e14729	glsl: Implement min3 built-in function Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:08 -08:00
Maxence Le Doré	61c450fc81	glsl: add min() and max() functions to builder.cpp Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:07 -08:00
Maxence Le Doré	cf70d2a7c0	glsl: add a shader_trinary_minmax predicate Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:06 -08:00
Maxence Le Doré	ff50493bb3	glsl: Add extension tracking for AMD_shader_trinary_minmax Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 14:28:02 -08:00
Alexander von Gluck IV	61ef697afc	haiku libGL: Move from gallium target to src/hgl * The Haiku renderers need to link to libGL to function properly in all usage contexts. As mesa drivers build before gallium targets, we couldn't properly link the mesa swrast driver to the gallium libGL target for Haiku. * This is likely better as it mimics how glx is laid out ensuring the Haiku libGL is better understood. * All renderers properly link in libGL now. Acked-by: Brian Paul <brianp@vmware.com>	2014-01-06 15:50:21 -06:00
Alexander von Gluck IV	b236314a11	haiku: Fix missing HaikuGL header paths Acked-by: Brian Paul <brianp@vmware.com>	2014-01-06 15:50:15 -06:00
Brian Paul	3486f6f31b	mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query This is part of the GL_EXT_packed_float extension. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096 Cc: 10.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2014-01-06 13:37:00 -07:00
Eric Anholt	7db56ddee0	i965: Warning fix Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 10:54:22 -08:00
Kenneth Graunke	242ca9acb4	i965: Delete unused INTEL_WRITE_{PART,FULL} and INTEL_READ #defines. These are just software flag values (not hardware specific values), and aren't used anywhere. Delete them to avoid confusion. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-06 10:52:43 -08:00
Marek Olšák	346b6abab9	radeonsi: calculate NUM_BANKS for DB correctly on CIK NUM_BANKS is not constant on CIK. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2014-01-06 18:40:42 +01:00
Marek Olšák	bf3c361113	radeonsi: set correct pipe config for Hawaii in DB Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-06 18:40:42 +01:00
Marek Olšák	2748b7da7e	radeonsi: disable HTILE for 1D-tiled depth-stencil buffers Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2014-01-06 18:40:41 +01:00
Juha-Pekka Heikkila	d41f5396f3	glx: check memory allocations in __glXInitVertexArrayState() Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-06 10:23:26 -07:00
Juha-Pekka Heikkila	0c04cca0e1	glx: Add missing null check in __glXNewIndirectAPI() Add extra null check in auto generated indirect_init.c via src/mapi/glapi/gen/glX_proto_send.py Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-06 10:23:12 -07:00
Nathan Kidd	0691b37732	docs: fix misspellings Fixed what I noticed; no warranty for exhaustiveness. Signed-off-by: Nathan Kidd <nkidd@opentext.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-06 09:55:38 -07:00
Chris Forbes	a61ae2aa01	i965: set size of txf_mcs payload vgrf properly Previously we left the size of this vgrf as 1, which caused register allocation to be subtly broken. If we were lucky we would explode in the post-alloc instruction scheduler; if we were unlucky we'd just stomp on someone else and get broken rendering. Fixes crash when running `tesseract` with the following settings: msaa 4 glineardepth 0 Also fixes the piglit test: arb_sample_shading-builtin-gl-sample-id Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: Anuj Phogat <anuj.phogat@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72859 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2014-01-04 20:24:29 +13:00
Erik Faye-Lund	eb212c5a30	glcpp: error on multiple #else/#elif directives The preprocessor currently accepts multiple else/elif-groups per if-section. The GLSL-preprocessor is defined by the C++ specification, which defines the following parse-rule: if-section: if-group elif-groups(opt) else-group(opt) endif-line This clearly only allows a single else-group, that has to come after any elif-groups. So let's modify the code to follow the specification. Add test to prevent regressions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Carl Worth <cworth@cworth.org> Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2014-01-02 14:22:58 -08:00
Carl Worth	6005e9cb28	glcpp: Replace multi-line comment with a space (even as part of macro definition) The preprocessor has always replaced multi-line comments with a single space character, (as required by the specification), but as of commit `bd55ba568b` the lexer also emitted a NEWLINE token for each newline within the comment, (in order to preserve line numbers). The emitting of NEWLINE tokens within the comment broke the rule of "replace a multi-line comment with a single space" as could be exposed by code like the following: #define FOO a/* */b FOO Prior to commit `bd55ba568b`, this code defined the macro FOO as "a b" as desired. Since that commit, this code instead defines FOO as "a" and leaves a stray "b" in the output. In this commit, we fix this by not emitting the NEWLINE tokens while lexing the comment, but instead merely counting them in the commented_newlines variable. Then, when the lexer next encounters a non-commented newline it switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as necessary (so that subsequent parsing stages still generate correct line numbers). Of course, it would have been more clear if we could have written a loop to emit all the newlines, but flex conventions prevent that, (we must use "return" for each token we emit). It similarly would have been clear to have a new rule restricted to the <NEWLINE_CATCHUP> state with an action much like the body of this if condition. The problem with that is that this rule must not consume any characters. It might be possible to write a rule that matches a single lookahead of any character, but then we would also need an additional rule to ensure for the <EOF> case where there are no additional characters available for the lookahead to match. Given those considerations, and given that the SKIP-state manipulation already involves a code block at the top of the lexer function, before any rules, it seems best to me to go with the implementation here which adds a similar pre-rule code block for the NEWLINE_CATCHUP. Finally, this commit also changes the expected output of a few, existing glcpp tests. The change here is that the space character resulting from the multi-line comment is now emitted before the newlines corresponding to that comment. (Previously, the newlines were emitted first, and the space character afterward.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-02 14:15:51 -08:00
Carl Worth	61cea49014	glcpp: Add a more descriptive comment for the SKIP state manipulation Two things make this code confusing: 1. The uncharacteristic manipulation of lexer start state outside of flex rules. 2. The confusing semantics of the skip_stack (including the "lexing_if" override and the SKIP_NO_SKIP state). This new comment is intended to bring a bit more clarity for any readers. There is no intended beahvioral change to the code here. The actual code changes include better indentation to avoid an excessively-long line, and using the more descriptive INITIAL rather than 0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2014-01-02 14:15:24 -08:00
Courtney Goeltzenleuchter	5a51c1b01a	i965: Enhance intel_texsubimage_tiled_memcpy() to support all levels Support all levels of a supported texture format. Using 1024x1024, RGBA 8888 source, mipmap internal-format Before (MB/sec) mipmap (MB/sec) GL_RGBA 627.15 615.90 GL_RGB 456.35 611.53 512x512 GL_RGBA 597.00 619.95 GL_RGB 440.62 611.28 256x256 GL_RGBA 487.80 587.42 GL_RGB 376.63 585.00 Benchmark has been sent to mesa-dev list: teximage_enh Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-30 14:57:49 -08:00
Courtney Goeltzenleuchter	85784fd832	i965: Add XRGB to intel_texsubimage_tiled_memcpy() MESA_FORMAT_XRGB8888 is equivalent to MESA_FORMAT_ARGB8888 in terms of storage on the device, so okay to use this optimized copy routine. This series builds on work from Frank Henigman to optimize the process of uploading a texture to the GPU. This series adds support for MESA_XRGB_8888 and full miptrees where were found to be common activities in the Smokin' Guns game. The issue was found while profiling the app but that part is not benchmarked. Smokin-Guns uses mipmap textures with an internal format of GL_RGB (MESA_XRGB_8888 in the driver). These changes need a performance tool to run against to show how they improve execution performance for specific texture formats. Using this benchmark I've measured the following improvement on my Ivybridge Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz. 1024x1024 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 628.15 627.15 GL_RGB 265.95 456.35 512x512 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 600.23 597.00 GL_RGB 255.50 440.62 256x256 texture size internal-format Before (MB/sec) XRGB (MB/sec) GL_RGBA 489.08 487.80 GL_RGB 229.03 376.63 Benchmark has been sent to mesa-dev list: teximage Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-30 14:57:48 -08:00
Paul Berry	77c74c647b	glsl: Fix gl_type of usamplerCube built-in type. I'm not aware of any piglit tests that this fixes, but the old code was obviously wrong. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-30 11:21:39 -08:00
Paul Berry	7e0b4b5e9b	mesa: Add an assertion to _mesa_program_index_to_target(). Only a Mesa bug could cause this function to be called with an out-of-range index, so raise an assertion if that ever happens. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:33 -08:00
Paul Berry	99e822fa18	mesa: Improve static error checking of arrays sized by MESA_SHADER_TYPES. This patch replaces the following pattern: foo bar[MESA_SHADER_TYPES] = { ... }; With: foo bar[] = { ... }; STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES); This way, when a new shader type is added in a future version of Mesa, we will get a compile error to remind us that the array needs to be updated. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:27 -08:00
Paul Berry	b30e25f297	glsl: Remove extraneous shader_type argument from analyze_clip_usage(). This argument was carrying the name of the shader target (as a string). We can get this just as easily by calling _mesa_shader_enum_to_string(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:24 -08:00
Paul Berry	d343e3d98c	glsl: Get rid of hardcoded arrays of shader target names. We already have a function for converting a shader type index to a string: _mesa_shader_type_to_string(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:21 -08:00
Paul Berry	89c35c59a4	main: Remove unused function _mesa_shader_index_to_type(). Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:14 -08:00
Paul Berry	26707abe56	Rename overloads of _mesa_glsl_shader_target_name(). Previously, _mesa_glsl_shader_target_name() had an overload for GLenum and an overload for the gl_shader_type enum, each of which behaved differently. However, since GLenum is a synonym for unsigned int, and unsigned ints are often used in place of gl_shader_type (e.g. in loop indices), there was a big risk of calling the wrong overload by mistake. This patch gives the two overloads different names so that it's always clear which one we mean to call. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-30 11:21:08 -08:00
Kenneth Graunke	f425d56ba4	Revert "mesa: Remove GLXContextID typedef from glx.h." This reverts commit `136a12ac98`. According to belak51 on IRC, this commit broke Allegro, which would no longer compile. Applications apparently expect the GLXContextID typedef to exist in glx.h; removing it breaks them. A bit of searching around the internet revealed other complaints since upgrading to Mesa 10. Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-29 23:23:33 -08:00
Kenneth Graunke	da031f83f7	i965: Remove unused depth_mode parameter from translate_tex_format(). According to git blame, this hasn't been used in over two years: commit `d2235b0f46` Author: Eric Anholt <eric@anholt.net> Date: Thu Nov 17 17:01:58 2011 -0800 i965: Always handle GL_DEPTH_TEXTURE_MODE through the shader. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-29 23:18:24 -08:00
Topi Pohjolainen	597a7ccc72	i965/blorp: unit test compiling integer typed texture fetches Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:45 +02:00
Topi Pohjolainen	1c76b53482	i965/blorp: unit test compiling simple gen6 zero-src sampled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:38 +02:00
Topi Pohjolainen	118c093d56	i965/blorp: unit test compiling gen6 msaa-8 cms alpha blend Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:34 +02:00
Topi Pohjolainen	b03319ddb1	i965/blorp: unit test compiling bilinear filtered Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:31 +02:00
Topi Pohjolainen	b928e345e4	i965/blorp: unit test compiling simple zero-src sampled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:27 +02:00
Topi Pohjolainen	001b92c112	i965/blorp: unit test compiling unaligned msaa-8 Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:23 +02:00
Topi Pohjolainen	0f89ebacbb	i965/blorp: unit test compiling msaa-8 cms alpha blend Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:19 +02:00
Topi Pohjolainen	90dcf31631	i965/blorp: unit test compiling msaa-4 ums to cms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:15 +02:00
Topi Pohjolainen	11d2986a53	i965/blorp: unit test compiling msaa-8 cms to cms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:11 +02:00
Topi Pohjolainen	28d2c969e7	i965/blorp: unit test compiling msaa-8 ums to cms Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:07 +02:00
Topi Pohjolainen	812f1e94c0	i965/blorp: unit test compiling blend and scaled Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:59:03 +02:00
Topi Pohjolainen	a7757bf518	i965/blorp: allow unit tests to compile and dump assembly Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:59 +02:00
Topi Pohjolainen	1cb22f0da2	i965: dump the disassembly to the given file instead of ignoring the argument and always dumping to standard output. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:52 +02:00
Topi Pohjolainen	1958a9bbdf	i965/fs: allow fs-generator use without gl_fragment_program Prepares the generator to accept hand-crafted blorp programs. Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:46 +02:00
Topi Pohjolainen	ca53704f4b	i965/fs: generate fs programs also without any 8-width instructions Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-27 11:58:36 +02:00
Rob Clark	8ab47b4353	freedreno/a3xx: fix blend state corruption issue Using RMW on banked context registers is not safe. The value read could be the wrong one. So if there has been a DRAW_IDX launched, the RMW must be preceded by a WAIT_FOR_IDLE to ensure the read part of RMW sees the correct value. To avoid unnecessary WFI's, keep track if there is a need for WFI, and only emit one if needed. Furthermore, keep track if we even need to update the register in the first place. And to cut down on the amount of RMW to avoid excessive WFI's, at the tiling/GMEM level we can always overwrite RB_RENDER_CONTROL, as the state at beginning of draw/clear cmds (which we IB to) is always undefined. In the draw/clear commands, we always still use RMW (with WFI if needed), but only if the register value actually changes. (At points where the current value cannot be known, the saved value is reset to ~0, which includes bits outside of RBRC_DRAW_STATE, so there never is chance for confusion.) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-26 12:13:42 -05:00
Rob Clark	be01d7a905	freedreno: prepare for hw binning Actually assign VSC_PIPE's properly, which will be needed for tiling. And introduce fd_tile for per-tile state (including the assignment of tile to VSC_PIPE). This gives us the proper pipe setup that we'll need for hw binning pass, and also cleans things up a bit by not having to pass so many parameters around. And will also make it easier to introduce different tiling patterns (since we may no longer render tiles in a simple left-to-right top-to-bottom pattern). Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-26 12:06:29 -05:00
Rob Clark	64fe067066	freedreno: resync generated headers Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-26 12:06:29 -05:00
Alex Deucher	e2d53fac1c	r600g: fix SUMO2 pci id 0x9649 is sumo2, not sumo. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-24 15:22:31 -05:00
Vinson Lee	35a3414302	scons: Add system library linker flags on LLVM 3.5. llvn-3.5svn r197664 split out the linker flags from ldflags to system-libs. Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-12-23 11:33:29 -08:00
Aaron Watry	3ddabe0d52	r600/pipe: Stop leaking context->start_compute_cs_cmd.buf on EG/CM Found while tracking down memory leaks in VDPAU playback Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	20446d0e53	st/vdpau: Destroy context when initialization fails Prevents a potential memory leak found when tracking down something else. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	767b0f82c3	radeon/llvm: Free target data at end of optimization Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	0bd858d7ff	r600/compute: Use the correct FREE macro when deleting compute state Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	e19717d075	r600/compute: Free compiled kernels when deleting compute state v2: Remove unnecessary null pointer check CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	8c9a9205d9	radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode Previously we were creating a new LLVMContext every time that we called radeon_llvm_parse_bitcode, which caused us to leak the context every time that we compiled a CL program. Sadly, we can't dispose of the LLVMContext at the point that it was being created because evergreen_launch_grid (and possibly the SI equivalent) was assuming that the context used to compile the kernels was still available. Now, we'll create a new LLVMContext when creating EG/SI compute state, store it there, and pass it to all of the places that need it. The LLVM Context gets destroyed when we delete the EG/SI compute state. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	a7653c19a3	pipe_loader/sw: close dev->lib when initialization fails Prevents a memory leak. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Aaron Watry	862f55c29c	clover: Remove unused variable Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-23 07:24:50 -06:00
Jonathan Liu	7990ab58fa	llvmpipe: use pipe_sampler_view_release() to avoid segfault This fixes another case of faulting when freeing a pipe_sampler_view that belongs to a previously destroyed context. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Liu <net147@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-22 07:07:56 -07:00
Jonathan Liu	670be71bd8	st/mesa: use pipe_sampler_view_release() This fixes a crash where old_view->context was already freed in the pipe_sampler_view_reference function contained in src/gallium/auxiliary/utils/u_inlines.h. As a result, the sampler_view_destroy function pointer contained 0xfeeefeee indicating freed heap memory. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jonathan Liu <net147@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-22 07:07:07 -07:00
Henri Verbeet	b094b3b9f4	i915: Add support for gl_FragData[0] reads. Similar to `556a47a262`, without this reading from gl_FragData[0] would cause a software fallback. Bugzilla: https://bugs.winehq.org/show_bug.cgi?id=33964 Signed-off-by: Henri Verbeet <hverbeet@gmail.com> Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-22 11:55:39 +01:00
Andreas Hartmetz	2efe7927d3	radeonsi: Use htile_buffer for depth only when there is no stencil. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-12-22 01:41:03 +01:00
Niels Ole Salscheider	900ac63ee8	winsys/radeon: remove superfluous distinction of cases Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-12-22 01:41:02 +01:00
Mark Mueller	852db050b9	mesa: inline r200 radeon texture format macros to facility search and replace Signed-off-by: Mark Mueller <MarkKMueller@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2013-12-21 15:27:29 +01:00
Lauri Kasanen	fcefdc9a59	mesa: Fix build to properly check for supported compiler flags Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72708 Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Lauri Kasanen <cand@gmx.com>	2013-12-20 17:00:57 -08:00
Ian Romanick	79f268978d	mesa: It is not possible to have GLSL < 1.20 This hasn't been possible for a long time. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	4949322462	mesa: Clean up bad code formatting left from previous commit Also s/_EXT// on enums that are now part of core. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	a92b9e60ab	mesa: GL_EXT_packed_depth_stencil is not optional Every driver supports it. All current and future Gallium drivers always support it, and all existing classic drivers support it. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	b66edff435	radeon: Sort list of enabled extensions Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Ian Romanick	1bf436e014	r200: Sort list of enabled extensions Note that ARB_occlusion_query was previously enabled twice. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-20 16:43:08 -08:00
Lauri Kasanen	fe2079c4c0	glx: Simplify __glxGetMscRate, it only needs the screen, not a drawable Useful in its own right, but also needed for adaptive vsync. No regressions in the piglit glx-oml-sync-control-getmscrate test. Signed-off-by: Lauri Kasanen <cand@gmx.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 16:43:08 -08:00
Keith Packard	6b51113981	dri3: Rename DRI3_MAX_BACK to DRI3_NUM_BACK It is the maximum number of back buffers, but the name is confusing and is easily read as the maximum back buffer index. Chage to DRI3_NUM_BACK to make the intended usage a bit clearer. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:31:09 -08:00
Keith Packard	547bcc4b57	i965: Set fast color clear mcs_state on newly allocated image miptrees Just copying code from the dri2 path to set up the fast color clear state. This also removes a couple of bogus intel_region_reference calls. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:19:52 -08:00
Keith Packard	c426fb08cf	i965: Correct check for re-bound buffer in intel_update_image_buffer The buffer-object is the persistent thing passed through the loader, so when updating an image buffer, check to see if it is already bound to the provided bo. The region, on the other hand, is allocated separately for the miptree, and so will never be the same as that passed back from the loader. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:18:37 -08:00
Keith Packard	ca2012a912	dri3: Clean up struct dri3_drawable Move the depth field up with width and height. Remove unused previous_time and frames fields. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:18:11 -08:00
Keith Packard	95b04850d0	dri3: Free resources when drawable is destroyed. Always nice to clean up after ourselves. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:17:59 -08:00
Keith Packard	568a27588d	dri3: Switch to libxshmfence version 1.1 libxshmfence v1.0 foolishly used 'int32_t ' for the fence type, which works when the fence is a linux futex. However, version 1.1 changes the exported datatype to 'struct xshmfence ' Require libxshmfence version 1.1 and switch the API around. Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 16:17:54 -08:00
Kenneth Graunke	9f330481c3	i965: Use RED for depth texture formats rather than INTENSITY. While looking through the documentation, I found this in the Sandybridge PRM (Volume 4, Part 1, Page 140): "Use of sample_c with SURFTYPE_CUBE surfaces is undefined with the following surface formats: I24X8_UNORM, L24X8_UNORM, A24X8_UNORM, I32_FLOAT, L32_FLOAT, A32_FLOAT." I haven't observed this to be true, but it suggests that we may want to use other formats. We already perform DEPTH_TEXTURE_MODE swizzling in the shaders, and don't rely on the surface format to splat things appropriately. So using RED should work just as well as INTENSITY. A few notes about the formats: - R24_UNORM_X8_TYPELESS has the exact same properties as I24X8_UNORM. - R16_UNORM and R32_FLOAT are additionally supported as a render target, while the old I16_UNORM/I32_FLOAT formats are not. - R32_FLOAT_X8X24_TYPELESS is not supported as a render target, while the old format (R32G32_FLOAT) was. However, it shares the same properties as the formats we use for Z24, so it should suffice. This makes translate_tex_format and brw_blorp_surface_info::set a bit more similar. No Piglit changes on Sandybridge or Ivybridge. No oglconform changes on Sandybridge. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 16:14:35 -08:00
Chad Versace	1a928816a1	i965/gen6: Fix HiZ hang in WebGL Google Maps Emitting flushes before depth and hiz resolves at the top of blorp's state emission fixes the hang. Marchesin and I found the fix experimentally, as opposed to adhering to a documented hardware workaround. A more minimal fix likely exists, but this gets the job done. Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS. Tested by zooming in and out continuously for 2 hours. This patch is based on `8bc07bb701` CC: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740 Signed-off-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-20 15:20:30 -08:00
Kenneth Graunke	b97fa1e75b	i965: Store QPitch in intel_mipmap_tree. Broadwell allows us to specify an arbitrary value for QPitch, rather than baking a specific formula into the hardware and requiring software to lay things out to match. The only restriction is that the software provided QPitch needs to be large enough so successive array slices do not overlap. In order to support this flexibility, software needs to specify QPitch in a bunch of packets. Storing QPitch makes that easy, and allows us to adjust it in a single place should we wish to change it in the future. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-20 12:41:54 -08:00
Kenneth Graunke	1e8e17ccd7	i965: Add support for Broadwell's new register types. Broadwell introduces support for Q, UQ, and HF types. It also extends DF support to allow immediate values. Irritatingly, although HF and DF both support immediates, they're represented by a different value depending on the register file. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:43 -08:00
Kenneth Graunke	15b9aa22d7	i965: Add BRW_REGISTER_TYPE_DF. Ivybridge, Baytrail, and Haswell support double float register types, but do not support them as immediate values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:41 -08:00
Kenneth Graunke	54e91e7420	i965: Abstract BRW_REGISTER_TYPE_* into an enum with unique values. On released hardware, values 4-6 are overloaded. For normal registers, they mean UB/B/DF. But for immediates, they mean UV/VF/V. Previously, we just created #defines for each name, reusing the same value. This meant we could directly splat the brw_reg::type field into the assembly encoding, which was fairly nice, and worked well. Unfortunately, Broadwell makes this infeasible: the HF and DF types are represented as different numeric values depending on whether the source register is an immediate or not. To preserve sanity, I decided to simply convert BRW_REGISTER_TYPE_* to an abstract enum that has a unique value for each register type, and write translation functions. One nice benefit is that we can add assertions about register files and generations. I've chosen not to convert brw_reg::type to the enum, since converting it caused a lot of trouble due to C++ enum rules (even though it's defined in an extern "C" block...). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:39 -08:00
Kenneth Graunke	13454fc3de	i965: Decode three-source register types directly. Three-source instructions use a different encoding for register types (and have a much more limited set to choose from). Previously, we translated those into BRW_REGISTER_TYPE_* values, then reused the existing reg_encoding mapping. Doing it directly is more straightforward and actually less code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:38 -08:00
Kenneth Graunke	4e95a09937	i965: Disassemble UV types, not UB types. UB types have never been supported as immediates. On Gen4-5, register encoding 4 is "Reserved." On Gen6+, it means UV. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:36 -08:00
Kenneth Graunke	d10242c5f7	i965: Add missing BRW_REGISTER_TYPE_UV. Sandybridge added support for packed unsigned vectors. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-20 12:34:15 -08:00
Kenneth Graunke	51c9cfc296	i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation. When adding geometry shader support, we accidentally reversed the size and offset parameters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-20 12:25:43 -08:00
Kenneth Graunke	0d0edf8e4c	i965: Use {point_sprite,flat}_enable variable names instead of dw. Calling the local variables flat_enable and point_sprite_enable is clearer than dw16 and such. It also matches the names used in calculate_attr_overrides, which computes them. v2: Add / dw16 / and / dw10 */ comments, requested by Jordan. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-12-20 12:25:33 -08:00
Kenneth Graunke	23fc845f81	i965: Zero out {point_sprite,flat}_enables in calculate_attr_overrides. calculate_attr_overrides is responsible for computing the point sprite and flat-shading enable bitfields. It does so by OR'ing in a bunch of bits. However, it relied on the caller to set the initial value to zero. This is pretty fragile - if the caller neglects to zero out those variables, then the enable bitfields end up full of garbage, which shows up as random things being flat-shaded. This patch moves the zero-initialization into calculate_attr_overrides, so that the computation is completely in one place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-12-20 12:25:33 -08:00
Kenneth Graunke	da872ddcc6	i965: Delete bogus BRW_REGISTER_TYPE_HF define. git blame ascribes this to the initial commit of the driver. No released hardware has ever supported half float, according to the documentation for SrcType in the ISA reference. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-20 12:25:33 -08:00
Kevin Rogovin	3b1195f8a6	Report that no function found if signature lookup is empty If no function signature is found for a function name, report that the function is not found instead of printing an empty list of candidates. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-20 09:03:54 -08:00
Kevin Rogovin	23d294bb60	Use line number information from entire function expression This patch changes the error reporting behavior for incorrect function invocation (triggered by match_function_by_name() unable to find a matching function call) from using the line number information associated to the function name term to using the line number information of the entire function expression. Fixes bug #72264. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-20 09:03:54 -08:00
Michel Dänzer	d580905000	radeonsi: Only scan pixel shaders for TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS It's not relevant for other shader types. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-12-20 18:51:09 +09:00
Aaron Watry	8252847b7b	r600g: Fix spelling error Trivial change, testing commit access	2013-12-19 14:30:51 -06:00
Quanxian Wang	1413a09f34	egl: break instead of looping after driver is found Stop searching for a driver after success. Signed-off-by: Quanxian Wang <quanxian.wang@intel.com> Reviewed-By: Gong, Zhigang <zhigang.gong@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 12:44:11 -07:00
Juha-Pekka Heikkila	22bf0f3eb4	mesa: Assert variable coming from get_variable() in get_current_attrib Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 08:26:17 -07:00
Juha-Pekka Heikkila	a7d8607d9e	mesa: Add asserts into emit_fog_instructions Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 08:25:58 -07:00
Juha-Pekka Heikkila	cd6aaf2920	glx: Fix two identical null check errors in driSet/GetInterval Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-19 08:25:45 -07:00
Dave Airlie	149140e922	st_glsl_to_tgsi: add support for prim id fragment shader input For GLSL 1.50 we can get frag shaders with primitive id as an input, add support to the translator for this. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-12-18 22:46:29 +00:00
Juha-Pekka Heikkila	28b552bf6b	mesa: add asserts in load_texunit_bumpmap In load_texunit_bumpmap tc_array is asserted so lets assert rot_mat_0 and rot_mat_1 also which are coming from same path. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:40:29 -07:00
Juha-Pekka Heikkila	c02f6c26d3	glx: add missing null check in dri2_bind_tex_image Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:40:19 -07:00
Brian Paul	a9bf5999d1	mesa: minor simplification in _mesa_es3_error_check_format_and_type() The type_valid local was set to true and never changed.	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	ca3df5eeda	glx: Add missing null check in dri2CreateDrawable Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	56c5ba8f92	mesa: Verify memory allocations success in _mesa_PushAttrib Check for malloc() returning null to fix Klocwork warnings. Minor clean-ups by BrianP. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	2a83e4182c	mesa: Verify memory allocations success in _mesa_PushClientAttrib Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Juha-Pekka Heikkila	d08ac826c5	mesa: Change save_attrib_data() to return boolean Change save_attrib_data() to return true/false depending on success. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Brian Paul	aa4001b607	mesa: add API/extension checks for 3-component texture buffer formats The GL_RGB32F, GL_RGB32UI and GL_RGB32I texture buffer formats are only supposed to be allowed if the GL_ARB_texture_buffer_object_rgb32 extension is supported. Note that the texture buffer extensions require a core profile. This patch adds those checks. Fixes the soon-to-be-added arb_clear_buffer_object-negative-bad-internalformat piglit test.	2013-12-18 09:06:52 -07:00
Brian Paul	eaaa9695b2	mesa: 78-column wrapping in extensions.c	2013-12-18 09:06:52 -07:00
Pi Tabred	4bf3afdde9	mesa: Cleanup mesa/main/bufferobj.h Column wrapping and space between lines. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Pi Tabred	3b0f5fc084	Modify release notes to include ARB_clear_buffer_object extension Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:52 -07:00
Pi Tabred	78216fb485	Add ARB_clear_buffer_object to list of supported extensions Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Brian Paul	787dadbeea	st/mesa: plug in default buffer object driver functions In particular, this plugs in the new ClearBufferSubData() fallback driver function.	2013-12-18 09:06:51 -07:00
Pi Tabred	5f7bc0c759	mesa: Implement functions for clear_buffer_object extensions Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	7d94653052	mesa: Modify get_buffer() to allow for a variable error code Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	84c4ea571d	mesa: Add bufferobj_range_mapped function Add function to test if the buffer is already mapped and if so, if the mapped range overlaps the given range. Modify the _mesa_InvalidateBufferSubData function to use the new function. Enable buffer_object_subdata_range_good() to use bufferobj_range_mapped Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	72d872ad82	mesa: get_texbuffer_format(): differentiate between core and compat context alpha, lumincance and intensity formats are illegal in a core context. Add a check to return MESA_FORMAT_NONE if one of those is requested within a core context. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	1ec2d0a9a8	mesa: Modify format validation to check for extension not context version Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	d5e6fe4d29	mesa: Make validate_texbuffer_format function available externally - change storage class from static to extern - rename validate_texbuffer_format to _mesa_validate_texbuffer_format Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Pi Tabred	1f7c3e541f	mesa: Add infrastructure for GL_ARB_clear_buffer_object - add xml file for extension - add reference in gl_API.xml - add pointer to device driver function table (dd.h) - update dispatch_sanity.cpp Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-18 09:06:51 -07:00
Jan Vesely	56647c5d8f	clover: Append buffers that use CL_MEM_USE_HOST_PTR. Specs say it's legal for implementations to use internal copies, and the write synchronization seems to work. Fixes clCreateBuffer (together with previous patches) and buffer-flags piglits. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Francisco Jerez <currojerez@riseup.net>	2013-12-18 16:21:59 +01:00
Jan Vesely	21f82188ce	clover: Add parameter checks to clCreateBuffer. v2: Use fewer if statements and functional tricks instead of single-use method, suggested by Francisco Jerez. Squash two small patches into one. Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-18 16:18:15 +01:00
Markus Trippelsdorf	78fcc31d4a	configure.ac: remove -fcolor-diagnostics from LLVM flags When LLVM is build with Clang, "llvm-config --cxxflags" contains the -fcolor-diagnostics flag. It is not recognized by gcc and the build fails. Fix by removing the flag. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Brian Paul <brianp@vmware.com>	2013-12-18 07:12:13 -07:00
Thomas Hellstrom	00cf048b12	st/dri: Check for kernel support before enabling fd sharing v2 The dri2 state tracker is checking for driver support before enabling dri2ImageExtension version 7. This commit adds a check that also the kernel driver supports fd sharing through prime. Note that this adds a libdrm dependency on dri2.c. v2: Removed unnecessary clamping of bool expression Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>	2013-12-18 09:11:24 +01:00
Marek Olšák	37c24e6d86	radeonsi: set CB_DISABLE if the color mask is 0 Also needed for the DB in-place decompression according to hw docs. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-12-18 01:20:11 +01:00
Marek Olšák	3352ff97c2	radeonsi: add the htile buffer to the CS ioctl buffer list This may fix the GPU crashes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-12-18 01:20:11 +01:00
Paul Berry	7963fde37b	glsl: Replace _mesa_glsl_parser_targets enum with gl_shader_type. These enums were redundant. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:36 -08:00
Paul Berry	abab438543	main: Move MESA_SHADER_TYPES outside of gl_shader_type enum. This will avoid spurious compiler warnings in the patch that follows. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:36 -08:00
Paul Berry	d9b55244fd	glsl: Don't return bad values from _mesa_shader_type_to_index. This will avoid compiler warnings in the patch that follows. There should be no user-visible effect because the change only affects the behaviour when an invalid enum is passed to _mesa_shader_type_to_index(), and that can only happen if there is a bug elsewhere in Mesa. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-17 12:31:35 -08:00
Brian Paul	188630dc13	swrast: silence driContextSetFlags() parameter type warning	2013-12-17 09:47:47 -08:00
Brian Paul	d79058d1c6	st/dri: fix compiler warning for driCopySubBufferExtension	2013-12-17 09:47:47 -08:00
Marek Olšák	2b404a6504	radeonsi: improve HiZ precision for less and lequal depth functions r600g needs this too. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-17 15:41:46 +01:00
Marek Olšák	1a63f278f2	radeonsi: make DB_RENDER_OVERRIDE an invariant register All this cruft was ported from r600g and isn't needed on SI and later according to hw docs. If we implemented HiS, we would set it to 0. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-17 15:41:46 +01:00
Marek Olšák	249cb511c5	radeonsi: flush HTILE when appropriate Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-17 15:41:46 +01:00
Thomas Hellstrom	3e2b0f801d	st/xa: Add new map flags Replicate some of the gallium pipe transfer functionality. Also bump minor to signal availability of this feature. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-12-17 09:01:29 +01:00
Alexander von Gluck IV	56d920a5c1	Haiku: Add in public GL kit headers * These make up the base of what C++ GL Haiku applications use for 3D rendering. * Not placed in includes/GL to prevent Haiku headers from getting installed on non-Haiku systems. Acked-by: Brian Paul <brianp@vmware.com>	2013-12-16 18:18:12 -06:00
Rob Clark	f9cfe5ce82	freedreno: dummy-draw workaround for a320 Fixes gpu lockups in supertuxkart. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-14 12:35:07 -05:00
Marek Olšák	b56c7f4df1	r600g: expose 32-bit integer vertex formats This advertises GL_ARB_texture_buffer_object_rgb32.	2013-12-14 17:42:08 +01:00
Marek Olšák	2eb321b992	radeonsi: move invariant regs to si_init_config Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	696229523d	r600g: use shader-based MSAA resolving when hw-based one cannot be used This fixes some MSAA integer tests.	2013-12-14 17:42:08 +01:00
Marek Olšák	9ebb9a3c8e	radeonsi: use shader-based MSAA resolving when hw-based one cannot be used This fixes MSAA resolving for 32-bit integer colorbuffers, which isn't implemented by the hardware. It also fixes VM protection faults when resolving MSAA 2D array textures. This may be a CB bug, because shader-based resolving works fine. It may also be faster for upside-down and scaled blits. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	5a609fbcb5	gallium/u_blitter: implement shader-based MSAA resolve with bilinear filtering For scaled resolve. The filter is only good for magnification. If somebody has an idea how to implement a good filter for minification, I'm all ears. I'd have to use derivatives probably. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	fc21098a95	gallium/u_blitter: implement shader-based MSAA resolve We need this for integer formats and upside-down blits, which Radeons don't support for MSAA resolving. It can be used by calling util_blitter_blit. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	f0ed082bab	gallium/u_blitter: remove useless parameters from some functions Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Marek Olšák	072c5d0573	st/dri: resolve sRGB buffers in linear colorspace Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:42:08 +01:00
Roland Scheidegger	27d47bd42f	gallivm: fix pointer type for stmxcsr/ldmxcsr The argument is a i8 pointer not a i32 pointer (even though the value actually stored/loaded IS i32). Older llvm versions didn't care but 3.2 and newer do leading to crashes. Reviewed-by: Zack Rusin <zackr@vmware.com>	2013-12-14 17:11:03 +01:00
Roland Scheidegger	7c027666da	llvmpipe: get rid of barycentric calculation of a0 Didn't really work as well as hoped (in particular it was not generally more accurate), will solve this differently. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-12-14 17:11:03 +01:00
Roland Scheidegger	bfcf1ba1c4	llvmpipe: (trivial) get rid of triangle subdivision code This code was always problematic, and with 64bit rasterization we no longer need it at all. Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 17:11:03 +01:00
Kenneth Graunke	35f0aafaa4	i965: Treat Haswell as 75 in the surface format table. Much like we do for G45. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-12-13 21:14:19 -08:00
Chris Forbes	8bb666cee3	mesa: fix texture view use of _mesa_get_tex_image() The target parameter to _mesa_get_tex_image() is a target enum, not an index. When we're setting up faces for a cubemap, it should be CUBE_MAP_POSITIVE_X .. CUBE_MAP_NEGATIVE_Z; for all other targets it should be the same as the texobj's target. Fixes broken cubemaps [had only +X face but claimed to have all] produced by glTextureView, which then caused various crashes in the driver when we tried to use them. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-14 16:32:41 +13:00
Chris Forbes	544869377d	i965/fs: add support for gl_SampleMaskIn[] v2: - add assert so we don't run into trouble on Gen6. - adjust for Tapani's rearrangement of ir_variable Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:28:11 +13:00
Chris Forbes	1d71f38924	glsl: add gl_SampleMaskIn[] builtin Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:24:22 +13:00
Chris Forbes	c1e1dd2298	mesa: add SYSTEM_VALUE_SAMPLE_MASK_IN Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-14 16:24:21 +13:00
Brian Paul	7d91390359	mesa: document _mesa_texstore() return value	2013-12-13 17:02:43 -07:00
Brian Paul	19fa540219	st/mesa: only set up sampler compare mode for depth textures The GL_ARB_shadow spec says the shadow compare mode should have no effect when sampling a color texture. As it was, it was up to drivers to check for that (softpipe, llvmpipe, svga and probably the rest don't do that). Note: it looks like DX10 allows shadow compare with some non-depth formats, so this case really should be handled in the state tracker. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:07 -07:00
Brian Paul	31b0e7d024	st/mesa: add const qualifiers in sampler validation code Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:06 -07:00
Brian Paul	9f9860b004	st/mesa: add const qualifier to st_translate_color() Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:06 -07:00
Brian Paul	eff11b5a4a	st/mesa: simplify integer texture check Just use the gl_texture_object::_IsInteger field instead of computing it from scratch. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-13 16:06:06 -07:00
Brian Paul	b5cc710473	mesa: update glext.h to version 20131212 Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-13 16:04:23 -07:00
Brian Paul	d6a8421f3b	svga: don't emit extraneous fs shadow code Depending on the depth texture format, we may or may not have to emit explicit fs code to do the shadow comparison. Before, we were emitting it more often than needed. v2: check the actual texture format rather than the screen->depth.z16 field. The screen->depth.z16, x8z24, s8z24 fields may not all be set to a consistent set of depth formats. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-13 12:01:28 -08:00
Brian Paul	e735dfd35b	mesa: s/uint/GLuint/ to fix MSVC error	2013-12-13 12:51:10 -07:00
Courtney Goeltzenleuchter	375f660e27	mesa: Update TexStorage to support ARB_texture_view Call TextureView helper function to set TextureView state appropriately for the TexStorage calls. Misc updates from review feedback. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	1db4cb841b	mesa: add texture_view helper function for TexStorage Add helper function to set texture_view state from TexStorage calls. Include review feedback. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	f07ca59839	mesa: Fill out ARB_texture_view entry points Add Mesa TextureView logic. Incorporate feedback on ARB_texture_view: - Add S3TC VIEW_CLASSes to compatibility table - Use existing _mesa_get_tex_image - Clean up error strings - Use bool instead of GLboolean for internal functions - Split compound level & layer test into individual tests - eliminate helper macro for VIEW_CLASS table - do not call driver if ptr null. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	bb5947de99	mesa: consolidate multiple next_mipmap_level_size Refactor to make next_mipmap_level_size defined in mipmap.c a _mesa_ helper function that can then be used by texture_view Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	320ec1deac	mesa: Add driver entry point for ARB_texture_view Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	f1563e6392	mesa: ARB_texture_view get parameters Add support for ARB_texture_view get parameters: GL_TEXTURE_VIEW_MIN_LEVEL GL_TEXTURE_VIEW_NUM_LEVELS GL_TEXTURE_VIEW_MIN_LAYER GL_TEXTURE_VIEW_NUM_LAYERS Incorporate feedback regarding when to allow query of GL_TEXTURE_IMMUTABLE_LEVELS. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:54 -07:00
Courtney Goeltzenleuchter	668f3614ca	mesa: update texture object for ARB_texture_view Add state needed by glTextureView to the gl_texture_object. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:53 -07:00
Courtney Goeltzenleuchter	2e8493af51	mesa: Tracking for ARB_texture_view extension Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:53 -07:00
Courtney Goeltzenleuchter	d77d2af20a	mesa: Add API definitions for ARB_texture_view Stub in glTextureView API call to go with the glTextureView API xml definition. Includes dispatch test for glTextureView Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 12:31:53 -07:00
Anuj Phogat	7a73c6acb0	mesa: Fix error code generation in glBeginConditionalRender() This patch changes the error condition to satisfy below statement from OpenGL 4.3 core specification: "An INVALID_OPERATION error is generated if id is the name of a query object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query currently in progress." Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-13 11:13:25 -08:00
Carl Worth	93e399f641	Makefile: Add bin/test-driver to EXTRA_FILES I'm not sure why this change is necessary. When I've built previous tar files (such as 9.2.4) with the "make tarballs" target, they include the bin/test-driver file. But at my first attempt to build the tar files for the 10.0.1 release this file was not being included and the build failed. (cherry picked from commit `d573899b93`) [The cherry pick is because I original applied this on the 10.0 branch while working on the 10.0.1 release. But if we don't have this on master as well, this issue will trip us up again the next time we make a new major-release branch off of master.]	2013-12-13 11:12:23 -08:00
Kristian Høgsberg	38366c0c6e	dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context The driverPrivate pointer is opaque to the driver and we can't assume it's a struct gl_context in dri_util.c. Instead provide a helper function to set the struct gl_context flags from the incoming DRI context flags. v2 (idr): Modify the other classic drivers to also use driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests with i965 and classic swrast without regressions. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1] Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau] Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-13 08:19:50 -08:00
Carl Worth	d6c8365795	docs: Update note regarding nominating patches for the stable branch. This brings the documentation up to date with the current practice of using the CC syntax for patch nomination.	2013-12-12 23:10:53 -08:00
Carl Worth	16c2919972	docs: Fix typo Simply replacing Extentions with the correct Extensions.	2013-12-12 23:02:54 -08:00
Carl Worth	66d9cbfe6d	docs: Import 9.2.5 release notes, add news item.	2013-12-12 22:58:40 -08:00
Carl Worth	79c60999dc	docs: Import 10.0.1 release notes, add news item.	2013-12-12 22:21:08 -08:00
Dave Airlie	ba00f2f6f5	swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3) This patches add MESA_copy_sub_buffer support to the dri sw loader and then to gallium state tracker, llvmpipe, softpipe and other bits. It reuses the dri1 driver extension interface, and it updates the swrast loader interface for a new putimage which can take a stride. I've tested this with gnome-shell with a cogl hacked to reenable sub copies for llvmpipe and the one piglit test. I could probably split this patch up as well. v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review, add to p_screen doc comments. v3: finish off winsys interfaces, add swrast classic support as well. Reviewed-by: Jose Fonseca <jfonseca@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com> swrast: add support for copy_sub_buffer	2013-12-13 14:37:01 +10:00
Brian Paul	40070e72d4	util: fix compile breakage D'oh!	2013-12-12 11:11:32 -07:00
Brian Paul	ba67d72c64	util: move variable declaration out of for-loop To fix MSVC build.	2013-12-12 11:09:02 -07:00
Marek Olšák	be909274aa	gallium/util: implement new color clear API in u_blitter	2013-12-12 18:48:04 +01:00
Marek Olšák	f09de87735	st/mesa: set correct PIPE_CLEAR_COLORn flags This also fixes the clear_with_quad function for glClearBuffer.	2013-12-12 18:48:04 +01:00
Marek Olšák	164dc6216a	gallium: allow choosing which colorbuffers to clear Required for glClearBuffer, which only clears one colorbuffer attachment. Example: If the first colorbuffer is float and the second one is int: pipe->clear(pipe, PIPE_CLEAR_COLOR0, float_clear_color, ...); pipe->clear(pipe, PIPE_CLEAR_COLOR1, int_clear_color, ...); This doesn't need any driver changes yet, because all drivers just use: if (flags & PIPE_CLEAR_COLOR) .. The drivers which support GL 3.0 will have to implement it properly though.	2013-12-12 18:48:04 +01:00
Marek Olšák	0612005aa6	st/mesa: fix glClear with multiple colorbuffers and different formats Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>	2013-12-12 18:48:04 +01:00
Marek Olšák	03d848ea10	mesa: fix interpretation of glClearBuffer(drawbuffer) This corresponding piglit tests supported this incorrect behavior instead of pointing at it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>	2013-12-12 18:48:04 +01:00
Marek Olšák	0ad57bef96	docs/GL3: better documentation of GL 3.0	2013-12-12 18:48:04 +01:00
Marek Olšák	e4ef639a57	r600g,radeonsi: fix initialized buffer range tracking for DMA, add comments The DMA functions modify dst_offset and size and util_range_add gets wrong values. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	7fa8fb7382	radeonsi: fix binding the dummy pixel shader This fixes valgrind errors in glxinfo. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	0eb528abf2	radeonsi: fix FS_COLOR0_WRITES_ALL_CBUFS with mixed colorbuffer formats The 16bpc packing must be done separately for each render target. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	cd86f773a7	radeonsi: use the colorbuffer count from the shader key As a result, the initialization of write_all must be done before the compilation. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Marek Olšák	e9fc552837	radeonsi: remove unused variable in si_pipe_shader_ps Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:48:04 +01:00
Andreas Hartmetz	8ee7370c9b	radeonsi: Write htile state to hardware.	2013-12-12 18:34:11 +01:00
Andreas Hartmetz	a32aa2617d	radeon: Allocate htile buffer for SI in r600_texture.	2013-12-12 18:34:11 +01:00
Andreas Hartmetz	ca5812b45c	radeon: rearrange r600_texture and related code a bit. This should make the differences and similarities between color and depth buffer handling more clear.	2013-12-12 18:34:11 +01:00
Marek Olšák	91aca8c662	r600g,radeonsi: consolidate buffer code, add handling of DISCARD_RANGE for SI This adds 2 optimizations for radeonsi: - handling of DISCARD_RANGE - mapping an uninitialized buffer range is automatically UNSYNCHRONIZED Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	12806449fa	r600g,radeonsi: add common interface for buffer invalidation This will be used by common code in the next commit. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	e1374d86fe	r600g,radeonsi: consolidate some debug flags Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	43ea10eb1d	r600g: refactor out code for buffer invalidation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	bba39d8804	r600g,radeonsi: share flags has_cp_dma and has_streamout Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	32fd445daa	radeonsi: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE which can come from glBufferData and glMapBufferRange. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	cc2c100274	radeonsi: implement accelerated buffer copying Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	171e4842ec	r600g: use common interfaces in buffer_transfer_unmap i.e. dma_copy and resource_copy_region. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 18:34:11 +01:00
Marek Olšák	0aea43db93	radeon: move some functions to r600_buffer_common.c Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christoph Brill <egore911@gmail.com> v2: Renamed r600_buffer.c to r600_buffer_common.c. The stupid build system doesn't allow 2 files of the same name in different directories.	2013-12-12 18:34:05 +01:00
Marek Olšák	0b37737cc3	winsys/radeon: set/get the scanout flag with the tiling ioctls If we assume that all buffers allocated by the DDX are scanout, a new flag that says "this is not scanout" has to be added to support the non-scanout buffers and maintain backward compatibility. This fixes bad rendering on Wayland. The flag is defined as: #define RADEON_TILING_R600_NO_SCANOUT RADEON_TILING_SWAP_16BIT AFAIK, RADEON_TILING_SWAP_16BIT is not used on SI. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-12 17:26:41 +01:00
Tapani Pälli	a6345f1559	glsl: modify ir_clone to use memcpy Patch copies the whole data structure at once instead of assigning individual variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:13 +02:00
Tapani Pälli	447bb9029f	glsl: move variables in to ir_variable::data, part II This patch moves following bitfields and variables to the data structure: explicit_location, explicit_index, explicit_binding, has_initializer, is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray, from_named_ifc_block_array, depth_layout, location, index, binding, max_array_access, atomic Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:11 +02:00
Tapani Pälli	33ee2c67c0	glsl: move variables in to ir_variable::data, part I This patch moves following bitfields in to the data structure: used, assigned, how_declared, mode, interpolation, origin_upper_left, pixel_center_integer Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:08 +02:00
Tapani Pälli	c1d3080ee8	glsl: introduce data section to ir_variable Data section helps serialization and cloning of a ir_variable. This patch includes the helper bits used for read only ir_variables. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-12 17:28:06 +02:00
Tapani Pälli	cbe7431cdb	mesa: fix a typo in glDetachShader error message Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-12 07:50:06 +02:00
Brian Paul	ccd6bf8272	svga: expose HW smooth/stipple/wide lines Newer virtual HW versions support smooth/stipple/wide lines. Use that instead of 'draw' fallbacks when possible. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-11 17:19:44 -08:00
Juha-Pekka Heikkila	84b1716b5e	glx: Add missing null check in DRI2WireToEvent Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-11 18:18:43 -07:00
Matthew McClure	e84a1ab3c4	llvmpipe: add plumbing for ARB_depth_clamp With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when clamping the fragment's zw value. To support this, the variant key now includes the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's (depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-11 18:24:21 +00:00
Vadim Girlin	00faf82832	r600g/sb: fix stack size computation on evergreen On evergreen we have to reserve 1 stack element in some additional cases besides the ones mentioned in the docs, but stack size computation was recently reimplemented exactly as described in the docs by the patch that added workarounds for stack issues on EG/CM, resulting in regressions with some apps (Serious Sam 3). This patch fixes it by restoring previous behavior. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369 Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Tested-by: Andre Heider <a.heider@gmail.com>	2013-12-11 04:08:32 +04:00
Zack Rusin	7a50d38a2b	llvmpipe: add a very useful (disabled) debugging output Disabled by default, but it's very useful when needed. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-10 16:41:11 -05:00
Zack Rusin	48b07fb4fc	draw: fix vbuf caching of vertices with inject front face Caching in the vbuf module meant that once a vertex has been emitted it was cached, but it's possible for a vertex at the same location to be emitted again, but this time with a different front-face semantic. Caching was causing the first version of the vertex to be emitted, which resulted in the renderer getting incorrect front-face attributes. By reseting the vertex_id (which is used for caching) we make sure that once a front-face info has been injected the vertex will endup getting emitted. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-12-10 16:40:54 -05:00
Zack Rusin	155139059b	llvmpipe: fix blending with half-float formats The fact that we flush denorms to zero breaks our half-float conversion and blending. This patches enables denorms for blending. It's a little tricky due to the llvm bug that makes it incorrectly reorder the mxcsr intrinsics: http://llvm.org/bugs/show_bug.cgi?id=6393 Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Signed-off-by: Zack Rusin <zackr@vmware.com>	2013-12-10 16:39:48 -05:00
Thomas Hellstrom	1e71493afa	svga/winsys: Implement surface sharing using prime fd handles This needs a prime-aware vmwgfx kernel module to work properly. (With additions by Christopher James Halse Rogers <raof@ubuntu.com>) Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:51 +01:00
Christopher James Halse Rogers	db687011e0	gallium/radeon: Implement hooks for DRI Image 7 (v2) v2: Fix transliteration of lseek arguments Ignore busy return from RADEON_GEM_BUSY ioctl; we're only after the domain Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:45 +01:00
Christopher James Halse Rogers	bff6c5d2b5	radeon: Rename bo_handles hashtable to match its actual contents. It's a map of GEM name->bo, so identify it as such Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:41 +01:00
Christopher James Halse Rogers	7d2c1df99e	ilo: Support DRI Image 7 Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:29 +01:00
Maarten Lankhorst	3e680de1eb	nouveau: Support DRI Image 7 extension Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:17 +01:00
Christopher James Halse Rogers	df3b20b2cf	gallium/dri: Support DRI Image extension version 7 v2: Fix up queryImage return for ATTRIB_FD Use driver_descriptor.configuration to determine whether the driver supports DMA-BUF import/export. v3: Really, truly, fix up queryImage return for ATTRIB_FD Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:13 +01:00
Christopher James Halse Rogers	6b5e15360a	gallium/dri2: Set winsys_handle type to KMS for stride query. Otherwise the default is TYPE_SHARED, which will flink the bo. This seems rather unnecessary for a simple stride query. Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:09 +01:00
Christopher James Halse Rogers	d5a3a2d2fb	gallium/winsys/drm: Prepare for passing prime fds in winsys_handle Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:46:05 +01:00
Christopher James Halse Rogers	343133167f	gallium/dri: Support DRI Image extension version 6 v2: Pick out the correct gl_context pointer v3: Don't leak pipe_resources on error path Set img->dri_format correctly Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>	2013-12-10 09:45:59 +01:00
Ilia Mirkin	bad8871e52	nv50: report 15 max inputs for fragment programs First off, nv50_program only has 16 in/out varyings. However reporting 16 makes 'm' become 68 in nv50_fp_linkage_validate with the varying-packing-simple piglit test. (Subverting the assert makes it compile but fail.) With this patch, varying-packing-simple passes. See: https://bugs.freedesktop.org/show_bug.cgi?id=69155 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-10 08:45:59 +01:00
Maarten Lankhorst	5576ad11ed	nouveau: Fix compiler warning regression cfg is now unused, remove it. Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-10 08:43:41 +01:00
Dave Airlie	0b16042377	swrast: fix readback regression since inversion fix This readback from the frontbuffer with swrast was broken, that bug just made it more obviously broken, this fixes it by inverting the sub image gets. Also fixes a few other piglits. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327 Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325 (for 9.2 the patches this depends on were asked to be backported separately in an email). Cc: "9.2" "10.0" mesa-stable@lists.fedoraproject.org Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-12-10 13:33:40 +10:00
Jordan Justen	4859d492b2	dri megadriver_stub: add compatibility for older DRI loaders To help the transition period when DRI loaders are being updated to support the newer __driDriverExtensions_foo mechanism, we populate __driDriverExtensions with the extensions returned by __driDriverExtensions_foo during a library contructor function. We find the driver foo's name by using the dladdr function which gives the path of the dynamic library's name that was being loaded. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Keith Packard <keithp@keithp.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 16:33:45 -08:00
Kristian Høgsberg	4ed055b4a6	egl/wayland: Return -1 from get_back_bo to indicate error A return value of -1 indicate failure to allocate the back buffer and means we don't segfault on the way out.	2013-12-09 16:14:33 -08:00
Neil Roberts	0b7058c46a	egl_dri2: Remove the unused swap_interval member of dri2_egl_surface The _EGLSurface struct which is embedded into dri2_egl_surface also contains a swap interval member so the other member is redundant. Nothing was using it as far as I can tell.	2013-12-09 16:14:32 -08:00
Kenneth Graunke	19190c2b8c	i965: Replace OUT_RELOC_FENCED with OUT_RELOC. On Gen4+, OUT_RELOC_FENCED is equivalent to OUT_RELOC; libdrm silently ignores the fenced flag: /* We never use HW fences for rendering on 965+ */ if (bufmgr_gem->gen >= 4) need_fence = false; Thanks to Eric for noticing this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-09 13:52:18 -08:00
Paul Berry	088494aa03	glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound. Now that loop_controls no longer creates normatively bound loops, there is no need for ir_loop::normative_bound or the lower_bounded_loops pass. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:09 -08:00
Paul Berry	7ea3baa64d	glsl/loops: Stop creating normatively bound loops in loop_controls. Previously, when loop_controls analyzed a loop and found that it had a fixed bound (known at compile time), it would remove all of the loop terminators and instead set the loop's normative_bound field to force the loop to execute the correct number of times. This made loop unrolling easy, but it had a serious disadvantage. Since most GPU's don't have a native mechanism for executing a loop a fixed number of times, in order to implement the normative bound, the back-ends would have to synthesize a new loop induction variable. As a result, many loops wound up having two induction variables instead of one. This caused extra register pressure and unnecessary instructions. This patch modifies loop_controls so that it doesn't set the loop's normative_bound anymore. Instead it leaves one of the terminators in the loop (the limiting terminator), so the back-end doesn't have to go to any extra work to ensure the loop terminates at the right time. This complicates loop unrolling slightly: when deciding whether a loop can be unrolled, we have to account for the presence of the limiting terminator. And when we do unroll the loop, we have to remove the limiting terminator first. For an example of how this results in more efficient back end code, consider the loop: for (int i = 0; i < 100; i++) { total += i; } Previous to this patch, on i965, this loop would compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop (notice that both g8 and g4 are loop induction variables; one is used to terminate the loop, and the other is used to accumulate the total). After this patch, the same loop compiles to: mov(8) g4<1>.xD 0D loop: cmp.ge.f0(8) null g4<4;4,1>.xD 100D (+f0) if(8) break(8) endif(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:06 -08:00
Paul Berry	4d844cfa56	glsl/loops: Get rid of loop_variable_state::max_iterations. This value is now redundant with loop_variable_state::limiting_terminator->iterations and ir_loop::normative_bound. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:55:03 -08:00
Paul Berry	e734c9f677	glsl/loops: Simplify loop unrolling logic by breaking into functions. The old logic of loop_unroll_visitor::visit_leave(ir_loop *) was: heuristics to skip unrolling in various circumstances; if (loop contains more than one jump) return; else if (loop contains one jump) { if (the jump is an unconditional "break" at the end of the loop) { remove the break and set iteration count to 1; fall through to simple loop unrolling code; } else { for (each "if" statement in the loop body) see if the jump is a "break" at the end of one of its forks; if (the "break" wasn't found) return; splice the remainder of the loop into the other fork of the "if"; remove the "break"; complex loop unrolling code; return; } } simple loop unrolling code; return; These tasks have been moved to their own functions: - splice the remainder of the loop into the other fork of the "if" - simple loop unrolling code - complex loop unrolling code And the logic has been flattened to: heuristics to skip unrolling in various circumstances; if (loop contains more than one jump) return; if (loop contains no jumps) { simple loop unroll; return; } if (the jump is an unconditional "break" at the end of the loop) { remove the break; simple loop unroll with iteration count of 1; return; } for (each "if" statement in the loop body) { if (the jump is a "break" at the end of one of its forks) { splice the remainder of the loop into the other fork of the "if"; remove the "break"; complex loop unroll; return; } } This will make it easier to modify the loop unrolling algorithm in a future patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:59 -08:00
Paul Berry	ffc29120c4	glsl/loops: Move some analysis from loop_controls to loop_analysis. Previously, the sole responsibility of loop_analysis was to find all the variables referenced in the loop that are either loop constant or induction variables, and find all of the simple if statements that might terminate the loop. The remainder of the analysis necessary to determine how many times a loop executed was performed by loop_controls. This patch makes loop_analysis also responsible for determining the number of iterations after which each loop terminator will terminate the loop, and for figuring out which terminator will terminate the loop first (I'm calling this the "limiting terminator"). This will allow loop unrolling to make use of information that was previously only visible from loop_controls, namely the identity of the limiting terminator. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:56 -08:00
Paul Berry	4bbf6d1d2b	glsl/loops: Allocate loop_terminator using new(mem_ctx) syntax. Patches to follow will introduce code into the loop_terminator constructor. Allocating loop_terminator using new(mem_ctx) syntax will ensure that the constructor runs. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:53 -08:00
Paul Berry	714e1b331e	glsl/loops: Remove unnecessary list walk from loop_control_visitor. When loop_control_visitor::visit_leave(ir_loop *) is analyzing a loop terminator that acts on a certain ir_variable, it doesn't need to walk the list of induction variables to find the loop_variable entry corresponding to the variable. It can just look it up in the loop_variable_state hashtable and verify that the loop_variable entry represents an induction variable. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:49 -08:00
Paul Berry	115fd75ab0	glsl/loops: Remove unused fields iv_scale and biv from loop_variable class. These fields were part of some planned optimizations that never materialized. Remove them for now to simplify things; if we ever get round to adding the optimizations that would require them, we can always re-introduce them. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:46 -08:00
Paul Berry	e00b93a1f7	glsl/loops: replace loop controls with a normative bound. This patch replaces the ir_loop fields "from", "to", "increment", "counter", and "cmp" with a single integer ("normative_bound") that serves the same purpose. I've used the name "normative_bound" to emphasize the fact that the back-end is required to emit code to prevent the loop from running more than normative_bound times. (By contrast, an "informative" bound would be a bound that is informational only). Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:33 -08:00
Paul Berry	2c17f97fe6	glsl/loops: consolidate bounded loop handling into a lowering pass. Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the i965 fs and vec4 visitors) had nearly identical logic for handling bounded loops. This replaces the duplicate logic with an equivalent lowering pass that is used by all the back-ends. Note: on i965, there is a slight increase in instruction count. For example, a loop like this: for (int i = 0; i < 100; i++) { total += i; } would previously compile down to this (vec4) native code: mov(8) g4<1>.xD 0D mov(8) g8<1>.xD 0D loop: cmp.ge.f0(8) null g8<4;4,1>.xD 100D (+f0) break(8) add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD add(8) g8<1>.xD g8<4;4,1>.xD 1D add(8) g4<1>.xD g4<4;4,1>.xD 1D while(8) loop After this patch, the "(+f0) break(8)" turns into: (+f0) if(8) break(8) endif(8) because the back-end isn't smart enough to recognize that "if (condition) break;" can be done using a conditional break instruction. However, it should be relatively easy for a future peephole optimization to properly optimize this. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:26 -08:00
Paul Berry	97d8b77054	glsl: In loop analysis, handle unconditional second assignment. Previously, loop analysis would set this->conditional_or_nested_assignment based on the most recently visited assignment to the variable. As a result, if a vaiable was assigned to more than once in a loop, the flag might be set incorrectly. For example, in a loop like this: int x; for (int i = 0; i < 3; i++) { if (i == 0) x = 10; ... x = 20; ... } loop analysis would have incorrectly concluded that all assignments to x were unconditional. In practice this was a benign bug, because conditional_or_nested_assignment is only used to disqualify variables from being considered as loop induction variables or loop constant variables, and having multiple assignments also disqualifies a variable from being considered as either of those things. Still, we should get the analysis correct to avoid future confusion. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:23 -08:00
Paul Berry	cb38a0dc0a	glsl: Fix handling of function calls inside nested loops. Previously, when visiting an ir_call, loop analysis would only mark the innermost enclosing loop as containing a call. As a result, when encountering a loop like this: for (i = 0; i < 3; i++) { for (int j = 0; j < 3; j++) { foo(); } } it would incorrectly conclude that the outer loop ran three times. (This is not certain; if foo() modifies i, then the outer loop might run more or fewer times). Fixes piglit test "vs-call-in-nested-loop.shader_test". Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:20 -08:00
Paul Berry	877db5a792	glsl: Fix loop analysis of nested loops. Previously, when visiting a variable dereference, loop analysis would only consider its effect on the innermost enclosing loop. As a result, when encountering a loop like this: for (int i = 0; i < 3; i++) { for (int j = 0; j < 3; j++) { ... i = 2; } } it would incorrectly conclude that the outer loop ran three times. Fixes piglit test "vs-inner-loop-modifies-outer-loop-var.shader_test". Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:16 -08:00
Paul Berry	2e060551bd	glsl: Extract functions from loop_analysis::visit(ir_dereference_variable *). This function is about to get more complex. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-09 10:54:13 -08:00
Paul Berry	69c44d65c8	i965/gen7+: Implement fast color clears for MSAA buffers. Fast color clears of MSAA buffers work just like fast color clears with non-MSAA buffers, except that the alignment and scaledown requirements are different. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-12-09 10:54:10 -08:00
Paul Berry	0ac622accf	i965/blorp: Refactor code for computing fast clear align/scaledown factors. This will make it easier to add fast color clear support to MSAA buffers, since they have different alignment and scaling requirements. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:54:07 -08:00
Paul Berry	da08ee8e3b	i965/blorp: allow multisample blorp clears Previously, we didn't do multisample blorp clears because we couldn't figure out how to get them to work. The reason for this was because we weren't setting the brw_blorp_params num_samples field consistently with dst.num_samples. Now that those two fields have been collapsed down into one, we can do multisample blorp clears. However, we need to do a few other pieces of bookkeeping to make them work correctly in all circumstances: - Since blorp clears may now operate on multisampled window system framebuffers, they need to call intel_renderbuffer_set_needs_downsample() to ensure that a downsample happens before buffer swap (or glReadPixels()). - When clearing a layered multisample buffer attachment using UMS or CMS layout, we need to advance layer by multiples of num_samples (since each logical layer is associated with num_samples physical layers). Note: we still don't do multisample fast color clears; more work needs to be done to enable those. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:54:03 -08:00
Paul Berry	73e8bd9f5c	i965/blorp: Get rid of redundant num_samples blorp param. Previously, brw_blorp_params contained two fields for determining sample count: num_samples (which determined the multisample configuration of the rendering pipeline) and dst.num_samples (which determined the multisample configuration of the render target surface). This was redundant, since both fields had to be set to the same value to avoid rendering errors. This patch eliminates num_samples to avoid future confusion. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:54:00 -08:00
Paul Berry	25195b0041	i965/gen7+: Disentangle MSAA layout from fast clear state. This patch renames the enum that's used to keep track of fast clear state from "mcs_state" to "fast_clear_state", and it removes the enum value INTEL_MCS_STATE_MSAA (which previously meant, "this is an MSAA buffer, so we're not keeping track of fast clear state"). The only real purpose that enum value was serving was to prevent us from trying to do fast clear resolves on MSAA buffers, and it's just as easy to prevent that by checking the buffer's msaa_layout. This paves the way for implementing fast clears of MSAA buffers. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:10 -08:00
Paul Berry	f416a15096	i965: Don't try to use HW blitter for glCopyPixels() when multisampled. The hardware blitter doesn't understand multisampled layouts, so there's no way this could possibly succeed. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:07 -08:00
Paul Berry	b5fe413b4d	i965: Document conventions for counting layers in 2D multisample buffers. The "layer" parameters used in blorp, and the intel_renderbuffer::mt_layer field, represent a physical layer rather than a logical layer. This is important for 2D multisample arrays on Gen7+ because the UMS and CMS multisample layouts use N physical layers to represent each logical layer, where N is the number of samples. Also add an assertion to blorp to help catch bugs if we fail to follow these conventions. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:03 -08:00
Paul Berry	3a2925bfa9	i965/blorp: Improve fast color clear comment. Clarify the fact that we only optimize full buffer clears using fast color clear, and why. Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-09 10:51:00 -08:00
Tom Stellard	9a5ce0c4c9	r300/compiler/tests: Fix line length check in test parser Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 09:40:15 -05:00
Tom Stellard	1896431f79	r300/compiler/tests: Fix segfault Reviewed-by: Alex Deucher <alexander.deucher@amd.com> CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 09:40:15 -05:00
Ilia Mirkin	2cd2b9705e	nouveau/video: update a few more h264 picparm field names Based on comments by Benjamin Morris <bmorris@nvidia.com> in http://lists.freedesktop.org/archives/nouveau/2013-December/015328.html This adds setting of is_long_term, and updates a few field names we were unclear about. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 15:11:50 +01:00
Ilia Mirkin	78525dae8a	nouveau/video: update h264 picparm field names based on usage Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 15:11:42 +01:00
Ilia Mirkin	e01ba9d6b0	nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0) Create the ref_bo without any storage type flags set for now. The issue probably arises from our use of the additional buffer space at the end of the ref_bo. It should probably be split up in the future. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Tested-by: Martin Peres <martin.peres@labri.fr> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-09 15:11:20 +01:00
Ilia Mirkin	e796fa22d4	nvc0: make sure nvd7 gets NVC8_3D_CLASS as well Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-09 15:10:37 +01:00
Ilia Mirkin	1386cb9488	nv50: TXF already has integer arguments, don't try to convert from f32 Fixes the texelFetch piglit tests Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-09 15:10:37 +01:00
Matthew McClure	0319ea9ff6	llvmpipe: clamp fragment shader depth write to the current viewport depth range. With this patch, generate_fs_loop will clamp any fragment shader depth writes to the viewport's min and max depth values. Viewport selection is determined by the geometry shader output for the viewport array index. If no index is specified, then the default viewport index is zero. Semantics for this path can be found in draw_clamp_viewport_idx and lp_clamp_viewport_idx. lp_jit_viewport was created to store viewport information visible to JIT code, and is validated when the LP_NEW_VIEWPORT dirty flag is set. lp_rast_shader_inputs is responsible for passing the viewport_index through the rasterizer stage to fragment stage (via lp_jit_thread_data). Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-12-09 12:57:02 +00:00
Neil Roberts	992a2dbba8	wayland: Add support for eglSwapInterval The Wayland EGL platform now respects the eglSwapInterval value. The value is clamped to either 0 or 1 because it is difficult (and probably not useful) to sync to more than 1 redraw. The main change is that if the swap interval is 0 then Mesa won't install a frame callback so that eglSwapBuffers can be executed as often as necessary. Instead it will do a sync request after the swap buffers. It will block for sync complete event in get_back_bo instead of the frame callback. The compositor is likely to send a release event while processing the new buffer attach and this makes sure we will receive that before deciding whether to allocate a new buffer. If there are no buffers available then instead of returning with an error, get_back_bo will now poll the compositor by repeatedly sending sync requests every 10ms. This is a last resort and in theory this shouldn't happen because there should be no reason for the compositor to hold on to more than three buffers. That means whenever we attach the fourth buffer we should always get an immediate release event which should come in with the notification for the first sync request that we are throttled to. When the compositor is directly scanning out from the application's buffer it may end up holding on to three buffers. These are the one that is is currently scanning out from, one that has been given to DRM as the next buffer to flip to, and one that has been attached and will be given to DRM as soon as the previous flip completes. When we attach a fourth buffer to the compositor it should replace that third buffer so we should get a release event immediately after that. This patch therefore also changes the number of buffer slots to 4 so that we can accomodate that situation. If DRM eventually gets a way to cancel a pending page flip then the compositors can be changed to only need to hold on to two buffers and this value can be put back to 3. This also moves the vblank configuration defines from platform_x11.c to the common egl_dri2.h header so they can be shared by both platforms.	2013-12-07 22:36:02 -08:00
Neil Roberts	25cc889004	wayland: Block for the frame callback in get_back_bo not dri2_swap_buffers Consider a typical game-style main loop which might be like this: while (1) { draw_something(); eglSwapBuffers(); } In this case the game is relying on eglSwapBuffers to throttle to a sensible frame rate. Previously this game would end up using three buffers even though it should only need two. This is because Mesa decides whether to allocate a new buffer in get_back_bo which would be before it has tried to read any events from the compositor so it wouldn't have seen any buffer release events yet. This patch just moves the block for the frame callback to get_back_bo. Typically the compositor will send a release event immediately after one of the attaches so if we block for the frame callback here then we can be sure to have completed at least one roundtrip and received that release event after attaching the previous buffer before deciding whether to allocate a new one. dri2_swap_buffers always calls get_back_bo so even if the client doesn't render anything we will still be sure to block to the frame callback. The code to create the new frame callback has been moved to after this call so that we can be sure to have cleared the previous frame callback before requesting a new one.	2013-12-07 22:36:02 -08:00
Vinson Lee	965cde9232	glapi: Do not include dlfcn.h on Windows. This patch fixes this MinGW build error. CC glapi_gentable.lo glapi_gentable.c:47:19: fatal error: dlfcn.h: No such file or directory Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-07 14:31:01 -08:00
Vincent Lejeune	797894036d	r600/llvm: Allow arbitrary amount of temps in tgsi to llvm	2013-12-07 18:39:10 +01:00
Rob Clark	a1d808638d	freedreno/a3xx: add adreno 330 support Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-07 09:37:24 -05:00
Rob Clark	d36ae204d5	freedreno/a3xx/compiler: add ROUND Signed-off-by: Rob Clark <robclark@freedesktop.org>	2013-12-07 08:45:27 -05:00
Chris Forbes	88dc246630	mesa: Require per-sample shading if the `sample` qualifier is used. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:15:05 +13:00
Chris Forbes	2625a34bfc	glsl: Populate gl_fragment_program::IsSample bitfield Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:15:03 +13:00
Chris Forbes	6429cc05ca	mesa: add IsSample bitfield to gl_fragment_program Drivers will need to look at this to decide if they need to do per-sample fragment shader dispatch. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:15:01 +13:00
Chris Forbes	5d326fa963	glsl: Put `sample`-qualified varyings in their own packing classes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:14:59 +13:00
Chris Forbes	51c5fc85e1	glsl: Add ir support for `sample` qualifier; adjust compiler and linker Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:14:58 +13:00
Chris Forbes	51aa15aca2	glsl: Add frontend support for `sample` auxiliary storage qualifier Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-12-07 17:14:39 +13:00
Chris Forbes	a1ca580240	i965: Don't flag gather quirks for Gen8+ My understanding is that Broadwell retains the same SCS mechanism that Haswell has, so even if the underlying issue with this format is not fixed, the w/a will be applied in SCS rather than needing shader code. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:17:27 +13:00
Chris Forbes	83b83fb984	i965/Gen7: Allow CMS layout for multisample textures Now that all the pieces are in place, this should provide a nice performance boost for apps using multisample textures. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:10:04 +13:00
Chris Forbes	3122c2421a	i965/vs: Sample from MCS surface when required Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-07 16:10:02 +13:00
Chris Forbes	7810162053	i965/fs: Sample from MCS surface when required Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-07 16:09:49 +13:00
Chris Forbes	7629c489c8	i965: Add shader opcode for sampling MCS surface Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:09:32 +13:00
Chris Forbes	27359b8079	i965/Gen7: Include bitfield in the sampler key for CMS layout We need to emit extra shader code in this case to sample the MCS surface first; we can't just blindly do this all the time since IVB will sometimes try to access the MCS surface even if disabled. V3: Use actual MSAA layout from the texture's mt, rather then computing what would have been used based on the format. This is simpler and less fragile - there's at least one case where we might want to have a texture's MSAA layout change based on what the app does (CMS SINT falling back to UMS if the app ever attempts to render to it with a channel disabled.) This also obsoletes V2's 1/10 -- compute_msaa_layout can now remain an implementation detail of the miptree code. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-07 16:09:12 +13:00
Chris Forbes	b1604841c2	i965/Gen7: Move decision to allocate MCS surface into intel_mipmap_create This gives us correct behavior for both renderbuffers (which previously worked) and multisample textures (which would never get an MCS surface allocated, even if CMS layout was selected) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:08:55 +13:00
Chris Forbes	6ca9a6f4d7	i965/Gen7: emit mcs info for multisample textures Previously this was only done for render targets. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:08:52 +13:00
Chris Forbes	dfa952da97	i965/wm: Set copy of sample mask in 3DSTATE_PS correctly for Haswell The bspec says: "SW must program the sample mask value in this field so that it matches with 3DSTATE_SAMPLE_MASK" I haven't observed this to actually fix anything, but stumbled across it while adding the rest of the support for CMS layout for multisample textures. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:08:47 +13:00
Chris Forbes	8064b0f2c4	i965: refactor sample mask calculation Haswell needs a copy of the sample mask in 3DSTATE_PS; this makes that convenient. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-07 16:07:53 +13:00
Ian Romanick	758658850b	glsl: Don't emit empty declaration warning for a struct specifier The intention is that things like int; will generate a warning. However, we were also accidentally emitting the same warning for things like struct Foo { int x; }; Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68838 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Aras Pranckevicius <aras@unity3d.com> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-06 08:06:54 -08:00
Thomas Hellstrom	453651e521	st/xa: Bump major version number to 2 For some reason this was left out when the version was changed... Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-12-06 06:18:03 -08:00
Ben Skeggs	92ceb327ba	nvc0: fixup gk110 and up not being listed in various switch statements Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-12-06 11:28:45 +10:00
Kenneth Graunke	26f3ff8a91	i965: Replace non-standard INLINE macro with "inline". These are identical: main/compiler.h defines INLINE to "inline". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
Kenneth Graunke	11d9af7c0a	i965: Don't use GL types in files shared with intel-gpu-tools. sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \ -e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \ -e 's/GLshort/int16_t/g' \ brw_eu* brw_disasm.c brw_structs.h Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
Kenneth Graunke	a7bdd4cba8	i965: Drop trailing whitespace from the rest of the driver. Performed via: $ for file in ; do sed -i 's/ //g'; done Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
Kenneth Graunke	d542c45c75	i965: Drop trailing whitespace from files shared with intel-gpu-tools. Performed via s/ *$//g. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-05 13:59:18 -08:00
José Fonseca	3be333ed30	tools/trace: More tweaks to state dumping. - Ignore buffer format (it is totally arbitrary) - Initialize state. - Handle begin/end_query statements.	2013-12-05 13:35:06 +00:00
José Fonseca	9648b76dc4	trace: Reorder dumping of pipe_rasterizer_state. Such that it matches the pipe_rasterizer_state declaration, making it easier to double-check that all state is being actually dumped. Trivial.	2013-12-05 13:35:06 +00:00
José Fonseca	10450cbbe6	trace: Dump pipe_sampler_state::seamless_cube_map. Trivial.	2013-12-05 13:35:06 +00:00
Michel Dänzer	7435d9f77c	radeonsi: Remove some stale XXX / FIXME comments Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-12-05 13:50:07 +09:00
Matt Turner	cbb49cb2f7	i965: Emit better code for ir_unop_sign. total instructions in shared programs: 1550449 -> 1550048 (-0.03%) instructions in affected programs: 15207 -> 14806 (-2.64%) Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>	2013-12-04 20:05:44 -08:00
Matt Turner	d30b2ed5f8	i965/fs: New peephole optimization to flatten IF/BREAK/ENDIF. total instructions in shared programs: 1550713 -> 1550449 (-0.02%) instructions in affected programs: 7931 -> 7667 (-3.33%) Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:44 -08:00
Matt Turner	9658b04fc4	i965/fs: Emit a MOV instead of a SEL if the sources are the same. One program affected. instructions in affected programs: 436 -> 428 (-1.83%) Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:44 -08:00
Matt Turner	4532cac06a	i965/fs: Extend SEL peephole to handle only matching MOVs. Before this patch, the following code would not be optimized even though the first two instructions were common to the then and else blocks: (+f0) IF MOV dst0 ... MOV dst1 ... MOV dst2 ... ELSE MOV dst0 ... MOV dst1 ... MOV dst3 ... ENDIF This commit extends the peephole to handle this case. No shader-db changes. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:44 -08:00
Matt Turner	13de9f03f1	i965/fs: New peephole optimization to generate SEL. fs_visitor::try_replace_with_sel optimizes only if statements whose "then" and "else" bodies contain a single MOV instruction. It also could not handle constant arguments, since they cause an extra MOV immediate to be generated (since we haven't run constant propagation, there are more than the single MOV). This peephole fixes both of these and operates as a normal optimization pass. fs_visitor::try_replace_with_sel is still arguably necessary, since it runs before pull constant loads are lowered. total instructions in shared programs: 1559129 -> 1545833 (-0.85%) instructions in affected programs: 167120 -> 153824 (-7.96%) GAINED: 13 LOST: 6 Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:44 -08:00
Matt Turner	fa227e7cbc	i965/fs: Add SEL() convenience function. Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-04 20:05:43 -08:00
Matt Turner	4b0ef4bf38	glsl: Use fabs() on floating point values. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-12-04 20:05:43 -08:00
Matt Turner	8814806c97	i965: Print conditional mod in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	b9af66528e	i965: Externalize conditional_modifier for use in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	637dda1c30	i965: Print argument types in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	21e92e74c8	i965: Externalize reg_encoding for use in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	729fe77e3b	i965/vec4: Don't print swizzles for immediate values. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	2b8e0a73fb	i965/vec4: Print negate and absolute value for src args. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	a85f1b7adf	i965/vec4: Add support for printing HW_REGs in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	942151af30	i965/fs: Print ARF registers properly in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:43 -08:00
Matt Turner	0e4053234d	i965: Don't print extra (null) arguments in dump_instruction(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	d79e711718	glsl: Remove silly OR(..., 0x0) from ldexp() lowering. I translated copysign(0.0f, x) a little too literally. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	b1eb2ad8d1	i965: Allow commuting the operands of ADDC for const propagation. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	04d83396ee	i965/fs: Rename register_coalesce_2() -> register_coalesce(). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	9a6b14f674	i965/fs: Remove now useless register_coalesce() pass. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	1520ae48b8	i965/fs: Let register_coalesce_2() eliminate self-moves. This is the last thing that register_coalesce() still handled. total instructions in shared programs: 1561060 -> 1560908 (-0.01%) instructions in affected programs: 15758 -> 15606 (-0.96%) Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	8786f381ec	i965: Allow constant propagation into ASR and BFI1. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	ba84800275	i965/cfg: Document cur_* variables. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	7642c3c6ff	i965/cfg: Remove ip & cur from brw_cfg. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	d2fcdd0973	i965/cfg: Clean up cfg_t constructors. parent_mem_ctx was unused since `db47074a`, so remove the two wrappers around create() and make create() the constructor. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	c6450fa963	i965/cfg: Throw out confusing make_list method. make_list is just a one-line wrapper and was confusingly called by NULL objects. E.g., cur_if == NULL; cur_if->make_list(mem_ctx). Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	f3bce19f6c	i965/cfg: Include only needed headers. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:42 -08:00
Matt Turner	f4b50a1466	i965/cfg: Remove unnecessary endif_stack. Unnecessary since last commit. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Matt Turner	2eb9bbfb68	i965/cfg: Rework to make IF & ELSE blocks flow into ENDIF. Previously we made the basic block following an ENDIF instruction a successor of the basic blocks ending with IF and ELSE. The PRM says that IF and ELSE instructions jump to the ENDIF, rather than over it. This should be immaterial to dataflow analysis, except for if, break, endif sequences: START B1 <-B0 <-B9 0x00000100: cmp.g.f0(8) null g15<8,8,1>F g4<0,1,0>F 0x00000110: (+f0) if(8) 0 0 null 0x00000000UD END B1 ->B2 ->B4 START B2 <-B1 break 0x00000120: break(8) 0 0 null 0D END B2 ->B10 START B3 0x00000130: endif(8) 2 null 0x00000002UD END B3 ->B4 The ENDIF block would have no parents, so dataflow analysis would generate incorrect results, preventing copy propagation from eliminating some instructions. This patch changes the CFG to make ENDIF start rather than end basic blocks, so that it can be the jump target of the IF and ELSE instructions. It helps three programs (including two fs8/fs16 pairs). total instructions in shared programs: 1561126 -> 1561060 (-0.00%) instructions in affected programs: 837 -> 771 (-7.89%) More importantly, it allows copy propagation to handle more cases. Disabling the register_coalesce() pass before this patch hurts 58 programs, while afterward it only hurts 11 programs. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Matt Turner	ed85c0f409	i965/cfg: Keep pointers to IF/ELSE/ENDIF instructions in the cfg. Useful for finding the associated control flow instructions, given a block ending in one. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Matt Turner	51194932d3	i965/cfg: Add code to dump blocks and cfg. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-04 20:05:41 -08:00
Ian Romanick	fa1923ac3a	mesa: Remove GL_MESA_texture_array cruft from gl.h glext.h has had all the necessary bits for years. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:43 -08:00
Ian Romanick	2a3d1e2e06	mesa: Remove support for GL_MESA_texture_array This extension enabled the use of texture array with fixed-function and assembly fragment shaders. No applications are known to use this extension. NOTE: This patch regresses GL_TEXTURE_1D_ARRAY and GL_TEXTURE_2D_ARRAY cases of the copyteximage piglit test. The test is incorrectly using texture arrays with fixed function while only requiring the GL_EXT_texture_array extension. A fix for the test has been posted to the piglit mailing list. http://lists.freedesktop.org/archives/piglit/2013-November/008639.html Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	538a7f2a80	mesa: Use a single enable for GL_EXT_texture_array and GL_MESA_texture_array Every driver that enables one also enables the other. The difference between the two is MESA adds support for fixed-function and assembly fragment shaders, but EXT only adds support for GLSL. The MESA extension was created back when Mesa did not support GLSL. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	e0587fb9d0	mesa: Minor clean-up of target_enum_to_index Constify the gl_context parameter, and remove suffixes from enums that have non-suffix versions. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	b092af40a5	mesa: Silence GCC warning in count_tex_size main/texobj.c: In function 'count_tex_size': main/texobj.c:886:23: warning: unused parameter 'key' [-Wunused-parameter] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	6c84fc2dbf	mesa: Silence GCC warning in _mesa_test_texobj_completeness main/texobj.c: In function '_mesa_test_texobj_completeness': main/texobj.c:553:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] main/texobj.c:553:193: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] main/texobj.c:553:254: warning: signed and unsigned type in conditional expression [-Wsign-compare] main/texobj.c:553:148: warning: signed and unsigned type in conditional expression [-Wsign-compare] Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	7144b76872	mesa: Add missing API check for GL_TEXTURE_3D There are no 3D textures in OpenGL ES 1.x. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Ian Romanick	01bbebce4d	mesa: Add missing checks for GL_TEXTURE_CUBE_MAP_ARRAY That enum requires GL_ARB_texture_cube_map_array, and it is only available on desktop GL. It looks like this has been an un-noticed issue since GL_ARB_texture_cube_map_array support was added in commit `e0e7e295`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 17:22:42 -08:00
Neil Roberts	5cddb1ce3c	wayland: Add an extension to create wl_buffers from EGLImages This adds an extension called EGL_WL_create_wayland_buffer_from_image which adds the following single function: struct wl_buffer * eglCreateWaylandBufferFromImageWL(EGLDisplay dpy, EGLImageKHR image); The function creates a wl_buffer which shares its contents with the given EGLImage. The expected use case for this is in a nested Wayland compositor which is using subsurfaces to present buffers from its clients. Using this extension it can attach the client buffers directly to the subsurface without having to blit the contents into an intermediate buffer. The compositing can then be done in the parent compositor. The extension is only implemented in the Wayland EGL platform because of course it wouldn't make sense anywhere else.	2013-12-04 17:04:57 -08:00
Kristian Høgsberg	bce64c6c83	egl/wayland: Damage INT32_MAX x INT32_MAX region for eglSwapBuffers If we're not using EGL_EXT_swap_buffers_with_damage, we have to damage the full extent. EGL operates on buffer coordinates, but wl_surface.damage takes surface coordinates. EGL doesn't know the buffer transformation (rotated or scaled) and can't post accurate damage in surface coordinates. The damage event however is clipped to the surface extents so we can just damage the maximum rectangle. In case of EGL_EXT_swap_buffers_with_damage, the application knows the buffer transform and is expected to pass in rectangles in surface space. https://bugs.freedesktop.org/show_bug.cgi?id=70250 Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 16:13:42 -08:00
Axel Davy	afcce46fd5	Enable throttling in SwapBuffers flush_with_flags, when available, allows the driver to throttle. Using this suppress input lag issues that can be observed in heavy rendering situations on non-intel cards. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 15:58:29 -08:00
Kristian Høgsberg	33eb5eabee	egl/wayland: Send commit after flushing the driver context This typically won't make a difference, since we only send the requests at wl_display_flush() time. There might be a small race with another thread calling wl_display_flush() after our commit request, but before we flush the DRI driver. Moving the commit below the DRI driver flush call looks more natural and eliminates the small race. Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 15:48:28 -08:00
Axel Davy	402bf6e8d0	egl/wayland: Flush the wl_display at the end of SwapBuffers We would like the compositor to receive the commited buffer as soon as possible, so it has the time to treat it, and release old ones. We shouldn't rely on the client to flush the queue for us. Signed-off-by: Axel Davy <axel.davy@ens.fr> Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-12-04 15:48:28 -08:00
Brian Paul	50205e11c6	mesa: reduce memory used for short display lists Display lists allocate memory in chunks of 256 tokens (1KB) at a time. If an app creates many short display lists or uses glXUseXFont() this can waste quite a bit of memory. This patch uses realloc() to trim short lists and reduce the memory used. Also, null/zero-out some list construction fields in _mesa_EndList(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 15:40:32 -07:00
Brian Paul	314ccf6901	mesa: update/remove display list comments Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 09:46:07 -07:00
Brian Paul	483dc973c4	mesa: remove gl_dlist_node::next pointer to reduce dlist memory use Now, sizeof(gl_dlist_node)==4 even on 64-bit systems. This can halve the memory used by some display lists on 64-bit systems. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 09:46:07 -07:00
Brian Paul	b6468b4597	mesa: begin reducing memory used by display lists This is a first step in reducing memory used by display lists on 64-bit systems. On 64-bit systems, the gl_dlist_node union type is 8 bytes because of the 'data' and 'next' fields. This causes every display list node/token to occupy 8 bytes instead of 4 as originally designed. This basically doubles the memory used by some display lists on 64-bit systems. The fix is to remove the 64-bit 'data' and 'next' pointer fields from the union and instead store them as a pair of 32-bit values. Easily done with a few helper functions. The next patch will take care of the 'next' field. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-04 09:46:07 -07:00
Ilia Mirkin	06359e368b	nouveau: Add lots of comments to the buffer transfer logic Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-04 16:38:50 +01:00
Ilia Mirkin	0e5bf85651	nv50: wait on the buf's fence before sticking it into pushbuf This resolves some rendering issues in source games. See https://bugs.freedesktop.org/show_bug.cgi?id=64323 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 16:38:50 +01:00
Ilia Mirkin	ce6dd69697	nouveau: avoid leaking fences while waiting This fixes a memory leak in some situations. Also avoids emitting an extra fence if the kick handler does the call to nouveau_fence_next itself. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 16:38:50 +01:00
Ilia Mirkin	f50a45452a	nv50: fix a small leak on context destroy Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2013-12-04 16:38:50 +01:00
Brian Paul	657466a3f6	docs: put MD5 sums in 9.2.4 relnotes file Signed-off-by: Brian Paul <brianp@vmware.com>	2013-12-04 07:47:13 -07:00
Brian Paul	2732d0d21d	docs: use --disable-dri3 for VMware guest driver build For the time being at least. Suggested by Adrian Rangel. Signed-off-by: Brian Paul <brianp@vmware.com>	2013-12-04 07:41:29 -07:00
Siavash Eliasi	f0cc59d68a	mesa: modified _mesa_align_free() to accept NULL pointer So that it acts like ordinary free(). This lets us remove a bunch of if statements where the function is called. v2: - Avoiding compile error on MSVC and possible warnings on other compilers. - Added comment regards passing NULL pointer being safe. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-04 07:31:27 -07:00
Ilia Mirkin	267679be84	mesa: don't leak performance monitors on context destroy Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 06:20:36 -08:00
Ilia Mirkin	c45cf6199f	nv50: Fix GPU_READING/WRITING bit removal Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> CC: "9.1, 9.2, 10.0" <mesa-stable@lists.freedesktop.org>	2013-12-04 14:24:30 +01:00
Michel Dänzer	79e6512629	pipe-loader: Fix llvmpipe.la path Fixes make[3]: *** No rule to make target `.../src/gallium/drivers/softpipe/libllvmpipe.la', needed by `pipe_swrast.la'. Stop.	2013-12-04 11:56:10 +09:00
Kenneth Graunke	26b7b50afe	i965: Fix BRW_BATCH_STRUCT to specify RENDER_RING, not UNKNOWN_RING. I missed this in the boolean -> enum conversion. C cheerfully casts false -> 0 -> UNKNOWN_RING. On Gen4-5, this causes the render ring prelude hook to get called in the middle of the batch, which is crazy. BRW_BATCH_STRUCT is not used on Gen6+. Fixes regressions since `395a32717d` ("i965: Introduce an UNKNOWN_RING state."). Fixes "fips -v glxgears" on Ironlake. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-03 16:24:58 -08:00
Kenneth Graunke	e03994bf47	Revert "i965: Move brw_emit_query_begin() to the render ring prelude." This reverts commit `a4bf7f6b6e`. It breaks occlusion queries on Gen4-5. Doing this right will likely require larger changes, which should be done at a future date. Some Piglit tests still passed due to other bugs; fixing those revealed this problem. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-03 16:24:53 -08:00
Kenneth Graunke	da07e1b683	i965: Fix OACONTROL assertion failures on Ironlake. I guarded half of the callers to start/stop_oa_counters with generation checks, but missed the other half (which were added later). OACONTROL doesn't exist on Ironlake, so we better not write it. Also, there's no need---Ironlake's performance counters are always running. This patch moves the generation checks into start/stop_oa_counters, rather than requiring the caller to do them. Fixes assertion failures in Piglit's AMD_performance_monitor/measure. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-03 16:24:49 -08:00
Emil Velikov	4c11099453	gallium/radeon: use PRIu64 macro for printing uint64_t Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Emil Velikov	f60737a525	pipe-loader: build llvmpipe on top of softpipe One can select if they want to fallback to softpipe. Current approach makes this not possible, whereas other targets (dri-swrast) handle this approapriately. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Emil Velikov	bc2627a98a	mesa: resolve typo DTXn/DXTn Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Emil Velikov	507c2356e3	automake: include only one copy VERSION in tarball The VERSION file is tracked by git (git ls-files), thus adding it to EXTRA_FILES will result in a duplicate copy within the final tarball. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72230 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reported-by: Patrick Steinhardt <ps@pks.im> Tested-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-12-03 21:44:26 +00:00
Juha-Pekka Heikkila	03ef57950a	glx: Add missing null check in gxl/dri2_glx.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-03 14:35:41 -07:00
Juha-Pekka Heikkila	b8875cb7c8	glx: Check malloc return value before accessing memory in glx/clientattrib.c Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-12-03 14:35:41 -07:00
Chad Versace	998018d7be	i965: Add extra-alignment for non-msrt fast color clear for all hw (v2) The BSpec states that the aligment for the non-msrt clear rectangle must be doubled; the BSpec does not restricit the workaround to specific hardware. Commit `9a1a67b` applied the workaround to Haswell GT3. Commit `8b659ce` expanded the workaround to all Haswell variants. This commit expands it to all hardware. No Piglit regressions on Ivybridge 0x0166. No fixes either. I know no Ivybridge nor Baytrail bug related to this workaround. However, the BSpec says the extra alignment is required, so let's do it. v2: Apply to all hardware, not just gen7. CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org> CC: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Paul Berry <stereotype441@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-03 13:19:54 -08:00
Marek Olšák	40e2856123	configure.ac: require libdrm_radeon 2.4.50	2013-12-03 20:07:35 +01:00
Marek Olšák	e47af58bb4	st/mesa: implement layered framebuffer clear for the clear_with_quad fallback Same approach as in u_blitter.	2013-12-03 19:39:13 +01:00
Marek Olšák	6b919b1b2d	gallium/util: implement layered framebuffer clear in u_blitter All bound layers (from first_layer to last_layer) should be cleared. This uses a vertex shader which outputs gl_Layer = gl_InstanceID, so each instance goes to a different layer. By rendering a quad and setting the instance count to the number of layers, it will trivially clear all layers. This requires AMD_vertex_shader_layer (or PIPE_CAP_TGSI_VS_LAYER), which only radeonsi supports at the moment. r600 could do this too. Standard DX11 hardware will have to use a geometry shader though, which has higher overhead.	2013-12-03 19:39:13 +01:00
Marek Olšák	1a02bb71dd	gallium: add support for AMD_vertex_shader_layer	2013-12-03 19:39:13 +01:00
Marek Olšák	d52791a708	radeonsi: add driver support for layered rendering and AMD_vertex_shader_layer Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-03 19:39:13 +01:00
Marek Olšák	053606ddae	radeonsi: implement OpenGL edge flags Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2013-12-03 19:39:13 +01:00
Marek Olšák	d8d67d2e1f	st/mesa: add support for layered framebuffers and consolidate code This is a subset of geometry shaders. It's all about setting first_layer and last_layer correctly. Also some code between st_render_texture and update_framebuffer_state is consolidated. It doesn't use rtt_level and derives the level from dimensions instead as the code in st_atom_framebuffer.c did.	2013-12-03 19:39:13 +01:00
Marek Olšák	0b3b901cff	mesa: expose AMD_vertex_shader_layer in the core profile only It needs glFramebufferTexture, which isn't available in the compatibility profile. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-03 19:39:13 +01:00
Tapani Pälli	a057b837dd	egl: add HAVE_LIBDRM define, fix EGL X11 platform Commit `a594cec` broke EGL X11 backend by adding dependency between X11 and DRM backends requiring HAVE_EGL_PLATFORM_DRM defined for X11. This patch fixes the issue by adding additional define for libdrm detection independent of which backend is being compiled. Tested by compiling Mesa with '--with-egl-platforms=x11' and running es2gears_x11 + glbenchmark2.7 successfully. v2: return true for dri2_auth if running without libdrm (Samuel) v3: check libdrm when building EGL drm platform + AM_CFLAGS fix (Emil) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72062 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: mesa-stable@lists.freedesktop.org	2013-12-03 09:21:24 -08:00
Andreas Heider	ad3937fd4e	freedreno: Add a few texture formats	2013-12-02 17:37:03 -05:00
Kenneth Graunke	decf070258	i965: Skip the register write check on Broadwell. MI_STORE_REGISTER_MEM has to take a 48-bit address, so the existing code doesn't work. But supposedly Broadwell has a register whitelist and just works out of the box anyway, so there's no need to check. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:26:03 -08:00
Kenneth Graunke	8ed9f69b36	i965: Fix texture border color on Broadwell. The Gen7 sampler state code still works. Increasing the alignment to 64 bytes makes bit 5 zero, which is good because it's now reserved. Since we don't use the new filter bits, we can leave those as zero too, which means we don't need to update the code to update the pointer. (We probably should anyway, for clarity, but alas, another day.) Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:52 -08:00
Kenneth Graunke	bc9d3a0254	i965: Don't use MACH for integer multiplies on Gen8+. The documentation is really hard to follow, but apparently a 32-bit x 32-bit multiply just works without the MACH macro. The macro apparently is only necessary to get the full 64-bit value. Fixes Piglit tests [vf]s-op-mult-int-int.shader_test. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:32 -08:00
Kenneth Graunke	5720832f23	i965: Fix texture swizzling on Broadwell. Like Haswell, we do this in SURFACE_STATE rather than shader workarounds. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:23 -08:00
Kenneth Graunke	1110ba4c08	i965: Set vertical alignment unit to 4 on Broadwell. Broadwell doesn't support a surface vertical alignment of 2. It only supports VALIGN_4, VALIGN_8, or VALIGN_16. I chose 4 since it's the least wasteful. v2: Replace my comment with a better one from Eric. Move Broadwell checks earlier so it's more obvious that "return 2" won't be hit. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:25:11 -08:00
Kenneth Graunke	93658054c0	i965/vs: Always store pull constant offsets in GRFs on Gen8. We need to SEND from a GRF, and we can only obtain those prior to register allocation. This allows us to do pull constant loads without the MRF hack. v2: Reword comments (suggested by Paul). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-12-02 13:19:10 -08:00
Kenneth Graunke	dd159f25e4	i965/vs: Don't copy propagate into SEND-from-GRF messages. SEND can't deal with swizzles, source modifiers, and so on. This should avoid problems with VS pull constant loads on Broadwell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-12-02 13:10:12 -08:00
Francisco Jerez	ce34158680	clover: Fix missing minus sign in 'iterator_adaptor::operator-='. The method is currently unused, this probably doesn't fix anything at this point.	2013-12-02 11:55:02 -08:00
Chad Versace	8b659cef3a	i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs Pre-patch, the workaround was applied to only HSW GT3. However, the workaround also fixes render corruption on the HSW GT1 Chromebook, codenamed Falco. Also, update the BSpec quote that discusses the workaround to reflect the latest BSpec. The BSpec states that the workaround is required for Ivybridge and Baytrail as well as Haswell. But, we apply the workaround to only Haswell because (a) we suspect that is the only hardware where it is actually required and (b) we haven't yet validated the workaround for the other hardware. CC: "9.2, 10.0" <mesa-stable@lists.freedesktop.org> CC: Anuj Phogat <anuj.phogat@gmail.com> OTC-Tracker: CHRMOS-812 Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-12-02 10:53:33 -08:00
Kenneth Graunke	5b331f6fcb	glsl: Simplify the built-in function linking code. Previously, we stored an array of up to 16 additional shaders to link, as well as a count of how many each shader actually needed. Since the built-in functions rewrite, all the built-ins are stored in a single shader. So all we need is a boolean indicating whether a shader needs to link against built-ins or not. During linking, we can avoid creating the temporary array if none of the shaders being linked need built-ins. Otherwise, it's simply a copy of the array that has one additional element. This is much simpler. This patch saves approximately 128 bytes of memory per gl_shader object. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:33:04 -08:00
Kenneth Graunke	1b557b1606	glsl: Create an accessor for the built-in function shader. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:33:02 -08:00
Kenneth Graunke	5af97b43c9	glsl: Drop crazy looping from no_matching_function_error(). Since the built-in functions rewrite, num_builtins_to_link is always either 0 or 1, so we don't need tho crazy loop starting at -1 with a special case. All we need to do is print the prototypes from the current shader, and the single built-in function shader (if present). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:33:00 -08:00
Kenneth Graunke	e04a97ff23	glsl: Merge "candidates are: " message to the previous line. Previously, when we hit a "no matching function" error, it looked like: 0:0(0): error: no matching function for call to `cos()' 0:0(0): error: candidates are: float cos(float) 0:0(0): error: vec2 cos(vec2) 0:0(0): error: vec3 cos(vec3) 0:0(0): error: vec4 cos(vec4) Now it looks like: 0:0(0): error: no matching function for call to `cos()'; candidates are: 0:0(0): error: float cos(float) 0:0(0): error: vec2 cos(vec2) 0:0(0): error: vec3 cos(vec3) 0:0(0): error: vec4 cos(vec4) This is not really any worse and removes the need for the prefix variable. It will also help with the next commit's refactoring. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:32:59 -08:00
Kenneth Graunke	e5e191a6b1	glsl: Drop unused call_ir parameter from generate_call(). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:32:57 -08:00
Kenneth Graunke	c5adc1c8b5	glsl: Remove useless iteration through function parameters. There's no need to loop through the "parameters" list and remove every element; move_nodes_to(&parameters) already throws away all elements of the destination list. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-12-01 15:32:55 -08:00
Jon TURNEY	61e0f11170	Fix 'make check' in src/mapi/glapi/tests when builddir != srcdir make[5]: Entering directory `/jhbuild/build/mesa/mesa/src/mapi/glapi/tests' CXX check_table.o /jhbuild/checkout/mesa/mesa/src/mapi/glapi/tests/check_table.cpp:29:30: fatal error: glapi/glapitable.h: No such file or directory We should look for the generated file glapi/glapitable.h in builddir, not srcdir Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-12-01 12:30:25 +00:00
Ian Romanick	862044c7f7	docs: Import 10.0 release notes, add news item Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-30 23:42:51 -08:00
Paul Berry	c4cf487315	i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats. On gen6, multisamble resolve blits use the SAMPLE message to blend together the 4 samples for each texel. For some reason, SAMPLE doesn't blend together the proper samples when the source format is L32_FLOAT or I32_FLOAT, resulting in blocky artifacts. To work around this problem, sample from the source surface using R32_FLOAT. This shouldn't affect rendering correctness, because when doing these resolve blits, the destination format is R32_FLOAT, so the channel replication done by L32_FLOAT and I32_FLOAT is unnecessary. Fixes piglit tests on Sandy Bridge: - spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float - spec/ARB_texture_float/multisample-formats 4 GL_ARB_texture_float No piglit regressions on Sandy Bridge. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70601 Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:31 -08:00
Paul Berry	26498e0f0c	glsl: Remove unused field loop_variable_state::loop. This field was neither initialized nor used. It was just dead memory. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:28 -08:00
Paul Berry	af9af2965b	glsl: Improve documentation of ir_loop counter/control fields. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:23 -08:00
Paul Berry	a810db7b84	glsl: In ir_validate, check that ir_loop::counter always refers to a new var. The compiler back-ends (i965's fs_visitor and brw_visitor, ir_to_mesa_visitor, and glsl_to_tgsi_visitor) have been assuming this for some time. Thanks to the preceding patch, the compiler front-end no longer breaks this assumption. This patch adds code to validate the assumption so that if we have future bugs, we'll be able to catch them earlier. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:20 -08:00
Paul Berry	d6eb4321d0	glsl: Fix inconsistent assumptions about ir_loop::counter. The compiler back-ends (i965's fs_visitor and brw_visitor, ir_to_mesa_visitor, and glsl_to_tgsi_visitor) assume that when ir_loop::counter is non-null, it points to a fresh ir_variable that should be used as the loop counter (as opposed to an ir_variable that exists elsewhere in the instruction stream). However, previous to this patch: (1) loop_control_visitor did not create a new variable for ir_loop::counter; instead it re-used the existing ir_variable. This caused the loop counter to be double-incremented (once explicitly by the body of the loop, and once implicitly by ir_loop::increment). (2) ir_clone did not clone ir_loop::counter properly, resulting in the cloned ir_loop pointing to the source ir_loop's counter. (3) ir_hierarchical_visitor did not visit ir_loop::counter, resulting in the ir_variable being missed by reparenting. Additionally, most optimization passes (e.g. loop unrolling) assume that the variable mentioned by ir_loop::counter is not accessed in the body of the loop (an assumption which (1) violates). The combination of these factors caused a perfect storm in which the code worked properly nearly all of the time: for loops that got unrolled, (1) would introduce a double-increment, but loop unrolling would fail to notice it (since it assumes that ir_loop::counter is not accessed in the body of the loop), so it would unroll the loop the correct number of times. For loops that didn't get unrolled, (1) would introduce a double-increment, but then later when the IR was cloned for linking, (2) would prevent the loop counter from being cloned properly, so it would look to further analysis stages like an independent variable (and hence the double-increment would stop occurring). At the end of linking, (3) would prevent the loop counter from being reparented, so it would still belong to the shader object rather than the linked program object. Provided that the client program didn't delete the shader object, the memory would never get reclaimed, and so the shader would function properly. However, for loops that didn't get unrolled, if the client program did delete the shader object, and the memory belonging to the loop counter got re-used, this could cause a use-after-free bug, leading to a crash. This patch fixes loop_control_visitor, ir_clone, and ir_hierarchical_visitor to treat ir_loop::counter the same way the back-ends treat it: as a freshly allocated ir_variable that needs to be visited and cloned independently of other ir_variables. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72026 Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:17 -08:00
Paul Berry	9d2951ea0a	glsl: Teach ir_variable_refcount about ir_loop::counter variables. If an ir_loop has a non-null "counter" field, the variable referred to by this field is implicitly read and written by the loop. We need to account for this in ir_variable_refcount, otherwise there is a danger we will try to dead-code-eliminate the loop counter variable. Note: at the moment the dead code elimination bug doesn't occur due to a bug in ir_hierarchical_visitor: it doesn't visit the "counter" field, so dead code elimination doesn't treat it as a candidate for elimination. But the patch to follow will fix that bug, so we need to fix ir_variable_refcount first in order to avoid breaking dead code elimination. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-29 21:46:13 -08:00
Brian Paul	1fb106527f	mesa: fix mem leak of glPixelMap data in display list And simplify save_PixelMapfv() by using the memdup() function. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:14 -07:00
Brian Paul	90d85aa16c	mesa: added memory-related comment in save_error() Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:14 -07:00
Brian Paul	95d6ed22b3	mesa: fix flags assignment in save_WaitSync() The flags value is a bitfield so use the union's 'bf' field, not 'e' (enum) field. There's no actual change in behavior here since both fields of the union are the same size. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:14 -07:00
Brian Paul	efe7257ea7	mesa: remove old colortable, histogram, etc. code from dlist.c Trying to compile any of these functions into a display list now just generates a GL_INVALID_OPERATION error. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:13 -07:00
Brian Paul	90891091cd	mesa: have old convolution functions generate GL_INVALID_OPERATION Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:13 -07:00
Brian Paul	214399a3bc	mesa: have old glColorTable functions generate GL_INVALID_OPERATION As is done for the old histogram functions. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-29 06:41:12 -07:00
José Fonseca	fb5f5b8188	trace: Dump PIPE_QUERY_* enums. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-28 12:19:42 +00:00
José Fonseca	eb040bd54a	trace: Dump query results faithfully. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-28 12:19:30 +00:00
Carl Worth	eeaa7a05a1	docs: Import 9.2.4 release notes, add news item.	2013-11-28 00:02:52 -08:00
Roland Scheidegger	ca39f4eee2	gallium/cso: fix sampler / sampler_view counts Now that it is possible to query drivers for the max sampler view it should be safe to increase this without crashing. Not entirely convinced this really works correctly though if state trackers using non-linked sampler / sampler_views use this. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-28 04:02:41 +01:00
Roland Scheidegger	2983c039df	gallium: new shader cap bit for the amount of sampler views Ever since introducing separate sampler and sampler view max this was really missing. Every driver but llvmpipe reports the same number as number of samplers for now, so nothing should break. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-28 04:02:18 +01:00
Roland Scheidegger	e4d8084cbd	gallium/drivers: support more sampler views than samplers for more drivers This adds support for this to more drivers, in particular for all the "special" ones useful for debugging. HW drivers are left alone, some should be able to support it if they want but they may not be interested at this point. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-28 04:01:54 +01:00
Ian Romanick	53a65e547c	i965: Properly reject __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS when __DRI2_ROBUSTNESS is not enabled Only allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in brwCreateContext if intelInitScreen2 also enabled __DRI2_ROBUSTNESS (thereby enabling GLX_ARB_create_context). This fixes a regression in the piglit test "glx/GLX_ARB_create_context/invalid flag" v2: Remove commented debug code. Noticed by Jordan. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-27 15:09:01 -08:00
Matt Turner	0822b2dfbd	Revert "drop old INTEL_DEBUG names for `perf` (fall) and `fs` (wm)" This reverts commit `195994fe4c`. It wasn't sent to the list, Ken didn't review it, and it breaks shader-db.	2013-11-27 13:38:42 -08:00
Vinson Lee	9bf41f09ab	glsl: Link glcpp with math library. This patch fixes this build error with Oracle Solaris Studio. libtool: link: /opt/solarisstudio12.3/bin/cc -g -o glcpp/glcpp glcpp.o prog_hash_table.o ./.libs/libglcpp.a Undefined first referenced symbol in file sqrt prog_hash_table.o Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-27 10:37:37 -08:00
Kenneth Graunke	c4815f6cd6	i965: Always reserve binding table space for at least one render target. In brw_update_renderbuffer_surfaces(), if there are no color draw buffers, we always set up a null render target at surface index 0 so we have something to use with the FB write marking the end of thread. However, when we recently began computing surface indexes dynamically, we failed to reserve space for it. This meant that the first texture would be assigned surface index 0, and our closing FB write would clobber the texture. Fixes Piglit's EXT_packed_depth_stencil/fbo-blit-d24s8 test on Gen4-5, which regressed as of commit `4e5306453d` ("i965/fs: Dynamically set up the WM binding table offsets.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70605 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Tested-by: lu hua <huax.lu@intel.com> Cc: "10.0" mesa-stable@lists.freedesktop.org	2013-11-27 10:28:43 -08:00
Francisco Jerez	6b2b4cc885	glsl: Initialize _mesa_glsl_parse_state::atomic_counter_offsets before using it. Cc: Ian Romanick <ian.d.romanick@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 19:34:24 -08:00
Francisco Jerez	4f64dabb5f	i965/fs: Fix misleading comment. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 19:34:02 -08:00
Francisco Jerez	32f69ad86c	i965: Bump number of supported atomic counter buffers. Now that we have dynamic binding tables there's no good reason anymore to expose so few atomic counter buffers. Increase it to 16. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 19:34:02 -08:00
Paul Berry	d7fa9eb003	glsl/linker: Validate IR just before reparenting. If reparent_ir() is called on invalid IR, then there's a danger that it will fail to reparent all of the necessary nodes. For example, if the IR contains an ir_dereference_variable which refers to an ir_variable that's not in the tree, that ir_variable won't get reparented, resulting in subtle use-after-free bugs once the non-reparented nodes are freed. (This is exactly what happened in the bug fixed by the previous commit). This patch makes this kind of bug far easier to track down, by transforming it from a use-after-free bug into an explicit IR validation error. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 13:22:24 -08:00
Paul Berry	9dfcb05fa6	glsl: Fix lowering of direct assignment in lower_clip_distance. In commit `065da16` (glsl: Convert lower_clip_distance_visitor to be an ir_rvalue_visitor), we failed to notice that since lower_clip_distance_visitor overrides visit_leave(ir_assignment ), ir_rvalue_visitor::visit_leave(ir_assignment ) wasn't getting called. As a result, clip distance dereferences appearing directly on the right hand side of an assignment (not in a subexpression) weren't getting properly lowered. This caused an ir_dereference_variable node to be left in the IR that referred to the old gl_ClipDistance variable. However, since the lowering pass replaces gl_ClipDistance with gl_ClipDistanceMESA, this turned into a dangling pointer when the IR got reparented. Prior to the introduction of geometry shaders, this bug was unlikely to arise, because (a) reading from gl_ClipDistance[i] in the fragment shader was rare, and (b) when it happened, it was likely that it would either appear in a subexpression, or be hoisted into a subexpression by tree grafting. However, in a geometry shader, we're likely to see a statement like this, which would trigger the bug: gl_ClipDistance[i] = gl_in[j].gl_ClipDistance[i]; This patch causes lower_clip_distance_visitor::visit_leave(ir_assignment *) to call the base class visitor, so that the right hand side of the assignment is properly lowered. Fixes piglit test: - spec/glsl-1.50/execution/geometry/clip-distance-itemized-copy Cc: Ian Romanick <idr@freedesktop.org> Cc: "9.2" <mesa-stable@lists.freedesktop.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 13:22:24 -08:00
Paul Berry	37bdde1087	i965/gs: Set GS prog_data to NULL if there is no GS program. The previous commit fixes a bug wherein we would incorrectly refer to stale geometry shader prog_data when no geometry shader was active. This patch reduces the likelihood of that sort of bug occurring in the future by setting prog_data to NULL whenever there is no GS program. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 13:22:23 -08:00
Paul Berry	2714ca81b9	i965/gs: Properly skip GS binding table upload when no GS active. Previously, in brw_gs_upload_binding_table(), we checked whether brw->gs.prog_data was NULL in order to determine whether a geometry shader was active. This didn't work: brw->gs.prog_data starts off as NULL, but it is set to non-NULL when a geometry shader program is built, and then never set to NULL again. As a result, if we called brw_gs_upload_binding_table() while there was no geometry shader active, but a geometry shader had previously been active, it would refer to a stale (and possibly freed) prog_data structure. This patch fixes the problem by modifying brw_gs_upload_binding_table() to use the proper technique to determine whether a geometry shader is active: by checking whether brw->geometry_program is NULL. This fixes the crash reported in comment 2 of bug 71870 (the incorrect rendering remains, however). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 13:21:56 -08:00
Ian Romanick	73e9aa9e3f	dri: Allow __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS in driCreateContextAttribs Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-26 13:13:38 -08:00
Ian Romanick	9b1c68638d	i965: Only enable __DRI2_ROBUSTNESS if kernel support is available Rather than always advertising the extension but failing to create a context with reset notifiction, just don't advertise it. I don't know why it didn't occur to me to do it this way in the first place. NOTE: Kristian requested that I provide a follow-up for master that dynamically generates the list of DRI extensions instead of selected between two hardcoded lists. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Suggested-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-26 13:10:52 -08:00
Ian Romanick	0ae8439906	Revert "i965: Make the driver compile until a proper libdrm can be released." libdrm 2.4.48 has been released. This reverts commit `bd4596efac`. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-26 13:10:52 -08:00
Ian Romanick	cb728bb028	i965: Bump libdrm requirement drm_intel_get_reset_stats is only available in libdrm-2.4.48, and libdrm-2.4.49 contains an important bug fix in that function. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-26 13:10:52 -08:00
Chad Versace	97851145bc	egl: Kill macro _EGL_DECLARE_MUTEX Replace all occurences of the macro with its expansion. It seems that the macro intended to provide cross-platform static mutex intialization. However, it had the same definition in all pre-processor paths: #define _EGL_DECLARE_MUTEX(m) _EGLMutex m = _EGL_MUTEX_INITIALIZER Therefore this abstraction obscured rather than helped. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 12:50:30 -08:00
Chad Versace	3c58d4c700	egl: Enable EGL_EXT_client_extensions Insert two fields into _egl_global to hold the client extensions and statically initialize them: ClientExtensions // a struct of bools ClientExtensionString Post-patch, Mesa supports exactly one client extension, EGL_EXT_client_extensions. Signed-off-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-26 12:50:29 -08:00
Tom Stellard	ddc77c5092	radeon/compute: Unconditionally inline all functions v2 We need to do this until function calls are supported. v2: - Fix loop conditional https://bugs.freedesktop.org/show_bug.cgi?id=64225 CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-25 20:42:49 -08:00
Kenneth Graunke	ad542a10c5	i965: Use __attribute__((flatten)) on fast tiled teximage code. The fast tiled texture upload code does not compile with GCC 4.8's -Og optimization flag. memcpy() has the always_inline attribute set. This poses a problem, since {x,y}tile_copy_faster calls it indirectly via {x,y}tile_copy, and {x,y}tile_copy normally aren't inlined at -Og. Using __attribute__((flatten)) tells GCC to inline every function call inside the function, which I believe was the author's intent. Fix suggested by Alexander Monakov. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: mesa-stable@lists.freedesktop.org	2013-11-25 19:13:23 -08:00
Zack Rusin	0510ec67e2	llvmpipe: support 8bit subpixel precision 8 bit precision is required by d3d10 but unfortunately requires 64 bit rasterizer. This commit implements 64 bit rasterization with full support for 8bit subpixel precision. It's a combination of all individual commits from the llvmpipe-rast-64 branch. Signed-off-by: Zack Rusin <zackr@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-25 13:05:03 -05:00
Maarten Lankhorst	5455c818b5	gbm/dri: hide extension loader symbols They should not be exposed. Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-25 13:13:47 +01:00
Chris Forbes	e6a0eca45e	i965: Enable ARB_draw_indirect (and ARB_multi_draw_indirect) on Gen7+ .. and mark them off on the extensions list as done. V2: Enable only if pipelined register writes work. V3: Also update relnotes Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:36 +13:00
Chris Forbes	093965f9e3	vbo: map indirect buffer and extract params if doing sw primitive restart V2: Check for mapping failure (thanks Brian) V3: - Change error on mapping failure to OUT_OF_MEMORY (Brian) - Unconst; remove casting away of const. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:36 +13:00
Chris Forbes	3953766e57	mesa: pass indirect buffer to sw primitive restart Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:36 +13:00
Chris Forbes	803fcc3298	i965: pass indirect buffer to primitive restart check Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	02f9757ab5	i965: implement indirect drawing for Gen7 Just prior to emitting the 3DPRIMITIVE command, we load each of the indirect registers. The values loaded are either from offsets into the current indirect BO, or constant zero if the parameter is not used for this draw. Enabling use of the indirect registers is done by turning on a bit in the first dword of the 3DPRIMITIVE command itself. V3: - Deduplicate the common part of both indexed and nonindexed indirect setup. - Just refer to the indirect bo out of the context directly. V4: - Fix bo reference to specify the range we care about. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	1a00317169	i965: Add new defines for indirect draws - MMIO registers for draw parameters - New bit in 3DPRIMITIVE command to enable indirection Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	5a798e73b5	vbo: Flesh out implementation of indirect draws Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-25 22:01:35 +13:00
Chris Forbes	aadbb0f275	mesa: add indirect_offset, is_indirect to _mesa_prim V3: Add missing cases V4: Add indirect_offset here too Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	36046ae278	mesa: Add validation helpers for new indirect draws Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. V3: - Disallow primcount==0 for DrawMulti*Indirect. The spec is unclear on this, but it's silly. We might go back on this later if it turns out to be a problem. - Make it clear that the caller has dealt with stride==0 V4: - Allow primcount==0 again. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	a95236cfc1	mesa: Add binding point for indirect buffer Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	56e98fe2fe	mesa: Add extension scaffolding for ARB_draw_indirect We will reuse the same extension flag for ARB_multi_draw_indirect since it can always be supported by looping. Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Chris Forbes	5127318ae8	glapi: add plumbing for GL_ARB_draw_indirect and GL_ARB_multi_draw_indirect Based on part of Patch 2 of Christoph Bumiller's ARB_draw_indirect series. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
Christoph Bumiller	80ac616fca	mesa: add indirect drawing buffer parameter to draw functions Split from patch implementing ARB_draw_indirect. v2: Const-qualify the struct gl_buffer_object *indirect argument. v3: Fix up some more draw calls for new argument. v4: Fix up rebase conflicts in i965. v5: Undo const-qualification Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-25 22:01:35 +13:00
José Fonseca	eb0892b4b1	docs/llvmpipe: Add one other good reference.	2013-11-25 08:28:23 +00:00
Chris Forbes	90d185544c	docs: describe the INTEL_* envvars that do exist V2: drop description of `fall` and `wm`, which have been removed by the previous patch; describe `stats`. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	195994fe4c	drop old INTEL_DEBUG names for `perf` (fall) and `fs` (wm) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	452721c1fa	i965: remove unused DEBUG_IOCTL Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	e0c98fa401	radeon: change last instance of DEBUG_IOCTL to use RADEON_IOCTL DEBUG_IOCTL comes from i965, and is about to be removed. Both defines have the same value (4). Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-25 21:18:33 +13:00
Chris Forbes	26eb6ad831	docs: drop INTEL_* envvars which no longer exist These were removed back in 2012. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Chris Forbes	f6159afa19	docs: bump supported shading language version Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-25 21:18:33 +13:00
Dave Airlie	72cae2a599	st/mesa: respect higher GLSL levels. (v2) Limit the max glsl version level to what the state tracker supports. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-11-25 13:03:02 +10:00
Timothy Arceri	3c9f0096c7	glsl: Improve error message when attemping assignment to unsized array V2: Return after error to avoid cascading error messages and removed redundant "to" from error message Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-23 15:52:27 -08:00
Jordan Justen	bd00c66500	intel: enable GL_AMD_vertex_shader_layer extension for gen7+ Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-23 10:49:56 -08:00
Marek Olšák	751e8697f2	radeonsi: implement MSAA for CIK There are also some changes to the printfs. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-11-23 01:54:58 +01:00
Marek Olšák	7b136de79a	radeonsi: enable 2D tiling on CIK libdrm does the DRM version check and decides if 2D tiling is used. Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-11-23 01:54:58 +01:00
Marek Olšák	a3969aa125	mesa: initialize gl_renderbuffer::Depth in core Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-23 01:54:57 +01:00
Eric Anholt	46cf80fb36	i965/fs: Make the first pre-allocation heuristic be the post heuristic. I recently made us try two different things that tried to reduce register pressure so that we would be more likely to allocate successfully. But now that we have the logic for trying two, we can make the first thing we try be the normal, not-prioritizing-register-pressure heuristic. This means one less scheduling pass in the common case of that heuristic not producing spills, plus the best schedule we know how to produce, if that one happens to succeed. This is important, because our register allocation produces a lot of possibly avoidable dependencies for the post-register-allocation schedule, despite ra_set_allocate_round_robin(). GLB2.7: 1.04127% +/- 0.732461% fps improvement (n=31) nexuiz: No difference (n=5) lightsmark: 0.838512% +/- 0.300147% fps improvement (n=86) minecraft apitrace: No difference (n=15) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-22 16:36:27 -08:00
Eric Anholt	09db4940ee	mesa: Remove the ralloc canary on release builds. The canary is basically just to give a better debugging message when you ralloc_free() something that wasn't rallocated. Reduces maximum memory usage of apitrace replay of the dota2 demo by 60MB on my 64-bit system (so half that on a real 32-bit dota2 environment). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-22 16:36:27 -08:00
Eric Anholt	5891f98145	i965: Fix streamed state dumping/annotation after the blorp-flush change. I think I was thinking of the batch command packet cache when I pasted this in, but this counter is only used for dumping out streamed state for INTEL_DEBUG=batch and for putting annotations in our aub files. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-22 16:36:27 -08:00
Chad Versace	315b06ff62	i965: Let driconf clamp_max_samples affect context version Commit `2f89662` added the driconf option 'clamp_max_samples'. In that commit, the option did not alter the context version. The neglect to alter the context version is a fatal issue for some apps. For example, consider running Chromium with clamp_max_samples=0. Pre-patch, Mesa creates a GL 3.0 context but clamps GL_MAX_SAMPLES to 0. This violates the GL 3.0 spec, which requires GL_MAX_SAMPLES >= 4. The spec violation causes WebGL context creation to fail in many scenarios because Chromium correctly assumes that a GL 3.0 context supports at least 4 samples. Since the driconf option was introduced largely for Chromium, the issue really needs fixing. This patch fixes calculation of the context version to respect the post-clamped value of GL_MAX_SAMPLES. This in turn fixes WebGL on Chromium when clamp_max_samples=0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-22 15:27:03 -08:00
Chad Versace	95ebabbc5f	i965: Share code between intel_quantize_num_samples and clamp_max_samples clamp_max_samples() and intel_quantize_num_samples() each maintained their own list of which MSAA modes the hardware supports. This patch removes the duplication by making intel_quantize_num_samples() use the same list as clamp_max_samples(), the list maintained in brw_supported_msaa_modes(). By removing the duplication, we prevent the scenario where someone updates one list but forgets to update the other. Move function `brw_context.c:static brw_supported_msaa_modes()` to `intel_screen.c:(non-static) intel_supported_msaa_modes()` and patch intel_quantize_num_samples() to use the list returned by that function. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-22 14:56:15 -08:00
Chad Versace	8d1a8d65b5	i965: Terminate brw_supported_msaa_modes() list with -1, not 0 This simplifies the loop logic in a subsqequent patch that refactors intel_quantize_num_samples() to use brw_supported_msaa_modes(). Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-22 14:56:02 -08:00
Brian Paul	aad2511c6d	st/mesa: simplify writemask for emitting fog result Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-22 09:01:13 -07:00
Brian Paul	73b19be32d	mesa: fix indentation in ffvertex_prog.c Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-22 08:52:09 -07:00
José Fonseca	69049555af	tgsi: Prevent emission of instructions with empty writemask. These degenerate instructions can often be emitted by state trackers when the semantics of instructions don't match precisely. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-22 15:03:36 +00:00
José Fonseca	4ade77f625	tgsi: Rework calls to ureg_emit_insn(). Mere syntactical change. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-22 15:03:36 +00:00
José Fonseca	68b696e595	docs: Add a section with recommended reading for llvmpipe development. Several of links the were contributed by Keith Whitwell and Roland Scheidegger. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-22 15:03:36 +00:00
Roland Scheidegger	f69d2c857d	llvmpipe: (trivial) disable new accurate origin calculation It looks like there's some bugs in it...	2013-11-22 11:29:00 +00:00
Vinson Lee	bb354c6c27	meta: Move declaration before code. Fixes MSVC build. meta.c(2411) : error C2143: syntax error : missing ';' before 'type' meta.c(2411) : error C2143: syntax error : missing ')' before 'type' meta.c(2411) : error C2065: 'layer' : undeclared identifier meta.c(2411) : error C2059: syntax error : ')' meta.c(2411) : error C2143: syntax error : missing ';' before '{' meta.c(2413) : error C2065: 'layer' : undeclared identifier meta.c(2415) : error C2065: 'layer' : undeclared identifier Signed-off-by: Vinson Lee <vlee@freedesktop.org>	2013-11-21 20:29:38 -08:00
Paul Berry	ec79c05cbf	mesa: Implement GL_FRAMEBUFFER_ATTACHMENT_LAYERED query. From section 6.1.18 (Renderbuffer Object Queries) of the GL 3.2 spec, under the heading "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is TEXTURE, then": If pname is FRAMEBUFFER_ATTACHMENT_LAYERED, then params will contain TRUE if an entire level of a three-dimesional texture, cube map texture, or one-or two-dimensional array texture is attached. Otherwise, params will contain FALSE. Fixes piglit tests: - spec/!OpenGL 3.2/layered-rendering/framebuffer-layered-attachments - spec/!OpenGL 3.2/layered-rendering/framebuffertexture-defaults Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> v2: Don't include "EXT" in the error message, since this query only makes sensen in context versions that have adopted glGetFramebufferAttachmentParameteriv(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 18:16:47 -08:00
Paul Berry	af1471dc04	mesa: Fix texture target validation for glFramebufferTexture() Previously we were using the code path for validating glFramebufferTextureLayer(). But glFramebufferTexture() allows additional texture types. Fixes piglit tests: - spec/!OpenGL 3.2/layered-rendering/gl-layer-cube-map - spec/!OpenGL 3.2/layered-rendering/framebuffertexture Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> v2: Clarify comment above framebuffer_texture(). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 18:16:44 -08:00
Paul Berry	0831523350	i965: Fix fast clear of depth buffers. From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes the fast depth clear path. Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-depth". Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:42 -08:00
Paul Berry	c1019670ea	i965: Fix blorp clear of layered framebuffers. From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes the blorp clear path for color buffers. Fixes piglit test "spec/!OpenGL 3.2/layered-rendering/clear-color". Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:39 -08:00
Paul Berry	1ec5365429	i965: refactor blorp clear code in preparation for layered clears. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:36 -08:00
Paul Berry	068a073c1d	meta: fix meta clear of layered framebuffers From section 4.4.7 (Layered Framebuffers) of the GLSL 3.2 spec: When the Clear or ClearBuffer* commands are used to clear a layered framebuffer attachment, all layers of the attachment are cleared. This patch fixes meta clears to properly clear all layers of a layered framebuffer attachment. We accomplish this by adding a geometry shader to the meta clear program which sets gl_Layer to a uniform value. When clearing a layered framebuffer, we execute in a loop, setting the uniform to point to each layer in turn. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-21 18:16:34 -08:00
Paul Berry	95140740ad	mesa: Track number of layers in layered framebuffers. In order to properly clear layered framebuffers, we need to know how many layers they have. The easiest way to do this is to record it in the gl_framebuffer struct when we check framebuffer completeness. This patch replaces the gl_framebuffer::Layered boolean with a gl_framebuffer::NumLayers integer, which is 0 if the framebuffer is not layered, and equal to the number of layers otherwise. v2: Remove gl_framebuffer::Layered and make gl_framebuffer::NumLayers always have a defined value. Fix factor of 6 error in the number of layers in a cube map array. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 18:16:25 -08:00
Ben Skeggs	085ad4821e	nvc0: inform kernel about buffers that screen_create touches Prevents a GPU page fault if somehow the uniform bo gets evicted before the screen_create pushbuf has been submitted. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-11-22 11:34:43 +10:00
Tom Stellard	1bdb99330a	radeonsi/compute: Fix LDS size calculation We need to include the number of LDS bytes allocated by the state tracker. CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-21 16:14:58 -08:00
Tom Stellard	7a30cd7085	r600g/compute: Add a work-around for flushing issues on Cayman Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> https://bugs.freedesktop.org/show_bug.cgi?id=69321 CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-21 15:55:16 -08:00
Paul Berry	544e3129c5	glsl: Fix interstage uniform interface block link error detection. Previously, we checked for interstage uniform interface block link errors in validate_interstage_interface_blocks(), which is only called on pairs of adjacent shader stages. Therefore, we failed to detect uniform interface block mismatches between non-adjacent shader stages. Before the introduction of geometry shaders, this wasn't a problem, because the only supported shader stages were vertex and fragment shaders, therefore they were always adjacent. However, now that we allow a program to contain vertex, geometry, and fragment shaders, that is no longer the case. Fixes piglit test "skip-stage-uniform-block-array-size-mismatch". Cc: "10.0" <mesa-stable@lists.freedesktop.org> v2: Rename validate_interstage_interface_blocks() to validate_interstage_inout_blocks() to reflect the fact that it no longer validates uniform blocks. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> v3: Make validate_interstage_inout_blocks() skip uniform blocks. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 15:05:09 -08:00
Paul Berry	0f4cacbb53	glsl: Fix cross-version linking between VS and GS. Previously, when attempting to link a vertex shader and a geometry shader that use different GLSL versions, we would sometimes generate a link error due to the implicit declaration of gl_PerVertex being different between the two GLSL versions. This patch fixes that problem by only requiring interface block definitions to match when they are explicitly declared. Fixes piglit test "shaders/version-mixing vs-gs". Cc: "10.0" <mesa-stable@lists.freedesktop.org> v2: In the interface_block_definition constructor, move the assignment to explicitly_declared after the existing if block. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 15:05:06 -08:00
Paul Berry	2bbcf19aca	glsl: Prohibit illegal mixing of redeclarations inside/outside gl_PerVertex. From section 7.1 (Built-In Language Variables) of the GLSL 4.10 spec: Also, if a built-in interface block is redeclared, no member of the built-in declaration can be redeclared outside the block redeclaration. We have been regarding this text as a clarification to the behaviour established for gl_PerVertex by GLSL 1.50, so we apply it regardless of GLSL version. This patch enforces the rule by adding an enum to ir_variable to track how the variable was declared: implicitly, normally, or in an interface block. Fixes piglit tests: - gs-redeclares-pervertex-out-after-global-redeclaration.geom - vs-redeclares-pervertex-out-after-global-redeclaration.vert - gs-redeclares-pervertex-out-after-other-global-redeclaration.geom - vs-redeclares-pervertex-out-after-other-global-redeclaration.vert - gs-redeclares-pervertex-out-before-global-redeclaration - vs-redeclares-pervertex-out-before-global-redeclaration Cc: "10.0" <mesa-stable@lists.freedesktop.org> v2: Don't set "how_declared" redundantly in builtin_variables.cpp. Properly clone "how_declared". Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-21 15:04:59 -08:00
Kenneth Graunke	7a70f033b5	i965: Enable the AMD_performance_monitor extension on Gen5+. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	2af1aedeca	i965: Take "bookend" OA snapshots at the start/end of each batch. Unfortunately, our hardware only has one set of aggregating performance counters shared between all 3D programs, and their values are not saved or restored by hardware contexts. Also, at least on Sandybridge and Ivybridge, the counters lose their values if the GPU goes to sleep. To work around both of these problems, we have to snapshot the performance counters at the beginning and end of each batch, similar to how we handle query objects on platforms that don't support hardware contexts. I call these "bookend" snapshots. Since there can be multiple performance monitors active at a time, we store the bookend snapshots in a global BO, shared by all monitors. For monitors that span multiple batches, acquiring results involves adding up three segments: BeginPerfMonitor --> End of Batch 1 ("head") Start of Batch 2 --> End of Batch 2 ... ("middle") Start of Batch N-1 --> End of Batch N-1 Start of Batch N --> EndPerfMonitor ("tail") Monitors that refer to bookend BO snapshots are considered "unresolved". We delay resolving them (and adding up deltas to obtain the results) as long as possible to avoid blocking on mapping monitor->oa_bo. We can also run out of space in the bookend BO, at which point we have to resolve all unresolved monitors. Then we can throw away the snapshots and begin writing at the beginning of the buffer. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	1172974ddd	i965: Reserve batchbuffer space for a closing MI_REPORT_PERF_COUNT. In order to use the Observability Architecture effectively, we'll need to take snapshots of the OA counters via MI_REPORT_PERF_COUNT at the start and end of each batch. Experimentation reveals that we need to flush before and after each MI_REPORT_PERF_COUNT to get working values. For simplicitly, I chose to use intel_batchbuffer_emit_mi_flush(), which unfortunately expands to triple pipe controls on Sandybridge. We may want to start computing per-generation reserved batch space to avoid the insanity of Sandybridge's PIPE_CONTROL cost. That said, much of this cost existed before I rewrote the query object support to use hardware contexts, so it's at least not entirely new. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	fedc14a050	i965: Add some plumbing for gathering OA results. Currently, this only considers the monitor start and end snapshots. This is woefully insufficient, but allows me to add a bunch of the infrastructure now and flesh it out later. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	c289c70ce1	i965: Start and stop OA counters as necessary. We need to start OA at the beginning of each batch where monitors are active. OACONTROL isn't part of the hardware context, so to avoid leaving counters enabled for other applications, we turn them off at the end of the batch too. We also need to start them at BeginPerfMonitor time (unless they've already been started). We stop them when the monitor last ends as well. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	834c9575b2	i965: Add functions to start and stop the OA counters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	367c7c2d7c	i965: Add #defines for the OACONTROL register and fields. We'll need to write this register to start/stop performance counters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	901cae07ff	i965: Take OA counter snapshots at Begin/EndPerfMonitor time. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	093ecbfe3b	i965: Add a function to emit the MI_REPORT_PERF_COUNT packet. MI_REPORT_PERF_COUNT writes a snapshot of the Observability Architecture counters to a buffer. Exactly how it works varies between generations: Ironlake requires two packets, Sandybridge has to use GGTT, and Ivybridge and later use PPGTT. v2: Assert that we didn't use more space than we reserved (suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	b05b1eff1c	i965: Track the number of monitors that need OA counters. Using the OA counters requires some per-batch work. When starting and ending a batch, it's useful to know whether any monitors are actually interested in OA data. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	7329f8dd10	i965: Enumerate Observability Architecture counters on Gen5+. In addition to listing the counter names, we include several "remap" tables. Confusingly, counters are documented with names like "A23", are written to some buffer offset other than 23, and exposed by core Mesa under a counter ID that is different still. The first is inevitable; MI_REPORT_PERF_COUNT writes certain counters to fixed locations in the buffer. The latter could be avoided, but core Mesa uses the "Counters" array index as the ID for a counter. We could do remapping there, but it would just complicate the core Mesa code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	9f41585eb5	i965: Expose pipeline statistics registers via performance monitors. This is fairly simple: - At BeginPerfMonitor time, take an opening snapshot. - At EndPerfMonitor time, take a closing snapshot. - The first time the application asks for results, subtract the two and store that value. Then free the BO containing the snapshots. - On subsequent requests for the results, just return the saved value. - On reset, throw away the results. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	91950d1aea	i965: Enumerate the pipeline statistics register counters on Gen6+. For now, we only support these on Gen6+, since that's what currently uses hardware contexts. When we add Ironlake hardware context support, we can add pipeline statistics register support for that as well. In theory, we could support pipeline statistics counters even without hardware contexts, but it would be annoyingly painful. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:14 -08:00
Kenneth Graunke	569adb40d7	i965: Initialize performance monitor Groups/NumGroups. Since we don't support any counters, there are zero groups. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	7bf3cd4315	i965: Add macros for creating performance monitor counters and groups. The Observability Architecture counters are 32-bit unsigned values, and the Pipeline Statistics Register counters are 64-bit unsigned values. These convenience macros make it easy to create those types of counters. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	63b8ce612f	i965: Periodically dump the list of monitors if INTEL_DEBUG=perfmon. It's useful to see the state of all outstanding monitors; the start of a new batch seems like a reasonable time to print them out. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	379a246fc1	i965: Add basic driver hooks and plumbing for AMD_performance_monitor. These stub functions will be filled out in later patches. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	b64eb100b0	i965: Add INTEL_DEBUG=perfmon support. This will enable debugging printfs for the AMD_performance_monitor code. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	a4bf7f6b6e	i965: Move brw_emit_query_begin() to the render ring prelude. Without hardware contexts, the pipeline statistics registers are free-running and include data from every 3D application running. In order to find out the contributions of one particular context, we need to take a snapshot at the start and end of each batch. Previously, we emitted the PIPE_CONTROL necessary to capture PS_DEPTH_COUNT when drawing primitives. Special tracking ensured it happened only on the first draw of the batch, rather than on every draw. Moving this to brw_new_batch increases symmetry, since the final snapshot has always been in brw_finish_batch, which is just a few lines below. It should be basically equivalent. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	bb9d2eab89	i965: Introduce a "render ring prelude" hook. The new intel_batchbuffer_emit_render_ring_prelude() hook will be called when switching from BLT or UNKNOWN_RING to RENDER_RING. This provides a place to emit state that should go at the start of each render ring batch, with minimal overhead. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	395a32717d	i965: Introduce an UNKNOWN_RING state. When we first create a batch buffer, it's empty. We don't actually know what ring it will be targeted at until the first BEGIN_BATCH or BEGIN_BATCH_BLT macro. Previously, one could determine the state of the batch by checking brw->batch.ring (blit vs. render) and brw->batch.used != 0 (known vs. unknown). This should be functionally equivalent, but the tri-state enum is a bit clearer. v2: Catch three explicit require_space callers (thanks to Carl and Eric). v3: Split the boolean -> enum change from the UNKNOWN_RING change. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Kenneth Graunke	6bc40f9af5	i965: Convert brw->batch.is_blit to a BLT_RING/RENDER_RING enum. Passing BLT_RING or RENDER_RING to batchbuffer functions is a lot more obvious than passing true or false. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 15:01:13 -08:00
Roland Scheidegger	28d7b4147d	llvmpipe: calculate more accurate interpolation value at origin Some rounding errors could crop up when calculating a0. Use a more accurate method (barycentric interpolation essentially) to fix this, though to fix the REAL problem (which is that our interpolation will give very bad results with small triangles far away from the origin when they have steep gradients) this does absolutely nothing (actually makes it worse). (To fix the real problem, either would need to use a vertex corner (or some other point inside the tri) as starting point value instead of fb origin and pass that down to interpolation, or mimic what hw does, use barycentric interpolation (using the coordinates extracted from the rasterizer edge functions) - maybe another time.) Some (silly) tests though really want a high accuracy at fb origin and don't care much about anything else (Just. Don't. Ask.). Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-21 20:39:19 +00:00
Brian Paul	9d1c71e34d	svga: remove special-case code for texkil w component Not actually needed. Fixes piglit ARB_fragment_program/kil-swizzle test. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-21 09:08:17 -07:00
José Fonseca	2d5f21ba65	gallium: Make TGSI_SEMANTIC_FOG register four-component wide. D3D9 Shader Model 2 restricted the fog register to one component, http://msdn.microsoft.com/en-us/library/windows/desktop/bb172945.aspx , but that restriction no longer exists in Shader Model 3, and several WHCK tests enforce that. So this change: - lifts the single-component restriction TGSI_SEMANTIC_FOG from Gallium interface - updates the Mesa state tracker to enforce output fog has (f, 0, 0, 1) - draw module was updated to leave TGSI_SEMANTIC_FOG output registers alone Several gallium drivers that are going out of their way to clear TGSI_SEMANTIC_FOG components could be simplified in the future. Thanks to Si Chen and Michal Krol for identifying the problem. Testing done: piglit fogcoord-*.vpfp tests Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-21 14:00:05 +00:00
José Fonseca	edd9efc2fb	tgsi_exec: Fix mask calculation for emit_kill_if. Same as Si Chen's commit `e7a5905d8a` for tgsi_exec module. Not actually tested, because softpipe is failing the test that caught this bug due to unrelated issues. Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-21 13:56:10 +00:00
José Fonseca	bba8f10598	mesa: Use IROUND instead of roundf. roundf is not available on MSVC.	2013-11-21 13:56:00 +00:00
Tapani Pälli	7e61b44dcd	mesa: enable GL_TEXTURE_LOD_BIAS set/get Earlier comments suggest this was removed from GL core spec but it is still there. Enabling makes 'texture_lod_bias_getter' Khronos conformance tests pass, also removes some errors from Metro Last Light game which is using this API. v2: leave NOTE comment (Ian) Cc: "9.0 9.1 9.2 10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Tapani Pälli <tapani.palli@intel.com>	2013-11-21 12:49:18 +02:00
Christian König	ecb37a6e77	winsys/radeon: cleanup virtual memory nonsense The alignment of a virtual memory area must always be at least 4096 bytes. It only worked because size was aligned to 4096 outside of the function. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-11-21 10:24:20 +01:00
Courtney Goeltzenleuchter	f56f875b8b	mesa: Update MESA_INFO to eliminate error If a user set MESA_INFO and the OpenGL application uses a 3.0 or later context then the MESA_INFO debug output will have an error when it queries for extensions using the deprecated enum GL_EXTENSIONS. Passing context argument allows code to return extension list directly regardless of profile. Commit title updated as recommended by Kenneth Graunke. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-21 00:26:20 -08:00
Kenneth Graunke	36c3faf4bf	i965: Disable BLORP on Broadwell for now. BLORP is essential. However, porting it to Gen8 is a huge amount of work. Disabling it for now allows us to proceed with basic hardware enablement. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Kenneth Graunke	01ae16a0e7	i965: Disable HiZ on Broadwell for now. HiZ is difficult to implement, and while it's essential for performance, we don't need it right away for purposes of hardware enabling. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Kenneth Graunke	232140a47a	i965: Claim OpenGL 3.3 support on Broadwell. Bugs aside, basically everything ought to work. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Kenneth Graunke	b61ff94032	i965: Add device info structs for Broadwell. As always, the chipset limits here are placeholders, rather than the actual values. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-21 00:26:11 -08:00
Vinson Lee	b7c0b61782	glsl: Use more portable bash invocation construct. Fixes 'make check' on distros where bash is not at /bin/bash. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Tested-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-20 22:39:59 -08:00
Vinson Lee	7f56780915	gallivm: Ignore unknown file type in non-debug builds. Fixes "Uninitialized pointer read" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-20 22:35:36 -08:00
Dave Airlie	b01a3a9b72	glx: don't fail out when no configs if we have visuals GLX 1.2 servers with no SGIX_fbconfigs exist (some citrix thing), and we fail glxinfo completely in those cases. CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-11-21 10:50:48 +10:00
Dave Airlie	a43b49dfb1	mesa/swrast: fix inverted front buffer rendering with old-school swrast I've no idea when this broke, but we have some people who wanted it fixed, so here's my attempt. reproducer, run readpix with swrast hit f, or run trival tri -sb things are upside down, after this patch they aren't. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62142 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66213 Cc: <mesa-stable@lists.freedesktop.org>" Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-11-21 10:50:17 +10:00
Eric Anholt	81ff29e30c	mesa: Fix setup of LocalParams array. i965 passed piglit, but swrast and gallium both segfaulted without this. i965 happened to work because it never ran _mesa_load_state_parameters() on the new program before the test called glProgramLocalParameter(), which was allocating a LocalParams array for the fallback path. v2: Since v1 threw away old localparams data, leaked old LocalParams memory, only fixed fragment programs, and I was dubious of my previous invariants already (nothing but program_parse.y will generate LocalParams, and only that one path of program_parse.y will), just late-allocate localparams at the other point of dereferencing them. This adds overhead to _mesa_load_state_parameter, which is uncomfortable, but I'm pretty sure that giant switch statement is super slow already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71734 Tested-by: Michel Dänzer <michel.daenzer@amd.com>	2013-11-20 16:12:46 -08:00
Matt Turner	5fe49d99f2	i965/test: Use unreachable() to silence warning.	2013-11-20 15:04:53 -08:00
Matt Turner	1f9092958d	i965: Link -ldl after libmesa.la DLOPEN_LIBS is part of DRI_LIB_DEPS. Cc: "10.0" <mesa-stable@lists.freedesktop.org>" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71512 Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-20 15:04:53 -08:00
Matt Turner	a97cd0f4d7	i965: Add a pass to remove dead control flow. Removes IF/ENDIF and IF/ELSE/ENDIF with no intervening instructions. total instructions in shared programs: 1360393 -> 1360387 (-0.00%) instructions in affected programs: 157 -> 151 (-3.82%) (no change in vertex shaders) Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:53 -08:00
Matt Turner	b63d6aae55	i965: Make invalidate_live_intervals() a virtual method of backend_visitor. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:53 -08:00
Matt Turner	1c263f8f4f	i965/vec4: Add invalidate_live_intervals method. Reviewed-by: Paul Berry <stereotype441@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:53 -08:00
Matt Turner	c4464c9eea	i965/fs: Don't emit SIMD16 BFI instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
Matt Turner	9bbedf6146	i965/fs: Emit compressed 3-source instructions on Haswell. For commit `4df56177` Paul discovered that the hardware restriction that Align16 instructions cannot be compressed was lifted on Haswell. This has prevented us from emitting compressed three-source instructions. For added confirmation, the bspec lists a work around called WaBreakSimd16TernaryInstructionsIntoSimd8 that hasn't been applicable since very early Haswell silicon. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
Matt Turner	82bfb45e24	i965: Fix disassembled names of BFI1 and BFI2 instructions. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
Matt Turner	9793fc1335	i965/fs: Use source's original type in register_coalesce(). Previously, register_coalesce() would modify mov vgrf1:f vgrf2:f cmp null vgrf3:d vgrf1:d to be cmp null vgrf3:d vgrf2:f and incorrectly use vgrf2's type in the instruction that the mov was coalesced into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-20 15:04:52 -08:00
José Fonseca	060159820c	u_gen_mipmap: Use untampered cubemap texture coords when generating mipmaps. It's not necessary to scale down cubemap texture coords when generating mipmaps: we are doing a 2x minification therefore it's guaranteed that the texture coords will always be at least 1 texel away of the edges. Scaling down can actually be harmful, as it may cause artefacts when generating mipmaps with nearest filtering. Sample points will lie exactly in the middle each 2x2 texels, so the scaling factor was causing different texels to be take on each quadrant of the cube face. This is apparent with a 1x1 checkerboard pattern in the base mipmap level: instead of next mipmap level receiving a constant color throughout the face, it will have different colors for each quadrant of the face. The behaviour for blits is left untouched for now, but the cubemap texture coord scaling hack should be reconsidered eventually. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-20 07:12:59 +00:00
Brian Paul	15d8e05e1e	st/mesa: fix GL_FEEDBACK mode inverted Y coordinate bug We need to check the drawbuffer's orientation before inverting Y coordinates. Fixes piglit feedback tests when running with the -fbo option. Cc: "9.2" "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-19 13:21:35 -07:00
Si Chen	e7a5905d8a	gallivm: Fix mask calculation for emit_kill_if. The exec_mask must be taken in consideration, just like emit_kill above. The tgsi_exec module has the same bug and should be fixed in a future change. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-19 19:16:18 +00:00
Paul Berry	81b998ca48	i965/gen7: Disallow Y tiling of renderable surfaces with valign of 2. Gen7 does not allow render targets to have a vertical alignment of 2. So, when creating a surface, if its format is renderable, and its vertical alignment is 2, force it to use X tiling. Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-19 09:48:51 -08:00
Paul Berry	6b40dd17cf	i965/gen7: Prefer vertical alignment of 4 when possible. Gen6+ allows for color buffers to use a vertical alignment of either 4 or 2. Previously we defaulted to 2. This may have caused problems on Gen7 because Y-tiled render targets are not allowed to use a vertical alignment of 2. This patch changes the vertical alignment to 4 on Gen7, except for the few formats where a vertical alignment of 2 is required. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-19 09:48:48 -08:00
Paul Berry	60b1a118e1	i965/vec4: Fix broken IR annotation in debug output. Commit `70953b5` (i965: Initialize all member variables of vec4_instruction on construction) inadvertently added a line to the vec4_instruction constructor setting this->ir to NULL, wiping out the previously set value. As a result, ever since then, the output of INTEL_DEBUG=vs and INTEL_DEBUG=gs has been missing IR annotations. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-19 09:40:57 -08:00
Brian Paul	92c3d5acf7	svga: improve check for 3D compressed textures This is basically a a respin of f1dfcf4bce35e6796f873d9a00103b280da81e4c per Jose's suggestion. Just set the SVGA3dSurfaceFormatCaps flags for 3D and cube textures when checking the texture format capabilities. This will filter out unsupported combinations like 3D+DXT. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-19 09:03:41 -07:00
Jon TURNEY	5ab59e5332	glx/tests: Provide __glXGetCurrentContext() stub when needed Refine `8c533022`. Provide a stub __glXGetCurrentContext() function when $(DEFINES) are such that it is not a macro. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>	2013-11-19 15:28:22 +00:00
Brian Paul	21ae5135dd	svga: we don't supported 3D compressed textures Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-11-18 16:34:02 -07:00
Brian Paul	7eab897d4d	st/mesa: pass correct pipe_texture_target to st_choose_format() We were always passing PIPE_TEXTURE_2D, but not all formats are supported for all types of textures. In particular, the driver may not supported texture compression for all types of textures. Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-11-18 16:34:02 -07:00
Tom Stellard	1b9511d7ce	r600g/compute: Fix handling of global buffers in r600_resource_copy_region() Global buffers do not have an associate cs_buf handle, so we can't copy them using r600_copy_buffer() https://bugs.freedesktop.org/show_bug.cgi?id=64226 Reviewed-by: Marek Ol????k <marek.olsak@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-18 12:28:13 -08:00
Tom Stellard	17930a66aa	gallium: Pass version scripts to linker using --version-script= This fixes build failures with the gold linker. CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-18 12:19:04 -08:00
Tom Stellard	a84dd2398f	clover: Optionally return context's devices from clGetProgramInfo() The spec allows clGetProgramInfo() to return information about either the devices associated with the program or the devices associated with the context. If there are no devices associated with the program, then we return devices associated with the context. https://bugs.freedesktop.org/show_bug.cgi?id=52171 Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-18 11:54:28 -08:00
Paul Berry	7dfb4b2d00	i965/gen7: Emit workaround flush when changing GS enable state. v2: Don't go to extra work to avoid extraneous flushes. (Previous experiments in the kernel have suggested that flushing the pipeline when it is already empty is extremely cheap). Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-18 10:09:11 -08:00
Brian Paul	d222202193	osmesa: add missing comma	2013-11-18 09:14:48 -07:00
Brian Paul	cadec45c3d	osmesa: add support for postprocess filters Add new OSMesaPostprocess() function to allow using the gallium postprocessing filters. This only works for OSMesa with gallium drivers, not the legacy swrast OSMesa. Bump OSMESA_MAJOR/MINOR_VERSION numbers to 10.0 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:35 -07:00
Brian Paul	7cf40c1cb3	postprocess: document the pp_init() function. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	b7e5678fe5	postprocess: move #defines to filters.h They're not needed in postprocess.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	c27d8cc0c9	postprocess: refactor header files, etc Move private data structures and function prototypes out of the public postprocess.h header file. Create a pp_private.h for the shared, private data structures, functions. Remove pp_program.h header. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	de2fd7dd0b	postprocess: rename program to pp_program To match the pp_ namespace convention. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Brian Paul	401f2d6ea8	postprocess: simplify pp_free() code Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2013-11-18 08:56:34 -07:00
Emil Velikov	d33d260b90	docs: indicate GLX_MESA_query_renderer's completion Cc: "10.0" <mesa-stable@lists.freedesktop.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:38:37 +00:00
Emil Velikov	b8a1115132	docs: update nv50, nvc0 current status Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:38:29 +00:00
Joerg Mayer	f9868926ee	docs: restructure GL3.txt - Indent items under a GL version to allow context diffs to do their work. - Move complete drivers into the GL version line - this should make the stuff a little bit easier to read. v2: keep the fd.o link (Emil Velikov) Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Joerg Mayer <jmayer@loplof.de> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:38:16 +00:00
Emil Velikov	ca9794658e	docs: add a note about removed state tracker/targets The X.Org state tracker is gone, as well as the xvmc/vdpau r300 and softpipe targets. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Acked-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:37:39 +00:00
Emil Velikov	0faaed2112	targets/xvmc: export only necessary symbols Export only XvMC* symbols for the xvmc targets. Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:35:21 +00:00
Emil Velikov	5896100a38	drivers/radeon: remove unused CXXFLAGS, LLVM_CPP_FILES The above two variables are unused as of commit commit `024fe6852a` Author: Tom Stellard <thomas.stellard@amd.com> Date: Tue Apr 2 10:42:50 2013 -0700 radeon/llvm: Use LLVM C API for compiling LLVM IR to ISA v2 which removed the only cpp file from drivers/radeon, but missed to remove the CXXFLAGS. The sequential commit reintroduced and empty LLVM_CPP_FILES. Lets cleanup and remove both. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-18 15:35:21 +00:00
José Fonseca	1e67ee8c9a	mesa/main: Move declaration to beginning of scope. Should fix MSVC build. Trivial.	2013-11-18 14:43:31 +00:00
Courtney Goeltzenleuchter	2cfbf84dad	mesa: Add API debug logging to TexStorage Give glTexStorage* equivalent debug logging to glTexImage*. Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com> Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 19:57:17 -08:00
Tapani Pälli	53f89a436f	glsl: cleanup, remove duplicate assignment Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:51:37 -08:00
Kenneth Graunke	d12e0e8972	mesa: Handle !m->Ended for performance monitor result availability. If a performance monitor has never ended, then no result can be available. Core Mesa can easily handle this, saving drivers a tiny bit of complexity. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:07 -08:00
Kenneth Graunke	bde5e4a1e6	mesa: Track whether a performance monitor has ever ended. If a monitor has ended, it means a result should eventually become available, pending some flushing. This is distinct from !m->Active; if a monitor has not been started, then m->Active == false and m->Ended == false. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:07 -08:00
Kenneth Graunke	a6712f5109	mesa: Also initialize gl_performance_monitor::Active. The i965 implementation uses calloc, so I missed this. It's best to simply initialize it to avoid requiring a zeroing allocator, though. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:06 -08:00
Kenneth Graunke	145138fb3c	mesa: Store the performance monitor object's name. Being able to print monitor->Name is really useful for debugging. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-17 18:51:06 -08:00
Chris Forbes	45a56ce399	mesa: bump version to 10.1 (devel) Now that branch 10.0 is created, bump the minor version in master. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 20:31:49 +13:00
Chris Forbes	61143b87c1	i965: Fix broken asserts These would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:56:57 +13:00
Chris Forbes	0741997ff0	st/vega: Fix broken assert This would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:56:55 +13:00
Chris Forbes	6f7c693a85	r600/sb: Fix broken assert This would never fire. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-17 18:56:40 +13:00
Vadim Girlin	4cb04aa0df	r600g/sb: work around hw issues with stack on eg/cm v2: make it actually work, improve condition Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68503 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>	2013-11-17 01:36:28 +04:00
Kenneth Graunke	04856ceb5c	i965: Make swizzle_to_scs non-static. We'll need this for Broadwell code as well. Normally, when we make things public, we add the "brw" prefix. I'm not crazy about that in this case, since it deals with prog_instruction.h's SWIZZLE_XYZW values, rather than the BRW_SWIZZLE_XYZW enums. However, I can't think of a better name, and at least the comments and code make it clear. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-11-16 09:12:58 -08:00
Kenneth Graunke	717241bf4a	i965: Move enum brw_urb_write_flags from brw_eu.h to brw_defines.h. Broadwell code should not include brw_eu.h (since it is for Gen4-7 assembly encoding), but needs the URB write flags enum. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-11-16 09:12:58 -08:00
Kenneth Graunke	ec8cc65926	i965/fs: Remove force_sechalf stack Only Gen4 color write setup uses the force_sechalf flag, and it only sets it on a single instruction. It also already has to get a pointer to the instruction and manually set the saturate flag, so we may as well just set force_sechalf the same way and avoid the complexity of a stack. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chris Forbes <chrisf@ijw.co.nz> Acked-by: Anuj Phogat <anuj.phogat@gmail.com>	2013-11-16 09:12:57 -08:00
Emil Velikov	02fdb5cb51	targets/dri: move linker flags out of configure into Automake.inc Previous assumption was that the same set of flags can be reused for both classic and gallium drivers. With megadriver work done the classic drivers ended up using their own (single) instance of the flags. Move these into Automake.inc and rename to indicate that those are gallium specific. Additionally silence an automake/autoconf warning "XXX is not a standard libtool library name", due to the parsing issues of the module tag. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	5b8c2c8f00	targets/dri: compact compiler flags into Automake.inc Greatly reduce duplication and provide a sane minimum of CFLAGS for all DRI targets. Note: This commit adds VISIBILITY_CFLAGS to the following: * freedreno * i915 * ilo * nouveau * vmwgfx Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	38e0b7eeaa	targets/xvmc: do not link against libtrace.la In order to use the trace driver, one needs to define GALLIUM_TRACE. Neither one of the two targets was defining it, thus we're safe to remove libtrace.la. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	dfcdece7c5	targets/xvmc: consolidate lib deps into Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:04 +00:00
Emil Velikov	bfda1460b1	targets/xvmc: move linker flags to Automake.inc Minimise duplication and sources of error (eg nouveau was missing shared and no-undefined) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	5d7d120af1	targets/xvmc: use drop duplicated compiler flags Automake.inc already has GALLIUM_VIDEO_CFLAGS, which provide the essential compiler flags needed. Note: this commit adds VISIBILITY_CFLAGS to nouveau. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	f7ac1d5989	gallium/winsys: compact compiler flags into Automake.inc Cleanup the duplicating flags and consolidate into a sigle variable. Note: this patch adds VISIBILITY_CFLAGS to the following targets * freedreno/drm * i915/{drm,sw} * nouveau/drm * sw/fbdev * sw/null * sw/wayland * sw/wrapper * sw/xlib Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	096b988360	targets/vdpau: drop unused libraries from linker In order for one to use trace, noop, rbug and/or galahad, they must set the corresponding GALLIUM_* CFLAG. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	3f920a91f3	targets/vdpau: consolidate lib deps into Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:03 +00:00
Emil Velikov	5f0df8ab22	targets/vdpau: move linker flags to Automake.inc Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:02 +00:00
Emil Velikov	23588a9c04	targets/vdpau: compact compiler flags into Automake.inc Store the compiler flags into a variable, in order to minimise flags duplication (amongst vdpau and xvmc). Note: this commit add VISIBILITY_CFLAGS to the nouveau target Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:31:02 +00:00
Emil Velikov	7dac1b470a	gallium/drivers: compact compiler flags into Automake.inc * minimise flags duplication * distingush between VISIBILITY C and CXX flags * set only required flags - C and/or CXX v2: add LLVM_CFLAGS back to AM_CFLAGS (add missing backslash) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 16:29:28 +00:00
Emil Velikov	ad501a535a	targets/radeonsi: move drm_target.c to a common folder ... and symlink to each target. Make automake's subdir-objects work for radeonsi. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:52 +00:00
Emil Velikov	23cdf8de32	targets/r600: move drm_target.c to common folder ... and symlink for each target. Make automake's subdir-objects work for r600. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:52 +00:00
Emil Velikov	a9a3029541	targets/r300: move drm_target.c to common folder ... and symlink for each target. Make automake's subdir-objects work for r300. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:52 +00:00
Emil Velikov	589e0b2305	gallium/drivers: enable automake subdir-objects Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:51 +00:00
Emil Velikov	d5e79a9d2b	r300: move the final sources list to Makefile.sources Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:47 +00:00
Emil Velikov	2c1bb79213	r300: add symlink to ralloc.c and register_allocate.c Make automake's subdir-objects work. Update includes. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:15 +00:00
Emil Velikov	b3c60ff5d0	st/xvmc: enable automake subdir-objects Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:02:15 +00:00
Emil Velikov	01d35eb372	dri/common: move source file lists to Makefile.sources * Allow the lists to be shared among build systems. * Update automake and Android build systems. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-16 14:02:15 +00:00
Emil Velikov	b51b3fc537	gtest: enable subdir-objects to prevent automake warnings Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:01:27 +00:00
Emil Velikov	b5773ee043	gbm: enable subdir-objects to prevent automake warnings Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:16 +00:00
Emil Velikov	0b57da0211	scons: move SConscript from gallium/targets/ to mesa/drivers/dri/common/ Store scons side by side with the other build systems. v2: cleanup after a failed rebase Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:16 +00:00
Johannes Obermayr	595bd01eb1	freedreno: compact a2xx and a3xx makefiles into parent ones Nearly everything within the three Makefile.am's is identical. Let's simplify things a little. v2: Rebase and rewrite the commit message (Emil Velikov) Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:16 +00:00
Emil Velikov	c5062726f1	scons: drop obsolete enabled_apis variable The variable was forgotten during the FEATURE_* removal. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:15 +00:00
Emil Velikov	1aeafcb7c5	Android: remove unused MESA_ENABLED_APIS variable The variable was forgotten during the FEATURE_* removal. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 14:00:15 +00:00
Emil Velikov	9560d34fcf	st/egl: use _FILE over _SOURCES names for filelists Silence automake warnings about missing program/library whenever the _SOURCES suffix is used for temporary variable names. warning: variable 'gdi_SOURCES' is defined but no program or library has 'gdi' as canonical name (possible typo) Acked-by: Matt Turner <mattst88@gmail.com> Reported-by: Ilia Mirkin <imirkin@alum.mit.edu> Reported-by: Johannes Obermayr <johannesobermayr@gmx.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70581 Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-16 13:53:31 +00:00
Matt Turner	e133c0103d	i965: Assert that IF with cmod is Gen6 only. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 23:31:42 -08:00
Vinson Lee	b570c4229f	i965: Add missing break in SHADER_OPCODE_GEN7_SCRATCH_READ case. Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 18:29:34 -08:00
Eric Anholt	e5885c119d	mesa: Dynamically allocate the storage for program local parameters. The array was 64kb per struct gl_program, plus we statically stored a copy of one on disk for _mesa_DummyProgram. Given that most struct gl_programs we generate are for GLSL shaders that don't have local parameters, this was a waste. Since you can store and fetch parameters beyond what the program actually uses, we do have to do a late allocation if necessary at GetProgramLocalParameter time. Reduces peak memory usage in the dota2 trace I made by 76MB (4.5%) Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:35:01 -08:00
Eric Anholt	bb1f096975	mesa: Remove PROGRAM_ENV_PARAM enum. This has been replaced with referring to env parameters using PROGRAM_STATE_VAR and _mesa_load_state_parameters. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:34:59 -08:00
Eric Anholt	33b0455211	mesa: Remove PROGRAM_LOCAL_PARAM enum. This has been replaced with referring to local parameters using PROGRAM_STATE_VAR and _mesa_load_state_parameters. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:34:57 -08:00
Eric Anholt	fddc17ab36	mesa: Update a comment about valid values of a field. Notably, ENV and LOCAL aren't used any more (replaced by STATE_VAR), but apparently CONSTANT is. Reviewed-by: Brian Paul <brianp@vmware.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-15 11:34:49 -08:00
Eric Anholt	aa6d7bc6d6	glsl: Apply the transformation "1/rsq(x) == sqrt(x)" in opt_algebraic. The comment was stale, because the lowering in question wasn't happening in lower_instructions.cpp. Presumably if the lowering ever moves there, we can plumb the lowering mask through to opt_algebraic. total instructions in shared programs: 1618696 -> 1616810 (-0.12%) instructions in affected programs: 243018 -> 241132 (-0.78%) GAINED: 0 LOST: 0 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	477f8cd08b	glsl: Apply the transformation "(a ^^ a) -> false" in opt_algebraic. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	58a98d32e4	glsl: Apply the transformation "(a && a) -> a" in opt_algebraic. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	ee27048262	glsl: Apply the transformation "(a \|\| a) -> a" in opt_algebraic. total instructions in shared programs: 1732385 -> 1732373 (-0.00%) instructions in affected programs: 416 -> 404 (-2.88%) GAINED: 0 LOST: 0 (That's 4 already-short fragment shaders in dota2) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Eric Anholt	8957c6b887	glsl: Move the CSE equality functions to the ir class. I want to reuse them in opt_algebraic. v2: Merge in Chris Forbes's break fix. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 11:33:07 -08:00
Matt Turner	fc51e7ac58	clover: Remove dead file from Makefile.sources. Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>	2013-11-15 11:10:32 -08:00
Kenneth Graunke	4ec982ad01	i965: Rework brw_new_batch to actually start a new batch. Previously, brw_new_batch was called just after execbuf, but before intel_batchbuffer_reset. Essentially, it prepared for the creation of a new batch, that wasn't yet available, and which it didn't create. This was a bit awkward. This patch makes brw_new_batch call intel_batchbuffer_reset as the very first operation. This means that brw_new_batch actually creates a new batchbuffer, and thus has it available. It brings the creation of the new batchbuffer and BRW_NEW_BATCH flagging together into one place. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 10:24:07 -08:00
Kenneth Graunke	720d935fff	i965: Move cache_used_by_gpu flag setting to brw_finish_batch. It really makes more sense here. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 10:24:07 -08:00
Ian Romanick	96a3527a63	i915: Actually enable __DRI2rendererQueryExtensionRec More rebase fail. This code was written long before i915 and i965 were split, so most of the code in i9[16]5/intel_screen.c only needed to exist in one place. It looks like I fixed n-1 of those places after rebasing on the split. I only found this from the defined-but-not-used warning for intelRendererQueryExtension. I noticed this while fixing the other, related warnings. (Note: During review, we decided to not pick this back to 10.0.) Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Paul Berry <stereotype441@gmail.com>	2013-11-15 10:10:29 -08:00
Aaron Watry	2be85e2492	radeon/llvm: Free elf_buffer after use Prevents a memory leak. v2: Remove null check CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:53:31 -08:00
Aaron Watry	01f3622c74	r600/llvm: Free binary.code/binary.config in r600_llvm_compile radeon_llvm_compile allocates memory for binary.code, binary.config, or neither depending on what's being done. We need to make sure to free that memory after it's no longer needed. v2: Don't bother checking for null before FREE() CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:53:31 -08:00
Aaron Watry	dd73b99420	r600/llvm: initialize radeon_llvm_binary use memset to initialize to 0's... otherwise code_size and config_size could be uninitialized when read later in this method. It's also hard to do NULL checks on uninitialized pointers. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> v2: Fix indentation CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:53:31 -08:00
Brian Paul	2bc1680665	svga: remove unused vars in svga_hwtnl_simple_draw_range_elements() And simplify the code. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-15 10:27:01 -07:00
Brian Paul	1a36dfb21e	svga: print warning for unsupported indirect dest reg indexing For DX9-level shaders, there's only limited support for indirect indexing of registers (with the loop counter register, not the general address register.) Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-15 10:23:49 -07:00
Brian Paul	3969330b47	svga: mark dest image as defined in svga_surface_copy() After we blit/copy to a dest texture image we need to mark it as being defined. This fixes broken mipmap generation for quite a few texture formats. Mipgen involves making texture views and svga_texture_view_surface() skips texture images that are undefined. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-15 10:23:48 -07:00
Brian Paul	79984b9928	svga: do primitive trimming in translate_indices() The index translation code expects the number of indexes to be consistent with the primitive type (ex: a multiple of 3 for PIPE_PRIM_TRIANGLES). If it's not, we can write out of bounds in the destination buffer. Fixes failed assertions in the pipebuffer debug code found with Piglit primitive-restart-draw-mode test. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-15 10:23:48 -07:00
Brian Paul	491d6397fc	indices: add comments, assertions in u_indices.c file Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2013-11-15 10:23:48 -07:00
Brian Paul	2253fed4a0	mesa: remove duplicated prototypes in varray.h	2013-11-15 10:23:48 -07:00
Aaron Watry	598f61ba28	gallium/pipe_loader: un-reference udev resources when we're done with them. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	4c6ac9e614	radeonsi/compute: Dispose of LLVM module after compiling kernels v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	35dad4a1e2	radeonsi/compute: Free program and program.kernels on shutdown v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	d41b10f811	radeon/llvm: Free created llvm memory buffer v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	a2b93da84b	radeon/llvm: Free libelf resources v2: Fix indentation Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Aaron Watry	df482fe02f	radeon/llvm: fix spelling error Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:49 -08:00
Tom Stellard	17af4dd52b	clover: Support multiple devices in clCreateContextFromType() v2 v2: - Use clGetDeviceIDs to query devices. Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-15 09:16:48 -08:00
Paul Berry	f38ac41ed4	glsl: Rework interface block linking. Previously, when doing intrastage and interstage interface block linking, we only checked the interface type; this prevented us from catching some link errors. We now check the following additional constraints: - For intrastage linking, the presence/absence of interface names must match. - For shader ins/outs, the interface names themselves must match when doing intrastage linking (note: it's not clear from the spec whether this is necessary, but Mesa's implementation currently relies on it). - Array vs. nonarray must be consistent, taking into account the special rules for vertex-geometry linkage. - Array sizes must be consistent (exception: during intrastage linking, an unsized array matches a sized array). Note: validate_interstage_interface_blocks currently handles both uniforms and in/out variables. As a result, if all three shader types are present (VS, GS, and FS), and a uniform interface block is mentioned in the VS and FS but not the GS, it won't be validated. I plan to address this in later patches. Fixes the following piglit tests in spec/glsl-1.50/linker: - interface-blocks-vs-fs-array-size-mismatch - interface-vs-array-to-fs-unnamed - interface-vs-unnamed-to-fs-array - intrastage-interface-unnamed-array v2: Simplify logic in intrastage_match() for handling array sizes. Make extra_array_level const. Use an unnamed temporary interface_block_definition in validate_interstage_interface_blocks()'s first call to definitions->store(). Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2013-11-15 08:56:28 -08:00
Paul Berry	b4c3b833ec	i965: Fix vertical alignment for multisampled buffers. From the Sandy Bridge PRM, Vol 1 Part 1 7.18.3.4 (Alignment Unit Size): j [vertical alignment] = 4 for any render target surface is multisampled (4x) From the Ivy Bridge PRM, Vol 4 Part 1 2.12.2.1 (SURFACE_STATE for most messages), under the "Surface Vertical Alignment" heading: This field is intended to be set to VALIGN_4 if the surface was rendered as a depth buffer, for a multisampled (4x) render target, or for a multisampled (8x) render target, since these surfaces support only alignment of 4. Back in 2012 when we added multisampling support to the i965 driver, we forgot to update the logic for computing the vertical alignment, so we were often using a vertical alignment of 2 for multisampled buffers, leading to subtle rendering errors. Note that the specs also require a vertical alignment of 4 for all Y-tiled render target surfaces; I plan to address that in a separate patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53077 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net>	2013-11-15 08:54:15 -08:00
Paul Berry	46e9f78efc	main: Fix MaxUniformComponents for geometry shaders. For both vertex and fragment shaders we default MaxUniformComponents to 4 * MAX_UNIFORMS. It makes sense to do this for geometry shaders too; if back-ends have different limits they can override them as necessary. Fixes piglit test: spec/glsl-1.50/built-in constants/gl_MaxGeometryUniformComponents Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-15 08:47:41 -08:00
José Fonseca	420ccf7b8f	tools/trace: Several bugfixes/improvements to dump_state.py - Don't crash with user memory pointers. - Support old bind__sampler_ methods. Useful when comparing dumps from old branches. - Misc.	2013-11-15 15:42:02 +00:00
José Fonseca	c5a05a6aef	trace: Dump user_buffer members.	2013-11-15 15:32:33 +00:00
Fredrik Höglund	ff353c218a	mesa: Fix derived vertex state not being updated in glCallList() AEcontext::NewState is not always set when the vertex array state is changed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71492 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: José Fonseca <jfonseca@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-15 15:23:23 +00:00
Alex Deucher	469b42ee21	radeonsi: add Hawaii pci ids Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-15 08:51:20 -05:00
Alex Deucher	f5778f152b	radeonsi: add support for Hawaii asics (v2) Update additional register fields. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2013-11-15 08:51:09 -05:00
Vinson Lee	78fc159d68	i965: Initialize schedule_node::delay. Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-14 22:36:26 -08:00
Alexander von Gluck IV	f7ce1d772d	haiku/swrast: Inherit gl_config, fix flush * Inherit gl_context so we always have access to it * Thanks curro for the idea. * Last Haiku cannidate for 10.0.0 Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2013-11-14 12:33:03 -06:00
Roland Scheidegger	473cb3fe4a	llvmpipe: (trivial) fix more fallout from the setup cleanup. Oops... Should have done some more testing.	2013-11-14 15:49:42 +00:00
Roland Scheidegger	5190c16a04	llvmpipe: (trivial) fix misplaced bld context assignment. Should fix polygon offset crashes...	2013-11-14 14:44:15 +00:00
José Fonseca	a29e40a423	gallivm: Compile flag to debug TGSI execution through printfs. It is similar to tgsi_exec.c's DEBUG_EXECUTION compile flag. I had prototyped this for a while while debugging an issue, but finally cleaned this up and added a few more bells and whistles. v2: Use '$' as marker; better output. Thanks to Brian, Zack and Roland reviews. Here is a sample output. CONST[0].x = 0.00625000009 0.00625000009 0.00625000009 0.00625000009 CONST[0].y = -0.00714285718 -0.00714285718 -0.00714285718 -0.00714285718 CONST[0].z = -1 -1 -1 -1 CONST[0].w = 1 1 1 1 IN[0].x = 143.5 175.5 175.5 143.5 IN[0].y = 123.5 123.5 155.5 155.5 IN[0].z = 0 0 0 0 IN[0].w = 1 1 1 1 $ 1: RCP TEMP[0].w, IN[0].wwww TEMP[0].w = 1 1 1 1 $ 2: MAD TEMP[0].xy, IN[0], CONST[0], CONST[0].zwzw TEMP[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 TEMP[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 $ 3: MUL OUT[0].xy, TEMP[0], TEMP[0].wwww OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 $ 4: MUL OUT[0].z, IN[0].zzzz, TEMP[0].wwww OUT[0].z = 0 0 0 0 $ 5: MOV OUT[0].w, TEMP[0] OUT[0].w = 1 1 1 1 $ 6: END OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 OUT[0].z = 0 0 0 0 OUT[0].w = 1 1 1 1	2013-11-14 14:04:28 +00:00
Roland Scheidegger	673d5391a2	softpipe: (trivial) fix debug code The debug printfs wouldn't actually compile when enabled, so kill them off and insert some new one in another place, and make sure it keeps compiling by enclosing it in a if-0 clause.	2013-11-14 12:24:55 +00:00
Roland Scheidegger	2dd693412a	llvmpipe: clean up state setup code a bit In particular get rid of home-grown vector helpers which didn't add much. And while here fix formatting a bit. No functional change. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-14 12:24:55 +00:00
Roland Scheidegger	754319490f	gallivm,llvmpipe: fix float->srgb conversion to handle NaNs d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-14 12:24:55 +00:00
Ian Romanick	a15a19f0d1	dri: Change value param to unsigned This silences some compiler warnings in i915 and i965. See also `75982a5`. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-13 14:49:27 -08:00
Ian Romanick	cb6182bdfa	i965: Use drm_intel_get_aperture_sizes instead of hard-coded 2GiB Systems with little physical memory installed will report less than 2GiB, and some systems may (hypothetically?) have a larger address space for the GPU. My IVB still reports 1534. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-13 14:49:27 -08:00
Ian Romanick	9fe108db09	i915: Use drm_intel_get_aperture_sizes instead of drmAgpSize Send the zombie back to the grave before it infects the townsfolk. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-13 14:49:26 -08:00
Alexander Monakov	279e8d2641	i965: implement blit path for PBO glDrawPixels This patch implements accelerated path for glDrawPixels from a PBO in i965. The code follows what intel_pixel_read, intel_pixel_copy, intel_pixel_bitmap and intel_tex_image are doing. Piglit quick.tests show no regressions. In my testing on IVB, performance improvement is huge (about 30x, didn't measure exactly) since generic path goes via _mesa_unpack_color_span_float, memcpy, extract_float_rgba. Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-13 12:20:59 -08:00
Brian Paul	19c2f40649	docs: fill in md5 checksums for 9.2.3 release	2013-11-13 10:06:23 -07:00
Brian Paul	c093cd3984	docs: fix 9.2.2 -> 9.2.3 typos	2013-11-13 10:03:35 -07:00
Alexander von Gluck IV	df91144a6d	haiku: add swrast driver * This is pretty small and upkeep should be minimal. * Currently fully working. * Cannidate for 10.0.0 branch Acked-by: Brian Paul <brianp@vmware.com>	2013-11-13 10:41:10 -06:00
Carl Worth	9976a176ae	docs: Import 9.2.3 release notes, add news item.	2013-11-13 07:32:47 -08:00
Kristian Høgsberg	e048953145	dri: Remove redundant createNewContext function from __DRIimageDriverExtension createContextAttribs is a superset of what createNewContext provides. Also remove the function typedef, since createNewContext is deprecated and no longer used in multiple interfaces. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 16:08:17 -08:00
Kristian Høgsberg	68bb26bead	wayland: Use __DRIimage based getBuffers implementation when available This lets us allocate color buffers as __DRIimages and pass them into the driver instead of having to create a __DRIbuffer with the flink that requires. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 16:08:17 -08:00
Kristian Høgsberg	04e3ef00db	gbm: Add support for __DRIimage based getBuffers when available This lets us allocate color buffers as __DRIimages and pass them into the driver instead of having to create a __DRIbuffer with the flink that requires. With this patch, we can now run gbm on render-nodes. A render-node is a drm device that doesn't support modesetting and all the legacy DRI ioctls. flink is also not supported, but now that gbm doesn't need flink, we can run piglit on head-less gbm or head-less GPGPU. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Chad Versace <chad.versace@linux.intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Tested-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 16:01:40 -08:00
Ander Conselvan de Oliveira	5ba6be2617	dri/i915, dri/i965: Fix support for planar images Planar images have format __DRI_IMAGE_FORMAT_NONE, but the patch that moved the conversion from dri_format to the mesa format made it impossible to allocate a image with that format. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Reviewed-by: Eric Anholt <eric@anholt.net> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 15:57:39 -08:00
Eric Anholt	e9daead784	i965/fs: Try a different pre-scheduling heuristic if the first spills. Since LIFO fails on some shaders in one particular way, and non-LIFO systematically fails in another way on different kinds of shaders, try them both, and pick whichever one successfully register allocates first. Slightly prefer non-LIFO in case we produce extra dependencies in register allocation, since it should start out with fewer stalls than LIFO. This is madness, but I haven't come up with another way to get unigine tropics to not spill while keeping other programs from not spilling and retaining the non-unigine performance wins from texture-grf. total instructions in shared programs: 1626728 -> `1626288` (-0.03%) instructions in affected programs: 1015 -> 575 (-43.35%) GAINED: 50 LOST: 0 Improves Unigine Tropics performance by 14.5257% +/- 0.241838% (n=38) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70445 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:06:28 -08:00
Eric Anholt	fbd8303a94	i965/fs: Do instruction pre-scheduling just before register allocation. Long ago, the HW_REG usage in assign_curb/urb_setup() were scheduling barriers, so we had to run scheduler before them in order for it to be able to do basically anything. Now that that's fixed, we can delay the scheduling until we go to allocate (which will make the next change less scary). Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:06:21 -08:00
Eric Anholt	f72a0d99fe	i965/fs: Ignore actual latency pre-reg-alloc. We care about depth-until-program-end, as a proxy for "make sure I schedule those early instructions that open up the other things that can make progress while keeping register pressure low", not actual latency (since we're relying on the post-register-alloc scheduling to actually schedule for the hardware). total instructions in shared programs: 1609931 -> 1609931 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 55 LOST: 43 Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:06:00 -08:00
Eric Anholt	7c90947a0b	i965/fs: Fix message setup for SIMD8 spills. In the SIMD16 spilling changes, I replaced a "1" in the spill path with "mlen", but obviously it wasn't mlen before because spills have the g0 header along with the payload. The interface I was trying to use was asking for how many physical regs we're writing, so we're looking for "1" or "2". I'm guessing this actually passed piglit because the high 8 bits of the execution mask in SIMD8 mode are all 0s. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Paul Berry <stereotype441@gmail.com>	2013-11-12 15:05:07 -08:00
Eric Anholt	bc0e3bb4d0	i965/fs: Prefer things we know reduce reg pressure when pre-scheduling. Previously, the best thing we had was to schedule the things unblocked by the last chosen instruction, on the hope that it would be consuming two values at the end of their live intervals while only producing one new value. But that's just a guess, and we can do counting of usage of registers to know when an instruction would (almost surely) reduce register pressure. The only failure mode I know of in this new dominant heuristic is that inside of a loop when scheduling the iterator (for example), choosing the last use of the iterator doesn't actually reduce the live interval of the iterator. But it doesn't seem to matter in shader-db: total instructions in shared programs: 1618700 -> 1618700 (0.00%) instructions in affected programs: 0 -> 0 GAINED: 13 LOST: 0 Note: The new functions are made virtual because I expect we'll soon lift the pre-regalloc scheduling heuristic over to the vec4 backend. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:04:32 -08:00
Eric Anholt	9b3e1592c2	i965: Fix undefined value usage in ABO setup. Fixes a compiler warning. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:04:28 -08:00
Eric Anholt	8bd45a7e7e	i965: Add a warning if something ever hits a bug I noticed. We'd have to map the VBO and rewrite things to a lower stride to fix it. Reviewed-by: Matt Turner <mattst88@gmail.com>	2013-11-12 15:04:25 -08:00
Ben Skeggs	c944bde5be	nvc0: release 3d bufctx after drawing Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2013-11-13 08:09:29 +10:00
Francisco Jerez	99d447cc5d	clover: Fix the const variant of adaptor_range::end to deal with mismatching range sizes. Fixes infinite loop in find_grid_optimal_factor() in cases where the user specifies a grid size with less dimensions than the device supports. Reported-by: Tom Stellard <thomas.stellard@amd.com> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 11:52:47 -08:00
Roland Scheidegger	50f19e3a66	draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offset Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-12 19:08:58 +00:00
Cyril Brulebois	2d77e4f922	gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detection Thanks to Pino Toscano. Patch from Debian package. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-12 11:57:21 -07:00
Petr Sebor	f2b844f59d	meta: enable vertex attributes in the context of the newly created array object Otherwise, the function would enable generic vertex attributes 0 and 1 of the array object it does not own. This was causing crashes in Euro Truck Simulator 2, since the incorrectly enabled generic attribute 0 in the foreign context got precedence before vertex position attribute at later time, leading to NULL pointer dereference. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Petr Sebor <petr@scssoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-12 11:56:30 -07:00
Brian Paul	76317355bd	mesa: 80-column wrapping, remove trailing whitespace in arrayobj.c	2013-11-12 11:05:25 -07:00
Brian Paul	c8f3722129	mesa: add comment for struct gl_vertex_buffer_binding	2013-11-12 11:05:25 -07:00
Brian Paul	ce193d4f01	mesa: call update_array_format() after error checking We try to do all error checking before changing any GL state. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Jordan Justen <jordan.l.justen@intel.com>	2013-11-12 11:05:19 -07:00
Brian Paul	5f22f3207e	mesa: use _mesa_is_bufferobj() helper in _mesa_vertex_attrib_address() And use a regular if statment to slightly improve readability. Jordan Justen <jordan.l.justen@intel.com>	2013-11-12 11:05:14 -07:00
Brian Paul	e032abcb27	mesa: add const qualifiers to vertex array helper functions Jordan Justen <jordan.l.justen@intel.com>	2013-11-12 11:05:04 -07:00
Ilia Mirkin	08122e151a	nouveau/video: mark bitstream-level acceleration as unsupported Adding a vl_mpeg-based helper didn't seem to work, as it produced data that the card couldn't handle. (And I didn't investigate further.) This makes the decoding functionality only accessible via XvMC and avoids crashes when attempting to use VDPAU. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 10:11:41 +01:00
Ilia Mirkin	e8d5d3409c	nouveau/video: don't try on nv3x It doesn't work, I don't know why, but no point in hanging people's displays until it gets figured out. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-12 10:10:54 +01:00
Tom Stellard	594fa4a208	egl-static: Only export necessary symbols v3 This fixes a crash in glamor when mesa links against static LLVM. v2: - Inline LINKER_SCRIPT variable v3: Kai Wasserbäch - Fix out out-of-tree-builds Tested-by: Kai WasserbÃ¤ch <kai@dev.carbon-project.or>	2013-11-11 17:21:35 -05:00
Tom Stellard	cb080a10b6	configure.ac: Don't require shared LLVM when building OpenCL This works now that pipe_*.so is no longer exporting LLVM symbols. Tested-by: Kai Wasserbäch <kai@dev.carbon-project.or>	2013-11-11 17:21:35 -05:00
Tom Stellard	6d6c749215	pipe-loader: Only export necessary symbols v3 This makes it possible to use clover with statically linked LLVM. v2: - Inline LINKER_SCRIPT variable v3: Kai Wasserbäch - Fix out out-of-tree-builds Tested-by: Kai WasserbÃ¤ch <kai@dev.carbon-project.or>	2013-11-11 17:21:34 -05:00
Tom Stellard	a859131003	radeonsi/compute: Add Sea Islands support	2013-11-11 17:21:34 -05:00
Vincent Lejeune	88c8f19729	r600/llvm: Store inputs in function arguments	2013-11-11 23:14:42 +01:00
Rico Schüller	23afe71f44	tests: Fix make check for out of tree builds. Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Rico Schüller <kgbricola@web.de>	2013-11-11 14:06:17 -08:00
Anuj Phogat	348b91b7dc	i965: Move #define's inside function as local variables X_f, Y_f, Xp_f, Yp_f variables are used just inside translate_dst_to_src().So, they can be defined just as local variables. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-11 13:35:37 -08:00
Vinson Lee	227872571a	i915, i965: Fix memory leak in intel_miptree_create_for_bo. Fixes "Resource leak" defects reported by Coverity. Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-11 13:11:07 -08:00
Brian Paul	ab2da985b6	osmesa: assorted code clean-ups	2013-11-11 08:17:46 -07:00
Brian Paul	a66a008b17	osmesa: fix broken triangle/line drawing when using float color buffer Doesn't seem to help with bug 71363 but it fixed a failure I found in my testing. Cc: "9.2" <mesa-stable@lists.freedesktop.org> Cc: "10.0" <mesa-stable@lists.freedesktop.org>	2013-11-11 08:17:24 -07:00
Brian Paul	34ce1a8502	svga: improve loops over color buffers Only loop over the actual number of color buffers supported, not PIPE_MAX_COLOR_BUFS. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-11 08:12:18 -07:00
Brian Paul	2182d2db28	svga: document magic number of 8 render targets per batch Grab the comments from commit message `b84b7f19df` to explain what the code is doing.	2013-11-11 08:12:18 -07:00
Brian Paul	dc21b36daf	util: set all unused cbufs to NULL in util_copy_framebuffer_state() This helps fix an issue in the svga driver, and is just safer all-around. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2013-11-11 08:12:18 -07:00
Brian Paul	944eebbdb4	glx: declare glx_screen struct to silence warning	2013-11-11 08:12:05 -07:00
Brian Paul	75982a5df4	glx: change query_renderer_integer() value param to unsigned When this function was added, the returned value was signed in some places, unsigned in others. v2: also add unsigned in the unit test, per Ian. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2013-11-11 08:10:12 -07:00
José Fonseca	6c6f4aa6fd	glx: Fix scons build. Reviewed-by: Brian Paul <brianp@vmware.com>	2013-11-11 07:30:07 +00:00
Samuel Thibault	a594cec7e3	EGL: fix build without libdrm This fixes building EGL without libdrm support. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>	2013-11-10 22:11:42 +01:00
Chris Forbes	5442c0eae3	i965: convert brw_lower_offset_array_visitor to ir_rvalue_visitor Previously, we would bogusly replace the entire statement containing the ir_texture node with an ir_dereference_variable. Correct this to just replace the ir_texture node itself as intended. Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-10 16:57:07 +13:00
Chris Forbes	d257350949	glsl: fix missing breaks in equals(ir_texture,..) Signed-off-by: Chris Forbes <chrisf@ijw.co.nz> Cc: "10.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2013-11-10 10:20:02 +13:00
Eric Anholt	bd4596efac	i965: Make the driver compile until a proper libdrm can be released. No depending on unreleased code.	2013-11-09 13:00:53 -08:00
Armin K	f0f202e6b7	glx: conditionaly build dri3 and present loader (v3) This patch makes it possible to disable DRI3 if desired. Tested with: ./configure --disable-dri3 --with-dri-drivers=i965 \ --with-gallium-drivers= --disable-vdpau --disable-egl \ --disable-gbm --disable-xvmc Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71397 Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2013-11-09 09:12:46 -08:00
Matt Turner	68349e5219	i965/fs: Don't perform CSE on inst HW_REG dests (unless it's null) Commit `b16b3c87` began performing CSE on CMP instructions with null destinations. I relaxed the restrictions a bit too much, thereby allowing CSE to be performed on instructions with, for instance, an explicit accumulator destination. This broke the arb_gpu_shader5/fs-imulExtended shader tests because they emit MUL instructions with the accumulator as the destination. CSE would instead cause the MUL to write to a GRF, which is lower precision than the accumulator. Reviewed-by: Eric Anholt <eric@anholt.net> Cc: 10.0 <mesa-stable@lists.freedesktop.org>	2013-11-09 09:10:24 -08:00
Chad Versace	b7dfb8528f	i965: Remove some tiny dead code from intel_miptree_map_movntdqa Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Chad Versace <chad.versace@linux.intel.com>	2013-11-08 14:34:41 -08:00
Brian Paul	f41c01c688	swrast: add missing notify_reset parameter to dri_create_context() Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2013-11-08 08:57:03 -07:00
Christian König	754eb6a67d	vl: use a separate context for shader based decode v2 This makes VDPAU thread save again. v2: fix some memory leaks reported by Aaron Watry. Signed-off-by: Christian König <christian.koenig@amd.com>	2013-11-08 14:50:27 +01:00
José Fonseca	cb3c57df3a	scons: Add dri2_query_renderer.c to sources.	2013-11-08 12:22:22 +00:00
José Fonseca	caf1d96862	st/dri: Fix dri_create_context declaration prototype.	2013-11-08 12:20:00 +00:00

3563 changed files with 364477 additions and 200429 deletions

									
										3

.dir-locals.el
									
												View File
												
				@@ -1,4 +1,4 @@

				((nil

				((prog-mode

				  (indent-tabs-mode . nil)

				  (tab-width . 8)

				  (c-basic-offset . 3)

				@@ -8,4 +8,5 @@

					    (c-set-offset 'innamespace '0)

					    (c-set-offset 'inline-open '0)))

				  )

				 (makefile-mode (indent-tabs-mode . t))

				 )

1

.gitignore vendored

View File

@@ -18,6 +18,7 @@
 *.tar
 *.tar.bz2
 *.tar.gz
 *.tar.xz
 *.trs
 *.zip
 *~

									
										7

Android.mk
									
												View File
												
				@@ -24,7 +24,7 @@

				# BOARD_GPU_DRIVERS should be defined.  The valid values are

				#

				#   classic drivers: i915 i965

				#   gallium drivers: swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx

				#   gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx

				#

				# The main target is libGLES_mesa.  For each classic driver enabled, a DRI

				# module will also be built.  DRI modules will be loaded by libGLES_mesa.

				@@ -38,11 +38,10 @@ MESA_ANDROID_VERSION := $(MESA_ANDROID_MAJOR_VERSION).$(MESA_ANDROID_MINOR_VERSI

				MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk

				MESA_PYTHON2 := python

				DRM_TOP := external/drm

				DRM_GRALLOC_TOP := hardware/drm_gralloc

				classic_drivers := i915 i965

				gallium_drivers := swrast i915g ilo nouveau r300g r600g radeonsi vmwgfx

				gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx

				MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))

				@@ -78,9 +77,11 @@ endif

				ifneq ($(strip $(MESA_GPU_DRIVERS)),)

				SUBDIRS := \

					src/loader \

					src/mapi \

					src/glsl \

					src/mesa \

					src/util \

					src/egl/main

				ifeq ($(strip $(MESA_BUILD_CLASSIC)),true)

									
										7

CleanSpec.mk
									
										Normal file
									
												View File
												
				@@ -0,0 +1,7 @@

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/STATIC_LIBRARIES/libmesa_*_intermediates)

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/i9*5_dri_intermediates)

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libglapi_intermediates)

				$(call add-clean-step, rm -rf $(PRODUCT_OUT)/obj/SHARED_LIBRARIES/libGLES_mesa_intermediates)

				$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/mesa_*_intermediates)

				$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/EXECUTABLES/glsl_compiler_intermediates)

				$(call add-clean-step, rm -rf $(OUT_DIR)/host/$(HOST_OS)-$(HOST_ARCH)/obj/STATIC_LIBRARIES/libmesa_glsl_utils_intermediates)

									
										113

Makefile.am
									
												View File
												
				@@ -21,86 +21,41 @@

				SUBDIRS = src

				AM_DISTCHECK_CONFIGURE_FLAGS = \

					--enable-dri3 \

					--enable-gallium-tests \

					--enable-gbm \

					--enable-gles1 \

					--enable-gles2 \

					--enable-glx-tls \

					--enable-va \

					--enable-vdpau \

					--enable-xa \

					--enable-xvmc \

					--with-egl-platforms=x11,wayland,drm

				ACLOCAL_AMFLAGS = -I m4

				doxygen:

					cd doxygen && $(MAKE)

				EXTRA_DIST = \

					autogen.sh \

					common.py \

					docs \

					doxygen \

					scons \

					SConstruct

				.PHONY: doxygen

				noinst_HEADERS = \

					include/c99_alloca.h \

					include/c99_compat.h \

					include/c99_math.h \

					include/c99 \

					include/c11 \

					include/D3D9 \

					include/HaikuGL \

					include/no_extern_c.h \

					include/pci_ids

				# Rules for making release tarballs

				PACKAGE_DIR = Mesa-$(PACKAGE_VERSION)

				PACKAGE_NAME = MesaLib-$(PACKAGE_VERSION)

				EXTRA_FILES = \

					aclocal.m4					\

					configure					\

					VERSION					\

					bin/ar-lib					\

					bin/compile					\

					bin/config.sub					\

					bin/config.guess				\

					bin/depcomp					\

					bin/install-sh					\

					bin/ltmain.sh					\

					bin/missing					\

					bin/ylwrap					\

					src/glsl/glsl_parser.cpp			\

					src/glsl/glsl_parser.h				\

					src/glsl/glsl_lexer.cpp				\

					src/glsl/glcpp/glcpp-lex.c			\

					src/glsl/glcpp/glcpp-parse.c			\

					src/glsl/glcpp/glcpp-parse.h			\

					src/mesa/program/lex.yy.c			\

					src/mesa/program/program_parse.tab.c		\

					src/mesa/program/program_parse.tab.h		\

					`git ls-files | grep "Makefile.am" | sed -e "s/Makefile.am/Makefile.in/"`

				IGNORE_FILES = \

					-x autogen.sh

				parsers: configure

					$(MAKE) -C src/glsl glsl_parser.cpp glsl_parser.h glsl_lexer.cpp glcpp/glcpp-lex.c glcpp/glcpp-parse.c glcpp/glcpp-parse.h

					$(MAKE) -C src/mesa program/lex.yy.c program/program_parse.tab.c program/program_parse.tab.h

				# Everything for new a Mesa release:

				ARCHIVES = $(PACKAGE_NAME).tar.gz \

					$(PACKAGE_NAME).tar.bz2 \

					$(PACKAGE_NAME).zip

				tarballs: md5

					rm -f ../$(PACKAGE_DIR) $(PACKAGE_NAME).tar

				manifest.txt: .git

					( \

						ls -1 $(EXTRA_FILES) ; \

						git ls-files $(IGNORE_FILES) \

					) | sed -e '/^\(.*\/\)\?\./d' -e "s@^@$(PACKAGE_DIR)/@" > $@

				../$(PACKAGE_DIR):

					ln -s $(PWD) $@

				$(PACKAGE_NAME).tar: parsers ../$(PACKAGE_DIR) manifest.txt

					cd .. ; tar -cf $(PACKAGE_DIR)/$(PACKAGE_NAME).tar -T $(PACKAGE_DIR)/manifest.txt

				$(PACKAGE_NAME).tar.gz: $(PACKAGE_NAME).tar ../$(PACKAGE_DIR)

					gzip --stdout --best $(PACKAGE_NAME).tar > $(PACKAGE_NAME).tar.gz

				$(PACKAGE_NAME).tar.bz2: $(PACKAGE_NAME).tar

					bzip2 --stdout --best $(PACKAGE_NAME).tar > $(PACKAGE_NAME).tar.bz2

				$(PACKAGE_NAME).zip: parsers ../$(PACKAGE_DIR) manifest.txt

					rm -f $(PACKAGE_NAME).zip ; \

					cd .. ; \

					zip -q -@ $(PACKAGE_NAME).zip < $(PACKAGE_DIR)/manifest.txt ; \

					mv $(PACKAGE_NAME).zip $(PACKAGE_DIR)

				md5: $(ARCHIVES)

					@-md5sum $(PACKAGE_NAME).tar.gz

					@-md5sum $(PACKAGE_NAME).tar.bz2

					@-md5sum $(PACKAGE_NAME).zip

				.PHONY: tarballs md5

				# We list some directories in EXTRA_DIST, but don't actually want to include

				# the .gitignore files in the tarball.

				dist-hook:

					find $(distdir) -name .gitignore -exec $(RM) {} +

									
										13

SConstruct
									
												View File
												
				@@ -59,13 +59,6 @@ else:

				Help(opts.GenerateHelpText(env))

				# fail early for a common error on windows

				if env['gles']:

				    try:

				        import libxml2

				    except ImportError:

				        raise SCons.Errors.UserError, "GLES requires libxml2-python to build"

				#######################################################################

				# Environment setup

				@@ -87,9 +80,6 @@ env.Append(CPPPATH = [

					'#/src/gallium/winsys',

				])

				if env['msvc']:

				    env.Append(CPPPATH = ['#include/c99'])

				# for debugging

				#print env.Dump()

				@@ -122,9 +112,6 @@ if env['crosscompile'] and not env['embedded']:

				    host_env['hostonly'] = True

				    assert host_env['crosscompile'] == False

				    if host_env['msvc']:

				        host_env.Append(CPPPATH = ['#include/c99'])

				    target_env = env

				    env = host_env

				    Export('env')

2

VERSION

View File

@@ -1 +1 @@
 .0.0-devel
 .6.0-devel

									
										4

autogen.sh
									
												View File
												
				@@ -6,8 +6,8 @@ test -z "$srcdir" && srcdir=.

				ORIGDIR=`pwd`

				cd "$srcdir"

				autoreconf -v --install || exit 1

				cd $ORIGDIR || exit $?

				autoreconf --force --verbose --install || exit 1

				cd "$ORIGDIR" || exit $?

				if test -z "$NOCONFIGURE"; then

				    "$srcdir"/configure "$@"

									
										96

common.py
									
												View File
												
				@@ -26,28 +26,28 @@ else:

				    target_platform = host_platform

				_machine_map = {

					'x86': 'x86',

					'i386': 'x86',

					'i486': 'x86',

					'i586': 'x86',

					'i686': 'x86',

					'BePC': 'x86',

					'Intel': 'x86',

					'ppc' : 'ppc',

					'BeBox': 'ppc',

					'BeMac': 'ppc',

					'AMD64': 'x86_64',

					'x86_64': 'x86_64',

					'sparc': 'sparc',

					'sun4u': 'sparc',

				    'x86': 'x86',

				    'i386': 'x86',

				    'i486': 'x86',

				    'i586': 'x86',

				    'i686': 'x86',

				    'BePC': 'x86',

				    'Intel': 'x86',

				    'ppc': 'ppc',

				    'BeBox': 'ppc',

				    'BeMac': 'ppc',

				    'AMD64': 'x86_64',

				    'x86_64': 'x86_64',

				    'sparc': 'sparc',

				    'sun4u': 'sparc',

				}

				# find host_machine value

				if 'PROCESSOR_ARCHITECTURE' in os.environ:

					host_machine = os.environ['PROCESSOR_ARCHITECTURE']

				    host_machine = os.environ['PROCESSOR_ARCHITECTURE']

				else:

					host_machine = _platform.machine()

				    host_machine = _platform.machine()

				host_machine = _machine_map.get(host_machine, 'generic')

				default_machine = host_machine

				@@ -65,7 +65,8 @@ else:

				    default_llvm = 'no'

				    try:

				        if target_platform != 'windows' and \

				           subprocess.call(['llvm-config', '--version'], stdout=subprocess.PIPE) == 0:

				           subprocess.call(['llvm-config', '--version'],

				                           stdout=subprocess.PIPE) == 0:

				            default_llvm = 'yes'

				    except:

				        pass

				@@ -75,29 +76,38 @@ else:

				# Common options

				def AddOptions(opts):

					try:

						from SCons.Variables.BoolVariable import BoolVariable as BoolOption

					except ImportError:

						from SCons.Options.BoolOption import BoolOption

					try:

						from SCons.Variables.EnumVariable import EnumVariable as EnumOption

					except ImportError:

						from SCons.Options.EnumOption import EnumOption

					opts.Add(EnumOption('build', 'build type', 'debug',

					                  allowed_values=('debug', 'checked', 'profile', 'release')))

					opts.Add(BoolOption('verbose', 'verbose output', 'no'))

					opts.Add(EnumOption('machine', 'use machine-specific assembly code', default_machine,

															 allowed_values=('generic', 'ppc', 'x86', 'x86_64')))

					opts.Add(EnumOption('platform', 'target platform', host_platform,

															 allowed_values=('cygwin', 'darwin', 'freebsd', 'haiku', 'linux', 'sunos', 'windows')))

					opts.Add(BoolOption('embedded', 'embedded build', 'no'))

					opts.Add('toolchain', 'compiler toolchain', default_toolchain)

					opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support', 'no'))

					opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))

					opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)', 'no'))

					opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))

					opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))

					opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))

					opts.Add(BoolOption('texture_float', 'enable floating-point textures and renderbuffers', 'no'))

					if host_platform == 'windows':

						opts.Add(EnumOption('MSVC_VERSION', 'MS Visual C++ version', None, allowed_values=('7.1', '8.0', '9.0', '10.0', '11.0')))

				    try:

				        from SCons.Variables.BoolVariable import BoolVariable as BoolOption

				    except ImportError:

				        from SCons.Options.BoolOption import BoolOption

				    try:

				        from SCons.Variables.EnumVariable import EnumVariable as EnumOption

				    except ImportError:

				        from SCons.Options.EnumOption import EnumOption

				    opts.Add(EnumOption('build', 'build type', 'debug',

				                        allowed_values=('debug', 'checked', 'profile',

				                                        'release')))

				    opts.Add(BoolOption('verbose', 'verbose output', 'no'))

				    opts.Add(EnumOption('machine', 'use machine-specific assembly code',

				                        default_machine,

				                        allowed_values=('generic', 'ppc', 'x86', 'x86_64')))

				    opts.Add(EnumOption('platform', 'target platform', host_platform,

				                        allowed_values=('cygwin', 'darwin', 'freebsd', 'haiku',

				                                        'linux', 'sunos', 'windows')))

				    opts.Add(BoolOption('embedded', 'embedded build', 'no'))

				    opts.Add(BoolOption('analyze',

				                        'enable static code analysis where available', 'no'))

				    opts.Add('toolchain', 'compiler toolchain', default_toolchain)

				    opts.Add(BoolOption('gles', 'EXPERIMENTAL: enable OpenGL ES support',

				                        'no'))

				    opts.Add(BoolOption('llvm', 'use LLVM', default_llvm))

				    opts.Add(BoolOption('openmp', 'EXPERIMENTAL: compile with openmp (swrast)',

				                        'no'))

				    opts.Add(BoolOption('debug', 'DEPRECATED: debug build', 'yes'))

				    opts.Add(BoolOption('profile', 'DEPRECATED: profile build', 'no'))

				    opts.Add(BoolOption('quiet', 'DEPRECATED: profile build', 'yes'))

				    opts.Add(BoolOption('texture_float',

				                        'enable floating-point textures and renderbuffers',

				                        'no'))

				    if host_platform == 'windows':

				        opts.Add('MSVC_VERSION', 'Microsoft Visual C/C++ version')

1721

configure.ac

View File

File diff suppressed because it is too large Load Diff

318

docs/GL3.txt

View File

@@ -18,164 +18,216 @@ are exposed in the 3.0 context as extensions.
 Feature                                               Status
 ----------------------------------------------------- ------------------------
 GL 3.0:
 GL 3.0, GLSL 1.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GLSL 1.30                                             DONE (i965, r600, radeonsi)
 glBindFragDataLocation, glGetFragDataLocation         DONE
 Conditional rendering (GL_NV_conditional_render)      DONE (i965, r300, r600, radeonsi, swrast)
 Map buffer subranges (GL_ARB_map_buffer_range)        DONE (i965, r300, r600, radeonsi, swrast)
 Clamping controls (GL_ARB_color_buffer_float)         DONE (i965, r300, r600, radeonsi)
 Float textures, renderbuffers (GL_ARB_texture_float)  DONE (i965, r300, r600, radeonsi)
 GL_EXT_packed_float                                   DONE (i965, r600, radeonsi)
 GL_EXT_texture_shared_exponent                        DONE (i965, r600, radeonsi, swrast)
 Float depth buffers (GL_ARB_depth_buffer_float)       DONE (i965, r600, radeonsi)
 Framebuffer objects (GL_ARB_framebuffer_object)       DONE (i965, r300, r600, radeonsi, swrast)
 Half-float                                            DONE (i965, r300, r600, radeonsi, swrast)
 Non-normalized Integer texture/framebuffer formats    DONE (i965, r600, radeonsi)
 D/2D Texture arrays                                  DONE (i965, r600, radeonsi)
 Per-buffer blend and masks (GL_EXT_draw_buffers2)     DONE (i965, r600, radeonsi, swrast)
 GL_EXT_texture_compression_rgtc                       DONE (i965, r300, r600, radeonsi, swrast)
 Red and red/green texture formats                     DONE (i965, r300, r600, radeonsi, swrast)
 Transform feedback (GL_EXT_transform_feedback)        DONE (i965, r600, radeonsi)
 Vertex array objects (GL_APPLE_vertex_array_object)   DONE (all drivers)
 sRGB framebuffer format (GL_EXT_framebuffer_sRGB)     DONE (i965, r600, radeonsi)
 glClearBuffer commands                                DONE
 glGetStringi command                                  DONE
 glTexParameterI, glGetTexParameterI commands          DONE
 glVertexAttribI commands                              DONE
 Depth format cube textures                            DONE (i965, r600, radeonsi)
 GLX_ARB_create_context (GLX 1.4 is required)          DONE
   glBindFragDataLocation, glGetFragDataLocation         DONE
   Conditional rendering (GL_NV_conditional_render)      DONE ()
   Map buffer subranges (GL_ARB_map_buffer_range)        DONE ()
   Clamping controls (GL_ARB_color_buffer_float)         DONE ()
   Float textures, renderbuffers (GL_ARB_texture_float)  DONE ()
   GL_EXT_packed_float                                   DONE ()
   GL_EXT_texture_shared_exponent                        DONE ()
   Float depth buffers (GL_ARB_depth_buffer_float)       DONE ()
   Framebuffer objects (GL_ARB_framebuffer_object)       DONE ()
   GL_ARB_half_float_pixel                               DONE (all drivers)
   GL_ARB_half_float_vertex                              DONE ()
   GL_EXT_texture_integer                                DONE ()
   GL_EXT_texture_array                                  DONE ()
   Per-buffer blend and masks (GL_EXT_draw_buffers2)     DONE ()
   GL_EXT_texture_compression_rgtc                       DONE ()
   GL_ARB_texture_rg                                     DONE ()
   Transform feedback (GL_EXT_transform_feedback)        DONE ()
   Vertex array objects (GL_ARB_vertex_array_object)     DONE ()
   sRGB framebuffer format (GL_EXT_framebuffer_sRGB)     DONE ()
   glClearBuffer commands                                DONE
   glGetStringi command                                  DONE
   glTexParameterI, glGetTexParameterI commands          DONE
   glVertexAttribI commands                              DONE
   Depth format cube textures                            DONE ()
   GLX_ARB_create_context (GLX 1.4 is required)          DONE
   Multisample anti-aliasing                             DONE (llvmpipe (*), softpipe (*))
 (*) llvmpipe and softpipe have fake Multisample anti-aliasing support
 GL 3.1:
 GL 3.1, GLSL 1.40 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GLSL 1.40                                             DONE (i965, r600, radeonsi)
 Forward compatible context support/deprecations       DONE (i965, r600, radeonsi)
 Instanced drawing (GL_ARB_draw_instanced)             DONE (i965, r600, radeonsi, swrast)
 Buffer copying (GL_ARB_copy_buffer)                   DONE (i965, r300, r600, radeonsi, swrast)
 Primitive restart (GL_NV_primitive_restart)           DONE (i965, r300, r600, radeonsi)
 vertex texture image units                         DONE (i965, r600, radeonsi)
 Texture buffer objs (GL_ARB_texture_buffer_object)    DONE for OpenGL 3.1 contexts (i965, r600, radeonsi)
 Rectangular textures (GL_ARB_texture_rectangle)       DONE (i965, r300, r600, radeonsi, swrast)
 Uniform buffer objs (GL_ARB_uniform_buffer_object)    DONE (i965, r600, radeonsi, swrast)
 Signed normalized textures (GL_EXT_texture_snorm)     DONE (i965, r300, r600, radeonsi)
   Forward compatible context support/deprecations       DONE ()
   Instanced drawing (GL_ARB_draw_instanced)             DONE ()
   Buffer copying (GL_ARB_copy_buffer)                   DONE ()
   Primitive restart (GL_NV_primitive_restart)           DONE ()
 vertex texture image units                         DONE ()
   Texture buffer objs (GL_ARB_texture_buffer_object)    DONE for OpenGL 3.1 contexts ()
   Rectangular textures (GL_ARB_texture_rectangle)       DONE ()
   Uniform buffer objs (GL_ARB_uniform_buffer_object)    DONE ()
   Signed normalized textures (GL_EXT_texture_snorm)     DONE ()
 GL 3.2:
 GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 Core/compatibility profiles                           DONE
 GLSL 1.50                                             DONE (i965)
 Geometry shaders                                      DONE (i965)
 BGRA vertex order (GL_ARB_vertex_array_bgra)          DONE (i965, r300, r600, radeonsi, swrast)
 Base vertex offset(GL_ARB_draw_elements_base_vertex)  DONE (i965, r300, r600, radeonsi, swrast)
 Frag shader coord (GL_ARB_fragment_coord_conventions) DONE (i965, r300, r600, radeonsi, swrast)
 Provoking vertex (GL_ARB_provoking_vertex)            DONE (i965, r300, r600, radeonsi, swrast)
 Seamless cubemaps (GL_ARB_seamless_cube_map)          DONE (i965, r600, radeonsi)
 Multisample textures (GL_ARB_texture_multisample)     DONE (i965, r600, radeonsi)
 Frag depth clamp (GL_ARB_depth_clamp)                 DONE (i965, r600, swrast, radeonsi)
 Fence objects (GL_ARB_sync)                           DONE (i965, r300, r600, radeonsi, swrast)
 GLX_ARB_create_context_profile                        DONE
   Core/compatibility profiles                           DONE
   Geometry shaders                                      DONE ()
   BGRA vertex order (GL_ARB_vertex_array_bgra)          DONE ()
   Base vertex offset(GL_ARB_draw_elements_base_vertex)  DONE ()
   Frag shader coord (GL_ARB_fragment_coord_conventions) DONE ()
   Provoking vertex (GL_ARB_provoking_vertex)            DONE ()
   Seamless cubemaps (GL_ARB_seamless_cube_map)          DONE ()
   Multisample textures (GL_ARB_texture_multisample)     DONE ()
   Frag depth clamp (GL_ARB_depth_clamp)                 DONE ()
   Fence objects (GL_ARB_sync)                           DONE ()
   GLX_ARB_create_context_profile                        DONE
 GL 3.3:
 GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
 GLSL 3.30                                             DONE (i965)
 GL_ARB_blend_func_extended                            DONE (i965, r600, radeonsi, softpipe)
 GL_ARB_explicit_attrib_location                       DONE (i915, i965, r300, r600, radeonsi, swrast)
 GL_ARB_occlusion_query2                               DONE (i965, r300, r600, radeonsi, swrast)
 GL_ARB_sampler_objects                                DONE (i965, r300, r600, radeonsi)
 GL_ARB_shader_bit_encoding                            DONE (i965, r600, radeonsi)
 GL_ARB_texture_rgb10_a2ui                             DONE (i965, r600, radeonsi)
 GL_ARB_texture_swizzle                                DONE (i965, r300, r600, radeonsi, swrast)
 GL_ARB_timer_query                                    DONE (i965, r600, radeonsi)
 GL_ARB_instanced_arrays                               DONE (i965, r300, r600, radeonsi)
 GL_ARB_vertex_type_2_10_10_10_rev                     DONE (i965, r600, radeonsi)
   GL_ARB_blend_func_extended                            DONE ()
   GL_ARB_explicit_attrib_location                       DONE (all drivers that support GLSL)
   GL_ARB_occlusion_query2                               DONE ()
   GL_ARB_sampler_objects                                DONE (all drivers)
   GL_ARB_shader_bit_encoding                            DONE ()
   GL_ARB_texture_rgb10_a2ui                             DONE ()
   GL_ARB_texture_swizzle                                DONE ()
   GL_ARB_timer_query                                    DONE ()
   GL_ARB_instanced_arrays                               DONE ()
   GL_ARB_vertex_type_2_10_10_10_rev                     DONE ()
 GL 4.0:
 GL 4.0, GLSL 4.00:
 GLSL 4.0                                             not started
 GL_ARB_texture_query_lod                             DONE (i965)
 GL_ARB_draw_buffers_blend                            DONE (i965, r600, radeonsi, softpipe)
 GL_ARB_draw_indirect                                 started (Christoph)
 GL_ARB_gpu_shader5                                   started
 GL_ARB_gpu_shader_fp64                               not started
 GL_ARB_sample_shading                                DONE (i965)
 GL_ARB_shader_subroutine                             not started
 GL_ARB_tessellation_shader                           not started
 GL_ARB_texture_buffer_object_rgb32                   DONE (i965, r600, radeonsi, softpipe)
 GL_ARB_texture_cube_map_array                        DONE (i965, r600, softpipe)
 GL_ARB_texture_gather                                DONE (i965)
 GL_ARB_transform_feedback2                           DONE (i965, r600, radeonsi)
 GL_ARB_transform_feedback3                           DONE (i965, r600, radeonsi)
   GL_ARB_draw_buffers_blend                            DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_draw_indirect                                 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_gpu_shader5                                   DONE (i965, nvc0)
   - 'precise' qualifier                                DONE
   - Dynamically uniform sampler array indices          DONE (r600)
   - Dynamically uniform UBO array indices              DONE (r600)
   - Implicit signed -> unsigned conversions            DONE
   - Fused multiply-add                                 DONE ()
   - Packing/bitfield/conversion functions              DONE (r600, radeonsi)
   - Enhanced textureGather                             DONE (r600, radeonsi)
   - Geometry shader instancing                         DONE (r600)
   - Geometry shader multiple streams                   DONE ()
   - Enhanced per-sample shading                        DONE (r600, radeonsi)
   - Interpolation functions                            DONE (r600)
   - New overload resolution rules                      DONE
   GL_ARB_gpu_shader_fp64                               DONE (nvc0, softpipe)
   GL_ARB_sample_shading                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_shader_subroutine                             not started
   GL_ARB_tessellation_shader                           started (Chris, Ilia)
   GL_ARB_texture_buffer_object_rgb32                   DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_cube_map_array                        DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_gather                                DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_lod                             DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_transform_feedback2                           DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_transform_feedback3                           DONE (i965, nv50, nvc0, r600, radeonsi)
 GL 4.1:
 GL 4.1, GLSL 4.10:
 GLSL 4.1                                             not started
 GL_ARB_ES2_compatibility                             DONE (i965, r300, r600, radeonsi)
 GL_ARB_get_program_binary                            DONE (0 binary formats)
 GL_ARB_separate_shader_objects                       some infrastructure done
 GL_ARB_shader_precision                              not started
 GL_ARB_vertex_attrib_64bit                           not started
 GL_ARB_viewport_array                                not started
   GL_ARB_ES2_compatibility                             DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_get_program_binary                            DONE (0 binary formats)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_shader_precision                              started (Micah)
   GL_ARB_vertex_attrib_64bit                           started (Dave)
   GL_ARB_viewport_array                                DONE (i965, nv50, nvc0, r600, llvmpipe)
 GL 4.2:
 GL 4.2, GLSL 4.20:
 GLSL 4.2                                             not started
 GL_ARB_texture_compression_bptc                      not started
 GL_ARB_compressed_texture_pixel_storage              not started
 GL_ARB_shader_atomic_counters                        DONE (i965)
 GL_ARB_texture_storage                               DONE (all drivers)
 GL_ARB_transform_feedback_instanced                  DONE (i965, r600, radeonsi)
 GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi)
 GL_ARB_shader_image_load_store                       not started
 GL_ARB_conservative_depth                            DONE (all drivers that support GLSL 1.30)
 GL_ARB_shading_language_420pack                      DONE (all drivers that support GLSL 1.30)
 GL_ARB_internalformat_query                          DONE (i965, r300, r600, radeonsi)
 GL_ARB_map_buffer_alignment                          DONE (r300, r600, radeonsi)
   GL_ARB_texture_compression_bptc                      DONE (i965, nvc0, r600, radeonsi)
   GL_ARB_compressed_texture_pixel_storage              DONE (all drivers)
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_texture_storage                               DONE (all drivers)
   GL_ARB_transform_feedback_instanced                  DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_base_instance                                 DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_shader_image_load_store                       in progress (curro)
   GL_ARB_conservative_depth                            DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_420pack                      DONE (all drivers that support GLSL 1.30)
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_internalformat_query                          DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_map_buffer_alignment                          DONE (all drivers)
 GL 4.3:
 GL 4.3, GLSL 4.30:
 GLSL 4.3                                             not started
 GL_ARB_arrays_of_arrays                              not started
 GL_ARB_ES3_compatibility                             DONE (i965)
 GL_ARB_clear_buffer_object                           not started
 GL_ARB_compute_shader                                not started
 GL_ARB_copy_image                                    not started
 GL_KHR_debug                                         DONE (all drivers)
 GL_ARB_explicit_uniform_location                     not started
 GL_ARB_fragment_layer_viewport                       not started
 GL_ARB_framebuffer_no_attachments                    not started
 GL_ARB_internalformat_query2                         not started
 GL_ARB_invalidate_subdata                            DONE (all drivers)
 GL_ARB_multi_draw_indirect                           not started
 GL_ARB_program_interface_query                       not started
 GL_ARB_robust_buffer_access_behavior                 not started
 GL_ARB_shader_image_size                             not started
 GL_ARB_shader_storage_buffer_object                  not started
 GL_ARB_stencil_texturing                             not started
 GL_ARB_texture_buffer_range                          DONE (nv50, nvc0, i965, r600, radeonsi)
 GL_ARB_texture_query_levels                          DONE (i965)
 GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
 GL_ARB_texture_view                                  not started
 GL_ARB_vertex_attrib_binding                         DONE (all drivers)
   GL_ARB_arrays_of_arrays                              started (Timothy)
   GL_ARB_ES3_compatibility                             DONE (all drivers that support GLSL 3.30)
   GL_ARB_clear_buffer_object                           DONE (all drivers)
   GL_ARB_compute_shader                                in progress (jljusten)
   GL_ARB_copy_image                                    DONE (i965)
   GL_KHR_debug                                         DONE (all drivers)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_fragment_layer_viewport                       DONE (nv50, nvc0, r600, llvmpipe)
   GL_ARB_framebuffer_no_attachments                    not started
   GL_ARB_internalformat_query2                         not started
   GL_ARB_invalidate_subdata                            DONE (all drivers)
   GL_ARB_multi_draw_indirect                           DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_program_interface_query                       not started
   GL_ARB_robust_buffer_access_behavior                 not started
   GL_ARB_shader_image_size                             not started
   GL_ARB_shader_storage_buffer_object                  not started
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_buffer_range                          DONE (nv50, nvc0, i965, r600, radeonsi, llvmpipe)
   GL_ARB_texture_query_levels                          DONE (all drivers that support GLSL 1.30)
   GL_ARB_texture_storage_multisample                   DONE (all drivers that support GL_ARB_texture_multisample)
   GL_ARB_texture_view                                  DONE (i965, nv50, nvc0)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
 GL 4.4:
 GL 4.4, GLSL 4.40:
 GLSL 4.4                                             not started
 GL_MAX_VERTEX_ATTRIB_STRIDE                          not started
 GL_ARB_buffer_storage                                not started
 GL_ARB_clear_texture                                 not started
 GL_ARB_enhanced_layouts                              not started
 GL_ARB_multi_bind                                    not started
 GL_ARB_query_buffer_object                           not started
 GL_ARB_texture_mirror_clamp_to_edge                  DONE (i965, nv30, nv50, nvc0, r300, r600, radeonsi, swrast)
 GL_ARB_texture_stencil8                              not started
 GL_ARB_vertex_type_10f_11f_11f_rev                   DONE (i965, r600)
   GL_MAX_VERTEX_ATTRIB_STRIDE                          DONE (all drivers)
   GL_ARB_buffer_storage                                DONE (i965, nv50, nvc0, r600, radeonsi)
   GL_ARB_clear_texture                                 DONE (i965)
   GL_ARB_enhanced_layouts                              not started
   GL_ARB_multi_bind                                    DONE (all drivers)
   GL_ARB_query_buffer_object                           not started
   GL_ARB_texture_mirror_clamp_to_edge                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_texture_stencil8                              not started
   GL_ARB_vertex_type_10f_11f_11f_rev                   DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
 GL 4.5, GLSL 4.50:
   GL_ARB_ES3_1_compatibility                           not started
   GL_ARB_clip_control                                  DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_conditional_render_inverted                   DONE (i965, nv50, nvc0, llvmpipe, softpipe)
   GL_ARB_cull_distance                                 not started
   GL_ARB_derivative_control                            DONE (i965, nv50, nvc0, r600)
   GL_ARB_direct_state_access                           started
   - Transform Feedback object                          DONE
   - Buffer object                                      DONE
   - Framebuffer object                                 started (Laura Ekstrand)
   - Renderbuffer object                                DONE
   - Texture object                                     DONE
   - Vertex array object                                started (Fredrik Höglund)
   - Sampler object                                     DONE
   - Program Pipeline object                            DONE
   - Query object                                       DONE (will require changes when GL_ARB_query_buffer_object lands)
   GL_ARB_get_texture_sub_image                         started (Brian Paul)
   GL_ARB_shader_texture_image_samples                  not started
   GL_ARB_texture_barrier                               DONE (nv50, nvc0, r600, radeonsi)
   GL_KHR_context_flush_control                         DONE (all - but needs GLX/EXT extension to be useful)
   GL_KHR_robust_buffer_access_behavior                 not started
   GL_KHR_robustness                                    90% done (the ARB variant)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
 These are the extensions cherry-picked to make GLES 3.1
 GLES3.1, GLSL ES 3.1
   GL_ARB_arrays_of_arrays                              started (Timothy)
   GL_ARB_compute_shader                                in progress (jljusten)
   GL_ARB_draw_indirect                                 DONE (i965, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_explicit_uniform_location                     DONE (all drivers that support GLSL)
   GL_ARB_framebuffer_no_attachments                    not started
   GL_ARB_program_interface_query                       not started
   GL_ARB_shader_atomic_counters                        DONE (i965)
   GL_ARB_shader_image_load_store                       in progress (curro)
   GL_ARB_shader_storage_buffer_object                  not started
   GL_ARB_shading_language_packing                      DONE (all drivers)
   GL_ARB_separate_shader_objects                       DONE (all drivers)
   GL_ARB_stencil_texturing                             DONE (i965/gen8+, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe)
   GL_ARB_vertex_attrib_binding                         DONE (all drivers)
   GS5 Enhanced textureGather                           DONE (i965, nvc0, r600, radeonsi)
   GS5 Packing/bitfield/conversion functions            DONE (i965, nvc0, r600, radeonsi)
   GL_EXT_shader_integer_mix                            DONE (all drivers that support GLSL)
 More info about these features and the work involved can be found at

256

docs/README.CYGWIN

View File

@@ -1,256 +0,0 @@
                           Mesa Cygwin/X11 Information
 WARNING
 =======
 If you installed X11 (packages xorg-x11-devel and xorg-x11-bin-dlls ) with the
 latest setup.exe from Cygwin the GL (Mesa) libraries and include are already
 installed in /usr/X11R6.
 The following will explain how to "replace" them.
 Installation
 ============
 How to compile Mesa on Cygwin/X11 systems:
 . Shared libs:
     type 'make cygwin-sl'.
     When finished, the Mesa DLL will be in the Mesa-x.y/lib/ and
     Mesa-x.y/bin directories.
 . Static libs:
     type 'make cygwin-static'.
     When finished, the Mesa libraries will be in the Mesa-x.y/lib/ directory.
 Header and library files:
    After you've compiled Mesa and tried the demos I recommend the following
    procedure for "installing" Mesa.
    Copy the Mesa include/GL directory to /usr/X11R6/include:
 	cp -a include/GL /usr/X11R6/include
    Copy the Mesa library files to /usr/X11R6/lib:
 	cp -a lib/* /usr/X11R6ocal/lib
    Copy the Mesa bin files (used by the DLL stuff) to /usr/X11R6/bin:
 	cp -a lib/cyg* /usr/X11R6/bin
 Xt/Motif widgets:
    If you want to use Mesa or OpenGL in your Xt/Motif program you can build
    the widgets found in either the widgets-mesa or widgets-sgi directories.
    The former were written for Mesa and the later are the original SGI
    widgets.  Look in those directories for more information.
    For the Motif widgets you must have downloaded the lesstif package.
 Using the library
 =================
 Configuration options:
    The file src/mesa/main/config.h has many parameters which you can adjust
    such as maximum number of lights, clipping planes, maximum texture size,
    etc.  In particular, you may want to change DEPTH_BITS from 16 to 32
    if a 16-bit depth buffer isn't precise enough for your application.
 Shared libraries:
    If you compile shared libraries (Win32 DLLS) you may have to set an
    environment variable to specify where the Mesa libraries are located.
    Set the PATH variable to include /your-dir/Mesa-2.6/bin.
    Otherwise, when you try to run a demo it may fail with a message saying
    that one or more DLL couldn't be found.
 Xt/Motif Widgets:
    Two versions of the Xt/Motif OpenGL drawing area widgets are included:
       widgets-sgi/	SGI's stock widgets
       widgets-mesa/	Mesa-tuned widgets
    Look in those directories for details
 Togl:
    Togl is an OpenGL/Mesa widget for Tcl/Tk.
    See http://togl.sourceforge.net for more information.
 X Display Modes:
    Mesa supports RGB(A) rendering into almost any X visual type and depth.
    The glXChooseVisual function tries its best to pick an appropriate visual
    for the given attribute list.  However, if this doesn't suit your needs
    you can force Mesa to use any X visual you want (any supported by your
    X server that is) by setting the MESA_RGB_VISUAL and MESA_CI_VISUAL
    environment variables.  When an RGB visual is requested, glXChooseVisual
    will first look if the MESA_RGB_VISUAL variable is defined.  If so, it
    will try to use the specified visual.  Similarly, when a color index
    visual is requested, glXChooseVisual will look for the MESA_CI_VISUAL
    variable.
    The format of accepted values is:  <visual-class> <depth>
    Here are some examples:
    using the C-shell:
 	% setenv MESA_RGB_VISUAL "TrueColor 8"		// 8-bit TrueColor
 	% setenv MESA_CI_VISUAL "PseudoColor 12"	// 12-bit PseudoColor
 	% setenv MESA_RGB_VISUAL "PseudoColor 8"	// 8-bit PseudoColor
    using the KornShell:
 	$ export MESA_RGB_VISUAL="TrueColor 8"
 	$ export MESA_CI_VISUAL="PseudoColor 12"
 	$ export MESA_RGB_VISUAL="PseudoColor 8"
 Double buffering:
    Mesa can use either an X Pixmap or XImage as the backbuffer when in
    double buffer mode.  Using GLX, the default is to use an XImage.  The
    MESA_BACK_BUFFER environment variable can override this.  The valid
    values for MESA_BACK_BUFFER are:  Pixmap and XImage (only the first
    letter is checked, case doesn't matter).
    A pixmap is faster when drawing simple lines and polygons while an
    XImage is faster when Mesa has to do pixel-by-pixel rendering.  If you
    need depth buffering the XImage will almost surely be faster.  Exper-
    iment with the MESA_BACK_BUFFER variable to see which is faster for
    your application.
 Colormaps:
    When using Mesa directly or with GLX, it's up to the application writer
    to create a window with an appropriate colormap.  The aux, tk, and GLUT
    toolkits try to minimize colormap "flashing" by sharing colormaps when
    possible.  Specifically, if the visual and depth of the window matches
    that of the root window, the root window's colormap will be shared by
    the Mesa window.  Otherwise, a new, private colormap will be allocated.
    When sharing the root colormap, Mesa may be unable to allocate the colors
    it needs, resulting in poor color quality.  This can happen when a
    large number of colorcells in the root colormap are already allocated.
    To prevent colormap sharing in aux, tk and GLUT, define the environment
    variable MESA_PRIVATE_CMAP.  The value isn't significant.
 Gamma correction:
    To compensate for the nonlinear relationship between pixel values
    and displayed intensities, there is a gamma correction feature in
    Mesa.  Some systems, such as Silicon Graphics, support gamma
    correction in hardware (man gamma) so you won't need to use Mesa's
    gamma facility.  Other systems, however, may need gamma adjustment
    to produce images which look correct.  If in the past you thought
    Mesa's images were too dim, read on.
    Gamma correction is controlled with the MESA_GAMMA environment
    variable.  Its value is of the form "Gr Gg Gb" or just "G" where
    Gr is the red gamma value, Gg is the green gamma value, Gb is the
    blue gamma value and G is one gamma value to use for all three
    channels.  Each value is a positive real number typically in the
    range 1.0 to 2.5.  The defaults are all 1.0, effectively disabling
    gamma correction.  Examples using csh:
 	% setenv MESA_GAMMA "2.3 2.2 2.4"	// separate R,G,B values
 	% setenv MESA_GAMMA "2.0"		// same gamma for R,G,B
    The demos/gamma.c program may help you to determine reasonable gamma
    value for your display.  With correct gamma values, the color intensities
    displayed in the top row (drawn by dithering) should nearly match those
    in the bottom row (drawn as grays).
    Alex De Bruyn reports that gamma values of 1.6, 1.6 and 1.9 work well
    on HP displays using the HP-ColorRecovery technology.
    Mesa implements gamma correction with a lookup table which translates
    a "linear" pixel value to a gamma-corrected pixel value.  There is a
    small performance penalty.  Gamma correction only works in RGB mode.
    Also be aware that pixel values read back from the frame buffer will
    not be "un-corrected" so glReadPixels may not return the same data
    drawn with glDrawPixels.
    For more information about gamma correction see:
    http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html
 Overlay Planes
    Overlay planes in the frame buffer are supported by Mesa but require
    hardware and X server support.  To determine if your X server has
    overlay support you can test for the SERVER_OVERLAY_VISUALS property:
 	xprop -root | grep SERVER_OVERLAY_VISUALS
 HPCR glClear(GL_COLOR_BUFFER_BIT) dithering
    If you set the MESA_HPCR_CLEAR environment variable then dithering
    will be used when clearing the color buffer.  This is only applicable
    to HP systems with the HPCR (Color Recovery) system.
 Extensions
 ==========
    There are three Mesa-specific GLX extensions at this time.
    GLX_MESA_pixmap_colormap
       This extension adds the GLX function:
          GLXPixmap glXCreateGLXPixmapMESA( Display *dpy, XVisualInfo *visual,
                                            Pixmap pixmap, Colormap cmap )
       It is an alternative to the standard glXCreateGLXPixmap() function.
       Since Mesa supports RGB rendering into any X visual, not just True-
       Color or DirectColor, Mesa needs colormap information to convert RGB
       values into pixel values.  An X window carries this information but a
       pixmap does not.  This function associates a colormap to a GLX pixmap.
       See the xdemos/glxpixmap.c file for an example of how to use this
       extension.
    GLX_MESA_release_buffers
       Mesa associates a set of ancillary (depth, accumulation, stencil and
       alpha) buffers with each X window it draws into.  These ancillary
       buffers are allocated for each X window the first time the X window
       is passed to glXMakeCurrent().  Mesa, however, can't detect when an
       X window has been destroyed in order to free the ancillary buffers.
       The best it can do is to check for recently destroyed windows whenever
       the client calls the glXCreateContext() or glXDestroyContext()
       functions.  This may not be sufficient in all situations though.
       The GLX_MESA_release_buffers extension allows a client to explicitly
       deallocate the ancillary buffers by calling glxReleaseBuffersMESA()
       just before an X window is destroyed.  For example:
          #ifdef GLX_MESA_release_buffers
             glXReleaseBuffersMESA( dpy, window );
          #endif
          XDestroyWindow( dpy, window );
       This extension is new in Mesa 2.0.
    GLX_MESA_copy_sub_buffer
       This extension adds the glXCopySubBufferMESA() function.  It works
       like glXSwapBuffers() but only copies a sub-region of the window
       instead of the whole window.
       This extension is new in Mesa version 2.6
 Summary of X-related environment variables:
    MESA_RGB_VISUAL - specifies the X visual and depth for RGB mode (X only)
    MESA_CI_VISUAL - specifies the X visual and depth for CI mode (X only)
    MESA_BACK_BUFFER - specifies how to implement the back color buffer (X only)
    MESA_PRIVATE_CMAP - force aux/tk libraries to use private colormaps (X only)
    MESA_GAMMA - gamma correction coefficients (X only)
 ----------------------------------------------------------------------
 README.CYGWIN - lassauge April 2004 - based on README.X11

102

docs/README.MITS

View File

@@ -1,102 +0,0 @@
 			Mesa 3.0 MITS Information
 This software is distributed under the terms of the GNU Library
 General Public License, see the LICENSE file for details.
 This document is a preliminary introduction to help you get
 started. For more detaile information consult the web page.
 http://10-dencies.zkm.de/~mesa/
 Version 0.1 (Yes it's very alpha code so be warned!)
 Contributors:
   Emil Briggs    	(briggs@bucky.physics.ncsu.edu)
   David Bucciarelli 	(tech.hmw@plus.it)
   Andreas Schiffler 	(schiffler@zkm.de)
 . Requirements:
      Mesa 3.0.
      An SMP capable machine running Linux 2.x
      libpthread installed on your machine.
 . What does MITS stand for?
      MITS stands for Mesa Internal Threading System. By adding
      internal threading to Mesa it should be possible to improve
      performance of OpenGL applications on SMP machines.
 . Do applications have to be recoded to take advantage of MITS?
      No. The threading is internal to Mesa and transparent to
      applications.
 . Will all applications benefit from the current implementation of MITS?
      No. This implementation splits the processing of the vertex buffer
      over two threads. There is a certain amount of overhead involved
      with the thread synchronization and if there is not enough work
      to be done the extra overhead outweighs any speedup from using
      dual processors. You will not for example see any speedup when
      running Quake because it uses GL_POLYGON and there is only one
      polygon for each vertex buffer processed. Test results on a
      dual 200 Mhz. Pentium Pro system show that one needs around
 -200 vertices in the vertex buffer before any there is any
      appreciable benefit from the threading.
 . Are there any parameters that I can tune to try to improve performance.
      Yes. You can try to vary the size of the vertex buffer which is
      define in VB_MAX located in the file src/vb.h from your top level
      Mesa distribution. The number needs to be a multiple of 12 and
      the optimum value will probably depend on the capabilities of
      your machine and the particular application you are running.
 . Are there any ways I can modify the application to improve its
    performance with the MITS?
      Yes. Try to use as many vertices between each Begin/End pair
      as possbile. This will reduce the thread synchronization
      overhead.
 . What sort of speedups can I expect?
      On some benchmarks performance gains of up to 30% have been
      observerd. Others may see no gain at all and in a few rare
      cases even some degradation.
 . What still needs to be done?
      Lots of testing and benchmarking.
      A portable implementation that works within the Mesa thread API.
      Threading of additional areas of Mesa to improve performance
      even more.
 Installation:
 . This assumes that you already have a working Mesa 3.0 installation
       from source.
 . Place the tarball MITS.tar.gz in your top level Mesa directory.
 . Unzip it and untar it. It will replace the following files in
       your Mesa source tree so back them up if you want to save them.
 	 README.MITS
          Make-config
 	 Makefile
 	 mklib.glide
          src/vbxform.c
 	 src/vb.h
 . Rebuild Mesa using the command
           make linux-386-glide-mits

207

docs/README.QUAKE

View File

@@ -1,207 +0,0 @@
              Info on using Mesa 3.0 with Linux Quake I and Quake II
 Disclaimer
 ----------
 I am _not_ a Quake expert by any means.  I pretty much only run it to
 test Mesa.  There have been a lot of questions about Linux Quake and
 Mesa so I'm trying to provide some useful info here.  If this file
 doesn't help you then you should look elsewhere for help.  The Mesa
 mailing list or the news://news.3dfx.com/3dfx.linux.glide newsgroup
 might be good.
 Again, all the information I have is in this file.  Please don't email
 me with questions.
 If you have information to contribute to this file please send it to
 me at brianp@elastic.avid.com
 Linux Quake
 -----------
 You can get Linux Quake from http://www.idsoftware.com/
 Quake I and II for Linux were tested with, and include, Mesa 2.6.  You
 shouldn't have too many problems if you simply follow the instructions
 in the Quake distribution.
 RedHat 5.0 Linux problems
 -------------------------
 RedHat Linux 5.x uses the GNU C library ("glibc" or "libc6") whereas
 previous RedHat and other Linux distributions use "libc5" for its
 runtime C library.
 Linux Quake I and II were compiled for libc5.  If you compile Mesa
 on a RedHat 5.x system the resulting libMesaGL.so file will not work
 with Linux Quake because of the different C runtime libraries.
 The symptom of this is a segmentation fault soon after starting Quake.
 If you want to use a newer version of Mesa (like 3.x) with Quake on
 RedHat 5.x then read on.
 The solution to the C library problem is to force Mesa to use libc5.
 libc5 is in /usr/i486-linux-libc5/lib on RedHat 5.x systems.
 Emil Briggs (briggs@tick.physics.ncsu.edu) nicely gave me the following
 info:
 >   I only know what works on a RedHat 5.0 distribution. RH5 includes
 > a full set of libraries for both libc5 and glibc. The loader ld.so
 > uses the libc5 libraries in /usr/i486-linux-libc5/lib for programs
 > linked against libc5 while it uses the glibc libraries in /lib and
 > /usr/lib for programs linked against glibc.
 >
 > Anyway I changed line 41 of mklib.glide to
 >     GLIDELIBS="-L/usr/local/glide/lib -lglide2x -L/usr/i486-linux-libc5/lib"
 >
 > And I started quake2 up with a script like this
 > #!/bin/csh
 > setenv LD_LIBRARY_PATH /usr/i486-linux-libc5/lib
 > setenv MESA_GLX_FX f
 > ./quake2 +set vid_ref gl
 > kbd_mode -a
 > reset
 I've already patched the mklib.glide file.  You'll have to start Quake
 with the script shown above though.
 **********************
 Daryll Strauss writes:
 Here's my thoughts on the problem. On a RH 5.x system, you can NOT build
 a libc5 executable or library. Red Hat just doesn't include the right
 stuff to do it.
 Since Quake is a libc5 based application, you are in trouble. You need
 libc5 libraries.
 What can you do about it? Well there's a package called gcc5 that does
 MOST of the right stuff to compile with libc5. (It brings back older
 header files, makes appropriate symbolic links for libraries, and sets
 up the compiler to use the correct directories) You can find gcc5 here:
 ftp://ecg.mit.edu/pub/linux/gcc5-1.0-1.i386.rpm
 No, this isn't quite enough. There are still a few tricks to getting
 Mesa to compile as a libc5 application. First you have to make sure that
 every compile uses gcc5 instead of gcc. Second, in some cases the link
 line actually lists -L/usr/lib which breaks gcc5 (because it forces you
 to use the glibc version of things)
 If you get all the stuff correctly compiled with gcc5 it should work.
 I've run Mesa 3.0B6  and its demos in a window with my Rush on a Red Hat
 .1 system. It is a big hassle, but it can be done. I've only made Quake
 segfault, but I think that's from my libRush using the wrong libc.
 Yes, mixing libc5 and glibc is a major pain. I've been working to get
 all my libraries compiling correctly with this setup. Someone should
 make an RPM out of it and feed changes back to Brian once they get it
 all working. If no one else has done so by the time I get the rest of my
 stuff straightened out, I'll try to do it myself.
 							- |Daryll
 *********************
 David Bucciarelli (tech.hmw@plus.it) writes:
 I'm using the Mesa-3.0beta7 and the RedHat 5.1 and QuakeII is
 working fine for me.  I had only to make a small change to the
 Mesa-3.0/mklib.glide file, from:
     GLIDELIBS="-L/usr/local/glide/lib -lglide2x
 -L/usr/i486-linux-libc5/lib -lm"
 to:
     GLIDELIBS="-L/usr/i486-linux-libc5/lib -lglide2x"
 and to make two symbolic links:
 [david@localhost Mesa]$ ln -s libMesaGL.so libMesaGL.so.2
 [david@localhost Mesa]$ ln -s libMesaGLU.so libMesaGLU.so.2
 I'm using the Daryll's Linux glide rpm for the Voodoo2 and glibc (it
 includes also the Glide for the libc5). I'm not using the /dev/3Dfx and
 running QuakeII as root with the following env. var:
 export
 LD_LIBRARY_PATH=/dsk1/home/david/src/gl/Mesa/lib:/usr/i486-linux-libc5/lib
 I think that all problems are related to the glibc, Quake will never
 work if you get the following output:
 [david@localhost Mesa]$ ldd lib/libMesaGL.so
         libglide2x.so => /usr/lib/libglide2x.so (0x400f8000)
         libm.so.6 => /lib/libm.so.6 (0x40244000)
         libc.so.6 => /lib/libc.so.6 (0x4025d000)
         /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00000000)
 You must get the following outputs:
 [david@localhost Mesa]# ldd lib/libMesaGL.so
         libglide2x.so => /usr/i486-linux-libc5/lib/libglide2x.so
 (0x400f3000)
 [root@localhost quake2]# ldd quake2
         libdl.so.1 => /lib/libdl.so.1 (0x40005000)
         libm.so.5 => /usr/i486-linux-libc5/lib/libm.so.5 (0x40008000)
         libc.so.5 => /usr/i486-linux-libc5/lib/libc.so.5 (0x40010000)
 [root@localhost quake2]# ldd ref_gl.so
         libMesaGL.so.2 =>
 /dsk1/home/david/src/gl/Mesa/lib/libMesaGL.so.2 (0x400eb000)
         libglide2x.so => /usr/i486-linux-libc5/lib/libglide2x.so
 (0x401d9000)
         libX11.so.6 => /usr/i486-linux-libc5/lib/libX11.so.6
 (0x40324000)
         libXext.so.6 => /usr/i486-linux-libc5/lib/libXext.so.6
 (0x403b7000)
         libvga.so.1 => /usr/i486-linux-libc5/lib/libvga.so.1
 (0x403c1000)
         libm.so.5 => /usr/i486-linux-libc5/lib/libm.so.5 (0x403f5000)
         libc.so.5 => /usr/i486-linux-libc5/lib/libc.so.5 (0x403fd000)
 ***********************
 Steve Davies (steve@one47.demon.co.uk) writes:
 Try using:
     export LD_LIBRARY_PATH=/usr/i486-linux-libc5/lib
     ./quake2 +set vid_ref gl
 to start the game... Works for me, but assumes that you have the
 compatability libc5 RPMs installed.
 ***************************
 WWW resources - you may find additional Linux Quake help at these URLs:
 http://quake.medina.net/howto
 http://webpages.mr.net/bobz
 http://www.linuxgames.com/quake2/
 ----------------------------------------------------------------------

52

docs/README.THREADS

View File

@@ -1,52 +0,0 @@
 Mesa Threads README
 -------------------
 Thread safety was introduced in Mesa 2.6 by John Stone and
 Christoph Poliwoda.
 It was redesigned in Mesa 3.3 so that thread safety is
 supported by default (on systems which support threads,
 that is).  There is no measurable penalty on single
 threaded applications.
 NOTE that the only _driver_ which is thread safe at this time
 is the OS/Mesa driver!
 At present the mthreads code supports three thread APIS:
 ) POSIX threads (aka pthreads).
 ) Solaris / Unix International threads.
 ) Win32 threads (Win 95/NT).
 Support for other thread libraries can be added src/glthread.[ch]
 In order to guarantee proper operation, it is
 necessary for both Mesa and application code to use the same threads API.
 So, if your application uses Sun's thread API, then you should build Mesa
 using one of the targets for Sun threads.
 The mtdemos directory contains some example programs which use
 multiple threads to render to osmesa rendering context(s).
 Linux users should be aware that there exist many different POSIX
 threads packages. The best solution is the linuxthreads package
 (http://pauillac.inria.fr/~xleroy/linuxthreads/) as this package is the
 only one that really supports multiprocessor machines (AFAIK). See
 http://pauillac.inria.fr/~xleroy/linuxthreads/README for further
 information about the usage of linuxthreads.
 If you are interested in helping with thread safety work in Mesa
 join the Mesa developers mailing list and post your proposal.
 Regards,
   John Stone           -- j.stone@acm.org  johns@cs.umr.edu
   Christoph Poliwoda   -- poliwoda@volumegraphics.com
 Version info:
    Mesa 2.6 - initial thread support.
    Mesa 3.3 - thread support mostly rewritten (Brian Paul)

31

docs/README.UVD

View File

@@ -11,3 +11,34 @@ INFORMATION FOR PACKAGED MEDIA IS EXPRESSLY PROHIBITED WITHOUT A LICENSE
 UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT PORTFOLIO, WHICH LICENSES IS
 AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers Green Circle, Suite 400E,
 Greenwood Village, Colorado 80111 U.S.A.
 WARRANTY DISCLAIMER: THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
 KIND.  AMD DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
 BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, THAT THE SOFTWARE WILL RUN
 UNINTERRUPTED OR ERROR-FREE OR WARRANTIES ARISING FROM CUSTOM OF TRADE OR
 COURSE OF USAGE.  THE ENTIRE RISK ASSOCIATED WITH THE USE OF THE SOFTWARE IS
 ASSUMED BY YOU.  Some jurisdictions do not allow the exclusion of implied
 warranties, so the above exclusion may not apply to You.
 LIMITATION OF LIABILITY AND INDEMNIFICATION:  AMD AND ITS LICENSORS WILL NOT,
 UNDER ANY CIRCUMSTANCES BE LIABLE FOR ANY PUNITIVE, DIRECT, INCIDENTAL,
 INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE SOFTWARE OR
 THIS AGREEMENT EVEN IF AMD AND ITS LICENSORS HAVE BEEN ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGES.  In no event shall AMD's total liability to You
 for all damages, losses, and causes of action (whether in contract, tort
 (including negligence) or otherwise) exceed the amount of $100 USD.  You agree
 to defend, indemnify and hold harmless AMD and its licensors, and any of their
 directors, officers, employees, affiliates or agents from and against any and
 all loss, damage, liability and other expenses (including reasonable
 attorneys' fees), resulting from Your use of the Software or violation of the
 terms and conditions of this Agreement.
 U.S. GOVERNMENT RESTRICTED RIGHTS: The Software is provided with "RESTRICTED
 RIGHTS." Use, duplication, or disclosure by the Government is subject to the
 restrictions as set forth in FAR 52.227-14 and DFAR252.227-7013, et seq., or
 its successor.  Use of the Software by the Government constitutes
 acknowledgement of AMD's proprietary rights in them.
 EXPORT RESTRICTIONS: The Software may be subject to export restrictions as
 stated in the Software License Agreement.

43

docs/README.VCE Normal file

View File

@@ -0,0 +1,43 @@
 The software may implement third party technologies (e.g. third party
 libraries) that are not licensed to you by AMD and for which you may need
 to obtain licenses from other parties.  Unless explicitly stated otherwise,
 these third party technologies are not licensed hereunder.  Such third
 party technologies include, but are not limited, to H.264, MPEG-2, MPEG-4,
 AVC, and VC-1.
 For MPEG-2 Intermediate Products: ANY USE OF THIS PRODUCT IN ANY MANNER OTHER
 THAN PERSONAL USE THAT COMPLIES WITH THE MPEG-2 STANDARD IS EXPRESSLY
 PROHIBITED WITHOUT A LICENSE UNDER APPLICABLE PATENTS IN THE MPEG-2 PATENT
 PORTFOLIO, WHICH LICENSES IS AVAILABLE FROM MPEG LA, LLC, 6312 S. Fiddlers
 Green Circle, Suite 400E, Greenwood Village, Colorado 80111 U.S.A.
 WARRANTY DISCLAIMER: THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY
 KIND.  AMD DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING
 BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 PARTICULAR PURPOSE, TITLE, NON-INFRINGEMENT, THAT THE SOFTWARE WILL RUN
 UNINTERRUPTED OR ERROR-FREE OR WARRANTIES ARISING FROM CUSTOM OF TRADE OR
 COURSE OF USAGE.  THE ENTIRE RISK ASSOCIATED WITH THE USE OF THE SOFTWARE IS
 ASSUMED BY YOU.  Some jurisdictions do not allow the exclusion of implied
 warranties, so the above exclusion may not apply to You.
 LIMITATION OF LIABILITY AND INDEMNIFICATION:  AMD AND ITS LICENSORS WILL NOT,
 UNDER ANY CIRCUMSTANCES BE LIABLE FOR ANY PUNITIVE, DIRECT, INCIDENTAL,
 INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING FROM USE OF THE SOFTWARE OR
 THIS AGREEMENT EVEN IF AMD AND ITS LICENSORS HAVE BEEN ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGES.  In no event shall AMD's total liability to You
 for all damages, losses, and causes of action (whether in contract, tort
 (including negligence) or otherwise) exceed the amount of $100 USD.  You agree
 to defend, indemnify and hold harmless AMD and its licensors, and any of their
 directors, officers, employees, affiliates or agents from and against any and
 all loss, damage, liability and other expenses (including reasonable
 attorneys' fees), resulting from Your use of the Software or violation of the
 terms and conditions of this Agreement.
 U.S. GOVERNMENT RESTRICTED RIGHTS: The Software is provided with "RESTRICTED
 RIGHTS." Use, duplication, or disclosure by the Government is subject to the
 restrictions as set forth in FAR 52.227-14 and DFAR252.227-7013, et seq., or
 its successor.  Use of the Software by the Government constitutes
 acknowledgement of AMD's proprietary rights in them.
 EXPORT RESTRICTIONS: The Software may be subject to export restrictions as
 stated in the Software License Agreement.

20

docs/README.WIN32

View File

@@ -11,10 +11,6 @@ no longer shipped or supported.
 Run
   scons osmesa mesagdi
 to build classic mesa Windows GDI drivers; or
   scons libgl-gdi
 to build gallium based GDI driver.
@@ -36,17 +32,15 @@ Recipe
 Building on windows requires several open-source packages. These are
 steps that work as of this writing.
 ) install python 2.7
 ) install scons (latest)
 ) install mingw, flex, and bison
 ) install libxml2 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
   get libxml2-python-2.9.1.win-amd64-py2.7.exe
 ) install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
 - install python 2.7
 - install scons (latest)
 - install mingw, flex, and bison
 - install pywin32 from here: http://www.lfd.uci.edu/~gohlke/pythonlibs
   get pywin32-218.4.win-amd64-py2.7.exe
 ) install git
 ) download mesa from git
 - install git
 - download mesa from git
   see http://www.mesa3d.org/repository.html
 ) run scons
 - run scons
 General
 -------

									
										54

docs/autoconf.html
									
												View File
												
				@@ -97,20 +97,22 @@ shared libraries in a single pass.</p>

				<dt><code>CC, CFLAGS, CXX, CXXFLAGS</code></dt>

				<dd><p>These environment variables

				control the C and C++ compilers used during the build. By default,

				<code>gcc</code> and <code>g++</code> are used with the options

				<code>"-g -O2"</code>.</p>

				<code>gcc</code> and <code>g++</code> are used and the debug/optimisation

				level is left unchanged.</p>

				</dd>

				<dt><code>LDFLAGS</code></dt>

				<dd><p>An environment variable specifying flags to

				pass when linking programs. These are normally empty, but can be used

				to direct the linker to use libraries in nonstandard directories. For

				example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.</p>

				pass when linking programs. These should be empty and

				<code>PKG_CONFIG_PATH</code> is recommended to be used instead. If needed

				it can be used to direct the linker to use libraries in nonstandard

				directories. For example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.</p>

				</dd>

				<dt><code>PKG_CONFIG_PATH</code></dt>

				<dd><p>When available, the

				<code>pkg-config</code> utility is used to search for external libraries

				<dd><p>The

				<code>pkg-config</code> utility is a hard requirement for cofiguring and

				building mesa. It is used to search for external libraries

				on the system. This environment variable is used to control the search

				path for <code>pkg-config</code>. For instance, setting

				<code>PKG_CONFIG_PATH=/usr/X11R6/lib/pkgconfig</code> will search for

				@@ -135,14 +137,32 @@ one of these architectures is detected. This option ensures that

				assembly will not be used.</p>

				</dd>

				<dt><code>--enable-32-bit</code></dt>

				<dt><code>--enable-64-bit</code></dt>

				<dd><p>By default, the build will compile code as directed by the environment

				variables

				<code>CC</code>, <code>CFLAGS</code>, etc. If the compiler is

				<code>gcc</code>, these options offer a helper to add the compiler flags

				to force 32- or 64-bit code generation as used on the x86 and x86_64

				architectures. Note that these options are mutually exclusive.</p>

				<dt><code>--build=</code></dt>

				<dt><code>--host=</code></dt>

				<dd><p>By default, the build will compile code for the architecture that

				it's running on. In order to build cross-compile Mesa on a x86-64 machine

				that is to run on a i686, one would need to set the options to:</p>

				<p><code>--build=x86_64-pc-linux-gnu --host=i686-pc-linux-gnu</code></p>

				Note that these can vary from distribution to distribution. For more

				information check with the

				<a href="https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Specifying-Target-Triplets.html">

				autoconf manual</a>.

				Note that you will need to correctly set <code>PKG_CONFIG_PATH</code> as well.

				<p>In some cases a single compiler is capable of handling both architectures

				(multilib) in that case one would need to set the <code>CC,CXX</code> variables

				appending the correct machine options. Seek your compiler documentation for

				further information -

				<a href="https://gcc.gnu.org/onlinedocs/gcc/Submodel-Options.html"> gcc

				machine dependent options</a></p>

				<p>In addition to specifying correct <code>PKG_CONFIG_PATH</code> for the target

				architecture, the following should be sufficient to configure multilib Mesa</p>

				<code>./configure CC="gcc -m32" CXX="g++ -m32" --build=x86_64-pc-linux-gnu --host=i686-pc-linux-gnu ...</code>

				</dd>

				</dl>

				@@ -194,7 +214,9 @@ kernel DRM modules are not available.

				<dt><code>--enable-glx-tls</code> <dd><p>

				Enable Thread Local Storage (TLS) in

				GLX.

				<dt><code>--with-expat=DIR</code> <dd> The DRI-enabled libGL uses expat to

				<dt><code>--with-expat=DIR</code>

				<dd><p><strong>DEPRECATED</strong>, use <code>PKG_CONFIG_PATH</code> instead.</p>

				<p>The DRI-enabled libGL uses expat to

				parse the DRI configuration files in <code>/etc/drirc</code> and

				<code>~/.drirc</code>. This option allows a specific expat installation

				to be used. For example, <code>--with-expat=/usr/local</code> will

									
										2

docs/conform.html
									
												View File
												
				@@ -19,7 +19,7 @@

				<p>

				The SGI OpenGL conformance tests verify correct operation of OpenGL

				implementations.  I, Brian Paul, have been given a copy of the tests

				for testing Mesa.  The tests are not publically available.

				for testing Mesa.  The tests are not publicly available.

				</p>

				<p>

				This file has the latest results of testing Mesa with the OpenGL 1.2

									
										1

docs/contents.html
									
												View File
												
				@@ -61,7 +61,6 @@

				<li><a href="shading.html" target="_parent">Shading Language</a>

				<li><a href="egl.html" target="_parent">EGL</a>

				<li><a href="opengles.html" target="_parent">OpenGL ES</a>

				<li><a href="openvg.html" target="_parent">OpenVG / Vega</a>

				<li><a href="envvars.html" target="_parent">Environment Variables</a>

				<li><a href="osmesa.html" target="_parent">Off-Screen Rendering</a>

				<li><a href="debugging.html" target="_parent">Debugging Tips</a>

									
										349

docs/devinfo.html
									
												View File
												
				@@ -17,7 +17,7 @@

				<h1>Development Notes</h1>

				<h2>Adding Extentions</h2>

				<h2>Adding Extensions</h2>

				<p>

				To add a new GL extension to Mesa you have to do at least the following.

				@@ -56,6 +56,11 @@ To add a new GL extension to Mesa you have to do at least the following.

				   If the new extension adds new GL state, the functions in get.c, enable.c

				   and attrib.c will most likely require new code.

				</li>

				<li>

				   The dispatch tests check_table.cpp and dispatch_sanity.cpp

				   should be updated with details about the new extensions functions. These

				   tests are run using 'make check'

				</li>

				</ul>

				@@ -190,18 +195,116 @@ you should add an appropriate note to the commit message.

				Here are some examples of such a note:

				</p>

				<ul>

				  <li>NOTE: This is a candidate for the 9.0 branch.</li>

				  <li>NOTE: This is a candidate for the 8.0 and 9.0 branches.</li>

				  <li>NOTE: This is a candidate for the stable branches.</li>

				  <li>CC: &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				  <li>CC: "9.2 10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				  <li>CC: "10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>

				</ul>

				Simply adding the CC to the mesa-stable list address is adequate to nominate

				the commit for the most-recently-created stable branch. It is only necessary

				to specify a specific branch name, (such as "9.2 10.0" or "10.0" in the

				examples above), if you want to nominate the commit for an older stable

				branch. And, as in these examples, you can nominate the commit for the older

				branch in addition to the more recent branch, or nominate the commit

				exclusively for the older branch.

				<h2>Cherry-picking candidates for a stable branch</h2>

				This "CC" syntax for patch nomination will cause patches to automatically be

				copied to the mesa-stable@ mailing list when you use "git send-email" to send

				patches to the mesa-dev@ mailing list. Also, if you realize that a commit

				should be nominated for the stable branch after it has already been committed,

				you can send a note directly to the mesa-stable@lists.freedesktop.org where

				the Mesa stable-branch maintainers will receive it. Be sure to mention the

				commit ID of the commit of interest (as it appears in the mesa master branch).

				<p>

				Please use <code>git cherry-pick -x &lt;commit&gt;</code> for cherry-picking a commit

				from master to a stable branch.

				</p>

				The latest set of patches that have been nominated, accepted, or rejected for

				the upcoming stable release can always be seen on the

				<a href="http://cworth.org/~cworth/mesa-stable-queue/">Mesa Stable Queue</a>

				page.

				<h2>Criteria for accepting patches to the stable branch</h2>

				Mesa has a designated release manager for each stable branch, and the release

				manager is the only developer that should be pushing changes to these

				branches. Everyone else should simply nominate patches using the mechanism

				described above.

				The stable-release manager will work with the list of nominated patches, and

				for each patch that meets the crtieria below will cherry-pick the patch with:

				<code>git cherry-pick -x &lt;commit&gt;</code>. The <code>-x</code> option is

				important so that the picked patch references the comit ID of the original

				patch.

				The stable-release manager may at times need to force-push changes to the

				stable branches, for example, to drop a previously-picked patch that was later

				identified as causing a regression). These force-pushes may cause changes to

				be lost from the stable branch if developers push things directly. Consider

				yourself warned.

				The stable-release manager is also given broad discretion in rejecting patches

				that have been nominated for the stable branch. The most basic rule is that

				the stable branch is for bug fixes only, (no new features, no

				regressions). Here is a non-exhaustive list of some reasons that a patch may

				be rejected:

				<ul>

				  <li>Patch introduces a regression. Any reported build breakage or other

				  regression caused by a particular patch, (game no longer work, piglit test

				  changes from PASS to FAIL), is justification for rejecting a patch.</li>

				  <li>Patch is too large, (say, larger than 100 lines)</li>

				  <li>Patch is not a fix. For example, a commit that moves code around with no

				  functional change should be rejected.</li>

				  <li>Patch fix is not clearly described. For example, a commit message

				  of only a single line, no description of the bug, no mention of bugzilla,

				  etc.</li>

				  <li>Patch has not obviously been reviewed, For example, the commit message

				  has no Reviewed-by, Signed-off-by, nor Tested-by tags from anyone but the

				  author.</li>

				  <li>Patch has not already been merged to the master branch. As a rule, bug

				  fixes should never be applied first to a stable branch. Patches should land

				  first on the master branch and then be cherry-picked to a stable

				  branch. (This is to avoid future releases causing regressions if the patch

				  is not also applied to master.) The only things that might look like

				  exceptions would be backports of patches from master that happen to look

				  significantly different.</li>

				  <li>Patch depends on too many other patches. Ideally, all stable-branch

				  patches should be self-contained. It sometimes occurs that a single, logical

				  bug-fix occurs as two separate patches on master, (such as an original

				  patch, then a subsequent fix-up to that patch). In such a case, these two

				  patches should be squashed into a single, self-contained patch for the

				  stable branch. (Of course, if the squashing makes the patch too large, then

				  that could be a reason to reject the patch.)</li>

				  <li>Patch includes new feature development, not bug fixes. New OpenGL

				  features, extensions, etc. should be applied to Mesa master and included in

				  the next major release. Stable releases are intended only for bug fixes.

				  Note: As an exception to this rule, the stable-release manager may accept

				  hardware-enabling "features". For example, backports of new code to support

				  a newly-developed hardware product can be accepted if they can be reasonably

				  determined to not have effects on other hardware.</li>

				  <li>Patch is a performance optimization. As a rule, performance patches are

				  not candidates for the stable branch. The only exception might be a case

				  where an application's performance was recently severely impacted so as to

				  become unusable. The fix for this performance regression could then be

				  considered for a stable branch. The optimization must also be

				  non-controversial and the patches still need to meet the other criteria of

				  being simple and self-contained</li>

				  <li>Patch introduces a new failure mode (such as an assert). While the new

				  assert might technically be correct, for example to make Mesa more

				  conformant, this is not the kind of "bug fix" we want in a stable

				  release. The potential problem here is that an OpenGL program that was

				  previously working, (even if technically non-compliant with the

				  specification), could stop working after this patch. So that would be a

				  regression that is unaacceptable for the stable branch.</li>

				</ul>

				<h2>Making a New Mesa Release</h2>

				@@ -212,64 +315,205 @@ These are the instructions for making a new Mesa release.

				<h3>Get latest source files</h3>

				<p>

				Use git to get the latest Mesa files from the git repository, from whatever

				branch is relevant.

				branch is relevant. This document uses the convention X.Y.Z for the release

				being created, which should be created from a branch named X.Y.

				</p>

				<h3>Verify and update version info in VERSION</h3>

				<h3>Perform basic testing</h3>

				<p>

				Create a docs/relnotes/x.y.z.html file.

				The bin/bugzilla_mesa.sh and bin/shortlog_mesa.sh scripts can be used to

				create the HTML-formatted lists of bugfixes and changes to include in the file.

				Link the new docs/relnotes/x.y.z.html file into the main <a href="relnotes.html">relnotes.html</a> file.

				The release manager should, at the very least, test the code by compiling it,

				installing it, and running the latest piglit to ensure that no piglit tests

				have regressed since the previous release.

				</p>

				<p>

				Update <a href="index.html">docs/index.html</a>.

				The release manager should do this testing with at least one hardware driver,

				(say, whatever is contained in the local development machine), as well as on

				both Gallium and non-Gallium software drivers. The software testing can be

				performed by running piglit with the following environment-variable set:

				</p>

				<p>

				Tag the files with the release name (in the form <b>mesa-x.y</b>)

				with: <code>git tag -s mesa-x.y -m "Mesa x.y Release"</code>

				Then: <code>git push origin mesa-x.y</code>

				</p>

				<h3>Make the tarballs</h3>

				<p>

				Make the distribution files.  From inside the Mesa directory:

				<pre>

					./autogen.sh

					make tarballs

				LIBGL_ALWAYS_SOFTWARE=1

				</pre>

				And Gallium vs. non-Gallium software drivers can be obtained by using the

				following configure flags on separate builds:

				<pre>

				--with-dri-drivers=swrast

				--with-gallium-drivers=swrast

				</pre>

				<p>

				After the tarballs are created, the md5 checksums for the files will

				be computed.

				Add them to the docs/relnotes/x.y.html file.

				Note: If both options are given in one build, both swrast_dri.so drivers will

				be compiled, but only one will be installed. The following command can be used

				to ensure the correct driver is being tested:

				</p>

				<pre>

				LIBGL_ALWAYS_SOFTWARE=1 glxinfo | grep "renderer string"

				</pre>

				If any regressions are found in this testing with piglit, stop here, and do

				not perform a release until regressions are fixed.

				<h3>Update version in file VERSION</h3>

				<p>

				Increment the version contained in the file VERSION at Mesa's top-level, then

				commit this change.

				</p>

				<h3>Create release notes for the new release</h3>

				<p>

				Create a new file docs/relnotes/X.Y.Z.html, (follow the style of the previous

				release notes). Note that the sha256sums section of the release notes should

				be empty at this point.

				</p>

				<p>

				Copy the distribution files to a temporary directory, unpack them,

				compile everything, and run some demos to be sure everything works.

				</p>

				Two scripts are available to help generate portions of the release notes:

				<pre>

					./bin/bugzilla_mesa.sh

					./bin/shortlog_mesa.sh

				</pre>

				<h3>Update the website and announce the release</h3>

				<p>

				Make a new directory for the release on annarchy.freedesktop.org with:

				<br>

				<code>

				mkdir /srv/ftp.freedesktop.org/pub/mesa/x.y

				</code>

				The first script identifies commits that reference bugzilla bugs and obtains

				the descriptions of those bugs from bugzilla. The second script generates a

				log of all commits. In both cases, HTML-formatted lists are printed to stdout

				to be included in the release notes.

				</p>

				<p>

				Basically, to upload the tarball files with:

				<br>

				<code>

				rsync -avP -e ssh MesaLib-x.y.* USERNAME@annarchy.freedesktop.org:/srv/ftp.freedesktop.org/pub/mesa/x.y/

				</code>

				Commit these changes

				</p>

				<h3>Make the release archives, signatures, and the release tag</h3>

				<p>

				From inside the Mesa directory:

				<pre>

					./autogen.sh

					make -j1 tarballs

				</pre>

				<p>

				After the tarballs are created, the sha256 checksums for the files will

				be computed and printed. These will be used in a step below.

				</p>

				<p>

				It's important at this point to also verify that the constructed tar file

				actually builds:

				</p>

				<pre>

					tar xjf MesaLib-X.Y.Z.tar.bz2

					cd Mesa-X.Y.Z

					./configure --enable-gallium-llvm

					make -j6

					make install

				</pre>

				<p>

				Some touch testing should also be performed at this point, (run glxgears or

				more involved OpenGL programs against the installed Mesa).

				</p>

				<p>

				Create detached GPG signatures for each of the archive files created above:

				</p>

				<pre>

					gpg --sign --detach MesaLib-X.Y.Z.tar.gz

					gpg --sign --detach MesaLib-X.Y.Z.tar.bz2

					gpg --sign --detach MesaLib-X.Y.Z.zip

				</pre>

				<p>

				Tag the commit used for the build:

				</p>

				<pre>

					git tag -s mesa-X.Y.X -m "Mesa X.Y.Z release"

				</pre>

				<p>

				Note: It would be nice to investigate and fix the issue that causes the

				tarballs target to fail with multiple build process, such as with "-j4". It

				would also be nice to incorporate all of the above commands into a single

				makefile target. And instead of a custom "tarballs" target, we should

				incorporate things into the standard "make dist" and "make distcheck" targets.

				</p>

				<h3>Add the sha256sums to the release notes</h3>

				<p>

				Edit docs/relnotes/X.Y.Z.html to add the sha256sums printed as part of "make

				tarballs" in the previous step. Commit this change.

				</p>

				<h3>Push all commits and the tag creates above</h3>

				<p>

				This is the first step that cannot easily be undone. The release is going

				forward from this point:

				</p>

				<pre>

					git push origin X.Y --tags

				</pre>

				<h3>Install the release files and signatures on the distribution server</h3>

				<p>

				The following commands can be used to copy the release archive files and

				signatures to the freedesktop.org server:

				</p>

				<pre>

					scp MesaLib-X.Y.Z* people.freedesktop.org:

					ssh people.freedesktop.org

					cd /srv/ftp.freedesktop.org/pub/mesa

					mkdir X.Y.Z

					cd X.Y.Z

					mv ~/MesaLib-X.Y.Z* .

				</pre>

				<h3>Back on mesa master, andd the new release notes into the tree</h3>

				<p>

				Something like the following steps will do the trick:

				</p>

				<pre>

					cp docs/relnotes/X.Y.Z.html /tmp

				        git checkout master

				        cp /tmp/X.Y.Z.html docs/relnotes

				        git add docs/relnotes/X.Y.Z.html

				</pre>

				<p>

				Also, edit docs/relnotes.html to add a link to the new release notes, and edit

				docs/index.html to add a news entry. Then commit and push:

				</p>

				<pre>

					git commit -a -m "docs: Import X.Y.Z release notes, add news item."

				        git push origin

				</pre>

				<h3>Update the mesa3d.org website</h3>

				<p>

				NOTE: The recent release managers have not been performing this step

				themselves, but leaving this to Brian Paul, (who has access to the

				sourceforge.net hosting for mesa3d.org). Brian is more than willing to grant

				the permission necessary to future release managers to do this step on their

				own.

				</p>

				<p>

				@@ -281,13 +525,22 @@ sftp USERNAME,mesa3d@web.sourceforge.net

				</code>

				</p>

				<h3>Announce the release</h3>

				<p>

				Make an announcement on the mailing lists:

				<em>mesa-dev@lists.freedesktop.org</em>,

				<em>mesa-users@lists.freedesktop.org</em>

				and

				<em>mesa-announce@lists.freedesktop.org</em>

				Follow the template of previously-sent release announcements. The following

				command can be used to generate the log of changes to be included in the

				release announcement:

				<pre>

					git shortlog mesa-X.Y.Z-1..mesa-X.Y.Z

				</pre>

				</p>

				</div>

									
										15

docs/dispatch.html
									
												View File
												
				@@ -25,7 +25,7 @@ href="#overview">overview of Mesa's implementation</a>.</p>

				<h2>1. Complexity of GL Dispatch</h2>

				<p>Every GL application has at least one object called a GL <em>context</em>.

				This object, which is an implicit parameter to ever GL function, stores all

				This object, which is an implicit parameter to every GL function, stores all

				of the GL related state for the application.  Every texture, every buffer

				object, every enable, and much, much more is stored in the context.  Since

				an application can have more than one context, the context to be used is

				@@ -51,7 +51,7 @@ example, <tt>glFogCoordf</tt> may operate differently depending on whether

				or not fog is enabled.</p>

				<p>In multi-threaded environments, it is possible for each thread to have a

				differnt GL context current.  This means that poor old <tt>glVertex3fv</tt>

				different GL context current.  This means that poor old <tt>glVertex3fv</tt>

				has to know which GL context is current in the thread where it is being

				called.</p>

				@@ -204,16 +204,15 @@ terribly relevant.</p>

				few preprocessor defines.</p>

				<ul>

				<li>If <tt>GLX_USE_TLS</tt> is defined, method #4 is used.</li>

				<li>If <tt>HAVE_PTHREAD</tt> is defined, method #3 is used.</li>

				<li>If <tt>WIN32_THREADS</tt> is defined, method #2 is used.</li>

				<li>If none of the preceeding are defined, method #1 is used.</li>

				<li>If <tt>GLX_USE_TLS</tt> is defined, method #3 is used.</li>

				<li>If <tt>HAVE_PTHREAD</tt> is defined, method #2 is used.</li>

				<li>If none of the preceding are defined, method #1 is used.</li>

				</ul>

				<p>Two different techniques are used to handle the various different cases.

				On x86 and SPARC, a macro called <tt>GL_STUB</tt> is used.  In the preamble

				of the assembly source file different implementations of the macro are

				selected based on the defined preprocessor variables.  The assmebly code

				selected based on the defined preprocessor variables.  The assembly code

				then consists of a series of invocations of the macros such as:

				<blockquote>

				@@ -242,7 +241,7 @@ first technique, is to insert <tt>#ifdef</tt> within the assembly

				implementation of each function.  This makes the assembly file considerably

				larger (e.g., 29,332 lines for <tt>glapi_x86-64.S</tt> versus 1,155 lines for

				<tt>glapi_x86.S</tt>) and causes simple changes to the function

				implementation to generate many lines of diffs.  Since the assmebly files

				implementation to generate many lines of diffs.  Since the assembly files

				are typically generated by scripts (see <a href="#autogen">below</a>), this

				isn't a significant problem.</p>

									
										70

docs/egl.html
									
												View File
												
				@@ -77,26 +77,22 @@ drivers will be installed to <code>${libdir}/egl</code>.</p>

				</dd>

				<dt><code>--enable-gallium-egl</code></dt>

				<dd>

				<p>Enable the optional <code>egl_gallium</code> driver.</p>

				</dd>

				<dt><code>--with-egl-platforms</code></dt>

				<dd>

				<p>List the platforms (window systems) to support.  Its argument is a comma

				seprated string such as <code>--with-egl-platforms=x11,drm</code>.  It decides

				separated string such as <code>--with-egl-platforms=x11,drm</code>.  It decides

				the platforms a driver may support.  The first listed platform is also used by

				the main library to decide the native platform: the platform the EGL native

				types such as <code>EGLNativeDisplayType</code> or

				<code>EGLNativeWindowType</code> defined for.</p>

				<p>The available platforms are <code>x11</code>, <code>drm</code>,

				<code>fbdev</code>, and <code>gdi</code>.  The <code>gdi</code> platform can

				only be built with SCons.  Unless for special needs, the build system should

				<code>wayland</code>, <code>null</code>, <code>android</code>,

				<code>haiku</code>, and <code>gdi</code>.  The <code>android</code> platform

				can only be built as a system component, part of AOSP, while the

				<code>haiku</code> and <code>gdi</code> platforms can only be built with SCons.

				Unless for special needs, the build system should

				select the right platforms automatically.</p>

				</dd>

				@@ -119,13 +115,6 @@ is required if applications mix OpenGL and OpenGL ES.</p>

				</dd>

				<dt><code>--enable-openvg</code></dt>

				<dd>

				<p>OpenVG must be explicitly enabled by this option.</p>

				</dd>

				</dl>

				<h2>Use EGL</h2>

				@@ -219,52 +208,15 @@ the X server directly using (XCB-)DRI2 protocol.</p>

				</dd>

				<dt><code>egl_gallium</code></dt>

				<dd>

				<p>This driver is based on Gallium3D.  It supports all rendering APIs and

				hardwares supported by Gallium3D.  It is the only driver that supports OpenVG.

				The supported platforms are X11, DRM, FBDEV, and GDI.</p>

				<p>This driver comes with its own hardware drivers

				(<code>pipe_&lt;hw&gt;</code>) and client API modules

				(<code>st_&lt;api&gt;</code>).</p>

				</dd>

				<dt><code>egl_glx</code></dt>

				<dd>

				<p>This driver provides a wrapper to GLX.  It uses exclusively GLX to implement

				the EGL API.  It supports both direct and indirect rendering when the GLX does.

				It is accelerated when the GLX is.  As such, it cannot provide functions that

				is not available in GLX or GLX extensions.</p>

				</dd>

				</dl>

				<h2>Packaging</h2>

				<p>The ABI between the main library and its drivers are not stable.  Nor is

				there a plan to stabilize it at the moment.  Of the EGL drivers,

				<code>egl_gallium</code> has its own hardware drivers and client API modules.

				They are considered internal to <code>egl_gallium</code> and there is also no

				stable ABI between them.  These should be kept in mind when packaging for

				distribution.</p>

				<p>Generally, <code>egl_dri2</code> is preferred over <code>egl_gallium</code>

				when the system already has DRI drivers.  As <code>egl_gallium</code> is loaded

				before <code>egl_dri2</code> when both are available, <code>egl_gallium</code>

				is disabled by default.</p>

				there a plan to stabilize it at the moment.</p>

				<h2>Developers</h2>

				<p>The sources of the main library and the classic drivers can be found at

				<code>src/egl/</code>.  The sources of the <code>egl</code> state tracker can

				be found at <code>src/gallium/state_trackers/egl/</code>.</p>

				<p>The suggested way to learn to write a EGL driver is to see how other drivers

				are written.  <code>egl_glx</code> should be a good reference.  It works in any

				environment that has GLX support, and it is simpler than most drivers.</p>

				<p>The sources of the main library and drivers can be found at

				<code>src/egl/</code>.</p>

				<h3>Lifetime of Display Resources</h3>

				@@ -273,8 +225,8 @@ longer than the display that creates them.</p>

				<p>In EGL, when a display is terminated through <code>eglTerminate</code>, all

				display resources should be destroyed.  Similarly, when a thread is released

				throught <code>eglReleaseThread</code>, all current display resources should be

				released.  Another way to destory or release resources is through functions

				through <code>eglReleaseThread</code>, all current display resources should be

				released.  Another way to destroy or release resources is through functions

				such as <code>eglDestroySurface</code> or <code>eglMakeCurrent</code>.</p>

				<p>When a resource that is current to some thread is destroyed, the resource

									
										42

docs/envvars.html
									
												View File
												
				@@ -47,7 +47,7 @@ sometimes be useful for debugging end-user issues.

				<li>MESA_NO_SSE - if set, disables Intel SSE optimizations

				<li>MESA_DEBUG - if set, error messages are printed to stderr.  For example,

				   if the application generates a GL_INVALID_ENUM error, a corresponding error

				   message indicating where the error occured, and possibly why, will be

				   message indicating where the error occurred, and possibly why, will be

				   printed to stderr.<br>

				   If the value of MESA_DEBUG is 'FP' floating point arithmetic errors will

				   generate exceptions.

				@@ -121,10 +121,38 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.

				<h2>i945/i965 driver environment variables (non-Gallium)</h2>

				<ul>

				<li>INTEL_STRICT_CONFORMANCE - if set to 1, enable sw fallbacks to improve

				    OpenGL conformance.  If set to 2, always use software rendering.

				<li>INTEL_NO_BLIT - if set, disable hardware-accelerated glBitmap,

				    glCopyPixels, glDrawPixels.

				<li>INTEL_NO_HW - if set to 1, prevents batches from being submitted to the hardware.

				   This is useful for debugging hangs, etc.</li>

				<li>INTEL_DEBUG - a comma-separated list of named flags, which do various things:

				<ul>

				   <li>tex - emit messages about textures.</li>

				   <li>state - emit messages about state flag tracking</li>

				   <li>blit - emit messages about blit operations</li>

				   <li>miptree - emit messages about miptrees</li>

				   <li>perf - emit messages about performance issues</li>

				   <li>perfmon - emit messages about AMD_performance_monitor</li>

				   <li>bat - emit batch information</li>

				   <li>pix - emit messages about pixel operations</li>

				   <li>buf - emit messages about buffer objects</li>

				   <li>reg - emit messages about regions</li>

				   <li>fbo - emit messages about framebuffers</li>

				   <li>fs - dump shader assembly for fragment shaders</li>

				   <li>gs - dump shader assembly for geometry shaders</li>

				   <li>sync - emit messages about synchronization</li>

				   <li>prim - emit messages about drawing primitives</li>

				   <li>vert - emit messages about vertex assembly</li>

				   <li>dri - emit messages about the DRI interface</li>

				   <li>sf - emit messages about the strips &amp; fans unit (for old gens, includes the SF program)</li>

				   <li>stats - enable statistics counters. you probably actually want perfmon or intel_gpu_top instead.</li>

				   <li>urb - emit messages about URB setup</li>

				   <li>vs - dump shader assembly for vertex shaders</li>

				   <li>clip - emit messages about the clip unit (for old gens, includes the CLIP program)</li>

				   <li>aub - dump batches into an AUB trace for use with simulation tools</li>

				   <li>shader_time - record how much GPU time is spent in each shader</li>

				   <li>no16 - suppress generation of 16-wide fragment shaders. useful for debugging broken shaders</li>

				   <li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>

				   <li>nodualobj - suppress generation of dual-object geometry shader code</li>

				</ul>

				</ul>

				@@ -173,14 +201,14 @@ See src/mesa/state_tracker/st_debug.c for other options.

				    to stderr

				<li>SOFTPIPE_NO_RAST - if set, rasterization is no-op'd.  For profiling purposes.

				<li>SOFTPIPE_USE_LLVM - if set, the softpipe driver will try to use LLVM JIT for

				    vertex shading procesing.

				    vertex shading processing.

				</ul>

				<h3>LLVMpipe driver environment variables</h3>

				<ul>

				<li>LP_NO_RAST - if set LLVMpipe will no-op rasterization

				<li>LP_DEBUG - a comma-separated list of debug options is acceptec.  See the

				<li>LP_DEBUG - a comma-separated list of debug options is accepted.  See the

				    source code for details.

				<li>LP_PERF - a comma-separated list of options to selectively no-op various

				    parts of the driver.  See the source code for details.

									
										8

docs/faq.html
									
												View File
												
				@@ -137,7 +137,7 @@ Just follow the Mesa <a href="install.html">compilation instructions</a>.

				<h2>1.6 Are there other open-source implementations of OpenGL?</h2>

				<p>

				Yes, SGI's <a href="http://oss.sgi.com/projects/ogl-sample/index.html">

				OpenGL Sample Implemenation (SI)</a> is available.

				OpenGL Sample Implementation (SI)</a> is available.

				The SI was written during the time that OpenGL was originally designed.

				Unfortunately, development of the SI has stagnated.

				Mesa is much more up to date with modern features and extensions.

				@@ -353,7 +353,7 @@ That's where Mesa development is discussed.

				</p>

				<p>

				The <a href="http://www.opengl.org/documentation">

				OpenGL Specification</a> is the bible for OpenGL implemention work.

				OpenGL Specification</a> is the bible for OpenGL implementation work.

				You should read it.

				</p>

				<p>Most of the Mesa development work involves implementing new OpenGL

				@@ -375,7 +375,7 @@ For a Gallium3D hardware driver, the r300g, r600g and the i915g are good example

				</p>

				<p>The DRI website has more information about writing hardware drivers.

				The process isn't well document because the Mesa driver interface changes

				over time, and we seldome have spare time for writing documentation.

				over time, and we seldom have spare time for writing documentation.

				That being said, many people have managed to figure out the process.

				</p>

				<p>

				@@ -390,7 +390,7 @@ The <a href="http://oss.sgi.com/projects/ogl-sample/registry/EXT/texture_compres

				indicates that there are intellectual property (IP) and/or patent issues

				to be dealt with.

				</p>

				<p>We've been unsucessful in getting a response from S3 (or whoever owns

				<p>We've been unsuccessful in getting a response from S3 (or whoever owns

				the IP nowadays) to indicate whether or not an open source project can

				implement the extension (specifically the compression/decompression

				algorithms).

									
										276

docs/index.html
									
												View File
												
				@@ -16,6 +16,282 @@

				<h1>News</h1>

				<h2>March 28, 2015</h2>

				<p>

				<a href="relnotes/10.5.2.html">Mesa 10.5.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 20, 2015</h2>

				<p>

				<a href="relnotes/10.4.7.html">Mesa 10.4.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 13, 2015</h2>

				<p>

				<a href="relnotes/10.5.1.html">Mesa 10.5.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 06, 2015</h2>

				<p>

				<a href="relnotes/10.5.0.html">Mesa 10.5.0</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<h2>March 06, 2015</h2>

				<p>

				<a href="relnotes/10.4.6.html">Mesa 10.4.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 21, 2015</h2>

				<p>

				<a href="relnotes/10.4.5.html">Mesa 10.4.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>February 06, 2015</h2>

				<p>

				<a href="relnotes/10.4.4.html">Mesa 10.4.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 24, 2015</h2>

				<p>

				<a href="relnotes/10.4.3.html">Mesa 10.4.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 12, 2015</h2>

				<p>

				<a href="relnotes/10.3.7.html">Mesa 10.3.7</a>

				and <a href="relnotes/10.4.2.html">Mesa 10.4.2</a> are released.

				These are bug-fix releases from the 10.3 and 10.4 branches, respectively.

				<br>

				NOTE: It is anticipated that 10.3.7 will be the final release in the 10.3

				series. Users of 10.3 are encouraged to migrate to the 10.4 series in order

				to obtain future fixes.

				</p>

				<h2>December 29, 2014</h2>

				<p>

				<a href="relnotes/10.3.6.html">Mesa 10.3.6</a>

				and <a href="relnotes/10.4.1.html">Mesa 10.4.1</a> are released.

				These are bug-fix releases from the 10.3 and 10.4 branches, respectively.

				</p>

				<h2>December 14, 2014</h2>

				<p>

				<a href="relnotes/10.4.html">Mesa 10.4</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<h2>December 5, 2014</h2>

				<p>

				<a href="relnotes/10.3.5.html">Mesa 10.3.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 21, 2014</h2>

				<p>

				<a href="relnotes/10.3.4.html">Mesa 10.3.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>November 8, 2014</h2>

				<p>

				<a href="relnotes/10.3.3.html">Mesa 10.3.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 24, 2014</h2>

				<p>

				<a href="relnotes/10.3.2.html">Mesa 10.3.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>October 12, 2014</h2>

				<p>

				<a href="relnotes/10.2.9.html">Mesa 10.2.9</a>

				and <a href="relnotes/10.3.1.html">Mesa 10.3.1</a> are released.

				These are bug-fix releases from the 10.2 and 10.3 branches, respectively.

				<br>

				NOTE: It is anticipated that 10.2.9 will be the final release in the 10.2

				series. Users of 10.2 are encouraged to migrate to the 10.3 series in order

				to obtain future fixes.

				</p>

				<h2>September 19, 2014</h2>

				<p>

				<a href="relnotes/10.3.html">Mesa 10.3</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<p>

				Also, <a href="relnotes/10.2.8.html">Mesa 10.2.8</a> is released.

				This is a bug fix release from the 10.2 branch.

				</p>

				<h2>September 6, 2014</h2>

				<p>

				<a href="relnotes/10.2.7.html">Mesa 10.2.7</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 19, 2014</h2>

				<p>

				<a href="relnotes/10.2.6.html">Mesa 10.2.6</a> is released.

				This is a bug-fix release.

				</p>

				<h2>August 2, 2014</h2>

				<p>

				<a href="relnotes/10.2.5.html">Mesa 10.2.5</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 18, 2014</h2>

				<p>

				<a href="relnotes/10.2.4.html">Mesa 10.2.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 7, 2014</h2>

				<p>

				<a href="relnotes/10.2.3.html">Mesa 10.2.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>July 5, 2014</h2>

				<p>

				Mesa demos 8.2.0 is released.

				See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2014-July/000100.html">announcement</a> for more information about the release.

				You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.2.0/">ftp.freedesktop.org/pub/mesa/demos/8.2.0/</a>.

				</p>

				<h2>June 24, 2014</h2>

				<p>

				<a href="relnotes/10.1.6.html">Mesa 10.1.6</a>

				and <a href="relnotes/10.2.2.html">Mesa 10.2.2</a> are released.

				These are bug-fix releases from the 10.1 and 10.2 branches, respectively.

				</p>

				<h2>June 6, 2014</h2>

				<p>

				<a href="relnotes/10.2.1.html">Mesa 10.2.1</a> is released.  This release

				only fixes a build error in the radeonsi driver that was introduced between

				10.2-rc5 and the 10.2 final release.

				</p>

				<h2>June 6, 2014</h2>

				<p>

				<a href="relnotes/10.2.html">Mesa 10.2</a> is released.  This is a new

				development release.  See the release notes for more information about

				the release.

				</p>

				<p>

				Also, <a href="relnotes/10.1.5.html">Mesa 10.1.5</a> is released.

				This is a bug fix release from the 10.1 branch.

				</p>

				<h2>May 20, 2014</h2>

				<p>

				<a href="relnotes/10.1.4.html">Mesa 10.1.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>May 9, 2014</h2>

				<p>

				<a href="relnotes/10.1.3.html">Mesa 10.1.3</a> is released.

				This is a bug-fix release, and is being released sooner than

				originally scheduled to fix a performance regression (vmware

				swapbuffers falling back to software) introduced to the

				10.1.2 release.

				</p>

				<h2>May 5, 2014</h2>

				<p>

				<a href="relnotes/10.1.2.html">Mesa 10.1.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2014</h2>

				<p>

				<a href="relnotes/10.1.1.html">Mesa 10.1.1</a> is released.

				This is a bug-fix release.

				</p>

				<h2>April 18, 2014</h2>

				<p>

				<a href="relnotes/10.0.5.html">Mesa 10.0.5</a> is released.

				This is a bug-fix release.

				<br>

				NOTE: Since the 10.1.1 release is being released concurrently, it is

				anticipated that 10.0.5 will be the final release in the 10.0

				series. Users of 10.0 are encouraged to migrate to the 10.1 series in

				order to obtain future fixes.

				</p>

				<h2>March 12, 2014</h2>

				<p>

				<a href="relnotes/10.0.4.html">Mesa 10.0.4</a> is released.

				This is a bug-fix release.

				</p>

				<h2>March 4, 2014</h2>

				<p>

				<a href="relnotes/10.1.html">Mesa 10.1</a> is released.

				This is a new development release.

				See the release notes for more information about the release.

				</p>

				<h2>February 3, 2014</h2>

				<p>

				<a href="relnotes/10.0.3.html">Mesa 10.0.3</a> is released.

				This is a bug-fix release.

				</p>

				<h2>January 9, 2014</h2>

				<p>

				<a href="relnotes/10.0.2.html">Mesa 10.0.2</a> is released.

				This is a bug-fix release.

				</p>

				<h2>December 12, 2013</h2>

				<p>

				<a href="relnotes/10.0.1.html">Mesa 10.0.1</a>

				and <a href="relnotes/9.2.5.html">Mesa 9.2.5</a> are released.

				These are both bug-fix releases.

				</p>

				<h2>November 30, 2013</h2>

				<p>

				<a href="relnotes/10.0.html">Mesa 10.0</a> is released.

				This is a new development release.

				See the release notes for more information about the release.

				</p>

				<h2>November 27, 2013</h2>

				<p>

				<a href="relnotes/9.2.4.html">Mesa 9.2.4</a> is released.

				This is a bug fix release.

				</p>

				<h2>November 13, 2013</h2>

				<p>

				<a href="relnotes/9.2.3.html">Mesa 9.2.3</a> is released.

				This is a bug fix release.

				</p>

				<h2>October 18, 2013</h2>

				<p>

				<a href="relnotes/9.2.2.html">Mesa 9.2.2</a> is released.

									
										30

docs/install.html
									
												View File
												
				@@ -34,20 +34,29 @@

				<h2>1.1 General</h2>

				<ul>

				<li><a href="http://www.python.org/">Python</a> - Python is required.

				Version 2.6.4 or later should work.

				</li>

				<br>

				<li><a href="http://www.makotemplates.org/">Python Mako module</a> -

				Python Mako module is required. Version 0.7.3 or later should work.

				</li>

				</br>

				<li><a href="http://www.scons.org/">SCons</a> is required for building on

				Windows and optional for Linux (it's an alternative to autoconf/automake.)

				</li>

				<br>

				<li>lex / yacc - for building the GLSL compiler.

				<br>

				<br>

				On Linux systems, flex and bison are used.

				Versions 2.5.35 and 2.4.1, respectively, (or later) should work.

				<br>

				<br>

				On Windows with MinGW, install flex and bison with:

				<pre>mingw-get install msys-flex msys-bison</pre>

				</li>

				<li>python - Python is needed for building the Gallium components.

				Version 2.6.4 or later should work.

				<br>

				<br>

				To build OpenGL ES 1.1 and 2.0 you'll also need

				<a href="http://xmlsoft.org/sources/win32/python/libxml2-python-2.7.7.win32-py2.7.exe">libxml2-python</a>.

				For MSVC on Windows, install

				<a href="http://winflexbison.sourceforge.net/">Win flex-bison</a>.

				</li>

				</ul>

				@@ -73,7 +82,7 @@ the needed dependencies:

				<pre>

				  sudo yum install flex bison imake libtool xorg-x11-proto-devel libdrm-devel \

				  gcc-c++ xorg-x11-server-devel libXi-devel libXmu-devel libXdamage-devel git \

				  expat-devel llvm-devel

				  expat-devel llvm-devel python-mako

				</pre>

				@@ -118,14 +127,13 @@ by -debug for debug builds.

				To build Mesa with SCons for Windows on Linux using the MinGW crosscompiler toolchain do

				</p>

				<pre>

				    scons platform=windows toolchain=crossmingw machine=x86 mesagdi libgl-gdi

				    scons platform=windows toolchain=crossmingw machine=x86 libgl-gdi

				</pre>

				<p>

				This will create:

				</p>

				<ul>

				<li>build/windows-x86-debug/mesa/drivers/windows/gdi/opengl32.dll &mdash; Mesa + swrast, binary compatible with Windows's opengl32.dll

				<li>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll &mdash; Mesa + Gallium + softpipe, binary compatible with Windows's opengl32.dll

				<li>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll &mdash; Mesa + Gallium + softpipe (or llvmpipe), binary compatible with Windows's opengl32.dll

				</ul>

				<p>

				Put them all in the same directory to test them.

									
										3

docs/license.html
									
												View File
												
				@@ -103,6 +103,9 @@ Device drivers    src/mesa/drivers/*     MIT, generally

				Ext headers       include/GL/glext.h     Khronos

				                  include/GL/glxext.h

				C11 thread        include/c11/threads*.h Boost (permissive)

				emulation

				</pre>

				<p>

									
										111

docs/llvmpipe.html
									
												View File
												
				@@ -43,11 +43,7 @@ It's the fastest software rasterizer for Mesa.

				   </p>

				</li>

				<li>

				   <p>LLVM: version 2.9 recommended; 2.6 or later required.</p>

				   <p><b>NOTE</b>: LLVM 2.8 and earlier will not work on systems that support the

				   Intel AVX extensions (e.g. Sandybridge).  LLVM's code generator will

				   fail when trying to emit AVX instructions.  This was fixed in LLVM 2.9.

				   </p>

				   <p>LLVM: version 3.4 recommended; 3.3 or later required.</p>

				   <p>

				   For Linux, on a recent Debian based distribution do:

				   </p>

				@@ -101,13 +97,15 @@ but the rest of these instructions assume that scons is used.

				For Windows the procedure is similar except the target:

				<pre>

				  scons build=debug libgl-gdi

				  scons platform=windows build=debug libgl-gdi

				</pre>

				<h1>Using</h1>

				On Linux, building will create a drop-in alternative for libGL.so into

				<h2>Linux</h2>

				<p>On Linux, building will create a drop-in alternative for libGL.so into</p>

				<pre>

				  build/foo/gallium/targets/libgl-xlib/libGL.so

				@@ -117,15 +115,45 @@ or

				  lib/gallium/libGL.so

				</pre>

				To use it set the LD_LIBRARY_PATH environment variable accordingly.

				<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p>

				For performance evaluation pass debug=no to scons, and use the corresponding

				lib directory without the "-debug" suffix.

				<p>For performance evaluation pass build=release to scons, and use the corresponding

				lib directory without the "-debug" suffix.</p>

				On Windows, building will create a drop-in alternative for opengl32.dll. To use

				it put it in the same directory as the application. It can also be used by

				<h2>Windows</h2>

				<p>

				On Windows, building will create

				<code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code>

				which is a drop-in alternative for system's <code>opengl32.dll</code>.  To use

				it put it in the same directory as your application.  It can also be used by

				replacing the native ICD driver, but it's quite an advanced usage, so if you

				need to ask, don't even try it.

				</p>

				<p>

				There is however an easy way to replace the OpenGL software renderer that comes

				with Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without

				any OpenGL drivers):

				</p>

				<ul>

				  <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li>

				  <li><p>load this registry settings:</p>

				  <pre>REGEDIT4

				; http://technet.microsoft.com/en-us/library/cc749368.aspx

				; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596

				[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]

				"DLL"="mesadrv.dll"

				"DriverVersion"=dword:00000001

				"Flags"=dword:00000001

				"Version"=dword:00000002

				</pre>

				  </li>

				  <li>Ditto for 64 bits drivers if you need them.</li>

				</ul>

				<h1>Profiling</h1>

				@@ -203,11 +231,66 @@ for posterior analysis, e.g.:

				  We use LLVM-C bindings for now. They are not documented, but follow the C++

				  interfaces very closely, and appear to be complete enough for code

				  generation. See 

				  http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html

				  for a stand-alone example.  See the llvm-c/Core.h file for reference.

				  <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">

				  this stand-alone example</a>.  See the llvm-c/Core.h file for reference.

				</li>

				</ul>

				<h1 id="recommended_reading">Recommended Reading</h1>

				<ul>

				  <li>

				    <p>Rasterization</p>

				    <ul>

				      <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>

				      <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>

				      <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>

				      <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>

				      <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>Texture sampling</p>

				    <ul>

				      <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>

				      <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>

				      <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>

				      <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>

				      <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>

				      <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>SIMD</p>

				    <ul>

				      <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>Optimization</p>

				    <ul>

				      <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>

				      <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>

				      <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>

				      <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>

				    </ul>

				  </li>

				  <li>

				    <p>LLVM</p>

				    <ul>

				      <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>

				      <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>

				    </ul>

				  </li>

				  <li>

				    <p>General</p>

				    <ul>

				      <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>

				      <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>

				    </ul>

				  </li>

				</ul>

				</div>

				</body>

				</html>

									
										4

docs/opengles.html
									
												View File
												
				@@ -16,7 +16,7 @@

				<h1>OpenGL ES</h1>

				<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0.  More informations about

				<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0.  More information about

				OpenGL ES can be found at <a href="http://www.khronos.org/opengles/">

				http://www.khronos.org/opengles/</a>.</p>

				@@ -48,7 +48,7 @@ EGL drivers for your hardware.</p>

				<h3>Dispatch Table</h3>

				<p>OpenGL ES has an additional indirection when dispatching fucntions</p>

				<p>OpenGL ES has an additional indirection when dispatching functions</p>

				<pre>

				  Mesa:       glFoo() --&gt; _mesa_Foo()

									
										59

docs/openvg.html
									
												View File
											
				@@ -1,59 +0,0 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>OpenVG State Tracker</title>

				  <link rel="stylesheet" type="text/css" href="mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>OpenVG State Tracker</h1>

				<p>

				The current version of the OpenVG state tracker implements OpenVG 1.1.

				</p>

				<p>

				More informations about OpenVG can be found at

				<a href="http://www.khronos.org/openvg/">

				http://www.khronos.org/openvg/</a> .

				</p>

				<p>

				The OpenVG state tracker depends on the Gallium architecture and a working EGL implementation.

				Please refer to <a href="egl.html">Mesa EGL</a> for more information about EGL.

				</p>

				<h2>Building the library</h2>

				<ol>

				<li>Run <code>configure</code> with <code>--enable-openvg</code> and

				<code>--enable-gallium-egl</code>.  If you do not need OpenGL, you can add

				<code>--disable-opengl</code> to save the compilation time.</li>

				<li>Build and install Mesa as usual.</li>

				</ol>

				<h3>Sample build</h3>

				A sample build looks as follows:

				<pre>

				  $ ./configure --disable-opengl --enable-openvg --enable-gallium-egl

				  $ make

				  $ make install

				</pre>

				<p>It will install <code>libOpenVG.so</code>, <code>libEGL.so</code>, and one

				or more EGL drivers.</p>

				<h2>OpenVG Demos</h2>

				<p>OpenVG demos can be found in mesa/demos repository.</p>

				</div>

				</body>

				</html>

									
										44

docs/relnotes.html
									
												View File
												
				@@ -21,7 +21,51 @@ The release notes summarize what's new or changed in each Mesa release.

				</p>

				<ul>

				<li><a href="relnotes/10.5.2.html">10.5.2 release notes</a>

				<li><a href="relnotes/10.4.7.html">10.4.7 release notes</a>

				<li><a href="relnotes/10.5.1.html">10.5.1 release notes</a>

				<li><a href="relnotes/10.5.0.html">10.5.0 release notes</a>

				<li><a href="relnotes/10.4.6.html">10.4.6 release notes</a>

				<li><a href="relnotes/10.4.5.html">10.4.5 release notes</a>

				<li><a href="relnotes/10.4.4.html">10.4.4 release notes</a>

				<li><a href="relnotes/10.4.3.html">10.4.3 release notes</a>

				<li><a href="relnotes/10.4.2.html">10.4.2 release notes</a>

				<li><a href="relnotes/10.3.7.html">10.3.7 release notes</a>

				<li><a href="relnotes/10.4.1.html">10.4.1 release notes</a>

				<li><a href="relnotes/10.3.6.html">10.3.6 release notes</a>

				<li><a href="relnotes/10.4.html">10.4 release notes</a>

				<li><a href="relnotes/10.3.5.html">10.3.5 release notes</a>

				<li><a href="relnotes/10.3.4.html">10.3.4 release notes</a>

				<li><a href="relnotes/10.3.3.html">10.3.3 release notes</a>

				<li><a href="relnotes/10.3.2.html">10.3.2 release notes</a>

				<li><a href="relnotes/10.3.1.html">10.3.1 release notes</a>

				<li><a href="relnotes/10.2.9.html">10.2.9 release notes</a>

				<li><a href="relnotes/10.3.html">10.3 release notes</a>

				<li><a href="relnotes/10.2.8.html">10.2.8 release notes</a>

				<li><a href="relnotes/10.2.7.html">10.2.7 release notes</a>

				<li><a href="relnotes/10.2.6.html">10.2.6 release notes</a>

				<li><a href="relnotes/10.2.5.html">10.2.5 release notes</a>

				<li><a href="relnotes/10.2.4.html">10.2.4 release notes</a>

				<li><a href="relnotes/10.2.3.html">10.2.3 release notes</a>

				<li><a href="relnotes/10.2.2.html">10.2.2 release notes</a>

				<li><a href="relnotes/10.2.1.html">10.2.1 release notes</a>

				<li><a href="relnotes/10.2.html">10.2 release notes</a>

				<li><a href="relnotes/10.1.6.html">10.1.6 release notes</a>

				<li><a href="relnotes/10.1.5.html">10.1.5 release notes</a>

				<li><a href="relnotes/10.1.4.html">10.1.4 release notes</a>

				<li><a href="relnotes/10.1.3.html">10.1.3 release notes</a>

				<li><a href="relnotes/10.1.2.html">10.1.2 release notes</a>

				<li><a href="relnotes/10.1.1.html">10.1.1 release notes</a>

				<li><a href="relnotes/10.1.html">10.1 release notes</a>

				<li><a href="relnotes/10.0.5.html">10.0.5 release notes</a>

				<li><a href="relnotes/10.0.4.html">10.0.4 release notes</a>

				<li><a href="relnotes/10.0.3.html">10.0.3 release notes</a>

				<li><a href="relnotes/10.0.2.html">10.0.2 release notes</a>

				<li><a href="relnotes/10.0.1.html">10.0.1 release notes</a>

				<li><a href="relnotes/10.0.html">10.0 release notes</a>

				<li><a href="relnotes/9.2.5.html">9.2.5 release notes</a>

				<li><a href="relnotes/9.2.4.html">9.2.4 release notes</a>

				<li><a href="relnotes/9.2.3.html">9.2.3 release notes</a>

				<li><a href="relnotes/9.2.2.html">9.2.2 release notes</a>

				<li><a href="relnotes/9.2.1.html">9.2.1 release notes</a>

				<li><a href="relnotes/9.2.html">9.2 release notes</a>

									
										150

docs/relnotes/10.0.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,150 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.1 Release Notes / (December 12, 2013)</h1>

				<p>

				Mesa 10.0.1 is a bug fix release which fixes bugs found since the 10.0 release.

				</p>

				<p>

				Mesa 10.0.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				0a72ca5b36046a658bf6038326ff32ed  MesaLib-10.0.1.tar.bz2

				01bde35c912e504ba62caf1ef9f7022c  MesaLib-10.0.1.tar.gz

				59a174a11a89e6b1b8ee9c3f7e3c388c  MesaLib-10.0.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64323">Bug 64323</a> - Severe misrendering in Left 4 Dead 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68838">Bug 68838</a> - GLSL: struct declarations produce a &quot;empty declaration warning&quot; in 9.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69155">Bug 69155</a> - [NV50 gallium] [piglit] bin/varying-packing-simple triggers memory corruption/failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70250">Bug 70250</a> - weston-terminal rendering corrupted with output transform 90 and 270</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70601">Bug 70601</a> - [SNB Bisected]Piglit spec/ARB_texture_float/multisample-formats 2 GL_ARB_texture_float fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72230">Bug 72230</a> - Unable to extract MesaLib-10.0.0.tar.{gz,bz2} with bsdtar</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72325">Bug 72325</a> - [swrast] piglit glean fbo regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72327">Bug 72327</a> - [swrast] piglit glean pointSprite regression</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0..mesa-10.0.1

				</pre>

				<p>Axel Davy (2):</p>

				<ul>

				  <li>egl/wayland: Flush the wl_display at the end of SwapBuffers</li>

				  <li>Enable throttling in SwapBuffers</li>

				</ul>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs</li>

				  <li>i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>swrast: fix readback regression since inversion fix</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>automake: include only one copy VERSION in tarball</li>

				</ul>

				<p>Ian Romanick (3):</p>

				<ul>

				  <li>docs: Add 10.0 release md5sums</li>

				  <li>Remove a057b83 from the pick list</li>

				  <li>glsl: Don't emit empty declaration warning for a struct specifier</li>

				</ul>

				<p>Ilia Mirkin (8):</p>

				<ul>

				  <li>mesa: don't leak performance monitors on context destroy</li>

				  <li>nv50: Fix GPU_READING/WRITING bit removal</li>

				  <li>nouveau: avoid leaking fences while waiting</li>

				  <li>nv50: wait on the buf's fence before sticking it into pushbuf</li>

				  <li>nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0)</li>

				  <li>nouveau/video: update h264 picparm field names based on usage</li>

				  <li>nouveau/video: update a few more h264 picparm field names</li>

				  <li>nv50: report 15 max inputs for fragment programs</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>dri megadriver_stub: add compatibility for older DRI loaders</li>

				</ul>

				<p>Kristian Høgsberg (2):</p>

				<ul>

				  <li>egl/wayland: Damage INT32_MAX x INT32_MAX region for eglSwapBuffers</li>

				  <li>egl/wayland: Send commit after flushing the driver context</li>

				</ul>

				<p>Maarten Lankhorst (1):</p>

				<ul>

				  <li>nouveau: Fix compiler warning regression</li>

				</ul>

				<p>Paul Berry (1):</p>

				<ul>

				  <li>i965/gen6: Fix multisample resolve blits for luminance/intensity 32F formats.</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Bump major version number to 2</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>r300/compiler/tests: Fix segfault</li>

				  <li>r300/compiler/tests: Fix line length check in test parser</li>

				</ul>

				</div>

				</body>

				</html>

									
										161

docs/relnotes/10.0.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,161 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.2 Release Notes / (January 9, 2014)</h1>

				<p>

				Mesa 10.0.2 is a bug fix release which fixes bugs found since the 10.0.1 release.

				</p>

				<p>

				Mesa 10.0.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				de7d14baf0101b697c140d2f47ef27e9  MesaLib-10.0.2.tar.gz

				8544c0ab3e438a08b5103421ea15b6d2  MesaLib-10.0.2.tar.bz2

				181b0d6c1afca38e98a930d0e564ed90  MesaLib-10.0.2.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70740">Bug 70740</a> - HiZ on SNB causes GPU hang with WebGL web app</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72026">Bug 72026</a> - SIGSEGV in fs_visitor::visit(ir_dereference_variable*)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72264">Bug 72264</a> - GLSL error reporting</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72369">Bug 72369</a> - glitches in serious sam 3 with the sb shader backend</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.1..mesa-10.0.2

				</pre>

				<p>Aaron Watry (8):</p>

				<ul>

				  <li>clover: Remove unused variable</li>

				  <li>pipe_loader/sw: close dev-&gt;lib when initialization fails</li>

				  <li>radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode</li>

				  <li>r600/compute: Free compiled kernels when deleting compute state</li>

				  <li>r600/compute: Use the correct FREE macro when deleting compute state</li>

				  <li>radeon/llvm: Free target data at end of optimization</li>

				  <li>st/vdpau: Destroy context when initialization fails</li>

				  <li>r600/pipe: Stop leaking context-&gt;start_compute_cs_cmd.buf on EG/CM</li>

				</ul>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>r600g: fix SUMO2 pci id</li>

				</ul>

				<p>Alexander von Gluck IV (1):</p>

				<ul>

				  <li>Haiku: Add in public GL kit headers</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>mesa: Fix error code generation in glBeginConditionalRender()</li>

				</ul>

				<p>Carl Worth (2):</p>

				<ul>

				  <li>docs: Add md5sums for the 10.0.1 release.</li>

				  <li>Update version to 10.0.2</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>i965/gen6: Fix HiZ hang in WebGL Google Maps</li>

				</ul>

				<p>Erik Faye-Lund (1):</p>

				<ul>

				  <li>glcpp: error on multiple #else/#elif directives</li>

				</ul>

				<p>Henri Verbeet (1):</p>

				<ul>

				  <li>i915: Add support for gl_FragData[0] reads.</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nv50: fix a small leak on context destroy</li>

				</ul>

				<p>Jonathan Liu (2):</p>

				<ul>

				  <li>st/mesa: use pipe_sampler_view_release()</li>

				  <li>llvmpipe: use pipe_sampler_view_release() to avoid segfault</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.</li>

				  <li>Revert "mesa: Remove GLXContextID typedef from glx.h."</li>

				</ul>

				<p>Kevin Rogovin (1):</p>

				<ul>

				  <li>Use line number information from entire function expression</li>

				</ul>

				<p>Kristian Høgsberg (1):</p>

				<ul>

				  <li>dri_util: Don't assume __DRIcontext-&gt;driverPrivate is a gl_context</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>mesa: fix interpretation of glClearBuffer(drawbuffer)</li>

				  <li>st/mesa: fix glClear with multiple colorbuffers and different formats</li>

				</ul>

				<p>Paul Berry (2):</p>

				<ul>

				  <li>glsl: Teach ir_variable_refcount about ir_loop::counter variables.</li>

				  <li>glsl: Fix inconsistent assumptions about ir_loop::counter.</li>

				</ul>

				<p>Vadim Girlin (1):</p>

				<ul>

				  <li>r600g/sb: fix stack size computation on evergreen</li>

				</ul>

				</div>

				</body>

				</html>

									
										206

docs/relnotes/10.0.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,206 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.3 Release Notes / (February 3, 2014)</h1>

				<p>

				Mesa 10.0.3 is a bug fix release which fixes bugs found since the 10.0.2 release.

				</p>

				<p>

				Mesa 10.0.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				5f9f463ef08129f6762106b434910adb  MesaLib-10.0.3.tar.bz2

				fb3997b6500e153bc32370cb3fc4ca9e  MesaLib-10.0.3.tar.gz

				a07b4b6b9eb449b88a6cb5061e51c331  MesaLib-10.0.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72708">Bug 72708</a> - Master fails to build with older gcc due to -msse4.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72926">Bug 72926</a> - [REGRESSION,swrast] Memory-related crash with anti-aliasing enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73096">Bug 73096</a> - Query GL_RGBA_SIGNED_COMPONENTS_EXT missing</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73100">Bug 73100</a> - Please use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73418">Bug 73418</a> - OpenCL hangs graphics on CAYMAN</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73473">Bug 73473</a> - Potential crash bug in src/gallium/auxiliary/rtasm/rtasm_execmem.c</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73915">Bug 73915</a> - sample shading + centroid broken since f5cfb4a</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73956">Bug 73956</a> - SIGSEGV when passing GL_NONE to glReadBuffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74026">Bug 74026</a> - Compiler rejects chained assignments involving array dereferences</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.2..mesa-10.0.3

				</pre>

				<p>Aaron Watry (2):</p>

				<ul>

				  <li>radeon: Move gfx/dma cs cleanup to r600_common_context_cleanup</li>

				  <li>st/dri: prevent leak of dri option default values</li>

				</ul>

				<p>Andreas Fänger (1):</p>

				<ul>

				  <li>swrast: fix delayed texel buffer allocation regression for OpenMP</li>

				</ul>

				<p>Anuj Phogat (3):</p>

				<ul>

				  <li>glsl: Disable ARB_texture_rectangle in shader version 100.</li>

				  <li>i965: Use sample barycentric coordinates with per sample shading</li>

				  <li>i965: Ignore 'centroid' interpolation qualifier in case of persample shading</li>

				</ul>

				<p>Brian Paul (3):</p>

				<ul>

				  <li>mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query</li>

				  <li>st/mesa: fix glReadBuffer(GL_NONE) segfault</li>

				  <li>draw: fix incorrect vertex size computation in LLVM drawing code</li>

				</ul>

				<p>Carl Worth (5):</p>

				<ul>

				  <li>Add md5sums for 10.0.2. release.</li>

				  <li>cherry-ignore: Ignore several patches not yet ready for the stable branch</li>

				  <li>Drop another couple of patches.</li>

				  <li>cherry-ignore: Ignore 4 patches at teh request of the author, (Anuj).</li>

				  <li>Update version to 10.0.3</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>i965/gen6/blorp: Emit more flushes to workaround hangs</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965: fold offset into coord for textureOffset(gsampler2DRect)</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>mesa: use signed temporary variable to store _ColorDrawBufferIndexes</li>

				  <li>st/mesa: use signed temporary variable to store _ColorDrawBufferIndexes</li>

				  <li>nv50: access only the available amount of textures</li>

				  <li>nv50: access only the available amount of constbuf</li>

				  <li>gallium/rtasm: handle mmap failures appropriately</li>

				</ul>

				<p>Eric Anholt (2):</p>

				<ul>

				  <li>i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.</li>

				  <li>i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>mesa: Add COMPRESSED_RGBA_S3TC_DXT1_EXT to COMPRESSED_TEXTURE_FORMATS for GLES</li>

				  <li>radeon / r200: Pass the API into _mesa_initialize_context</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>mesa: fix GL_COLOR_SUM enum for drivers without ARB_vertex_program</li>

				  <li>st/vdpau: don't return a device if the screen doesn't support NPOT</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>mesa: Use IROUND instead of roundf.</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Rename "expr" to "lhs_expr" in vector_extract munging code.</li>

				  <li>glsl: Fix chained assignments of vector channels.</li>

				</ul>

				<p>Lauri Kasanen (1):</p>

				<ul>

				  <li>mesa: Fix build to properly check for supported compiler flags</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>st/mesa: use sRGB formats for MSAA resolving if destination is sRGB</li>

				  <li>gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>glcpp: Define GL_EXT_shader_integer_mix in both GL and ES.</li>

				  <li>glx: Update glxext.h to revision 24777.</li>

				</ul>

				<p>Michał Górny (1):</p>

				<ul>

				  <li>Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.</li>

				</ul>

				<p>Paul Berry (1):</p>

				<ul>

				  <li>i965: Ensure that all necessary state is re-emitted if we run out of aperture.</li>

				</ul>

				<p>Paul Seidler (1):</p>

				<ul>

				  <li>build: move ARCH_LIBS definition outside of ASM definition</li>

				</ul>

				<p>Thomas Sondergaard (4):</p>

				<ul>

				  <li>mesa: Preliminary support for MSVC_VERSION=12.0</li>

				  <li>mesa: Fix compile error with MSVC 2013</li>

				  <li>mesa: Work around internal compiler error</li>

				  <li>mesa: Namespace qualify fma to override ambiguity with fma from math.h</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>r600g/compute: Emit DEALLOC_STATE on cayman after dispatching a compute shader.</li>

				</ul>

				</div>

				</body>

				</html>

									
										191

docs/relnotes/10.0.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,191 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.4 Release Notes / (March 12, 2014)</h1>

				<p>

				Mesa 10.0.4 is a bug fix release which fixes bugs found since the 10.0.3 release.

				</p>

				<p>

				Mesa 10.0.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				5a3c5b90776ec8a9fcd777c99e0607e2  MesaLib-10.0.4.tar.gz

				8b148869d2620b0720c8a8d2b7eb3e38  MesaLib-10.0.4.tar.bz2

				da2418d25bfbc273660af7e755fb367e  MesaLib-10.0.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71870">Bug 71870</a> - Metro: Last Light rendering issues</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72895">Bug 72895</a> - Missing trees in flightgear 2.12.1 with mesa 10.0.1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74251">Bug 74251</a> - Segfault in st_finalize_texture with Texture Buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74723">Bug 74723</a> - main/shaderapi.c:407: detach_shader: Assertion `shProg-&gt;Shaders[j]-&gt;Type == 0x8B31 || shProg-&gt;Shaders[j]-&gt;Type == 0x8B30' failed.</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.3..mesa-10.0.4

				</pre>

				<p>Anuj Phogat (4):</p>

				<ul>

				  <li>mesa: Generate correct error code in glDrawBuffers()</li>

				  <li>mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()</li>

				  <li>glsl: Fix condition to generate shader link error</li>

				  <li>i965: Fix the region's pitch condition to use blitter</li>

				</ul>

				<p>Brian Paul (8):</p>

				<ul>

				  <li>r200: move driContextSetFlags(ctx) call after ctx var is initialized</li>

				  <li>radeon: move driContextSetFlags(ctx) call after ctx var is initialized</li>

				  <li>gallium/auxiliary/indices: replace free() with FREE()</li>

				  <li>draw: fix incorrect color of flat-shaded clipped lines</li>

				  <li>st/mesa: avoid sw fallback for getting/decompressing textures</li>

				  <li>mesa: update assertion in detach_shader() for geom shaders</li>

				  <li>mesa: do depth/stencil format conversion in glGetTexImage</li>

				  <li>softpipe: use 64-bit arithmetic in softpipe_resource_layout()</li>

				</ul>

				<p>Carl Worth (4):</p>

				<ul>

				  <li>docs: Add md5sums for 10.0.3 release</li>

				  <li>main: Avoid double-free of shader Label</li>

				  <li>get-pick-list: Update to only find patches nominated for the 10.0 branch</li>

				  <li>Update version to 10.0.4</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965: Validate (and resolve) all the bound textures.</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>radeon/uvd: fix feedback buffer handling v2</li>

				</ul>

				<p>Daniel Kurtz (1):</p>

				<ul>

				  <li>glsl: Add locking to builtin_builder singleton</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>dri/nouveau: Pass the API into _mesa_initialize_context</li>

				  <li>nv50: correctly calculate the number of vertical blocks during transfer map</li>

				  <li>dri/i9*5: correctly calculate the amount of system memory</li>

				</ul>

				<p>Fredrik Höglund (3):</p>

				<ul>

				  <li>mesa: Preserve the NewArrays state when copying a VAO</li>

				  <li>glx: Fix the default values for GLXFBConfig attributes</li>

				  <li>glx: Fix the GLXFBConfig attrib sort priorities</li>

				</ul>

				<p>Hans (2):</p>

				<ul>

				  <li>util: don't define isfinite(), isnan() for MSVC &gt;= 1800</li>

				  <li>mesa: don't define c99 math functions for MSVC &gt;= 1800</li>

				</ul>

				<p>Ian Romanick (6):</p>

				<ul>

				  <li>meta: Release resources used by decompress_texture_image</li>

				  <li>meta: Release resources used by _mesa_meta_DrawPixels</li>

				  <li>meta: Fallback to software for GetTexImage of compressed GL_TEXTURE_CUBE_MAP_ARRAY</li>

				  <li>meta: Consistenly use non-Apple VAO functions</li>

				  <li>glcpp: Only warn for macro names containing __</li>

				  <li>glsl: Only warn for macro names containing __</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>nv30: report 8 maximum inputs</li>

				  <li>nouveau/video: make sure that firmware is present when checking caps</li>

				  <li>nouveau: fix chipset checks for nv1a by using the oclass instead</li>

				</ul>

				<p>Julien Cristau (1):</p>

				<ul>

				  <li>glx/dri2: fix build failure on HURD</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>glsl: Don't lose precision qualifiers when encountering "centroid".</li>

				  <li>i965: Create a hardware context before initializing state module.</li>

				</ul>

				<p>Kusanagi Kouichi (1):</p>

				<ul>

				  <li>targets/vdpau: Always use c++ to link</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>st/mesa: fix crash when a shader uses a TBO and it's not bound</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>glsl: Initialize ubo_binding_mask flags to zero.</li>

				</ul>

				<p>Paul Berry (2):</p>

				<ul>

				  <li>glsl: Make condition_to_hir() callable from outside ast_iteration_statement.</li>

				  <li>glsl: Fix continue statements in do-while loops.</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>r600g/compute: PIPE_CAP_COMPUTE should be false for pre-evergreen GPUs</li>

				</ul>

				<p>Topi Pohjolainen (1):</p>

				<ul>

				  <li>i965/blorp: do not use unnecessary hw-blending support</li>

				</ul>

				</div>

				</body>

				</html>

									
										173

docs/relnotes/10.0.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,173 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0.5 Release Notes / April 18, 2014</h1>

				<p>

				Mesa 10.0.5 is a bug fix release which fixes bugs found since the 10.0.4 release.

				</p>

				<p>

				Mesa 10.0.5 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				db606aadd0fe321f3664099677d159bc  MesaLib-10.0.5.tar.gz

				e6009ccd8898d7104bb325b6af9ec354  MesaLib-10.0.5.tar.bz2

				c8ab9e502542bf32299a4df85b0b704d  MesaLib-10.0.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=58660">Bug 58660</a> - CAYMAN broken with HyperZ on</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66352">Bug 66352</a> - GPU lockup in L4D2 on TURKS with HyperZ</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68799">Bug 68799</a> - [APITRACE] Hyper-Z lockup with Falcon BMS 4.32u6 on CAYMAN</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error &quot;SSE4.1 instruction set not enabled&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73088">Bug 73088</a> - [HyperZ] Juniper (6770): Gone Home / Unigine Heaven 4.0 lock up system after several minutes of use</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74428">Bug 74428</a> - hyperz causes gpu hang in Counter-strike: Source</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74803">Bug 74803</a> - [r600g] HyperZ broken on RV630 (Cogs shadows are broken)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74892">Bug 74892</a> - HyperZ GPU lockup with radeonsi 7970M PITCAIRN and Distance Alpha game</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74988">Bug 74988</a> - Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75279">Bug 75279</a> - XCloseDisplay() takes one minute around nouveau_dri.so, freezing Firefox startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77102">Bug 77102</a> - gallium nouveau has no profile in vdpau and libva</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77207">Bug 77207</a> - [ivb/hsw] batch overwritten with garbage</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following git command:</p>

				<pre>

				  git log mesa-10.0.4..mesa-10.0.5

				</pre>

				<p>Alex Deucher (1):</p>

				<ul>

				  <li>radeon: reverse DBG_NO_HYPERZ logic</li>

				</ul>

				<p>Brian Paul (9):</p>

				<ul>

				  <li>mesa: add unpacking code for MESA_FORMAT_Z32_FLOAT_S8X24_UINT</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SARGB8()</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SRGB8()</li>

				  <li>mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up</li>

				  <li>st/mesa: add null pointer checking in query object functions</li>

				  <li>mesa: fix glMultiDrawArrays inside a display list</li>

				  <li>cso: fix sampler view count in cso_set_sampler_views()</li>

				  <li>svga: replace sampler assertion with conditional</li>

				  <li>svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()</li>

				</ul>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>docs: Add md5sums for the 10.0.4 release.</li>

				  <li>Ignore patches which don't apply.</li>

				  <li>Update version to 10.0.5</li>

				</ul>

				<p>Christian König (2):</p>

				<ul>

				  <li>st/mesa: recreate sampler view on context change v3</li>

				  <li>st/mesa: fix sampler view handling with shared textures v4</li>

				</ul>

				<p>Courtney Goeltzenleuchter (1):</p>

				<ul>

				  <li>mesa: add bounds checking to eliminate buffer overrun</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>mesa: return v.value_int64 when the requested type is TYPE_INT64</li>

				  <li>glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Fix buffer overruns in MSAA MCS buffer clearing.</li>

				</ul>

				<p>Ilia Mirkin (6):</p>

				<ul>

				  <li>nouveau: fix fence waiting logic in screen destroy</li>

				  <li>nv50: adjust blit_3d handling of ms output textures</li>

				  <li>mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture</li>

				  <li>nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list</li>

				  <li>nouveau: there may not have been a texture if the fbo was incomplete</li>

				  <li>nouveau: fix firmware check on nvd7/nvd9</li>

				</ul>

				<p>Johannes Nixdorf (1):</p>

				<ul>

				  <li>configure.ac: fix the detection of expat with pkg-config</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>gallium: add endian detection for OpenBSD</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>draw: Duplicate TGSI tokens in draw_pipe_pstipple module.</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.</li>

				</ul>

				<p>Paul Berry (1):</p>

				<ul>

				  <li>i965/gen7: Prefer vertical alignment of 4 when possible.</li>

				</ul>

				</div>

				</body>

				</html>

									
										83

docs/relnotes/10.0.html
									
												View File
												
				@@ -14,7 +14,7 @@

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.0 Release Notes / TBD</h1>

				<h1>Mesa 10.0 Release Notes / (November 30th, 2013)</h1>

				<p>

				Mesa 10.0 is a new development release.

				@@ -33,7 +33,9 @@ because compatibility contexts are not supported.

				<h2>MD5 checksums</h2>

				<pre>

				TBD.

				b38626b96c664db67a534d7859682436  MesaLib-10.0.0.tar.gz

				f3fe55d9735bea158bbe97ed9a0da819  MesaLib-10.0.0.tar.bz2

				c6ee1ce51e3bf35947d2978b872daf51  MesaLib-10.0.0.zip

				</pre>

				@@ -55,16 +57,89 @@ Note: some of the new features are only available with certain drivers.

				<li>GL_ARB_vertex_attrib_binding</li>

				<li>GL_ARB_vertex_type_10f_11f_11f_rev on i965 and r600g</li>

				<li>GL_KHR_debug</li>

				<li>GLX_MESA_query_renderer</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<p>Attempts have been made to <b>not</b> include bugs fixed in previous 9.2

				releases or bugs that were regressions during 10.0 development. This list is

				likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=47755">Bug 47755</a> - [glsl-compiler] no error checking when Interpolation qualifier for built-in variable is different in vertex and fragment shader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=52171">Bug 52171</a> - [gallium/r600/clover] Simple benchmarks failed to run</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53077">Bug 53077</a> - [IVB] Output error with msaa when both of framebuffer and source color's alpha are not 1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54867">Bug 54867</a> - bug in r300 compiler</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60929">Bug 60929</a> - [r600-llvm] mono games with opengl are blocking on start</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62142">Bug 62142</a> - Mesa/demo mipmap_limits upside down with running by SOFTWARE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62698">Bug 62698</a> - [bisected] WebGL demo &quot;Consumed&quot;: texstate.c:628: update_texture_state: Assertion „__builtin_popcount(enabledTargets) == 1“ failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64225">Bug 64225</a> - bfgminer --scyte generates Segmentation Fault on Northern Island</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64226">Bug 64226</a> - python-opencl package generate segmentation fault at pipe_r600.so</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64261">Bug 64261</a> - [SNB Bisected]Ogles3conform GL3Tests_color_buffer_float_color_buffer_float_clamp_fixed.test fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66213">Bug 66213</a> - Certain Mesa Demos Rendering Inverted (vertically)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66806">Bug 66806</a> - [softpipe] glxgears floating point exception</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67921">Bug 67921</a> - [bisected commit 883987] crosscompiling fails with util/u_cpu_detect.c:247:4: error: 'asm' undeclared (first use in this function)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68162">Bug 68162</a> - [radeonsi] texture rendering is broken in Source-Engine games</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68451">Bug 68451</a> - Texture flicker in native Dota2 in mesa 9.2.0rc1</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68503">Bug 68503</a> - Graphical glitches in Serious Sam 3 when SB is enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68792">Bug 68792</a> - Problems during playback of h264 files using UVD and VLC on AMD E-350 CPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68845">Bug 68845</a> - VDPAU/UVD regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69078">Bug 69078</a> - Modern Warfare (1, 2 and 3) broken in Wine on SNB</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69321">Bug 69321</a> - starting openCL crashes/boots system</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70042">Bug 70042</a> - Major texture flickering in Dota 2 (r600g on HD 6950)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70088">Bug 70088</a> - Glamor on r600g crashes Xserver</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70123">Bug 70123</a> - Freeze caused by 'winsys/radeon: remove cs_queue_empty' commit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70327">Bug 70327</a> - Casting floating point variable to integer not working properly while constant gets converted properly</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70891">Bug 70891</a> - CL_INVALID_BUILD_OPTIONS results in CL_INVALID_DEVICE when asking for build log</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70913">Bug 70913</a> - [PIGLIT,radeonsi] crash in &quot;spec/EXT_framebuffer_multisample/sample-alpha-to-coverage 4 depth&quot; (buffer overflow)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71022">Bug 71022</a> - configure: error: Expat required for DRI.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71110">Bug 71110</a> - xorg_driver.c:1030:2: error: too many arguments to function ‘DamageUnregister’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71172">Bug 71172</a> - Segfault when running glxinfo. NV25GL [Quadro4 900 XGL]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71512">Bug 71512</a> - dlopen.h:54: undefined reference to `dlopen'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71870">Bug 71870</a> - Metro: Last Light rendering issues</li>

				</ul>

				<h2>Changes</h2>

				TBD.

				<ul>

				<li>Removed X.Org state tracker (unmaintained and broken)</li>

				<li>Removed the video-accel r300 targets</li>

				<li>Removed the video-accel softpipe targets</li>

				</ul>

				</div>

				</body>

									
										254

docs/relnotes/10.1.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,254 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.1 Release Notes / April 18, 2014</h1>

				<p>

				Mesa 10.1.1 is a bug fix release which fixes bugs found since the 10.1 release.

				</p>

				<p>

				Mesa 10.1.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				96e63674ccfa98e7ec6eb4fee3f770c3  MesaLib-10.1.1.tar.gz

				1fde7ed079df7aeb9b6a744ca033de8d  MesaLib-10.1.1.tar.bz2

				e64d0a562638664b13d2edf22321df59  MesaLib-10.1.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=71547">Bug 71547</a> - compilation failure :#error &quot;SSE4.1 instruction set not enabled&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74868">Bug 74868</a> - r600g: Diablo III Crashes After a few minutes</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74988">Bug 74988</a> - Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75279">Bug 75279</a> - XCloseDisplay() takes one minute around nouveau_dri.so, freezing Firefox startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75543">Bug 75543</a> - OSMesa Gallium OSMesaMakeCurrent</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75660">Bug 75660</a> - u_inlines.h:277:pipe_buffer_map_range: Assertion `length' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76323">Bug 76323</a> - GLSL compiler ignores layout(binding=N) on uniform blocks</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76377">Bug 76377</a> - DRI3 should only be enabled on Linux due to a udev dependency</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76749">Bug 76749</a> - [HSW] DOTA world lighting has no effect</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77102">Bug 77102</a> - gallium nouveau has no profile in vdpau and libva</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77207">Bug 77207</a> - [ivb/hsw] batch overwritten with garbage</li>

				</ul>

				<h2>Changes</h2>

				<p>Aaron Watry (1):</p>

				<ul>

				  <li>gallium/util: Fix memory leak</li>

				</ul>

				<p>Alexander von Gluck IV (1):</p>

				<ul>

				  <li>haiku: Fix build through scons corrections and viewport fixes</li>

				</ul>

				<p>Anuj Phogat (2):</p>

				<ul>

				  <li>mesa: Set initial internal format of a texture to GL_RGBA</li>

				  <li>mesa: Allow GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL combinations in glTexImage{123}D()</li>

				</ul>

				<p>Brian Paul (12):</p>

				<ul>

				  <li>softpipe: use 64-bit arithmetic in softpipe_resource_layout()</li>

				  <li>mesa: don't call ctx-&gt;Driver.ClearBufferSubData() if size==0</li>

				  <li>st/osmesa: check buffer size when searching for buffers</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SARGB8()</li>

				  <li>mesa: fix copy &amp; paste bugs in pack_ubyte_SRGB8()</li>

				  <li>c11/threads: don't include assert.h if the assert macro is already defined</li>

				  <li>mesa: fix unpack_Z32_FLOAT_X24S8() / unpack_Z32_FLOAT() mix-up</li>

				  <li>st/mesa: add null pointer checking in query object functions</li>

				  <li>mesa: fix glMultiDrawArrays inside a display list</li>

				  <li>cso: fix sampler view count in cso_set_sampler_views()</li>

				  <li>svga: replace sampler assertion with conditional</li>

				  <li>svga: move LIST_INITHEAD(dirty_buffers) earlier in svga_context_create()</li>

				</ul>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>cherry-ignore: Ignore a few patches</li>

				  <li>glsl: Allow explicit binding on atomics again</li>

				  <li>Update VERSION to 10.1.1</li>

				</ul>

				<p>Chia-I Wu (1):</p>

				<ul>

				  <li>i965/vec4: fix record clearing in copy propagation</li>

				</ul>

				<p>Christian König (2):</p>

				<ul>

				  <li>st/mesa: recreate sampler view on context change v3</li>

				  <li>st/mesa: fix sampler view handling with shared textures v4</li>

				</ul>

				<p>Courtney Goeltzenleuchter (1):</p>

				<ul>

				  <li>mesa: add bounds checking to eliminate buffer overrun</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>nv50: add missing brackets when handling the samplers array</li>

				  <li>mesa: return v.value_int64 when the requested type is TYPE_INT64</li>

				  <li>configure: enable dri3 only for linux</li>

				  <li>glx: drop obsolete _XUnlock_Mutex in __glXInitialize error path</li>

				  <li>configure: cleanup libudev handling</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Fix buffer overruns in MSAA MCS buffer clearing.</li>

				</ul>

				<p>Hans (2):</p>

				<ul>

				  <li>util: don't define isfinite(), isnan() for MSVC &gt;= 1800</li>

				  <li>mesa: don't define c99 math functions for MSVC &gt;= 1800</li>

				</ul>

				<p>Ian Romanick (7):</p>

				<ul>

				  <li>linker: Split set_uniform_binding into separate functions for blocks and samplers</li>

				  <li>linker: Various trivial clean-ups in set_sampler_binding</li>

				  <li>linker: Fold set_uniform_binding into call site</li>

				  <li>linker: Clean up "unused parameter" warnings</li>

				  <li>linker: Set block bindings based on UniformBlocks rather than UniformStorage</li>

				  <li>linker: Set binding for all elements of UBO array</li>

				  <li>glsl: Propagate explicit binding information from the AST all the way to the linker</li>

				</ul>

				<p>Ilia Mirkin (8):</p>

				<ul>

				  <li>nouveau: fix fence waiting logic in screen destroy</li>

				  <li>nv50: adjust blit_3d handling of ms output textures</li>

				  <li>loader: add special logic to distinguish nouveau from nouveau_vieux</li>

				  <li>mesa/main: condition GL_DEPTH_STENCIL on ARB_depth_texture</li>

				  <li>nouveau: add forgotten GL_COMPRESSED_INTENSITY to texture format list</li>

				  <li>nouveau: there may not have been a texture if the fbo was incomplete</li>

				  <li>nvc0/ir: move sample id to second source arg to fix sampler2DMS</li>

				  <li>nouveau: fix firmware check on nvd7/nvd9</li>

				</ul>

				<p>Johannes Nixdorf (1):</p>

				<ul>

				  <li>configure.ac: fix the detection of expat with pkg-config</li>

				</ul>

				<p>Jonathan Gray (7):</p>

				<ul>

				  <li>gallium: add endian detection for OpenBSD</li>

				  <li>loader: use 0 instead of FALSE which isn't defined</li>

				  <li>loader: don't limit the non-udev path to only android</li>

				  <li>megadriver_stub.c: don't use _GNU_SOURCE to gate the compat code</li>

				  <li>egl/dri2: don't require libudev to build drm/wayland platforms</li>

				  <li>egl/dri2: use drm macros to construct device name</li>

				  <li>configure: don't require libudev for gbm or egl drm/wayland</li>

				</ul>

				<p>José Fonseca (4):</p>

				<ul>

				  <li>c11/threads: Fix nano to milisecond conversion.</li>

				  <li>mapi/u_thread: Use GetCurrentThreadId</li>

				  <li>c11/threads: Don't implement thrd_current on Windows.</li>

				  <li>draw: Duplicate TGSI tokens in draw_pipe_pstipple module.</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>i965/fs: Fix register comparisons in saturate propagation.</li>

				  <li>glsl: Fix lack of i2u in lower_ubo_reference.</li>

				  <li>i965: Stop advertising GL_MESA_ycbcr_texture.</li>

				  <li>glsl: Try vectorizing when seeing a repeated assignment to a channel.</li>

				</ul>

				<p>Marek Olšák (13):</p>

				<ul>

				  <li>r600g: fix texelFetchOffset GLSL functions</li>

				  <li>r600g: fix blitting the last 2 mipmap levels for Evergreen</li>

				  <li>mesa: fix the format of glEdgeFlagPointer</li>

				  <li>r600g,radeonsi: fix MAX_TEXTURE_3D_LEVELS and MAX_TEXTURE_ARRAY_LAYERS limits</li>

				  <li>st/mesa: fix per-vertex edge flags and GLSL support (v2)</li>

				  <li>mesa: mark GL_RGB9_E5 as not color-renderable</li>

				  <li>mesa: fix texture border handling for cube arrays</li>

				  <li>mesa: allow generating mipmaps for cube arrays</li>

				  <li>mesa: fix software fallback for generating mipmaps for cube arrays</li>

				  <li>mesa: fix software fallback for generating mipmaps for 3D textures</li>

				  <li>st/mesa: fix generating mipmaps for cube arrays</li>

				  <li>st/mesa: drop the lowering of quad strips to triangle strips</li>

				  <li>r600g: implement edge flags</li>

				</ul>

				<p>Matt Turner (4):</p>

				<ul>

				  <li>mesa: Wrap SSE4.1 code in #ifdef __SSE4_1__.</li>

				  <li>i965/fs: Fix off-by-one in saturate propagation.</li>

				  <li>i965/fs: Don't propagate saturate modifiers into partial writes.</li>

				  <li>i965/fs: Don't propagate saturation modifiers if there are source modifiers.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>r600g: Don't leak bytecode on shader compile failure</li>

				</ul>

				<p>Mike Stroyan (1):</p>

				<ul>

				  <li>i965: Avoid dependency hints on math opcodes</li>

				</ul>

				<p>Thomas Hellstrom (5):</p>

				<ul>

				  <li>winsys/svga: Replace the query mm buffer pool with a slab pool v3</li>

				  <li>winsys/svga: Update the vmwgfx_drm.h header to latest version from kernel</li>

				  <li>winsys/svga: Fix prime surface references also for guest-backed surfaces</li>

				  <li>st/xa: Bind destination before setting new state</li>

				  <li>st/xa: Make sure unused samplers are set to NULL</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>configure: Use LLVM shared libraries by default</li>

				</ul>

				</div>

				</body>

				</html>

									
										179

docs/relnotes/10.1.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,179 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.2 Release Notes / (May 5, 2014)</h1>

				<p>

				Mesa 10.1.2 is a bug fix release which fixes bugs found since the 10.1.1 release.

				</p>

				<p>

				Mesa 10.1.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				37d79f94b1f41852a89d1fc3900bea76  MesaLib-10.1.2.tar.gz

				28b60d15ac9f364da1e0155911eaf44e  MesaLib-10.1.2.tar.bz2

				05300039085a65fc53c5472c4bb5747a  MesaLib-10.1.2.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=27499">Bug 27499</a> - [855GM i915] GL_LINE_STIPPLE displays incorrect colors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75723">Bug 75723</a> - (regression since Linux 3.14?) brw_get_graphics_reset_status: Assertion `brw-&gt;hw_ctx != ((void *)0)' failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76894">Bug 76894</a> - Piglit/spec/EXT_framebuffer_object/fbo-bind-renderbuffer failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77702">Bug 77702</a> - [i965 Bisected]Piglit spec/NV_conditional_render_blitframebuffer fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Ander Conselvan de Oliveira (2):</p>

				<ul>

				  <li>gbm/dri: Fix out-of-memory error path in dri_device_create()</li>

				  <li>egl: Protect use of gbm_dri with ifdef HAVE_DRM_PLATFORM</li>

				</ul>

				<p>Anuj Phogat (27):</p>

				<ul>

				  <li>mesa: Fix glGetVertexAttribi(GL_VERTEX_ATTRIB_ARRAY_SIZE)</li>

				  <li>swrast: Add glBlitFramebuffer to commands affected by conditional rendering</li>

				  <li>mesa: Fix error condition for multisample proxy texture targets</li>

				  <li>i965: Put an assertion to check valid varying_to_slot[varying]</li>

				  <li>i965: Fix component mask and varying_to_slot mapping for gl_Layer</li>

				  <li>i965: Fix component mask and varying_to_slot mapping for gl_ViewportIndex</li>

				  <li>mesa: Add helper function _mesa_is_format_integer()</li>

				  <li>mesa: Add error condition for integer formats in glGetTexImage()</li>

				  <li>mesa: Add an error condition in glGetFramebufferAttachmentParameteriv()</li>

				  <li>mesa: Fix error code generation in glReadPixels()</li>

				  <li>glsl: Allow overlapping locations for vertex input attributes</li>

				  <li>mesa: Fix querying location of nth element of an array variable</li>

				  <li>mesa: Use location VERT_ATTRIB_GENERIC0 for vertex attribute 0</li>

				  <li>glsl: Compile error if fs defines conflicting qualifiers for gl_FragCoord</li>

				  <li>glsl: Compile error if fs uses gl_FragCoord before first redeclaration</li>

				  <li>mesa: Add entry for extension ARB_texture_stencil8</li>

				  <li>mesa: Add error condition for format=STENCIL_INDEX in glGetTexImage()</li>

				  <li>i965: Fix crash in do_blit_readpixels()</li>

				  <li>mesa: Add missing types in _mesa_texstore_xx_xx() functions</li>

				  <li>mesa: Allow srcFormat=GL_DEPTH_STENCIL in _mesa_texstore_xx_xx() functions</li>

				  <li>mesa: Add new helper function _mesa_unpack_depth_stencil_row()</li>

				  <li>mesa: Add support to unpack depth-stencil texture in to FLOAT_32_UNSIGNED_INT_24_8_REV</li>

				  <li>mesa: Allow FLOAT_32_UNSIGNED_INT_24_8_REV in get_tex_depth_stencil()</li>

				  <li>i965: Add glBlitFramebuffer to commands affected by conditional rendering</li>

				  <li>glsl: Use switch to allow adding more shader types</li>

				  <li>glsl: Link error if fs defines conflicting qualifiers for gl_FragCoord</li>

				  <li>glsl: Apply the link error conditions to GL_ARB_fragment_coord_conventions</li>

				</ul>

				<p>Benjamin Bellec (1):</p>

				<ul>

				  <li>mesa: fix GetStringi error message with correct function name</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>swrast: allocate swrast_texture_image::ImageSlices array if needed</li>

				</ul>

				<p>Carl Worth (4):</p>

				<ul>

				  <li>docs: Add the MD5 sums for the 10.1.1 release tar files.</li>

				  <li>cherry-ignore: Ignore a patch causing a regression</li>

				  <li>cherry-ignore: Drop an ignored patch now that piglit has been updated.</li>

				  <li>Update VERSION to 10.1.2</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>glsl: Only allow `invariant` on shader in/out between stages.</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Fix render-to-texture in non-FinishRenderTexture cases.</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>dri3: Enable GLX_MESA_query_renderer on DRI3 too</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Don't enable reset notification support on Gen4-5.</li>

				  <li>i965: Actually emit PIPELINE_SELECT and 3DSTATE_VF_STATISTICS.</li>

				</ul>

				<p>Marek Olšák (10):</p>

				<ul>

				  <li>r300g: don't crash when getting NULL colorbuffers</li>

				  <li>st/mesa: remove trailing NULL colorbuffers</li>

				  <li>r600g: fix edge flags and layered rendering on R600-R700</li>

				  <li>r600g: disable async DMA on R700</li>

				  <li>r600g: fix MSAA resolve on R6xx when the destination is 1D-tiled</li>

				  <li>r600g: fix flushing on RV670, RS780, RS880 again</li>

				  <li>r600g: fix buffer copying on R600-R700</li>

				  <li>r600g: fix for broken CULL_FRONT behavior on R6xx</li>

				  <li>r600g: fix for an MSAA hang on RV770</li>

				  <li>r600g: fix hang on RV740 by using DX_RASTERIZATION_KILL instead of SX_MISC</li>

				</ul>

				<p>Michel Dänzer (2):</p>

				<ul>

				  <li>r600g: Disable LLVM by default at runtime for graphics</li>

				  <li>st/mesa: Fix NULL pointer dereference for incomplete framebuffers</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>wayland: Fix the logic in disabling the prime capability</li>

				</ul>

				<p>Samuel Iglesias Gonsalvez (1):</p>

				<ul>

				  <li>mesa: fix check for dummy renderbuffer in _mesa_FramebufferRenderbufferEXT()</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Cache render target surface</li>

				</ul>

				<p>nick (1):</p>

				<ul>

				  <li>swrast: Fix vertex color in _swsetup_Translate()</li>

				</ul>

				</div>

				</body>

				</html>

									
										90

docs/relnotes/10.1.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,90 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.3 Release Notes / (May 9, 2014)</h1>

				<p>

				Mesa 10.1.3 is a bug fix release which fixes bugs found since the 10.1.2 release.

				</p>

				<p>

				Note: Mesa 10.1.3 is being released sooner than originally scheduled to make

				available a fix for a performance rgression that was inadvertently introduced

				to Mesa 10.1.2. The performance regression is reported to make vmware

				swapbuffers fall back to software. 

				</p>

				<p>

				Mesa 10.1.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				665fe1656aaa2c37b32042068aff92cb  MesaLib-10.1.3.tar.gz

				ba6dbe2b9cab0b4de840c996b9b6a3ad  MesaLib-10.1.3.tar.bz2

				4e6f26330a63d3c47e62ac4bdead39e8  MesaLib-10.1.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77245">Bug 77245</a> - Bogus GL_ARB_explicit_attrib_location layout identifier warnings</li>

				</ul>

				<h2>Changes</h2>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>docs: Add MD5 sums for Mesa 10.1.2</li>

				  <li>get-pick-list.sh: Require explicit "10.1" for nominating stable patches</li>

				  <li>VERSION: Update to 10.1.3</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>mesa: Fix MaxNumLayers for 1D array textures.</li>

				  <li>i965: Fix depth (array slices) computation for 1D_ARRAY render targets.</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>glsl: fix bogus layout qualifier warnings</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Fix performance regression introduced by commit "Cache render target surface"</li>

				</ul>

				</div>

				</body>

				</html>

									
										100

docs/relnotes/10.1.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,100 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.4 Release Notes / (May 20, 2014)</h1>

				<p>

				Mesa 10.1.4 is a bug fix release which fixes bugs found since the 10.1.3 release.

				</p>

				<p>

				Mesa 10.1.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				e934365d77f384bfaec844999440bef8  MesaLib-10.1.4.tar.gz

				6fddee101f49b7409cd29994c34ddee7  MesaLib-10.1.4.tar.bz2

				ba5f48e7d5e373922c804c2651fec6c1  MesaLib-10.1.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78225">Bug 78225</a> - Compile error due to undefined reference to `gbm_dri_backend', fix attached</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78537">Bug 78537</a> - no anisotropic filtering in a native Half-Life 2</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix double-freeing of dispatch tables inside glBegin/End.</li>

				</ul>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>docs: Add MD5 sums for 10.1.3</li>

				  <li>cherry-ignore: Roland and Michel agreed to drop these patches.</li>

				  <li>VERSION: Update to 10.1.4</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>configure: error out if building GBM without dri</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls.</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>nv50/ir: make sure to reverse cond codes on all the OP_SET variants</li>

				  <li>nv50: fix setting of texture ms info to be per-stage</li>

				  <li>nv50/ir: fix integer mul lowering for u32 x u32 -&gt; high u32</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeonsi: Fix anisotropic filtering state setup</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>configure.ac: Add LLVM_VERSION_PATCH to DEFINES</li>

				  <li>radeonsi: Enable geometry shaders with LLVM 3.4.1</li>

				</ul>

				</div>

				</body>

				</html>

									
										105

docs/relnotes/10.1.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,105 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.5 Release Notes / (June 6, 2014)</h1>

				<p>

				Mesa 10.1.5 is a bug fix release which fixes bugs found since the 10.1.4 release.

				</p>

				<p>

				Mesa 10.1.5 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b0aceaa75bc9a9b2d9215a113e2ad488b5cf85c99005a7624f8cf7c37c5d0eaa  MesaLib-10.1.5.tar.gz

				bc6c5ec7836f254a49d055a29d9aa34c97c54c038f47ad3a00fa57a5fef15bbc  MesaLib-10.1.5.tar.bz2

				78b7255cab0af7918945452a84de7989096ebcdd27e99b31c56c0589274cbc77  MesaLib-10.1.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79115">Bug 79115</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79421">Bug 79421</a> - </li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>glsl: fix use-after free bug/crash in ast_declarator_list::hir()</li>

				</ul>

				<p>Carl Worth (5):</p>

				<ul>

				  <li>docs: Add md5sums for 10.1.4 release</li>

				  <li>Merge remote-tracking branch 'origin/10.1' into 10.1</li>

				  <li>cherry-ignore: Ignore two commits.</li>

				  <li>Ignore a patch that is not needed for the 10.1 branch.</li>

				  <li>Update version to 10.1.5</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>glx: do not leak dri3Display</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nv50/ir: fix s32 x s32 -&gt; high s32 multiply logic</li>

				  <li>nv50/ir: fix constant folding for OP_MUL subop HIGH</li>

				</ul>

				<p>James Legg (1):</p>

				<ul>

				  <li>mesa: Fix unbinding GL_DEPTH_STENCIL_ATTACHMENT</li>

				</ul>

				<p>Jeremy Huddleston Sequoia (2):</p>

				<ul>

				  <li>glapi: Avoid heap corruption in _glapi_table</li>

				  <li>darwin: Fix test for kCGLPFAOpenGLProfile support at runtime</li>

				</ul>

				<p>Pavel Popov (2):</p>

				<ul>

				  <li>i965: Properly return *RESET* status in glGetGraphicsResetStatusARB</li>

				  <li>i965: Fix Line Stipple enable bit in 3DSTATE_SF for Haswell.</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>llvmpipe: fix crash when not all attachments are populated in a fb</li>

				</ul>

				</div>

				</body>

				</html>

									
										138

docs/relnotes/10.1.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,138 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1.6 Release Notes / (June 24, 2014)</h1>

				<p>

				Mesa 10.1.6 is a bug fix release which fixes bugs found since the 10.1.5 release.

				</p>

				<p>

				Mesa 10.1.6 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				cde60e06b340d7598802fe4a4484b3fb8befd714f9ab9caabe1f27d3149e8815  MesaLib-10.1.6.tar.bz2

				e4e726d7805a442f7ed07d12f71335e6126796ec85328a5989eb5348a8042d00  MesaLib-10.1.6.tar.gz

				bf7e3f721a7ad0c2057a034834b6fea688e64f26a66cf8d1caa2827e405e72dd  MesaLib-10.1.6.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>

				</ul>

				<h2>Changes</h2>

				<p>Adrian Negreanu (7):</p>

				<ul>

				  <li>add megadriver_stub_FILES</li>

				  <li>android: adapt to the megadriver mechanism</li>

				  <li>android: add libloader to libGLES_mesa and libmesa_egl_dri2</li>

				  <li>android: add src/gallium/auxiliary as include path for libmesa_dricore</li>

				  <li>android, egl: add correct drm include for libmesa_egl_dri2</li>

				  <li>android, mesa_gen_matypes: pull in timespec POSIX definition</li>

				  <li>android, dricore: undefined reference to _mesa_streaming_load_memcpy</li>

				</ul>

				<p>Beren Minor (1):</p>

				<ul>

				  <li>egl/main: Fix eglMakeCurrent when releasing context from current thread.</li>

				</ul>

				<p>Carl Worth (3):</p>

				<ul>

				  <li>docs: Add SHA256 checksums for the 10.1.5 release</li>

				  <li>cherry-ignore: Add a patch to ignore</li>

				  <li>Update VERSION to 10.1.6</li>

				</ul>

				<p>Daniel Manjarres (1):</p>

				<ul>

				  <li>glx: Don't crash on swap event for a Window (non-GLXWindow)</li>

				</ul>

				<p>Emil Velikov (1):</p>

				<ul>

				  <li>configure: error out when building opencl without LLVM</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.</li>

				</ul>

				<p>José Fonseca (3):</p>

				<ul>

				  <li>mesa/main: Make get_hash.c values constant.</li>

				  <li>mesa: Make glGetIntegerv(GL_*_ARRAY_SIZE) return GL_BGRA.</li>

				  <li>mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).</li>

				</ul>

				<p>Kristian Høgsberg (1):</p>

				<ul>

				  <li>mesa: Remove glClear optimization based on drawable size</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>configure: Only check for OpenCL without LLVM when the latter is certain</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>i965: Set the fast clear color value for texture surfaces</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>draw: (trivial) fix clamping of viewport index</li>

				</ul>

				<p>Tobias Klausmann (1):</p>

				<ul>

				  <li>nv50/ir: clear subop when folding constant expressions</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>clover: Prevent Clang from printing number of errors and warnings to stderr.</li>

				  <li>clover: Don't use llvm's global context</li>

				</ul>

				</div>

				</body>

				</html>

									
										75

docs/relnotes/10.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.1 Release Notes / March 4, 2014</h1>

				<p>

				Mesa 10.1 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.1.1.

				</p>

				<p>

				Mesa 10.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				3ec43f79dbcd9aa2a4a27bf1f51655b6  MesaLib-10.1.0.tar.bz2

				08e796ec7122aa299d32d4f67a254315  MesaLib-10.1.0.tar.gz

				bd365356543f4b38e57c1ddf7a317c40  MesaLib-10.1.0.zip

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_draw_indirect on i965.</li>

				<li>GL_ARB_clear_buffer_object</li>

				<li>GL_ARB_viewport_array on i965.</li>

				<li>GL_ARB_map_buffer_alignment on all drivers that did not previously support

				it.</li>

				<li>GL_AMD_shader_trinary_minmax.</li>

				<li>GL_EXT_framebuffer_blit on r200 and radeon.</li>

				<li>Reduced memory usage for display lists.</li>

				<li>OpenGL 3.3 support on nv50, nvc0, r600 and radeonsi</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				<ul>

				<li>Removed support for the GL_MESA_texture_array extension.  This extension

				  enabled the use of texture array with fixed-function and assembly fragment

				  shaders.  No applications are known to use this extension.</li>

				</ul>

				</div>

				</body>

				</html>

									
										61

docs/relnotes/10.2.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,61 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.1 Release Notes / June 6, 2014</h1>

				<p>

				Mesa 10.2.1 is a bug fix release which fixes bugs found since the 10.2 release.

				</p>

				<p>

				Mesa 10.2.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				96f892dae2d0bb14ac9c2113f586c909  MesaLib-10.2.1.tar.gz

				093f9b5d077e5f6061dcd7b01b7aa51a  MesaLib-10.2.1.tar.bz2

				6ab76c1608e5deed1eb8b54c62d7a48a  MesaLib-10.2.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>

				Mesa 10.2 had a build problem in the radeonsi driver due to an error resolving

				conflicts in a patch cherry-pick from master.  The build error is fixed.

				</p>

				<h2>Changes</h2>

				<p>Ian Romanick (3):</p>

				<ul>

				  <li>docs: Add MD5 checksum, etc. for 10.1 release</li>

				  <li>radeonsi: Fix build error introduced in 5ab9a9c</li>

				  <li>Bump version to 10.2.1</li>

				</ul>

				</div>

				</body>

				</html>

									
										181

docs/relnotes/10.2.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,181 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.2 Release Notes / June 24, 2014</h1>

				<p>

				Mesa 10.2.2 is a bug fix release which fixes bugs found since the 10.2.1 release.

				</p>

				<p>

				Mesa 10.2.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				38c4a40364000f89cddaa1694f6f3cfb444981d1110238ce603093585477399c  MesaLib-10.2.2.tar.bz2

				2af2ec8b4db624c352e961eefbcce6c8d1f86d44c5542f6f378c50e1b958d453  MesaLib-10.2.2.tar.gz

				d4c0372da59367a344d62ebcdf5cf61039c9cae6925f40f2dab8f8d95cf22da9  MesaLib-10.2.2.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66452">Bug 66452</a> - JUNIPER UVD accelerated playback of WMV3 streams does not work</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77865">Bug 77865</a> - [BDW] Many Ogles3conform framebuffer_blit cases fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - OpenCL: clBuildProgram prints error messages directly rather than storing them</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79029">Bug 79029</a> - INTEL_DEBUG=shader_time is full of lies</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79907">Bug 79907</a> - Mesa 10.2.1 --enable-vdpau default=auto broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80115">Bug 80115</a> - MESA_META_DRAW_BUFFERS induced GL_INVALID_VALUE errors</li>

				</ul>

				<h2>Changes</h2>

				<p>Adrian Negreanu (8):</p>

				<ul>

				  <li>add megadriver_stub_FILES</li>

				  <li>android: adapt to the megadriver mechanism</li>

				  <li>android: add libloader to libGLES_mesa and libmesa_egl_dri2</li>

				  <li>android: add src/gallium/auxiliary as include path for libmesa_dricore</li>

				  <li>android, egl: add correct drm include for libmesa_egl_dri2</li>

				  <li>android, egl: typo dri2_fallback_pixmap_surface -&gt; dri2_fallback_create_pixmap_surface</li>

				  <li>android, mesa_gen_matypes: pull in timespec POSIX definition</li>

				  <li>android, dricore: undefined reference to _mesa_streaming_load_memcpy</li>

				</ul>

				<p>Carl Worth (1):</p>

				<ul>

				  <li>Update VERSION to 10.2.2</li>

				</ul>

				<p>Daniel Manjarres (1):</p>

				<ul>

				  <li>glx: Don't crash on swap event for a Window (non-GLXWindow)</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>targets/xa: limit the amount of exported symbols</li>

				  <li>configure: error out when building opencl without LLVM</li>

				  <li>configure: correctly autodetect xvmc/vdpau/omx</li>

				</ul>

				<p>Grigori Goronzy (1):</p>

				<ul>

				  <li>radeon/uvd: disable VC-1 simple/main on UVD 2.x</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>mesa: Copy Geom.UsesEndPrimitive when cloning a geometry program.</li>

				</ul>

				<p>Ian Romanick (3):</p>

				<ul>

				  <li>docs: Add initial 10.2.1 release notes</li>

				  <li>docs: Add MD5 checksum, etc. for 10.2.1 release</li>

				  <li>meta: Respect the driver's maximum number of draw buffers</li>

				</ul>

				<p>Ilia Mirkin (7):</p>

				<ul>

				  <li>gk110/ir: emit saturate flag on fadd when needed</li>

				  <li>gk110/ir: fix emitting constbuf file index</li>

				  <li>gk110/ir: fix bfind emission</li>

				  <li>nv50: make sure to mark first scissor dirty after blit</li>

				  <li>nv30: plug some memory leaks on screen destroy and shader compile</li>

				  <li>nv30: avoid dangling references to deleted contexts</li>

				  <li>nv30: hack to avoid errors on unexpected color/zeta combinations</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>meta_blit: properly compute texture width for the CopyTexSubImage fallback</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>mesa/main: Prevent sefgault on glGetIntegerv(GL_ATOMIC_COUNTER_BUFFER_BINDING).</li>

				</ul>

				<p>Kenneth Graunke (9):</p>

				<ul>

				  <li>i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.</li>

				  <li>i965: Invalidate live intervals when inserting Gen4 SEND workarounds.</li>

				  <li>i965/vec4: Fix dead code elimination for VGRFs of size &gt; 1.</li>

				  <li>i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.</li>

				  <li>i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.</li>

				  <li>i965: Add missing newlines to a few perf_debug messages.</li>

				  <li>i965/vec4: Use the sampler for pull constant loads on Broadwell.</li>

				  <li>i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.</li>

				  <li>i965: Save meta stencil blit programs in the context.</li>

				</ul>

				<p>Kristian Høgsberg (1):</p>

				<ul>

				  <li>mesa: Remove glClear optimization based on drawable size</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>configure: Only check for OpenCL without LLVM when the latter is certain</li>

				</ul>

				<p>Neil Roberts (1):</p>

				<ul>

				  <li>i965: Set the fast clear color value for texture surfaces</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>clover: Prevent Clang from printing number of errors and warnings to stderr.</li>

				  <li>clover: Don't use llvm's global context</li>

				</ul>

				<p>Ville Syrjälä (1):</p>

				<ul>

				  <li>i915: Fix gen2 texblend setup</li>

				</ul>

				</div>

				</body>

				</html>

									
										130

docs/relnotes/10.2.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.3 Release Notes / July 7, 2014</h1>

				<p>

				Mesa 10.2.3 is a bug fix release which fixes bugs found since the 10.2.2 release.

				</p>

				<p>

				Mesa 10.2.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e482a96170c98b17d6aba0d6e4dda4b9a2e61c39587bb64ac38cadfa4aba4aeb  MesaLib-10.2.3.tar.bz2

				96cffacaa1c52ae659b3b0f91be2eebf5528b748934256751261fb79ea3d6636  MesaLib-10.2.3.tar.gz

				82cab6ff14c8038ee39842dbdea0d447a78d119efd8d702d1497bc7c246434e9  MesaLib-10.2.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76223">Bug 76223</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79823">Bug 79823</a> - </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80015">Bug 80015</a> - </li>

				</ul>

				<h2>Changes</h2>

				<p>Aaron Watry (1):</p>

				<ul>

				  <li>radeon/llvm: Allocate space for kernel metadata operands</li>

				</ul>

				<p>Carl Worth (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.2.2 release</li>

				  <li>cherry-ignore: Add a patch that's been rejected</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>nouveau: dup fd before passing it to device</li>

				  <li>nv50: disable dedicated ubo upload method</li>

				  <li>nv50: do an explicit flush on draw when there are persistent buffers</li>

				  <li>nvc0: add a memory barrier when there are persistent UBOs</li>

				</ul>

				<p>Jasper St. Pierre (1):</p>

				<ul>

				  <li>glxext: Send the Drawable's ID in the GLX_BufferSwapComplete event</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Don't emit SURFACE_STATEs for gather workarounds on Broadwell.</li>

				  <li>i965: Include marketing names for Broadwell GPUs.</li>

				  <li>i965/disasm: Fix INTEL_DEBUG=fs on Broadwell for ARB_fp applications.</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeon/llvm: Use the llvm.rsq.clamped intrinsic for RSQ</li>

				</ul>

				<p>Rob Clark (9):</p>

				<ul>

				  <li>xa: fix segfault</li>

				  <li>freedreno: use OUT_RELOCW when buffer is written</li>

				  <li>freedreno/a3xx: fix depth/stencil GMEM positioning</li>

				  <li>freedreno/a3xx: fix depth/stencil gmem restore</li>

				  <li>freedreno/a3xx: fix blend opcode</li>

				  <li>freedreno: few caps fixes</li>

				  <li>freedreno/a3xx: texture fixes</li>

				  <li>freedreno: fix for null textures</li>

				  <li>freedreno/a3xx: vtx formats</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>draw: (trivial) fix clamping of viewport index</li>

				</ul>

				<p>Takashi Iwai (1):</p>

				<ul>

				  <li>llvmpipe: Fix zero-division in llvmpipe_texture_layout()</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Don't close the drm fd on failure v2</li>

				</ul>

				<p>Tobias Klausmann (1):</p>

				<ul>

				  <li>nv50/ir: allow gl_ViewportIndex to work on non-provoking vertices</li>

				</ul>

				</div>

				</body>

				</html>

									
										127

docs/relnotes/10.2.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,127 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.4 Release Notes / July 18, 2014</h1>

				<p>

				Mesa 10.2.4 is a bug fix release which fixes bugs found since the 10.2.3 release.

				</p>

				<p>

				Mesa 10.2.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				06a2341244eb85c283f59f70161e06ded106f835ed9b6be1ef0243bd9344811a  MesaLib-10.2.4.tar.bz2

				33e3c8b4343503e7d7d17416c670438860a2fd99ec93ea3327f73c3abe33b5e4  MesaLib-10.2.4.tar.gz

				e26791a4a62a61b82e506e6ba031812d09697d1a831e8239af67e5722a8ee538  MesaLib-10.2.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81157">Bug 81157</a> - [BDW]Piglit some spec_glsl-1.50_execution_built-in-functions* cases fail</li>

				</ul>

				<h2>Changes</h2>

				<p>Abdiel Janulgue (3):</p>

				<ul>

				  <li>i965/fs: Refactor check for potential copy propagated instructions.</li>

				  <li>i965/fs: skip copy-propate for logical instructions with negated src entries</li>

				  <li>i965/vec4: skip copy-propate for logical instructions with negated src entries</li>

				</ul>

				<p>Brian Paul (3):</p>

				<ul>

				  <li>mesa: fix geometry shader memory leaks</li>

				  <li>st/mesa: fix geometry shader memory leak</li>

				  <li>gallium/u_blitter: fix some shader memory leaks</li>

				</ul>

				<p>Carl Worth (2):</p>

				<ul>

				  <li>docs: Add sha256 checksums for the 10.2.3 release</li>

				  <li>Update VERSION to 10.2.4</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Generalize the pixel_x/y workaround for all UW types.</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>nv50/ir: retrieve shadow compare from first arg</li>

				  <li>nv50/ir: ignore bias for samplerCubeShadow on nv50</li>

				  <li>nvc0/ir: do quadops on the right texture coordinates for TXD</li>

				  <li>nvc0/ir: use manual TXD when offsets are involved</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>i965: Add auxiliary surface field #defines for Broadwell.</li>

				</ul>

				<p>Kenneth Graunke (9):</p>

				<ul>

				  <li>i965: Don't copy propagate abs into Broadwell logic instructions.</li>

				  <li>i965: Set execution size to 8 for instructions with force_sechalf set.</li>

				  <li>i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.</li>

				  <li>i965/fs: Use WE_all for gl_SampleID header register munging.</li>

				  <li>i965: Add plumbing for Broadwell's auxiliary surface support.</li>

				  <li>i965: Drop SINT workaround for CMS layout on Broadwell.</li>

				  <li>i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.</li>

				  <li>i965: Add 2x MSAA support to the MCS allocation function.</li>

				  <li>i965: Enable compressed multisample support (CMS) on Broadwell.</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>gallium: fix u_default_transfer_inline_write for textures</li>

				  <li>st/mesa: fix samplerCubeShadow with bias</li>

				  <li>radeonsi: fix samplerCubeShadow with bias</li>

				  <li>radeonsi: add support for TXB2</li>

				</ul>

				<p>Matt Turner (8):</p>

				<ul>

				  <li>i965/vec4: Don't return void from a void function.</li>

				  <li>i965/vec4: Don't fix_math_operand() on Gen &gt;= 8.</li>

				  <li>i965/fs: Don't fix_math_operand() on Gen &gt;= 8.</li>

				  <li>i965/fs: Make try_constant_propagate() static.</li>

				  <li>i965/fs: Constant propagate into 2-src math instructions on Gen8.</li>

				  <li>i965/vec4: Constant propagate into 2-src math instructions on Gen8.</li>

				  <li>i965/fs: Don't use brw_imm_* unnecessarily.</li>

				  <li>i965/fs: Set correct number of regs_written for MCS fetches.</li>

				</ul>

				</div>

				</body>

				</html>

									
										188

docs/relnotes/10.2.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,188 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.5 Release Notes / August 2, 2014</h1>

				<p>

				Mesa 10.2.5 is a bug fix release which fixes bugs found since the 10.2.4 release.

				</p>

				<p>

				Mesa 10.2.5 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b4459f0bf7f4a3c8fb78ece3c9d2eac3d0e5bf38cb470f2a72705e744bd0310d  MesaLib-10.2.5.tar.bz2

				7b4dd0cb683f8c7dc48a3e7a315742bed58ddcd7b756c462aca4177bd1acdc79  MesaLib-10.2.5.tar.gz

				6180565914fb238dd77ccdaff96b6155d9a6e1b3e981ebbf6a6851301b384fed  MesaLib-10.2.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80991">Bug 80991</a> - [BDW]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Abdiel Janulgue (3):</p>

				<ul>

				  <li>i965/fs: Refactor check for potential copy propagated instructions.</li>

				  <li>i965/fs: skip copy-propate for logical instructions with negated src entries</li>

				  <li>i965/vec4: skip copy-propate for logical instructions with negated src entries</li>

				</ul>

				<p>Adel Gadllah (1):</p>

				<ul>

				  <li>i915: Fix up intelInitScreen2 for DRI3</li>

				</ul>

				<p>Anuj Phogat (2):</p>

				<ul>

				  <li>i965: Fix z_offset computation in intel_miptree_unmap_depthstencil()</li>

				  <li>mesa: Don't use memcpy() in _mesa_texstore() for float depth texture data</li>

				</ul>

				<p>Brian Paul (3):</p>

				<ul>

				  <li>mesa: fix geometry shader memory leaks</li>

				  <li>st/mesa: fix geometry shader memory leak</li>

				  <li>gallium/u_blitter: fix some shader memory leaks</li>

				</ul>

				<p>Carl Worth (6):</p>

				<ul>

				  <li>docs: Add sha256 checksums for the 10.2.3 release</li>

				  <li>Update VERSION to 10.2.4</li>

				  <li>Add release notes for 10.2.4</li>

				  <li>docs: Add SHA256 checksums for the 10.2.4 release</li>

				  <li>cherry-ignore: Ignore a few patches picked in the previous stable release</li>

				  <li>Update version to 10.2.5</li>

				</ul>

				<p>Christian König (1):</p>

				<ul>

				  <li>radeonsi: fix order of r600_need_dma_space and r600_context_bo_reloc</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Generalize the pixel_x/y workaround for all UW types.</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profile</li>

				  <li>mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat profile</li>

				</ul>

				<p>Ilia Mirkin (5):</p>

				<ul>

				  <li>nv50/ir: retrieve shadow compare from first arg</li>

				  <li>nv50/ir: ignore bias for samplerCubeShadow on nv50</li>

				  <li>nvc0/ir: do quadops on the right texture coordinates for TXD</li>

				  <li>nvc0/ir: use manual TXD when offsets are involved</li>

				  <li>nvc0: make sure that the local memory allocation is aligned to 0x10</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>main/format_pack: Fix a wrong datatype in pack_ubyte_R8G8_UNORM</li>

				  <li>main/get_hash_params: Add GL_SAMPLE_SHADING_ARB</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>i965: Add auxiliary surface field #defines for Broadwell.</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>st/wgl: Clamp wglChoosePixelFormatARB's output nNumFormats to nMaxFormats.</li>

				</ul>

				<p>Kenneth Graunke (13):</p>

				<ul>

				  <li>i965: Don't copy propagate abs into Broadwell logic instructions.</li>

				  <li>i965: Set execution size to 8 for instructions with force_sechalf set.</li>

				  <li>i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.</li>

				  <li>i965/fs: Use WE_all for gl_SampleID header register munging.</li>

				  <li>i965: Add plumbing for Broadwell's auxiliary surface support.</li>

				  <li>i965: Drop SINT workaround for CMS layout on Broadwell.</li>

				  <li>i965: Hook up the MCS buffers in SURFACE_STATE on Broadwell.</li>

				  <li>i965: Add 2x MSAA support to the MCS allocation function.</li>

				  <li>i965: Enable compressed multisample support (CMS) on Broadwell.</li>

				  <li>i965: Add missing persample_shading field to brw_wm_debug_recompile.</li>

				  <li>i965/fs: Fix gl_SampleID for 2x MSAA and SIMD16 mode.</li>

				  <li>i965/fs: Fix gl_SampleMask handling for SIMD16 on Gen8+.</li>

				  <li>i965/fs: Set LastRT on the final FB write on Broadwell.</li>

				</ul>

				<p>Marek Olšák (14):</p>

				<ul>

				  <li>gallium: fix u_default_transfer_inline_write for textures</li>

				  <li>st/mesa: fix samplerCubeShadow with bias</li>

				  <li>radeonsi: fix samplerCubeShadow with bias</li>

				  <li>radeonsi: add support for TXB2</li>

				  <li>r600g: switch SNORM conversion to DX and GLES behavior</li>

				  <li>radeonsi: fix CMASK and HTILE calculations for Hawaii</li>

				  <li>gallium/util: add a helper for calculating primitive count from vertex count</li>

				  <li>radeonsi: fix a hang with instancing on Hawaii</li>

				  <li>radeonsi: fix a hang with streamout on Hawaii</li>

				  <li>winsys/radeon: fix vram_size overflow with Hawaii</li>

				  <li>radeonsi: fix occlusion queries on Hawaii</li>

				  <li>r600g,radeonsi: switch all occurences of array_size to util_max_layer</li>

				  <li>radeonsi: fix build because of lack of draw_indirect infrastructure in 10.2</li>

				  <li>radeonsi: use DRAW_PREAMBLE on CIK</li>

				</ul>

				<p>Matt Turner (8):</p>

				<ul>

				  <li>i965/vec4: Don't return void from a void function.</li>

				  <li>i965/vec4: Don't fix_math_operand() on Gen &gt;= 8.</li>

				  <li>i965/fs: Don't fix_math_operand() on Gen &gt;= 8.</li>

				  <li>i965/fs: Make try_constant_propagate() static.</li>

				  <li>i965/fs: Constant propagate into 2-src math instructions on Gen8.</li>

				  <li>i965/vec4: Constant propagate into 2-src math instructions on Gen8.</li>

				  <li>i965/fs: Don't use brw_imm_* unnecessarily.</li>

				  <li>i965/fs: Set correct number of regs_written for MCS fetches.</li>

				</ul>

				<p>Thorsten Glaser (1):</p>

				<ul>

				  <li>nv50: fix build failure on m68k due to invalid struct alignment assumptions</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>clover: Call end_query before getting timestamp result v2</li>

				</ul>

				</div>

				</body>

				</html>

									
										118

docs/relnotes/10.2.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,118 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.6 Release Notes / August 19, 2014</h1>

				<p>

				Mesa 10.2.6 is a bug fix release which fixes bugs found since the 10.2.5 release.

				</p>

				<p>

				Mesa 10.2.6 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				193314d2adba98e43697d726739ac46b4299aae324fa1821aa226890c28ac806  MesaLib-10.2.6.tar.bz2

				f7a45a5977b485eb95ac024205c584a0c112fe3951c2313c797579bb16a7a448  MesaLib-10.2.6.tar.gz

				6d086d6fcda8f317adfaaae40011decf2f2e2dc80819c4a7a77c76f73512e8d8  MesaLib-10.2.6.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81450">Bug 81450</a> - [BDW]Piglit spec_glsl-1.30_execution_tex-miplevel-selection_textureGrad_1DArray cases intel_do_flush_locked failed</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (15):</p>

				<ul>

				  <li>mesa: Fix error condition for valid texture targets in glTexStorage* functions</li>

				  <li>mesa: Turn target_can_be_compressed() in to a utility function</li>

				  <li>mesa: Add error condition for using compressed internalformat in glTexStorage3D()</li>

				  <li>mesa: Fix condition for using compressed internalformat in glCompressedTexImage3D()</li>

				  <li>mesa: Add utility function _mesa_is_enum_format_snorm()</li>

				  <li>mesa: Don't allow snorm internal formats in glCopyTexImage*() in GLES3</li>

				  <li>mesa: Add a helper function _mesa_is_enum_format_unsized()</li>

				  <li>mesa: Add a gles3 error condition for sized internalformat in glCopyTexImage*()</li>

				  <li>mesa: Add gles3 error condition for GL_RGBA10_A2 buffer format in glCopyTexImage*()</li>

				  <li>mesa: Add utility function _mesa_is_enum_format_unorm()</li>

				  <li>mesa: Add gles3 condition for normalized internal formats in glCopyTexImage*()</li>

				  <li>mesa: Allow GL_TEXTURE_CUBE_MAP target with compressed internal formats</li>

				  <li>meta: Use _mesa_get_format_bits() to get the GL_RED_BITS</li>

				  <li>egl: Fix OpenGL ES version checks in _eglParseContextAttribList()</li>

				  <li>meta: Fix datatype computation in get_temp_image_type()</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix assertion in _mesa_drawbuffers()</li>

				</ul>

				<p>Carl Worth (2):</p>

				<ul>

				  <li>docs: Add sha256 sums to the 10.2.5 release notes</li>

				  <li>Update VERSION to 10.2.6</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>mesa/st: only convert AND(a, NOT(b)) into MAD when not using native integers</li>

				</ul>

				<p>Jordan Justen (1):</p>

				<ul>

				  <li>i965/miptree: Layout 1D Array as 2D Array with height of 1</li>

				</ul>

				<p>Maarten Lankhorst (1):</p>

				<ul>

				  <li>configure.ac: Do not require llvm on x32</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>st/mesa: fix blit-based partial TexSubImage for 1D arrays</li>

				  <li>radeon,r200: fix buffer validation after CS flush</li>

				  <li>radeonsi: fix a hang with instancing in Unigine Heaven/Valley on Hawaii</li>

				  <li>radeonsi: fix CMASK and HTILE allocation on Tahiti</li>

				</ul>

				<p>Pali Rohár (1):</p>

				<ul>

				  <li>configure: check for dladdr via AC_CHECK_FUNC/AC_CHECK_LIB</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>gallivm: fix up out-of-bounds level when using conformant out-of-bound behavior</li>

				</ul>

				</div>

				</body>

				</html>

									
										211

docs/relnotes/10.2.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,211 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.7 Release Notes / September 06, 2014</h1>

				<p>

				Mesa 10.2.7 is a bug fix release which fixes bugs found since the 10.2.6 release.

				</p>

				<p>

				Mesa 10.2.7 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				cb67dfaabf88acba29aa2cf0dd58ee17b21ebf9594f8d1226c41794da8de3e9d  MesaLib-10.2.7.tar.gz

				27b958063a4c002071f14ed45c7d2a1ee52cd85e4ac8876e8a1c273495a7d43f  MesaLib-10.2.7.tar.bz2

				a2796a2d5bbbc2edd22857ecc267cba68dfe5d0296f5d84ba7510877b216cc40  MesaLib-10.2.7.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36193">Bug 36193</a> - [i965] brw_eu_emit.c:182: validate_reg: Assertion `execsize &gt;= width' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70441">Bug 70441</a> - [Gen4-5 clip] Piglit spec_OpenGL_1.1_polygon-offset hits (execsize &gt;= width) assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76188">Bug 76188</a> - EGL_EXT_image_dma_buf_import fd ownership is incorrect</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76789">Bug 76789</a> - [radeonsi] si_descriptors.c requires -std=gnu99 or -fms-extensions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82139">Bug 82139</a> - [r600g, bisected] multiple ubo piglit regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82671">Bug 82671</a> - [r600g-evergreen][compute]Empty kernel execution causes crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82709">Bug 82709</a> - OpenCL not working on radeon hainan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82814">Bug 82814</a> - glDrawBuffers(0, NULL) segfaults in _mesa_drawbuffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>

				</ul>

				<h2>Changes</h2>

				<p>Adam Jackson (1):</p>

				<ul>

				  <li>radeonsi: Don't use anonymous struct trick in atom tracking</li>

				</ul>

				<p>Alex Deucher (2):</p>

				<ul>

				  <li>radeonsi: add new CIK pci ids</li>

				  <li>radeonsi: add new SI pci ids</li>

				</ul>

				<p>Andreas Boll (1):</p>

				<ul>

				  <li>winsys/radeon: fix nop packet padding for hawaii</li>

				</ul>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>i965: Bail on vec4 copy propagation for scratch writes with source modifiers</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix NULL pointer deref bug in _mesa_drawbuffers()</li>

				</ul>

				<p>Carl Worth (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.2.6 release</li>

				  <li>Makefile: Switch from md5sums to sha256sums</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>i965: add missing parens in vec4 visitor</li>

				</ul>

				<p>Emil Velikov (17):</p>

				<ul>

				  <li>configure.ac: bail out if building gallium_gbm without gallium_egl</li>

				  <li>android: gallium/nouveau: fix include folders, link against libstlport</li>

				  <li>android: egl/main: fixup the nouveau build</li>

				  <li>automake: gallium/freedreno: drop spurious include dirs</li>

				  <li>android: gallium/freedreno: add preliminary build</li>

				  <li>android: egl/main: add/enable freedreno</li>

				  <li>android: gallium/auxiliary: drop log2/log2f redefitions</li>

				  <li>android: drop HAL_PIXEL_FORMAT_RGBA_{5551,4444}</li>

				  <li>android: glsl: the stlport over the limited Android STL</li>

				  <li>android: dri/i915: do not build an 'empty' driver</li>

				  <li>cherry-ignore: remove patch that lacking previous dependencies</li>

				  <li>cherry-ignore: PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE is not it 10.2</li>

				  <li>cherry-ignore: drop whitespace fix</li>

				  <li>cherry-ignore: reject a15088338eb</li>

				  <li>get-pick-list.sh: Require explicit "10.2" for nominating stable patches</li>

				  <li>mesa: fix make tarballs</li>

				  <li>Update VERSION to 10.2.7</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image</li>

				</ul>

				<p>Ilia Mirkin (9):</p>

				<ul>

				  <li>nouveau: make sure to invalidate any vbo state as well</li>

				  <li>nouveau: don't keep stale pointer to free'd data</li>

				  <li>nvc0/ir: avoid infinite recursion when finding first uses of tex</li>

				  <li>nv50: zero out unbound samplers</li>

				  <li>nvc0: don't make 1d staging textures linear</li>

				  <li>nv50/ir: avoid creating instructions that can't be emitted</li>

				  <li>nv50: set the miptree address when clearing bo's in vp2 init</li>

				  <li>nv50: mt address may not be the underlying bo's start address</li>

				  <li>nv50: attach the buffer bo to the miptree structures</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>gallivm: Fix build with latest LLVM</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>mesa: Move declaration to top of block.</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.</li>

				  <li>i965/vec4: Respect ir-&gt;force_writemask_all in Gen8 code generation.</li>

				  <li>i965/clip: Fix brw_clip_unfilled.c/compute_offset's assembly.</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>r600g: fix constant buffer fetches</li>

				  <li>radeonsi: save scissor state and sample mask for u_blitter</li>

				  <li>glsl_to_tgsi: allocate and enlarge arrays for temporaries on demand</li>

				</ul>

				<p>Paulo Sergio Travaglia (2):</p>

				<ul>

				  <li>android: gallium/radeon: attempt to fix the android build</li>

				  <li>android: egl/main: resolve radeon linking issues</li>

				</ul>

				<p>Pekka Paalanen (1):</p>

				<ul>

				  <li>egl_dri2: fix EXT_image_dma_buf_import fds</li>

				</ul>

				<p>Robert Bragg (1):</p>

				<ul>

				  <li>meta: save and restore swizzle for _GenerateMipmap</li>

				</ul>

				<p>Tom Stellard (7):</p>

				<ul>

				  <li>radeon/compute: Fix reported values for MAX_GLOBAL_SIZE and MAX_MEM_ALLOC_SIZE</li>

				  <li>radeonsi/compute: Update reference counts for buffers in si_set_global_binding()</li>

				  <li>radeonsi/compute: Call si_pm4_free_state() after emitting compute state</li>

				  <li>clover: Flush the command queue in clReleaseCommandQueue()</li>

				  <li>radeon: Add work-around for missing Hainan support in clang &lt; 3.6 v2</li>

				  <li>pipe-loader: Fix memory leak v2</li>

				  <li>r600g/compute: Don't initialize vertex_buffer_state masks to 0x2</li>

				</ul>

				<p>Vinson Lee (1):</p>

				<ul>

				  <li>gallivm: Fix build with LLVM &gt;= 3.6 r215967.</li>

				</ul>

				</div>

				</body>

				</html>

									
										130

docs/relnotes/10.2.8.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.8 Release Notes / September 19, 2014</h1>

				<p>

				Mesa 10.2.8 is a bug fix release which fixes bugs found since the 10.2.7 release.

				</p>

				<p>

				Mesa 10.2.8 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				4c5a25ccaf1a9734bbd10d62a1420cc8fd35a1060ce679f2fc846769a25fbeec  MesaLib-10.2.8.tar.gz

				1ef9ad3f241788d454f2ff8c9d65b6849dfc31c8fe91f70fd2930b81c8af1398  MesaLib-10.2.8.tar.bz2

				d26218da3b44734b1d555267b4c63c48803c4c8b14d2bc53071be57014da37fa  MesaLib-10.2.8.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77493">Bug 77493</a> - lp_test_arit fails with llvm &gt;= llvm-3.5svn r206094</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83567">Bug 83567</a> - Mesa 10.2.6 does not compile with llvm 3.5</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83735">Bug 83735</a> - [mesa-10.2.x] broken with llvm-3.5 and old CPUs</li>

				</ul>

				<h2>Changes</h2>

				<p>Aaron Watry (1):</p>

				<ul>

				  <li>gallivm: Fix build after LLVM commit 211259</li>

				</ul>

				<p>Christoph Bumiller (2):</p>

				<ul>

				  <li>nv50/ir/util: fix BitSet issues</li>

				  <li>nvc0/ir: clarify recursion fix to finding first tex uses</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.2.7 release</li>

				  <li>configure: bail out if building svga without libdrm</li>

				  <li>Update VERSION to 10.2.8</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>nv50/ir: avoid array overrun when checking for supported mods</li>

				  <li>nouveau: only enable the depth test if there actually is a depth buffer</li>

				  <li>nouveau: only enable stencil func if the visual has stencil bits</li>

				  <li>nouveau: change internal variables to avoid conflicts with macro args</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>configure.ac: strip _GNU_SOURCE from llvm-config output</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>gallivm: Disable workaround for PR12833 on LLVM 3.2+.</li>

				</ul>

				<p>Maarten Lankhorst (4):</p>

				<ul>

				  <li>nouveau: re-allocate bo's on overflow</li>

				  <li>nouveau: fix MPEG4 hw decoding</li>

				  <li>nouveau: rework reference frame handling</li>

				  <li>nouveau: remove unneeded assert</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>r600g,radeonsi: make sure there's enough CS space before resuming queries</li>

				  <li>mesa: set UniformBooleanTrue = 1.0f by default</li>

				  <li>st/mesa: use 1.0f as boolean true on drivers without integer support</li>

				</ul>

				<p>Richard Sandiford (1):</p>

				<ul>

				  <li>gallivm: Fix uses of 2^24</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>gallivm: set mcpu when initializing llvm execution engine</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>winsys/svga: Fix incorrect type usage in IOCTL v2</li>

				</ul>

				</div>

				</body>

				</html>

									
										101

docs/relnotes/10.2.9.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,101 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2.9 Release Notes / October 12, 2014</h1>

				<p>

				Mesa 10.2.9 is a bug fix release which fixes bugs found since the 10.2.8 release.

				This is the final planned release for the 10.2 branch.

				</p>

				<p>

				Mesa 10.2.9 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				f8d62857eed8f604a57710c58a8ffcfb8dab2dc4977ec27c956c7c4fd14032f6  MesaLib-10.2.9.tar.gz

				f6031f8b7113a92325b60635c504c510490eebb2e707119bbff7bd86aa34657d  MesaLib-10.2.9.tar.bz2

				11c0ef4f3308fc29d9f15a77fd8f4842a946fce9e830250a1c95b171a446171a  MesaLib-10.2.9.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83570">Bug 83570</a> - Glyphy demo throws unhandled Integer division by zero exception</li>

				</ul>

				<h2>Changes</h2>

				<p>Andreas Pokorny (2):</p>

				<ul>

				  <li>egl/drm: expose KHR_image_pixmap extension</li>

				  <li>i915: Fix black buffers when importing prime fds</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.2.8 release</li>

				  <li>Update VERSION to 10.2.9</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>nv50/ir: avoid deleting pseudo instructions too early</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: release GS rings at context destruction</li>

				  <li>radeonsi: properly destroy the GS copy shader and scratch_bo for compute</li>

				  <li>st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>gallivm: fix idiv</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Fix regression in xa_yuv_planar_blit()</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>configure.ac: Compute LLVM_VERSION_PATCH using llvm-config</li>

				</ul>

				<p>rconde (1):</p>

				<ul>

				  <li>gallivm,tgsi: fix idiv by zero crash</li>

				</ul>

				</div>

				</body>

				</html>

									
										97

docs/relnotes/10.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,97 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.2 Release Notes / June 6, 2014</h1>

				<p>

				Mesa 10.2 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.2.1.

				</p>

				<p>

				Mesa 10.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				c87bfb6dd5cbcf1fdef42e5ccd972581  MesaLib-10.2.0.tar.gz

				7aaba90bd7169a94ae2fe83febdec963  MesaLib-10.2.0.tar.bz2

				58b203aca15dadc25ab4d1126db1052b  MesaLib-10.2.0.zip

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_buffer_storage on i965, nv30, nv50, nvc0, r300, r600, and radeonsi</li>

				<li>GL_ARB_multi_bind on all drivers</li>

				<li>GL_ARB_sample_shading on nv50 (GT21x only), nvc0</li>

				<li>GL_ARB_separate_shader_objects (desktop OpenGL) and

				  GL_EXT_separate_shader_objects (OpenGL ES 2.0 and 3.0) on all drivers</li>

				<li>GL_ARB_stencil_texturing on i965/gen8+</li>

				<li>GL_ARB_texture_cube_map_array on nv50 (GT21x only)</li>

				<li>GL_ARB_texture_gather on nv50 (GT21x only), nvc0</li>

				<li>GL_ARB_texture_query_lod on nv50 (GT21x only), nvc0</li>

				<li>GL_ARB_texture_view on i965/gen7</li>

				<li>GL_ARB_vertex_type_10f_11f_11f_rev on nv50, nvc0, radeonsi</li>

				<li>GL_ARB_viewport_array on nv50, r600</li>

				<li>GL_INTEL_performance_query on i965/gen5+</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				<ul>

				<li>Renamed <i>--with-llvm-shared-libs</i> to <i>--enable-llvm-shared-libs</i></li>

				<p>

				The option is used to control how mesa is linked against LLVM, and now

				defaults to enabled (shared linking).

				</p>

				<li>Split <i>libxatracker.so</i> into a standalone library which can be used

				with any gallium driver.</li>

				<p>

				Previously the library was linked statically against vmware's virtual gpu

				driver(svga), whereas now it loads a shared pipe_*.so driver. Provide the

				following options during configure, if you would like support for svga driver

				<i>--enable-xa --with-gallium-drivers=svga</i>

				</p>

				<p>

				Note: The files are installed in $(libdir)/gallium-pipe/ and the interface

				between them and libxatracker.so is <strong>not</strong> stable.

				</p>

				<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>

				</ul>

				</div>

				</body>

				</html>

									
										158

docs/relnotes/10.3.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,158 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.1 Release Notes / October 12, 2014</h1>

				<p>

				Mesa 10.3.1 is a bug fix release which fixes bugs found since the 10.3 release.

				</p>

				<p>

				Mesa 10.3.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				155afcbad17be8bb80282c761b957d5cc716c14a1fa16c4f5ee04e76df729c6d  MesaLib-10.3.1.tar.gz

				b081d077d717e5d56f2d59677490856052c41573e50378ff86d6c72456714add  MesaLib-10.3.1.tar.bz2

				07a14febfed06412d519e091a62d24513fee6745f1a6f8a8f1956bfe04b77d15  MesaLib-10.3.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83570">Bug 83570</a> - Glyphy demo throws unhandled Integer division by zero exception</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>

				</ul>

				<h2>Changes</h2>

				<p>Andreas Pokorny (2):</p>

				<ul>

				  <li>egl/drm: expose KHR_image_pixmap extension</li>

				  <li>i915: Fix black buffers when importing prime fds</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix prog_optimize.c assertions triggered by SWZ opcode</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add 10.3 sha256 sums, news item and link release notes</li>

				  <li>Update VERSION to 10.3.1</li>

				</ul>

				<p>Ian Romanick (4):</p>

				<ul>

				  <li>glsl: Make sure fields after small structs have correct padding</li>

				  <li>glsl: Make sure row-major array-of-structure get correct layout</li>

				  <li>glsl: Round struct size up to at least 16 bytes</li>

				  <li>glsl: Strip arrayness from ir_type_dereference_variable too</li>

				</ul>

				<p>Ilia Mirkin (5):</p>

				<ul>

				  <li>nv50/ir: avoid deleting pseudo instructions too early</li>

				  <li>gm107/ir: fix manual TXD for array targets</li>

				  <li>gm107/ir: fix texture argument order</li>

				  <li>gm107/ir: add support for indirect const buffer selection</li>

				  <li>gm107/ir: take relative pfetch offset into account</li>

				</ul>

				<p>Keith Packard (1):</p>

				<ul>

				  <li>glx/dri3: Provide error diagnostics when DRI3 allocation fails</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>mesa: Use proper structure for glGet*(GL_TEXTURE_COORD_ARRAY*).</li>

				  <li>mesa: Set correct array element in vbo_exec_vtx_init.</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>radeonsi: release GS rings at context destruction</li>

				  <li>radeonsi: properly destroy the GS copy shader and scratch_bo for compute</li>

				  <li>st/dri: remove GALLIUM_MSAA and __GL_FSAA_MODE environment variables</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>st/mesa: Use PIPE_USAGE_STAGING for GL_STATIC/DYNAMIC/STREAM_READ buffers</li>

				</ul>

				<p>Richard Sandiford (2):</p>

				<ul>

				  <li>mesa: Fix alpha component in unpack_R8G8B8X8_SRGB.</li>

				  <li>swrast: Fix handling of MESA_FORMAT_L8A8_SRGB for big-endian</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>gallivm: fix idiv</li>

				</ul>

				<p>Thomas Hellstrom (1):</p>

				<ul>

				  <li>st/xa: Fix regression in xa_yuv_planar_blit()</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>clover: Add support to mem objects for multiple destructor callbacks v2</li>

				  <li>configure.ac: Compute LLVM_VERSION_PATCH using llvm-config</li>

				</ul>

				<p>Tomasz Figa (3):</p>

				<ul>

				  <li>util: Include in Android builds</li>

				  <li>st/mesa: Generate format_info.c in Android builds</li>

				  <li>st/mesa: Fix paths used in Android builds</li>

				</ul>

				<p>rconde (1):</p>

				<ul>

				  <li>gallivm,tgsi: fix idiv by zero crash</li>

				</ul>

				</div>

				</body>

				</html>

									
										115

docs/relnotes/10.3.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,115 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.2 Release Notes / October 24, 2014</h1>

				<p>

				Mesa 10.3.2 is a bug fix release which fixes bugs found since the 10.3 release.

				</p>

				<p>

				Mesa 10.3.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e65f8e691f06f111c1aeb3a376b13c9cc88cb162bee2709e0e7e6b0e6628ca75  MesaLib-10.3.2.tar.gz

				e9849bcb9aa9acd98a753d6d46d2e7d7238d3367036e11357a60efd16de8bea3  MesaLib-10.3.2.tar.bz2

				427dc0d670d38e713ebff2675665ec2fe4ff7d04ce227bd54de946999fc1d234  MesaLib-10.3.2.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81680">Bug 81680</a> - [r600g] Firefox crashes with hardware acceleration turned on</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84140">Bug 84140</a> - mplayer crashes playing some files using vdpau output</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84662">Bug 84662</a> - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85267">Bug 85267</a> - vlc crashes with vdpau (Radeon 3850HD) [r600]</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (3):</p>

				<ul>

				  <li>mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error</li>

				  <li>st/wgl: add WINAPI qualifiers on wgl function typedefs</li>

				  <li>glsl: fix several use-after-free bugs</li>

				</ul>

				<p>Daniel Manjarres (1):</p>

				<ul>

				  <li>glx: Fix glxUseXFont for glxWindow and glxPixmaps</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>mesa: fix GetTexImage for 1D array depth textures</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.3.1 release</li>

				  <li>Update VERSION to 10.3.2</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>gm107/ir: add dnz emission for fmul</li>

				  <li>gk110/ir: add dnz flag emission for fmul/fmad</li>

				  <li>nouveau: 3d textures are unsupported, limit 3d levels to 1</li>

				  <li>st/gbm: fix order of arguments passed to is_format_supported</li>

				</ul>

				<p>Kenneth Graunke (3):</p>

				<ul>

				  <li>i965: Add a BRW_MOCS_PTE #define.</li>

				  <li>i965: Use BDW_MOCS_PTE for renderbuffers.</li>

				  <li>i965: Fix register write checks.</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>st/mesa: use pipe_sampler_view_release for releasing sampler views</li>

				  <li>glsl_to_tgsi: fix the value of gl_FrontFacing with native integers</li>

				</ul>

				<p>Michel Dänzer (4):</p>

				<ul>

				  <li>radeonsi: Clear sampler view flags when binding a buffer</li>

				  <li>r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers</li>

				  <li>winsys/radeon: Use separate caching buffer manager for each set of flags</li>

				  <li>r600g: Drop references to destroyed blend state</li>

				</ul>

				</div>

				</body>

				</html>

									
										209

docs/relnotes/10.3.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,209 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.3 Release Notes / November 8, 2014</h1>

				<p>

				Mesa 10.3.3 is a bug fix release which fixes bugs found since the 10.3.2 release.

				</p>

				<p>

				Mesa 10.3.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				23a0c36d88cd5d8968ae6454160de2878192fd1d37b5d606adca1f1b7e788b79  MesaLib-10.3.3.tar.gz

				0e4eee4a2ddf86456eed2fc44da367f95471f74249636710491e85cc256c4753  MesaLib-10.3.3.tar.bz2

				a83648f17d776b7cf6c813fbb15782d2644b937dc6a7c53d8c0d1b35411f4840  MesaLib-10.3.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70410">Bug 70410</a> - egl-static/Makefile: linking fails with llvm &gt;= 3.4</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82921">Bug 82921</a> - layout(location=0) emits error &gt;= MAX_UNIFORM_LOCATIONS due to integer underflow</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83574">Bug 83574</a> - [llvmpipe] [softpipe] piglit arb_explicit_uniform_location-use-of-unused-loc regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85454">Bug 85454</a> - Unigine Sanctuary with Wine crashes on Mesa Git</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85918">Bug 85918</a> - Mesa: MSVC 2010/2012 Compile error</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (2):</p>

				<ul>

				  <li>glsl: Fix crash due to negative array index</li>

				  <li>glsl: Use signed array index in update_max_array_access()</li>

				</ul>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix UNCLAMPED_FLOAT_TO_UBYTE() macro for MSVC</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.3.2 release</li>

				  <li>Update version to 10.3.3</li>

				</ul>

				<p>Ilia Mirkin (27):</p>

				<ul>

				  <li>freedreno/ir3: fix FSLT/etc handling to return 0/-1 instead of 0/1.0</li>

				  <li>freedreno/ir3: INEG operates on src0, not src1</li>

				  <li>freedreno/ir3: add UARL support</li>

				  <li>freedreno/ir3: negate result of USLT/etc</li>

				  <li>freedreno/ir3: use unsigned comparison for UIF</li>

				  <li>freedreno/ir3: add TXL support</li>

				  <li>freedreno/ir3: fix UCMP handling</li>

				  <li>freedreno/ir3: implement UMUL correctly</li>

				  <li>freedreno: add default .dir-locals.el for emacs settings</li>

				  <li>freedreno/ir3: make texture instruction construction more dynamic</li>

				  <li>freedreno/ir3: fix TXB/TXL to actually pull the bias/lod argument</li>

				  <li>freedreno/ir3: add TXQ support</li>

				  <li>freedreno/ir3: add TXB2 support</li>

				  <li>freedreno: dual-source render targets are not supported</li>

				  <li>freedreno: instanced drawing/compute not yet supported</li>

				  <li>freedreno/ir3: avoid fan-in sources referring to same instruction</li>

				  <li>freedreno/ir3: add IDIV/UDIV support</li>

				  <li>freedreno/ir3: add UMOD support, based on UDIV</li>

				  <li>freedreno/ir3: add MOD support</li>

				  <li>freedreno/ir3: add ISSG support</li>

				  <li>freedreno/ir3: add UMAD support</li>

				  <li>freedreno/ir3: make TXQ return integers, not floats</li>

				  <li>freedreno/ir3: shadow comes before array</li>

				  <li>freedreno/ir3: add texture offset support</li>

				  <li>freedreno/ir3: add TXD support and expose ARB_shader_texture_lod</li>

				  <li>freedreno/ir3: add TXF support</li>

				  <li>freedreno: positions come out as integers, not half-integers</li>

				</ul>

				<p>Jan Vesely (1):</p>

				<ul>

				  <li>configure: include llvm systemlibs when using static llvm</li>

				</ul>

				<p>Marek Olšák (5):</p>

				<ul>

				  <li>r600g: fix polygon mode for points and lines and point/line fill modes</li>

				  <li>radeonsi: fix polygon mode for points and lines and point/line fill modes</li>

				  <li>radeonsi: fix incorrect index buffer max size for lowered 8-bit indices</li>

				  <li>Revert "st/mesa: set MaxUnrollIterations = 255"</li>

				  <li>r300g: remove enabled/disabled hyperz and AA compression messages</li>

				</ul>

				<p>Mauro Rossi (1):</p>

				<ul>

				  <li>gallium/nouveau: fully build the driver under android</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeon/llvm: Dynamically allocate branch/loop stack arrays</li>

				</ul>

				<p>Rob Clark (62):</p>

				<ul>

				  <li>freedreno/ir3: detect scheduler fail</li>

				  <li>freedreno/ir3: add TXB</li>

				  <li>freedreno/ir3: add DDX/DDY</li>

				  <li>freedreno/ir3: bit of debug</li>

				  <li>freedreno/ir3: fix error in bail logic</li>

				  <li>freedreno/ir3: fix constlen with relative addressing</li>

				  <li>freedreno/ir3: add no-copy-propagate fallback step</li>

				  <li>freedreno: don't overflow cmdstream buffer so much</li>

				  <li>freedreno/ir3: fix potential segfault in RA</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/a3xx: enable hw primitive-restart</li>

				  <li>freedreno/a3xx: handle rendering to layer != 0</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/a3xx: format fixes</li>

				  <li>util/u_format: add _is_alpha()</li>

				  <li>freedreno/a3xx: alpha render-target shenanigans</li>

				  <li>freedreno/ir3: catch incorrect usage of tmp-dst</li>

				  <li>freedreno/ir3: add missing put_dst</li>

				  <li>freedreno: "fix" problems with excessive flushes</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/a3xx: 3d/array textures</li>

				  <li>freedreno: add DRM_CONF_SHARE_FD</li>

				  <li>freedreno/a3xx: more texture array fixes</li>

				  <li>freedreno/a3xx: initial texture border-color</li>

				  <li>freedreno: fix compiler warning</li>

				  <li>freedreno: don't advertise mirror-clamp support</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno: we have more than 0 viewports!</li>

				  <li>freedreno: turn missing caps into compile warnings</li>

				  <li>freedreno/a3xx: add LOD_BIAS</li>

				  <li>freedreno/a3xx: add flat interpolation mode</li>

				  <li>freedreno/a3xx: add 32bit integer vtx formats</li>

				  <li>freedreno/a3xx: fix border color order</li>

				  <li>freedreno: move bind_sampler_states to per-generation</li>

				  <li>freedreno: add texcoord clamp support to lowering</li>

				  <li>freedreno/a3xx: add support to emulate GL_CLAMP</li>

				  <li>freedreno/a3xx: re-emit shaders on variant change</li>

				  <li>freedreno/lowering: fix token calculation for lowering</li>

				  <li>freedreno: destroy transfer pool after blitter</li>

				  <li>freedreno: max-texture-lod-bias should be 15.0f</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/a3xx: handle large shader program sizes</li>

				  <li>freedreno/a3xx: emit all immediates in one shot</li>

				  <li>freedreno/ir3: fix lockups with lame FRAG shaders</li>

				  <li>freedreno/a3xx: handle VS only outputting BCOLOR</li>

				  <li>freedreno: query fixes</li>

				  <li>freedreno/a3xx: refactor vertex state emit</li>

				  <li>freedreno/a3xx: refactor/optimize emit</li>

				  <li>freedreno/ir3: optimize shader key comparision</li>

				  <li>freedreno: inline fd_draw_emit()</li>

				  <li>freedreno: fix layer_stride</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/ir3: large const support</li>

				  <li>freedreno/a3xx: more layer/level fixes</li>

				  <li>freedreno/ir3: comment + better fxn name</li>

				  <li>freedreno/ir3: fix potential gpu lockup with kill</li>

				  <li>freedreno/a3xx: disable early-z when we have kill's</li>

				  <li>freedreno/ir3: add debug flag to disable cp</li>

				  <li>freedreno: clear vs scissor</li>

				  <li>freedreno: mark scissor state dirty when enable bit changes</li>

				  <li>freedreno/a3xx: fix viewport state during clear</li>

				  <li>freedreno/a3xx: fix depth/stencil restore format</li>

				</ul>

				<p>Tapani Pälli (2):</p>

				<ul>

				  <li>glsl: fix uniform location count used for glsl types</li>

				  <li>mesa: check that uniform exists in glUniform* functions</li>

				</ul>

				</div>

				</body>

				</html>

									
										106

docs/relnotes/10.3.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,106 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.4 Release Notes / November 21, 2014</h1>

				<p>

				Mesa 10.3.4 is a bug fix release which fixes bugs found since the 10.3.3 release.

				</p>

				<p>

				Mesa 10.3.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				26482495ef6177f889dbd87c7edcccfedd995598785bbbd7e3e066352574c8e0  MesaLib-10.3.4.tar.gz

				e6373913142338d10515daf619d659433bfd2989988198930c13b0945a15e98a  MesaLib-10.3.4.tar.bz2

				8c3ebbb6535daf3414305860ebca6ac67dbb6e3d35058c7a6ce18b84b5945b7f  MesaLib-10.3.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76252">Bug 76252</a> - Dynamic loading/unloading of opengl32.dll results in a deadlock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>st/mesa: copy sampler_array_size field when copying instructions</li>

				</ul>

				<p>Chad Versace (1):</p>

				<ul>

				  <li>i965: Fix segfault in WebGL Conformance on Ivybridge</li>

				</ul>

				<p>Dave Airlie (5):</p>

				<ul>

				  <li>r600g/cayman: fix integer multiplication output overwrite (v2)</li>

				  <li>r600g/cayman: fix texture gather tests</li>

				  <li>r600g/cayman: handle empty vertex shaders</li>

				  <li>r600g: geom shaders: always load texture src regs from inputs</li>

				  <li>r600g: limit texture offset application to specific types (v2)</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.3.3 release</li>

				  <li>configure.ac: roll up a program for the sse4.1 check</li>

				  <li>get-pick-list.sh: Require explicit "10.3" for nominating stable patches</li>

				</ul>

				<p>Ilia Mirkin (1):</p>

				<ul>

				  <li>st/mesa: add a fallback for clear_with_quad when no vs_layer</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>llvmpipe: Avoid deadlock when unloading opengl32.dll</li>

				</ul>

				<p>Kenneth Graunke (1):</p>

				<ul>

				  <li>i915g: we also have more than 0 viewports!</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeonsi: Disable asynchronous DMA except for PIPE_BUFFER</li>

				</ul>

				</div>

				</body>

				</html>

									
										88

docs/relnotes/10.3.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,88 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.5 Release Notes / December 5, 2014</h1>

				<p>

				Mesa 10.3.5 is a bug fix release which fixes bugs found since the 10.3.4 release.

				</p>

				<p>

				Mesa 10.3.5 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				7ea71c3cce89114df3dc050376afa1c6f6bf235d77a68f9703273603d6a90621  MesaLib-10.3.5.tar.gz

				eb75d2790f1606d59d50a6acaa637b6c75f2155b3e0eca3d5099165c0d9556ae  MesaLib-10.3.5.tar.bz2

				164bc64ba63fb07ff255ff8de6ed3c95ff545dfe8f864c44c33abe94788da910  MesaLib-10.3.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>mesa: fix arithmetic error in _mesa_compute_compressed_pixelstore()</li>

				  <li>mesa: fix height error check for 1D array textures</li>

				</ul>

				<p>Chris Forbes (2):</p>

				<ul>

				  <li>i965: Handle nested uniform array indexing</li>

				  <li>mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.3.5 release</li>

				  <li>Update version to 10.3.5</li>

				</ul>

				<p>Ilia Mirkin (6):</p>

				<ul>

				  <li>nv50/ir: set neg modifiers on min/max args</li>

				  <li>nv50,nvc0: actually check constbufs for invalidation</li>

				  <li>nv50,nvc0: buffer resources can be bound as other things down the line</li>

				  <li>freedreno/ir3: don't pass consts to madsh.m16 in MOD logic</li>

				  <li>freedreno/a3xx: only enable blend clamp for non-float formats</li>

				  <li>freedreno/ir3: fix UMAD</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>configure.ac: bump libdrm_freedreno requirement</li>

				</ul>

				</div>

				</body>

				</html>

									
										124

docs/relnotes/10.3.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,124 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.6 Release Notes / December 29, 2014</h1>

				<p>

				Mesa 10.3.6 is a bug fix release which fixes bugs found since the 10.3.5 release.

				</p>

				<p>

				Mesa 10.3.6 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				c4d053d6bc6604cb5c93c99e0ef2e815c539f26dc5a03737eb3809bc1767d12f  MesaLib-10.3.6.tar.gz

				8d43673c6788fbf85f9c36c3a95c61ccf46f8835fc9c0d85d34474490d80572b  MesaLib-10.3.6.tar.bz2

				6b5b1e9a13949cfdb76fe51e8dcc3ea71e464a5ca73d11fdc29c20c4ba3f411a  MesaLib-10.3.6.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>i965/brw_reg: struct constructor now needs explicit negate and abs values.</li>

				</ul>

				<p>Ben Widawsky (1):</p>

				<ul>

				  <li>i965/gs: Avoid DW * DW mul</li>

				</ul>

				<p>Dave Airlie (1):</p>

				<ul>

				  <li>r600g: only init GS_VERT_ITEMSIZE on r600</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.3.5 release</li>

				  <li>Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"</li>

				  <li>Update version to 10.3.6</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>linker: Wrap access of producer_var with a NULL check</li>

				  <li>linker: Assign varying locations geometry shader inputs for SSO</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>util/primconvert: pass index bias through</li>

				  <li>util/primconvert: support instanced rendering</li>

				  <li>util/primconvert: take ib offset into account</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>util/primconvert: Avoid point arithmetic; apply offset on all cases.</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>docs/relnotes: document the removal of GALLIUM_MSAA</li>

				</ul>

				<p>Mario Kleiner (4):</p>

				<ul>

				  <li>glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)</li>

				  <li>glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)</li>

				  <li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>

				  <li>glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)</li>

				</ul>

				<p>Maxence Le Doré (1):</p>

				<ul>

				  <li>glsl: Add gl_MaxViewports to available builtin constants</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>radeonsi: Program RASTER_CONFIG for harvested GPUs v5</li>

				</ul>

				</div>

				</body>

				</html>

									
										93

docs/relnotes/10.3.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,93 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3.7 Release Notes / January 12, 2015</h1>

				<p>

				Mesa 10.3.7 is a bug fix release which fixes bugs found since the 10.3.6 release.

				</p>

				<p>

				Mesa 10.3.7 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				bc13f33c19bc9f44a0565fdd51a8f9d1c0153a3365c429ceaf4ef43b7022b052  MesaLib-10.3.7.tar.gz

				43c6ced15e237cbb21b3082d7c0b42777c50c1f731d0d4b5efb5231063fb6a5b  MesaLib-10.3.7.tar.bz2

				d821fd46baf804fecfcf403e901800a4b996c7dd1c83f20a354b46566a49026f  MesaLib-10.3.7.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>

				</ul>

				<h2>Changes</h2>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>

				  <li>i965: Use safer pointer arithmetic in gather_oa_results()</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.3.6 release</li>

				  <li>Update version to 10.3.7</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nv50,nvc0: set vertex id base to index_bias</li>

				  <li>nv50/ir: fix texture offsets in release builds</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>

				  <li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>glsl_to_tgsi: fix a bug in copy propagation</li>

				  <li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>

				  <li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>

				</ul>

				</div>

				</body>

				</html>

									
										335

docs/relnotes/10.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,335 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.3 Release Notes / September 19, 2014</h1>

				<p>

				Mesa 10.3 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.3.1.

				</p>

				<p>

				Mesa 10.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				9a1bf52040fc3dda81e83a35f944f1c3f532847dbe9fdf57161265cf71ea1bae  MesaLib-10.3.0.tar.gz

				0283bfe710fa449ed82e465cfa09612a269e19abb7e0382082608062ce7960b5  MesaLib-10.3.0.tar.bz2

				221420763c2c3a244836a736e735612c4a6a0377b4e5223fca1e612f49906789  MesaLib-10.3.0.zip

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_ES3_compatibility on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe</li>

				<li>GL_ARB_clear_texture on i965</li>

				<li>GL_ARB_compressed_texture_pixel_storage on all drivers</li>

				<li>GL_ARB_conditional_render_inverted on i965, nvc0, softpipe, llvmpipe</li>

				<li>GL_ARB_derivative_control on i965, nv50, nvc0, r600</li>

				<li>GL_ARB_draw_indirect on nvc0, radeonsi</li>

				<li>GL_ARB_explicit_uniform_location (all drivers that support GLSL)</li>

				<li>GL_ARB_fragment_layer_viewport on nv50, nvc0, llvmpipe, r600</li>

				<li>GL_ARB_gpu_shader5 on i965/gen7, nvc0</li>

				<li>GL_ARB_multi_draw_indirect on nvc0, radeonsi</li>

				<li>GL_ARB_sample_shading on radeonsi</li>

				<li>GL_ARB_seamless_cubemap_per_texture on i965, llvmpipe, nvc0, r600, radeonsi, softpipe</li>

				<li>GL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi</li>

				<li>GL_ARB_texture_barrier on nv50, nvc0, r300, r600, radeonsi</li>

				<li>GL_ARB_texture_compression_bptc on i965/gen7+, nvc0, r600/evergreen+, radeonsi</li>

				<li>GL_ARB_texture_cube_map_array on radeonsi</li>

				<li>GL_ARB_texture_gather on r600, radeonsi</li>

				<li>GL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, softpipe</li>

				<li>GL_ARB_texture_query_lod on r600, radeonsi</li>

				<li>GL_ARB_viewport_array on nvc0</li>

				<li>GL_AMD_vertex_shader_viewport_index on i965/gen7+, r600</li>

				<li>GL_OES_compressed_ETC1_RGB8_texture on nv30, nv50, nvc0, r300, r600, radeonsi, softpipe, llvmpipe</li>

				<li>GLX_MESA_query_renderer on nv30, nv50, nvc0, r300, r600, radeonsi, softpipe, llvmpipe</li>

				<li>A new software rasterizer driver (kms_swrast_dri.so) that works with

				DRM drivers that don't have a full-fledged GEM (such as qxl or simpledrm)</li>

				<li>Distribute the Khronos GL/glcorearb.h header file.</li>

				</ul>

				<h2>Bug fixes</h2>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=50754">Bug 50754</a> - Building 32 bit mesa on 64 bit OS fails since change for automake</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53617">Bug 53617</a> - [llvmpipe] piglit fbo-depthtex regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=56127">Bug 56127</a> - [ILK bisected]unigine-sanctruary performance reduced by 98%</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66452">Bug 66452</a> - JUNIPER UVD accelerated playback of WMV3 streams does not work</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68365">Bug 68365</a> - [SNB Bisected]Piglit spec_ARB_framebuffer_object_fbo-blit-stretch  fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70441">Bug 70441</a> - [Gen4-5 clip] Piglit spec_OpenGL_1.1_polygon-offset hits (execsize &gt;= width) assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73846">Bug 73846</a> - [llvmpipe] lp_test_format fails with llvm-3.5svn &gt;= r199602</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74005">Bug 74005</a> - [i965 Bisected]Piglit/glx_glx-make-glxdrawable-current fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75010">Bug 75010</a> - clang: error: unknown argument: '-fstack-protector-strong'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75478">Bug 75478</a> - [BDW]Some Piglit and Ogles2conform cases cause GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75664">Bug 75664</a> - Unigine Valley &amp; Heaven &quot;error: syntax error, unexpected EXTENSION, expecting $end&quot; IVB HD4000</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75878">Bug 75878</a> - [BDW] GPU hang running Raytracer WebGL demo</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76188">Bug 76188</a> - EGL_EXT_image_dma_buf_import fd ownership is incorrect</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76223">Bug 76223</a> - [radeonsi] luxmark segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76939">Bug 76939</a> - [BDW] GPU hang when running “Metro:Last Light “ /“Crusader Kings II”</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77245">Bug 77245</a> - Bogus GL_ARB_explicit_attrib_location layout identifier warnings</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77493">Bug 77493</a> - lp_test_arit fails with llvm &gt;= llvm-3.5svn r206094</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77703">Bug 77703</a> - [ILK Bisected]Piglit glean_texCombine4 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77704">Bug 77704</a> - [IVB/HSW Bisected]Ogles3conform GL3Tests_shadow_shadow_execution_frag.test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77705">Bug 77705</a> - [SNB/IVB/HSW/BYT/BDW Bisected]Ogles3conform GL3Tests/packed_pixels/packed_pixels_pixelstore.test  segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77707">Bug 77707</a> - [ILK Bisected]Ogles2conform GL_sin_sin_float_frag_xvary.test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77740">Bug 77740</a> - i965: Relax accumulator dependency scheduling on Gen &lt; 6</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77852">Bug 77852</a> - [BDW]Piglit spec_ARB_framebuffer_object_fbo-drawbuffers-none_glBlitFramebuffer fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77856">Bug 77856</a> - [BDW]Piglit spec_OpenGL_3.0_clearbuffer-mixed-format fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77865">Bug 77865</a> - [BDW] Many Ogles3conform framebuffer_blit cases fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78225">Bug 78225</a> - Compile error due to undefined reference to `gbm_dri_backend', fix attached</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78258">Bug 78258</a> - make check link_varyings.gl_ClipDistance failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78403">Bug 78403</a> - query_renderer_implementation_unittest.cpp:144:4: error: expected primary-expression before ‘.’ token</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78537">Bug 78537</a> - no anisotropic filtering in a native Half-Life 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78546">Bug 78546</a> - [swrast] piglit copyteximage-border regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78581">Bug 78581</a> - OpenCL: clBuildProgram prints error messages directly rather than storing them</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78648">Bug 78648</a> - Texture artifacts in Kerbal Space Program</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78665">Bug 78665</a> - macros in builtin_functions.cpp make invalid assumptions about M_PI definitions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78679">Bug 78679</a> - Gen4-5 code lost: runtime_check_aads_emit</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78691">Bug 78691</a> - [G45 - Tesseract] Mesa 10.1.2 implementation error: Unsupported opcode 169872468 in FS</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78692">Bug 78692</a> - Football Manager 2014, gameplay rendered black &amp; white</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78716">Bug 78716</a> - Fix Mesa bugs for running Unreal Engine 4.1 Cave effects demo compiled for Linux</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78803">Bug 78803</a> - gallivm/lp_bld_debug.cpp:42:28: fatal error: llvm/IR/Module.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78842">Bug 78842</a> - [swrast] piglit fcc-read-after-clear copy rb regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78843">Bug 78843</a> - [swrast] piglit copyteximage 1D regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78872">Bug 78872</a> - [ILK Bisected]Piglit spec_ARB_depth_buffer_float_fbo-depthstencil-GL_DEPTH32F_STENCIL8-blit Aborted</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78875">Bug 78875</a> - [ILK Bisected]Webglc conformance/uniforms/uniform-default-values.html fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78888">Bug 78888</a> - test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79029">Bug 79029</a> - INTEL_DEBUG=shader_time is full of lies</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79095">Bug 79095</a> - x86/common_x86.c:348:14: error: use of undeclared identifier 'bit_SSE4_1'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79115">Bug 79115</a> - glFramebufferRenderbuffer(GL_DRAW_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_RENDERBUFFER, 0) doesn't unbind stencil buffer</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79263">Bug 79263</a> - Linking error in egl_gallium.la when compiling 32 bit on multiarch</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79294">Bug 79294</a> - Xlib-based build broken on non x86/x86-64 architectures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79373">Bug 79373</a> - Non-const initializers for matrix and vector constructors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79382">Bug 79382</a> - build error: multiple definition of `loader_get_pci_id_for_fd'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79421">Bug 79421</a> - [llvmpipe] SIGSEGV src/gallium/drivers/llvmpipe/lp_rast_priv.h:218</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79440">Bug 79440</a> - prog_hash_table.c:146: undefined reference to `_mesa_error_no_memory'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79469">Bug 79469</a> - Commit e3cc0d90e14e62a0a787b6c07a6df0f5c84039be breaks unigine heaven</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79534">Bug 79534</a> - gen&lt;7 renders garbage</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79616">Bug 79616</a> - L4D2 crash on startup</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79724">Bug 79724</a> - switch statement type check</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79729">Bug 79729</a> - [i965] glClear on a multisample texture doesn't work</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79809">Bug 79809</a> - radeonsi: mouse cursor corruption using weston on AMD Kaveri</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79823">Bug 79823</a> - [NV30/gallium] Mozilla apps freeze on startup with nouveau-dri-10.2.1 libs on dual-screen</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79885">Bug 79885</a> - commit b52a530 (gallium/egl: st_profiles are build time decision, treat them as such) broke egl</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79903">Bug 79903</a> - [HSW Bisected]Some Piglit and Ogles2conform cases fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79907">Bug 79907</a> - Mesa 10.2.1 --enable-vdpau default=auto broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79948">Bug 79948</a> - [i965] Incorrect pixels when using discard and uniform loads</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80015">Bug 80015</a> - Transparency glitches in native Civilization 5 (Civ5) port</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80115">Bug 80115</a> - MESA_META_DRAW_BUFFERS induced GL_INVALID_VALUE errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80211">Bug 80211</a> - [ILK/SNB Bisected]Piglit shaders_glsl-fs-copy-propagation-texcoords-1 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test  ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id  fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80254">Bug 80254</a> - pipe_loader_sw.c:90: undefined reference to `dri_create_sw_winsys'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80541">Bug 80541</a> - [softpipe] piglit levelclamp regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80614">Bug 80614</a> - [regression] Error in `omxregister-bellagio': munmap_chunk(): invalid pointer: 0x00007f5f76626dab</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80778">Bug 80778</a> - [bisected regression] piglit spec/glsl-1.50/compiler/incorrect-in-layout-qualifier-repeated-prim.geom</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80827">Bug 80827</a> - [radeonsi,R9 270X] Corruptions in window menus in KDE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80880">Bug 80880</a> - Unreal Engine 4 demos fail GLSL compiler assertion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80991">Bug 80991</a> - [BDW]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81020">Bug 81020</a> - [radeonsi][regresssion] Wireframe of background rendered through objects in Half-Life 2: Episode 2 with MSAA enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81150">Bug 81150</a> - [SNB]Piglit spec_arb_shading_language_packing_execution_built-in-functions_fs-packSnorm4x8 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81157">Bug 81157</a> - [BDW]Piglit some spec_glsl-1.50_execution_built-in-functions* cases fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81450">Bug 81450</a> - [BDW]Piglit spec_glsl-1.30_execution_tex-miplevel-selection_textureGrad_1DArray cases intel_do_flush_locked failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81828">Bug 81828</a> - [BDW Bisected]Ogles3conform GL3Tests_packed_pixels_packed_pixels_pbo.test fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81834">Bug 81834</a> - TGSI constant buffer overrun causes assertion failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81857">Bug 81857</a> - [SNB+]Piglit spec_glsl-1.30_execution_switch_fs-default_last sporadically fail</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81967">Bug 81967</a> - [regression] Selections in Blender renders wrong</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82139">Bug 82139</a> - [r600g, bisected] multiple ubo piglit regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82159">Bug 82159</a> - No rule to make target `../../../../src/mesa/libmesa.la', needed by `collision'.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82268">Bug 82268</a> - Add support for the OpenRISC architecture (or1k)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82428">Bug 82428</a> - [radeonsi,R9 270X] System lockup when using mplayer/mpv with VDPAU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82483">Bug 82483</a> - format_srgb.h:145: undefined reference to `util_format_srgb_to_linear_8unorm_table'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82517">Bug 82517</a> - [RADEONSI,VDPAU] SIGSEGV in map_msg_fb_buf called from ruvd_destroy, when closing a Tab with accelerated video player</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82534">Bug 82534</a> - src\egl\main\eglapi.h : fatal error LNK1107: invalid or corrupt file: cannot read at 0x2E02</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82536">Bug 82536</a> - u_current.h:72: undefined reference to `__imp__glapi_Dispatch'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82546">Bug 82546</a> - [regression] libOSMesa build failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82574">Bug 82574</a> - GLSL: opt_vectorize goes wrong on texture lookups</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82628">Bug 82628</a> - bisected: GALLIUM_HUD hangs radeon 7970M (PRIME)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82671">Bug 82671</a> - [r600g-evergreen][compute]Empty kernel execution causes crash</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82709">Bug 82709</a> - OpenCL not working on radeon hainan</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82814">Bug 82814</a> - glDrawBuffers(0, NULL) segfaults in _mesa_drawbuffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83046">Bug 83046</a> - [BDW bisected]] Warsow v1.0/Xonotic v0.7/Gputest v0.5_triangle_fullscreen/synmark2_v6/GLBenchmark v2.5.0/GLBenchmark v2.7.0/Ungine-demos performance reduced 30%~60%</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Removed support for the GL_ATI_envmap_bumpmap extension</li>

				<li>The hacky --enable-32/64-bit is no longer available in configure. To build

				32/64 bit mesa refer to the default method recommended by your distribution</li>

				</li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>

				</ul>

				</div>

				</body>

				</html>

									
										97

docs/relnotes/10.4.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,97 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.1 Release Notes / December 29, 2014</h1>

				<p>

				Mesa 10.4.1 is a bug fix release which fixes bugs found since the 10.4.0 release.

				</p>

				<p>

				Mesa 10.4.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				5311285e791a6bfaa468ad002bd1e1164acb3eaa040b5a1bf958bdb7c27e0a9d  MesaLib-10.4.1.tar.gz

				91e8b71c8aff4cb92022a09a872b1c5d1ae5bfec8c6c84dbc4221333da5bf1ca  MesaLib-10.4.1.tar.bz2

				e09c8135f5a86ecb21182c6f8959aafd39ae2f98858fdf7c0e25df65b5abcdb8  MesaLib-10.4.1.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83908">Bug 83908</a> - [i965] Incorrect icon colors in Steam Big Picture</li>

				</ul>

				<h2>Changes</h2>

				<p>Andres Gomez (1):</p>

				<ul>

				  <li>i965/brw_reg: struct constructor now needs explicit negate and abs values.</li>

				</ul>

				<p>Cody Northrop (1):</p>

				<ul>

				  <li>i965: Require pixel alignment for GPU copy blit</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: Add 10.4 sha256 sums, news item and link release notes</li>

				  <li>Revert "glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)"</li>

				  <li>Update version to 10.4.1</li>

				</ul>

				<p>Ian Romanick (2):</p>

				<ul>

				  <li>linker: Wrap access of producer_var with a NULL check</li>

				  <li>linker: Assign varying locations geometry shader inputs for SSO</li>

				</ul>

				<p>Mario Kleiner (4):</p>

				<ul>

				  <li>glx/dri3: Fix glXWaitForSbcOML() to handle targetSBC==0 correctly. (v2)</li>

				  <li>glx/dri3: Track separate (ust, msc) for PresentPixmap vs. PresentNotifyMsc (v2)</li>

				  <li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>

				  <li>glx/dri3: Don't fail on glXSwapBuffersMscOML(dpy, window, 0, 0, 0) (v2)</li>

				</ul>

				<p>Maxence Le Doré (1):</p>

				<ul>

				  <li>glsl: Add gl_MaxViewports to available builtin constants</li>

				</ul>

				</div>

				</body>

				</html>

									
										127

docs/relnotes/10.4.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,127 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.2 Release Notes / January 12, 2015</h1>

				<p>

				Mesa 10.4.2 is a bug fix release which fixes bugs found since the 10.4.1 release.

				</p>

				<p>

				Mesa 10.4.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e303e77dd774df0d051b2870b165f98c97084a55980f884731df89c1b56a6146  MesaLib-10.4.2.tar.gz

				08a119937d9f2aa2f66dd5de97baffc2a6e675f549e40e699a31f5485d15327f  MesaLib-10.4.2.tar.bz2

				c2c2921a80a3395824f02bee4572a6a17d6a12a928a3e497618eeea04fb06490  MesaLib-10.4.2.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>

				</ul>

				<h2>Changes</h2>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965: Use safer pointer arithmetic in intel_texsubimage_tiled_memcpy()</li>

				  <li>i965: Use safer pointer arithmetic in gather_oa_results()</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>Revert "r600g/sb: fix issues cause by GLSL switching to loops for switch"</li>

				  <li>r600g: fix regression since UCMP change</li>

				  <li>r600g/sb: implement r600 gpr index workaround. (v3.1)</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.4.1 release</li>

				  <li>Update version to 10.4.2</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>nv50,nvc0: set vertex id base to index_bias</li>

				  <li>nv50/ir: fix texture offsets in release builds</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Add missing BRW_NEW_*_PROG_DATA to texture/renderbuffer atoms.</li>

				  <li>i965: Fix start/base_vertex_location for &gt;1 prims but !BRW_NEW_VERTICES.</li>

				</ul>

				<p>Leonid Shatz (1):</p>

				<ul>

				  <li>gallium/util: make sure cache line size is not zero</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>glsl_to_tgsi: fix a bug in copy propagation</li>

				  <li>vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays</li>

				  <li>st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX</li>

				  <li>radeonsi: fix VertexID for OpenGL</li>

				</ul>

				<p>Michel Dänzer (1):</p>

				<ul>

				  <li>radeonsi: Don't modify PA_SC_RASTER_CONFIG register value if rb_mask == 0</li>

				</ul>

				<p>Roland Scheidegger (1):</p>

				<ul>

				  <li>gallium/util: fix crash with daz detection on x86</li>

				</ul>

				<p>Tiziano Bacocco (1):</p>

				<ul>

				  <li>nv50,nvc0: implement half_pixel_center</li>

				</ul>

				<p>Vadim Girlin (1):</p>

				<ul>

				  <li>r600g/sb: fix issues with loops created for switch</li>

				</ul>

				</div>

				</body>

				</html>

									
										145

docs/relnotes/10.4.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,145 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.3 Release Notes / January 24, 2015</h1>

				<p>

				Mesa 10.4.3 is a bug fix release which fixes bugs found since the 10.4.2 release.

				</p>

				<p>

				Mesa 10.4.3 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				c53eaafc83d9c6315f63e0904d9954d929b841b0b2be7a328eeb6e14f1376129  MesaLib-10.4.3.tar.gz

				ef6ecc9c2f36c9f78d1662382a69ae961f38f03af3a0c3268e53f351aa1978ad  MesaLib-10.4.3.tar.bz2

				179325fc8ec66529d3b0d0c43ef61a33a44d91daa126c3bbdd1efdfd25a7db1d  MesaLib-10.4.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>

				</ul>

				<h2>Changes</h2>

				<p>Axel Davy (39):</p>

				<ul>

				  <li>st/nine: Add new texture format strings</li>

				  <li>st/nine: Correctly advertise D3DPMISCCAPS_CLIPTLVERTS</li>

				  <li>st/nine: NineBaseTexture9: fix setting of last_layer</li>

				  <li>st/nine: CubeTexture: fix GetLevelDesc</li>

				  <li>st/nine: Fix crash when deleting non-implicit swapchain</li>

				  <li>st/nine: Return D3DERR_INVALIDCALL when trying to create a texture of bad format</li>

				  <li>st/nine: NineBaseTexture9: update sampler view creation</li>

				  <li>st/nine: Check if srgb format is supported before trying to use it.</li>

				  <li>st/nine: Add ATI1 and ATI2 support</li>

				  <li>st/nine: Rework of boolean constants</li>

				  <li>st/nine: Convert integer constants to floats before storing them when cards don't support integers</li>

				  <li>st/nine: Remove some shader unused code</li>

				  <li>st/nine: Saturate oFog and oPts vs outputs</li>

				  <li>st/nine: Correctly declare NineTranslateInstruction_Mkxn inputs</li>

				  <li>st/nine: Fix typo for M4x4</li>

				  <li>st/nine: Fix POW implementation</li>

				  <li>st/nine: Handle RSQ special cases</li>

				  <li>st/nine: Handle NRM with input of null norm</li>

				  <li>st/nine: Correct LOG on negative values</li>

				  <li>st/nine: Rewrite LOOP implementation, and a0 aL handling</li>

				  <li>st/nine: Fix CND implementation</li>

				  <li>st/nine: Clamp ps 1.X constants</li>

				  <li>st/nine: Fix some fixed function pipeline operation</li>

				  <li>st/nine: Implement TEXCOORD special behaviours</li>

				  <li>st/nine: Fill missing dst and src number for some instructions.</li>

				  <li>st/nine: Fix TEXM3x3 and implement TEXM3x3VSPEC</li>

				  <li>st/nine: implement TEXM3x2DEPTH</li>

				  <li>st/nine: Implement TEXM3x2TEX</li>

				  <li>st/nine: Implement TEXM3x3SPEC</li>

				  <li>st/nine: Implement TEXDEPTH</li>

				  <li>st/nine: Implement TEXDP3</li>

				  <li>st/nine: Implement TEXDP3TEX</li>

				  <li>st/nine: Implement TEXREG2AR, TEXREG2GB and TEXREG2RGB</li>

				  <li>st/nine: Correct rules for relative adressing and constants.</li>

				  <li>st/nine: Remove unused code for ps</li>

				  <li>st/nine: Fix sm3 relative addressing for non-debug build</li>

				  <li>st/nine: Add variables containing the size of the constant buffers</li>

				  <li>st/nine: Allocate the correct size for the user constant buffer</li>

				  <li>st/nine: Allocate vs constbuf buffer for indirect addressing once.</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.4.2 release</li>

				  <li>Update version to 10.4.3</li>

				</ul>

				<p>Jason Ekstrand (1):</p>

				<ul>

				  <li>mesa: Fix clamping to -1.0 in snorm_to_float</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>glsl: Link glsl_test with pthreads library.</li>

				</ul>

				<p>Jose Fonseca (1):</p>

				<ul>

				  <li>nine: Drop use of TGSI_OPCODE_CND.</li>

				</ul>

				<p>Kenneth Graunke (2):</p>

				<ul>

				  <li>i965: Respect the no_8 flag on Gen6, not just Gen7+.</li>

				  <li>i965: Work around mysterious Gen4 GPU hangs with minimal state changes.</li>

				</ul>

				<p>Stanislaw Halik (1):</p>

				<ul>

				  <li>st/nine: Hack to generate resource if it doesn't exist when getting view</li>

				</ul>

				<p>Xavier Bouchoux (3):</p>

				<ul>

				  <li>st/nine: Additional defines to d3dtypes.h</li>

				  <li>st/nine: Add missing c++ declaration for IDirect3DVolumeTexture9</li>

				  <li>st/nine: Fix D3DRS_POINTSPRITE support</li>

				</ul>

				</div>

				</body>

				</html>

									
										100

docs/relnotes/10.4.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,100 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.4 Release Notes / February 06, 2015</h1>

				<p>

				Mesa 10.4.4 is a bug fix release which fixes bugs found since the 10.4.3 release.

				</p>

				<p>

				Mesa 10.4.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				5cb427eaf980cb8555953e9928f5797979ed783e277745d5f8cbae8bc5364086  MesaLib-10.4.4.tar.gz

				f18a967e9c4d80e054b2fdff8c130ce6e6d1f8eecfc42c9f354f8628d8b4df1c  MesaLib-10.4.4.tar.bz2

				86baad73b77920c80fe58402a905e7dd17e3ea10ead6ea7d3afdc0a56c860bd7  MesaLib-10.4.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>

				</ul>

				<h2>Changes</h2>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>mesa: fix display list 8-byte alignment issue</li>

				</ul>

				<p>Emil Velikov (2):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.4.3 release</li>

				  <li>Update version to 10.4.4</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>egl: Pass the correct X visual depth to xcb_put_image().</li>

				</ul>

				<p>Mario Kleiner (1):</p>

				<ul>

				  <li>glx/dri3: Request non-vsynced Present for swapinterval zero. (v3)</li>

				</ul>

				<p>Matt Turner (1):</p>

				<ul>

				  <li>gallium/util: Don't use __builtin_clrsb in util_last_bit().</li>

				</ul>

				<p>Niels Ole Salscheider (1):</p>

				<ul>

				  <li>configure: Link against all LLVM targets when building clover</li>

				</ul>

				<p>Park, Jeongmin (1):</p>

				<ul>

				  <li>st/osmesa: Fix osbuffer-&gt;textures indexing</li>

				</ul>

				<p>Ville Syrjälä (1):</p>

				<ul>

				  <li>i965: Fix max_wm_threads for CHV</li>

				</ul>

				</div>

				</body>

				</html>

									
										114

docs/relnotes/10.4.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,114 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.5 Release Notes / February 21, 2015</h1>

				<p>

				Mesa 10.4.5 is a bug fix release which fixes bugs found since the 10.4.4 release.

				</p>

				<p>

				Mesa 10.4.5 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				e12bbdaee9a758617e8ebd0bb0e987f72addd11db2e4da25ba695e386cd63843  MesaLib-10.4.5.tar.gz

				bf60000700a9d58e3aca2bfeee7e781053b0d839e61a95b1883e05a2dee247a0  MesaLib-10.4.5.tar.bz2

				3b926de8eee500bb67cf85332c51292f826cc539b8636382aadbb8e70c76527a  MesaLib-10.4.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>

				</ul>

				<h2>Changes</h2>

				<p>Carl Worth (1):</p>

				<ul>

				  <li>Revert use of Mesa IR optimizer for ARB_fragment_programs</li>

				</ul>

				<p>Emil Velikov (3):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.4.4 release</li>

				  <li>get-pick-list.sh: Require explicit "10.4" for nominating stable patches</li>

				  <li>Update version to 10.4.5</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>nvc0: bail out of 2d blits with non-A8_UNORM alpha formats</li>

				  <li>st/mesa: treat resource-less xfb buffers as if they weren't there</li>

				  <li>nvc0: allow holes in xfb target lists</li>

				</ul>

				<p>Jeremy Huddleston Sequoia (2):</p>

				<ul>

				  <li>darwin: build fix</li>

				  <li>darwin: build fix</li>

				</ul>

				<p>Kenneth Graunke (4):</p>

				<ul>

				  <li>i965: Override swizzles for integer luminance formats.</li>

				  <li>i965: Use a gl_color_union for sampler border color.</li>

				  <li>i965: Fix integer border color on Haswell.</li>

				  <li>glsl: Reduce memory consumption of copy propagation passes.</li>

				</ul>

				<p>Laura Ekstrand (1):</p>

				<ul>

				  <li>main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.</li>

				</ul>

				<p>Marek Olšák (5):</p>

				<ul>

				  <li>r600g,radeonsi: don't append to streamout buffers that haven't been used yet</li>

				  <li>radeonsi: fix instanced arrays with non-zero start instance</li>

				  <li>radeonsi: small fix in SPI state</li>

				  <li>mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers</li>

				  <li>radeonsi: fix a crash if a stencil ref state is set before a DSA state</li>

				</ul>

				<p>Michel Dänzer (2):</p>

				<ul>

				  <li>st/mesa: Don't use PIPE_USAGE_STREAM for GL_PIXEL_UNPACK_BUFFER_ARB</li>

				  <li>Revert "radeon/llvm: enable unsafe math for graphics shaders"</li>

				</ul>

				</div>

				</body>

				</html>

									
										143

docs/relnotes/10.4.6.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,143 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.6 Release Notes / March 06, 2015</h1>

				<p>

				Mesa 10.4.6 is a bug fix release which fixes bugs found since the 10.4.5 release.

				</p>

				<p>

				Mesa 10.4.6 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4  MesaLib-10.4.6.tar.gz

				d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735  MesaLib-10.4.6.tar.bz2

				6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa  MesaLib-10.4.6.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>

				</ul>

				<h2>Changes</h2>

				<p>Abdiel Janulgue (2):</p>

				<ul>

				  <li>glsl: Don't optimize min/max into saturate when EmitNoSat is set</li>

				  <li>st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported</li>

				</ul>

				<p>Andreas Boll (1):</p>

				<ul>

				  <li>glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA</li>

				</ul>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>swrast: fix multiple color buffer writing</li>

				  <li>st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>

				</ul>

				<p>Eduardo Lima Mitev (1):</p>

				<ul>

				  <li>mesa: Fix error validating args for TexSubImage3D</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.4.5 release</li>

				  <li>install-lib-links: remove the .install-lib-links file</li>

				  <li>Revert "mesa: Correct backwards NULL check."</li>

				  <li>mesa: cherry-pick the second half of commit 2aa71e9485a</li>

				  <li>Revert "gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."</li>

				  <li>Update version to 10.4.6</li>

				</ul>

				<p>Ian Romanick (3):</p>

				<ul>

				  <li>mesa: Add missing error checks in _mesa_ProgramBinary</li>

				  <li>mesa: Ensure that length is set to zero in _mesa_GetProgramBinary</li>

				  <li>mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary</li>

				</ul>

				<p>Jonathan Gray (1):</p>

				<ul>

				  <li>auxilary/os: correct sysctl use in os_get_total_physical_memory()</li>

				</ul>

				<p>José Fonseca (1):</p>

				<ul>

				  <li>gallivm: Update for RTDyldMemoryManager becoming an unique_ptr.</li>

				</ul>

				<p>Leo Liu (1):</p>

				<ul>

				  <li>st/omx/dec/h264: fix picture out-of-order with poc type 0 v2</li>

				</ul>

				<p>Lucas Stach (1):</p>

				<ul>

				  <li>install-lib-links: don't depend on .libs directory</li>

				</ul>

				<p>Marek Olšák (2):</p>

				<ul>

				  <li>vbo: fix an unitialized-variable warning</li>

				  <li>radeonsi: fix point sprites</li>

				</ul>

				<p>Matt Turner (4):</p>

				<ul>

				  <li>glsl: Rewrite and fix min/max to saturate optimization.</li>

				  <li>mesa: Correct backwards NULL check.</li>

				  <li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>

				  <li>mesa: Correct backwards NULL check.</li>

				</ul>

				</div>

				</body>

				</html>

									
										134

docs/relnotes/10.4.7.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,134 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4.7 Release Notes / March 20, 2015</h1>

				<p>

				Mesa 10.4.7 is a bug fix release which fixes bugs found since the 10.4.6 release.

				</p>

				<p>

				Mesa 10.4.7 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				9e7b59267199658808f8b33e0410b86fbafbdcd52378658b9df65fac9d24947f  MesaLib-10.4.7.tar.gz

				2c351c98671f9a7ab3fd9c601bb7a255801b1580f5dd0992639f99152801b0d2  MesaLib-10.4.7.tar.bz2

				d14ac578b5ce16560757b53fbd1cb4d6b34652f8e110e4b10a019adc82e67ffd  MesaLib-10.4.7.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrey Sudnik (1):</p>

				<ul>

				  <li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>egl: Take alpha bits into account when selecting GBM formats</li>

				</ul>

				<p>Emil Velikov (6):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.4.6 release</li>

				  <li>cherry-ignore: add not applicable/rejected commits</li>

				  <li>mesa: rename format_info.c to format_info.h</li>

				  <li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>

				  <li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>

				  <li>Update version to 10.4.7</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>freedreno: move fb state copy after checking for size change</li>

				  <li>freedreno/ir3: fix array count returned by TXQ</li>

				  <li>freedreno/ir3: get the # of miplevels from getinfo</li>

				  <li>freedreno: fix slice pitch calculations</li>

				</ul>

				<p>Marc-Andre Lureau (1):</p>

				<ul>

				  <li>gallium/auxiliary/indices: fix start param</li>

				</ul>

				<p>Marek Olšák (4):</p>

				<ul>

				  <li>r300g: fix RGTC1 and LATC1 SNORM formats</li>

				  <li>r300g: fix a crash when resolving into an sRGB texture</li>

				  <li>r300g: fix sRGB-&gt;sRGB blits</li>

				  <li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>

				</ul>

				<p>Mario Kleiner (1):</p>

				<ul>

				  <li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>

				</ul>

				<p>Matt Turner (2):</p>

				<ul>

				  <li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>

				  <li>r300g: Check return value of snprintf().</li>

				</ul>

				<p>Rob Clark (2):</p>

				<ul>

				  <li>freedreno/ir3: fix silly typo for binning pass shaders</li>

				  <li>freedreno: update generated headers</li>

				</ul>

				<p>Samuel Iglesias Gonsalvez (1):</p>

				<ul>

				  <li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>

				</ul>

				<p>Stefan Dösinger (1):</p>

				<ul>

				  <li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>

				</ul>

				</div>

				</body>

				</html>

									
										259

docs/relnotes/10.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,259 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.4 Release Notes / December 14, 2014</h1>

				<p>

				Mesa 10.4 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.4.1.

				</p>

				<p>

				Mesa 10.4 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				abfbfd2d91ce81491c5bb6923ae649212ad5f82d0bee277de8704cc948dc221e  MesaLib-10.4.0.tar.gz

				98a7dff3a1a6708c79789de8b9a05d8042e867067f70e8f30387c15026233219  MesaLib-10.4.0.tar.bz2

				443a6d46d0691b5ac811d8d30091b1716c365689b16d49c57cf273c2b76086fe  MesaLib-10.4.0.zip

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_conditional_render_inverted on nv50</li>

				<li>GL_ARB_sample_shading on r600</li>

				<li>GL_ARB_texture_view on nv50, nvc0</li>

				<li>GL_ARB_clip_control on nv50, nvc0, r300, r600, radeonsi, llvmpipe, softpipe</li>

				<li>GL_KHR_context_flush_control on all drivers</li>

				</ul>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79963">Bug 79963</a> - [ILK Bisected]some piglit and ogles2conform cases fail </li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=29661">Bug 29661</a> - MSVC built u_format_test fails on Windows</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=38873">Bug 38873</a> - [855gm] gnome-shell misrendered</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=54372">Bug 54372</a> - GLX_INTEL_swap_event crashes driver when swapping window buffers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=61415">Bug 61415</a> - Clover ignores --with-opencl-libdir path</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64471">Bug 64471</a> - Radeon HD6570 lockup in Brütal Legend with HyperZ</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66184">Bug 66184</a> - src/mesa/state_tracker/st_glsl_to_tgsi.cpp:3216:simplify_cmp: Assertion `inst-&gt;dst.index &lt; 4096' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67672">Bug 67672</a> - [llvmpipe] lp_test_arit fails on old CPUs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69200">Bug 69200</a> - [Bisected]Piglit glx/glx-multithread-shader-compile aborted</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70410">Bug 70410</a> - egl-static/Makefile: linking fails with llvm &gt;= 3.4</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72685">Bug 72685</a> - [radeonsi hyperz] Artifacts in Unigine Sanctuary</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72819">Bug 72819</a> - [855GM] Incorrect drop shadow color on windows and strange white rectangle when showing/hiding GLX-dock...</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74563">Bug 74563</a> - Surfaceless contexts are not properly released by DRI drivers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=74863">Bug 74863</a> - [r600g] HyperZ broken on RV770 and CYPRESS (Left 4 Dead 2 trees corruption) bisected!</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75011">Bug 75011</a> - [hyperz] Performance drop since git-01e6371 (disable hyperz by default) with radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=75112">Bug 75112</a> - Meta Bug for HyperZ issues on r600g and radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76252">Bug 76252</a> - Dynamic loading/unloading of opengl32.dll results in a deadlock</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=76861">Bug 76861</a> - mid3 generates slow code for constant arguments</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77957">Bug 77957</a> - Variably-indexed constant arrays result in terrible shader code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78468">Bug 78468</a> - Compiling of shader gets stuck in infinite loop</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79155">Bug 79155</a> - [Tesseract Game] Global Illumination: Medium Causes Color Distortion</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79462">Bug 79462</a> - [NVC0/Codegen] Shader compilation falis in spill logic</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80011">Bug 80011</a> - [softpipe] tgsi/tgsi_exec.c:2023:exec_txf: Assertion `0' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80012">Bug 80012</a> - [softpipe] draw/draw_gs.c:113:tgsi_fetch_gs_outputs: Assertion `!util_is_inf_or_nan(output[slot][0])' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80050">Bug 80050</a> - [855GM] Incorrect drop shadow color under windows in Cinnamon persists with MESA 10.1.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80247">Bug 80247</a> - Khronos conformance test  ES3-CTS.gtf.GL3Tests.transform_feedback.transform_feedback_vertex_id  fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80561">Bug 80561</a> - Incorrect implementation of some VDPAU APIs.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80615">Bug 80615</a> - Files in bellagio directory [omx tracker] don't respect installation folder</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80848">Bug 80848</a> - [dri3] Building mesa fails with dri3 enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=81680">Bug 81680</a> - [r600g] Firefox crashes with hardware acceleration turned on</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82255">Bug 82255</a> - [VP2] Chroma planes are vertically stretched during VDPAU playback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82472">Bug 82472</a> - piglit 16385-consecutive-chars regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82537">Bug 82537</a> - Stunt Rally GLSL compiler assertion failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82538">Bug 82538</a> - Super Maryo Chronicles fails with st/mesa assertion failure</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82539">Bug 82539</a> - vmw_screen_dri.lo In file included from vmw_screen_dri.c:41: vmwgfx_drm.h:32:17: error: drm.h: No such file or directory</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82796">Bug 82796</a> - [IVB/BYT-M/HSW/BDW Bisected]Synmark2_v6.0_OglTerrainFlyInst/OglTerrainPanInst cannot run as image validation failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82804">Bug 82804</a> - unreal engine 4 rendering errors</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82828">Bug 82828</a> - Regression: Crash in 3Dmark2001</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82846">Bug 82846</a> - [BDW Bisected] Gpu hang when running Lightsmark v2008/Warsow v1.0/Xonotic v0.7/unigine-demos</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82881">Bug 82881</a> - test_vec4_register_coalesce regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82882">Bug 82882</a> - [swrast] piglit glsl-fs-uniform-bool-1 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82921">Bug 82921</a> - layout(location=0) emits error &gt;= MAX_UNIFORM_LOCATIONS due to integer underflow</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82929">Bug 82929</a> - [BDW Bisected]glxgears causes X hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82932">Bug 82932</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.shaders.indexing.vector_subscript.vec3_static_loop_subscript_write_direct_read_vertex fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83079">Bug 83079</a> - [NVC0] Dota 2 (Linux native and Wine) crash with Nouveau Drivers</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83080">Bug 83080</a> - [SNB+ Bisected]ES3-CTS.shaders.loops.do_while_constant_iterations.mixed_break_continue_fragment fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83081">Bug 83081</a> - [BDW Bisected]Piglit spec_ARB_sample_shading_builtin-gl-sample-mask_2 is core dumped</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83127">Bug 83127</a> - [ILK Bisected]Piglit glean_texCombine fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83148">Bug 83148</a> - Unity invisible under Ubuntu 14.04 and 14.10</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83355">Bug 83355</a> - FTBFS: src/mesa/program/program_lexer.l:122:64: error: unknown type name 'YYSTYPE'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83380">Bug 83380</a> - Linking fails when not writing gl_Position.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83418">Bug 83418</a> - EU IV is incorrectly rendered after git1409011930.d571f2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83432">Bug 83432</a> - r600_query.c:269:r600_emit_query_end: Assertion `ctx-&gt;num_pipelinestat_queries &gt; 0' failed [Gallium HUD]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83463">Bug 83463</a> - [swrast] piglit glsl-vs-clamp-1 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83468">Bug 83468</a> - [UBO] Using bool from UBO as if-statement condition asserts</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83506">Bug 83506</a> - [UBO] row_major layout ignored inside structures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83533">Bug 83533</a> - [UBO] nested structures don't get appropriate padding</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83573">Bug 83573</a> - [swrast] piglit fs-op-not-bool-using-if regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83574">Bug 83574</a> - [llvmpipe] [softpipe] piglit arb_explicit_uniform_location-use-of-unused-loc regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83741">Bug 83741</a> - [UBO] row_major layout partially ignored for arrays of structures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83777">Bug 83777</a> - [regression] ilo fails to build</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83934">Bug 83934</a> - Structures must have same name to be considered same type.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84140">Bug 84140</a> - mplayer crashes playing some files using vdpau output</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84145">Bug 84145</a> - UE4: Realistic Rendering Demo render blue</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84178">Bug 84178</a> - Big glamor regression in Xorg server 1.6.99.1 GIT: x11perf 1.5 Test: PutImage XY 500x500 Square</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84355">Bug 84355</a> - texture2DProjLod and textureCubeLod are not supported when using GLES.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84529">Bug 84529</a> - [IVB bisected] glean fragProg1 CMP test failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84538">Bug 84538</a> - lp_test_format.c:226:4: error: too few arguments to function ‘gallivm_create’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84539">Bug 84539</a> - brw_fs_register_coalesce.cpp:183: bool fs_visitor::register_coalesce(): Assertion `src_size &lt;= 11' failed.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84557">Bug 84557</a> - [HSW] &quot;Emit ELSE/ENDIF JIP with type D on Gen 7&quot; causes Atomic Afterlife and GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84651">Bug 84651</a> - Distorted graphics or black window when running Battle.net app on Intel hardware via wine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84662">Bug 84662</a> - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84807">Bug 84807</a> - Build issue starting between bf4aecfb2acc8d0dc815105d2f36eccbc97c284b and a3e9582f09249ad27716ba82c7dfcee685b65d51</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85189">Bug 85189</a> - llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector&lt;llvm::Function*&gt;&amp;)': llvm/invocation.cpp:324:18: error: expected type-specifier</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85267">Bug 85267</a> - vlc crashes with vdpau (Radeon 3850HD) [r600]</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85377">Bug 85377</a> - lp_test_format failure with llvm-3.6</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85425">Bug 85425</a> - [bisected] Compiler error in clip control operations in meta</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85429">Bug 85429</a> - indirect.c:296: multiple definition of `__indirect_glNewList'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85454">Bug 85454</a> - Unigine Sanctuary with Wine crashes on Mesa Git</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85683">Bug 85683</a> - [i965 Bisected]Piglit shaders_glsl-vs-raytrace-bug26691 segfault</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85691">Bug 85691</a> - 'glsl: Drop constant 0.0 components from dot products.' broke piglit shaders/glsl-gnome-shell-dim-window and a few others with Gallium</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86025">Bug 86025</a> - src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86089">Bug 86089</a> - [r600g][mesa 10.4.0-dev] shader failure - r600_sb::bc_finalizer::cf_peephole() when starting Second Life</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86145">Bug 86145</a> - Pipeline statistic counter values for VF always 0</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86760">Bug 86760</a> - mesa doesn't build: recipe for target 'r600_llvm.lo' failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86764">Bug 86764</a> - [SNB+ Bisected]Piglit glean/pointSprite fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86788">Bug 86788</a> - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>The environment variable GALLIUM_MSAA that forced a multisample GLX visual was removed.</li>

				</ul>

				</div>

				</body>

				</html>

									
										212

docs/relnotes/10.5.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,212 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.5.0 Release Notes / March 06, 2015</h1>

				<p>

				Mesa 10.5.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.5.1.

				</p>

				<p>

				Mesa 10.5.0 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				2bb6e2e982ee4d8264d52d638c2a4e3f8a164190336d72d4e34ae1304d87ed91  mesa-10.5.0.tar.gz

				d7ca9f9044bbdd674377e3eebceef1fae339c8817b9aa435c2053e4fea44e5d3  mesa-10.5.0.tar.xz

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_ARB_framebuffer_sRGB on freedreno</li>

				<li>GL_ARB_texture_rg on freedreno</li>

				<li>GL_EXT_packed_float on freedreno</li>

				<li>GL_EXT_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, llvmpipe</li>

				<li>GL_EXT_texture_shared_exponent on freedreno</li>

				<li>GL_EXT_texture_snorm on freedreno</li>

				</ul>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=10370">Bug 10370</a> - Incorrect pixels read back if draw bitmap texture through Display list</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=45348">Bug 45348</a> - [swrast] piglit fbo-drawbuffers-arbfp regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=60879">Bug 60879</a> - [radeonsi] X11 can't start with acceleration enabled</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=67672">Bug 67672</a> - [llvmpipe] lp_test_arit fails on old CPUs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77544">Bug 77544</a> - i965: Try to use LINE instructions to perform MAD with immediate arguments</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=78770">Bug 78770</a> - [SNB bisected]Webglc conformance/textures/texture-size-limit.html fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=80568">Bug 80568</a> - [gen4] GPU Crash During Google Chrome Operation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82477">Bug 82477</a> - [softpipe] piglit fp-long-alu regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82585">Bug 82585</a> - geometry shader with optional out variable segfaults</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=82991">Bug 82991</a> - Inverted bumpmap in webgl applications</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83463">Bug 83463</a> - [swrast] piglit glsl-vs-clamp-1 regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83500">Bug 83500</a> - si_dma_copy_tile causes GPU hangs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83510">Bug 83510</a> - Graphical glitches in Unreal Engine 4</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=83908">Bug 83908</a> - [i965] Incorrect icon colors in Steam Big Picture</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84212">Bug 84212</a> - [BSW]ES3-CTS.shaders.loops.do_while_dynamic_iterations.vector_counter_vertex fails and causes GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84651">Bug 84651</a> - Distorted graphics or black window when running Battle.net app on Intel hardware via wine</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84777">Bug 84777</a> - [BSW]Piglit spec_glsl-1.50_execution_geometry-basic fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85367">Bug 85367</a> - [gen4] GPU hang in glmark-es2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85467">Bug 85467</a> - [llvmpipe] piglit gl-1.0-dlist-beginend failure with llvm-3.6.0svn</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85529">Bug 85529</a> - Surfaces not drawn in Unvanquished</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85647">Bug 85647</a> - Random radeonsi crashes with mesa 10.3.x</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=85696">Bug 85696</a> - r600g+nine: Bioshock shader failure after 7b1c0cbc90d456384b0950ad21faa3c61a6b43ff</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86089">Bug 86089</a> - [r600g][mesa 10.4.0-dev] shader failure - r600_sb::bc_finalizer::cf_peephole() when starting Second Life</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86618">Bug 86618</a> - [NV96] neg modifiers not working in MIN and MAX operations</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86760">Bug 86760</a> - mesa doesn't build: recipe for target 'r600_llvm.lo' failed</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86764">Bug 86764</a> - [SNB+ Bisected]Piglit glean/pointSprite fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86788">Bug 86788</a> - (bisected) 32bit UrbanTerror 4.1 timedemo sse4.1 segfault...</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86811">Bug 86811</a> - [BDW/BSW Bisected]Piglit spec_arb_shading_language_packing_execution_built-in-functions_vs-unpackSnorm4x8 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86837">Bug 86837</a> - kodi segfault since auxiliary/vl: rework the build of the VL code</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86939">Bug 86939</a> - test_vf_float_conversions.cpp:63:12: error: expected primary-expression before ‘union’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86944">Bug 86944</a> - glsl_parser_extras.cpp&quot;, line 1455: Error: Badly formed expression. (Oracle Studio)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86958">Bug 86958</a> - lp_bld_misc.cpp:503:40: error: no matching function for call to ‘llvm::EngineBuilder::setMCJITMemoryManager(ShaderMemoryManager*&amp;)’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86969">Bug 86969</a> - _drm_intel_gem_bo_references() function takes half the CPU with Witcher2 game</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87076">Bug 87076</a> - Dead Island needs allow_glsl_extension_directive_midshader</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87516">Bug 87516</a> - glProgramBinary violates spec</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87619">Bug 87619</a> - Changes to state such as render targets change fragment shader without marking it dirty.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87658">Bug 87658</a> - [llvmpipe] SEGV in sse2_has_daz on ancient Pentium4-M</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87694">Bug 87694</a> - [SNB] Crash in brw_begin_transform_feedback</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87886">Bug 87886</a> - constant fps drops with Intel and Radeon</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87887">Bug 87887</a> - [i965 Bisected]ES2-CTS.gtf.GL.cos.cos_float_vert_xvary fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=87913">Bug 87913</a> - CPU cacheline size of 0 can be returned by CPUID leaf 0x80000006 in some virtual machines</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88079">Bug 88079</a> - dEQP-GLES3.functional.fbo.completeness.renderable.renderbuffer.color0 tests fail due to enabling of GL_RGB and GL_RGBA</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88170">Bug 88170</a> - 32 bits opengl apps crash with latest llvm 3.6 git / mesa git / radeonsi</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88219">Bug 88219</a> - include/c11/threads_posix.h:197: undefined reference to `pthread_mutex_lock'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88227">Bug 88227</a> - Radeonsi: High GTT usage in Prison Architect large map</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88248">Bug 88248</a> - Calling glClear while there is an occlusion query in progress messes up the results</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88335">Bug 88335</a> - format_pack.c:9567:22: error: expected ')'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88385">Bug 88385</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels core dumped</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88467">Bug 88467</a> - nir.c:140: error: ‘nir_src’ has no member named ‘ssa’</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88478">Bug 88478</a> - #error &quot;&lt;malloc.h&gt; has been replaced by &lt;stdlib.h&gt;&quot;</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88519">Bug 88519</a> - sha1.c:210:22: error: 'grcy_md_hd_t' undeclared (first use in this function)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88523">Bug 88523</a> - sha1.c:37: error: 'SHA1_CTX' undeclared (first use in this function)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88561">Bug 88561</a> - [radeonsi][regression,bisected] Depth test/buffer issues in Portal</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88658">Bug 88658</a> - (bisected) Slow video playback on Kabini</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88662">Bug 88662</a> - unaligned access to gl_dlist_node</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88783">Bug 88783</a> - FTBFS: Clover: src/gallium/state_trackers/clover/llvm/invocation.cpp:335:49: error: no matching function for call to 'llvm::TargetLibraryInfo::TargetLibraryInfo(llvm::Triple)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88792">Bug 88792</a> - [BDW/BSW Bisected]Piglit spec_ARB_pixel_buffer_object_pbo-read-argb8888 fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88806">Bug 88806</a> - nir/nir_constant_expressions.c:2754:15: error: controlling expression type 'unsigned int' not compatible with any generic association type</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88841">Bug 88841</a> - [SNB/IVB/HSW/BDW Bisected]Piglit spec_EGL_NOK_texture_from_pixmap_basic fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88852">Bug 88852</a> - macros.h(181) : error C2143: syntax error : missing '{' before 'enum [tag]'</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88905">Bug 88905</a> - [SNB+ Bisected]Ogles3conform ES3-CTS.gtf.GL3Tests.packed_pixels.packed_pixels fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88930">Bug 88930</a> - [osmesa] osbuffer-&gt;textures should be indexed by attachment type</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88962">Bug 88962</a> - [osmesa] Crash on postprocessing if z buffer is NULL</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89032">Bug 89032</a> - [BDW/BSW/SKL Bisected]Piglit spec_OpenGL_1.1_infinite-spot-light fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89037">Bug 89037</a> - [SKL]Piglit spec_EXT_texture_array_copyteximage_1D_ARRAY_samples=2 sporadically causes GPU hang</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89068">Bug 89068</a> - glTexImage2D regression by texstore_rgba switch to _mesa_format_convert</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89069">Bug 89069</a> - Lack of grass in The Talos Principle on radeonsi (native\wine\nine)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89180">Bug 89180</a> - [IVB regression] Rendering issues in Mass Effect through VMware Workstation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86330">Bug 86330</a> - lp_bld_debug.cpp:112: multiple definition of `raw_debug_ostream::write_impl(char const*, unsigned long)'</li>

				</ul>

				<h2>Changes</h2>

				<ul>

				<li>Removed support for GCC versions earlier than 4.2.0.</li>

				</ul>

				</div>

				</body>

				</html>

									
										217

docs/relnotes/10.5.1.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,217 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.5.1 Release Notes / March 13, 2015</h1>

				<p>

				Mesa 10.5.1 is a bug fix release which fixes bugs found since the 10.5.0 release.

				</p>

				<p>

				Mesa 10.5.1 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				b5b6256a6d46023e16a675257fd11a0f94d7b3e60a76cf112952da3d0fef8e9b  mesa-10.5.1.tar.gz

				ffc51943d15c6812ee7611d053d8980a683fbd6a4986cff567b12cc66637d679  mesa-10.5.1.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=79202">Bug 79202</a> - valgrind errors in glsl-fs-uniform-array-loop-unroll.shader_test; random code generation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84613">Bug 84613</a> - [G965, bisected] piglit regressions : glslparsertest.glsl2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86747">Bug 86747</a> - Noise in Football Manager 2014 textures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=86974">Bug 86974</a> - INTEL_DEBUG=shader_time always asserts in fs_generator::generate_code() when Mesa is built with --enable-debug (= with asserts)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88246">Bug 88246</a> - Commit 2881b12 causes 43 DrawElements test regressions</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88793">Bug 88793</a> - [BDW/BSW Bisected]Piglit/shaders_glsl-max-varyings fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88883">Bug 88883</a> - ir-a2xx.c: variable changed in assert statement</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88885">Bug 88885</a> - Transform feedback uses incorrect interleaving if a previous draw did not write gl_Position</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89095">Bug 89095</a> - [SNB/IVB/BYT Bisected]Webglc conformance/glsl/functions/glsl-function-mix-float.html fails</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89156">Bug 89156</a> - r300g: GL_COMPRESSED_RED_RGTC1 / ATI1N support broken</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89224">Bug 89224</a> - Incorrect rendering of Unigine Valley running in VM on VMware Workstation</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89292">Bug 89292</a> - [regression,bisected] incomplete screenshots in some cases</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89311">Bug 89311</a> - [regression, bisected] dEQP: Added entry points for glCompressedTextureSubImage*D.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89312">Bug 89312</a> - [regression, bisected] main: Added entry points for CopyTextureSubImage*D. (d6b7c40cecfe01)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89315">Bug 89315</a> - [HSW, regression, bisected] i965/fs: Emit MAD instructions when possible.</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89317">Bug 89317</a> - [HSW, regression, bisected] i965: Add LINTERP/CINTERP to can_do_cmod() (d91390634)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89416">Bug 89416</a> - UE4Editor crash after load project</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89430">Bug 89430</a> - [g965][bisected] arb_copy_image-targets gl_texture* tests fail</li>

				</ul>

				<h2>Changes</h2>

				<p>Andrey Sudnik (1):</p>

				<ul>

				  <li>i965/vec4: Don't lose the saturate modifier in copy propagation.</li>

				</ul>

				<p>Chris Forbes (1):</p>

				<ul>

				  <li>i965/gs: Check newly-generated GS-out VUE map against correct stage</li>

				</ul>

				<p>Daniel Stone (1):</p>

				<ul>

				  <li>egl: Take alpha bits into account when selecting GBM formats</li>

				</ul>

				<p>Emil Velikov (5):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.5.0 release</li>

				  <li>egl/main: no longer export internal function</li>

				  <li>cherry-ignore: ignore a few more commits picked without -x</li>

				  <li>mapi: fix commit 90411b56f6bc817e229d8801ac0adad6d4e3fb7a</li>

				  <li>Update version to 10.5.1</li>

				</ul>

				<p>Frank Henigman (1):</p>

				<ul>

				  <li>intel: fix EGLImage renderbuffer _BaseFormat</li>

				</ul>

				<p>Iago Toral Quiroga (1):</p>

				<ul>

				  <li>i965: Fix out-of-bounds accesses into pull_constant_loc array</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>i965/fs/nir: Use emit_math for nir_op_fpow</li>

				</ul>

				<p>Ilia Mirkin (3):</p>

				<ul>

				  <li>freedreno: move fb state copy after checking for size change</li>

				  <li>freedreno/ir3: fix array count returned by TXQ</li>

				  <li>freedreno/ir3: get the # of miplevels from getinfo</li>

				</ul>

				<p>Jason Ekstrand (2):</p>

				<ul>

				  <li>meta/TexSubImage: Stash everything other than PIXEL_TRANSFER/store in meta_begin</li>

				  <li>main/base_tex_format: Properly handle STENCIL_INDEX1/4/16</li>

				</ul>

				<p>Kenneth Graunke (8):</p>

				<ul>

				  <li>i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta.</li>

				  <li>glsl: Mark array access when copying to a temporary for the ?: operator.</li>

				  <li>i965/fs: Set force_writemask_all on shader_time instructions.</li>

				  <li>i965/fs: Set smear on shader_time diff register.</li>

				  <li>i965/fs: Make emit_shader_time_write return rather than emit.</li>

				  <li>i965/fs: Make get_timestamp() pass back the MOV rather than emitting it.</li>

				  <li>i965/fs: Make emit_shader_time_end() insert before EOT.</li>

				  <li>i965/fs: Don't issue FB writes for bound but unwritten color targets.</li>

				</ul>

				<p>Laura Ekstrand (2):</p>

				<ul>

				  <li>main: Fix target checking for CompressedTexSubImage*D.</li>

				  <li>main: Fix target checking for CopyTexSubImage*D.</li>

				</ul>

				<p>Marc-Andre Lureau (1):</p>

				<ul>

				  <li>gallium/auxiliary/indices: fix start param</li>

				</ul>

				<p>Marek Olšák (3):</p>

				<ul>

				  <li>r300g: fix RGTC1 and LATC1 SNORM formats</li>

				  <li>r300g: fix a crash when resolving into an sRGB texture</li>

				  <li>r300g: fix sRGB-&gt;sRGB blits</li>

				</ul>

				<p>Matt Turner (12):</p>

				<ul>

				  <li>i965/vec4: Fix implementation of i2b.</li>

				  <li>mesa: Indent break statements and add a missing one.</li>

				  <li>mesa: Free memory allocated for luminance in readpixels.</li>

				  <li>mesa: Correct backwards NULL check.</li>

				  <li>i965: Consider scratch writes to have side effects.</li>

				  <li>i965/fs: Don't use backend_visitor::instructions after creating the CFG.</li>

				  <li>r300g: Use PATH_MAX instead of limiting ourselves to 100 chars.</li>

				  <li>r300g: Check return value of snprintf().</li>

				  <li>i965/fs: Don't propagate cmod to inst with different type.</li>

				  <li>i965: Tell intel_get_memcpy() which direction the memcpy() is going.</li>

				  <li>Revert SHA1 additions.</li>

				  <li>i965: Avoid applying negate to wrong MAD source.</li>

				</ul>

				<p>Neil Roberts (4):</p>

				<ul>

				  <li>meta: In pbo_{Get,}TexSubImage don't repeatedly rebind the source tex</li>

				  <li>Revert "common: Fix PBOs for 1D_ARRAY."</li>

				  <li>meta: Allow GL_UN/PACK_IMAGE_HEIGHT in _mesa_meta_pbo_Get/TexSubImage</li>

				  <li>meta: Fix the y offset for 1D_ARRAY in _mesa_meta_pbo_TexSubImage</li>

				</ul>

				<p>Rob Clark (11):</p>

				<ul>

				  <li>freedreno/ir3: fix silly typo for binning pass shaders</li>

				  <li>freedreno/a2xx: fix increment in assert</li>

				  <li>freedreno/a4xx: bit of cleanup</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly</li>

				  <li>freedreno: update generated headers</li>

				  <li>freedreno/a4xx: aniso filtering</li>

				  <li>freedreno/ir3: fix up cat6 instruction encodings</li>

				  <li>freedreno/ir3: add support for memory (cat6) instructions</li>

				  <li>freedreno/ir3: handle flat bypass for a4xx</li>

				  <li>freedreno/ir3: fix failed assert in grouping</li>

				</ul>

				<p>Stefan Dösinger (1):</p>

				<ul>

				  <li>r300g: Fix the ATI1N swizzle (RGTC1 and LATC1)</li>

				</ul>

				</div>

				</body>

				</html>

									
										130

docs/relnotes/10.5.2.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.5.2 Release Notes / March 28, 2015</h1>

				<p>

				Mesa 10.5.2 is a bug fix release which fixes bugs found since the 10.5.1 release.

				</p>

				<p>

				Mesa 10.5.2 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>SHA256 checksums</h2>

				<pre>

				755220e160a9f22fda0dffd47746f997b6e196d03f8edc390df7793aecaaa541  mesa-10.5.2.tar.gz

				2f4b6fb77c3e7d6f861558d0884a3073f575e1e673dad8d1b0624e78e9c4dd44  mesa-10.5.2.tar.xz

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=88534">Bug 88534</a> - include/c11/threads_posix.h PTHREAD_MUTEX_RECURSIVE_NP not defined</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89328">Bug 89328</a> - python required to build Mesa release tarballs</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89530">Bug 89530</a> - FTBFS in loader: missing fstat</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89590">Bug 89590</a> - Crash in glLinkProgram with shaders with multiple constant arrays</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89680">Bug 89680</a> - Hard link exist in Mesa 10.5.1 sources</li>

				</ul>

				<h2>Changes</h2>

				<p>Anuj Phogat (1):</p>

				<ul>

				  <li>glsl: Generate link error for non-matching gl_FragCoord redeclarations</li>

				</ul>

				<p>Emil Velikov (7):</p>

				<ul>

				  <li>docs: Add sha256 sums for the 10.5.1 release</li>

				  <li>automake: add missing egl files to the tarball</li>

				  <li>st/egl: don't ship the dri2.c link at the tarball</li>

				  <li>loader: include &lt;sys/stat.h&gt; for non-sysfs builds</li>

				  <li>auxiliary/os: fix the android build - s/drm_munmap/os_munmap/</li>

				  <li>cherry-ignore: add commit non applicable for 10.5</li>

				  <li>Update version to 10.5.2</li>

				</ul>

				<p>Felix Janda (1):</p>

				<ul>

				  <li>c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default</li>

				</ul>

				<p>Francisco Jerez (1):</p>

				<ul>

				  <li>i965: Set nr_params to the number of uniform components in the VS/GS path.</li>

				</ul>

				<p>Ilia Mirkin (2):</p>

				<ul>

				  <li>freedreno/a3xx: use the same layer size for all slices</li>

				  <li>freedreno: fix slice pitch calculations</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: increase coords array size for radeon_llvm_emit_prepare_cube_coords</li>

				</ul>

				<p>Mario Kleiner (2):</p>

				<ul>

				  <li>glx: Handle out-of-sequence swap completion events correctly. (v2)</li>

				  <li>mapi: Make private copies of name strings provided by client.</li>

				</ul>

				<p>Rob Clark (1):</p>

				<ul>

				  <li>freedreno: update generated headers</li>

				</ul>

				<p>Samuel Iglesias Gonsalvez (2):</p>

				<ul>

				  <li>glsl: optimize (0 cmp x + y) into (-x cmp y).</li>

				  <li>configure: Introduce new output variable to ax_check_python_mako_module.m4</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>glsl: fix names in lower_constant_arrays_to_uniforms</li>

				</ul>

				<p>Tom Stellard (1):</p>

				<ul>

				  <li>clover: Return 0 as storage size for local kernel args that are not set v2</li>

				</ul>

				</div>

				</body>

				</html>

									
										77

docs/relnotes/10.6.0.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,77 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 10.6.0 Release Notes / TBD</h1>

				<p>

				Mesa 10.6.0 is a new development release.

				People who are concerned with stability and reliability should stick

				with a previous release or wait for Mesa 10.6.1.

				</p>

				<p>

				Mesa 10.6.0 implements the OpenGL 3.3 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.3.  OpenGL

				3.3 is <strong>only</strong> available if requested at context creation

				because compatibility contexts are not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				TBD.

				</pre>

				<h2>New features</h2>

				<p>

				Note: some of the new features are only available with certain drivers.

				</p>

				<ul>

				<li>GL_AMD_pinned_memory on r600, radeonsi</li>

				<li>GL_ARB_draw_indirect, GL_ARB_multi_draw_indirect on r600</li>

				<li>GL_ARB_draw_instanced on freedreno</li>

				<li>GL_ARB_gpu_shader_fp64 on nvc0, softpipe</li>

				<li>GL_ARB_instanced_arrays on freedreno</li>

				<li>GL_ARB_pipeline_statistics_query on i965, nv50, nvc0, r600, radeonsi, softpipe</li>

				<li>GL_ARB_uniform_buffer_object on freedreno</li>

				<li>GL_EXT_draw_buffers2 on freedreno</li>

				<li>GL_ARB_clip_control on i965</li>

				</ul>

				<h2>Bug fixes</h2>

				TBD.

				<h2>Changes</h2>

				<ul>

				<li>Removed classic Windows software rasterizer.</li>

				<li>Removed egl_gallium EGL driver.</li>

				<li>Removed gbm_gallium GBM driver.</li>

				<li>Removed OpenVG support.</li>

				<li>Removed the galahad gallium driver.</li>

				<li>Removed the identity gallium driver.</li>

				<li>Removed the EGL loader from the Windows SCons build.</li>

				<li>Removed the classic osmesa from the Windows SCons build.</li>

				</ul>

				</div>

				</body>

				</html>

									
										2

docs/relnotes/7.6.html
									
												View File
												
				@@ -48,7 +48,7 @@ c49c19c2bbef4f3b7f1389974dff25f4  MesaGLUT-7.6.zip

				<h2>New features</h2>

				<ul>

				<li><a href="../openvg.html">OpenVG</a> front-end (state tracker for Gallium).

				<li>OpenVG front-end (state tracker for Gallium).

				This was written by Zack Rusin at Tungsten Graphics.

				<li>GL_ARB_vertex_array_object and GL_APPLE_vertex_array_object extensions

				    (supported in Gallium drivers, Intel DRI drivers, and software drivers)</li>

									
										115

docs/relnotes/9.2.3.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,115 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 9.2.3 Release Notes / (November 13, 2013)</h1>

				<p>

				Mesa 9.2.3 is a bug fix release which fixes bugs found since the 9.2.2 release.

				</p>

				<p>

				Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.1.  OpenGL

				3.1 is <strong>only</strong> available if requested at context creation

				because GL_ARB_compatibility is not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				66e9a33a414f801e1c33398bf627d56b  MesaLib-9.2.3.tar.gz

				f56b6beb556e4b9072814419f7c554e3  MesaLib-9.2.3.tar.bz2

				ed852dab576faac237ac4298bf55d0a1  MesaLib-9.2.3.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69437">Bug 69437</a> - Composite Bypass no longer works</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following GIT command:</p>

				<pre>

				  git log mesa-9.2.2..mesa-9.2.3

				</pre>

				<p>Brian Paul (2):</p>

				<ul>

				  <li>st/mesa: move out of memory check in st_draw_vbo()</li>

				  <li>osmesa: fix broken triangle/line drawing when using float color buffer</li>

				</ul>

				<p>Carl Worth (7):</p>

				<ul>

				  <li>Remove error when calling glGenQueries/glDeleteQueries while a query is active</li>

				  <li>Bump version to 9.2.3</li>

				</ul>

				<p>Daniel Vetter (1):</p>

				<ul>

				  <li>i965: CS writes/reads should use I915_GEM_INSTRUCTION</li>

				</ul>

				<p>Eric Anholt (1):</p>

				<ul>

				  <li>i965: Fix texture buffer rendering after a whole buffer replacement.</li>

				</ul>

				<p>Kenneth Graunke (6):</p>

				<ul>

				  <li>i965: Emit post-sync non-zero flush before 3DSTATE_GS_SVB_INDEX.</li>

				  <li>i965: Emit post-sync non-zero flush before 3DSTATE_DRAWING_RECTANGLE.</li>

				  <li>i965: Also guard 3DSTATE_DRAWING_RECTANGLE with a flush in blorp.</li>

				  <li>i965: Move post-sync non-zero flush for 3DSTATE_MULTISAMPLE.</li>

				  <li>i965: Also emit HIER_DEPTH and STENCIL packets when disabling depth.</li>

				  <li>i965: Also emit HiZ and Stencil packets when disabling depth on Gen6.</li>

				</ul>

				<p>Kristian Høgsberg (1):</p>

				<ul>

				  <li>wayland: Don't rely on static variable for identifying wl_drm buffers</li>

				</ul>

				<p>Marek Olšák (1):</p>

				<ul>

				  <li>radeonsi: fix blitting the last 2 mipmap levels of compressed textures</li>

				</ul>

				<p>Petr Sebor (1):</p>

				<ul>

				  <li>meta: enable vertex attributes in the context of the newly created array object</li>

				</ul>

				<p>Scott Graham (1):</p>

				<ul>

				  <li>mesa: fixes for MSVC 2013</li>

				</ul>

				</div>

				</body>

				</html>

									
										102

docs/relnotes/9.2.4.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,102 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 9.2.4 Release Notes / (November 27, 2013)</h1>

				<p>

				Mesa 9.2.4 is a bug fix release which fixes bugs found since the 9.2.3 release.

				</p>

				<p>

				Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.1.  OpenGL

				3.1 is <strong>only</strong> available if requested at context creation

				because GL_ARB_compatibility is not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				28190b831b0271d69dbc44b2686eab1c  MesaLib-9.2.4.tar.gz

				e630c0a307cec4f0f70ddd029d2fe084  MesaLib-9.2.4.tar.bz2

				8ef5e1e92e1d30fbedec31f716a7619e  MesaLib-9.2.4.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=53077">Bug 53077</a> - [IVB] Output error with msaa when both of framebuffer and source color's alpha are not 1</li>

				<li>Fix freedreno to compile with recent libdrm.</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following GIT command:</p>

				<pre>

				  git log mesa-9.2.3..mesa-9.2.4

				</pre>

				<p>Brian Paul (1):</p>

				<ul>

				  <li>st/mesa: fix GL_FEEDBACK mode inverted Y coordinate bug</li>

				</ul>

				<p>Paul Berry (2):</p>

				<ul>

				  <li>i965: Fix vertical alignment for multisampled buffers.</li>

				  <li>glsl: Fix lowering of direct assignment in lower_clip_distance.</li>

				</ul>

				<p>Rob Clark (17):</p>

				<ul>

				  <li>freedreno/a3xx: fix color inversion on mem-&gt;gmem restore</li>

				  <li>freedreno/a3xx: fix viewport on gmem-&gt;mem resolve</li>

				  <li>freedreno: add debug option to disable scissor optimization</li>

				  <li>freedreno: update register headers</li>

				  <li>freedreno/a3xx: some texture fixes</li>

				  <li>freedreno/a3xx/compiler: fix CMP</li>

				  <li>freedreno/a3xx/compiler: handle saturate on dst</li>

				  <li>freedreno/a3xx/compiler: use max_reg rather than file_count</li>

				  <li>freedreno/a3xx/compiler: cat4 cannot use const reg as src</li>

				  <li>freedreno: fix segfault when no color buffer bound</li>

				  <li>freedreno/a3xx/compiler: make compiler errors more useful</li>

				  <li>freedreno/a3xx/compiler: bit of re-arrange/cleanup</li>

				  <li>freedreno/a3xx/compiler: fix SGT/SLT/etc</li>

				  <li>freedreno/a3xx: don't leak so much</li>

				  <li>freedreno/a3xx/compiler: better const handling</li>

				  <li>freedreno/a3xx/compiler: handle sync flags better</li>

				  <li>freedreno: updates for msm drm/kms driver</li>

				</ul>

				<p>Tapani Pälli (1):</p>

				<ul>

				  <li>mesa: enable GL_TEXTURE_LOD_BIAS set/get</li>

				</ul>

				</div>

				</body>

				</html>

									
										120

docs/relnotes/9.2.5.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,120 @@

				<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

				<html lang="en">

				<head>

				  <meta http-equiv="content-type" content="text/html; charset=utf-8">

				  <title>Mesa Release Notes</title>

				  <link rel="stylesheet" type="text/css" href="../mesa.css">

				</head>

				<body>

				<div class="header">

				  <h1>The Mesa 3D Graphics Library</h1>

				</div>

				<iframe src="../contents.html"></iframe>

				<div class="content">

				<h1>Mesa 9.2.5 Release Notes / (December 12, 2013)</h1>

				<p>

				Mesa 9.2.5 is a bug fix release which fixes bugs found since the 9.2.4 release.

				</p>

				<p>

				Mesa 9.2 implements the OpenGL 3.1 API, but the version reported by

				glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /

				glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.

				Some drivers don't support all the features required in OpenGL 3.1.  OpenGL

				3.1 is <strong>only</strong> available if requested at context creation

				because GL_ARB_compatibility is not supported.

				</p>

				<h2>MD5 checksums</h2>

				<pre>

				9fb4de29ca1d9cfd03cbdefa123ba336  MesaLib-9.2.5.tar.bz2

				1146c7c332767174f3de782b88d8e8ca  MesaLib-9.2.5.tar.gz

				a9a6c46dac7ea26fd272bf14894d95f3  MesaLib-9.2.5.zip

				</pre>

				<h2>New features</h2>

				<p>None</p>

				<h2>Bug fixes</h2>

				<p>This list is likely incomplete.</p>

				<ul>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62142">Bug 62142</a> - Mesa/demo mipmap_limits upside down with running by SOFTWARE</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=64323">Bug 64323</a> - Severe misrendering in Left 4 Dead 2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=66213">Bug 66213</a> - Certain Mesa Demos Rendering Inverted (vertically)</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68838">Bug 68838</a> - GLSL: struct declarations produce a &quot;empty declaration warning&quot; in 9.2</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=69155">Bug 69155</a> - [NV50 gallium] [piglit] bin/varying-packing-simple triggers memory corruption/failures</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72325">Bug 72325</a> - [swrast] piglit glean fbo regression</li>

				<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72327">Bug 72327</a> - [swrast] piglit glean pointSprite regression</li>

				</ul>

				<h2>Changes</h2>

				<p>The full set of changes can be viewed by using the following GIT command:</p>

				<pre>

				  git log mesa-9.2.4..mesa-9.2.5

				</pre>

				<p>Chad Versace (2):</p>

				<ul>

				  <li>i965/hsw: Apply non-msrt fast color clear w/a to all HSW GTs</li>

				  <li>i965: Add extra-alignment for non-msrt fast color clear for all hw (v2)</li>

				</ul>

				<p>Chris Forbes (4):</p>

				<ul>

				  <li>i965: Gen4-5: Don't enable hardware alpha test with MRT</li>

				  <li>i965: Gen4-5: Include alpha func/ref in program key</li>

				  <li>i965/fs: Gen4-5: Setup discard masks for MRT alpha test</li>

				  <li>i965/fs: Gen4-5: Implement alpha test in shader for MRT</li>

				</ul>

				<p>Chí-Thanh Christopher Nguyễn (1):</p>

				<ul>

				  <li>st/xorg: Handle new DamageUnregister API which has only one argument</li>

				</ul>

				<p>Dave Airlie (3):</p>

				<ul>

				  <li>mesa/swrast: fix inverted front buffer rendering with old-school swrast</li>

				  <li>glx: don't fail out when no configs if we have visuals</li>

				  <li>swrast: fix readback regression since inversion fix</li>

				</ul>

				<p>Ian Romanick (1):</p>

				<ul>

				  <li>glsl: Don't emit empty declaration warning for a struct specifier</li>

				</ul>

				<p>Ilia Mirkin (4):</p>

				<ul>

				  <li>nv50: Fix GPU_READING/WRITING bit removal</li>

				  <li>nouveau: avoid leaking fences while waiting</li>

				  <li>nv50: wait on the buf's fence before sticking it into pushbuf</li>

				  <li>nv50: report 15 max inputs for fragment programs</li>

				</ul>

				<p>Tom Stellard (2):</p>

				<ul>

				  <li>r300/compiler/tests: Fix segfault</li>

				  <li>r300/compiler/tests: Fix line length check in test parser</li>

				</ul>

				</div>

				</body>

				</html>

									
										2

docs/repository.html
									
												View File
												
				@@ -156,7 +156,7 @@ each time you do a pull.

				</p>

				<li>Small changes to master

				<p>

				If you are an experienced git user working on substancial modifications,

				If you are an experienced git user working on substantial modifications,

				you are probably

				working on a separate branch and would rebase your branch prior to

				merging with master.

									
										4

docs/shading.html
									
												View File
												
				@@ -67,7 +67,7 @@ Example:  export MESA_GLSL=dump,nopt

				<h2 id="support">GLSL Version</h2>

				<p>

				The GLSL compiler currently supports version 1.40 of the shading language.

				The GLSL compiler currently supports version 3.30 of the shading language.

				</p>

				<p>

				@@ -234,7 +234,7 @@ This option is only relevant if EmitHighLevelInstructions is set.

				<dt>EmitComments</dt>

				<dd>

				If set, instructions will be annoted with comments to help with debugging.

				If set, instructions will be annotated with comments to help with debugging.

				Extra NOP instructions will also be inserted.

				</dd>

				</dl>

									
										4

docs/sourcetree.html
									
												View File
												
				@@ -123,7 +123,7 @@ each directory.

				          Currently there's run-time code generation for x86/SSE, PowerPC

				          and Cell SPU.

				      <li><b>tgsi</b> - TG Shader Infrastructure.  Code for encoding,

				          manipulating and interpretting GPU programs.

				          manipulating and interpreting GPU programs.

				      <li><b>translate</b> - module for translating vertex data from one format

				          to another.

				      <li><b>util</b> - assorted utilities for arithmetic, hashing, surface

				@@ -133,10 +133,8 @@ each directory.

				       <ul>

				       <li><b>clover</b> - OpenCL state tracker

				       <li><b>dri</b> - Meta state tracker for DRI drivers

				       <li><b>egl</b> - Meta state tracker for EGL drivers

				       <li><b>glx</b> - Meta state tracker for GLX

				       <li><b>vdpau</b> - VDPAU state tracker

				       <li><b>vega</b> - OpenVG 1.x state tracker

				       <li><b>wgl</b> -

				       <li><b>xorg</b> - Meta state tracker for Xorg video drivers

				       <li><b>xvmc</b> - XvMC state tracker

									
										125

docs/specs/MESA_configless_context.spec
									
										Normal file
									
												View File
												
				@@ -0,0 +1,125 @@

				Name

				    MESA_configless_context

				Name Strings

				    EGL_MESA_configless_context

				Contact

				    Neil Roberts <neil.s.roberts@intel.com>

				Status

				    Proposal

				Version

				    Version 1, February 28, 2014

				Number

				    EGL Extension #not assigned

				Dependencies

				    Requires EGL 1.4 or later.  This extension is written against the

				    wording of the EGL 1.4 specification.

				Overview

				    This extension provides a means to use a single context to render to

				    multiple surfaces which have different EGLConfigs. Without this extension

				    the EGLConfig for every surface used by the context must be compatible

				    with the one used by the context. The only way to render to surfaces with

				    different formats would be to create multiple contexts but this is

				    inefficient with modern GPUs where this restriction is unnecessary.

				IP Status

				    Open-source; freely implementable.

				New Procedures and Functions

				    None.

				New Tokens

				    Accepted as <config> in eglCreateContext

				        EGL_NO_CONFIG_MESA                  ((EGLConfig)0)

				Additions to the EGL Specification section "2.2 Rendering Contexts and Drawing

				Surfaces"

				    Add the following to the 3rd paragraph:

				   "EGLContexts can also optionally be created with respect to an EGLConfig

				    depending on the parameters used at creation time. If a config is provided

				    then additional restrictions apply on what surfaces can be used with the

				    context."

				    Replace the last sentence of the 6th paragraph with:

				   "In order for a context to be compatible with a surface they both must have

				    been created with respect to the same EGLDisplay. If the context was

				    created without respect to an EGLConfig then there are no further

				    constraints. Otherwise they are only compatible if:"

				    Remove the last bullet point in the list of constraints.

				Additions to the EGL Specification section "3.7.1 Creating Rendering Contexts"

				    Replace the paragraph starting "If config is not a valid EGLConfig..."

				    with

				   "The config argument can either be a valid EGLConfig or EGL_NO_CONFIG_MESA.

				    If it is neither of these then an EGL_BAD_CONFIG error is generated. If a

				    valid config is passed then the error will also be generated if the config

				    does not support the requested client API (this includes requesting

				    creation of an OpenGL ES 1.x context when the EGL_RENDERABLE_TYPE

				    attribute of config does not contain EGL_OPENGL_ES_BIT, or creation of an

				    OpenGL ES 2.x context when the attribute does not contain

				    EGL_OPENGL_ES2_BIT).

				    Passing EGL_NO_CONFIG_MESA will create a configless context. When a

				    configless context is used with the OpenGL API it can be assumed that the

				    initial values of the context's state will be decided when the context is

				    first made current. In particular this means that the decision of whether

				    to use GL_BACK or GL_FRONT for the initial value of the first output in

				    glDrawBuffers will be decided based on the config of the draw surface when

				    it is first bound."

				Additions to the EGL Specification section "3.7.3 Binding Contexts and

				Drawables"

				    Replace the first bullet point with the following:

				   "* If draw or read are not compatible with ctx as described in section 2.2,

				      then an EGL_BAD_MATCH error is generated."

				    Add a second bullet point after that:

				   "* If draw and read are not compatible with each other as described in

				      section 2.2, then an EGL_BAD_MATCH error is generated."

				Issues

				    1.  What happens when an OpenGL context with a double-buffered surface and

				        draw buffer set to GL_BACK is made current with a single-buffered

				        surface?

				        NOT RESOLVED: There are a few options here.  An implementation can

				        raise an error, change the drawbuffer state to GL_FRONT or just do

				        nothing, expecting the application to set GL_FRONT drawbuffer before

				        drawing.  However, this extension deliberately does not specify any

				        required behavior in this corner case and applications should avoid

				        mixing single- and double-buffered surfaces with configless contexts.

				        Future extensions may specify required behavior in this case.

				Revision History

				    Version 1, February 28, 2014

				        Initial draft (Neil Roberts)

142

docs/specs/MESA_image_dma_buf_export.txt Normal file

View File

@@ -0,0 +1,142 @@
 Name
     MESA_image_dma_buf_export
 Name Strings
     EGL_MESA_image_dma_buf_export
 Contributors
     Dave Airlie
 Contact
     Dave Airlie (airlied 'at' redhat 'dot' com)
 Status
     Proposal
 Version
     Version 2, Mar 30, 2015
 Number
     EGL Extension #not assigned
 Dependencies
     Reguires EGL 1.4 or later.  This extension is written against the
     wording of the EGL 1.4 specification.
     EGL_KHR_base_image is required.
     The EGL implementation must be running on a Linux kernel supporting the
     dma_buf buffer sharing mechanism.
 Overview
     This extension provides entry points for integrating EGLImage with the
     dma-buf infrastructure.  The extension allows creating a Linux dma_buf
     file descriptor or multiple file descriptors, in the case of multi-plane
     YUV image, from an EGLImage.
     It is designed to provide the complementary functionality to EGL_EXT_image_dma_buf_import.
 IP Status
     Open-source; freely implementable.
 New Types
     This is a 64 bit unsigned integer.
     typedef khronos_uint64_t EGLuint64MESA;
 New Procedures and Functions
     EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
                                   EGLImageKHR image,
 				  int *fourcc,
 				  int *num_planes,
 				  EGLuint64MESA *modifiers);
     EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
                                         EGLImageKHR image,
                                         int *fds,
 				        EGLint *strides,
 					EGLint *offsets);
 New Tokens
     None
 Additions to the EGL 1.4 Specification:
     To mirror the import extension, this extension attempts to return
     enough information to enable an exported dma-buf to be imported
     via eglCreateImageKHR and EGL_LINUX_DMA_BUF_EXT token.
     Retrieving the information is a two step process, so two APIs
     are required.
     The first entrypoint
        EGLBoolean eglExportDMABUFImageQueryMESA(EGLDisplay dpy,
                                   EGLImageKHR image,
 				  int *fourcc,
 				  int *num_planes,
 				  EGLuint64MESA *modifiers);
     is used to retrieve the pixel format of the buffer, as specified by
     drm_fourcc.h, the number of planes in the image and the Linux
     drm modifiers. <fourcc>, <num_planes> and <modifiers> may be NULL,
     in which case no value is retrieved.
     The second entrypoint retrieves the dma_buf file descriptors,
     strides and offsets for the image. The caller should pass
     arrays sized according to the num_planes values retrieved previously.
     Passing arrays of the wrong size will have undefined results.
     If the number of fds is less than the number of planes, then
     subsequent fd slots should contain -1.
         EGLBoolean eglExportDMABUFImageMESA(EGLDisplay dpy,
                                          EGLImageKHR image,
 					 int *fds,
                                          EGLint *strides,
                                          EGLint *offsets);
     <fds>, <strides>, <offsets> can be NULL if the infomatation isn't
     required by the caller.
 Issues
 . Should the API look more like an attribute getting API?
 ANSWER: No, from a user interface pov, having to iterate across calling
 the API up to 12 times using attribs seems like the wrong solution.
 . Should the API take a plane and just get the fd/stride/offset for that
    plane?
 ANSWER: UNKNOWN,this might be just as valid an API.
 . Does ownership of the file descriptor remain with the app?
 ANSWER: Yes, the app is responsible for closing any fds retrieved.
 . If number of planes and number of fds differ what should we do?
 ANSWER: Return -1 for the secondary slots, as this avoids having
 to dup the fd extra times to make the interface sane.
 Revision History
     Version 2, March, 2015
         Add a query interface (Dave Airlie)
     Version 1, June 3, 2014
         Initial draft (Dave Airlie)

									
										13

docs/specs/MESA_query_renderer.spec
									
												View File
												
				@@ -16,11 +16,11 @@ IP Status

				Status

				    Incomplete.  DO NOT SHIP.

				    Shipping as of Mesa 10.0

				Version

				    Version 6, 7-November-2013

				    Version 8, 14-February-2014

				Number

				@@ -211,7 +211,7 @@ Additions to the GLX 1.4 Specification

				    The attribute name GLX_RENDERER_ID_MESA specified the index of the render

				    against which the context should be created.  The default value of

				    GLX_RENDER_ID_MESA is 0.

				    GLX_RENDERER_ID_MESA is 0.

				    [Add to list of errors for glXCreateContextAttribsARB in section section

				@@ -373,7 +373,7 @@ Issues

				        should make every attempt to return as much information as is

				        possible.  For example, if the implementation is running on a non-PCI

				        SoC with a Qualcomm GPU, GLX_RENDERER_VENDOR_ID_MESA should return

				        0x168C, but GLX_RENDERER_DEVICE_ID_MESA will return 0x0000.

				        0x5143, but GLX_RENDERER_DEVICE_ID_MESA will return 0xFFFFFFFF.

				Revision History

				@@ -403,3 +403,8 @@ Revision History

				    Version 7, 2013/11/07 - Fix a couple more typos.  Add issue #17 regarding

				                            the PCI queries on systems that don't have PCI.

				    Version 8, 2014/02/14 - Fix a couple typos. GLX_RENDER_ID_MESA should

				                            read GLX_RENDERER_ID_MESA. The VENDOR/DEVICE_ID

				                            example given in issue #17 should be 0x5143 and

				                            0xFFFFFFFF respectively.

									
										2

docs/specs/MESA_texture_array.spec
									
												View File
												
				@@ -16,7 +16,7 @@ IP Status

				Status

				    Shipping in Mesa 7.1

				    DEPRECATED - Support removed in Mesa 10.1.

				Version

									
										101

docs/specs/WL_create_wayland_buffer_from_image.spec
									
										Normal file
									
												View File
												
				@@ -0,0 +1,101 @@

				Name

				    WL_create_wayland_buffer_from_image

				Name Strings

				    EGL_WL_create_wayland_buffer_from_image

				Contributors

				    Neil Roberts

				    Axel Davy

				    Daniel Stone

				Contact

				    Neil Roberts <neil.s.roberts@intel.com>

				Status

				    Proposal

				Version

				    Version 2, October 25, 2013

				Number

				    EGL Extension #not assigned

				Dependencies

				    Requires EGL 1.4 or later.  This extension is written against the

				    wording of the EGL 1.4 specification.

				    EGL_KHR_base_image is required.

				Overview

				    This extension provides an entry point to create a wl_buffer which shares

				    its contents with a given EGLImage. The expected use case for this is in a

				    nested Wayland compositor which is using subsurfaces to present buffers

				    from its clients. Using this extension it can attach the client buffers

				    directly to the subsurface without having to blit the contents into an

				    intermediate buffer. The compositing can then be done in the parent

				    compositor.

				    The nested compositor can create an EGLImage from a client buffer resource

				    using the existing WL_bind_wayland_display extension. It should also be

				    possible to create buffers using other types of images although there is

				    no expected use case for that.

				IP Status

				    Open-source; freely implementable.

				New Procedures and Functions

				    struct wl_buffer *eglCreateWaylandBufferFromImageWL(EGLDisplay dpy,

				                                                        EGLImageKHR image);

				New Tokens

				    None.

				Additions to the EGL 1.4 Specification:

				    To create a client-side wl_buffer from an EGLImage call

				      struct wl_buffer *eglCreateWaylandBufferFromImageWL(EGLDisplay dpy,

				                                                          EGLImageKHR image);

				    The returned buffer will share the contents with the given EGLImage. Any

				    updates to the image will also be updated in the wl_buffer. Typically the

				    EGLImage will be generated in a nested Wayland compositor using a buffer

				    resource from a client via the EGL_WL_bind_wayland_display extension.

				    If there was an error then the function will return NULL. In particular it

				    will generate EGL_BAD_MATCH if the implementation is not able to represent

				    the image as a wl_buffer. The possible reasons for this error are

				    implementation-dependant but may include problems such as an unsupported

				    format or tiling mode or that the buffer is in memory that is inaccessible

				    to the GPU that the given EGLDisplay is using.

				Issues

				    1) Under what circumstances can the EGL_BAD_MATCH error be generated? Does

				       this include for example unsupported tiling modes?

				       RESOLVED: Yes, the EGL_BAD_MATCH error can be generated for any reason

				       which prevents the implementation from representing the image as a

				       wl_buffer. For example, these problems can be but are not limited to

				       unsupported tiling modes, inaccessible memory or an unsupported pixel

				       format.

				Revision History

				    Version 1, September 6, 2013

				        Initial draft (Neil Roberts)

				    Version 2, October 25, 2013

				        Added a note about more possible reasons for returning EGL_BAD_FORMAT.

									
										2

docs/thanks.html
									
												View File
												
				@@ -14,7 +14,7 @@

				<iframe src="contents.html"></iframe>

				<div class="content">

				<h1>Acknowledgments</h1>

				<h1>Acknowledgements</h1>

				The following individuals and groups are to be acknowledged for their

									
										94

docs/viewperf.html
									
												View File
												
				@@ -19,6 +19,7 @@

				<p>

				This page lists known issues with

				<a href="http://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>

				and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a>

				when running on Mesa-based drivers.

				</p>

				@@ -40,13 +41,15 @@ These issues have been reported to the SPEC organization in the hope that

				they'll be fixed in the future.

				</p>

				<h2><u>Viewperf 11</u></h2>

				<p>

				Some of the Viewperf tests use a lot of memory.

				Some of the Viewperf 11 tests use a lot of memory.

				At least 2GB of RAM is recommended.

				</p>

				<h2>Catia-03 test 2</h2>

				<h3>Catia-03 test 2</h3>

				<p>

				This test creates over 38000 vertex buffer objects.  On some systems

				@@ -59,7 +62,7 @@ either in Viewperf or the Mesa driver.

				<h2>Catia-03 tests 3, 4, 8</h2>

				<h3>Catia-03 tests 3, 4, 8</h3>

				<p>

				These tests use features of the

				@@ -79,7 +82,7 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.

				<h2>sw-02 tests 1, 2, 4, 6</h2>

				<h3>sw-02 tests 1, 2, 4, 6</h3>

				<p>

				These tests depend on the

				@@ -99,7 +102,7 @@ color.  This is probably due to some uninitialized state somewhere.

				<h2>sw-02 test 6</h2>

				<h3>sw-02 test 6</h3>

				<p>

				The lines drawn in this test appear in a random color.

				@@ -111,7 +114,7 @@ situation, we get a random color.

				<h2>Lightwave-01 test 3</h2>

				<h3>Lightwave-01 test 3</h3>

				<p>

				This test uses a number of mipmapped textures, but the textures are

				@@ -172,7 +175,7 @@ However, we have no plans to implement this work-around in Mesa.

				</p>

				<h2>Maya-03 test 2</h2>

				<h3>Maya-03 test 2</h3>

				<p>

				This test makes some unusual calls to glRotate.  For example:

				@@ -204,7 +207,7 @@ and with a semi-random color (between white and black) since GL_FOG is enabled.

				</p>

				<h2>Proe-05 test 1</h2>

				<h3>Proe-05 test 1</h3>

				<p>

				This uses depth testing but there's two problems:

				@@ -232,7 +235,7 @@ glClear is called so clearing the depth buffer would be a no-op anyway.

				</p>

				<h2>Proe-05 test 6</h2>

				<h3>Proe-05 test 6</h3>

				<p>

				This test draws an engine model with a two-pass algorithm.

				@@ -261,6 +264,79 @@ blending with appropriate patterns/modes to ensure the same fragments

				are produced in both passes.

				</p>

				<h2><u>Viewperf 12</u></h2>

				<p>

				Note that Viewperf 12 only runs on 64-bit Windows 7 or later.

				</p>

				<h3>catia-04</h3>

				<p>

				One of the catia tests calls wglGetProcAddress() to get some

				GL_EXT_direct_state_access functions (such as glBindMultiTextureEXT) and some

				GL_NV_half_float functions (such as glMultiTexCoord3hNV).

				If the extension/function is not supported, wglGetProcAddress() can return NULL.

				Unfortunately, Viewperf doesn't check for null pointers and crashes when it

				later tries to use the pointer.

				</p>

				<p>

				Another catia test uses OpenGL 3.1's primitive restart feature.

				But when Viewperf creates an OpenGL context, it doesn't request version 3.1

				If the driver returns version 3.0 or earlier all the calls related to primitive

				restart generate an OpenGL error.

				Some of the rendering is then incorrect.

				</p>

				<h3>energy-01</h3>

				<p>

				This test creates a 3D luminance texture of size 1K x 1K x 1K.

				If the OpenGL driver/device doesn't support a texture of this size

				the glTexImage3D() call will fail with GL_INVALID_VALUE or GL_OUT_OF_MEMORY

				and all that's rendered is plain white polygons.

				Ideally, the test would use a proxy texture to determine the max 3D

				texture size.  But it does not do that.

				</p>

				<h3>maya-04</h3>

				<p>

				This test generates many GL_INVALID_OPERATION errors in its calls to

				glUniform().

				Causes include:

				<ul>

				<li> Trying to set float uniforms with glUniformi()

				<li> Trying to set float uniforms with glUniform3f()

				<li> Trying to set matrix uniforms with glUniform() instead of glUniformMatrix().

				</ul>

				<p>

				Apparently, the indexes returned by glGetUniformLocation() were hard-coded

				into the application trace when it was created.

				Since different implementations of glGetUniformLocation() may return different

				values for any given uniform name, subsequent calls to glUniform() will be

				invalid since they refer to the wrong uniform variables.

				This causes many OpenGL errors and leads to incorrect rendering.

				</p>

				<h3>medical-01</h3>

				<p>

				This test uses a single GLSL fragment shader which contains a GLSL 1.20

				array initializer statement, but it neglects to specify

				<code>#version 120</code> at the top of the shader code.

				So, the shader does not compile and all that's rendered is plain white polygons.

				</p>

				<h3>showcase-01</h3>

				<p>

				This is actually a DX11 test based on Autodesk's Showcase product.

				As such, it won't run with Mesa.

				</p>

				</div>

				</body>

									
										16

docs/vmware-guest.html
									
												View File
												
				@@ -27,9 +27,10 @@ MacOS are all supported.

				</p>

				<p>

				End users shouldn't have to go through all these steps once the driver is

				included in newer Linux distributions.

				Fedora 18 and Ubuntu 12.10 include the VMware guest GL driver, for example.

				Most modern Linux distros include the SVGA3D driver so end users shouldn't

				be concerned with this information.

				But if your distro lacks the driver or you want to update to the latest code

				these instructions explain what to do.

				</p>

				<p>

				@@ -53,6 +54,13 @@ The components involved in this include:

				<li>Mesa/gallium OpenGL driver: "svga"

				</ul>

				<p>

				All of these components reside in the guest Linux virtual machine.

				On the host, all you're doing is running VMware

				<a href="http://www.vmware.com/products/workstation/">Workstation</a> or

				<a href="http://www.vmware.com/products/fusion/">Fusion</a>.

				</p>

				<h2>Prerequisites</h2>

				@@ -134,7 +142,7 @@ As before, if you're on a 32-bit system, you should skip the --libdir

				configure option.

				  <pre>

				  cd $TOP/mesa

				  ./autogen.sh --prefix=/usr --libdir=/usr/lib64 --with-gallium-drivers=svga --with-dri-drivers= --enable-xa

				  ./autogen.sh --prefix=/usr --libdir=/usr/lib64 --with-gallium-drivers=svga --with-dri-drivers= --enable-xa --disable-dri3

				  make

				  sudo make install

				  </pre>

									
										2

docs/xlibdriver.html
									
												View File
												
				@@ -107,7 +107,7 @@ for your application.

				<p>

				When using Mesa directly or with GLX, it's up to the application

				writer to create a window with an appropriate colormap.  The GLUT

				toolkit tris to minimize colormap <em>flashing</em> by sharing

				toolkit tries to minimize colormap <em>flashing</em> by sharing

				colormaps when possible.  Specifically, if the visual and depth of the

				window matches that of the root window, the root window's colormap

				will be shared by the Mesa window.  Otherwise, a new, private colormap

1

doxygen/core_subset.doxy

View File

@@ -73,7 +73,6 @@ FILE_PATTERNS          = \
 			fog.h \
 			get.h \
 			glheader.h \
 			glthread.h \
 			hash.[ch] \
 			hint.h \
 			histogram.h \

2

doxygen/main.doxy

View File

@@ -34,7 +34,7 @@ SEARCH_INCLUDES        = YES
 INCLUDE_PATH           = ../include/
 INCLUDE_FILE_PATTERNS  =
 PREDEFINED             =
 EXPAND_AS_DEFINED      = _glthread_DECLARE_STATIC_MUTEX
 EXPAND_AS_DEFINED      =
 SKIP_FUNCTION_MACROS   = YES
 #---------------------------------------------------------------------------
 # Configuration::addtions related to external references

									
										364

include/CL/cl.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2010 The Khronos Group Inc.

				 * Copyright (c) 2008 - 2012 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -21,8 +21,6 @@

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 ******************************************************************************/

				/* $Revision: 11985 $ on $Date: 2010-07-15 11:16:06 -0700 (Thu, 15 Jul 2010) $ */

				#ifndef __OPENCL_CL_H

				#define __OPENCL_CL_H

				@@ -58,8 +56,10 @@ typedef cl_uint             cl_device_mem_cache_type;

				typedef cl_uint             cl_device_local_mem_type;

				typedef cl_bitfield         cl_device_exec_capabilities;

				typedef cl_bitfield         cl_command_queue_properties;

				typedef intptr_t            cl_device_partition_property;

				typedef cl_bitfield         cl_device_affinity_domain;

				typedef intptr_t			cl_context_properties;

				typedef intptr_t            cl_context_properties;

				typedef cl_uint             cl_context_info;

				typedef cl_uint             cl_command_queue_info;

				typedef cl_uint             cl_channel_order;

				@@ -67,6 +67,7 @@ typedef cl_uint             cl_channel_type;

				typedef cl_bitfield         cl_mem_flags;

				typedef cl_uint             cl_mem_object_type;

				typedef cl_uint             cl_mem_info;

				typedef cl_bitfield         cl_mem_migration_flags;

				typedef cl_uint             cl_image_info;

				typedef cl_uint             cl_buffer_create_type;

				typedef cl_uint             cl_addressing_mode;

				@@ -75,24 +76,43 @@ typedef cl_uint             cl_sampler_info;

				typedef cl_bitfield         cl_map_flags;

				typedef cl_uint             cl_program_info;

				typedef cl_uint             cl_program_build_info;

				typedef cl_uint             cl_program_binary_type;

				typedef cl_int              cl_build_status;

				typedef cl_uint             cl_kernel_info;

				typedef cl_uint             cl_kernel_arg_info;

				typedef cl_uint             cl_kernel_arg_address_qualifier;

				typedef cl_uint             cl_kernel_arg_access_qualifier;

				typedef cl_bitfield         cl_kernel_arg_type_qualifier;

				typedef cl_uint             cl_kernel_work_group_info;

				typedef cl_uint             cl_event_info;

				typedef cl_uint             cl_command_type;

				typedef cl_uint             cl_profiling_info;

				typedef struct _cl_image_format {

				    cl_channel_order        image_channel_order;

				    cl_channel_type         image_channel_data_type;

				} cl_image_format;

				typedef struct _cl_image_desc {

				    cl_mem_object_type      image_type;

				    size_t                  image_width;

				    size_t                  image_height;

				    size_t                  image_depth;

				    size_t                  image_array_size;

				    size_t                  image_row_pitch;

				    size_t                  image_slice_pitch;

				    cl_uint                 num_mip_levels;

				    cl_uint                 num_samples;

				    cl_mem                  buffer;

				} cl_image_desc;

				typedef struct _cl_buffer_region {

				    size_t                  origin;

				    size_t                  size;

				} cl_buffer_region;

				/******************************************************************************/

				/* Error Codes */

				@@ -111,6 +131,11 @@ typedef struct _cl_buffer_region {

				#define CL_MAP_FAILURE                              -12

				#define CL_MISALIGNED_SUB_BUFFER_OFFSET             -13

				#define CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST -14

				#define CL_COMPILE_PROGRAM_FAILURE                  -15

				#define CL_LINKER_NOT_AVAILABLE                     -16

				#define CL_LINK_PROGRAM_FAILURE                     -17

				#define CL_DEVICE_PARTITION_FAILED                  -18

				#define CL_KERNEL_ARG_INFO_NOT_AVAILABLE            -19

				#define CL_INVALID_VALUE                            -30

				#define CL_INVALID_DEVICE_TYPE                      -31

				@@ -147,14 +172,21 @@ typedef struct _cl_buffer_region {

				#define CL_INVALID_MIP_LEVEL                        -62

				#define CL_INVALID_GLOBAL_WORK_SIZE                 -63

				#define CL_INVALID_PROPERTY                         -64

				#define CL_INVALID_IMAGE_DESCRIPTOR                 -65

				#define CL_INVALID_COMPILER_OPTIONS                 -66

				#define CL_INVALID_LINKER_OPTIONS                   -67

				#define CL_INVALID_DEVICE_PARTITION_COUNT           -68

				/* OpenCL Version */

				#define CL_VERSION_1_0                              1

				#define CL_VERSION_1_1                              1

				#define CL_VERSION_1_2                              1

				/* cl_bool */

				#define CL_FALSE                                    0

				#define CL_TRUE                                     1

				#define CL_BLOCKING                                 CL_TRUE

				#define CL_NON_BLOCKING                             CL_FALSE

				/* cl_platform_info */

				#define CL_PLATFORM_PROFILE                         0x0900

				@@ -168,6 +200,7 @@ typedef struct _cl_buffer_region {

				#define CL_DEVICE_TYPE_CPU                          (1 << 1)

				#define CL_DEVICE_TYPE_GPU                          (1 << 2)

				#define CL_DEVICE_TYPE_ACCELERATOR                  (1 << 3)

				#define CL_DEVICE_TYPE_CUSTOM                       (1 << 4)

				#define CL_DEVICE_TYPE_ALL                          0xFFFFFFFF

				/* cl_device_info */

				@@ -221,7 +254,7 @@ typedef struct _cl_buffer_region {

				#define CL_DEVICE_VERSION                           0x102F

				#define CL_DEVICE_EXTENSIONS                        0x1030

				#define CL_DEVICE_PLATFORM                          0x1031

				/* 0x1032 reserved for CL_DEVICE_DOUBLE_FP_CONFIG */

				#define CL_DEVICE_DOUBLE_FP_CONFIG                  0x1032

				/* 0x1033 reserved for CL_DEVICE_HALF_FP_CONFIG */

				#define CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF       0x1034

				#define CL_DEVICE_HOST_UNIFIED_MEMORY               0x1035

				@@ -233,6 +266,20 @@ typedef struct _cl_buffer_region {

				#define CL_DEVICE_NATIVE_VECTOR_WIDTH_DOUBLE        0x103B

				#define CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF          0x103C

				#define CL_DEVICE_OPENCL_C_VERSION                  0x103D

				#define CL_DEVICE_LINKER_AVAILABLE                  0x103E

				#define CL_DEVICE_BUILT_IN_KERNELS                  0x103F

				#define CL_DEVICE_IMAGE_MAX_BUFFER_SIZE             0x1040

				#define CL_DEVICE_IMAGE_MAX_ARRAY_SIZE              0x1041

				#define CL_DEVICE_PARENT_DEVICE                     0x1042

				#define CL_DEVICE_PARTITION_MAX_SUB_DEVICES         0x1043

				#define CL_DEVICE_PARTITION_PROPERTIES              0x1044

				#define CL_DEVICE_PARTITION_AFFINITY_DOMAIN         0x1045

				#define CL_DEVICE_PARTITION_TYPE                    0x1046

				#define CL_DEVICE_REFERENCE_COUNT                   0x1047

				#define CL_DEVICE_PREFERRED_INTEROP_USER_SYNC       0x1048

				#define CL_DEVICE_PRINTF_BUFFER_SIZE                0x1049

				#define CL_DEVICE_IMAGE_PITCH_ALIGNMENT             0x104A

				#define CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT      0x104B

				/* cl_device_fp_config - bitfield */

				#define CL_FP_DENORM                                (1 << 0)

				@@ -242,6 +289,7 @@ typedef struct _cl_buffer_region {

				#define CL_FP_ROUND_TO_INF                          (1 << 4)

				#define CL_FP_FMA                                   (1 << 5)

				#define CL_FP_SOFT_FLOAT                            (1 << 6)

				#define CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT         (1 << 7)

				/* cl_device_mem_cache_type */

				#define CL_NONE                                     0x0

				@@ -266,8 +314,23 @@ typedef struct _cl_buffer_region {

				#define CL_CONTEXT_PROPERTIES                       0x1082

				#define CL_CONTEXT_NUM_DEVICES                      0x1083

				/* cl_context_info + cl_context_properties */

				/* cl_context_properties */

				#define CL_CONTEXT_PLATFORM                         0x1084

				#define CL_CONTEXT_INTEROP_USER_SYNC                0x1085

				/* cl_device_partition_property */

				#define CL_DEVICE_PARTITION_EQUALLY                 0x1086

				#define CL_DEVICE_PARTITION_BY_COUNTS               0x1087

				#define CL_DEVICE_PARTITION_BY_COUNTS_LIST_END      0x0

				#define CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN      0x1088

				/* cl_device_affinity_domain */

				#define CL_DEVICE_AFFINITY_DOMAIN_NUMA                     (1 << 0)

				#define CL_DEVICE_AFFINITY_DOMAIN_L4_CACHE                 (1 << 1)

				#define CL_DEVICE_AFFINITY_DOMAIN_L3_CACHE                 (1 << 2)

				#define CL_DEVICE_AFFINITY_DOMAIN_L2_CACHE                 (1 << 3)

				#define CL_DEVICE_AFFINITY_DOMAIN_L1_CACHE                 (1 << 4)

				#define CL_DEVICE_AFFINITY_DOMAIN_NEXT_PARTITIONABLE       (1 << 5)

				/* cl_command_queue_info */

				#define CL_QUEUE_CONTEXT                            0x1090

				@@ -282,6 +345,14 @@ typedef struct _cl_buffer_region {

				#define CL_MEM_USE_HOST_PTR                         (1 << 3)

				#define CL_MEM_ALLOC_HOST_PTR                       (1 << 4)

				#define CL_MEM_COPY_HOST_PTR                        (1 << 5)

				/* reserved                                         (1 << 6)    */

				#define CL_MEM_HOST_WRITE_ONLY                      (1 << 7)

				#define CL_MEM_HOST_READ_ONLY                       (1 << 8)

				#define CL_MEM_HOST_NO_ACCESS                       (1 << 9)

				/* cl_mem_migration_flags - bitfield */

				#define CL_MIGRATE_MEM_OBJECT_HOST                  (1 << 0)

				#define CL_MIGRATE_MEM_OBJECT_CONTENT_UNDEFINED     (1 << 1)

				/* cl_channel_order */

				#define CL_R                                        0x10B0

				@@ -297,6 +368,8 @@ typedef struct _cl_buffer_region {

				#define CL_Rx                                       0x10BA

				#define CL_RGx                                      0x10BB

				#define CL_RGBx                                     0x10BC

				#define CL_DEPTH                                    0x10BD

				#define CL_DEPTH_STENCIL                            0x10BE

				/* cl_channel_type */

				#define CL_SNORM_INT8                               0x10D0

				@@ -314,11 +387,16 @@ typedef struct _cl_buffer_region {

				#define CL_UNSIGNED_INT32                           0x10DC

				#define CL_HALF_FLOAT                               0x10DD

				#define CL_FLOAT                                    0x10DE

				#define CL_UNORM_INT24                              0x10DF

				/* cl_mem_object_type */

				#define CL_MEM_OBJECT_BUFFER                        0x10F0

				#define CL_MEM_OBJECT_IMAGE2D                       0x10F1

				#define CL_MEM_OBJECT_IMAGE3D                       0x10F2

				#define CL_MEM_OBJECT_IMAGE2D_ARRAY                 0x10F3

				#define CL_MEM_OBJECT_IMAGE1D                       0x10F4

				#define CL_MEM_OBJECT_IMAGE1D_ARRAY                 0x10F5

				#define CL_MEM_OBJECT_IMAGE1D_BUFFER                0x10F6

				/* cl_mem_info */

				#define CL_MEM_TYPE                                 0x1100

				@@ -339,6 +417,10 @@ typedef struct _cl_buffer_region {

				#define CL_IMAGE_WIDTH                              0x1114

				#define CL_IMAGE_HEIGHT                             0x1115

				#define CL_IMAGE_DEPTH                              0x1116

				#define CL_IMAGE_ARRAY_SIZE                         0x1117

				#define CL_IMAGE_BUFFER                             0x1118

				#define CL_IMAGE_NUM_MIP_LEVELS                     0x1119

				#define CL_IMAGE_NUM_SAMPLES                        0x111A

				/* cl_addressing_mode */

				#define CL_ADDRESS_NONE                             0x1130

				@@ -361,6 +443,7 @@ typedef struct _cl_buffer_region {

				/* cl_map_flags - bitfield */

				#define CL_MAP_READ                                 (1 << 0)

				#define CL_MAP_WRITE                                (1 << 1)

				#define CL_MAP_WRITE_INVALIDATE_REGION              (1 << 2)

				/* cl_program_info */

				#define CL_PROGRAM_REFERENCE_COUNT                  0x1160

				@@ -370,11 +453,20 @@ typedef struct _cl_buffer_region {

				#define CL_PROGRAM_SOURCE                           0x1164

				#define CL_PROGRAM_BINARY_SIZES                     0x1165

				#define CL_PROGRAM_BINARIES                         0x1166

				#define CL_PROGRAM_NUM_KERNELS                      0x1167

				#define CL_PROGRAM_KERNEL_NAMES                     0x1168

				/* cl_program_build_info */

				#define CL_PROGRAM_BUILD_STATUS                     0x1181

				#define CL_PROGRAM_BUILD_OPTIONS                    0x1182

				#define CL_PROGRAM_BUILD_LOG                        0x1183

				#define CL_PROGRAM_BINARY_TYPE                      0x1184

				/* cl_program_binary_type */

				#define CL_PROGRAM_BINARY_TYPE_NONE                 0x0

				#define CL_PROGRAM_BINARY_TYPE_COMPILED_OBJECT      0x1

				#define CL_PROGRAM_BINARY_TYPE_LIBRARY              0x2

				#define CL_PROGRAM_BINARY_TYPE_EXECUTABLE           0x4

				/* cl_build_status */

				#define CL_BUILD_SUCCESS                            0

				@@ -388,6 +480,32 @@ typedef struct _cl_buffer_region {

				#define CL_KERNEL_REFERENCE_COUNT                   0x1192

				#define CL_KERNEL_CONTEXT                           0x1193

				#define CL_KERNEL_PROGRAM                           0x1194

				#define CL_KERNEL_ATTRIBUTES                        0x1195

				/* cl_kernel_arg_info */

				#define CL_KERNEL_ARG_ADDRESS_QUALIFIER             0x1196

				#define CL_KERNEL_ARG_ACCESS_QUALIFIER              0x1197

				#define CL_KERNEL_ARG_TYPE_NAME                     0x1198

				#define CL_KERNEL_ARG_TYPE_QUALIFIER                0x1199

				#define CL_KERNEL_ARG_NAME                          0x119A

				/* cl_kernel_arg_address_qualifier */

				#define CL_KERNEL_ARG_ADDRESS_GLOBAL                0x119B

				#define CL_KERNEL_ARG_ADDRESS_LOCAL                 0x119C

				#define CL_KERNEL_ARG_ADDRESS_CONSTANT              0x119D

				#define CL_KERNEL_ARG_ADDRESS_PRIVATE               0x119E

				/* cl_kernel_arg_access_qualifier */

				#define CL_KERNEL_ARG_ACCESS_READ_ONLY              0x11A0

				#define CL_KERNEL_ARG_ACCESS_WRITE_ONLY             0x11A1

				#define CL_KERNEL_ARG_ACCESS_READ_WRITE             0x11A2

				#define CL_KERNEL_ARG_ACCESS_NONE                   0x11A3

				/* cl_kernel_arg_type_qualifer */

				#define CL_KERNEL_ARG_TYPE_NONE                     0

				#define CL_KERNEL_ARG_TYPE_CONST                    (1 << 0)

				#define CL_KERNEL_ARG_TYPE_RESTRICT                 (1 << 1)

				#define CL_KERNEL_ARG_TYPE_VOLATILE                 (1 << 2)

				/* cl_kernel_work_group_info */

				#define CL_KERNEL_WORK_GROUP_SIZE                   0x11B0

				@@ -395,6 +513,7 @@ typedef struct _cl_buffer_region {

				#define CL_KERNEL_LOCAL_MEM_SIZE                    0x11B2

				#define CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE 0x11B3

				#define CL_KERNEL_PRIVATE_MEM_SIZE                  0x11B4

				#define CL_KERNEL_GLOBAL_WORK_SIZE                  0x11B5

				/* cl_event_info  */

				#define CL_EVENT_COMMAND_QUEUE                      0x11D0

				@@ -425,13 +544,17 @@ typedef struct _cl_buffer_region {

				#define CL_COMMAND_WRITE_BUFFER_RECT                0x1202

				#define CL_COMMAND_COPY_BUFFER_RECT                 0x1203

				#define CL_COMMAND_USER                             0x1204

				#define CL_COMMAND_BARRIER                          0x1205

				#define CL_COMMAND_MIGRATE_MEM_OBJECTS              0x1206

				#define CL_COMMAND_FILL_BUFFER                      0x1207

				#define CL_COMMAND_FILL_IMAGE                       0x1208

				/* command execution status */

				#define CL_COMPLETE                                 0x0

				#define CL_RUNNING                                  0x1

				#define CL_SUBMITTED                                0x2

				#define CL_QUEUED                                   0x3

				/* cl_buffer_create_type  */

				#define CL_BUFFER_CREATE_TYPE_REGION                0x1220

				@@ -470,22 +593,35 @@ clGetDeviceInfo(cl_device_id    /* device */,

				                size_t          /* param_value_size */, 

				                void *          /* param_value */,

				                size_t *        /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clCreateSubDevices(cl_device_id                         /* in_device */,

				                   const cl_device_partition_property * /* properties */,

				                   cl_uint                              /* num_devices */,

				                   cl_device_id *                       /* out_devices */,

				                   cl_uint *                            /* num_devices_ret */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clRetainDevice(cl_device_id /* device */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clReleaseDevice(cl_device_id /* device */) CL_API_SUFFIX__VERSION_1_2;

				/* Context APIs  */

				extern CL_API_ENTRY cl_context CL_API_CALL

				clCreateContext(const cl_context_properties * /* properties */,

				                cl_uint                       /* num_devices */,

				                const cl_device_id *          /* devices */,

				                cl_uint                 /* num_devices */,

				                const cl_device_id *    /* devices */,

				                void (CL_CALLBACK * /* pfn_notify */)(const char *, const void *, size_t, void *),

				                void *                        /* user_data */,

				                cl_int *                      /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				                void *                  /* user_data */,

				                cl_int *                /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_context CL_API_CALL

				clCreateContextFromType(const cl_context_properties * /* properties */,

				                        cl_device_type                /* device_type */,

				                        cl_device_type          /* device_type */,

				                        void (CL_CALLBACK *     /* pfn_notify*/ )(const char *, const void *, size_t, void *),

				                        void *                        /* user_data */,

				                        cl_int *                      /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				                        void *                  /* user_data */,

				                        cl_int *                /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clRetainContext(cl_context /* context */) CL_API_SUFFIX__VERSION_1_0;

				@@ -520,25 +656,6 @@ clGetCommandQueueInfo(cl_command_queue      /* command_queue */,

				                      void *                /* param_value */,

				                      size_t *              /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;

				#ifdef CL_USE_DEPRECATED_OPENCL_1_0_APIS

				#warning CL_USE_DEPRECATED_OPENCL_1_0_APIS is defined. These APIs are unsupported and untested in OpenCL 1.1!

				/* 

				 *  WARNING:

				 *     This API introduces mutable state into the OpenCL implementation. It has been REMOVED

				 *  to better facilitate thread safety.  The 1.0 API is not thread safe. It is not tested by the

				 *  OpenCL 1.1 conformance test, and consequently may not work or may not work dependably.

				 *  It is likely to be non-performant. Use of this API is not advised. Use at your own risk.

				 *

				 *  Software developers previously relying on this API are instructed to set the command queue 

				 *  properties when creating the queue, instead. 

				 */

				extern CL_API_ENTRY cl_int CL_API_CALL

				clSetCommandQueueProperty(cl_command_queue              /* command_queue */,

				                          cl_command_queue_properties   /* properties */, 

				                          cl_bool                        /* enable */,

				                          cl_command_queue_properties * /* old_properties */) CL_EXT_SUFFIX__VERSION_1_0_DEPRECATED;

				#endif /* CL_USE_DEPRECATED_OPENCL_1_0_APIS */

				/* Memory Object APIs */

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateBuffer(cl_context   /* context */,

				@@ -555,26 +672,12 @@ clCreateSubBuffer(cl_mem                   /* buffer */,

				                  cl_int *                 /* errcode_ret */) CL_API_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateImage2D(cl_context              /* context */,

				                cl_mem_flags            /* flags */,

				                const cl_image_format * /* image_format */,

				                size_t                  /* image_width */,

				                size_t                  /* image_height */,

				                size_t                  /* image_row_pitch */, 

				                void *                  /* host_ptr */,

				                cl_int *                /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateImage3D(cl_context              /* context */,

				                cl_mem_flags            /* flags */,

				                const cl_image_format * /* image_format */,

				                size_t                  /* image_width */, 

				                size_t                  /* image_height */,

				                size_t                  /* image_depth */, 

				                size_t                  /* image_row_pitch */, 

				                size_t                  /* image_slice_pitch */, 

				                void *                  /* host_ptr */,

				                cl_int *                /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				clCreateImage(cl_context              /* context */,

				              cl_mem_flags            /* flags */,

				              const cl_image_format * /* image_format */,

				              const cl_image_desc *   /* image_desc */, 

				              void *                  /* host_ptr */,

				              cl_int *                /* errcode_ret */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clRetainMemObject(cl_mem /* memobj */) CL_API_SUFFIX__VERSION_1_0;

				@@ -609,7 +712,7 @@ clSetMemObjectDestructorCallback(  cl_mem /* memobj */,

				                                    void (CL_CALLBACK * /*pfn_notify*/)( cl_mem /* memobj */, void* /*user_data*/), 

				                                    void * /*user_data */ )             CL_API_SUFFIX__VERSION_1_1;  

				/* Sampler APIs  */

				/* Sampler APIs */

				extern CL_API_ENTRY cl_sampler CL_API_CALL

				clCreateSampler(cl_context          /* context */,

				                cl_bool             /* normalized_coords */, 

				@@ -647,6 +750,13 @@ clCreateProgramWithBinary(cl_context                     /* context */,

				                          cl_int *                       /* binary_status */,

				                          cl_int *                       /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_program CL_API_CALL

				clCreateProgramWithBuiltInKernels(cl_context            /* context */,

				                                  cl_uint               /* num_devices */,

				                                  const cl_device_id *  /* device_list */,

				                                  const char *          /* kernel_names */,

				                                  cl_int *              /* errcode_ret */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clRetainProgram(cl_program /* program */) CL_API_SUFFIX__VERSION_1_0;

				@@ -662,7 +772,30 @@ clBuildProgram(cl_program           /* program */,

				               void *               /* user_data */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clUnloadCompiler(void) CL_API_SUFFIX__VERSION_1_0;

				clCompileProgram(cl_program           /* program */,

				                 cl_uint              /* num_devices */,

				                 const cl_device_id * /* device_list */,

				                 const char *         /* options */, 

				                 cl_uint              /* num_input_headers */,

				                 const cl_program *   /* input_headers */,

				                 const char **        /* header_include_names */,

				                 void (CL_CALLBACK *  /* pfn_notify */)(cl_program /* program */, void * /* user_data */),

				                 void *               /* user_data */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_program CL_API_CALL

				clLinkProgram(cl_context           /* context */,

				              cl_uint              /* num_devices */,

				              const cl_device_id * /* device_list */,

				              const char *         /* options */, 

				              cl_uint              /* num_input_programs */,

				              const cl_program *   /* input_programs */,

				              void (CL_CALLBACK *  /* pfn_notify */)(cl_program /* program */, void * /* user_data */),

				              void *               /* user_data */,

				              cl_int *             /* errcode_ret */ ) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clUnloadPlatformCompiler(cl_platform_id /* platform */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetProgramInfo(cl_program         /* program */,

				@@ -710,6 +843,14 @@ clGetKernelInfo(cl_kernel       /* kernel */,

				                void *          /* param_value */,

				                size_t *        /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetKernelArgInfo(cl_kernel       /* kernel */,

				                   cl_uint         /* arg_indx */,

				                   cl_kernel_arg_info  /* param_name */,

				                   size_t          /* param_value_size */,

				                   void *          /* param_value */,

				                   size_t *        /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetKernelWorkGroupInfo(cl_kernel                  /* kernel */,

				                         cl_device_id               /* device */,

				@@ -718,7 +859,7 @@ clGetKernelWorkGroupInfo(cl_kernel                  /* kernel */,

				                         void *                     /* param_value */,

				                         size_t *                   /* param_value_size_ret */) CL_API_SUFFIX__VERSION_1_0;

				/* Event Object APIs  */

				/* Event Object APIs */

				extern CL_API_ENTRY cl_int CL_API_CALL

				clWaitForEvents(cl_uint             /* num_events */,

				                const cl_event *    /* event_list */) CL_API_SUFFIX__VERSION_1_0;

				@@ -750,7 +891,7 @@ clSetEventCallback( cl_event    /* event */,

				                    void (CL_CALLBACK * /* pfn_notify */)(cl_event, cl_int, void *),

				                    void *      /* user_data */) CL_API_SUFFIX__VERSION_1_1;

				/* Profiling APIs  */

				/* Profiling APIs */

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetEventProfilingInfo(cl_event            /* event */,

				                        cl_profiling_info   /* param_name */,

				@@ -771,7 +912,7 @@ clEnqueueReadBuffer(cl_command_queue    /* command_queue */,

				                    cl_mem              /* buffer */,

				                    cl_bool             /* blocking_read */,

				                    size_t              /* offset */,

				                    size_t              /* cb */, 

				                    size_t              /* size */, 

				                    void *              /* ptr */,

				                    cl_uint             /* num_events_in_wait_list */,

				                    const cl_event *    /* event_wait_list */,

				@@ -781,8 +922,8 @@ extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReadBufferRect(cl_command_queue    /* command_queue */,

				                        cl_mem              /* buffer */,

				                        cl_bool             /* blocking_read */,

				                        const size_t *      /* buffer_origin */,

				                        const size_t *      /* host_origin */, 

				                        const size_t *      /* buffer_offset */,

				                        const size_t *      /* host_offset */, 

				                        const size_t *      /* region */,

				                        size_t              /* buffer_row_pitch */,

				                        size_t              /* buffer_slice_pitch */,

				@@ -798,7 +939,7 @@ clEnqueueWriteBuffer(cl_command_queue   /* command_queue */,

				                     cl_mem             /* buffer */, 

				                     cl_bool            /* blocking_write */, 

				                     size_t             /* offset */, 

				                     size_t             /* cb */, 

				                     size_t             /* size */, 

				                     const void *       /* ptr */, 

				                     cl_uint            /* num_events_in_wait_list */, 

				                     const cl_event *   /* event_wait_list */, 

				@@ -808,8 +949,8 @@ extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueWriteBufferRect(cl_command_queue    /* command_queue */,

				                         cl_mem              /* buffer */,

				                         cl_bool             /* blocking_write */,

				                         const size_t *      /* buffer_origin */,

				                         const size_t *      /* host_origin */, 

				                         const size_t *      /* buffer_offset */,

				                         const size_t *      /* host_offset */, 

				                         const size_t *      /* region */,

				                         size_t              /* buffer_row_pitch */,

				                         size_t              /* buffer_slice_pitch */,

				@@ -820,13 +961,24 @@ clEnqueueWriteBufferRect(cl_command_queue    /* command_queue */,

				                         const cl_event *    /* event_wait_list */,

				                         cl_event *          /* event */) CL_API_SUFFIX__VERSION_1_1;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueFillBuffer(cl_command_queue   /* command_queue */,

				                    cl_mem             /* buffer */, 

				                    const void *       /* pattern */, 

				                    size_t             /* pattern_size */, 

				                    size_t             /* offset */, 

				                    size_t             /* size */, 

				                    cl_uint            /* num_events_in_wait_list */, 

				                    const cl_event *   /* event_wait_list */, 

				                    cl_event *         /* event */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueCopyBuffer(cl_command_queue    /* command_queue */, 

				                    cl_mem              /* src_buffer */,

				                    cl_mem              /* dst_buffer */, 

				                    size_t              /* src_offset */,

				                    size_t              /* dst_offset */,

				                    size_t              /* cb */, 

				                    size_t              /* size */, 

				                    cl_uint             /* num_events_in_wait_list */,

				                    const cl_event *    /* event_wait_list */,

				                    cl_event *          /* event */) CL_API_SUFFIX__VERSION_1_0;

				@@ -872,6 +1024,16 @@ clEnqueueWriteImage(cl_command_queue    /* command_queue */,

				                    const cl_event *    /* event_wait_list */,

				                    cl_event *          /* event */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueFillImage(cl_command_queue   /* command_queue */,

				                   cl_mem             /* image */, 

				                   const void *       /* fill_color */, 

				                   const size_t *     /* origin[3] */, 

				                   const size_t *     /* region[3] */, 

				                   cl_uint            /* num_events_in_wait_list */, 

				                   const cl_event *   /* event_wait_list */, 

				                   cl_event *         /* event */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueCopyImage(cl_command_queue     /* command_queue */,

				                   cl_mem               /* src_image */,

				@@ -911,7 +1073,7 @@ clEnqueueMapBuffer(cl_command_queue /* command_queue */,

				                   cl_bool          /* blocking_map */, 

				                   cl_map_flags     /* map_flags */,

				                   size_t           /* offset */,

				                   size_t           /* cb */,

				                   size_t           /* size */,

				                   cl_uint          /* num_events_in_wait_list */,

				                   const cl_event * /* event_wait_list */,

				                   cl_event *       /* event */,

				@@ -939,6 +1101,15 @@ clEnqueueUnmapMemObject(cl_command_queue /* command_queue */,

				                        const cl_event *  /* event_wait_list */,

				                        cl_event *        /* event */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueMigrateMemObjects(cl_command_queue       /* command_queue */,

				                           cl_uint                /* num_mem_objects */,

				                           const cl_mem *         /* mem_objects */,

				                           cl_mem_migration_flags /* flags */,

				                           cl_uint                /* num_events_in_wait_list */,

				                           const cl_event *       /* event_wait_list */,

				                           cl_event *             /* event */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueNDRangeKernel(cl_command_queue /* command_queue */,

				                       cl_kernel        /* kernel */,

				@@ -959,7 +1130,7 @@ clEnqueueTask(cl_command_queue  /* command_queue */,

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueNativeKernel(cl_command_queue  /* command_queue */,

									  void (*user_func)(void *), 

									  void (CL_CALLBACK * /*user_func*/)(void *), 

				                      void *            /* args */,

				                      size_t            /* cb_args */, 

				                      cl_uint           /* num_mem_objects */,

				@@ -970,16 +1141,17 @@ clEnqueueNativeKernel(cl_command_queue  /* command_queue */,

				                      cl_event *        /* event */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueMarker(cl_command_queue    /* command_queue */,

				                cl_event *          /* event */) CL_API_SUFFIX__VERSION_1_0;

				clEnqueueMarkerWithWaitList(cl_command_queue /* command_queue */,

				                            cl_uint           /* num_events_in_wait_list */,

				                            const cl_event *  /* event_wait_list */,

				                            cl_event *        /* event */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueWaitForEvents(cl_command_queue /* command_queue */,

				                       cl_uint          /* num_events */,

				                       const cl_event * /* event_list */) CL_API_SUFFIX__VERSION_1_0;

				clEnqueueBarrierWithWaitList(cl_command_queue /* command_queue */,

				                             cl_uint           /* num_events_in_wait_list */,

				                             const cl_event *  /* event_wait_list */,

				                             cl_event *        /* event */) CL_API_SUFFIX__VERSION_1_2;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueBarrier(cl_command_queue /* command_queue */) CL_API_SUFFIX__VERSION_1_0;

				/* Extension function access

				 *

				@@ -988,7 +1160,51 @@ clEnqueueBarrier(cl_command_queue /* command_queue */) CL_API_SUFFIX__VERSION_1_

				 * check to make sure the address is not NULL, before using or 

				 * calling the returned function address.

				 */

				extern CL_API_ENTRY void * CL_API_CALL clGetExtensionFunctionAddress(const char * /* func_name */) CL_API_SUFFIX__VERSION_1_0;

				extern CL_API_ENTRY void * CL_API_CALL 

				clGetExtensionFunctionAddressForPlatform(cl_platform_id /* platform */,

				                                         const char *   /* func_name */) CL_API_SUFFIX__VERSION_1_2;

				/* Deprecated OpenCL 1.1 APIs */

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL

				clCreateImage2D(cl_context              /* context */,

				                cl_mem_flags            /* flags */,

				                const cl_image_format * /* image_format */,

				                size_t                  /* image_width */,

				                size_t                  /* image_height */,

				                size_t                  /* image_row_pitch */, 

				                void *                  /* host_ptr */,

				                cl_int *                /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_mem CL_API_CALL

				clCreateImage3D(cl_context              /* context */,

				                cl_mem_flags            /* flags */,

				                const cl_image_format * /* image_format */,

				                size_t                  /* image_width */, 

				                size_t                  /* image_height */,

				                size_t                  /* image_depth */, 

				                size_t                  /* image_row_pitch */, 

				                size_t                  /* image_slice_pitch */, 

				                void *                  /* host_ptr */,

				                cl_int *                /* errcode_ret */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_int CL_API_CALL

				clEnqueueMarker(cl_command_queue    /* command_queue */,

				                cl_event *          /* event */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_int CL_API_CALL

				clEnqueueWaitForEvents(cl_command_queue /* command_queue */,

				                        cl_uint          /* num_events */,

				                        const cl_event * /* event_list */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_int CL_API_CALL

				clEnqueueBarrier(cl_command_queue /* command_queue */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED cl_int CL_API_CALL

				clUnloadCompiler(void) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				extern CL_API_ENTRY CL_EXT_PREFIX__VERSION_1_1_DEPRECATED void * CL_API_CALL

				clGetExtensionFunctionAddress(const char * /* func_name */) CL_EXT_SUFFIX__VERSION_1_1_DEPRECATED;

				#ifdef __cplusplus

				}

									
										126

include/CL/cl_d3d10.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,126 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 **********************************************************************************/

				/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */

				#ifndef __OPENCL_CL_D3D10_H

				#define __OPENCL_CL_D3D10_H

				#include <d3d10.h>

				#include <CL/cl.h>

				#include <CL/cl_platform.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/******************************************************************************

				 * cl_khr_d3d10_sharing                                                       */

				#define cl_khr_d3d10_sharing 1

				typedef cl_uint cl_d3d10_device_source_khr;

				typedef cl_uint cl_d3d10_device_set_khr;

				/******************************************************************************/

				/* Error Codes */

				#define CL_INVALID_D3D10_DEVICE_KHR                  -1002

				#define CL_INVALID_D3D10_RESOURCE_KHR                -1003

				#define CL_D3D10_RESOURCE_ALREADY_ACQUIRED_KHR       -1004

				#define CL_D3D10_RESOURCE_NOT_ACQUIRED_KHR           -1005

				/* cl_d3d10_device_source_nv */

				#define CL_D3D10_DEVICE_KHR                          0x4010

				#define CL_D3D10_DXGI_ADAPTER_KHR                    0x4011

				/* cl_d3d10_device_set_nv */

				#define CL_PREFERRED_DEVICES_FOR_D3D10_KHR           0x4012

				#define CL_ALL_DEVICES_FOR_D3D10_KHR                 0x4013

				/* cl_context_info */

				#define CL_CONTEXT_D3D10_DEVICE_KHR                  0x4014

				#define CL_CONTEXT_D3D10_PREFER_SHARED_RESOURCES_KHR 0x402C

				/* cl_mem_info */

				#define CL_MEM_D3D10_RESOURCE_KHR                    0x4015

				/* cl_image_info */

				#define CL_IMAGE_D3D10_SUBRESOURCE_KHR               0x4016

				/* cl_command_type */

				#define CL_COMMAND_ACQUIRE_D3D10_OBJECTS_KHR         0x4017

				#define CL_COMMAND_RELEASE_D3D10_OBJECTS_KHR         0x4018

				/******************************************************************************/

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetDeviceIDsFromD3D10KHR_fn)(

				    cl_platform_id             platform,

				    cl_d3d10_device_source_khr d3d_device_source,

				    void *                     d3d_object,

				    cl_d3d10_device_set_khr    d3d_device_set,

				    cl_uint                    num_entries,

				    cl_device_id *             devices,

				    cl_uint *                  num_devices) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D10BufferKHR_fn)(

				    cl_context     context,

				    cl_mem_flags   flags,

				    ID3D10Buffer * resource,

				    cl_int *       errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D10Texture2DKHR_fn)(

				    cl_context        context,

				    cl_mem_flags      flags,

				    ID3D10Texture2D * resource,

				    UINT              subresource,

				    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D10Texture3DKHR_fn)(

				    cl_context        context,

				    cl_mem_flags      flags,

				    ID3D10Texture3D * resource,

				    UINT              subresource,

				    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireD3D10ObjectsKHR_fn)(

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseD3D10ObjectsKHR_fn)(

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event) CL_API_SUFFIX__VERSION_1_0;

				#ifdef __cplusplus

				}

				#endif

				#endif  /* __OPENCL_CL_D3D10_H */

									
										126

include/CL/cl_d3d11.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,126 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 **********************************************************************************/

				/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */

				#ifndef __OPENCL_CL_D3D11_H

				#define __OPENCL_CL_D3D11_H

				#include <d3d11.h>

				#include <CL/cl.h>

				#include <CL/cl_platform.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/******************************************************************************

				 * cl_khr_d3d11_sharing                                                       */

				#define cl_khr_d3d11_sharing 1

				typedef cl_uint cl_d3d11_device_source_khr;

				typedef cl_uint cl_d3d11_device_set_khr;

				/******************************************************************************/

				/* Error Codes */

				#define CL_INVALID_D3D11_DEVICE_KHR                  -1006

				#define CL_INVALID_D3D11_RESOURCE_KHR                -1007

				#define CL_D3D11_RESOURCE_ALREADY_ACQUIRED_KHR       -1008

				#define CL_D3D11_RESOURCE_NOT_ACQUIRED_KHR           -1009

				/* cl_d3d11_device_source */

				#define CL_D3D11_DEVICE_KHR                          0x4019

				#define CL_D3D11_DXGI_ADAPTER_KHR                    0x401A

				/* cl_d3d11_device_set */

				#define CL_PREFERRED_DEVICES_FOR_D3D11_KHR           0x401B

				#define CL_ALL_DEVICES_FOR_D3D11_KHR                 0x401C

				/* cl_context_info */

				#define CL_CONTEXT_D3D11_DEVICE_KHR                  0x401D

				#define CL_CONTEXT_D3D11_PREFER_SHARED_RESOURCES_KHR 0x402D

				/* cl_mem_info */

				#define CL_MEM_D3D11_RESOURCE_KHR                    0x401E

				/* cl_image_info */

				#define CL_IMAGE_D3D11_SUBRESOURCE_KHR               0x401F

				/* cl_command_type */

				#define CL_COMMAND_ACQUIRE_D3D11_OBJECTS_KHR         0x4020

				#define CL_COMMAND_RELEASE_D3D11_OBJECTS_KHR         0x4021

				/******************************************************************************/

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetDeviceIDsFromD3D11KHR_fn)(

				    cl_platform_id             platform,

				    cl_d3d11_device_source_khr d3d_device_source,

				    void *                     d3d_object,

				    cl_d3d11_device_set_khr    d3d_device_set,

				    cl_uint                    num_entries,

				    cl_device_id *             devices,

				    cl_uint *                  num_devices) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D11BufferKHR_fn)(

				    cl_context     context,

				    cl_mem_flags   flags,

				    ID3D11Buffer * resource,

				    cl_int *       errcode_ret) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D11Texture2DKHR_fn)(

				    cl_context        context,

				    cl_mem_flags      flags,

				    ID3D11Texture2D * resource,

				    UINT              subresource,

				    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromD3D11Texture3DKHR_fn)(

				    cl_context        context,

				    cl_mem_flags      flags,

				    ID3D11Texture3D * resource,

				    UINT              subresource,

				    cl_int *          errcode_ret) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireD3D11ObjectsKHR_fn)(

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseD3D11ObjectsKHR_fn)(

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;

				#ifdef __cplusplus

				}

				#endif

				#endif  /* __OPENCL_CL_D3D11_H */

									
										127

include/CL/cl_dx9_media_sharing.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,127 @@

				/**********************************************************************************

				 * Copyright (c) 2008-2012 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 **********************************************************************************/

				/* $Revision: 11708 $ on $Date: 2010-06-13 23:36:24 -0700 (Sun, 13 Jun 2010) $ */

				#ifndef __OPENCL_CL_DX9_MEDIA_SHARING_H

				#define __OPENCL_CL_DX9_MEDIA_SHARING_H

				#include <CL/cl.h>

				#include <CL/cl_platform.h>

				#ifdef __cplusplus

				extern "C" {

				#endif

				/******************************************************************************

				/* cl_khr_dx9_media_sharing                                                   */

				#define cl_khr_dx9_media_sharing 1

				typedef cl_uint             cl_dx9_media_adapter_type_khr;

				typedef cl_uint             cl_dx9_media_adapter_set_khr;

				#if defined(_WIN32)

				#include <d3d9.h>

				typedef struct _cl_dx9_surface_info_khr

				{

				    IDirect3DSurface9 *resource;

				    HANDLE shared_handle;

				} cl_dx9_surface_info_khr;

				#endif

				/******************************************************************************/

				/* Error Codes */

				#define CL_INVALID_DX9_MEDIA_ADAPTER_KHR                -1010

				#define CL_INVALID_DX9_MEDIA_SURFACE_KHR                -1011

				#define CL_DX9_MEDIA_SURFACE_ALREADY_ACQUIRED_KHR       -1012

				#define CL_DX9_MEDIA_SURFACE_NOT_ACQUIRED_KHR           -1013

				/* cl_media_adapter_type_khr */

				#define CL_ADAPTER_D3D9_KHR                              0x2020

				#define CL_ADAPTER_D3D9EX_KHR                            0x2021

				#define CL_ADAPTER_DXVA_KHR                              0x2022

				/* cl_media_adapter_set_khr */

				#define CL_PREFERRED_DEVICES_FOR_DX9_MEDIA_ADAPTER_KHR   0x2023

				#define CL_ALL_DEVICES_FOR_DX9_MEDIA_ADAPTER_KHR         0x2024

				/* cl_context_info */

				#define CL_CONTEXT_ADAPTER_D3D9_KHR                      0x2025

				#define CL_CONTEXT_ADAPTER_D3D9EX_KHR                    0x2026

				#define CL_CONTEXT_ADAPTER_DXVA_KHR                      0x2027

				/* cl_mem_info */

				#define CL_MEM_DX9_MEDIA_ADAPTER_TYPE_KHR                0x2028

				#define CL_MEM_DX9_MEDIA_SURFACE_INFO_KHR                0x2029

				/* cl_image_info */

				#define CL_IMAGE_DX9_MEDIA_PLANE_KHR                     0x202A

				/* cl_command_type */

				#define CL_COMMAND_ACQUIRE_DX9_MEDIA_SURFACES_KHR        0x202B

				#define CL_COMMAND_RELEASE_DX9_MEDIA_SURFACES_KHR        0x202C

				/******************************************************************************/

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetDeviceIDsFromDX9MediaAdapterKHR_fn)(

				    cl_platform_id                   platform,

				    cl_uint                          num_media_adapters,

				    cl_dx9_media_adapter_type_khr *  media_adapter_type,

				    void *                           media_adapters,

				    cl_dx9_media_adapter_set_khr     media_adapter_set,

				    cl_uint                          num_entries,

				    cl_device_id *                   devices,

				    cl_uint *                        num_devices) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromDX9MediaSurfaceKHR_fn)(

				    cl_context                    context,

				    cl_mem_flags                  flags,

				    cl_dx9_media_adapter_type_khr adapter_type,

				    void *                        surface_info,

				    cl_uint                       plane,                                                                          

				    cl_int *                      errcode_ret) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireDX9MediaSurfacesKHR_fn)(

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseDX9MediaSurfacesKHR_fn)(

				    cl_command_queue command_queue,

				    cl_uint          num_objects,

				    const cl_mem *   mem_objects,

				    cl_uint          num_events_in_wait_list,

				    const cl_event * event_wait_list,

				    cl_event *       event) CL_API_SUFFIX__VERSION_1_2;

				#ifdef __cplusplus

				}

				#endif

				#endif  /* __OPENCL_CL_DX9_MEDIA_SHARING_H */

									
										133

include/CL/cl_egl.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,133 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2010 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				 * "Materials"), to deal in the Materials without restriction, including

				 * without limitation the rights to use, copy, modify, merge, publish,

				 * distribute, sublicense, and/or sell copies of the Materials, and to

				 * permit persons to whom the Materials are furnished to do so, subject to

				 * the following conditions:

				 *

				 * The above copyright notice and this permission notice shall be included

				 * in all copies or substantial portions of the Materials.

				 *

				 * THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

				 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

				 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

				 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

				 * CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

				 * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

				 * MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.

				 ******************************************************************************/

				#ifndef __OPENCL_CL_EGL_H

				#define __OPENCL_CL_EGL_H

				#ifdef __APPLE__

				#else

				#include <CL/cl.h>

				#include <EGL/egl.h>

				#include <EGL/eglext.h>

				#endif  

				#ifdef __cplusplus

				extern "C" {

				#endif

				/* Command type for events created with clEnqueueAcquireEGLObjectsKHR */

				#define CL_COMMAND_EGL_FENCE_SYNC_OBJECT_KHR  0x202F

				#define CL_COMMAND_ACQUIRE_EGL_OBJECTS_KHR    0x202D

				#define CL_COMMAND_RELEASE_EGL_OBJECTS_KHR    0x202E

				/* Error type for clCreateFromEGLImageKHR */

				#define CL_INVALID_EGL_OBJECT_KHR             -1093

				#define CL_EGL_RESOURCE_NOT_ACQUIRED_KHR      -1092

				/* CLeglImageKHR is an opaque handle to an EGLImage */

				typedef void* CLeglImageKHR;

				/* CLeglDisplayKHR is an opaque handle to an EGLDisplay */

				typedef void* CLeglDisplayKHR;

				/* CLeglSyncKHR is an opaque handle to an EGLSync object */

				typedef void* CLeglSyncKHR;

				/* properties passed to clCreateFromEGLImageKHR */

				typedef intptr_t cl_egl_image_properties_khr;

				#define cl_khr_egl_image 1

				extern CL_API_ENTRY cl_mem CL_API_CALL

				clCreateFromEGLImageKHR(cl_context                  /* context */,

				                        CLeglDisplayKHR             /* egldisplay */,

				                        CLeglImageKHR               /* eglimage */,

				                        cl_mem_flags                /* flags */,

				                        const cl_egl_image_properties_khr * /* properties */,

				                        cl_int *                    /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_mem (CL_API_CALL *clCreateFromEGLImageKHR_fn)(

					cl_context                  context,

					CLeglDisplayKHR             egldisplay,

					CLeglImageKHR               eglimage,

					cl_mem_flags                flags,

					const cl_egl_image_properties_khr * properties,

					cl_int *                    errcode_ret);

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueAcquireEGLObjectsKHR(cl_command_queue /* command_queue */,

				                              cl_uint          /* num_objects */,

				                              const cl_mem *   /* mem_objects */,

				                              cl_uint          /* num_events_in_wait_list */,

				                              const cl_event * /* event_wait_list */,

				                              cl_event *       /* event */) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueAcquireEGLObjectsKHR_fn)(

					cl_command_queue command_queue,

					cl_uint          num_objects,

					const cl_mem *   mem_objects,

					cl_uint          num_events_in_wait_list,

					const cl_event * event_wait_list,

					cl_event *       event);

				extern CL_API_ENTRY cl_int CL_API_CALL

				clEnqueueReleaseEGLObjectsKHR(cl_command_queue /* command_queue */,

				                              cl_uint          /* num_objects */,

				                              const cl_mem *   /* mem_objects */,

				                              cl_uint          /* num_events_in_wait_list */,

				                              const cl_event * /* event_wait_list */,

				                              cl_event *       /* event */) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clEnqueueReleaseEGLObjectsKHR_fn)(

					cl_command_queue command_queue,

					cl_uint          num_objects,

					const cl_mem *   mem_objects,

					cl_uint          num_events_in_wait_list,

					const cl_event * event_wait_list,

					cl_event *       event);

				#define cl_khr_egl_event 1

				extern CL_API_ENTRY cl_event CL_API_CALL

				clCreateEventFromEGLSyncKHR(cl_context      /* context */,

				                            CLeglSyncKHR    /* sync */,

				                            CLeglDisplayKHR /* display */,

				                            cl_int *        /* errcode_ret */) CL_API_SUFFIX__VERSION_1_0;

				typedef CL_API_ENTRY cl_event (CL_API_CALL *clCreateEventFromEGLSyncKHR_fn)(

					cl_context      context,

					CLeglSyncKHR    sync,

					CLeglDisplayKHR display,

					cl_int *        errcode_ret);

				#ifdef __cplusplus

				}

				#endif

				#endif /* __OPENCL_CL_EGL_H */

									
										119

include/CL/cl_ext.h
									
												View File
												
				@@ -1,5 +1,5 @@

				/*******************************************************************************

				 * Copyright (c) 2008-2010 The Khronos Group Inc.

				 * Copyright (c) 2008-2013 The Khronos Group Inc.

				 *

				 * Permission is hereby granted, free of charge, to any person obtaining a

				 * copy of this software and/or associated documentation files (the

				@@ -34,15 +34,12 @@ extern "C" {

				#endif

				#ifdef __APPLE__

					#include <OpenCL/cl.h>

				        #include <OpenCL/cl.h>

				    #include <AvailabilityMacros.h>

				#else

					#include <CL/cl.h>

				        #include <CL/cl.h>

				#endif

				/* cl_khr_fp64 extension - no extension #define since it has no functions  */

				#define CL_DEVICE_DOUBLE_FP_CONFIG                  0x1032

				/* cl_khr_fp16 extension - no extension #define since it has no functions  */

				#define CL_DEVICE_HALF_FP_CONFIG                    0x1033

				@@ -64,7 +61,7 @@ extern "C" {

				 * before using.

				 */

				#define cl_APPLE_SetMemObjectDestructor 1

				cl_int	CL_API_ENTRY clSetMemObjectDestructorAPPLE(  cl_mem /* memobj */, 

				cl_int  CL_API_ENTRY clSetMemObjectDestructorAPPLE(  cl_mem /* memobj */, 

				                                        void (* /*pfn_notify*/)( cl_mem /* memobj */, void* /*user_data*/), 

				                                        void * /*user_data */ )             CL_EXT_SUFFIX__VERSION_1_0;  

				@@ -118,6 +115,52 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(

				    cl_uint *        /* num_platforms */);

				/* Extension: cl_khr_image2D_buffer

				 *

				 * This extension allows a 2D image to be created from a cl_mem buffer without a copy.

				 * The type associated with a 2D image created from a buffer in an OpenCL program is image2d_t.

				 * Both the sampler and sampler-less read_image built-in functions are supported for 2D images

				 * and 2D images created from a buffer.  Similarly, the write_image built-ins are also supported

				 * for 2D images created from a buffer.

				 *

				 * When the 2D image from buffer is created, the client must specify the width,

				 * height, image format (i.e. channel order and channel data type) and optionally the row pitch

				 *

				 * The pitch specified must be a multiple of CL_DEVICE_IMAGE_PITCH_ALIGNMENT pixels.

				 * The base address of the buffer must be aligned to CL_DEVICE_IMAGE_BASE_ADDRESS_ALIGNMENT pixels.

				 */

				/*************************************

				 * cl_khr_initalize_memory extension *

				 *************************************/

				#define CL_CONTEXT_MEMORY_INITIALIZE_KHR            0x200E

				/**************************************

				 * cl_khr_terminate_context extension *

				 **************************************/

				#define CL_DEVICE_TERMINATE_CAPABILITY_KHR          0x200F

				#define CL_CONTEXT_TERMINATE_KHR                    0x2010

				#define cl_khr_terminate_context 1

				extern CL_API_ENTRY cl_int CL_API_CALL clTerminateContextKHR(cl_context /* context */) CL_EXT_SUFFIX__VERSION_1_2;

				typedef CL_API_ENTRY cl_int (CL_API_CALL *clTerminateContextKHR_fn)(cl_context /* context */) CL_EXT_SUFFIX__VERSION_1_2;

				/*

				 * Extension: cl_khr_spir

				 *

				 * This extension adds support to create an OpenCL program object from a 

				 * Standard Portable Intermediate Representation (SPIR) instance

				 */

				#define CL_DEVICE_SPIR_VERSIONS                     0x40E0

				#define CL_PROGRAM_BINARY_TYPE_INTERMEDIATE         0x40E1

				/******************************************

				* cl_nv_device_attribute_query extension *

				******************************************/

				@@ -130,12 +173,16 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(

				#define CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV            0x4005

				#define CL_DEVICE_INTEGRATED_MEMORY_NV              0x4006

				/*********************************

				* cl_amd_device_attribute_query *

				*********************************/

				#define CL_DEVICE_PROFILING_TIMER_OFFSET_AMD        0x4036

				/*********************************

				* cl_arm_printf extension

				*********************************/

				#define CL_PRINTF_CALLBACK_ARM                      0x40B0

				#define CL_PRINTF_BUFFERSIZE_ARM                    0x40B1

				#ifdef CL_VERSION_1_1

				   /***********************************

				@@ -201,7 +248,63 @@ typedef CL_API_ENTRY cl_int (CL_API_CALL *clIcdGetPlatformIDsKHR_fn)(

				    #define CL_PARTITION_BY_COUNTS_LIST_END_EXT         ((cl_device_partition_property_ext) 0)

				    #define CL_PARTITION_BY_NAMES_LIST_END_EXT          ((cl_device_partition_property_ext) 0 - 1)

				/*********************************

				* cl_qcom_ext_host_ptr extension

				*********************************/

				#define CL_MEM_EXT_HOST_PTR_QCOM                  (1 << 29)

				#define CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM   0x40A0      

				#define CL_DEVICE_PAGE_SIZE_QCOM                  0x40A1

				#define CL_IMAGE_ROW_ALIGNMENT_QCOM               0x40A2

				#define CL_IMAGE_SLICE_ALIGNMENT_QCOM             0x40A3

				#define CL_MEM_HOST_UNCACHED_QCOM                 0x40A4

				#define CL_MEM_HOST_WRITEBACK_QCOM                0x40A5

				#define CL_MEM_HOST_WRITETHROUGH_QCOM             0x40A6

				#define CL_MEM_HOST_WRITE_COMBINING_QCOM          0x40A7

				typedef cl_uint                                   cl_image_pitch_info_qcom;

				extern CL_API_ENTRY cl_int CL_API_CALL

				clGetDeviceImageInfoQCOM(cl_device_id             device,

				                         size_t                   image_width,

				                         size_t                   image_height,

				                         const cl_image_format   *image_format,

				                         cl_image_pitch_info_qcom param_name,

				                         size_t                   param_value_size,

				                         void                    *param_value,

				                         size_t                  *param_value_size_ret);

				typedef struct _cl_mem_ext_host_ptr

				{

				    /* Type of external memory allocation. */

				    /* Legal values will be defined in layered extensions. */

				    cl_uint  allocation_type;

					/* Host cache policy for this external memory allocation. */

				    cl_uint  host_cache_policy;

				} cl_mem_ext_host_ptr;

				/*********************************

				* cl_qcom_ion_host_ptr extension

				*********************************/

				#define CL_MEM_ION_HOST_PTR_QCOM                  0x40A8

				typedef struct _cl_mem_ion_host_ptr

				{

				    /* Type of external memory allocation. */

				    /* Must be CL_MEM_ION_HOST_PTR_QCOM for ION allocations. */

				    cl_mem_ext_host_ptr  ext_host_ptr;

				    /* ION file descriptor */

				    int                  ion_filedesc;

				    /* Host pointer to the ION allocated memory */

				    void*                ion_hostptr;

				} cl_mem_ion_host_ptr;

				#endif /* CL_VERSION_1_1 */

Compare commits

9727 Commits mesa-10.0 ... cros-mesa-

3 .dir-locals.el Unescape Escape View File

1 .gitignore vendored Unescape Escape View File

7 Android.mk Unescape Escape View File

7 CleanSpec.mk Normal file Unescape Escape View File

113 Makefile.am Unescape Escape View File

13 SConstruct Unescape Escape View File

2 VERSION Unescape Escape View File

4 autogen.sh Unescape Escape View File

96 common.py Unescape Escape View File

1721 configure.ac View File

318 docs/GL3.txt Unescape Escape View File

256 docs/README.CYGWIN Unescape Escape View File

102 docs/README.MITS Unescape Escape View File

207 docs/README.QUAKE Unescape Escape View File

52 docs/README.THREADS Unescape Escape View File

31 docs/README.UVD Unescape Escape View File

43 docs/README.VCE Normal file Unescape Escape View File

20 docs/README.WIN32 Unescape Escape View File

54 docs/autoconf.html Unescape Escape View File

2 docs/conform.html Unescape Escape View File

1 docs/contents.html Unescape Escape View File

349 docs/devinfo.html Unescape Escape View File

15 docs/dispatch.html Unescape Escape View File

70 docs/egl.html Unescape Escape View File

42 docs/envvars.html Unescape Escape View File

8 docs/faq.html Unescape Escape View File

276 docs/index.html Unescape Escape View File

30 docs/install.html Unescape Escape View File

3 docs/license.html Unescape Escape View File

111 docs/llvmpipe.html Unescape Escape View File

4 docs/opengles.html Unescape Escape View File

59 docs/openvg.html Unescape Escape View File

44 docs/relnotes.html Unescape Escape View File

150 docs/relnotes/10.0.1.html Normal file Unescape Escape View File

161 docs/relnotes/10.0.2.html Normal file Unescape Escape View File

206 docs/relnotes/10.0.3.html Normal file Unescape Escape View File

191 docs/relnotes/10.0.4.html Normal file Unescape Escape View File

173 docs/relnotes/10.0.5.html Normal file Unescape Escape View File

83 docs/relnotes/10.0.html Unescape Escape View File

254 docs/relnotes/10.1.1.html Normal file Unescape Escape View File

179 docs/relnotes/10.1.2.html Normal file Unescape Escape View File

90 docs/relnotes/10.1.3.html Normal file Unescape Escape View File

100 docs/relnotes/10.1.4.html Normal file Unescape Escape View File

105 docs/relnotes/10.1.5.html Normal file Unescape Escape View File

138 docs/relnotes/10.1.6.html Normal file Unescape Escape View File

75 docs/relnotes/10.1.html Normal file Unescape Escape View File

61 docs/relnotes/10.2.1.html Normal file Unescape Escape View File

181 docs/relnotes/10.2.2.html Normal file Unescape Escape View File

130 docs/relnotes/10.2.3.html Normal file Unescape Escape View File

127 docs/relnotes/10.2.4.html Normal file Unescape Escape View File

188 docs/relnotes/10.2.5.html Normal file Unescape Escape View File

118 docs/relnotes/10.2.6.html Normal file Unescape Escape View File

211 docs/relnotes/10.2.7.html Normal file Unescape Escape View File

130 docs/relnotes/10.2.8.html Normal file Unescape Escape View File

101 docs/relnotes/10.2.9.html Normal file Unescape Escape View File

97 docs/relnotes/10.2.html Normal file Unescape Escape View File

158 docs/relnotes/10.3.1.html Normal file Unescape Escape View File

115 docs/relnotes/10.3.2.html Normal file Unescape Escape View File

209 docs/relnotes/10.3.3.html Normal file Unescape Escape View File

106 docs/relnotes/10.3.4.html Normal file Unescape Escape View File

88 docs/relnotes/10.3.5.html Normal file Unescape Escape View File

124 docs/relnotes/10.3.6.html Normal file Unescape Escape View File

93 docs/relnotes/10.3.7.html Normal file Unescape Escape View File

335 docs/relnotes/10.3.html Normal file Unescape Escape View File

97 docs/relnotes/10.4.1.html Normal file Unescape Escape View File

127 docs/relnotes/10.4.2.html Normal file Unescape Escape View File

145 docs/relnotes/10.4.3.html Normal file Unescape Escape View File

100 docs/relnotes/10.4.4.html Normal file Unescape Escape View File

114 docs/relnotes/10.4.5.html Normal file Unescape Escape View File

143 docs/relnotes/10.4.6.html Normal file Unescape Escape View File

134 docs/relnotes/10.4.7.html Normal file Unescape Escape View File

259 docs/relnotes/10.4.html Normal file Unescape Escape View File

212 docs/relnotes/10.5.0.html Normal file Unescape Escape View File

217 docs/relnotes/10.5.1.html Normal file Unescape Escape View File

130 docs/relnotes/10.5.2.html Normal file Unescape Escape View File

77 docs/relnotes/10.6.0.html Normal file Unescape Escape View File

2 docs/relnotes/7.6.html Unescape Escape View File

115 docs/relnotes/9.2.3.html Normal file Unescape Escape View File

9727 Commits

mesa-10.0 ... cros-mesa-

3

.dir-locals.el

View File

1

.gitignore vendored

View File

7

Android.mk

View File

7

CleanSpec.mk Normal file

View File

113

Makefile.am

View File

13

SConstruct

View File

2

VERSION

View File

4

autogen.sh

View File

96

common.py

View File

1721

configure.ac

View File

318

docs/GL3.txt

View File

256

docs/README.CYGWIN

View File

102

docs/README.MITS

View File

207

docs/README.QUAKE

View File

52

docs/README.THREADS

View File

31

docs/README.UVD

View File

43

docs/README.VCE Normal file

View File

20

docs/README.WIN32

View File

54

docs/autoconf.html

View File

2

docs/conform.html

View File

1

docs/contents.html

View File

349

docs/devinfo.html

View File

15

docs/dispatch.html

View File

70

docs/egl.html

View File

42

docs/envvars.html

View File

8

docs/faq.html

View File

276

docs/index.html

View File

30

docs/install.html

View File

3

docs/license.html

View File

111

docs/llvmpipe.html

View File

4

docs/opengles.html

View File

59

docs/openvg.html

View File

44

docs/relnotes.html

View File

150

docs/relnotes/10.0.1.html Normal file

View File

161

docs/relnotes/10.0.2.html Normal file

View File

206

docs/relnotes/10.0.3.html Normal file

View File

191

docs/relnotes/10.0.4.html Normal file

View File

173

docs/relnotes/10.0.5.html Normal file

View File

83

docs/relnotes/10.0.html

View File

254

docs/relnotes/10.1.1.html Normal file

View File

179

docs/relnotes/10.1.2.html Normal file

View File

90

docs/relnotes/10.1.3.html Normal file

View File

100

docs/relnotes/10.1.4.html Normal file

View File

105

docs/relnotes/10.1.5.html Normal file

View File

138

docs/relnotes/10.1.6.html Normal file

View File

75

docs/relnotes/10.1.html Normal file

View File

61

docs/relnotes/10.2.1.html Normal file

View File

181

docs/relnotes/10.2.2.html Normal file

View File

130

docs/relnotes/10.2.3.html Normal file

View File

127

docs/relnotes/10.2.4.html Normal file

View File

188

docs/relnotes/10.2.5.html Normal file

View File

118

docs/relnotes/10.2.6.html Normal file

View File

211

docs/relnotes/10.2.7.html Normal file

View File

130

docs/relnotes/10.2.8.html Normal file

View File

101

docs/relnotes/10.2.9.html Normal file

View File

97

docs/relnotes/10.2.html Normal file

View File

158

docs/relnotes/10.3.1.html Normal file

View File

115

docs/relnotes/10.3.2.html Normal file

View File

209

docs/relnotes/10.3.3.html Normal file

View File

106

docs/relnotes/10.3.4.html Normal file

View File

88

docs/relnotes/10.3.5.html Normal file

View File

124

docs/relnotes/10.3.6.html Normal file

View File

93

docs/relnotes/10.3.7.html Normal file

View File

335

docs/relnotes/10.3.html Normal file

View File

97

docs/relnotes/10.4.1.html Normal file

View File

127

docs/relnotes/10.4.2.html Normal file

View File

145

docs/relnotes/10.4.3.html Normal file

View File

100

docs/relnotes/10.4.4.html Normal file

View File

114

docs/relnotes/10.4.5.html Normal file

View File

143

docs/relnotes/10.4.6.html Normal file

View File

134

docs/relnotes/10.4.7.html Normal file

View File

259

docs/relnotes/10.4.html Normal file

View File

212

docs/relnotes/10.5.0.html Normal file

View File

217

docs/relnotes/10.5.1.html Normal file

View File

130

docs/relnotes/10.5.2.html Normal file

View File

77

docs/relnotes/10.6.0.html Normal file

View File

2

docs/relnotes/7.6.html

View File

115

docs/relnotes/9.2.3.html Normal file

View File

102

docs/relnotes/9.2.4.html Normal file

View File